ITBench-AA: A New Benchmark for Evaluating AI in Enterprise IT Tasks

IBM and Artificial Analysis unveil ITBench-AA, marking a significant step in assessing AI performance in Site Reliability Engineering tasks, with frontier models scoring below 50%.

IBM and Artificial Analysis unveil ITBench-AA, marking a significant step in assessing AI performance in Site Reliability Engineering tasks, with frontier models scoring below 50%.

A developer reflects on their experience with AI-assisted coding, revealing critical lessons learned about architecture and feature implementation.

VMware has unveiled an update to its Cloud Foundation suite, aiming to enhance hardware efficiency and reduce costs for users amidst rising hardware prices.

Kelsey Hightower proposes a new approach to automation, termed 'zero-token architecture,' aimed at enhancing productivity while managing AI costs.

A collaboration between IBM Research and UC Berkeley has led to significant insights into the failures of agentic systems in IT automation, utilizing the ITBench benchmark and the MAST taxonomy.

A Red Hat engineer critiques the tech industry's persistent hype cycles, calling out several technologies as overhyped and ineffective.