ITBench-AA: A New Benchmark for Evaluating AI in Enterprise IT Tasks

IBM and Artificial Analysis unveil ITBench-AA, marking a significant step in assessing AI performance in Site Reliability Engineering tasks, with frontier models scoring below 50%.

IBM and Artificial Analysis unveil ITBench-AA, marking a significant step in assessing AI performance in Site Reliability Engineering tasks, with frontier models scoring below 50%.