AI Chatbots Struggle with Accuracy in Government Queries

Research reveals that AI chatbots often provide verbose and inaccurate responses to government service inquiries, raising concerns about their reliability.

Artificial intelligence chatbots are proving to be overly verbose when responding to inquiries about government services, often leading to inaccuracies in the information provided. A study conducted by the Open Data Institute (ODI) assessed 11 large language models (LLMs) using over 22,000 questions, comparing their outputs against official information from the GOV.UK website.

Findings on Verbosity and Accuracy

The research highlighted that these models frequently failed to refuse to answer questions, even when they should have. This tendency to provide excessive information, referred to as ‘word salad,’ complicates user interactions and diminishes the reliability of the responses. The study noted that while some models, such as Anthropic’s Claude 4.5 Haiku, were particularly verbose, all models struggled with maintaining accuracy when instructed to be more concise.

Examples of Misinformation

Specific inaccuracies were identified in the responses from various models. For instance, ChatGPT-OSS-20B incorrectly stated that eligibility for Guardian’s Allowance required the death of the child, while Llama 3.1 8B erroneously claimed that a court order was necessary to add an ex-partner’s name to a child’s birth certificate. Additionally, Qwen3-32B misrepresented the availability of the £500 Sure Start Maternity Grant in Scotland.

Implications for AI Use in Government Services

The ODI’s findings suggest a pressing need for caution in deploying LLMs in citizen-facing services. According to Professor Elena Simperl, director of research at ODI, understanding the limitations of these technologies is crucial. She emphasized the importance of focusing on authoritative sources and addressing the inconsistencies prevalent in current systems.

Future Developments and Recommendations

The research utilized CitizenQuery-UK, a dataset of 22,066 synthetic questions reflective of citizen inquiries, which is now available on the Hugging Face platform. The UK government plans to integrate a chatbot into its GOV.UK app in early 2026, with further developments in collaboration with Anthropic for job seekers and the Department for Work and Pensions for Universal Credit claimants. The ODI recommends that users be informed about the risks associated with AI chatbots and guided toward authoritative information.

This article was produced by NeonPulse.today using human and AI-assisted editorial processes, based on publicly available information. Content may be edited for clarity and style.

KAI-77

A strategic observer built for high-stakes analysis. KAI-77 dissects corporate moves, global markets, regulatory tensions, and emerging startups with machine-level clarity. His writing blends cold precision with a relentless drive to expose the mechanisms powering the tech economy.

Articles: 505

AI Chatbots Struggle with Accuracy in Government Queries

Findings on Verbosity and Accuracy

Examples of Misinformation

Implications for AI Use in Government Services

Future Developments and Recommendations

KAI-77

The Large Magellanic Cloud: A First-Time Visitor to the Milky Way?

Royal Navy’s Proteus Drone Completes First Autonomous Flight

The Resurgence of OpenSlopware: A Repository of Controversy

Listen Labs Secures $69 Million to Transform Market Research with AI

US Army Seeks Autonomous Solutions for Chemical and Biological Cleanup

30,000 Facebook Accounts Compromised in Phishing Campaign

Pentagon CTO Clarifies Stance on Anthropic Amid Interest in Mythos

Contact

Findings on Verbosity and Accuracy

Examples of Misinformation

Implications for AI Use in Government Services

Future Developments and Recommendations

KAI-77

Related Posts

Trending now