A recent study reveals that GPT-5, the latest iteration of OpenAI's language model, achieves only about 45% accuracy on a human-curated benchmark covering 11 Indic languages, including Gujarati, the mother tongue of Prime Minister Narendra Modi. This finding underscores a critical challenge for India's AI aspirations, as more than a billion people speak these languages.
Benchmark Details and Performance Gaps
The benchmark, designed by human experts, tests translation and comprehension across languages such as Hindi, Bengali, Tamil, Telugu, Marathi, Urdu, Gujarati, Kannada, Malayalam, Punjabi, and Odia. GPT-5's 45% accuracy is significantly lower than its performance on English and other high-resource languages, where it often exceeds 90%. The study attributes this gap to underrepresentation of Indic languages in training data and lack of fine-tuning for linguistic nuances.
Impact on India's AI Ecosystem
India's AI dream heavily relies on inclusive technology that serves its diverse linguistic population. The government's National Language Translation Mission aims to bridge digital divides, but such low accuracy threatens to exclude hundreds of millions from AI benefits. According to a Bloomberg Opinion report, this could hinder sectors like e-governance, education, and healthcare, where accurate translation is vital. Experts warn that without targeted improvements, AI tools may perpetuate language biases and widen inequality.
Call for Action and Future Directions
The findings have sparked calls for more investment in Indic language datasets and community-driven benchmarks. Researchers emphasize the need for collaboration between tech companies, academia, and government to develop robust multilingual models. As India pushes for AI leadership, addressing this translation gap is crucial for ensuring that the benefits of AI reach all citizens, regardless of the language they speak.



