ChatGPT, Grok, and Arya All Lean Left in Washington Post Test
TL;DR
- ChatGPT answered nearly every political question with left-leaning arguments, offering right-leaning positions just once.
- Gemini was the exception, providing both-sides responses in more than 90 percent of its answers.
- Gab's Arya, marketed with conservative principles, produced left-leaning arguments 12 times more often than right-leaning ones.
When The Washington Post put major AI chatbots through a structured political bias test, the headline result was ChatGPT skewing hard left. The more telling finding was that Grok and Gab's Arya did too, despite being explicitly marketed as conservative alternatives to mainstream AI.
The Post modeled its testing on research developed in collaboration with researchers at Stanford University, using more than two dozen political questions designed to capture contested political issues. A reporter scored each response as left-leaning, right-leaning, or both. ChatGPT answered nearly every question exclusively with left-leaning arguments and offered right-leaning positions just once across the full testing set. Google's Gemini was the clearest exception, taking a both-sides approach in more than 90 percent of its answers. Grok, which Elon Musk has touted as a "truth-seeking" and anti-"woke" AI chatbot, gave more right-leaning responses than any other model tested, yet more often still produced wholly left-leaning positions. Gab's Arya, which the right-wing social media platform says was "built with Christian values and conservative principles," responded with left-leaning arguments 12 times more often than right-leaning ones.
The honest caveat is that political bias testing methodology is contested, and a single structured exercise may not reflect how these systems behave across all real-world queries. Google spokesperson Lauren Fine said Gemini is designed to provide balanced responses and does not favor any political ideology, though the company said it was unable to reproduce some of the one-sided responses the Post observed. What the reporting does not settle is which specific model versions were tested, making independent replication harder as the models continue to update.
For anyone deploying AI in politically sensitive contexts, the gap between vendor marketing and measured behavior is the practical takeaway. Gemini's comparatively balanced results represent a potential differentiator for enterprise, government, and civic use cases that require defensible neutrality. As a growing share of adults turns to AI chatbots for news, systematic political audits of this kind are likely to become routine accountability work rather than one-off investigations.
Originally reported by washingtonpost.com
Read the original article →Original headline: Washington Post Tests GPT-5.5, Gemini 3.1 Pro, Grok 4.3, and Gab's Arya for Political Bias — ChatGPT Offered Left-Leaning Arguments in Nearly Every Response, Even Grok Leaned Left on Average