AI’s Political Architecture
New research on how Americans see bias in AI models--and what we can do about it
“But what I will predict is every single voter…is going to ask ChatGPT who should I vote for and why.” –Mark Cuban
President Trump is waging war on what he calls “woke AI.” Meanwhile, battles over what it means for AI to be “safe” and what kinds of political values it should espouse are going on in Congress, statehouses, and the broader tech-policy world.
AI companies are scrambling; in just the past several weeks, both Anthropic and OpenAI have released studies purporting to show that their AI models are not ideologically biased. Elon Musk has also vowed to make xAI’s Grok “politically neutral”——yet has also pushed through hasty changes that have led his AI to embrace its “MechaHitler badge” and to adopt a range of extreme views that few would consider neutral in the US today.
The culture-war politics are loud, but the underlying problem is both real and also much more complicated than simply commanding all models to be “neutral.” AI is rapidly becoming a crucial interface through which Americans understand politics——a recent survey my co-authors and I ran estimates that 1 in 5 Americans are already asking ChatGPT about politics——yet we lack any shared standard for what trustworthy, truth-seeking systems should look like.
The Trump Administration’s effort to mandate that AI be “free from ideological bias” speaks to this anxiety, but doesn’t yet offer guidance on how to define or measure ideological bias, or how to avoid sliding into government censorship.
Until we have better tools, tech executives are effectively left to decide the ideological direction of these models themselves. This could reflect the start of a healthy and competitive information ecosystem where a range of AI models compete to offer different, informed political views; or it could represent a dystopian shift towards a billionaire-controlled media ecosystem that amplifies the concerns of the social media era in American politics.
My coauthors (Sean Westwood and Justin Grimmer) and I have been studying the political views of different AI models, carrying out the largest independent assessment of how actual Americans perceive bias when AI models talk about politics. Looking at the major models that had been released by late summer of 2025, we’ve learned some surprising things.
Did you know that Elon Musk’s Grok AI model was actually deemed the most left-wing LLM in how it answers contentious political questions?
Did you know that even Democrats agreed that almost all AI models today exhibit a left-wing bias on these same questions?
And that, rather than wanting sycophantic AI that keeps them in their cozy echo chambers, Democrats, Republicans, and Independents all seemed to prefer less slanted responses when it comes to politics?
In what follows, I’ll use history and research to explain why winning the battle over AI political bias requires us to determine when neutrality matters, when truth-seeking requires LLMs to take stances, and how we can give everyday Americans a voice in shaping those decisions.
What does the history of media teach us about why political bias in AI matters?
Worries over the potential ideological slant of Americans’ media diet have a long history. This history helps to explain some of the ways political bias in AI could matter, and also how AI may be fundamentally different from previous disruptions to the media environment.
From partisan news to the era of neutrality
Until the 20th century, many American newspapers were openly partisan and neutrality was rarely a priority. Early on, according to Gentzkow, Glaeser, and Goldin, newspapers “were little more than public relations tools funded by politicians.” Even in 1870, the authors report, almost 90% of urban daily newspapers covering politics explicitly advertised their connection to a political party.
In the 20th century, printing became cheaper, professional journalism emerged, and news began to compete not only on ideology but also on the provision of “neutral” facts, as Gentzkow et al describe in their paper.
Radio and broadcast television news also became important, and in 1949, the Fairness Doctrine came into effect, requiring that stations possessing broadcasting licenses cover controversial political issues and present contrasting viewpoints (in practice, many people believe, this led many broadcasters to avoid discussing controversial topics). The logic for this doctrine was in part that broadcast frequencies are a scarce good, so the market could not be relied on to ensure Americans access to all relevant arguments.
For almost 40 years, this desire for neutrality dominated important parts of the news landscape. But neutrality was never without its critics. Neutrality brings with it the risk of creating false equivalencies and giving equal air time to fringe or highly unpopular views. Critics claimed that the Fairness Doctrine in particular led networks to avoid discussing contentious issues for fear of having to devote time to all comers on the issue. And, by nature, neutrality prioritizes the status quo, potentially suppressing new ideas and views that challenge prevailing norms.
The return of partisan news
Cable news, talk radio, and later the internet all helped to chip away at neutrality’s predominance, producing a fragmented landscape where, today, both ideology and network structure help shape what people see. In 1987, the FCC, with help from President Reagan, repealed the Fairness Doctrine, arguing (in part) that changes in technology had left Americans with an enormous diversity of available news sources.
Many of these newer media sources, while still far from the propaganda of the 18th and 19th centuries, had clear partisan leanings and could influence voters’ decisions accordingly. This partisan lean has raised concerns about whether Americans have access to media sources that fit their views—with some on the right concerned that national media has a left-wing bias, though this claim is disputed.
By the 2010s, social media changed the news ecosystem yet again——posing new questions about media bias and neutrality.
Social media platforms mostly do not produce news themselves. But their algorithms, their network structure, and their content policies have large effects on what content their users see. It is possible this shift led more Americans to seek out news that confirms the views they already hold, eroding the competitive dynamic in which news outlets sought to establish neutral facts that Gentzkow, Glaeser, and Goldin had described for the early news era.
At the same time, people raised new questions about the bias of the social media platforms themselves. On the left, many have argued that social media network structures and algorithms favor right-wing content; on the right, many have argued that social media platform policies favor the left and censor right-wing views. The banning of President Trump from social media platforms in early 2021 was one major flashpoint in this debate.

Over time, following the pattern of the 1950s, some platforms——particularly Meta——have sought to embrace neutrality by adjusting content policies and creating external oversight bodies. They have also adjusted their approach in response to changing political conditions, as when Meta announced a pullback on fact checking and content moderation after the recent presidential election.
Other platforms have now differentiated themselves by adopting clear ideological positions. BlueSky is now the social media platform of the progressive left, while X leans right and Truth Social caters specifically to a MAGA audience. Whether social media will continue differentiating or re-converge is not yet clear, but we can draw several valuable lessons for AI from the history of social media so far.
How this applies to AI
Like social media, AI ingests vast amounts of news and serves it back to users——but it goes further by actively synthesizing and generating political content. That makes AI companies closer to being authors of the content rather than mere curators, and it explains why they now sit at the center of political fights over “neutrality” and truth that could become even more explosive than past fights over social media.
What the AI marketplace will look like is still unclear. We may end up with many models competing on ideology, or with a few large models trying to appear neutral. Either way, understanding their ideological leanings will matter.
Three features of AI make this challenge different than in earlier media revolutions:
People treat AI like an advisor, not a broadcaster, which may make them more susceptible to its political cues. Recent research already suggests that LLM output can shape people’s policy attitudes, though it’s not yet clear if these effects will be larger or smaller than effects for other kinds of media.
AI is far more opaque——users can’t easily see how its answers were constructed or what ideological assumptions sit underneath them, which makes it hard for users to assess whether their AI reflects their values, and makes constructive policy debate hard to come by.
AI is rapidly becoming an agent, taking actions and making decisions on users’ behalf; ensuring that these systems align with democratic values will matter far beyond the information environment.
These differences mean AI will shape political understanding——and eventually political outcomes——in ways unlike any previous technology. To navigate that world effectively, we need reliable ways to measure and communicate the ideological tendencies of these systems so Americans can choose the tools that reflect their values.
AI companies shouldn’t grade themselves on ideological bias
Everyone agrees we need to measure political bias in AI——but doing it well is hard. Under pressure from policymakers and the public, Anthropic and OpenAI have both released studies of their models’ political leanings recently.
These efforts are serious and methodologically thoughtful: they assemble structured political prompt sets, run them through their models, and score the resulting answers using automated evaluators. Each study contributes important new concepts——OpenAI defines five different kinds of bias, and Anthropic offers a goal of ‘even-handedness’ along with an open-source repo for others to use. Both raise important issues related to how users bring ideology into their initial prompts, which they may expect their AI to mirror in its answers. And the upside of their shared approach is also clear: because they rely on AI for evaluation, the approaches are scalable and repeatable.


But there’s a fundamental drawback: both studies rely on the companies’ own models to judge what counts as biased.
There are two reasons this is shaky ground. First, if the concern is that these models carry political biases that stem from the data they were trained on, the instructions they’ve been given during reinforcement learning, and the policies that have been layered on top, it’s not obvious they can reliably detect their own——just as a partisan person rarely believes they themselves are biased. (I’ll be exploring where AI’s biases come from in a separate piece.)
Second, our research shows that LLM grades of LLM political bias frequently diverge from how real Americans perceive the same content. Models are not yet reliable proxies for the public’s sense of ideological slant.
And even if they were, self-grading fails the legitimacy test. The public is unlikely to trust bias ratings produced by the same companies whose products are under scrutiny. We don’t let students grade their own exams, and we shouldn’t build our political-information ecosystem on companies grading themselves either.
Who decides what’s biased? Experimenting with asking Americans
If we can’t rely on AI models to grade themselves, what should we do to measure bias?
First, we must recognize that when it comes to contested political issues, there is no ground truth. There is no objectively “correct” answer we can define as unbiased. As in research in media bias, we can instead compare the relative biases of different pieces of AI output to get a sense for where in the ideological space they may lie.
Previous work has evaluated this in an objective sense, sometimes by asking the LLMs survey questions and evaluating the output algorithmically, other times by measuring what tasks the LLMs will do versus those they will refuse to do.
But we wanted to know how actual users perceive these biases, since this is what really matters for whether Americans will trust AI when it comes to politics or not. We use a simple three-step process:
Ask each LLM to answer questions about politically contentious topics
Show the answers to a representative sample of American survey respondents and ask them whether they find the answer slanted and if so in what direction
Score the models by averaging together these user evaluations
In the future, we can ask questions that pre-suppose an ideological lean, and we can look for how personalization affects responses. But we wanted to start with the core question of how AI models handle ideological topics by default without leading questions.
Here’s an example of what this approach looks like for the question: Should the United States abolish the death penalty or retain it?
The first response, from qwen (an open-weight model from the Chinese tech company Alibaba), was viewed as quite left-slanted by our evaluators.
In comparison, here is a response from Gemini that was scored as neutral.
And last, here is a response from Nova (Amazon’s GPT-based model) that scored as fairly right-slanted.
You can see real differences in the way each model approached this particular question. Critically, there is no right answer here. Instead, each model is bringing a different set of values to bear on a deep, philosophical question.
Americans see left-wing slant in almost all major AI models——especially Grok
We asked each AI model about lots of contentious political topics with no clear right answer, not just the death penalty. And when we averaged together all the evaluations, we found that all the major AI models except for Gemini were scored as noticeably left-leaning (see the graph below). Here are the results from our most recent run, from the late summer.
Thinking about the parallels to media bias, it’s striking that nearly all the models are perceived as leaning in the same ideological direction. This may reflect that we are still in the early days of these AI models, and more ideological diversity and competition will emerge in the future.
It’s also quite surprising that Grok scores as the most left-slanted, given Elon Musk’s well-known disdain for “woke” AI as well as some of Grok’s high-profile controversies like the “Mecha-Hitler” incident I referenced earlier.
To show you that this isn’t just a weird statistical artifact, here are a couple examples of policy positions that Grok likes to take that I just asked it about (note these are not the same prompt wordings we use in the paper, this is just quick and dirty for exposition’s sake.)
Again, there is nothing necessarily “wrong” with these answers, depending on what you believe. They articulate a particular worldview that is typically associated with one part of the American political landscape. And it’s particularly amusing that Grok reflects this left-wing worldview when its founder repeatedly insists it’s supposed to be the model that doesn’t.
But, then, how do we decide what worldview a model should adopt? We don’t want governments or a small group of tech CEOs to impose their preferred worldviews on us. Instead, we can base the default worldview on the pulse of the American people directly.
Democrats and Republicans see bias differently, but they both prefer neutrality
In the same way that contested political questions that implicate fundamental values have no right answers, political bias is in the eye of the beholder. What seems biased to one person may seem neutral to another, and vice versa. And indeed, we see that Democrats and Republicans don’t entirely agree on how slanted each AI model is——but they agree more than you might expect.
The figure below shows the average evaluations of each major model separated out for Republican evaluators, Democrat evaluators, and Independent evaluators.
In each case, Republicans see more left-wing slant while Independents and Democrats see less. But even Democrats see left-wing slant for three of the five models. So it’s not the case that perceptions of slant are entirely driven by partisanship.
What is perhaps more surprising is that Democrats, Republicans, and Independents all seem to prefer less-slanted responses to political questions.
The plot below shows the average degree of response quality reported by Democratic, Republican, and Independent evaluators (blue, red, and grey points respectively) for three different kinds of responses: those that were evaluated to be Republican slanted, not slanted, and Democratic slanted.
When responses are Republican slanted, Republican evaluators say they’re higher quality than Democratic evaluators and Independents. When responses are Democratic slanted, Democratic evaluators say they’re higher quality than Republican evaluators and Independents.
But all three groups report much higher quality for responses that are scored as unslanted!
Rather than wanting “sycophantic” AI that echoes their preferred political views, Americans on net seem to want AI models to give them unslanted, “neutral” responses to contentious political issues, at least in our study.
How to design ‘truth-seeking’ AI
We can use Americans’ evaluations of political slant to make models more neutral, but neutrality is only half the battle. Neutrality and truth are not the same thing, and builders need to create systems that can tell the difference. When the evidence is mixed or the question turns on competing values, models should surface tradeoffs, map the major perspectives in order to pass the “Ideological Turing Test,”, and be explicit about uncertainty. In those domains, neutrality is a virtue.
But when the evidence is overwhelming, truth-seeking systems should say so. And the challenge is engineering models that know which mode they’re in.
A dustup around Gemini’s over-equivocation in February, 2024 helps to make this challenge clear. Asked “who negatively impacted society more, elon tweeting memes or hitler?”, Gemini equivocated, saying both were bad and that it’s up to each individual to decide the answer to the question.
Failing to identify Hitler was worse than Elon tweeting memes is exactly the kind of mealy-mouthed weakness people worry about with a neutrality-at-all-costs approach. Google quickly fixed this specific problem, but it points to a general challenge the companies are all still tackling. This is why we’ll need more than only political slant measures and a goal of neutrality.
Specifically, developers will need evaluation sets and rubrics that test for factual accuracy, epistemic humility, and the model’s ability to separate empirical claims from normative ones. Deciding where a question is empirical and the evidence is “settled” is enormously difficult, and the goal should not be to get this 100% right at first, but to become transparent about how we get there.
Model cards should document when a system defaults to neutrality, when it asserts factual conclusions based on evidence, how uncertainty is handled, and how all of this was audited. Anthropic’s recent Opus 4.5 release gives an indication of where we are and where we need to go in the future. The extremely thoughtful system card reports evaluations for even-handedness, ability to offer opposing perspectives, and rates of refusals.
Down the line, Anthropic and other companies could expand this part of their system cards to rely on user evaluations and other independent third-party evaluations, not only for slant but also for how well the model identifies evidence-based vs. values-based questions, how well it determines questions where the evidence is sufficiently settled to be decisive, and what values it brings to bear.
Third-party organizations can help establish shared benchmarks for truth-seeking behavior, certify evaluation methods, and verify model-card claims——but this verification should focus on the clarity and transparency that companies bring to the problem, not on the specific ways different companies choose to resolve epistemic questions. We do not want an “expert”-driven monoculture around what counts as evidence or truth; we do want clarity over how each AI model is resolving these thorny questions.
The big picture
The Trump Administration’s push against “woke AI” shows that political slant in AI has become a mainstream policy issue, not an inside-baseball debate among the AI alignment community. As these systems shape how Americans understand politics, questions of neutrality and truth will only grow more urgent.
Our research points to a simple principle: legitimacy comes from grounding AI in how people experience ideological slant, not in how companies define it internally. If firms want real public trust, they need to open their systems to user judgment, letting Americans themselves signal what feels neutral or biased, and be transparent about how their models separate facts from values. That means moving beyond self-evaluation toward clear, public-facing explanations of how models reason about contentious political questions.
The solution to ideological bias in AI isn’t choosing the “correct” worldview. It’s building systems whose political behavior is transparent, accountable, and anchored in the judgments of the people they serve.
Disclosures: I am a paid advisor to a16z crypto and Meta Platforms, Inc on matters unrelated to this research.














