Hegseth’s war on Anthropic is the wrong answer to the right question
The government is too reliant on private software vendors for core mission work and AI will make this much worse. Forcing companies to work at gunpoint won’t fix things.
Last week, the Department of Defense gave Anthropic an ultimatum: Let the Pentagon use its AI models for “any lawful purpose,” without restriction, by 5:01 p.m. Friday or face the consequences. Those consequences turned out to include a “supply chain risk” designation normally reserved for foreign adversaries and firms associated with them, a presidential directive for every federal agency to cease using Anthropic’s technology, and threats to compel cooperation under the extraordinary powers of the Defense Production Act.
Anthropic refused to compromise on two red lines that ought to put us beyond the realm of “lawful purpose” anyway: no mass domestic surveillance and no fully autonomous weapons without a human in the loop. The Pentagon labeled this defiance. As commenters across the ideological spectrum have pointed out, the administration’s actions are deeply illiberal and its rationale incoherent. It’s also, of course, hard to look past the relationship that some of Anthropic’s competitors have with the administration and not wonder whether they pushed the DoD to take an excessively hard line for their own benefit. Most Americans agree with Anthropic’s position.
And yet, for all the wrong reasons, the Trump administration has tripped and fallen into an interesting question: How should the government think about preserving its own autonomy while it makes use of privately-held AI models?
The administration has attempted to answer this question with allegations that don’t really scan; suggesting, for example, that Anthropic is bad because it is a Trojan horse for effective altruism to “seize veto power over the operational decisions of the United States military” or because “there has been no bigger thief of Americans’ public identity information en masse or creators’ works than by Anthropic.”
As is so often the case, other players in the MAGA ecosystem have now been left to graft logic onto Trump and Hegseth’s impulsive behaviors with varying success. But writing on X, Acting Under Secretary of State for Foreign Assistance, Humanitarian Affairs, and Religious Freedom Jeremy Lewin made one such ex post argument that we should take seriously:
This isn’t about Anthropic or the specific conditions at issue. It’s about the broader premise that technology deeply embedded in our military must be under the exclusive control of our duly elected/appointed leaders. No private company can dictate normative terms of use — which can change and are subject to interpretation — for our most sensitive national security systems. The @DeptofWar obviously can’t trust a system a private company can switch off at any moment.
Lewin is mostly right in principle. No government could, would, or should tolerate a private vendor dictating the terms of how it fulfills its core functions. It’s anti-democratic to outsource so much of your government to private industry that you fail to translate policy goals into real results.
But here’s the thing: Even if we accept the dubious premise that this is a case in which a private company is exercising inappropriate influence over how the government executes its legitimate mission, that is not a new problem. It is, in fact, one of the most common and well-documented failures in federal operations. We just don’t usually talk about it in such dramatic terms. It manifests instead in now-expected tech meltdowns, call centers that don’t scale, and websites that are unusable to the people they need to serve. But it costs us far more in money, failed programs, and degraded public services than any harm Anthropic’s usage policy has ever caused.
And while Lewin correctly identifies this as a problem, the solution cannot be a sudden resort to punitive intervention in the market; this won’t work. Instead, it requires habitual changes inside every agency, office, and program. For anything more complex than a pencil, government contracting poses a classic principal-agent problem: It is difficult for the ostensible boss to track what the ostensible underling is actually doing. Fixing it requires doing something radical: growing the government’s own internal capacity at the expense of an industrial base that has captured large parts of the government and profits by perpetuating dysfunction.
That insight has taken decades to surface, and we struggle mightily to execute on it even now. But AI is going to pour jet fuel on this problem, because it implicates so many normative questions about humanity, ethics, and ownership. We need to ask bigger questions and remain open to bigger answers about how to address this before it’s too late.
We know how to solve vendor capture, actually
The Anthropic fight has captured the public’s interest, but most of the time, vendor capture doesn’t look like a CEO drawing ethical red lines on national television. It looks like an agency that can’t switch IT contractors because no one on the government side understands how the old system works. It looks like a contractor who has maintained a critical system for so long that all the institutional knowledge has migrated to their side of the table. It looks like contracts designed — sometimes deliberately, sometimes through sheer inertia — to make it functionally impossible for anyone else to compete for them.
The practical impact isn’t usually an ideological company imposing its values on public policy. It’s billions of dollars siphoned off through dependency; programs hobbled by vendors who face no real accountability; and a government so reliant on outside firms that it can’t perform even basic oversight of their work.
It’s easy to imagine that this is nefarious, but the reality is generally quite banal. Agencies can’t function without their contractors not because the contractors are so brilliant but because the government has let its own capacity atrophy to the point where the dependency is total. As my colleague Matthew Burton recently explained, vendor capture comes in many forms.
There’s technological capture, where a vendor controls a government IT system so completely through proprietary code, managed-service arrangements, or contractual terms the agency failed to negotiate that the government can’t even access, modify, or migrate the system it paid to build.
There’s intellectual capture, where all the institutional knowledge of how a critical system works has migrated to the vendor’s side of the table. This happens sometimes because the agency never required adequate documentation and sometimes because the agency’s best engineer retired and was immediately hired by the incumbent contractor at twice their government salary.
And there’s psychological capture, where agency leaders have internalized the idea that their vendors are simply better than they are, and stop asking hard questions as a result.
In some senses, this is just a textbook example of the principal-agent problem: because vendors are profit-maximizing entities, they will behave in ways that both self-perpetuate their own incumbency and maximize the “moat” around their position, both at the expense of the very entity that hired them in the first place. This is a well-studied pathology, though one that has gotten worse in the last several decades as the complexity of 21st century technology grows.
Over the past decade, a hard-won consensus has emerged about how to address vendor capture in government. The answer isn’t to threaten vendors or nationalize their products. It’s to close what I call the sophistication gap: the gulf between a government that can barely understand and evaluate what it is buying and a vendor who seems to know everything.
This means the government needs its own technical capacity. It doesn’t need to do everything involved in deploying a system itself — that’s neither practical nor desirable. But it needs to be able to understand the process. It needs product managers who can set priorities and make tradeoffs. This involves the government hiring its own people who decide what to build and for whom, not just contracting officers who check compliance boxes. It needs its own software engineers who can review a vendor’s code, spot architectural problems, and course-correct before the project goes sideways. It needs people to maintain systems in steady state during transitions between vendors, between administrations, and over the long term. The absence of this capacity is staggeringly expensive. Many, many government IT programs fail outright, and many that did eventually launch cost billions more than expected.
We’ve seen what happens when this capacity doesn’t exist and when it does. When a several-hundred-million dollar redesign opf the Free Application for Student Financial Aid collapsed in 2023, the proximate cause was that the main vendors doing the work had produced code that simply didn’t work. But the upstream problem really was that the department ostensibly managing the vendor didn’t know the right questions to ask about their work. When Jeremy Singer, a College Board executive that the White House brought in to manage the turnaround, arrived at FAFSA he found “no ability to check the veracity of what the vendor said as far as status, quality of code — all the things you’d want to do.” Federal Student Aid wasn’t getting a good picture of the actual state of the product because it didn’t know the right questions to ask. “I don’t think the department knew how screwed they were.”
However, the Department of Education didn’t subsequently fix FAFSA by threatening its vendors with forced contracting or arbitrary risk designations. Instead, it managed to right the ship by bringing on a cadre of empowered, senior talent who were capable of reviewing whether contractors’ work was technically sound and explaining tradeoffs to policymakers. This isn’t rocket science and doesn’t require compelling private industry to act at gunpoint.
It’s telling, though, that this lesson had to be relearned. FAFSA was basically in the same failure mode as Healthcare.gov a decade earlier: well-meaning but outgunned policy professionals, unable to manage a vendor doing work they didn’t really understand. We keep making the same mistake because scaling this mindset across the largest enterprise in the country is extraordinarily hard, and because the structural incentives in federal hiring, budgeting, and procurement all push against it.
But AI is genuinely different
Here is where the DoD-Anthropic standoff points at something genuinely new, even if the administration both misunderstood its own problem and then prescribed exactly the wrong medicine.
The traditional answer to vendor capture rests on a key assumption: that if you put a sufficiently skilled person on the government’s side of the table, they can look under the hood. Traditional software is opaque but reconstructable. If a COBOL mainframe has no documentation (and many of them don’t!) you can hire a developer to trace the logic line by line and figure out what it does. It’s expensive, but the code is eventually legible to a competent reader. The architecture is knowable. The behavior is, in principle, fully explicable, even if it’s hard to recreate and expensive to maintain.
AI models are different in kind, not just in degree. The weights and parameters that determine a model’s behavior are trade secrets, but even if they weren’t, the relationship between those weights and any given output is extraordinarily difficult to trace. There is evidence of emergent behavior (i.e., capabilities that arise from training without anyone designing them in) and a long-standing focus by all frontier AI labs on preventing “misaligned” behavior. The companies building these tools do not themselves always fully understand why their models behave the way they do. In fact, AI is so illegible to its own designers that understanding it has become a major area of active research called interpretability.
Dario Amodei, the CEO of Anthropic, described this succinctly last year:
Modern generative AI systems are opaque in a way that fundamentally differs from traditional software. If an ordinary software program does something — for example, a character in a video game says a line of dialogue, or my food delivery app allows me to tip my driver — it does those things because a human specifically programmed them in. Generative AI is not like that at all. When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does — why it chooses certain words over others, or why it occasionally makes a mistake despite usually being accurate.
This difference matters enormously for government accountability. With traditional software, the strategy of closing the sophistication gap works.
With AI, it is not obvious that anyone the government hires could pop open the hood and investigate whether a model was putting its thumb on the scale — purposely or accidentally — in ways inconsistent with the public sector’s priorities. This problem is particularly dire because of the oligopolistic nature of the AI market. There are lots of engineers who could build an alternative software product with enough time and resources, but very few alternative AI models and tremendously high capital costs to train a new one. In such a highly concentrated market, one or two key players declining to work with the government is itself policy-influencing. And, while the switching costs are relatively low between models right now, the market pressure to raise them will be great.
Furthermore, the nature of AI technology is such that the ethics, values, and risk tolerances of its builders are fundamental to the product in a way that isn’t true of more traditional enterprise software. It may simply not be possible to require an AI company to check those values at the door, as the Trump administration seems to be demanding, because of that tight integration. Civil servants are acculturated to speak up but ultimately follow the president’s direction; AI models do not necessarily share this impulse.
One day, this may change as interpretability advances or the market fragments, but that day is yet a long way off. As we ask AI to perform complex tasks, this gap between what the technology can do and what the government understands becomes a significant risk to the public interest.
Imagine the shoe on the other foot
Imagine a future Democratic administration arrives at the Department of Health and Human Services and discovers that Elon Musk’s xAI has expanded its footprint beyond “generation of first drafts of documents and other communication materials” and is now deeply integrated into software looking for patterns of fraud in Medicare data. Per the terms of their contract, xAI declines to explain why the system keeps flagging supposed fraud cases overwhelmingly in blue states. The agency’s technical staff can’t look beyond the outputs they receive. They can’t audit the model’s reasoning because, in a meaningful sense, there isn’t traditional “reasoning” to audit — just statistical patterns baked into billions of parameters that no one can fully trace and that the vendor is under no obligation to provide insight into.
The incoming administrators have no choice but to cancel the contract entirely. The program grinds to a halt while they change vendors. It’s disruptive but ultimately recoverable. You could imagine this same scenario playing out in a law enforcement agency that contracts with Palantir, for example, as ties to the company increasingly become problematic in Democratic primary fights.
Now imagine something more extreme but entirely plausible. A president comes to power deeply opposed to the small number of firms that represent all the major players in frontier AI, perhaps because she made data center opposition a signature part of her campaign. Before the election, those firms announce — as is their right — that they’ll decline to work with that candidate if she wins. But this capital strike would leave an incoming government without options, particularly as citizens have come to expect AI-enabled service delivery. If the IRS is staffed to function with a significant chunk of its filing season call volume being diverted to an AI chatbot, for example, a new administration could find its first 100 days marred by the resulting chaos. The government could go after overt collusion in the above example, but what about the more ordinary and common case: a capital-intensive industry controlled by an extremely small group of people that simply has different political preferences or interests than the people in power?
This is not a technocratic problem. It’s a deep dilemma fundamental to liberal democracy, which has always struggled with the tension between individual liberty and the practical concerns of the state and society. Banning discrimination in public accommodations also impairs free enterprise and free association, but it is understood to be more desirable because it enhances the liberty of everyone else. Some monopolies (e.g., for utilities) are tolerated because astronomical infrastructure costs permit no alternative, but they are regulated more closely as a result. The original idea of the Defense Production Act was similarly reasonable – in a time of total war, it might really be necessary for the government to jump to the front of the line for steel that was being produced anyway. But generally these cases do not involve the compulsion of private action, in particular when that action is expressive.
In a real sense, finding the right balance between these competing interests is the hard work of democracy. Despite what unitary-executive fanatics would have you believe, getting it right requires broad political engagement from all of our institutions and not simply a pronouncement from the president. And, even if you did agree to compel action by a privately held company, would you really want to hinge national security on the work of anyone helping under duress?
The principal-agent problem we can’t solve the old way
The administration’s DPA gambit is a lousy, dangerous, and incoherent answer to this question, but it is not a lousy question. The more the government entwines the private sector in its day-to-day operations, the more difficult it is to ensure that the tail doesn’t wag the dog. This is the heart of it, and it’s easily obscured amid the storm over Hegseth’s tantrum.
We’ve slowly and painfully learned that to solve the principal-agent problem in government technology, we need to put a technically skilled person on the public’s side of the table. It took us a couple decades to reach this insight for regular software development, and we still struggle to act on it. Now, it may be on the verge of becoming obsolete as we get lapped by technology.
This is a bigger claim than “Anthropic should cooperate” or “the Pentagon was too aggressive.” It’s a claim that our entire framework for managing the government’s relationship with its technology vendors needs a radical overhaul, and that we need to get it right much faster this time.
No amount of DPA threats, supply chain designations, or contract renegotiations will fix a structural problem. Rather than threatening to compel industry to work for the government on the government’s terms, an administration genuinely concerned about this should be working with Congress to build something new.
One option would be a publicly-governed national lab for AI structured as a Federally Funded Research and Development Center, perhaps housed at a consortium of public universities across the country or patterned after the national laboratories. It could also look like a direct government capability along the lines of NASA’s early monopoly on space flight – an agency with a Senate-confirmed head, annual appropriations subject to congressional oversight, and all of the transparency requirements that follow from that. It could also look like a venture controlled by multiple branches of government simultaneously. The federal government was in the driver’s seat for many of the most important innovations of the modern age – the internet, GPS, space flight, etc. – and could be again.
This wouldn’t and doesn’t need to replace private AI vendors, but it would give the federal government an independent capacity to evaluate, test, and when necessary, replicate what the private sector is providing. Entities like the National Institute of Standards and Technology are already doing some of this work, but not necessarily with an eye towards replication and without the capacity to deliver service directly. In cases where it would be inappropriate or entirely too risky to deploy a private sector model, having a government-operated baseline would provide some measure of security for policymakers of both parties.
I want to be honest about what I don’t know here. A public AI entity would face enormous challenges: competing with the private sector for talent, keeping pace with a technology that’s evolving at breakneck speed, and operating inside a government procurement and personnel system that was not designed for this kind of work. It’s hard to imagine running an AI lab of any consequence when faced with constant government shutdowns, for instance, or one that has to navigate growing public backlash to data centers and the associated political incentives to tear things down. It would also potentially enable applications of AI on topics that no private lab would be willing to support but are nonetheless choices Congress and the President agree to direct.
But even if we never build a public AI lab, we need new governance models for the vendor relationship itself. If the fundamental problem is that the government can’t verify what’s happening inside the technology it depends on, then we need institutions designed to address that specific gap. This could mean mandatory interpretability standards. It could mean independent third-party auditing regimes with real teeth. It could mean extremely long-term contracts (say 10 years) entered into jointly by Congress and the president. It could mean moving deprecated, well-understood (but still useful!) models into the hands of the government when they “retire” as part of the normal business cycle. It could mean something we haven’t thought of yet.
Act before we’re stuck
We’ve seen a version of this movie before. The federal government’s transition to complex computer systems in the 1980s and 1990s happened without enough serious, careful thinking about vendor governance, and we ended up with exactly the captured, dysfunctional ecosystem we’re now trying to reform: legacy systems held together by contractors who can never be replaced, billions wasted on modernization projects that fail because the government doesn’t have the capacity to manage them, and the feeling that generations of policy professionals have that they aren’t actually in control of the government.
AI is going to turbo-charge this problem. The models are less auditable than traditional software. The market is more concentrated. The dependency, once established, will be harder to unwind. And the stakes, given what these systems will be asked to do, are considerably higher. If AI opens the aperture on disruption, it should also open our aperture for solution.
The DoD’s Defense Production Act debacle is the wrong answer. It’s inconsistent with the liberal values at the heart of the American tradition. It stifles private enterprise and innovation. It supposes that the government can arbitrarily compel action of private citizens without a single vote in Congress or take their property without due process. It does none of the careful balancing between the public’s basic interests in a government not reliant on the good will of a small handful of privately-held companies, the clear right of those companies to decline some work, and the practical challenges of day-to-day deployment. It is odious and wrong on so many levels that it’s not surprising the administration is having a hard time justifying their action.
But there is a glimmer of a good question inside of this whole affair: Who controls the technology that the government depends on, and how do we ensure adequate protection of the public’s interests when that technology is fundamentally opaque?
This has been one of the most important governance questions of the last decade and will only become more important in the next one. We need to start building the institutions to answer it now, before we wake up one day and find ourselves in a government so reliant on private firms as to barely be much of a “public” sector at all.
Gabe Menchaca is a Senior Policy Analyst at the Niskanen Center and, among many other things, is a former management staffer at the Office of Management and Budget and former management consultant. At Niskanen, he writes about civil service reform, the state capacity crisis, and other government management issues.




