We live in a revolutionary time of social data expansion, methodological innovation and active engagement between policymakers and researchers. As a result, we are better able to learn about social problems and the effects of policy than ever before. Yet this has left many dissatisfied: The promises of greater understanding, policy advancement and problem-solving associated with evidence-based policymaking have often felt unmet. Much of what we are learning is that large, lasting changes are rare — and not produced by small policy tweaks.
Although there are important lessons to be learned from the limits of this revolution, we should resist the temptation to use it as an excuse to return to our theoretical or ideological assumptions. If we find that we cannot easily solve a social problem, that a policy solution with promise does not pan out or that successes cannot easily be scaled up or replicated, that does not mean that your prior preferred social arrangement will work better or that a different revolution is more justified. It does not mean that dismissing scientific evaluation (or relaxing our standards of “evidence”) would make it easier to agree on sweeping policy changes. It just means that social change is hard, comes with trade-offs and is likely to require adaptations.
Social science is about failing better
Many of these debates are perennial: From the beginning, social science has wanted to provide evidence for effective social rearrangement but usually fell far short of ambitious practical goals. Yet real progress has been made. Data relevant to social problems and policymaking are now much more widely collected and available. The methods available to make causal inferences and generalizations and to accumulate findings across research literatures have made great strides in recent decades.
But these advances have often looked like failures. Past findings have regularly not held up to methodological scrutiny, and even well-done studies have failed to replicate. Findings in the lab or observational data have not translated to effective real-world interventions that change outcomes. And human behavior has often changed more quickly, varied more across geographies and institutions, and been less consistent than we expected. These are not research failures, however: They are real facts about the world we have relearned.
The advances were not limited to data and methods. Social scientists have become much more aware of how their unrepresentativeness — including in terms of nationality, race, gender, class, religion, partisanship and ideology — affects the questions they ask and the way they interpret results. The disciplines themselves have learned from each other, with economics rediscovering the importance of individual psychology while psychology discovers the role of political institutions and context.
Perhaps we have also learned some humility. In my surveys, social scientists have become more reticent to make grand claims. They now believe that most findings do not apply across countries or time periods. Scholars have become interested in applying their work to policy issues, but also recognize that doing so successfully is very difficult.
Disappointment does not vindicate ideology
“Evidence-based” policymaking has faced considerable constraints, especially when limited to randomized controlled trials. Even more than basic social science research, policy field experiments have not lived up to the hype. They have often found small or null effects. They have often failed to replicate or found diminishing returns or limited generalization. Even to get to the point of testing, interventions often face considerable pushback and scale back their ambitions.
It is tempting to use these collective results to argue that the testing infrastructure and social scientific aspirations are to blame. We have not found enough evidence-backed interventions, so we must be looking in the wrong places. Yet tellingly, this same argument is used to support diametrically opposed political agendas. Conservatives say the problem is a progressive expectation that social engineering can work. We should learn, they say, that government should limit its scope and leave more to the private sector, individuals or other social institutions. Progressives, in contrast, say the problem is a timid tinkering around the edges; the search for causal evidence limits the potential for necessary social revolutions.
Neither the conservative nor progressive conclusion comes close to following from the evidence even though they are both reacting to reasonable reviews. There is certainly evidence to support the conservative contention that researchers have overwhelmingly tested liberal policy interventions. But that is because conservatives simply put forward fewer new policy ideas; tests of conservative interventions are no more successful, but conservatives tend to return to the same remedies. A rightward policy move is no less disruptive and chancy in its social implications than a leftward policy move; it still involves the same trade-offs and expectations of limited effectiveness, even if conservatives view it as closer to an imagined state of nature.
The critique from the left is also correct in pointing out that we often test relatively small policy interventions. But we pursue pilot projects for good reasons: The effects of policy tend to be strongest under their most ideal circumstances; they bring challenges with implementation and scaling up. Policy effectiveness research is hardly dominated by randomized controlled trials; there are plenty of attempts to combine qualitative and quantitative tools in evaluation, but they also find that even successful policies start small (in effects as well as application) and rely on localized and adaptable implementation. If we pursue a small intervention that does not succeed, we cannot assume that doing more, faster would work better. It might well be the case that policy effects are weakened by broader social forces and institutions, but that hardly means we can easily change those larger contexts. It is not that researchers limit themselves to one kind of test, but that accumulated evidence often does not point to obvious policy interventions even when large problems are identified.
Limited evidence of policy effectiveness might be a reason to prefer the status quo: If we do not know what will work, perhaps we should do nothing or work as slowly as we can. But our political systems already build in substantial status quo bias: Most policymaking efforts fail and little or no change is always the most likely policy outcome. I do not think the lesson should be to do even less any more than it should be that we really need a revolution on the left or right.
The politics of muddling through
I come to this debate as an academic who regularly works with policymakers and teaches faculty to help government officials. I usually put the disconnect rather bluntly: Policymakers want immediate, definitive answers about the size of problems and the effects of small changes from well-known but objective policy area experts. Scholars usually offer data-heavy presentations focused on their own recent findings, accompanied by a tacked-on list of unvetted policy implications. Each side comes to the partnership with different priorities and strengths. Researchers care about causal inference, impact and generalizability, while policymakers stress feasibility, constituency reaction, implementation, appearance and values. Researchers bring knowledge of policy variation, potential unintended effects and uncertainty. But policymakers understand what has been tried before, what is (and is not) under their control and how people will react to ideas during political debates and implementation.
These differences are often bridged by settling on interventions that cause little political resistance and give researchers the stable environment they need to focus their research. Consider, for example, the highly politically successful but substantively disappointing experience of “nudge” units in government. Economists and regulation scholars convinced global policymakers that they could apply social science tools with little downside and high impact, focusing on testing small policy interventions like informational guidance and changing technical default options. As in other areas of social science, the initial hoopla gave way to much more limited lessons. It turns out that small changes to promotional texts or application materials do not often produce radical social changes.
The effort should be taken in the appropriate political context. It was responsive to political constraints: Policymakers did want to make big changes despite their limited tools. They did want to reduce counterarguments by preferring technical fixes over new requirements or more funding. Researchers were selling what policymakers were buying.
And policymakers buy the incremental approach for good reason. Policymaking involves compromise, and most successful lawmakers focus on what small steps can be achieved rather than on optimal policy. Any policy proponent learns that the easiest way to move policy forward is to scale it back, addressing any objections. When they do get their way, the victories may not last. The most common way for policy to fail is to produce a backlash before the policy can demonstrate success. Despite what activists believe, public and elite opinion most often move against the direction of policy: The more policy moves leftward, the bigger the backlash from the right (and vice versa).
There is thus little alternative to muddling through, incrementally and with setbacks. And it is better to do so with all available evidence than to suggest that the limited evidence of effectiveness for what we have tried so far justifies your own preferred options. It might just mean the world is harder to change than you thought.
Matt Grossmann is director of the Institute for Public Policy and Social Research, a professor of political science at Michigan State University, and a senior fellow at the Niskanen Center. He is author of the book “How Social Science Got Better: Overcoming Bias with More Evidence, Diversity, and Self-Reflection.” His podcast is “The Science of Politics.”