Social Science and the Replication Crisis

with Daniel Kaufman

May 18, 2022

Back in 2015, Daniel Kaufman, a philosopher and the editor of the online magazine the Electric Agora, had me onto his show to talk about the social sciences. He wanted to ask the kind of questions that a philosopher would ask of a social scientist, that is, questions about the premises and conceptual underpinnings of the field.

When Dan and I had this conversation, the discipline of psychology, and behavioral psychology in particular, was in the midst of well-publicized crisis. A study had been published showing that many of the results of many recent experiments in psychology, even those published in top-tier journals, could not be replicated. This would be a problem for any field that regards itself as a science. If you can replicate the conditions of an experiment, the results of the initial experiment and the replication should remain basically constant. If the results of the replication deviate significantly from the initial study, the validity of that study may be thrown into doubt. And if enough studies can’t be replicated, the validity of the entire field may be thrown into doubt.

There were worries at the time that our contemporary understanding of human psychology was undergirded by faulty premises. And this worry spread out to other disciplines in the social sciences. As I explain below, there are relatively few incentives for researchers to spend their time and energy replicating others’ research. But it’s also true that studying human behavior, whether through the lens of psychology or economics, is in many ways a messier business than studying the behavior of atoms or chemicals. You can make fairly accurate predictions about how people in the aggregate will behave under some set of market conditions, but it’s anybody’s guess how any one individual will act.

In the excerpt below, the replication crisis in psychology leads Dan and I to discuss what distinguishes the physical sciences from the social sciences. As Dan says, it’s not reasonable to expect social science research to produce findings with the same kind of accuracy as research in the physical sciences. But then what are our expectations of the social sciences? And how can we best implement their findings?

This post is free and available to the public. To receive early access to TGS episodes, an ad-free podcast feed, Q&As, and other exclusive content and benefits, click below.

DANIEL KAUFMAN: I'm interested in your opinion. There's been a recent brouhaha over the fact that it turns out that a very large number of the experiments in psychology could not be replicated. And this has caused quite a stir and has caused people, at least in the public popular conversation, to start wondering aloud about just how good of a science psychology is and how good the theories are. And I guess I'm interested in your opinion on this issue as it applies to psychology, just because it's been so discussed in the last few weeks since the study came out. But I also was going to ask you whether there are similar problems in economics, or if economic theory experiments also resist replication.

GLENN LOURY: So, I don't know the answer to the second question, just as a matter of fact. To what extent have people tried and failed to replicate the results that are found in—and as I said, the experiments in economics are to some degree artificial. You put people in the laboratory. Now in experimental economics, which is a field which consists of studies in which people try to find [something] out by varying certain kinds of conditions.

I'll give one example. This is about, would I be more willing to support the provision of public goods, which costs me something but which benefit everybody? You know, you've got this classic problem, the classic public goods problem, which is if I make an investment that the benefits are broadly available to everybody, generally I'm gonna make a socially too-small investment in that activity, because I don't internalize the benefits that other people are getting. So I'll under-invest.

So one argument for the state, for coercive centralized public action, taxes in order to pay for the national defense—which no one would be willing to pay for on their own—is that we need these public goods, but our incentives are all screwed up about providing them in an autonomous way. So we need to be, in effect, coerced into providing them for our own benefit. Are people more willing to go along with that coercion if they have a say in how the monies are being spent than if they don't?

We can create a laboratory environment—I'm just giving you an extended example—in which we have a kind of game where people get to decide how much they want to contribute and the benefits are available to everybody. Then there's a precondition where either they get to discuss and vote on the protocol or they don't get to vote on it. And then by comparing how people provide for the public good in the condition in which they have a say and in which they don't have a say, people want to draw a conclusion about the extent to which democracy, participatory democracy, promotes citizenship and public spiritedness.

Now, that's a large, large step from a laboratory experiment. So what I'm trying to say is, it may be that in the journal, when you send the paper in, the editor says that we'll publish this paper, and then someone sees it and they say, “Aha, I want to replicate this experiment.” Of course, the problem is going to be, it was the American Economic Review where the original paper is published, but the replication isn't fit for the American Economic Review. It's going to be in a second-tier journal run by somebody at a second-tier [university], you know, not at Princeton or Stanford or MIT, but you know, whatever. And so there's really very little incentive for the most talented people in the profession to engage in that kind of activity.

But I actually don't know the answer to the question. Is there a replication crisis in economics comparable to that that has been exposed in psychology? I'm not an empiricist. I just want to tell you, I don't do experiments. I'm a theoretician. You could replicate my theorems just by proving, you know, the proof is correct. But I generally think that the major journals now are requiring people who do large-scale empirical investigations to make their data publicly available, so that another investigator—it's probably going to be a graduate student, it's probably going to come up in some advanced graduate class where the instructor says, “Go get that data set and see if you can't replicate this guy's findings.” So then you have to get down in the bowels of the data, you have to deal with all the mechanics about how you organize and code it, and then you have to do the statistical analysis to see whether the outcomes come out.

Now, that kind of thing is being done quite a bit in economics. And famously, Steven Levitt, for example, a well-known economist, author of Freakonomics, a first rate economist, a winner of the John Bates Clark Medal in economics some years ago, which honors the best economist under the age of 40 every year and so on. This is Steven Levitt, University of Chicago. He's a major player. But he's had to call back a couple of results in some of his papers where a graduate student at Berkeley has gotten the data and gone through and gone through and found out, “Aha, you made a mistake in coding some variable with a one instead of a zero. And when you do it the right way the result collapses at the first stage of his two-stage least squares. And now, you know, blah, blah, blah.” And [Levitt]'s had to say, “Oops.”

So that kind of thing is happening with empirical investigations where the data can be made publicly available. Since experimental investigation isn't as critical in economics, I don't know that the replication of experiment question has come up. But these lab experiments that the experimental economists are doing with the little programs at the computer screens are very easy to replicate. Anybody can get the protocol and get 20 or 40 undergraduates in a room and see if the things come out the same way.

And do we know whether, in that small arena of economics that is experimental, how successful replication has been? Or has it been sort of a fiasco like in psychology?

My impression is “pretty successful.” My impression, I stand to be corrected here if anybody in the audience knows better than I do about this, is that we don't really hear about these results unless they get replicated. I mean, they don't become canonical. They don't gain the status of having a purchase on the profession's attention. Because the replication of these laboratory experiments is pretty easy to do.

I see. Do you have any feelings about what happened in [psychology]? I guess what I'm wondering is whether, in general, we would expect that social science would have a poorer rate of replicability than physical science. That we would expect that as a matter of course, even if only because it's much more difficult to simulate the original conditions when the pallete one is working on as a social pallete, as opposed to just space or a few substances.

In other words, are we being a bit unfair? Is this jumping on psychology a bit unfair, because we should expect that of course social science experiments are going to be less replicable than physical science experiments. They're going to have a poorer rate of replicability.

I think that's right, for the reasons that you just stated. It's going to be harder to exactly recreate the conditions of the initial investigation. That's one reason why I say, with respect to this laboratory experimentation that the experimental economists do, where it's relatively easy to replicate the computer program that everybody sat and looked at. But it might be pretty hard [with] some psychological experiments, and I'm not an expert in this field, to get that replication just right.

Also, to the extent that there's noise, I mean, I find the result, but there's always variance, right? There's always an error. Results can happen by chance. If I flip a coin long enough, I'm going to get ten straight heads. That doesn't mean that the coin is weighted in favor of heads. There's a chance that the original finding was not a faithful replication of the underlying structure but just a noisy observation of that structure. Hence, when I repeat the experiment, if there's a lot of noise in the inferential process, I might not get the same result just because the first result was what it was by chance and not by fidelity to the true structure.

I think we are expecting a bit too much from the social sciences. I think that, because we don't think enough about and talk enough about the difference between what the social sciences do and what the physical sciences do, we expect of the social sciences a similar kind of result that we expect from the physical, which is unfair. But also, I would say, we are, in a sense, through our our our policies, trusting the social sciences maybe a little bit too much, don't you think?

We aren't patient enough, it seems to me, Daniel. A single finding is not a conclusion. And so keep your powder dry. So there's that study, but wait a minute, it hasn't been replicated. Let's wait. Let's see if we do it in a different population, if we do it on the other side of the Atlantic or the equator. Let's see. It might take five years not five months before we have a sense of what the answer is here. Interesting, provocative, but let's keep our powder dry. But of course, the public communications mechanisms are not friendly to patience.

And I guess policy makers are pushing very hard to get so-called answers so that they can go ahead and fix problems for which there is tremendous political pressure being put upon them to fix. But I sometimes fear or worry that we're employing all sorts of remedies based on science that we're treating as if this was like chemistry. But we really don't know very well at all what's the cause of some of these phenomena.

What's an example of the fear that you have there?

Some of the remedies that we bring to bear, for example, for all sorts of negative social behavior. Theories of how we deal with kids who are unruly in the classroom. I am disturbed by the fact that so many of my undergraduates have been medicated for much of their teen years. People who, just looking at them, I would think would never have been medicated or even occurred to people to medicate them. When I was a kid in school, you know, back in the '70s or whatever, it almost seems to me like ... You know, theories of punishment, the area that you're interested in. Locking people up for long periods of time. Do we have any reason to think that this actually solves problems? I think we have many reasons to think that we don't, when we look at other countries that don't incarcerate the way we do, as you know better than I do.

So I guess I think that we're bringing a lot of social science to bear on some very important social problems that we have but placing too much confidence in it and engaging in remedies which really have tremendous consequence. I mean, you medicate whole generations of young people, I don't think you know what kind of people you're going to wind up with when you're done with it. I wonder whether part of this is because we simply don't appreciate the extent to which the social sciences simply don't provide us with the kind of knowledge that physical science is doing. That we have to be a lot more careful with what we do with the results.

Yeah.

So that's my feeling about it at least.

Robert Humphreys

I suggest the problem in the case of social science is that experiments with humans, which involve many uncontrollable and even unrecognized variables, are used to contradict social policies that were developed by civil society trial and error over decades centuries or even millennia. To call this high risk (irresponsible high risk) is understating the problem. It is hubris writ large. The results have been dreadful for society. Yet the government treats social science dictums as gospel. I make the same claim about economics. Time to recognize that economists do not understand enough about the ramifications of their theories to be making policy. We are seeing this play out today, with inflation, perpetual and rapidly growing debt, and a Fed with no way out. And after less than 15 years after the past policy disaster. Social sciences have done far more damage than good. Time to admit it and treat both as in the primitive stage of experiments “science”.

Expand full comment

Thomas DeGruccio

So the academy strains to promulgate immutable truths in the social sciences, a fools errand!

9 more comments...

Social Science and the Replication Crisis

with Daniel Kaufman

Discussion about this post