S3 Ep 11 - Uljana Feest on 'What is Missing in Replication Debates'

Episode Transcript

Carmelina Contarino: Hello and welcome back to The HPS Podcast where we discuss all things History, Philosophy and Social Studies of Science for a broad audience. I’m your host Carmelina Contarino and today we are joined by Professor Uljana Feest, Philosopher of Psychology and chair for Philosophy of Social Science and Social Philosophy at the Leibniz University of Hannover. Uljana’s background covers psychology, philosophy and the history and philosophy of science.

Today we will be talking about her work on the Replication Crisis as it relates to psychology. In her latest article, ‘What is the Replication Crisis a Crisis Of?’ Uljana questions the two dominant camps of metascience. On one hand, there are those calling for reforms in methods – such as statistical reform, on the other are those calling for more focus on building better theories. Uljana suggests the two camps could focus more on the subject matter of psychology - what we are trying to investigate and what comes out of our investigations to better understand how the way we look at scientific practice can change our perception of the crisis.

Carmelina Contarino: Hi Uljana, welcome to the podcast.

Uljana Feest: Hi Carmelina, thanks so much for having me.

Carmelina Contarino: Firstly, can you tell us how you came to HPS?

Uljana Feest: I grew up in Germany and had a variety of interests, which were mostly in the arts. If you had asked me at age 12 what I wanted to be, I would have said I wanted to be an artist. Then a little later I became aware of social inequalities, racism, sexism, and the ways in which people suffer as a result of these issues and then I wanted to be a social worker. Then somebody said, ‘why don't you study psychology instead?’ So that idea took a hold in my mind even though I only had a very hazy idea of what psychology actually is, which I think is probably the case for many people who study psychology.

So then, when I went to university, I picked psychology as my main subject, but I still didn't really know what it was. Even at the time, I was less interested in being a psychologist and more interested in understanding what kind of a what kind field psychology as a discipline was. Looking back, it's pretty clear that I was interested in understanding what psychology is, what are its methods, what's its subject matter, but also how does it think about its subject matter kind of on a more meta level. I also had a hunch that to answer these questions would also require looking at the way in which psychology developed historically. So, I was always interested in HPS not knowing that this was in a discipline that actually existed.

I studied psychology. I didn't get many answers to those questions, but I kept being interested in them. I also started taking philosophy classes. One of the professors was doing what is still called in the German system ‘theoretical philosophy’. This person didn't have a particular focus on philosophy of science, but he was very supportive of me. I became one of his student assistants for his introductory classes.

When I got my psychology degree I still didn't really know where I was going with any of this. I ended up working for two years as part of an interdisciplinary group that was looking at situated robotics from a science studies perspective. This was a group of six of us, all from different disciplines, and we interviewed people working in robotics. It became very clear that we had a hard time communicating with each other because we all had very different interests. It became very clear by the end of those two years that the questions I was asking were actually philosophical ones, not sociological ones. At that point, I realised I should do a PhD in philosophy, and my advisor in philosophy agreed to take me on, but also said, ‘I think you should really go somewhere where they do philosophy of science’ and he recommended that I go to Pittsburgh. I applied for a fellowship, and I got it. It wasn't until I literally walked into the department that I realised that I was in the department of History and Philosophy of Science. The philosophy department was across the hall, but I was in HPS. It was a revelation, that I was in exactly the right place. Somehow my advisor had realized this, but I hadn't. Because he had always talked about how wonderful Pittsburgh is, and the philosophy is so great there. And I never understood that there were actually two different departments.

It was really at that point that I found my academic home and realised there was actually a name for the thing that I was hoping to do.

I stayed there for a number of years and ended up with a PhD in history and philosophy of science focusing specifically on operationalism or operationism, as I call it, in psychology. It had components from the history of psychology, but it also had philosophical components, which were more informed by the philosophy of experimentation.

Carmelina Contarino: In your latest article you discuss recent debates surrounding the replication crisis, can you tell us what the replication crisis was about?

Uljana Feest: So the replication crisis started with the recognition that there are many seemingly well-established findings in psychology, but also in other fields that cannot be replicated. So, for example, there's a phenomenon called priming effect, which basically means - if you're exposed to a particular stimulus this influences your subsequent behaviour, even if you're not aware this is what's influencing your behaviour. There is semantic priming, that's actually a phenomenon that's still pretty uncontested. There is also something called social priming, which has become much more contested.

For example, there's a famous study by John Bargh who supposedly showed that if you're exposed to words or stimuli that somehow have to do with old age, you subsequently walk more slowly. This seems like a pretty spectacular finding. And obviously, people like spectacular findings. For a while, this was cited as an interesting a result from behavioural research, but it was really hard to replicate. This was only one of many, many results, where suddenly it really kind of caused the scientific community to stop and start thinking, what are we doing here? What if all these things that we thought were correct don't even replicate. So, they are much less certain than we thought they were.

Once psychologists became aware of this issue, a famous study was conducted that ran replications of 100 experiments and found that only about 35 of them achieved the same result as they had the first time around. Of course, this raises a number of questions and in particular raises the question of whether there was something shoddy going on in the initial study. If you can't replicate it, one conclusion you could draw is that maybe it's not a very stable effect, but another one is, it was presented as if it were a stable effect, but clearly, it's not that easy.

This then led to widespread soul searching amongst psychologists about what had gone wrong. Early on, it became clear that one of the problems was that many psychologists routinely engage in what has become known as questionable research practices. Questionable research practices are, as the name suggests, research practices that maybe you shouldn't engage in, but that happen, and maybe happen more frequently than they should.

With that becoming apparent, that people were doing that, a lot of the reform efforts focused on trying to make sure that people weren't doing it, to address methodological issues, at the level of statistical analysis.

Carmelina Contarino: Thanks Uljana, you go on to discuss reform approaches that focus on either methodological transformation or theory building. Can you tell us about those?

Uljana Feest: When I first heard about the replication crisis, I was also really intrigued. But since my own interest isn't in philosophy of statistics, I was more interested in what did they do when they redid their experiments, and what kind of reasoning goes into designing experiments to begin with. As a philosopher and historian of psychology and experimentation, my primary interest is not in how data are analysed once we have the data, but rather I was always intrigued by the question of how are the data produced in the first place?

One question that informed my research was, how do you do experiments or any kind of empirical research when you're dealing with a subject matter that cannot be very straightforwardly observed?

Let's assume you're a psychologist interested in memory. This is not an object that you can dissect or take and put under a microscope. It's something very evasive or elusive. So, the question that I was interested in was, how do you do research on such objects that are not very tangible? My thesis is that a psychologist wanting to study an object such as memory has to put in a lot of prior conceptual work in order to even design an experiment.

To get back to the question about methodological reforms that have been debated in the last 10 or 15 years, a lot of them, especially initially, focused almost exclusively on this whole issue of statistical analyses, making sure certain kinds of questionable research practices aren't employed. Subsequently, people started to think more about theory, on building theories, because there aren't a lot of theories in psychology. But the way I see it, there's a piece missing in the middle. The issue of what psychological theories are actually about.

So let me just say, coming back to your question about my recent paper, the problem as I see it in the replication crisis is neither exclusively one of questionable research practices with regard to statistical methods, for example, nor is it exclusively about a lack of theory. It's about paying insufficient attention to the question of how we conceptualise the very objects of psychology. What I'm doing in this paper that you mentioned is to say that recent debates about whether there is really a crisis and different assessments of how bad the crisis really is, can be traced back to different construals of the subject matter of psychology.

Carmelina Contarino: In your article, you discuss the difference between effects and complexity. How do they relate to these two different modes of investigating or reacting to the crisis?

Uljana Feest: What I'm trying to say in the article is that, depending on your general understanding of the subject matter and what you're trying to do when you're trying to do psychological research, you're going to come to different assessments of how difficult it's going to be to replicate a result. If you think of the psychological subject matter as incredibly complex and context sensitive and you want to study the subject matter in its complexity, you won't be surprised that results can be hard to replicate. You might expect that small differences in the experiment might have unexpected consequences. On the other hand, if you think it's the job of the experimenter to isolate specific stable effects and you take pride in doing that well, then it is pretty disturbing if it turns out that it is hard to replicate. So that's two ways of looking at the subject matter. Do I think it's my job to find these stable effects? Or, do I think it's my job to understand this very complex subject matter in more detail? In a way I feel that this is a false dichotomy, but to argue this, I think we need to get a more nuanced picture of what actually happens in psychological experiments.

One question is, what do we take to be the actual object of research in a given study? I can think of the object of research as a particular effect. I can say, ‘I'm interested in whether there is an elderly social priming effect. I'm trying to show that it exists in an experiment. Then I have evidence it exists in the real world.’ I can also say, ‘No, I’m not really interested in this particular effect. This is just an example. This is just a particular data point for something more general that I'm interested in.’ So, I might describe my research object not as elderly priming, but rather as social stereotyping. Social stereotyping then, of course, is something much more complex, much more general. It's not this particular effect that we're interested in, it is this kind of broader object that we're interested in.

I think we also need to distinguish between at least two meanings of the word effect. When psychologists run experiments and they produce an experimental effect, the effect I would argue doesn't constitute the object of research. It's actually produced to generate evidence. So, we need to distinguish between the effect, the object we're interested in, and the effect that we're generating as evidence. Often the effect that we're producing in the experiment has very little to do with the effect that we're trying to gain evidence for. Merely asking whether this effect can be replicated I think kind of detracts from this more interesting question, why did they do this funny experiment in the first place?

Carmelina Contarino: Is part of the problem that we're looking to make sure that the generalisability of the phenomena that we're testing is A-OK and ready to go, and everything must be able to be extrapolated across populations without taking into account the actual complexity of the various contextual situations?

Uljana Feest: Yes, thank you. I think that's a very good way of putting it. I mean, that is how I've been thinking about it, certainly. After the replication crisis happened there was a lot of discussion. In the meantime, a bunch of other crises have been declared and one of them is the crisis of generalisation. Tal Yarkoni coined that term. Other people have been talking about the crisis of validation or validity. So, there are all these other problems.

One way to think about the replication crisis that several people have been pushing is to say, when we do an experiment not only is the object very elusive, but part of what's elusive about the object is that we don't know how it behaves under various different circumstances, how it behaves in different populations. In my mind, that is part of the subject matter of psychology. We have to be prepared to assume that there probably are stable effects, but there are also many things that really are highly context sensitive. That's not necessarily a bug. That's not just a potential confounder. It's also an interesting fact about the subject matter.

So, yes, I think part of the problem is that we're expecting, we do an experiment and it should be replicable under various circumstances. When it doesn't, rather than thinking, ‘Oh, that's interesting, maybe we've hit on something significant here?’ we worry about psychologists having made mistakes. I think there are many problems with research practice in psychology but I'd like to steer attention towards these issues that you were just mentioning.

Carmelina Contarino: Given that there is this tendency to see failed replications as a problem with the initial study, rather than looking at reasons why, do you feel that we should be investigating those marginal questions a little more in terms of creating more robust theories?

Uljana Feest: First of all, I don't want to be misunderstood as saying that it's all context sensitive, and there are all these kinds of marginal things that affect how people behave. Because, in the end, we want the evidence to be robust and replicable. But when it's not, it doesn't necessarily give us insight, but potentially it might give us insight into specific ways in which the complexity and context sensitivity of psychological object plays out. I do think that when we think about that, we are engaging in a form of theory building, but I think we need a notion of theory that can capture that.

In philosophy of science, for a long time, there was this very idealised fiction of, ‘theory exists in this one realm, and then there is the messy reality on the other side, and what connects them are the data that are somewhat or somehow neutral.’ I don't think psychologists typically start out with a well-formed theory. So, in that respect, I would agree with the people who have emphasised theory building. But I think we need to think more closely about what we mean by theory building. I don't think we ought to be sitting in our study and building a theory and when it's done, we go to the lab and test it. Rather, I think we should think about, what are the objects that we're interested in? If my object is memory, or if my object is emotion, or another one that I've become interested in recently is personality, then we want to think about those objects in a theoretical manner, but also in a manner that is empirically informed. This is why I bring in the concepts as the missing link.

Most psychologists, when they work on something, they start out not with a well-formed theory, but they also don't start out with a blank slate, obviously. They start out with some kind of conception of what they're doing, and that concept then informs the kinds of experiments they do. Of course, that's no guarantee for the experiment being good because it might turn out that it's not an adequate concept. It might turn out that there's certain phenomena that we associate with memory, for example, that in the light of later research, we might find out that they are better explained by something else. So, I don't want to reify these concepts. I don't want to say, ‘you start out with this concept and therefore this thing exists.’ But I do think you can't do experiments on any object of psychological research without having a concept and reflecting on this concept and thinking about how you would operationally define it, which is part of what psychologists need to do when they do experiments. We can talk about operational definitions more because it's also a contested concept in itself. But yes, I think theory building is really important.

I just think we should move beyond the fiction of the theory being the thing that's already there and then we test it. Rather, but I'm really interested in the growing understanding of the subject matter, which includes a descriptive understanding of the subject matter. I think theory is often used synonymously with explanation. I think in psychology, we need a notion of theory that's not just explanatory theory, because part of what we're doing is describing the object.

Carmelina Contarino: Going back to what you said about operationalising the objects of study, how do we do that?

Uljana Feest: Yes, thank you for the question. This is something that I've tried to think about for the last 25 years. First, I think maybe a terminological distinction, and I know that I haven't been entirely precise. In general, I would distinguish between operationally defining a concept and operationalising a research question. So, if I have a particular hypothesis, let's say that muscular contraction has an impact on emotional experience, then I have to design an experiment that involves operationalising all the various aspects of this. For example, what kind of stimuli do I give? How do I control for confounders? How do I measure the result of a particular manipulation on the subject? So that's the very broad sense of operationalising.

More specifically, if I wanted to measure the impact of a manipulation on a particular object, such as emotion, I do this manipulation, now I want to see, do people that had the manipulation experience more emotions than the ones who didn't? Then I need a way to measure emotion. In order to do so, I have to, quote unquote, ‘operationally define’ emotion for the purpose of this study. I have to say, I have a particular test here, for example, putting funny cartoons in front of people and recording their answers of how funny they were on a scale from one to seven, or something. I am going to treat this way of measuring emotion as an operational definition of emotion. Only if this is adequate, can I even consider the data that I'm getting as evidence for my hypothesis, which was about the impact of the muscle movement on the emotion.

Of course, with this account of operational definition, I'm really departing from the traditional way of thinking about operational definitions. I should say this here because I'm sure that there are people listening to this who say, ‘well, wait a second, this is not what operational definition means. Operational definition [traditionally] means defining a concept in terms of necessary and sufficient conditions in terms of a particular operation, which means that there is nothing more to the meaning of this concept than what's given in this operation. So, this is how operational definitions have traditionally been understood. And I should say that there are reasons for understanding it that way. But my general take is, when you look at what researchers are actually doing, it would seem bizarre to attribute to them the belief that all we mean by emotion is how somebody responds to a funny cartoon. I just don't think we are doing justice to psychological research practices if we take them to be saying that.

My general approach in this entire project, which is much broader than the replication crisis as you can tell, has always been to say, I think that there is something more interesting in the practice of operationally defining concepts than what the philosophical caricature has traditionally assumed.

Carmelina Contarino: What can scientists in other disciplines draw from this?

Uljana Feest: That's a great question, and I don't feel particularly competent to answer it as I don't know other sciences that well. I know that you're interested in exploratory experimentation. One person that has done a lot of work on this is the historian of physics, Friedrich Steinle, and historians often say to me, ‘Oh, of course, that makes a lot of sense, psychology is a very young discipline. When physics was younger, or less developed, our concepts were less well defined. Also, during periods when there were conceptual shifts, physics found itself in a situation that was very similar to the one that you are describing for psychology, where the objects are not very well understood and there's a lot of conceptual openness and epistemic uncertainty. So, the things that you're describing about psychology maybe describe physics at an earlier time.’ I think that's an interesting answer. I mean, I think that's certainly one way in which there's a parallel.

I would like to think, however, that what I'm talking about is also relevant even maybe for more recent science. I think the issue of how you get from a scientific question to a valid and sound measurement that answers the question is one that doesn't just occur in the early stages of research and it doesn't just occur in psychology.

Carmelina Contarino: Thank you. So, for the general public, what do you think they can take from this discussion?

Uljana Feest: Especially in the current climate when there's so much scepticism about science, I think the replication crisis has added to the sense of ‘anything goes’, that scientists don't know what they're doing, they're all arguing with each other, that all these findings, you know, turn out to have been flawed. I do think that the replication crisis has drawn attention to problems in scientific practices. I think it has also drawn attention to problems in science reporting, to be honest. Maybe science is often portrayed as producing more permanent and certain results than it actually does, but I also don't like the science bashing, and in particular, I don't like the psychology bashing.

What I'd like the public to take away is, first, we're actually dealing with an incredibly interesting subject matter, and second, when you look more closely, a lot of the research is much more sophisticated than it gets credit for.

It's because often we just hear about these weird effects. Then, a year later we hear that that effect didn't replicate. We should have been more sceptical of the effect to begin with, and we should have had a more nuanced reporting about what people were trying to do, why they produced this effect. What was this effect meant to be evidence for? What are various ways in which this effect might be modified by variations in the environment or in the context?

But I think that what the general public should take away is - psychology is more interesting and also more sophisticated than might appear if you only listen to this or that effect not having been replicated.

Carmelina Contarino: I think there's certainly a lot that we could talk about science communication, but I did want to talk to you very briefly about a book that you've just finished writing.

Uljana Feest Yes, I'm super, super excited that I'm finally actually finished writing a book that I worked on for a long time. The book is titled Operationism in Psychology: An Epistemology of Exploration. As somebody who had done psychology, I felt like this notion that operational definitions just try to reduce the meaning just didn't do justice to what psychologists were doing. I thought it was much more plausible that when they operationally define a concept that they are just setting some kind of standard. I looked at the work of early operationists in psychology as evidence for my hypothesis that they're not primarily interested in meaning, they are interested in the subject matter and they bring a much broader conception of the subject matter than, ‘I am wanting to reduce it to particular operations, then apply this.’ The relevance of this insight to more recent debates within the philosophy of psychology and the philosophy of experimentation.

Carmelina Contarino: Fabulous, and when will that be released?

Uljana Feest: It will come out with the University of Chicago press, and I'm told that it won't come out until next February.

Carmelina Contarino: Now, what's next for your research?

Uljana Feest: From the perspective of philosophers, psychology deals with cognition. But when you look at psychology departments, there's actually a lot of research that's not concerned with cognition. It might also be concerned with cognition, but again, it's part of a broader interest. People might be interested in development. People might be interested in, of course, clinical phenomena. People might be interested in differential psychology, which is concerned not with general functions that we all share but features with respect to which we differ. This is in fact something that I want to work on next. I've become really interested in personality. So, my next big project is going to be about the history and philosophy of personality research.

Carmelina Contarino: Thank you so much, Uljana. It has been great having you on the podcast and we look forward to talking to you again when your book is released next year.

Uljana Feest: Thank you so much. It's been a lot of fun.

HPS@UniMelb

S3 Ep 11 - Uljana Feest on 'What is Missing in Replication Debates'

Recent Posts

Comments