Monthly Archives: February 2013

How Incentives Lead Us Astray in Academia

PDF of this post

The Teaching Trap

I did it again this week – I tried to teach students. Yes, It’s my job, and I love it. But that’s completely my own fault. If it were for the incentives I encounter in the academic institution where I work, it would be far better to not spend time on teaching at all. For my career in academia, the thing that counts most heavily is how many publications in top journals I can realize. For some, this is even the only thing that counts. Their promotion only depends on the number of publications. Last week going home on the train I overheard one young researcher from the medical school of our university saying to a colleague “I would be a sucker to spend time on teaching!”

I remember what I did when I was their age. I worked at another university in an era where excellent publications were not yet counted by the impact factors of journals. My dissertation supervisor asked me to teach a Sociology 101 class, and I spent all of my time on it. I loved it. I developed fun class assignments with creative methods. I gave weekly writing assignments to students and scribbled extensive comments in the margins of their essays. Students learned and wrote much better essays at the end of the course than at the beginning.

A few years later things started to change. We were told to ‘extensify’ teaching: spend less time as teachers, keeping the students as busy as ever. I developed checklists for students (‘Does my essay have a title?’ – ‘Is the reference list in alphabetical order and complete?’) and codes to grade essays with, ranging from ‘A. This sentence is not clear’ to ‘Z. Remember the difference between substance and significance: a p-value only tells you something about statistical significance, and not necessarily something about the effect size’. It was efficient for me – grading was much faster using the codes – and kept students busy – they could figure out themselves where they could improve their work. It was less attractive for students though and they progressed less than they used to. The extensification was required because the department spent too much time on teaching relative to the compensation it received from the university. I realized then that the department and my university earns money with teaching. For every student that passes a course the department earns money from the university, because for every student that graduates the university earns money from the Ministry of Education.

This incentive structure is still in place, and it is completely destroying the quality of teaching and the value of a university diploma. As a professor I can save a lot of time by just letting students pass the courses I teach without trying to have the students learn anything: by not giving them feedback on their essays, by not having them write essays, by not having them do a retake after a failed exam, or even by grading their exams with at least a ‘passed’ mark without reading what they wrote.

Allemaal_een_Tien

The awareness that incentives lead us astray has become clearer to me ever since the time the ‘extensify’ movement dawned. The latest illustration came to me earlier this academic year when I talked to a group of people interested in doing dissertation work as external PhD candidates. The university earns a premium from the Ministry of Education for each PhD dissertation that is defended successfully. Back in the old days, way before I got into academia, a dissertation was an eloquent monograph. When I graduated, the dissertation had become a set of four connected articles introduced by a literature review and a conclusion and discussion chapter. Today, the dissertation is a compilation of three articles, of which one could be a literature review. The process of diploma inflation has worked its way up to the PhD level. The minimum level of quality of required for dissertations has also declined. The procedures in place to check whether the research work by external PhD candidates conforms to minimum standards are weak. And why should they, if stringent criteria lower the profit for universities?

The Rat Race in Research

Academic careers are evaluated and shaped primarily by the number of publications, the impact factors of the journals in which they are published, and the number of citations by other researchers. At higher ranks the size and prestige of research grants starts to count as well. The dominance of output evaluations not only works against the attention paid to teaching, but also has perverse effects on research itself. The goal of research these days is not so much to get closer to the truth but to get published as frequently as possible in the most prestigious journals. A classic example of the replacement of substantive with instrumental rationality or the inversion between means and ends: an instrument becomes a goal in itself.[1] At some universities researchers can earn a salary bonus for each publication in a ‘top journal’. This leads to opportunistic behavior: salami tactics (thinly slicing the same research project in as many publications as possible), self-plagiarism (publishing the same or virtually the same research in different journals), self-citations, and even outright data fabrication.

What about the self-correcting power of science? Will reviewers not weed out the bad apples? Clearly not. The number of retractions in academic journals is increasing and not because reviewers are able to catch more cheaters. It is because colleagues and other bystanders witness misbehavior and are concerned about the reputation of science, or because they personally feel cheated or exploited. The recent high-profile cases of academic misbehavior as well as the growing number of retractions show it is surprisingly easy to engage in sloppy science. Because incentives lead us astray, it really comes down to our self-discipline and moral standards.

As an author of academic research articles I have rarely encountered reviewers who were doubting the validity of my analyses. Never did I encounter reviewers who asked for a more elaborate explanation of the procedures used or who wanted to see the data themselves. Only once I received a request from a graduate student from another university who asked me to provide a dataset and the code I used in an article. I do feel good about being able to provide the original data and the code even though they were located on a computer that I had not used for three years and were stored with software that has received 7 updates since that time. But why haven’t I received such requests on other occasions?

As a reviewer, I recently tried to replicate analyses of a publicly available dataset reported in a paper. It was the first time I ever went to the trouble of locating the data, interpreting the description of the data handling in the manuscript and replicating the analyses. I arrived at different estimates and discovered several omissions and other mistakes in the analyses. Usually it is not even possible to replicate results because the data on which they are based are not publicly available. But they should be made available. Secret data are not permissible.[2] Next time I review an article I might ask: ‘Show, don’t tell’.

As an author, I have experienced how easy and tempting it is to engage in p-hacking: “exploiting –perhaps unconsciously- researcher degrees-of-freedom until p<.05”.[3] It is not really difficult to publish a paper with a fun finding from an experiment that was initially designed to test a hypothesis predicting another finding.[4] The hypothesis was not confirmed, and that result was less appealing than the fun finding. I adapted the title of the paper to reflect the fun finding, and people loved it.

The temptation to report fun findings and not to report rejections is enhanced by the behavior of reviewers and journal editors. On multiple occasions I encountered reviewers who did not like my findings when they led to rejections of hypotheses – usually hypotheses they had promulgated in their own previous research. The original publication of a surprising new finding is rarely followed by a null-finding. Still I try to publish null-findings, and increasingly so.[5] It may take a few years, and the article ends up in a B-journal.[6] But persistence is fertile. Recently a colleague took the lead in an article in which we replicate that null-finding using five different datasets.

In the field of criminology, it is considered a trivial fact that crime increases with its profitability and decreases with the risk of detection. Academic misbehavior is like crime: the more profitable it is, and the lower the risk of getting caught, the more attractive it becomes. The low detection risk and high profitability create strong incentives. There must be an iceberg of academic misbehavior. Shall we crack it under the waterline or let it hit a cruise ship full of tourists?


[1] In 1917, this was Max Weber’s criticism of capitalism in The Protestant Ethic and the Spirit of Capitalism.

[2] As Graham Greene wrote in Our Man in Havana: “With a secret remedy you don’t have to print the formula. And there is something about a secret which makes people believe… perhaps a relic of magic.”

[3] The description is from Uri Simonsohn, http://opim.wharton.upenn.edu/~uws/SPSP/post.pdf

[4] The title of the paper is ‘George Gives to Geology Jane: The Name Letter Effect and Incidental Similarity Cues in Fundraising’. It appeared in the International Journal of Nonprofit and Voluntary Sector Marketing, 15 (2): 172-180.

[5] On average, 55% of the coefficients reported in my own publications are not significant. The figure increased from 46% in 2005 to 63% in 2011.

[6] It took six years before the paper ‘Trust and Volunteering: Selection or Causation? Evidence from a Four Year Panel Study’ was eventually published in Political Behavior (32 (2): 225-247), after initial rejections at the American Political Science Review and the American Sociological Review.

2 Comments

Filed under academic misconduct, fraud, incentives, law, methodology, psychology

Frequently Unanswered Questions (FUQ)

Dear journalists, before we embark on a journey along all too familiar landscapes, please read this.

Q (Question) 1. Mr. Bekkers, you study ‘giving to charities’. How do you know whether a donation to a charity is well spent?

  • U (Unanswer) 1. Well, I don’t, actually. Indeed my research is about giving to charities. I do not study how charities spend the funds they raise. I can tell you that donors say they care about how charities spend their money. In fact this is often an excuse. People who complain about inefficiency of charities are typically those who would never donate money in a million years, regardless of whatever evidence showing that donations are efficient.

Q2. Mr. Bekkers, what is the reason why people give to charity?

  • U2. There is not one reason, there’s 8 different types of reasons, also called ‘mechanisms’, buttons you can push to create more giving. You can read more about them here. You said you wanted fewer reasons? Well, I can give you a list of four reasons: egoism, altruism, collectivism, and principlism. Oh no, there’s only three types of reasons: emotions, cognitions, and things we are not aware of. Wait, there’s only two reasons: truly altruistic reasons and disguised egoism.

Q3. Speaking of altruism, isn’t all seemingly altruistic behavior in the end somewhat egoistic?

  • U3. Yes, you’re probably right. I would say about 95% of all giving (just a ball park figure) is motivated by non-altruistic concerns, like being asked, knowing someone who suffered from a problem, knowing someone who benefited,  benefiting oneself, getting tax breaks and deductions, social pressure to comply with requests for donations, feeling good about giving, having an impact on others, feeling in power, paternalism, having found a cookie or something else that cheered you up, or letting the wife decide about charities to keep her busy and save the marriage.

Q4. Sorry, what I meant to ask is this: does true altruism exist at all?

  • U4. No, probably not, but we don’t know. Nobody has ever come up with a convincing experiment that rules out all non-altruistic motives for giving. Many people have tried, but they have been unsuccessful. It is hard to eliminate all emotions, cognitions, awareness of the donor about the consequences of the donation.

Q5. I mean, isn’t all giving in the end also about helping ourselves, like when you’re feeling good about giving?

  • U5. That could be right, we can’t rule out the ‘warm glow’ without blowing out the candle. But if you would only be interested in feeling good, then having a chocolate bar might be a lot cheaper.

Q6. Why do people volunteer?

  • A1. See U2 above. In many respects, giving money is like giving time.

Q7. Are you a generous man yourself? What do you give to charity?

  • U6. I am not at liberty to answer this question.

Q8. How much do we give in the Netherlands?

  • A2. Read all about the numbers in our Giving in the Netherlands volume, published biennially. A summary in English is here. The latest estimates are about 2011. On April 23, 2015, we will publish new estimates (about 2013).

Q9. Is it true that the Dutch are a very generous population?

Q10. Is altruism part of human nature?

  • U8. I will answer this question with the only decent scientific answer a scientist can ever give: “Well, it depends”. In this case, it all depends on what you call ‘altruism’ (and ‘human nature’ of course). If you view helping in the absence of rewards spontaneously and repeatedly toward humans and conspecifics as altruism, then chimpanzees are altruistic; if you view cooperation in order to maintain mating access to single females against other males as altruism, bottlenose dolphins are altruistic; and if you view promoting chances of survival of your genes as altruism even maize plants can be  altruistic.

Hattips to Roel van Geene and Melissa Brown

Update: 16 July 2014

Leave a comment

Filed under altruism, charitable organizations, household giving, philanthropy, psychology, volunteering