The Teaching Trap
I did it again this week – I tried to teach students. Yes, It’s my job, and I love it. But that’s completely my own fault. If it were for the incentives I encounter in the academic institution where I work, it would be far better to not spend time on teaching at all. For my career in academia, the thing that counts most heavily is how many publications in top journals I can realize. For some, this is even the only thing that counts. Their promotion only depends on the number of publications. Last week going home on the train I overheard one young researcher from the medical school of our university saying to a colleague “I would be a sucker to spend time on teaching!”
I remember what I did when I was their age. I worked at another university in an era where excellent publications were not yet counted by the impact factors of journals. My dissertation supervisor asked me to teach a Sociology 101 class, and I spent all of my time on it. I loved it. I developed fun class assignments with creative methods. I gave weekly writing assignments to students and scribbled extensive comments in the margins of their essays. Students learned and wrote much better essays at the end of the course than at the beginning.
A few years later things started to change. We were told to ‘extensify’ teaching: spend less time as teachers, keeping the students as busy as ever. I developed checklists for students (‘Does my essay have a title?’ – ‘Is the reference list in alphabetical order and complete?’) and codes to grade essays with, ranging from ‘A. This sentence is not clear’ to ‘Z. Remember the difference between substance and significance: a p-value only tells you something about statistical significance, and not necessarily something about the effect size’. It was efficient for me – grading was much faster using the codes – and kept students busy – they could figure out themselves where they could improve their work. It was less attractive for students though and they progressed less than they used to. The extensification was required because the department spent too much time on teaching relative to the compensation it received from the university. I realized then that the department and my university earns money with teaching. For every student that passes a course the department earns money from the university, because for every student that graduates the university earns money from the Ministry of Education.
This incentive structure is still in place, and it is completely destroying the quality of teaching and the value of a university diploma. As a professor I can save a lot of time by just letting students pass the courses I teach without trying to have the students learn anything: by not giving them feedback on their essays, by not having them write essays, by not having them do a retake after a failed exam, or even by grading their exams with at least a ‘passed’ mark without reading what they wrote.
The awareness that incentives lead us astray has become clearer to me ever since the time the ‘extensify’ movement dawned. The latest illustration came to me earlier this academic year when I talked to a group of people interested in doing dissertation work as external PhD candidates. The university earns a premium from the Ministry of Education for each PhD dissertation that is defended successfully. Back in the old days, way before I got into academia, a dissertation was an eloquent monograph. When I graduated, the dissertation had become a set of four connected articles introduced by a literature review and a conclusion and discussion chapter. Today, the dissertation is a compilation of three articles, of which one could be a literature review. The process of diploma inflation has worked its way up to the PhD level. The minimum level of quality of required for dissertations has also declined. The procedures in place to check whether the research work by external PhD candidates conforms to minimum standards are weak. And why should they, if stringent criteria lower the profit for universities?
The Rat Race in Research
Academic careers are evaluated and shaped primarily by the number of publications, the impact factors of the journals in which they are published, and the number of citations by other researchers. At higher ranks the size and prestige of research grants starts to count as well. The dominance of output evaluations not only works against the attention paid to teaching, but also has perverse effects on research itself. The goal of research these days is not so much to get closer to the truth but to get published as frequently as possible in the most prestigious journals. A classic example of the replacement of substantive with instrumental rationality or the inversion between means and ends: an instrument becomes a goal in itself. At some universities researchers can earn a salary bonus for each publication in a ‘top journal’. This leads to opportunistic behavior: salami tactics (thinly slicing the same research project in as many publications as possible), self-plagiarism (publishing the same or virtually the same research in different journals), self-citations, and even outright data fabrication.
What about the self-correcting power of science? Will reviewers not weed out the bad apples? Clearly not. The number of retractions in academic journals is increasing and not because reviewers are able to catch more cheaters. It is because colleagues and other bystanders witness misbehavior and are concerned about the reputation of science, or because they personally feel cheated or exploited. The recent high-profile cases of academic misbehavior as well as the growing number of retractions show it is surprisingly easy to engage in sloppy science. Because incentives lead us astray, it really comes down to our self-discipline and moral standards.
As an author of academic research articles I have rarely encountered reviewers who were doubting the validity of my analyses. Never did I encounter reviewers who asked for a more elaborate explanation of the procedures used or who wanted to see the data themselves. Only once I received a request from a graduate student from another university who asked me to provide a dataset and the code I used in an article. I do feel good about being able to provide the original data and the code even though they were located on a computer that I had not used for three years and were stored with software that has received 7 updates since that time. But why haven’t I received such requests on other occasions?
As a reviewer, I recently tried to replicate analyses of a publicly available dataset reported in a paper. It was the first time I ever went to the trouble of locating the data, interpreting the description of the data handling in the manuscript and replicating the analyses. I arrived at different estimates and discovered several omissions and other mistakes in the analyses. Usually it is not even possible to replicate results because the data on which they are based are not publicly available. But they should be made available. Secret data are not permissible. Next time I review an article I might ask: ‘Show, don’t tell’.
As an author, I have experienced how easy and tempting it is to engage in p-hacking: “exploiting –perhaps unconsciously- researcher degrees-of-freedom until p<.05”. It is not really difficult to publish a paper with a fun finding from an experiment that was initially designed to test a hypothesis predicting another finding. The hypothesis was not confirmed, and that result was less appealing than the fun finding. I adapted the title of the paper to reflect the fun finding, and people loved it.
The temptation to report fun findings and not to report rejections is enhanced by the behavior of reviewers and journal editors. On multiple occasions I encountered reviewers who did not like my findings when they led to rejections of hypotheses – usually hypotheses they had promulgated in their own previous research. The original publication of a surprising new finding is rarely followed by a null-finding. Still I try to publish null-findings, and increasingly so. It may take a few years, and the article ends up in a B-journal. But persistence is fertile. Recently a colleague took the lead in an article in which we replicate that null-finding using five different datasets.
In the field of criminology, it is considered a trivial fact that crime increases with its profitability and decreases with the risk of detection. Academic misbehavior is like crime: the more profitable it is, and the lower the risk of getting caught, the more attractive it becomes. The low detection risk and high profitability create strong incentives. There must be an iceberg of academic misbehavior. Shall we crack it under the waterline or let it hit a cruise ship full of tourists?
 In 1917, this was Max Weber’s criticism of capitalism in The Protestant Ethic and the Spirit of Capitalism.
 As Graham Greene wrote in Our Man in Havana: “With a secret remedy you don’t have to print the formula. And there is something about a secret which makes people believe… perhaps a relic of magic.”
 The title of the paper is ‘George Gives to Geology Jane: The Name Letter Effect and Incidental Similarity Cues in Fundraising’. It appeared in the International Journal of Nonprofit and Voluntary Sector Marketing, 15 (2): 172-180.
 On average, 55% of the coefficients reported in my own publications are not significant. The figure increased from 46% in 2005 to 63% in 2011.
 It took six years before the paper ‘Trust and Volunteering: Selection or Causation? Evidence from a Four Year Panel Study’ was eventually published in Political Behavior (32 (2): 225-247), after initial rejections at the American Political Science Review and the American Sociological Review.