In the prehistoric era of competitive science, researchers were like magicians: they earned a reputation for tricks that nobody could repeat and shared their secrets only with trusted disciples. In the new age of open science, researchers share by default, not only with peer reviewers and fellow researchers, but with the public at large. The transparency of open science reduces the temptation of private profit maximization and the collective inefficiency in information asymmetries inherent in competitive markets. In a seminar organized by the University Library at Vrije Universiteit Amsterdam on November 1, 2018, I discussed recent developments in open science and its implications for research careers and progress in knowledge discovery. The slides are posted here. The podcast is here.
Category Archives: incentives
Scientists across the globe spend a substantial part of their time writing research proposals for competitive grant schemes. Usually, less than one in seven proposals gets funded. Moreover, the level of competition and the waste of time invested in research proposals that do not receive funding are increasing.
The most important funder of science in the Netherlands, the Netherlands Organization for Scientific Research (NWO), is painfully aware of the research competition crisis. On April 4, 2017, more than one hundred of the nation’s scientists gathered in a conference to come up with solutions for the crisis. I was one of them.
The conference made clear that the key problem is that we have too many good candidates and high quality research proposals that cannot be funded with the current budget. Without an increase in the budget for research funding, however, that problem is unlikely to go away.
Stan Gielen, the new director of NWO, opened the conference. Because the universities and NWO lack bargaining power in the government that determines the budget for NWO, he asked the scientists at the conference to think about ‘streamlining procedures’. In roundtable discussions, researchers talked about questions like: “How can the time it takes between a final ranking in a grant competition and the announcement of the result to applicants be reduced?”
Many proposals came up during the meeting. The more radical proposals were to discontinue funding for NWO altogether and to reallocate funding back to the universities, to give a larger number of smaller grants, to allocate funding through lotteries among top-rated applications, and the idea by Scheffer to give researchers voting rights on funding allocations. I left the meeting with an increased sense of urgency but with little hope for a solution. Gielen concluded the meeting with the promise to initiate conversations with the ministry for Education, Culture and Science about the results of the conference and to report back within six months.
Yesterday, NWO presented its proposals. None of the ideas above made it. Instead, a set of measures were announced that are unlikely to increase chances of funding. The press release does not say why ineffective measures were favored over effective measures.
Two of the proposals by NWO shift work to the universities, giving them responsibility in pre-evaluations of proposals. At the Vrije Universiteit Amsterdam we already make quite an investment in such pre-evaluations, but not all universities do so. Also the universities are now told to use an instrument to reduce the number of proposals: the financial guarantee. Also this proposal is akin to a measure we already had in place, the obligatory budget check. The financial guarantee is an additional hurdle applicants have to take.
The proposal to give non-funded but top-rated ERC proposals a second chance at NWO reduces some of the work for applicants, but does not increase chances for funding.
A final proposal is to ask applicants to work together with other applicants with related ideas. It may be a good idea for other reasons, but does not increase chances for funding.
One of the causes of the problem that funding chances are declining is the reward that universities get for graduations of PhD candidates (‘promotiepremie’). This reward keeps up the supply of good researchers. PhD candidates are prepared and motivated for careers in science. But these careers are increasingly hard to get into. As long as the dissertation defense reward is in place, one long term solution is to change the curriculum in graduate schools, orienting them to non-academic careers.
Another long-term solution is to diversify funding sources for science. In the previous cabinets, the ministry of Economic Affairs has co-controlled funding allocations to what were labeled ‘topsectors’. Evaluations of this policy have been predominantly negative. One of the problems is that the total budget for science was not increased, but the available budget was partly reallocated for applied research in energy, water, logistics etcetera. It is unclear how the new government thinks about this, but it seems a safe bet not to have much hope for creative ideas from this side. But there is hope for a private sector solution.
There is a huge amount of wealth in the Netherlands that investment bankers are trying to invest responsibly. As a result of increases in wealth, the number of private foundations established that support research and innovation has increased strongly in the past two decades. These foundations are experimenting with new financial instruments like impact investing and venture philanthropy. The current infrastructure and education at universities, however, is totally unfit to tap into this potential of wealth. Which graduate program offers a course in creating a business case for investments in research?
Experiments can have important advantages above other research designs. The most important advantage of experiments concerns internal validity. Random assignment to treatment reduces the attribution problem and increases the possibilities for causal inference. An additional advantage is that control over participants reduces heterogeneity of treatment effects observed.
The extent to which these advantages are realized in the data depends on the design and execution of the experiment. Experiments have a higher quality if the sample size is larger, the theoretical concepts are more reliably measured, and have a higher validity. The sufficiency of the sample size can be checked with a power analysis. For most effect sizes in the social sciences, which are small (d = 0.2), a sample of 1300 participants is required to detect it at conventional significance levels (p < .05) and 95% power (see appendix). Also for a stronger effect size (0.4) more than 300 participants are required. The reliability of normative scale measures can be judged with Cronbach’s alpha. A rule of thumb for unidimensional scales is that alpha should be at least .63 for a scale consisting of 4 items, .68 for 5 items, .72 for 6 items, .75 for 7 items, and so on. The validity of measures should be justified theoretically and can be checked with a manipulation check, which should reveal a sizeable and significant association with the treatment variables.
The advantages of experiments are reduced if assignment to treatment is non-random and treatment effects are confounded. In addition, a variety of other problems may endanger internal validity. Shadish, Cook & Campbell (2002) provide a useful list of such problems.
Also it should be noted that experiments can have important disadvantages. The most important disadvantage is that the external validity of the findings is limited to the participants in the setting in which their behavior was observed. This disadvantage can be avoided by creating more realistic decision situations, for instance in natural field experiments, and by recruiting (non-‘WEIRD’) samples of participants that are more representative of the target population. As Henrich, Heine & Norenzayan (2010) noted, results based on samples of participants in Western, Educated, Industrialized, Rich and Democratic (WEIRD) countries have limited validity in the discovery of universal laws of human cognition, emotion or behavior.
Recently, experimental research paradigms have received fierce criticism. Results of research often cannot be reproduced (Open Science Collaboration, 2015), publication bias is ubiquitous (Ioannidis, 2005). It has become clear that there is a lot of undisclosed flexibility, in all phases of the empirical cycle. While these problems have been discussed widely in communities of researchers conducting experiments, they are by no means limited to one particular methodology or mode of data collection. It is likely that they also occur in communities of researchers using survey or interview data.
In the positivist paradigm that dominates experimental research, the empirical cycle starts with the formulation of a research question. To answer the question, hypotheses are formulated based on established theories and previous research findings. Then the research is designed, data are collected, a predetermined analysis plan is executed, results are interpreted, the research report is written and submitted for peer review. After the usual round(s) of revisions, the findings are incorporated in the body of knowledge.
The validity and reliability of results from experiments can be compromised in two ways. The first is by juggling with the order of phases in the empirical cycle. Researchers can decide to amend their research questions and hypotheses after they have seen the results of their analyses. Kerr (1989) labeled the practice of reformulating hypotheses HARKING: Hypothesizing After Results are Known. Amending hypotheses is not a problem when the goal of the research is to develop theories to be tested later, as in grounded theory or exploratory analyses (e.g., data mining). But in hypothesis-testing research harking is a problem, because it increases the likelihood of publishing false positives. Chance findings are interpreted post hoc as confirmations of hypotheses that a priori are rather unlikely to be true. When these findings are published, they are unlikely to be reproducible by other researchers, creating research waste, and worse, reducing the reliability of published knowledge.
The second way the validity and reliability of results from experiments can be compromised is by misconduct and sloppy science within various stages of the empirical cycle (Simmons, Nelson & Simonsohn, 2011). The data collection and analysis phase as well as the reporting phase are most vulnerable to distortion by fraud, p-hacking and other questionable research practices (QRPs).
- In the data collection phase, observations that (if kept) would lead to undesired conclusions or non-significant results can be altered or omitted. Also, fake observations can be added (fabricated).
- In the analysis of data researchers can try alternative specifications of the variables, scale constructions, and regression models, searching for those that ‘work’ and choosing those that reach the desired conclusion.
- In the reporting phase, things go wrong when the search for alternative specifications and the sensitivity of the results with respect to decisions in the data analysis phase is not disclosed.
- In the peer review process, there can be pressure from editors and reviewers to cut reports of non-significant results, or to collect additional data supporting the hypotheses and the significant results reported in the literature.
Results from these forms of QRPs are that null-findings are less likely to be published, and that published research is biased towards positive findings, confirming the hypotheses, published findings are not reproducible, and when a replication attempt is made, published findings are found to be less significant, less often positive, and of a lower effect size (Open Science Collaboration, 2015).
Alarm bells, red flags and other warning signs
Some of the forms of misconduct mentioned above are very difficult to detect for reviewers and editors. When observations are fabricated or omitted from the analysis, only inside information, very sophisticated data detectives and stupidity of the authors can help us. Also many other forms of misconduct are difficult to prove. While smoking guns are rare, we can look for clues. I have developed a checklist of warning signs and good practices that editors and reviewers can use to screen submissions (see below). The checklist uses terminology that is not specific to experiments, but applies to all forms of data. While a high number of warning signs in itself does not prove anything, it should alert reviewers and editors. There is no norm for the number of flags. The table below only mentions the warning signs; the paper version of this blog post also shows a column with the positive poles. Those who would like to count good practices and reward authors for a higher number can count gold stars rather than red flags. The checklist was developed independently of the checklist that Wicherts et al. (2016) recently published.
- The power of the analysis is too low.
- The results are too good to be true.
- All hypotheses are confirmed.
- P-values are just below critical thresholds (e.g., p<.05)
- A groundbreaking result is reported but not replicated in another sample.
- The data and code are not made available upon request.
- The data are not made available upon article submission.
- The code is not made available upon article submission.
- Materials (manipulations, survey questions) are described superficially.
- Descriptive statistics are not reported.
- The hypotheses are tested in analyses with covariates and results without covariates are not disclosed.
- The research is not preregistered.
- No details of an IRB procedure are given.
- Participant recruitment procedures are not described.
- Exact details of time and location of the data collection are not described.
- A power analysis is lacking.
- Unusual / non-validated measures are used without justification.
- Different dependent variables are analyzed in different studies within the same article without justification.
- Variables are (log)transformed or recoded in unusual categories without justification.
- Numbers of observations mentioned at different places in the article are inconsistent. Loss or addition of observations is not justified.
- A one-sided test is reported when a two-sided test would be appropriate.
- Test-statistics (p-values, F-values) reported are incorrect.
With the increasing number of retractions of articles reporting on experimental research published in scholarly journals the awareness of the fallibility of peer review as a quality control mechanism has increased. Communities of researchers employing experimental designs have formulated solutions to these problems. In the review and publication stage, the following solutions have been proposed.
- Access to data and code. An increasing number of science funders require grantees to provide open access to the data and the code that they have collected. Likewise, authors are required to provide access to data and code at a growing number of journals, such as Science, Nature, and the American Journal of Political Science. Platforms such as Dataverse, the Open Science Framework and Github facilitate sharing of data and code. Some journals do not require access to data and code, but provide Open Science badges for articles that do provide access.
- Pledges, such as the ‘21 word solution’, a statement designed by Simmons, Nelson and Simonsohn (2012) that authors can include in their paper to ensure they have not fudged the data: “We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study.”
- Full disclosure of methodological details of research submitted for publication, for instance through psychdisclosure.org is now required by major journals in psychology.
- Apps such as Statcheck, p-curve, p-checker, and r-index can help editors and reviewers detect fishy business. They also have the potential to improve research hygiene when researchers start using these apps to check their own work before they submit it for review.
As these solutions become more commonly used we should see the quality of research go up. The number of red flags in research should decrease and the number of gold stars should increase. This requires not only that reviewers and editors use the checklist, but most importantly, that also researchers themselves use it.
The solutions above should be supplemented by better research practices before researchers submit their papers for review. In particular, two measures are worth mentioning:
- Preregistration of research, for instance on Aspredicted.org. An increasing number of journals in psychology require research to be preregistered. Some journals guarantee publication of research regardless of its results after a round of peer review of the research design.
- Increasing the statistical power of research is one of the most promising strategies to increase the quality of experimental research (Bakker, Van Dijk & Wicherts, 2012). In many fields and for many decades, published research has been underpowered, using samples of participants that are not large enough the reported effect sizes. Using larger samples reduces the likelihood of both false positives as well as false negatives.
A variety of institutional designs have been proposed to encourage the use of the solutions mentioned above, including reducing the incentives in careers of researchers and hiring and promotion decisions for using questionable research practices, rewarding researchers for good conduct through badges, the adoption of voluntary codes of conduct, and socialization of students and senior staff through teaching and workshops. Research funders, journals, editors, authors, reviewers, universities, senior researchers and students all have a responsibility in these developments.
Bakker, M., Van Dijk, A. & Wicherts, J. (2012). The Rules of the Game Called Psychological Science. Perspectives on Psychological Science, 7(6): 543–554.
Henrich, J., Heine, S.J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33: 61 – 135.
Ioannidis, J.P.A. (2005). Why Most Published Research Findings Are False. PLoS Medicine, 2(8): e124. http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124
Kerr, N.L. (1989). HARKing: Hypothesizing After Results are Known. Personality and Social Psychology Review, 2: 196-217.
Open Science Collaboration (2015). Estimating the Reproducibility of Psychological Science. Science, 349. http://www.sciencemag.org/content/349/6251/aac4716.full.html
Shadish, W.R., Cook, T.D., & Campbell, D.T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.
Simmons, J.P., Nelson, L.D., & Simonsohn, U. (2011). False positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22: 1359–1366.
Simmons, J.P., Nelson, L.D. & Simonsohn, U. (2012). A 21 Word Solution. Available at SSRN: http://ssrn.com/abstract=2160588
Wicherts, J.M., Veldkamp, C.L., Augusteijn, H.E., Bakker, M., Van Aert, R.C & Van Assen, M.L.A.M. (2016). Researcher degrees of freedom in planning, running, analyzing, and reporting psychological studies: A checklist to avoid p-hacking. Frontiers of Psychology, 7: 1832. http://journal.frontiersin.org/article/10.3389/fpsyg.2016.01832/abstract
The Center for Philanthropic Studies I am leading at VU Amsterdam is converting to Open Science.
Open Science offers four advantages to the scientific community, nonprofit organizations, and the public at large:
- Access: we make our work more easily accessible for everyone. Our research serves public goods, which are served best by open access.
- Efficiency: we make it easier for others to build on our work, which saves time.
- Quality: we enable others to check our work, find flaws and improve it.
- Innovation: ultimately, open science facilitates the production of knowledge.
What does the change mean in practice?
First, the source of funding for contract research we conduct will always be disclosed.
Second, data collection – interviews, surveys, experiments – will follow a prespecified protocol. This includes the number of observations forseen, the questions to be asked, measures to be included, hypotheses to be tested, and analyses to be conducted. New studies will be preferably be preregistered.
Third, data collected and the code used to conduct the analyses will be made public, through the Open Science Framework for instance. Obviously, personal or sensitive data will not be made public.
Fourth, results of research will preferably be published in open access mode. This does not mean that we will publish only in Open Access journals. Research reports and papers for academic will be made available online in working paper archives, as a ‘preprint’ version, or in other ways.
December 16, 2015 update:
A fifth reason, following directly from #1 and #2, is that open science reduces the costs of science for society.
See this previous post for links to our Giving in the Netherlands Panel Survey data and questionnaires.
July 8, 2017 update:
In 1Vandaag zei ik op 19 februari dat er in Nederland geen gedragscode voor de werving van nalatenschappen bestaat. Dit blijkt niet waar, er is wel degelijk een richtlijn voor nalatenschappenwerving. In de regels van het Centraal Bureau Fondsenwerving voor het CBF-Keur en op de website van de VFI (Vereniging voor Fondsenwervende Instellingen), branchevereniging voor goede doelen, is deze richtlijn echter niet te vinden. De VFI heeft wel een richtlijn voor de afwikkeling van nalatenschappen. Maar die gaat over de afwikkeling, als het geld al binnen is. Niet over de werving van nalatenschappen.
Het blijkt dat de richtlijn voor de werving van nalatenschappen is gepubliceerd door een derde organisatie, het Instituut Fondsenwerving. Deze organisatie heeft in 2012 een richtlijn opgesteld voor fondsenwervende instellingen die nalatenschappen werven. De richtlijn is niet verplichtend. Het Instituut Fondsenwerving heeft ook een gedragscode waar haar leden zich aan hebben te houden, maar de richtlijn voor nalatenschappen heeft niet de status van gedragscode. Bij het Instituut Fondsenwerving zijn volgens de ledenlijst 231 goede doelen organisaties aangesloten (klik hier voor een overzicht in Excel). Het Leger des Heils is lid van het IF, maar de Zonnebloem niet. Ook andere grote ontvangers van nalatenschappen, zoals KWF Kankerbestrijding, ontbreken op de ledenlijst. Zij zijn wel lid van de VFI, dat 113 leden en 11 aspirant leden telt.
De VFI reageerde op de uitzending via haar website en vermeldde de richtlijn van het Instituut Fondsenwerving. De status van de richtlijn is in de reactie opgehoogd naar een gedragscode. Dit zou betekenen dat leden die zich niet aan de richtlijn houden, kunnen worden geroyeerd. Gosse Bosma, directeur van de VFI, zei overigens in de 1Vandaag uitzending dat de VFI niet is nagegaan of de betrokken leden zich aan de richtlijn hebben gehouden en dat ook niet nodig te vinden. Het IF zelf heeft niet gereageerd. Ook de ontvangende goede doelen, de Zonnebloem en het Leger des Heils, reageerden deze week via het vaktijdschrift voor de filantropie, Filanthropium. Zij verklaarden zich bereid onrechtmatigheden te corrigeren. Wordt ongetwijfeld vervolgd in de volgende fase van deze zaak, of wanneer een nieuwe zaak zich aandient.
Het Financieel Dagblad besteedt een lang artikel aan de betekenis van het CBF-Keur voor goede doelen naar aanleiding van de vraag: “Waar blijft mijn gedoneerde euro?” Het “keurmerk en boekhoudregels zijn geen garantie voor een zinvolle besteding”, volgens de krant. Verderop in het artikel staat mijn naam genoemd bij de stelling dat het CBF-Keur ‘fraude of veel te hoge kosten niet uitsluit’ en zelfs dat het ‘nietszeggend’ zou zijn. Inderdaad zegt het feit dat een goed doel over het CBF-Keur beschikt niet dat de organisatie perfect werkt. Het maakt fraude niet onmogelijk en dwingt organisaties ook niet altijd tot de meest efficiënte besteding van beschikbare middelen. In het verleden zijn misstanden bij verschillende CBF-Keurmerkhouders in het nieuws gekomen, die bij sommige organisaties hebben geleid tot intrekking van het keurmerk.
Maar helemaal ‘nietszeggend’ is het CBF-Keur ook weer niet. Zo denk ik er ook niet over. Het CBF-Keur zegt wel degelijk wat. Voordat een organisatie het CBF-Keur mag voeren moet het een uitgebreide procedure door om aan eisen te voldoen aan financiële verslaggeving, onafhankelijkheid van het bestuur, kosten van fondsenwerving, en de formulering van beleidsplannen. Dit zijn relevante criteria. Zij zorgen ervoor dat je als donateur erop kunt vertrouwen dat de organisatie op een professionele manier werkt. Het CBF-Keur zegt alleen niet zoveel over de efficiëntie van de bestedingen van een goed doel. Veel mensen denken dat wel, zo constateerden we in onderzoek uit 2009.
Het is lastige materie. Garantie krijg je op een product dat je koopt in de winkel, waardoor je het terug kunt brengen als het niet functioneert of binnen korte tijd stuk gaat. Zulke garanties zijn moeilijk te geven voor giften aan goede doelen. Een dergelijke garantie zou je alleen kunnen geven als de kwaliteit van het werk van goede doelenorganisaties gecontroleerd kan worden en er een minimumeis voor te formuleren valt. Dat lijkt mij onmogelijk. Het CBF-Keur is niet zoiets als een rijbewijs dat je moet hebben voordat je een auto mag besturen. De markt voor goede doelen is vrij toegankelijk; iedereen mag de weg op. Sommige goede doelen hebben een keurmerk, maar dat zegt vooral hoeveel ze betaald hebben voor de benzine, in wat voor auto ze rijden en wie er achter het stuur zit. Het zegt nog niet zoveel over de hoeveelheid ongelukken die ze ooit hebben mee gemaakt of veroorzaakt, en of dat de kortste of de snelste weg is.
Vorig jaar stelde de commissie-De Jong voor om een autoriteit filantropie in te stellen, die organisaties zou gaan controleren voordat ze de markt voor goede doelen op mogen. Er zou een goede doelen politie komen die ook op de naleving van de regels mag controleren en boetes mag uitdelen. Dat voorstel was te duur voor de overheid. Voor de goede doelen was het onaantrekkelijk omdat zij aan nieuwe regels zouden moeten gaan voldoen. Bovendien was het niet duidelijk of die nieuwe regels ook echt het aantal ongelukken zou verlagen. Het is op dit moment überhaupt niet duidelijk hoe goed de bestuurders van goede doelen de weg kennen en hoeveel ongelukken ze maken. Een beter systeem zou moeten beginnen met een meting van het aantal overtredingen in het goede doelen verkeer en een telling van het aantal bestuurders met en zonder rijbewijs. Vervolgens zou het goed zijn om een rijopleiding op te zetten die iedereen die de markt op wil kan volgen en in staat stelt de vaardigheden op te doen waarover elke bestuurder moet beschikken. Ik hoop dat het artikel in het Financieel Dagblad tot een discussie leidt die dit duidelijk maakt.
Intussen heeft het CBF gereageerd met de verzekering dat er gewerkt wordt aan uitwerking van richtlijnen voor ‘reactief toezicht op prestaties’. Ook de VFI, branchevereniging voor goede doelen, kwam met een reactie van die strekking. Dat is goed nieuws. Maar die nieuwe richtlijnen zijn er nog lang niet. In de tussentijd geeft het CBF keurmerken af en publiceren de Nederlandse media – die na Finland de meest vrije ter wereld zijn – af en toe een flitspaalfoto van wegmisbruikers. Dat lijkt voldoende te zijn om het goede doelen verkeer zichzelf te laten regelen en de ergste ongelukken te voorkomen. Want die zijn er maar weinig.
Here’s CRAP: a new policy regarding review requests I’ve decided to try out. CRAP means Conditional Review Acceptance Policy, the new default response to review requests. I will perform review only if the journal agrees to publish the article in a Free Open Access mode – making the article publicly available, without charging any fees for it from universities, authors, or readers.
Here’s the story behind CRAP. If you’re an academic, you will recognize the pattern: you get an ‘invitation’ or a ‘request’ to review a paper submitted to the journal because ‘you have been identified as an expert on the topic’. If you’re serious about the job, you easily spend half a day reading the article, thinking about the innovations in the research questions, the consistency of the hypotheses, wondering why previous research was ignored, vetting the reliability and the validity of the data and methods used, checking the tables, leaving aside the errors in references which the author copied from a previously published article. As a reviewer accepting the task to review a paper you sometimes get a 25% discount on the hugely overpriced books by the publisher or access to journal articles which your university library already paid for.
You accept the invitation because you know the editor personally, you want to help improve science, want to facilitate progress in the field, because by refusing you will miss an opportunity to influence the direction your field is taking or simply to block rubbish from being published. *Or, if you have less laudable objectives, because you want others to know and cite your work. I confess I have fallen prey to this temptation myself.* But it does not end after the job is done. There’s a good chance you will get the article back after the authors revised it, and you are invited again to check whether the authors have done a good job incorporating your comments. In the mean time, you’ve received seven more review requests. I could fill my entire week reviewing papers if I accepted all invitations I receive.
In the world outside academia complying with a request means doing people a favor, which at some point in the future you can count on to be returned. Not so in academia. The favors that we academics are doing are used by publishers to make profit, by selling the journals we work for as unpaid volunteers to university libraries. The journal prices that publishers are ridiculously high but libraries have no choice but to accept them because they cannnot afford to miss the journal in its collection. And ultimately we keep up the system by continuing to accept review requests. Academic publishers exploit scholars asking for reviews and giving nothing in return.
If you’re not an academic you may find this all very strange. When I told my parents in 2003 that my first article was accepted for an international journal they asked: “How much did they pay you for the article?” Journalists and free lance writers for magazines may get paid for the content they are producing, but not academics. The content we hope to help produce as reviewers is a public good: valid and reliable knowledge. Falsity should be avoided at all cost; the truth, the truth, and nothing but the truth should be published. The production of this public good is facilitated by public money. But the reviews we provide are not public goods. They are private goods. They are typically anonymous, and not shared publicly. We send them to the editorial assistant, who sends them to the authors (and sometimes, ‘as a courtesy’, to the other volunteer reviewers). The final product is again a private good, sold by the publisher. Collectively, our favors are creating a public bad: increasing costs for journal subscriptions.
What can we do about this? Should the volunteer work we do be monetized? Should we go on strike to ask for an adequate wage? According to the profitability of the journal perhaps? So that the higher the profit the publisher makes on a journal, the higher the compensation for reviewers? This would do nothing to reduce the public bad. Instead, I think we should move into Free Open Access publishing. The public nature of knowledge, the production of which is made possible by public funding, should be accessible for the public. It is fair that some compensation is given to the journal’s publisher for the costs they will have to incur to copy-edit the article and to host the electronic manuscript submission system. These costs are relatively low. I am leaving the number crunching for some other time or some other geek, but my hunch is that if we would monetize our volunteer work as reviewers this would be enough to pay for the publication of one article.
Academic publishers are not stupid. They see the push towards open access coming, and are now actively offering open access publication in their journals. But everything comes at a price. So they are charging authors (i.e., authors’ funders) fees for open access publication, ranging from several hundred to thousands of dollars. Obviously, this business model is quite profitable – otherwise commercial publishers would not adopt it. Thugs and thieves are abusing the fee-based open access model by creating worthless journals that will publish any article, cashing the fees and to make a profit. The more respectable publishers are now negotiating with universities and public funders of science about a better model, circumventing the authors. Undoubtedly the starting point for such a model is that the academic publication industry remains profitable. In all of this, the volunteer work of reviewers is still the backbone of high quality journal publications. And it is still not compensated.
So my plea to fellow academics is simple. We should give CRAP as our new default response. Agree to review if the publisher agrees to publish the article in Free Open Access. It may be the only way to force Free Open Access into existence. I will keep you updated on the score.
Update: 14 October 2014
Response: 4 Declined (Journal of Personality and Social Psychology, Sociology of Religion, Nonprofit and Voluntary Sector Quarterly, Science and Public Policy; Qualitative Sociology); 1 offered Green Access (Public Management Review); 2 responded review was no longer needed (Journal for the Scientific Study of Religion and Body & Society).
One managing editor wrote: “Thank you for offering to review this manuscript. Unfortunately, our publisher has not yet approved free Open Source. Those of us who actually work for the journal instead of the publishing institution would gladly provide open access to articles if it was up to us. These kinds of decisions are not left up to our discretion however. I greatly respect your stance and hope it is one that will eventually lead to greater access to academic publications in the future.”
In a message titled “Your assignment”, the associate editor of JPSP wrote: “I appreciate your willingness to review manuscript #[omitted] for Journal of Personality and Social Psychology: Personality Processes and Individual Differences. As it turns out, your review will not be needed for me to make a decision, so please do not complete your review. ”
The message from the editor of Sociology of Religion, probably composed by an Oxford University Press employee, says: “Sociology of Religion does not have an author-pays Open Access option in place, which would require the author or the body that funded the research to pay an Author Processing Charge—there is a range of APCs, beginning at $1,800. This is the only system currently in place at Oxford University Press for optional Open Access (some journals, of course, are entirely Open Access by design, generally with significant society sponsorship). The request and APC would need to come from the author, not the manuscript reviewer. Moreover, if an author requires OA to comply with requirements from his or her funding body, then the author submits it to a journal that has a OA option. Also, all authors of published articles are given a toll-free URL to post wherever they like—this allows the final version to be read without payment by anyone using that link, and importantly, counts toward online usage statistics. While this isn’t exactly the same as OA, it does make it freely available through that link as it is posted or distributed by the author.”
This is an interesting response from OUP. The question aside why the ‘Author Processing Charge’ must be as high as $1,800, if three reviewers each charge $600 for the volunteer work they provide for the journal by reviewing the paper, the APC would be compensated. As a courtesy to reviewers, OUP could waive the APC. Reviewers could wave the review fee as a courtesy to OUP. With wallets closed everybody benefits.
The editor of Science and Public Policy, another OUP journal, responded: “unfortunately at present an unconditional policy for open access publishing is not in place for our journal, rather the following policy applies, which is not in line with your conditions, according to which Authors may upload their accepted manuscript PDF to an institutional and/or centrally organized repository, provided that public availability is delayed until 24 months after first online publication in the journal.”
The editor of the Journal for the Scientific Study of Religion wrote in what seems to be a standard reply: “It has become apparent that I will not need you to review the manuscript at this time. I hope you will be able to review other manuscripts for JSSR in the near future.”
In contrast, the editorial assistant for Body & Society wrote: “We’re just writing to let you know that we no longer require you to review this paper. Enough reviews have come in for the editorial board to be able to make a decision. Thank you for having agreed to review, and we apologise for any inconvenience caused.”
*HT to @dwcorne for identifying this less benign motivation.