Why the ‘qualitative’ vs ‘quantitative’ divide is not a good idea

The distinction between ‘qualitative’ and ‘quantitative’ sub-communities aligns with the self-identities of many researchers in the social sciences. In my experience advocating for open science in multi-method environments these identities create polarization. In the social sciences, self-identified ‘qualitative’ scholars oppose open science principles because they have been developed by a ruling class of ‘quantitative’ researchers, and do not fit their goals and practices.

Many scholars who self-identify as ‘qualitative’ view the request for compliance with principles of open science as disrespectful. The ‘qualitative’ self-identity unifies scholars who are not analyzing numerical data against ‘quantitative’ scholars who are working with numerical data. Obviously, guidelines for data storage should fit the type of data. But it is not solely the type of data scholars work with that determines whether they self-identify as ‘qualitative’ or ‘quantitative’. There are also differences in ontologies (beliefs about reality), epistemologies (philosophies of science), knowledge objectives, relationships with research participants, relationships with societal stakeholders, methods of data collection, and methods of analysis.

The stereotypical difference between scholars who self-identify as ‘qualitative’ rather than ‘quantitative’ scholars is that their ontology is relativistic (vs realistic), their epistemology is subjectivist (vs. objectivist), their knowledge objectives are post-modernist (vs positivist), relationships with participants and stakeholders are personal (vs. absent), the data collection method involves personal participation (vs passive measurement), and methods of analysis involve human interpretation (vs mechanistic analysis).

At the same time, there is a large degree of variety in both groups. You can find critical scholars working with societal stakeholders collecting textual data for purely exploratory analyses, who are positioned in the ‘quantitative’ camp because they convert texts to numerical data. Similarly, you can find structural realists not working with societal stakeholders conducting interviews for purely descriptive analyses, who are positioned in the ‘qualitative’ camp because they do not convert interview transcripts to numerical data.

In addition, there is a movement towards mixed methods research in the social sciences. Mixed methods research combines values on each of these dimensions. There is utility in epistemic diversity – a variety and a combination of data from interviews and surveys, registers and focus groups, of quotes interpreted with care and imagination, and of numbers, analyzed with fancy statistical models.


Leave a comment

Filed under data, methodology, open science, science

Respecting Epistemic Diversity in Open Science

The social sciences and humanities have a high level of epistemic diversity: they harbor different research traditions, with a wide variety of types of empirical data, methods of analysis, philosophies of knowledge, and epistemologies in their approaches to knowledge creation. Here I’d like to share experiences trying to enhance transparency as a common objective in two environments with high epistemic diversity.

The multidisciplinary field of nonprofit and philanthropy research, my home base in the past twenty years, includes social science scholars from business schools, accounting, finance, economics, marketing and management research, but also from social policy and social work, sociology, and public administration, as well as psychologists, anthropologists, historians, and philosophers. Their data collection strategies and methods of analysis vary widely: from participant observation to experiments, and from interpretative reflection to the estimation of econometric models. Journals in the field such as Nonprofit & Voluntary Sector Quarterly and Voluntas have published work with the full diversity of approaches to research. How can a scholarly community with such a high epistemic diversity collaborate to achieve open science objectives?

The Faculty of Social Sciences at VU Amsterdam, where I work, is the second environment with high epistemic diversity that I’m in. The Faculty includes departments of anthropology, organization science, public administration and political science, sociology, and communication science. Scholars analyze data they collected themselves with participant observation, interviews, surveys, web scraping, or data obtained from official registers and from social media. How can a community with such different approaches be encouraged to improve the data and methods transparency? Before I go into differences, here are three common challenges.

Challenge #1: the incentive problem

The incentives work against transparency. Even when researchers endorse ideals of transparency and recognize its public benefits, they see no or little personal benefits of transparency. Preregistration, data documentation, sharing of data and code or the production of open software are not rewarded as much as journal publications and grants obtained. Moreover, researchers know that documenting and sharing data, code and software costs time. One of the reasons why the costs of transparency are high is that researchers are not knowledgeable in research data management, and have not been trained in it. Data and methods transparency will never be free. It will always involve additional work and extra care, even for experienced researchers. To facilitate the extra work, researchers should get proper training.

Challenge #2: the dinosaur problem

Speaking of experienced researchers: I’ve also noticed a cohort difference. Researchers who have made careers in times and institutions that did not require much data and methods transparency are more likely to oppose enhanced transparency requirements. For all their working lives they have been able to do their work just fine, without preregistration or providing access to data and code. Why would it be required for new projects? Some of the dinosaurs have much to lose as transparency requirements go up, because they know that their prior work will not hold up against closer scrutiny. For these researchers open science requirements are just a nuisance. As we develop training and guidance for researchers, we should not focus exclusively on the young and the willing, but also involve older researchers.

Challenge#3: work pressure and competition for resources

The resistance against transparency is also a result of work pressure, and unhealthy levels of competition due to scarcity in academia. In the Netherlands and in many other countries, funding for academic research by professors has not kept up with the increase in the number of students in the past decades. The resulting increase in work pressure has exacerbated the level of competition for research funds that was high already. The work pressure is especially salient for early career researchers in ‘up or out’ tenure track systems, with performance criteria for the number of publications. Why spend time documenting and sharing data if you can publish papers without the extra hassle? Open science tools and platforms should be easy to work with. To make the extra work worthwhile, researchers should get more credit for it in the form of citations and in tenure and promotion decisions.

Averting a culture war

The open science movement has the potential to transform the peaceful coexistence of scholarly communities identifying themselves as scholars with ‘qualitative’ or ‘quantitative’ methodologies into a culture war. Data and methods transparency implies more work for the types of data and methods that scholars use who identify as ‘qualitative researchers’, while the risks associated with data transparency are higher, and the collective benefits are smaller than for researchers who identify as ‘quantitative researchers’. If the call for more transparency comes from ‘quantitative’ researchers, it can revive the paradigm wars that have flared up regularly in the social sciences since the 1960s. If we want to increase the credibility of the social sciences, we should not revert to internal dissent.

Respecting epistemic diversity in open science

We should develop transparency policies that are aligned with the epistemic diversity in our field. There are at least four dimensions of knowledge creation in which we should take diversity into account.

1. Data collection: strength of ties with participants. Scholars who have developed stronger personal ties with research participants, e.g. through participant observation or focus groups, are more hesitant to share data than scholars who have had no meaningful interaction with research participants, e.g. through online surveys. The privacy risks of participants being identified are perceived to be higher when researchers have a personal relationship with the participants.

2. Data processing: numeric vs textual data. Scholars who work with numeric data from registers or surveys are more interested in reproducibility than scholars who work with textual data from interviews and their personal memories of events from participant observation. Documenting and anonymizing textual data in such a way that they can be shared responsibly is simply much more work than documenting numerical data.

3. Data objectives: idiographic vs nomothetic knowledge. Data transparency is associated with the knowledge goals that researchers want to achieve. Scholars seeking idiographic knowledge of specific individuals, groups and contexts are less likely to engage in open science practices than scholars seeking nomothetic knowledge by exploring regularities or testing theories and hypotheses. Producing generalizable knowledge is not a goal for idiographic research, while it is important for nomothethic research. For idiographic research, preregistration to avoid ‘hypothesizing after results are known’ does not make sense, because hypothesizing is simply not the goal.

4. Ontologies. Crystallized opinions about social phenomena also shape attitudes towards transparency and replication. For constructivists and interpretivists, the term ‘data’ in its literal meaning of ‘things being given’ is a logical impossibility. If researchers construct meaning in interaction with others, a recreation of these meanings will always produce different results when new researchers or different actors are involved. As a result, reproducibility is not possible.

For extreme contexualists, all social phenomena are determined by contextual influences, and there are so many of them that we cannot draw definitive conclusions about anything. Conclusions from a study in one context will never be generalizable to other contexts. In this case, reproduction is possible, but meaningless, and not a property of good scholarship. Every reproduction will always produce different results, and what we can learn from these differences is limited. It is more interesting to study a new context.

Keeping everyone on board

Discussions about philosophies of science may be interesting, but they can also unearth and widen existing divisions. To avoid polarization between communities of scholars with strong social identities as ‘qualitative’ and ‘quantitative’, it is a better strategy to invite researchers working with different types of data and methods to self-organize conversations about community standards. If the support for transparency varies between scholars working with different types of data and methods, and if transparency means different things in practice for different types of data and methods, ‘one size fits all’ requirements to improve transparency are a bad idea. Respect for epistemic diversity means that different communities of researchers formulate their own norms with respect to good data and methods. Organizing discussions on transparency separately for each type of data and methods can keep everyone on board – even if we sail on different ships. Norms are more likely to be observed when they are formulated by community members themselves. Such a differentiated strategy will improve transparency for each of these communities, and ensure that the norms are tailored to their needs.

An afterthought

Here’s a personal anecdote that may explain why the desire to bridge different worlds runs so deep for me. I remember that one of the reviewers of the grant proposal for my dissertation research objected that the project would never be able to explain altruistic behavior. The reviewer stated that the study I had proposed departed from a rational choice approach, and that it used survey data for a large random sample of the population. Both choices would set me up for failure, according to the reviewer. The assumption in the first part of the argument was not true. I was skeptical of rational choice approaches, and had selected prosocial behaviors benefiting strangers such as charitable giving and blood donation precisely because they could not easily be explained with social exchange and direct reciprocity. The assumption in the second part was true: indeed I was going to use survey data. There had been few studies of prosocial behavior using survey data. But the reviewer did not want me to do such a study. Instead, I should use a small n, in depth study of exemplary altruists. My supervisors appealed against the negative review because it contained factual errors and was based on a prejudiced perception of the proposal. The reviewer had not read the proposal and had advised against funding it because it would benefit a competing school of research in the social sciences.

I was shocked. Why would an impartial reviewer submit a negative review of a proposal because it came from a school that used certain theories and methods? Eventually the board disqualified the reviewer, and awarded the grant. Though the outcome was favorable, I still think it would have been more interesting to actually have a discussion on the unique benefits of various methods and theories, and perhaps a way to combine them. More about how to make epistemic diversity productive for knowledge creation perhaps in a next post.

Hat tip to Joeri Tijdink for the reference to Leonelli, S. (2022). Open Science and Epistemic Diversity: Friends or Foes? Philosophy of Science, 89, 991–1001. https://doi.org/10.1017/psa.2022.45

More about governance instruments to promote transparency in this post.

Leave a comment

Filed under academic journals, data, incentives, methodology, regulation, science

A Reminder

Dear editor,

As you will recall, I recently replied to your invitation to review #[JRNL-YY-NUMBER] for your journal with the request to provide access to the data and code the authors used to produce results reported in the manuscript.

Unfortunately, you have not yet agreed to ask the authors of this manuscript to provide access.

Please do so at your earliest convenience. There is no need to log onto my Journal Editor Manager system.

You have no username.

If you forgot your password, do not worry. You do not need it. You do not have to click the ‘Send Login Details’ link on a Login page or something.

Please just ask the authors.

You will find the author contact details in your database. Please copy the message below and send it at your earliest convenience.

“Dear authors, it is good practice to provide access to the data and code (if any) you used to produce the results reported in the manuscript. Please provide a URL to a repository providing access to the data and code (if any) at your earliest convenience, in order not to delay the review process for your manuscript.”

In case you accept to make this request, just reply with a message saying “Yes, thanks. Will do.” There is no need to click on a link.

If you do not have time to do this, or do not feel qualified, please reply with a message saying “No, thanks.” and refrain from further correspondence regarding this manuscript. Again, there is no need to click on a link or login with a user name and password.

Please let me know if you have any questions.

Thank you very much.

With kind regards,

Journal Invitation Handling Office

Rene Bekkers

This letter does not contain confidential information, is published online without restrictions for reuse, and may be forwarded to third parties.

Recipients of this email are not registered as users within Journal Editor Manager database. We may publish your invitation behavior in the process of considering and evaluating journal editors and publishers. If you no longer wish to receive messages on access to data and code for manuscripts submitted to your journal that you ask me to review, please refrain from sending me invitations, or by default require authors to provide access to data and code and specify the repository in the invitation.

Leave a comment

Filed under academic journals, code, data, prosocial behavior

Beware of predatory publishers

In all likelihood, you also get numerous requests to review articles, contribute articles to special issues, compile or edit a special issue, or even to join an editorial board.

Recently colleagues asked me how to deal with such requests. Most of these requests come from predatory publishers such as SCIRP, MDPI, Hindawi and Frontiers. These journals have made it their business model to exploit the voluntary labor of academics while at the same time charging authors hefty publication fees for articles in their journals, sometimes called “Author Processing Charges” (APCs) or “Open Access fees”. The apparent advantage of a quick turnaround and acceptance at these journals is often explained by the lax peer review process.

How can you recognize a trustworthy request and distinguish it from an exploitative one? Here are three questions you can use to determine whether you are dealing with a predatory publisher.

  1. Do you know the person making the request? 
  2. Did you know the journal before you received the request?
  3. Do you know any of the persons listed on the journal website as editorial board members?

If the answers to these questions are “no”, you’re probably dealing with a predatory publisher. Then it is safe to ignore the request and delete the message. If you know the person inviting you, or if you know that the journal has previously published good work, or if you know one of the persons on the editorial board, the request may be worth considering. Every week I receive a dozen or so such messages. Most of them are filtered automatically and end up in the spam folder, and I just delete them.

The practices of one particular publisher, MDPI, have recently attracted attention. Its journal “Sustainability” for instance, published more than 3000 special issues in the year 2021. Another MPDI journal, the International Journal of Environmental Research and Public Health (IJERPH) was removed last week by Clarivate (Web of Science) from its list of recognized academic journals. More about that here: https://mahansonresearch.weebly.com/blog/mdpi-mega-journal-delisted-by-clarivate-web-of-science. Among the 82 journals that were removed were also journals published by Routledge / Taylor & Francis, Wiley, Sage, and Springer. More about predatory journals: https://predatoryreports.org/news/f/questions-to-consider-when-receiving-a-call-for-papers?blogcategory=Submission+Invitation


Filed under academic journals, academic misconduct, incentives, publications, research, research integrity, science

A moral appeal does not encourage everyone to give more

Do people give more if they are reminded of the social norm to give? Several experiments among students show that explicit reminders with texts such as “Do the right thing” lead to more cooperation in social dilemmas (Dal Bó & Dal Bó, 2014; Capraro & Vanzo, 2019). There is also evidence that moral norms in crowdfunding campaigns can lead to more giving behavior (Capraro & Vanzo, 2019; Van Teunenbroek, Bekkers & Beersma, 2021).

We tested the effect of a moral appeal among a large sample of respondents (n = 1,196) in the 2019 wave of the Giving in the Netherlands Panel Survey (Bekkers, Boonstoppel, De Wit, Van Teunenbroek, & Fraai, 2022). After completing the questionnaire, participants could donate their reward for completing the questionnaire to a charity. They can choose for a donation to the AIDS Fund, KWF Kankerbestrijding, the Netherlands Heart Foundation, or the Red Cross. Respondents could also receive the reward in the form of a voucher. Half of the respondents were presented with the text, “Did you know that 63% of the Dutch find that everyone has a responsibility to help others when they need it need? The other half of the respondents were not presented with this text. The purpose of the experiment was to test whether a moral appeal would influence giving behavior.

As in previous experiments (Bekkers, 2006), only a small proportion of participants (3.8%) give away their points to charity. We see no difference in the giving behavior between the group that did see the text (3.7% donated to charity) versus the group that was not shown the text (3.8%). This finding seems to indicate that the text had no effect. But it is a bit more complicated than that.

The effect of a moral appeal depends on personal norms
Each edition we ask to what extent participants agree with the statement “Everybody in this world has a responsibility to help others when they need assistance.” This statement is an item in the principle of care scale (Bekkers & Wilhelm, 2016). We ask this question at the beginning of the questionnaire, before participants can distribute points. This is how we know
we know that 63% of respondents “agree” or “strongly agree” with the statement.

Figure: the percentage donating their reward to charity when a moral appeal was shown or not, for two groups that did not agree (disagree strongly, disagree, or neither disagree nor agree) vs agreed or agreed strongly with the statement that “Everyone has a responsibility to help others when they need it.”

Here’s the plot twist: when we make a moral appeal to participants to give by reminding them of the norm we see that donations among those who do not agree personally with the norm goes up significantly. Without a moral appeal, hardly anyone from the group of participants who did not agree with the statement gave the reward to charity (1%). But with a moral appeal, the percentage of respondents who donated the reward is more than three times higher (3.6%). In contrast, the group of participants who do agree with the norm of helping others seems to be somewhat less generous after seeing the moral appeal: among these participants, 3.8% gave the reward away after the appeal, compared with another 5.3% without a reminder of the norm. Perhaps the information that 63% agreed with the statement disappointed them – it implies that 37% did not (completely) agree with the statement.

The positive effect of the moral appeal for the group that personally disagreed with the statement was canceled out by the negative effect for the group that did agree with the statement. The null finding that moral appeals did not work concealed an interaction with personal views of participants on the statement – those agreeing with it reduced their giving when they were confronted with the appeal, while those who did not agree were encouraged by it. This indicates that a moral appeal does not work positively for everyone. The effect of a moral appeal depends on personal norms.

Open data, code and results

The data are here and the code is here. The results are here. This text is a translation of pages 111-112 from Bekkers & Van Teunenbroek (2022).


Bekkers, R. (2007). Measuring Altruistic Behavior in Surveys: The All-Or-Nothing Dictator Game. Survey Research Methods, 1(3): 139-144. https://ojs.ub.uni-konstanz.de/srm/article/download/54/530

Bekkers, R., Boonstoppel, E., De Wit, A., Van Teunenbroek, C. & Fraai, P. (2021). Giving in the Netherlands Panel Survey: User Manual. Amsterdam: Center for Philanthropic Studies, Vrije Universiteit Amsterdam. https://osf.io/7ta2g

Bekkers, R., Gouwenberg, B., Koolen-Maas, S. & Schuyt, T. (2022, Eds.). Geven in Nederland 2022: maatschappelijke betrokkenheid in kaart gebracht. Amsterdam: Amsterdam University Press. https://osf.io/download/kqa8j/

Bekkers, R. & Ottoni-Wilhelm, M. (2016). ‘Principle of Care and Giving to Help People in Need’. European Journal of Personality, 30(3): 240-257. http://dx.doi.org/10.1002/per.2057

Bekkers, R. & Van Teunenbroek, C. (2022). Geven door huishoudens. Pp. 84-118 in: Bekkers, R., & Gouwenberg, B.M. (Eds.). Geven in Nederland 2022. Amsterdam: Amsterdam University Press. https://renebekkers.files.wordpress.com/2022/06/bekkers_vanteunenbroek_22_h1_gin22.pdf

Capraro, V., & Vanzo, A. (2019). The power of moral words: Loaded language generates framing effects in the extreme dictator game. Judgment and Decision Making, 14: 309-317. https://www.sas.upenn.edu/~baron/journal/19/190107/jdm190107.html

Dal Bó, E., & Dal Bó, P. (2014). “Do the right thing”: The effects of moral suasion on cooperation. Journal of Public Economics, 117: 28-38. https://www.nber.org/papers/w15559

Van Teunenbroek, C., Bekkers, R. & Beersma, B. (2021). They ought to do it too: Understanding effects of social information on donation behavior and mood. International Review of Public and Nonprofit Marketing, 18, 229–253. https://doi.org/10.1007/s12208-020-00270-3

Leave a comment

Filed under altruism, data, experiments, fundraising, Netherlands, open science, philanthropy, principle of care, survey research

Why do people (not) give?

What are the motivations and obstacles for charitable giving? Depending on the research design, there are two perspectives from which we can think about answers to this question. The first is an interventionist perspective, focusing on mechanisms in giving behavior and conditions that could affect it. In the Science of Generosity review Pamala Wiepking and I call this ‘Type 2 knowledge’ on Why People Give. The NVSQ paper identifying eight mechanisms summarizes research on this type of knowledge. The other perspective is descriptive. We call this ‘Type 1 knowledge’ on Who Gives What. Two articles in Voluntary Sector Review (here and here) summarize this type of knowledge.

From both perspectives, the question why people do something (i.e. give) is the same as the question why people don’t do something (not give) – the focus of attention is just on the alternative potential outcome. From a collective welfare perspective it makes sense that most interventions studied in research on giving are designed to increase charitable giving, and not decrease it.

Gift (Amount)No Gift
Treatment: intervention to make people give (or give more)
Control: no intervention

Thinking from an interventionist perspective, the mechanisms that promote giving are the same as the ones that reduce giving. Take solicitation for instance. In an experiment randomizing participants to either receive a direct solicitation for a gift or not, the evidence that solicitation works is the difference in the proportion of participants who give or do not give. Another example is prosocial values. One would expect that an appeal to prosocial values will lead to more giving. Some experiments (such as this one and this one) do show this, though in some experiments I have not been able to reproduce such findings, and even found backfiring effects.

Thinking from a descriptive (type 1 knowledge) rather than interventionist perspective, those who receive a direct solicitation are more likely to give than those who do not. The finding from experiments that solicitation creates giving is supported by evidence from correlational designs that almost all people who do not give are not asked, and those who do give are much more likely to have been asked. Persons with higher scores on prosocial values are more likely to give, and those with lower scores are called greedy and give less. The underlying dimension is the same though. This holds for not only for greed (<generosity) but also for self-interest (<altruism); and for other constructs.

How not to measure the influence of motivations

In surveys, the question why people give (or not) is often asked in a general sense with a list of statements about potential motivations or barriers to give. Participants are then asked to express their agreement or disagreement with these statements. The responses to such questions do not help us much in terms of potential interventions that will improve compliance with requests to give. For instance, full agreement with the statement “I don’t trust charities” or with the statement “I don’t give to charities because I don’t trust them” does not tell us what intervention would work to create charitable confidence, or could circumvent the lack of charitable confidence. They do tell us something about a certain societal sentiment and characteristics of people, which are hard to change for fundraisers.

Asking about motivations for giving specifically among those who gave is even worse. From these responses we cannot learn much about why people give, because there’s no observation of the very same motivations among those who did not give. It is very well possible that those who did not give have similar motivations. The same happens when a survey presents obstacles or barriers to those who did not give. In methods jargon this is called selection on the dependent variable. It is known to produce biased estimates. It is better to ask everyone the same questions on motivations, and then compare differences between donors and non-donors

Also such surveys usually lack a behavioral outcome observed after the motivation measure. The type of retrospective questions on motivations (“Why did you give?) or barriers (“Why did you not give?”) are likely to contain post-hoc justifications that do not reflect the actual reasons why people in which they had the opportunity to give did or did not do so. Therefore this type of data is not very useful for the design of interventions in fundraising campaigns.

What measures of motivations do tell us

Responses to questions about motivations for giving do have some value – they tell us something about the virtues and sins that people view as acceptable reasons to give or not give. This type of data is particularly useful in time series so we can observe societal trends. For instance, if the support for statements like “I don’t trust charities” is increasing that is an important signal that charitable confidence is a problem that organizations need to work on.


Leave a comment

Filed under data, fundraising, household giving, methodology, research, survey research, trends, trust

A Connected World of Academics in the Fediverse

If your vision of an ideal connected world is an online platform for free exchange of knowledge and reasoned debate, untainted by prejudice and hate, you will be fascinated by the demise of Twitter since October 27, 2022. On that day, the then richest man of the world, Elon Musk, took over the company for $44 billion. In the following weeks, he fired half the workforce, caused hundreds more to leave the company, and failed to make the company profitable. The promise to open source publish the company’s algorithms was not delivered.

For scientists, Twitter used to be a great place to learn and collaborate. It was indeed a platform that facilitated free exchange of knowledge and debate. I learned most of what I know about behavioral and molecular genetics, Genome Wide Association Studies, the reproducibility crisis, and meta science through Twitter. I found interesting scholars who made their own and other’s work accessible in crisp summaries of 140 characters. I learned about presentations of new research at conferences I could not attend personally via scientists who live tweeted them. I built consortia of researchers with similar interests in dozens of countries around the world. I would not have been able to work with them otherwise.

At the same time, Twitter was also an open gutter of hate speech and a channel of misinformation that contributed to the premature death of millions of people who were misled about COVID-19. By muting words and blocking accounts I was able to keep the worst filth out of my timeline. Occasionally, conspiracy intuitionists commented on things I said about philanthropists on television, but I managed to escape vile attacks and threats by anonymous accounts. After the Musk takeover, reinstated hate speech accounts, troll farms and foreign operatives increased their activities.

Millions of users – including myself – left the blue bird site to the federated servers of Mastodon. The current count is nine million. By the time you read this, the number of active users of the platform will have grown further. Like Twitter, Mastodon is a microblogging service, with some very similar functionality. A message on Mastodon is called a ‘toot’, and you can identify its topic with a #tag. At the same time, the affordances and features of the two platforms are also different in many ways. Mastodon is free from commercial interests. It has no owners, no advertisements, and no for-profit business model. Instances are installed and maintained by users, and paid for by donations and sponsorships. It is like a set of islands, each with a set of rules for inhabitants. The community governs itself. The islands are mildly undemocratic – they are run by volunteer moderators who can enforce community standards, such as bans on posting political and commercial content, and providing content warnings and textual descriptions of images.

It is rather difficult on Mastodon to quickly build a flock of new followers. The exodus of refugees from Twitter to Mastodon is still relatively small. The nine million users are only a fraction of the number of the 400 million users that Twitter used to have. On Mastodon, users have to invest in conversations to build connections. As a result, ties between current users are stronger and the community is more close-knit. User-built tools allow Twitter refugees to find old connections who have migrated to different islands. It is difficult to predict how the site will change as Twitter collapses further. Currently, Mastodon is less susceptible to hypes. Toots are less likely to ‘go viral’ than tweets; users – not algorithms – determine who sees which messages. Because of these features, Mastodon will probably not grow to the proportions of Twitter. Mastodon founder Eugen Rochko declined Silicon Valley investment offers. They would not make much sense to begin with. The site has no control over its users that it can easily monetize.

At the same time, the site is better aligned with the structure and spirit of academic communities: collective, self-organized initiatives, facilitating knowledge exchange and debate, untainted by commercial interests. The association of universities in the Netherlands is setting up its own instance for students and staff. Through its move from Twitter to Mastodon, the connected world of academia has become a little less exploitative and more cooperative.

Leave a comment

Filed under research, science

Ten Meta Science Insights to Deal With the Credibility Crisis in the Social Sciences

A decade of meta science research on social science research has produced devastating results. While the movement towards open science is gaining momentum, awareness of the credibility crisis remains low among social scientists. Here are ten meta science insights on the credibility crisis plus solutions on how to fight it.

This is a blog version of the SocArxiv preprint at https://osf.io/preprints/socarxiv/rm4p8/

1. At least half of all researchers use questionable research practices

Research on research integrity has estimated the prevalence of integrity violations in many subfields of science, including the social and behavioral sciences. According to the best evidence to date from the Netherlands Survey of Research Integrity (Gopalakrishna et al., 2022a), half of researchers in the social and behavioral sciences (50.2%) reported having engaged in at least one questionable research practice in the past two years. The most common questionable research practices in the social and behavioral sciences are not submitting valid negative studies for publication (17.2%) and insufficient discussion of study flaws and limitations (17.2%). Two other frequently reported violations are inadequate note taking of the research process (14.4%) and selective citation of references to enhance findings or convictions (11%).

A smaller but still non-negligible proportion of researchers in the social and behavioral sciences in the Netherlands self-reports data fabrication or falsification in the past two years (5.7%). This proportion seems low, but it should be zero. The estimate implies that one out of every 17.5 researchers fabricated or falsified data.

These estimates are valid for researchers in the Netherlands who responded to a survey on research integrity. There are at least three reasons to suppose that these estimates are underestimates for the global community of social scientists. One reason is that socially undesirable behaviors such as research misconduct and questionable research practices are underreported in surveys (John, Loewenstein & Prelec, 2012). A second reason is that the response rate to the survey was only 21%. Non-response is usually selective, and higher among those who have an interest in the study topic (Groves et al., 2006). Among the non-respondents the proportion of researchers who engaged in violations of research integrity is likely to be higher than among respondents (Tourangeau & Yan, 2017). The third reason is that the survey was conducted in the Netherlands. A meta-analysis of studies on integrity violations found that estimates of the prevalence of violations are higher in lower and middle income countries than in high income countries such as the United States and the Netherlands (Xie, Wang, & Kong, 2021). One audit of survey research found that datasets produced outside the US contained more fabricated observations (Judge & Schechter, 2009).

Codes of conduct for scientific researchers such as the guidelines of the United States Office of Research Integrity (ORI; Steneck, 2007), the Netherlands Code of Conduct for Research Integrity (KNAW et al., 2018) and the European Code of Conduct for Research Integrity (ALLEA, 2017) explicitly forbid not only fraud, fabrication and falsification of data, but also mild violations of integrity and questionable research practices. Clearly, the mere existence of a code of conduct is not enough to eradicate bad research. To design more effective quality control procedures, it is important to understand how researchers make decisions in practice.

2. Researcher degrees of freedom facilitate widely different conclusions

When researchers are allowed to keep their workflow private, and only give access to the final results, it is difficult to detect data fabrication, falsification, and questionable research practices. Throughout the empirical research process, researchers have many degrees of freedom (Simons, Nelson & Simonsohn, 2011): they can limit their samples to specific target groups, have different sampling strategies, use different modes of data collection, ask questions in particular ways, treat missing values in different ways, code variables in more or less fine-grained categories, add or omit covariates, and run different types of statistical tests and models. While some of these decisions are described in publications, many of the choices are not disclosed. Gelman & Loken (2014) compare these choices with a walk in a garden of forking paths. Taking different turns leads researchers to follow different paths, see different things, and they may end up at completely different exits.

Meta research projects of the ‘Many analysts, one dataset’ type, in which many researchers are testing the same hypothesis with the same dataset, demonstrate that researcher degrees of freedom easily lead to entirely different conclusions. In a recent study relying on international survey data, researchers were asked to estimate the association between immigration and public support for government provision of welfare (Breznau et al., 2022). 25% of estimates were significantly negative, 17% were positive, and 58% had a confidence interval including 0. Moreover, the magnitude of the relationships varied strongly, with standardized effect sizes ranging from less than -0.2 to more than +0.2. Even more striking differences between estimates emerge from projects relying on observational data on discrimination (Silberzahn et al., 2018), group processes (Schweinsberg et al., 2021) and financial markets (Menkveld et al., 2022). Documenting all steps taken to obtain the estimates is the only way in which the validity of the estimates can be evaluated.

3. Published research contains questionable research practices

Even when researchers are not required to document all decisions they made in the collection, analysis and reporting on data, divergence from standards of good practice are apparent in the body of published research. Meta research in the past decade has identified many traces of questionable research practices in published research. Here are three indicators.

A first indicator is a suspiciously high rate of success in supporting the hypotheses. Across all social sciences, the proportion of studies supporting the hypotheses has increased in the past decades to unreasonably high levels (Fanelli, 2012). In some subfields of psychology such as applied social psychology, up to 100% of studies support the hypotheses (Schäfer & Schwarz, 2019). The trick is nothing short of magical – it works every time.

A second indicator is the excess of p-values just above cut-off values for statistical significance. P-values tend to be more common when they are just above the critical value of 1.96 for a .05 significant finding in sociology (Gerber & Malhotra, 2008a), political science (Gerber & Malhotra, 2008b), psychology (Simonsohn, Nelson & Simmons, 2014), and economics (Brodeur, Lé, Sagnier, & Zylberberg, 2016). The prevalence of p-hacking is particularly high in online experiments in marketing conducted through MTurk (Brodeur, Cook & Hayes, 2022). A third indicator is the lack of statistical power to test hypotheses. Studies in psychology (Maxwell, 2004), economics (Ioannidis Stanley & Doucouliagos, 2017) and political science (Arel-Bundock et al., 2022) tend to be underpowered.

4. Publication bias and questionable research practices reduce the reliability of published research

A second explanation for the low replicability of published research is publication bias (Friese & Frankenbach, 2020; Smaldino & McElreath, 2016; Grimes, Bauch & Ioannidis, 2018). Negative or null-findings are less likely to be published than positive findings, not only because researchers are less likely to write reports on negative results, but also because they are evaluated by reviewers in a less positive manner (Franco, Malhotra & Siminovits, 2014). Scarcity of resources and perverse effects of incentive systems in academia create a premium for novelty and prolific publication (Smaldino & McElreath, 2016; Brembs, 2019). It is no wonder that researchers engage in questionable research practices to obtain positive results, and to get their research published.

Transparency does not guarantee quality; it enables a fair and independent assessment of quality (Vazire, 2019). Transparency is crucial for the detection of questionable research practices, fraud, fabrication, and plagiarism (Nosek, Spies & Motyl, 2012). Open science practices can improve the reliability of published research (Smaldino, Turner & Contreras Kallens, 2019). Studies with high statistical power, preregistration, and complete methodological transparency are more reliable and replicate well (Protzko et al., 2020). When studies are more replicable with the same methods and new data from the same target population, they are also more generalizable to other populations across time and place (Delios et al., 2022).

5. Closed science facilitates integrity violations

If the rate of questionable research practices is unacceptably high, why does that not change? The reason why research integrity violations continue to be so prevalent is that researchers are allowed to hide them. Compared to other industries, science has a particularly lax system of quality control. Before roads are built and new toys for kids are allowed to be sold there are safety and health checks of the builders, their materials, their construction plans and manufacturing processes, and ultimately the safety of their products. But when we do science, there is much less of this. We ask volunteers to primarily look at the product. If we buy a car at an authorized dealer, there’s a money back guarantee. But reviewers of scientific papers do not even start the engine of the data and code to check whether the thing actually works. That is not good enough.

Researchers are not required to be fully transparent about all the choices they have made. As a result, violations of research integrity are rarely detected in the classical peer review process (Altman, 1994; Smith, 2006, 2010). Without extensive training, peer reviewers are bad at catching mistakes in manuscripts (Schroter et al., 2008). The current peer review system is far from a guarantee of flawless research. If a study is published after it went through peer review, that does mean it is true. Even at the journals with the highest impact factors, the review process does not successfully keep out bad science (Brembs, 2018).

In order to enhance the reliability of the published record of research the peer review process need to change in a direction of more openness and transparency (Smith, 2010; Munafo et al., 2017). In addition, transparency requirements can deter questionable research practices. Violations of research integrity are like crime: when the probability of being detected is high enough, potential perpetrators will not engage in violations. Data from the Netherlands Survey of Research Integrity shows that a higher likelihood of being detected by a reviewer or collaborator for data fabrication is associated with a lower likelihood of engaging in questionable research practices (Gopalakrishna et al., 2022a) and a higher likelihood to engage in responsible research practices such as sharing data and materials (Gopalakrishna et al., 2022b).

6. Roughly half of all published studies do not replicate

With half of researchers admitting that they engage in questionable research practices, it is no surprise that research on the replicability of research published in the social sciences demonstrates that a large proportion of published findings claiming a general regularity in human behavior cannot be replicated with new data by new researchers. This conclusion holds not only for psychology (Open Science Collaboration, 2015), but also for other fields of the social and behavioral sciences, even for publications in the highest ranked journals, such as Nature and Science (Camerer et al., 2018). In psychology, 97% of 100 original studies reported significant effects with a standardized effect size of .40. In independent replications, only 36% produced significant effects, with a standardized effect size of .20 (Open Science Collaboration, 2015). The independent replication project of studies published in Nature and Science found significant effects for 62% of original studies with an effect size of .25, also half of the original effect size (Camerer et al., 2018).

A key difference between original studies and replications explaining why replications are much less likely to achieve significant results is that original studies are not pre-registered. Registered reports in psychology are achieving positive results in only 44% of hypotheses tests, while standard reports obtain positive results in 96% of tests (Scheel, Schijen & Lakens, 2021).

The finding that individual studies are likely to present overestimates also implies that meta-analyses of published findings are too positive (Friese & Frankenbach, 2000). The presence of publication bias, data fabrication and falsification and questionable research practices in the body of peer-reviewed publications implies that statistical meta-analysis is fundamentally unfit to estimate the size of an effect based on previous research. The only way to obtain an accurate estimate of a published association is to conduct an independent, preregistered replication (Van Elk et al., 2017). Effect sizes reported in meta-analyses are two to three times as large as independent preregistered replications (Kvarven, Strømland, & Johannesson, 2020).

These findings imply that the reliability of published research is low. For each pair of published studies, only one will replicate in the original direction. For each published study that does replicate, the magnitude of the association is only half of the original. The implication of these findings is that reliability of published research is low, and you cannot trust roughly half of all published research. You will have to evaluate the quality of published research yourself. A key indicator is whether it was pre-registered. As a rule of thumb, divide the effect size reported in a study that was not preregistered by two, and the effect size from a meta-analysis by three.

7. Why ‘data available upon request’ is not enough

One way to detect data fabrication and falsification and questionable research practices is through close inspection of research data. Fabricated and falsified data contain patterns that original research data do not (Judge & Schechter, 2009; Heathers et al., 2018).

However, researchers rarely provide access to research data. A recent estimate by Serghiou et al. (2021), showed that only 8% of 27,000 articles from the Social Sciences included in the PubMed database provided access to data. In a smaller sample of 250 publications in Scopus-indexed outlets from the period 2014-2017, Hardwicke et al. (2020) find that 7% provided access to data.

A journal encouragement to share data is not a guarantee that authors actually do share data. In a study among researchers who published in Nature and Science, which both require authors to promise they will give access to research data, still only 40% of psychologists and social scientists complied with a request to access the data (Tedersoo et al., 2021). In a study among economists who had indicated in their publications that data and materials were available upon request (Krawczyk & Reuben, 2012). In practice, only 44% complied. Among psychologists who had published in the top journals of the field and promised data and materials would be available upon request, only 26% complied (Wicherts et al., 2006).

Thus, introducing a data sharing policy by itself is an ineffective journal policy if it is not enforced (Stodden, Seiler, & Ma, 2018; Christensen et al., 2019). Authors should not only be required to share data and code, but a data editor should also verify the computational reproducibility of the data and code. In other words, the promise that “data are available upon request” usually means that the data are not made available. A policy relying on such promises is not strict enough to prevent the publication of manuscripts containing results based on questionable research practices and on fabricated and falsified data.

8. Artificial intelligence can support peer review

Just like artificial intelligence facilitates plagiarism detection, it can also support the peer review process by screening manuscripts for errors and the presence of information about relevant indicators of research quality. One example of a useful tool is StatCheck, which helps reviewers check the consistency between reported p-values and the test-statistics (http://statcheck.io; Nuijten & Polanin, 2020). Another example of a useful tool is the p-curve app, which quickly provides reviewers with relevant information about the evidentiary value of a set of experiments (https://shinyapps.org/apps/p-checker/; see Simonsohn, Nelson & Simmons, 2014a, 2014b).

Advancements in natural language processing have enabled software engineers to build tools that automatically screen full texts of articles and extract information about ethics statements, randomization, sample sizes, sharing of data and code, and other indicators of research quality (Menke et al., 2020; Riedel, Kip & Bobrov, 2020; Serghiou et al., 2021; Zavalis & Ioannidis, 2022). Publishers should create an infrastructure in which new submissions are screened automatically and transparency indicators are reported. While peer review should not be automated altogether, artificial intelligence will certainly help improve peer review (Checco et al., 2021; Schulz et al., 2022).

9. Introducing registered reports will improve the credibility of research

A registered report is a new submission format that effectively eliminates an evaluation of the results of research from the review process (Nosek & Lakens, 2014; Chambers & Tzavella, 2022). Reviewers only evaluate the research design: the hypotheses, data collection and analysis plans. Authors receive feedback and may alter their plans in a revised version. Editors then decide whether to accept the study for publication. Only after authors receive the acceptance letter they proceed to collect and analyze the data. This format blinds both reviewers and researchers to the results, and increases the likelihood that null-findings and negative results are published. Journals across all fields of the social sciences are introducing registered reports (Hardwicke & Ioannidis, 2018). When this format becomes the standard for academic research publications the reliability of published research will increase.

10. Replications should be encouraged and actively facilitated

One way to encourage replications is to invite authors to submit preregistered replication reports of published research. A preregistration is a document describing the hypotheses, data collection and analysis plans for a study before it is conducted (Nosek, Ebersole, DeHaven & Mellor, 2018). Public preregistrations enable reviewers to check whether authors changed the hypotheses, report on all of them, and how the data analysis reported in manuscripts is different from the original plans. The use of preregistrations is increasing across all fields of the social sciences (Nosek et al., 2022). The combination of preregistrations with a registered report effectively reduces questionable research practices such altering hypotheses after results are known, hiding negative results, and researcher degrees of freedom to obtain significant results (Soderberg et al., 2021).


The credibility crisis in the social sciences should lead us to redesign the industry of academic research and publications to raise the bar in quality control procedures. Enforcement of open science practices is the solution. Voluntary accountability mechanisms such as promises to uphold standards of good conduct and symbolic rewards such as badges depend on the intrinsic motivation of researchers. At the same time, tenure and promotion systems as well as award criteria in grant proposal competitions introduce extrinsic incentives that lead researchers to produce a high level of output at the expense of quality. Universities, research funders and journals should redesign reward systems so that prestige depends solely on research quality, not quantity. While such reforms are under way, professional training of reviewers and artificial intelligence facilitates the enhanced detection and deterrence of bad research practice.


ALLEA (2017). European Code of Conduct for Research Integrity. https://www.allea.org/wp-content/uploads/2017/05/ALLEA-European-Code-of-Conduct-for-Research-Integrity-2017.pdf

Altman, D. G. (1994). The scandal of poor medical research. British Medical Journal, 308(6924), 283-284. https://doi.org/10.1136/bmj.308.6924.283

Arel-Bundock, V., Briggs, R. C., Doucouliagos, H., Mendoza Aviña, M., & Stanley, T. D. (2022). Quantitative Political Science Research is Greatly Underpowered. I4R Discussion Paper Series, No. 6. http://hdl.handle.net/10419/265531

Bollen, K., Cacioppo, J.T., Kaplan, R.M., Krosnick, J.A., & Olds, J.A. (2015). Social, Behavioral, and Economic Sciences Perspectives on Robust and Reliable Science. Report of the Subcommittee on Replicability in Science Advisory Committee to the National Science Foundation Directorate for Social, Behavioral, and Economic Sciences. https://nsf.gov/sbe/AC_Materials/SBE_Robust_and_Reliable_Research_Report.pdf

Brembs, B. (2018). Prestigious Science Journals Struggle to Reach Even Average Reliability. Frontiers of Human Neuroscience, 12 (37): 1‐7. https://doi.org/10.3389/fnhum.2018.00037

Brembs, B. (2019). Reliable novelty: New should not trump true. PLoS Biology, 17(2), e3000117. https://doi.org/10.1371/journal.pbio.3000117

Breznau, N., Rinke, E. M., Wuttke, A., Nguyen, H. H., Adem, M., Adriaans, J., … & Van Assche, J. (2022). Observing many researchers using the same data and hypothesis reveals a hidden universe of uncertainty. Proceedings of the National Academy of Sciences, 119(44), e2203150119. https://doi.org/10.1073/pnas.2203150119

Brodeur, A., Cook, N. & Heyes, A. (2022). We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell Us about Publication Bias and p-Hacking in Online Experiments. IZA Discussion Paper No 15478. https://www.econstor.eu/bitstream/10419/265699/1/dp15478.pdf

Brodeur, A., Cook, N. & Neisser, C. (2022). P-Hacking, Data Type and Data-Sharing Policy, IZA Discussion Papers, No. 15586. http://hdl.handle.net/10419/265807

Brodeur, A., Lé, M., Sangnier, M., & Zylberberg, Y. (2016). Star wars: The empirics strike back. American Economic Journal: Applied Economics, 8(1), 1-32. https://doi.org/10.1257/app.20150044

Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., … & Wu, H. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637-644. https://doi.org/10.1038/s41562-018-0399-z

Chambers, C.D., & Tzavella, L. (2022). The past, present, and future of Registered Reports. Nature Human Behavior, 6: 29-42. https://doi.org/10.1038/s41562-021-01193-7

Checco, A., Bracciale, L., Loreti, P., Pinfield, S., & Bianchi, G. (2021). AI-assisted peer review. Humanities and Social Sciences Communications, 8(1), 1-11. https://doi.org/10.1057/s41599-020-00703-8

Christensen, G., Dafoe, A., Miguel, E., Moore, D.A., & Rose, A.K. (2019). A study of the impact of data sharing on article citations using journal policies as a natural experiment. PLoS ONE 14(12): e0225883. https://doi.org/10.1371/journal.pone.0225883

Delios, A., Clemente, E. G., Wu, T., Tan, H., Wang, Y., Gordon, M., … & Uhlmann, E. L. (2022). Examining the generalizability of research findings from archival data. Proceedings of the National Academy of Sciences, 119(30), e2120377119. https://doi.org/10.1073/pnas.2120377119

Fanelli, D. (2012). Negative results are disappearing from most disciplines and countries. Scientometrics, 90(3), 891-904. https://doi.org/10.1007/s11192-011-0494-7

Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502-1505. https://doi.org/10.1126/science.1255484

Friese, M., & Frankenbach, J. (2020). p-Hacking and publication bias interact to distort meta-analytic effect size estimates. Psychological Methods, 25(4), 456. https://doi.org/10.1037/met0000246

Gelman, A., & Loken, E. (2014). The statistical crisis in science data-dependent analysis—a “garden of forking paths”—explains why many statistically significant comparisons don’t hold up. American Scientist, 102(6), 460-465. https://www.jstor.org/stable/43707868

Gerber, A. S., & Malhotra, N. (2008a). Publication bias in empirical sociological research: Do arbitrary significance levels distort published results? Sociological Methods & Research, 37(1), 3-30. https://doi.org/10.1177/0049124108318973

Gerber, A., & Malhotra, N. (2008b). Do statistical reporting standards affect what is published? Publication bias in two leading political science journals. Quarterly Journal of Political Science, 3(3), 313-326. http://dx.doi.org/10.1561/100.00008024

Gopalakrishna, G., Ter Riet, G., Vink, G., Stoop, I., Wicherts, J. M., & Bouter, L. M. (2022a). Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands. PloS one, 17(2), e0263023. https://doi.org/10.1371/journal.pone.0263023

Gopalakrishna, G., Wicherts, J. M., Vink, G., Stoop, I., van den Akker, O. R., ter Riet, G., & Bouter, L. M. (2022b). Prevalence of responsible research practices among academics in The Netherlands. F1000Research, 11(471), 471. https://doi.org/10.12688/f1000research.110664.2

Grimes, D. R., Bauch, C. T., & Ioannidis, J. P. (2018). Modelling science trustworthiness under publish or perish pressure. Royal Society Open Science, 5(1), 171511. https://doi.org/10.1098/rsos.171511

Groves, R. M., Couper, M. P., Presser, S., Singer, E., Tourangeau, R., Acosta, G. P., & Nelson, L. (2006). Experiments in producing nonresponse bias. International Journal of Public Opinion Quarterly, 70(5), 720-736. https://doi.org/10.1093/poq/nfl036

Hardwicke, T.E., Ioannidis, J.P.A. (2018). Mapping the universe of registered reports. Nature Human Behavior, 2: 793–796. https://doi.org/10.1038/s41562-018-0444-y

Hardwicke, T.E., Wallach, J.D., Kidwell, M.C., Bendixen, T., Crüwell, S. & Ioannidis, J.P.A. (2020). An empirical assessment of transparency and reproducibility-related research practices in the social sciences (2014–2017). Royal Society Open Science, 7: 190806. https://doi.org/10.1098/rsos.190806

Heathers, J. A., Anaya, J., van der Zee, T., & Brown, N. J. (2018). Recovering data from summary statistics: Sample parameter reconstruction via iterative techniques (SPRITE). PeerJ Preprints, e26968v1. https://doi.org/10.7287/peerj.preprints.26968v1

Ioannidis, J., Stanley, T. D., & Doucouliagos, H. (2017). The Power of Bias in Economics Research. Economic Journal, 127(605): F236-F265. https://doi.org/10.1111/ecoj.12461

John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524-532. https://doi.org/10.1177/0956797611430953

Judge, G., & Schechter, L. (2009). Detecting problems in survey data using Benford’s Law. Journal of Human Resources, 44(1), 1-24. https://doi.org/10.3368/jhr.44.1.1

KNAW, NFU, NWO, TO2‐federatie, Vereniging Hogescholen & VSNU (2018). Netherlands Code of Conduct for Research Integrity. http://www.vsnu.nl/files/documents/Netherlands%20Code%20of%20Conduct%20for%20Research%20Integrity%202018.pdf

Krawczyk, M. & Reuben, E. (2012). (Un)Available upon Request: Field Experiment on Researchers’ Willingness to Share Supplementary Materials. Accountability in Research, 19: 175–186. https://doi.org/10.1080/08989621.2012.678688

Kvarven, A., Strømland, E., & Johannesson, M. (2020). Comparing meta-analyses and preregistered multiple-laboratory replication projects. Nature Human Behaviour, 4(4), 423-434. https://doi.org/10.1038/s41562-019-0787-z

Lakomý, M., Hlavová, R. & Machackova, H. (2019). Open Science and the Science-Society Relationship. Society, 56, 246–255. https://doi.org/10.1007/s12115-019-00361-w

Maxwell, S. E. (2004). The Persistence of Underpowered Studies in Psychological Research: Causes, Consequences, and Remedies. Psychological Methods, 9(2), 147–163. https://doi.org/10.1037/1082-989X.9.2.147

Menke, J., Roelandse, M., Ozyurt, B., Martone, M., & Bandrowski, A. (2020). The rigor and transparency index quality metric for assessing biological and medical science methods. iScience, 23(11): 101698. https://doi.org/10.1016/j.isci.2020.101698.

Menkveld, A. J., Dreber, A., Holzmeister, F., Huber, J., Johannesson, M., Kirchler, M., … & Weitzel, U. (2021). Non-standard errors. Working paper. https://dx.doi.org/10.2139/ssrn.3961574

Munafò, M. R., Nosek, B. A., Bishop, D. V., Button, K. S., Chambers, C. D., Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-J., Ware, J. J. & Ioannidis, J. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1(1), 1-9. https://doi.org/10.1038/s41562-016-0021

Nosek, B.A., Ebersole, C.R., DeHaven, A.C. & Mellor, D.T. (2018). The Preregistration Revolution. Proceedings of the National Academy of Sciences, 115(11): 2600-2606. http://www.pnas.org/cgi/doi/10.1073/pnas.1708274114

Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Dreber, A., … & Vazire, S. (2022). Replicability, robustness, and reproducibility in psychological science. Annual Review of Psychology, 73, 719-748. https://doi.org/10.1146/annurev-psych-020821-114157

Nosek, B. A., & Lakens, D. (2014). Registered reports: A method to increase the credibility of published results. Social Psychology, 45(3), 137–141. https://doi.org/10.1027/1864-9335/a000192

Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability. Perspectives on Psychological Science, 7(6), 615–631. https://doi.org/10.1177/1745691612459058

Nuijten, M.B. & Polanin, J.R. (2020). “statcheck”: Automatically detect statistical reporting inconsistencies to increase reproducibility of meta‐analyses. Research Synthesis Methods, 11(5): 574–579. https://doi.org/10.1002/jrsm.1408

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251). https://doi.org/10.1126/science.aac4716

Protzko, J., Krosnick, J., Nelson, L. D., Nosek, B. A., Axt, J., Berent, M., … & Schooler, J. (2020). High replicability of newly-discovered social-behavioral findings is achievable. PsyArxiv preprint. https://psyarxiv.com/n2a9x/

Riedel, N., Kip, M., & Bobrov, E. (2020). ODDPub—a text‑mining algorithm to detect data sharing in biomedical publications. Data Science Journal, 19:42. http://doi.org/10.5334/dsj-2020-042 

Schäfer, T. & Schwarz, M.A. (2019). The Meaningfulness of Effect Sizes in Psychological Research: Differences Between Sub-Disciplines and the Impact of Potential Biases. Frontiers of Psychology, 10:813. https://doi.org/10.3389/fpsyg.2019.00813

Scheel, A. M., Schijen, M. R., & Lakens, D. (2021). An excess of positive results: Comparing the standard Psychology literature with Registered Reports. Advances in Methods and Practices in Psychological Science, 4(2). https://doi.org/10.1177/25152459211007467  

Schroter, S., Black, N., Evans, S., Godlee, F., Osorio, L., & Smith, R. (2008). What errors do peer reviewers detect, and does training improve their ability to detect them? Journal of the Royal Society of Medicine, 101(10), 507-514. https://doi.org/10.1258/jrsm.2008.080062

Schulz, R., Barnett, A., Bernard, R., Brown, N. J., Byrne, J. A., Eckmann, P., … & Weissgerber, T. L. (2022). Is the future of peer review automated? BMC Research Notes, 15(1), 1-5. https://doi.org/10.1186/s13104-022-06080-6

Schweinsberg, M., Feldman, M., Staub, N., van den Akker, O. R., van Aert, R. C., Van Assen, M. A., … & Schulte-Mecklenbeck, M. (2021). Same data, different conclusions: Radical dispersion in empirical results when independent analysts operationalize and test the same hypothesis. Organizational Behavior and Human Decision Processes, 165, 228-249. https://doi.org/10.1016/j.obhdp.2021.02.003

Serghiou, S., Contopoulos-Ioannidis, D.G., Boyack, K.W., Riedel, N., Wallach, J.D., & Ioannidis, J.P.A. (2021). Assessment of transparency indicators across the biomedical literature: How open is open? PLoS Biology, 19(3): e3001107. https://doi.org/10.1371/journal.pbio.3001107

Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F., Awtrey, E., … & Nosek, B. A. (2018). Many analysts, one data set: Making transparent how variations in analytic choices affect results. Advances in Methods and Practices in Psychological Science, 1(3), 337-356. https://doi.org/10.1177/2515245917747646

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 1359-1366. https://doi.org/10.1177/0956797611417632

Simonsohn, U., Nelson, L.D. & Simmons, J.P. (2014a). P-Curve: A Key to the File Drawer. Journal of Experimental Psychology: General, 143 (2): 534–547. https://doi.org/10.1037/a0033242

Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014b). p-curve and effect size: Correcting for publication bias using only significant results. Perspectives on Psychological Science, 9(6), 666-681. https://doi.org/10.1177/1745691614553988

Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384. https://doi.org/10.1098/rsos.160384

Smaldino, P. E., Turner, M. A., & Contreras Kallens, P. A. (2019). Open science and modified funding lotteries can impede the natural selection of bad science. Royal Society Open Science, 6(7), 190194. https://doi.org/10.1098/rsos.190194

Smith, R. (2006). Peer review: a flawed process at the heart of science and journals. Journal of the Royal Society of Medicine, 99: 178–182. https://dx.doi.org/10.1177/014107680609900414

Smith, R. (2010). Classical peer review: an empty gun. Breast Cancer Research, 12 (Suppl 4), S13. https://doi.org/10.1186/bcr2742

Soderberg, C. K., Errington, T. M., Schiavone, S. R., Bottesini, J., Thorn, F. S., Vazire, S., … & Nosek, B. A. (2021). Initial evidence of research quality of registered reports compared with the standard publishing model. Nature Human Behaviour, 5(8), 990-997. https://doi.org/10.1038/s41562-021-01142-4

Song, H., Markowitz, D. M., & Taylor, S. H. (2022). Trusting on the shoulders of open giants? Open science increases trust in science for the public and academics. Journal of Communication, 72(4), 497-510. https://doi.org/10.1093/joc/jqac017

Steneck, N. H. (2007). Introduction to the Responsible Conduct of Research. Washington, DC: Department of Health and Human Services, Office of Research Integrity. https://ori.hhs.gov/sites/default/files/2018-04/rcrintro.pdf

Stodden, V., Seiler, J. & Ma, Z. (2018). An empirical analysis of journal policy effectiveness for computational reproducibility. Proceedings of the National Academy of Sciences, 115(11): 2584-2589. https://doi.org/10.1073/pnas.1708290115

Tedersoo, L., Küngas, R., Oras, E., Köster, K., Eenmaa, H., Leijen, Ä., Pedaste, M., Raju, M., Astapova, A., Lukner, H., Kogermann, K. & Sepp, T. (2021). Data sharing practices and data availability upon request differ across scientific disciplines. Scientific Data, 8(1), 1-11. https://doi.org/10.1038/s41597-021-00981-0

Tourangeau, R., & Yan, T. (2007). Sensitive questions in surveys. Psychological Bulletin, 133(5), 859-883. https://doi.org/10.1037/0033-2909.133.5.859

Trisovic, A., Lau, M.K., Pasquier, T. & Crosas, M. (2022). A large-scale study on research code quality and execution. Scientific Data, 9, 60. https://doi.org/10.1038/s41597-022-01143-6

Tsai, A. C., Kohrt, B. A., Matthews, L. T., Betancourt, T. S., Lee, J. K., Papachristos, A. V., … & Dworkin, S. L. (2016). Promises and pitfalls of data sharing in qualitative research. Social Science & Medicine, 169, 191-198. https://doi.org/10.1016/j.socscimed.2016.08.004

Van Elk, M., Matzke, D., Gronau, Q., Guang, M., Vandekerckhove, J., & Wagenmakers, E. J. (2015). Meta-analyses are no substitute for registered replications: A skeptical perspective on religious priming. Frontiers in Psychology, 1365. https://doi.org/10.3389/fpsyg.2015.01365

Vazire, S. (2019, March 14). The Credibility Revolution in Psychological Science. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2404.

Vlaeminck, S. (2021) : Dawning of a New Age? Economics Journals’ Data Policies on the Test Bench. LIBER Quarterly, 31 (1): 1–29. https://doi.org/10.53377/lq.10940

Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726–728. https://doi.org/10.1037/0003-066X.61.7.726

Xie, Y., Wang, K., & Kong, Y. (2021). Prevalence of research misconduct and questionable research practices: a systematic review and meta-analysis. Science and Engineering Ethics, 27(4), 1-28. https://doi.org/10.1007/s11948-021-00314-9

Zavalis, E.A., & Ioannidis, J.P.A. (2022) A meta-epidemiological assessment of transparency indicators of infectious disease models. PLoS ONE 17(10): e0275380. https://doi.org/10.1371/journal.pone.0275380

Leave a comment

Filed under academic journals, academic misconduct, data, experiments, fraud, incentives, methodology, open science, publications, regulation, research, research integrity, science, sociology, statistical analysis, survey research, writing

Governance instruments to promote open science

Which governance instruments can we employ to promote open science? In this post I discuss governance instruments that can encourage researchers to engage in two important practices of open science: documenting and sharing research data. The practice of documenting data makes science more transparent. Sharing data makes science more open. Documenting data is a necessary first step towards more open science.

Open Science cultural change pyramid and corresponding governance instruments, Source: Nosek (2019) – governance instruments added by Bekkers

A range of infrastructural reforms can promote open science practices, as displayed in the figure above. The left part of the figure displays a pyramid of cultural change created by the Center for Open Science (Nosek, 2019). For each of the layers of the pyramid, on the right part of the figure governance instruments are listed that researchers, universities, funders, and other stakeholders can employ to promote culture change towards open science.

Infrastructure. The foundation of the period is the infrastructure for open science: servers and online repositories where researchers can publish preprints, research materials, data, and code. Platforms such as the Open Science Framework and Zenodo provide such an infrastructure. The availability of this infrastructure makes open science possible, but does not ensure that researchers in practice engage in open science.

User experience. Therefore, the second layer is also important: a user friendly interface that makes it easy for researchers to use the infrastructure. New locally provided infrastructure should be tested by users before it is implemented, and adapted to enable a better user experience.

Communities. Once the infrastructure and interface enable a positive user experience, communities can provide guidelines for their members. The guidelines include norms that researchers in different communities should follow. User communities can also provide examples of good practices, and praise users providing good examples. One way to do that  is by symbolically rewarding good practices with prizes and badges indicating that publications conform to guidelines for open science. Badges alone, however, are not a strong enough incentive to motivate most researchers. In addition to providing symbolic rewards, universities and research communities can provide training to researchers in how to meet guidelines for open science.

Incentives. When a normative framework specifies which practices should be encouraged and which should be discouraged, incentives can be created to formally reward engaging in open science. Providing public access to data or software is not yet awarded the same benefits as publishing research articles in terms of citations. Researchers will publish data and software more often if they receive credit in the same way as they receive credit for publishing research articles. Universities can also incentivize open science practices by rewarding them in tenure and promotion decisions.

Policy. Finally, data documentation and data sharing can be imposed and monitored by employers, publishers, and funders. Public funders of research such as the European Research Council have imposed requirements for grantees that incentivize open science. Federally funded research in the US will also be accompanied with such requirements. Researchers have to submit data management plans in order to receive funding, and publish articles with an open access license. An increasing number of academic journals in the social sciences require authors to provide access to data and code in a replication package, and verify whether the results reported in manuscripts actually correspond to the results produced by the data and code. Finally, research groups can organize audits verifying whether data and code are available, and actually produce the results reported in manuscripts.

1 Comment

Filed under incentives, open science, research, research integrity, science

Open call for contributors: Transparency in Nonprofit Research

How do publications in nonprofit and philanthropic studies report on the type of data they report on? What are the characteristics of samples and measurement instruments used in research? Which research designs and methods are used in the analysis of data? To what extent do publications describe the generalizability and validity of observations? Which criteria do publications use to support the methodological quality of research in nonprofit and philanthropic studies?

Research in nonprofit and philanthropic studies uses a variety of data and methods. In order to evaluate their quality, it is imperative that the data and methods are clearly described. In this meta science project, we provide an assessment of the transparency in research products sampled randomly from the Knowledge Infrastructure for Nonprofit and Philanthropic Studies (KINPS).

The assessment starts with the current state of research. Retrospective assessments include research from the past 50 years, working backwards from 2022, at ten year intervals such that the state of research of today can be compared with representative samples from 2012, 2002, 1992, 1982 and 1972. The goal of the project is to identify weaknesses in the quality of research in nonprofit and philanthropic studies, and provide suggestions for how to improve research quality.

Ideas for the current meta-science project are outlined here: https://osf.io/hvw73. This project is supported by the Revolutionizing Philanthropy Research (RPR) initiative. See https://osf.io/46e8x/ for the foundational ideas. Previous projects of the RPR consortium include the Knowledge Infrastructure for Nonprofit and Philanthropic Studies (KINPS), https://osf.io/g9d8u/. You can access the database through https://public.tableau.com/app/profile/ji.ma/viz/KINPS/Story1.

Would you like to contribute to this project? If you care about the quality of data and methods in our field, are interested in its development over time, or if you want to be part of it in the future we need your help. We could use assistance to go through a sample of publications, build a database, and analyze it. First results will be published at the upcoming ARNOVA Conference in November 2022. We strongly believe in an open science approach. Therefore this will be an open project, to which all members of the nonprofit research community can contribute. We are open to any suggestions you may have. Anyone with basic skills can participate, and become a co-author on the paper we will write together. If you have advanced skills in programming, data analysis, visualization, or writing, you can assume more responsibility. If you’re interested, please describe your interest and potential contribution in an email message to René Bekkers, r.bekkers@vu.nl. Visit https://osf.io/p5kxr/ to learn about the current state of the project.

The first meeting of the project team was on Thursday September 29. The slides are here.

A second meeting will take place on Monday, October 10, at 10 PM CET | 11 AM HDT | 3 PM CDT | 6 AM AEST. Please write an email to obtain access details for the webinar.

Leave a comment

Filed under data, experiments, history, household giving, methodology, open science, philanthropy, publications, research, science, statistical analysis, survey research, trends