Reproducibility and replicability of research
Reproducibility and replicability of research
ALLEA Code:
According to the ALLEA Code, researchers should design, carry out, analyse and document research in a careful and well-considered manner. In addition, researchers should aim to report their results in a way that is compatible with the standards of the discipline and, where applicable, can be verified and reproduced.
Based on the definition proposed by the US National Academies of Sciences, Engineering, and Medicine (Reproducibility and Replicability in Science (2019), p. 6), “reproducibility is obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis. This is different from replicability which is obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data.” Please note that within the literature reproducibility and replicability are often used interchangeably. So while they are strictly speaking not synonyms, a failure to either reproduce or replicate findings both may point to the same, namely inconsistent results.
During the last decade, concerns have been raised regarding the lack of reproducibility and replicability of research findings in many disciplines, including social sciences (Caramer, 2018), psychology (Open Science Collaboration, 2015) and biomedical sciences (Pusztai, 2013), raising concerns on the reliability of science (Nature, 2016), as denoted as the ‘reproducibility crisis’.
Why is a study not reproducible or replicable?
TEDEd Animation Is there a reproducibility crisis in science? by Matt Anticole (licensend under CC BY-NC-ND 4.0 International). Please note that within the video the concepts of reproducing, replicating and repeating research results are sometimes used as synonyms.
There are many different reasons why a study might not be reproducible, including a faulty experimental design, variables that are not taken into account, wrong statistical analysis, biased reporting focusing on the desired effects, no access to all information necessary to reproduce or conditions that can’t be recreated. Within biomedical sciences, failures to replicate research sometimes have been explained by the use of wrong reagentia, including incorrect labelled mice or cell lines (Kafkafi et al, 2018; Eckers et al, 2018).
Failing to reproduce and replicate data is an inherent feature of science. Therefore, failure to reproduce/replicate a study does not necessarily mean that a study is faulty and cannot be trusted. In some research fields the interpretation of the data, f.e. using historical sources, doesn’t necessarily requires reproducibility yet might still be desirable. However in other research fields, to be able to draw conclusions, researchers have to take sufficient measures in order to make reproducibility/replicability of the research possible. Potential measures can include:
- clear standard operation protocols
- detailed logging of all aspects of the research throughout the project
- detailed logging of all metadata
- …
In addition, sufficient details regarding research methodology and analysis have to be available to everyone not directly involved in the research.
The Reproducibility Manifesto provides some concrete actions that can help make your research more reproducible
When to think about this?
In order to make research reproducible/replicable, transparency is needed across all stages of research, starting at the conceptualization (what is the research protocol?) and data collection/analyses stages (which data are shown, which are not, and why this decision?), continuing into publication of the work (does the manuscript provide sufficient details regarding the used methodology/analysis protocol-are the data available?).
Available tools for reproducible and replicable research
As mentioned before, p-hacking and HARKing significantly reduce the potential to replicate research findings. As such, the use of preregistration and registered reports will have a very positive effect on the replicability of research.
Moreover, to increase the reproducibility and replicability of research, reporting guidelines and checklists have been drawn up for several disciplines. These usually “specify a minimum set of items required for a clear and transparent account of what was done and what was found in a research study, reflecting, in particular, issues that might introduce bias into the research” (Adapted from the blog “It’s a kind of magic: how to improve adherence to reporting guidelines” Marshall, D. & Shananhan, D. (2016, February 12)).
Some examples include:
- The ARRIVE (Animal Research: Reporting of in Vivo Experiments) guidelines intended to improve the reporting of research using animals.
- The Materials Design Analysis Reporting (MDAR) Checklist for Authors that is applicable to studies in the life sciences.
- The ICMJE Recommendation for the Conduct, Reporting, Editing, and Publication of Scholarly work in Medical Journals.
- Within psychology, reporting standards for quantitative research in psychology have guidelines published by the American Psychological Association
- Inspiration for qualitative research can be drawn from the SRQR and COREQ standards..
- Researchers in humanities or empirical social sciences can look into the standard published by the American Educational Research Association (AERA).
- Resources for Reproducible Research by Reproducibility for Everyone
Finally, within certain fields, specific guidance and tools may exists to increase the reproducibility of the analyses (focusing on the code and software used).