home-icon
  • checkmark Responsible conduct of research
    • checkmark Design and conduct
    • checkmark Design and methodology
    • checkmark Possible flaws in a study design
    • checkmark Preregistration and registered reports
    • checkmark Reproducibility and replicability of research
    • checkmark Statistics in research
    • checkmark Research funding
    • checkmark Research Data Management (RDM)
    • checkmark FAIR data principles for research data
    • checkmark Data Management Plan
    • checkmark Reporting results
    • checkmark Presenting your data
    • checkmark Image processing
    • checkmark Authorship
    • checkmark Author affiliation
    • checkmark Citation and referencing
    • checkmark Open access to publications
    • checkmark The quality of a journal
    • checkmark Peer review
    • checkmark Preprints
    • checkmark Novelty of your work
    • checkmark The value of negative results
  • checkmark Declaration of conflict of interest
  • checkmark Science communication
  • checkmark Research(er) evaluation and assessment
  • checkmark References for module 3 - Good Academic Practices

Statistics in research

jumping-icon base

Statistics in research

Depending on your discipline, statistical analysis is crucial to substantiate your results. The problem that many researchers are faced with is that they are trained to be researchers and not necessarily to be a statistician, which is a whole field of its own. With this in mind, researchers have to be critical with regards to the statistical analysis they perform and consult experts wherever necessary. Just like when working with advanced technical equipment, you consult someone with expertise in working with this equipment, you have to be aware of how the equipment works and what information can and cannot be drawn from it.

Good academic practices on statistical analysis

The way the data are analysed can severely impact the conclusion and validity of your research. As such, researchers should aim to have a sound statistical analysis strategy even before collecting the first data. Some good practices:

  • Everything starts with a clearly defined research question. Set up your study and plan the statistical analysis according to the research question. Consider the number of observations you will go for? Which are the variables you will consider? Which are the relevant statistical tests?
  • Take courses in the field of data analysis. Although this does not make you an expert statistician, you get a much better idea what the different techniques are capable of and which pitfalls you should be aware of.
  • Accross all stages of a research project, statisticians are involved. In the budget calculation of your project, foresee some money for statistical assistance.
  • Be aware of the limitations of different statistical tests. Be critical and do not use a test just because this is how another researcher did it or because it is the standard method in your research group.
  • Be critical of your data. A statistically significant result does not automatically mean that the result is robust or relevant. Are there reasons to explain the result?
  • Exploring the use of different statistical tests is not in itself a problem as long as the tests have been designed for the intended purposes and not only to be able to ‘choose’ the best results. Analyses should account for the number of tests carried out.
  • Similarly, omission of data might be acceptable, but the reasons for this should be acknowledged and valid.
Source: Best practices for statistical data analysis
mindthegap

Flames, Flanders’ training network for statistics and methodology is an interuniversity training network rooted in the 5 Flemish universities.

mindthegap

A few other methods to make your analysis more robust are:

Who is involved?

Junior Researcher - Phd student

Junior researchers/PhD researchers will in most cases be responsible for the practical aspects such as data collection and the statistical analysis. Training in statistical data analysis is part of your education as a researcher.

Senior Researcher

Statistics is something many researchers are not familiar with. As such, the more experienced researchers should provide guidance to the junior ones in ensuring the statistical analysis is scientifically sound and/or referring to experts for help if necessary.

(Co-) Author

(Co-)Authors have to be sufficiently critical to ensure that the statistical analysis is robust.

Journals - Publishers

Journals should make sure the submitted work provides sufficient information to the readers to allow interpreting the data and defining the significance of it.

mindthegap

Journals don’t always provide a dedicated section to indicate who’s responsible for the statistical analysis. You can use the CRediT contributer roles to add the statisticians to your paper in the best possible way.

mindthegap

The perception exists that results with a p-value below 0.05 have more value and attract more attention within science. As such, researchers may have incentives to look out for statistically significant results. This slippery slope can be summarised as p-hacking in which researchers try out several analyses and/or data eligibility specifications and then selectively pick those that produce significant results. This increases the possibility of false positive association (merely by chance). Practices that can result in p-hacking include:

  • Conducting analyses midway through experiments and terminating data collection prematurely because significant p-value has been obtained
  • Adjusting the sample size in the hope that the difference becomes significant
  • Limitting analysis to a subset of the data
  • Removing outliers without a good reason
  • Exploring different statistical tests and only continue with the ones that meet the researcher’s personal beliefs or support the hypothesis.
  • When analysing data, the researcher may evidently look for data that confirm their hypotheses or confirm personal experience, overlooking data inconsistent with personal beliefs. P-hacking is a problem as it influences the data collection process and/or statistical analysis, which in turn can lead to inflation bias as the effect sizes reported in the literature do not correspond with the (experimental) observations. This can finally severely impact the robustness of the data and the reproducibility/replicability of the results.
mindthegap

Tip:

You can use statcheck to check for inconsistencies (in p-values) on the paper level.

Cartoon by xkcd under a Creative Commons Attribution – NonCommercial 2.5 license