Research Data Management (RDM)
Research Data Management (RDM)
What are ‘research data’?
Research data are all digital or physical data – regardless of the manner in which these data are collected or stored – used or analysed to support research findings, validate research results or support a scientific reasoning, discussion or calculation in the study. Research data cover the entire spectrum from raw data to processed and analysed data included or discussed in a publication. These data can be generated data, derived or composite data, as well as self-generated data and data provided by third parties. Some examples are: survey results, statistical data, graphics, computer-generated data, simulations, software developed for research purposes, computational metadata, prints, video and audio tapes, a corpus, organisms, gene sequences, synthetic or chemical compounds, samples of any kind, patient records, protocols, measurements, notebooks, and so on.
Research data form the beating heart of academic research, and are the engine of enormous progress in technology, healthcare and on a socio-economic level.
Research Data Management (RDM)
Although researchers are continuously working with data to test them against the research hypothesis, these are often not processed and stored carefully during and after the project. At the same time, our society is becoming ever more focused on increasing datafication, with more and more processes and actions based on digital data. This also enables a specific type of data-driven research, based on the combination of (big) data with new techniques for analysis (i.e. “data science”).
This large use and potential reuse of data therefore makes solid management of research data very important in the research community, meaning that the good handling of data, the so-called research data management (RDM), should be one of the cornerstones of good academic research practice.
RDM includes all steps before, during and after the project, i.e. “the research data lifecycle”: data planning, collection, processing, analysis, security, storage, preservation, access, sharing and reuse. All these steps are bound by conditions and regulations at legal, ethical and technological levels.
The importance of research data management
Prudent and thoughtful RDM leads to a better quality and integrity of research, a greater impact research and a greater visibility and reuse of research data. Good data handling can prevent data loss/corruption, fraud and/or bad science, it makes your research process run smoother, and ensures that you can find and reuse data later. In addition, the sharing and reuse of data for future researchers is facilitated in order to develop new research both inside and outside the university.
Imagine that you have read a scientific paper and would like to know how the algorithm the authors used works ‘under the hood’ or how the authors constructed a certain plot? How easy would it be if you could just click on a link and download that information? Imagine that you have to come up with a brilliant new way of analysing the data that underpin a certain study, but the original data are nowhere to be found. How much time would you lose collecting new data? Or imagine that you actually were able to get a hand on the original data, but after close scrutiny, the data are utterly incomprehensible: the measurement unit of variable B is impossible to decode etc.. How disappointed would you be after the initial effort of obtaining the data? Indeed, it’s such a hassle.
Every researcher has a responsibility to contribute to a new world where research data that underly research findings are easily findable and interpretable. Hence the need for proper research data management and researchers are increasingly required to put a thorough data management approach into practice throughout the research process. These fairly recent RDM requirements are often formalised in policies of research institutions, funders (e.g. the necessity to develop a data management plan) and journals (e.g. the inclusion of data availability statement in the article). At first glance, these RDM requirements might seem difficult to comply with for researchers who are not yet familiar with them. when you look further, you will notice that RDM will be of help during your research and with that they also serve the purpose of improving the way we do science, making scientific results more transparent and reproducible.
Retraction of an article because access to the underlying data was not granted.
Case: Lancet, NEJM retracts controversial COVID-19 studies based on surgisphere data – Retraction watch: “Two days after issuing expressions of concern about controversial papers on Covid-19, The Lancet and the New England Journal of Medicine have retracted the controversial articles on Covid-19 because a number of the authors were not granted access to the underlying data […]”.
“[…] Because all the authors were not granted access to the raw data and the raw data could not be made available to a third-party auditor, we are unable to validate the primary data sources underlying our article, “Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19.” We therefore request that the article be retracted. We apologize to the editors and to readers of the journal for the difficulties that this has caused.””
Who is involved?
The researcher develops a data management plan for his/her research that describes how research data will be collected, organised, documented, stored, used and looked after throughout the research lifecycle; implements good data management practices; and ensures that the data remain available in the long term.
The supervisor supports and advises the researcher on data management practices and ethical, legal and contractual responsibilities.
The institution/university provides the tools, infrastructure and policies for the researcher to implement good data management practices.
Data stewards and research support staff provide advice, support and training on data management (planning) to the researcher.
The ALLEA Code also strongly confirms the importance of research data management and included the following:
- Researchers, research institutions, and organisations ensure appropriate stewardship, curation, and preservation of all data, metadata, protocols, code, software, and other research materials for a reasonable and clearly stated period.
- Researchers, research institutions, and organisations ensure that access to data is as open as possible, as closed as necessary, and where appropriate in line with the FAIR principles (Findable, Accessible, Interoperable and Reusable) for data management.
- Researchers, research institutions, and organisations are transparent about how to access and gain permission to use data, metadata, protocols, code, software, and other research materials.
- Researchers inform research participants about how their data will be used, reused, accessed, stored, and deleted, in compliance with GDPR.
- Researchers, research institutions, and organisations acknowledge data, metadata, protocols, code, software, and other research materials as legitimate and citable products of research.
- Researchers, research institutions, and organisations ensure that any contracts or agreements relating to research results include equitable and fair provisions for the management of their use, ownership, and protection under intellectual property rights.