Category: Good Academic Practices
Citation and referencing
Citation and referencing
Learning how to write in an academic way is a skill. It requires knowledge of some basic rules and a lot of practice. This doesn’t happen overnight.
The ALLEA Code lists following unacceptable practices:
- Citing selectively or inaccurately.
- Expanding unnecessarily the bibliography of a study to please editors, reviewers, or colleagues, or to manipulate bibliographic data.
- Re-publishing substantive parts of one’s own earlier publications, including translations, without duly acknowledging or citing the original (‘self-plagiarism’).
For all knowledge a researcher uses in new academic work, a correct reference to the original has to be made, whether this is text, images, document structure, online information, etc. This requires an in-text citation and a full reference in the bibliography. Information flows need to be traceable for readers, at all times. If this is not properly done, it can be considered plagiarism.
How information from original sources is being processed can differ and therefore requires a different approach.
With quotations an author wants to use the exact words, copied directly from a source, without any change. This identical use of original knowledge requires additional care in correct referencing: quotations must appear in a noticeable format, e.g. with quotations marks, italic font, … and they need to be cited with in-text citations but also accompanied by the reference page. Also a full reference in the bibliography is required.
It is advised to use quotations when:
- You want to add the power of an author’s words to support your argument
- You want to disagree with an author’s argument
- You want to highlight particularly eloquent or powerful phrases or passages
- You are comparing and contrasting specific points of view
- You want to note the important research that precedes your own.
Source text: The writing center – When to summarize, paraphrase, and quote.
In all other cases you have to use paraphrase. You will then generate new content using or being influenced by ideas from other authors, but you have to write them down in your own words. The general rules for referencing apply (in-text + full reference).
Reference styles
There are many reference styles, often depending on the discipline or journal you write for. In order to know what style to use, ask your research leader, supervisor, colleague, … what style is most common for your discipline, or check author’s guidelines on the journal’s website.
The best-known systems are:
- APA – 7th ed. (American Psychological Association)
- Chicago – 17th ed. (used for footnotes)
- MLA – 8th ed. (Modern Language Association)
- Harvard
- Vancouver (Mainly used in biomedicine)
In some cases, you will refer to an original part of knowledge, however without reading it in the original source. You read it in another document, of which the author read the original source. This is called secondary referencing. Chance exists that information was misread, misinterpreted or cited selectively. By taking over the content without checking the original source, for example because of a paywall, you maintain (or even worsen) the misinformation.
How to handle secondary referencing?
- How to handle secondary referencing?
- You should always consult the original source yourself, check content and use referencing as described above. In addition, you add “Cited in:” and add the reference of the work you read. In case it consists of a quotation, the same, extended, reference is needed.
- The best research practice is to consult the original source yourself, check content and use referencing as described above. To keep efforts reasonable and pragmatic, it is advisable to do so for knowledge or arguments at the core of your work. However, for sideline details, researchers can use similar reference as described above, and add “Cited in:” and add the reference of the work they read. In case it consists of a quotation, the same, extended, reference is needed.
To manage your references in general, over time and over single articles, you can use a reference manager such as Endnote (Clarivate Analytics) (via your university research platform), Zotero (free) or Mendeley (Elsevier).
Author affiliation
Author affiliation
Besides author names, publications contain author affiliations, providing more information on the university/faculty/department where each of the authors is affiliated with. An author can have multiple affiliations, for example in case where he or she is affiliated with multiple entities within the same university, or alternatively when someone is appointed at multiple institutions.
When listing affiliations, it is important to note that these should reflect where the research has taken place. Although it is tempting to include as many affiliations as possible, researchers should be aware that an affiliation should only be claimed if the actual work and research underlying the publication have been performed at the institution(s) listed in the affiliation.
So why is a correct author affiliation important?
In addition to help identifying authors (in case of multiple researchers having the same name) and giving recognition to the host institution, it also assigns responsibility to the institutions involved as it directs the readers of the work to ‘who to contact’ in case of questions and/or problems with the research, for example with regards to ethics and research integrity of the work. In addition, correct affiliations are of key importance for identifying potential (financial) conflicts of interests.
Providing a wrong affiliation, thereby failing to give credit to the appropriate institution. A common mistake is the situation in which a researcher worked in institution A, and after ending the experimental work, but before having the work published, moved to institution B. Although one might feel it makes sense to provide the affiliation of institution B on the publication, given this is the current institution of the researcher, this is not correct as the actual work was not performed in institution B. As such, only institution A should be listed. In order to illustrate that the researcher has switched institutions, the current address can be listed in a footnote.
Providing a false affiliation, this in order to manipulate the perception and credibility of the research and thus increase the chances to have it published. In addition, being a student at your university does not necessarily mean that your work can be published as originating from this university. Research activities that have been performed on the researcher’s own initiative and without any supervision by the host institution should not be attributed to the university.
Failing to disclose a relevant affiliation: researchers might get into the situation that omitting a certain affiliation might actually help to get the work published. This is often related to the concealing of a conflict of interest, which is of course an unacceptable practice.
When to think about this?
Affiliations are important to show which institutions are involved in the research project. This is applicable every time one communicates about the research project and it is as such relevant throughout the project:
- When submitting a research proposal to apply for funding
- For progress reports on the research (for example in the form of presentation at other institutions or congresses).
- When communicating at the end of the research project (in many cases in the form of a publication).
- During follow up after finalising the research.
Authorship
Authorship
Who is an author? Criteria for authorship
Authorship is an explicit way to give credit to everyone who made a significant contribution to the work. In turn, this implies that it can be expected that all authors are fully accountable for all aspects of the work, unless otherwise specified.
ALLEA Code:
- Authors formally agree on the sequence of authorship, acknowledging that authorship itself is based on: (1) a significant contribution to the design of the research, relevant data collection, its analysis, and/or interpretation; (2) drafting and/or critical reviewing the publication; (3) approval of the final publication; and (4) agreeing to be responsible for the content of publication, unless specified otherwise in the publication.
- All authors are fully responsible for the content of publication, unless otherwise specified.
- Authors include an ‘Author Contribution Statement’ in the final publication, where possible, to describe each author’s responsibilities and contributions.
- Authors acknowledge important work and contributions of those who do not meet the criteria for authorship, including collaborators, assistants, and funders who have enabled the research.
Unacceptable practices:
- Manipulating authorship or denigrating the role of other researchers in publications.
As with authorship, it is important to reach clear agreements on the authorship order. According to the ALLEA code, all authors should agree on the sequence of authorship. Any form of listing is possible, if in line with the principles of research integrity and the policies that apply. As an author, you should also be able to explain the system of and the reasoning behind the agreed author order.
Like science itself, standards for attributing authorship may also evolve, e.g. as prevailing practices within a discipline change over time. The research context itself is also a determining factor. For example, it can become difficult (but not impossible) to correctly attribute authorship in the case of (large) collaborations, increasing specialisation, an increasing degree of inter- and transdisciplinary research, etc.
Nonetheless, the basic principles listed in the ALLEA code must be followed as they constitute the minimum standard for all researchers, in all disciplines and to all forms of output. This means that:
- All those designated as authors should meet all criteria for authorship, and all who meet the criteria should be identified as authors.
- Those who do not meet all criteria should be acknowledged, e.g. in a separate list in the acknowledgements or a footnote.
- In addition to being accountable for the parts of the work done, an author should be able to identify which co-authors are responsible for specific other parts of the work.
- Finally, authors should have confidence in the integrity of the contributions of their co-authors.
Besides the ALLEA code as leading framework, authorship guidelines are drawn up by many other stakeholders in science e.g. funders, journals, etc. Examples include the guidelines of the International Committee of Medical Journal Editors (ICMJE), originally developed for the (bio)medical field, but now also followed by many other fields, and the guidelines of the American Psychological Association Journals (APA). Please note that also journals and/or your host institution may have developed their own authorship policies. It is therefore necessary to always check them as early as possible. This is especially appropriate in the case of (international) collaboration.
Authorship order
In addition to being included as an author (or not), the order of authors is also determined by specific agreements. According to the ALLEA code, all authors should agree on the sequence of authorship.
Different systems can be used for this, not infrequently depending on the discipline. Alphabetical order or degree of contribution/collaboration are the most well-known protocols for ordering authors, but should not be seen as an absolute way to determine who contributed most to the study. Any form of listing is possible, if in line with the principles of research integrity and the policies that apply.
As an author, you should also be able to explain the system of and the reasoning behind the agreed author order.
It is advised for researchers to indicate the system used and the decisions derived from it, e.g. in the footnote of the contribution. In this way, readers/evaluators can correctly appreciate the listing (and thus the underlying contribution).
Good academic practices on authorship
To avoid authorship issues, good communication between the researchers involved in the project is key. Authorship disputes are one of, if not the biggest, drivers for conflict between researchers.
Some good academic practices to avoid authorship issues:
- Do not postpone agreement about authorship – authorship should not be decided on when getting a manuscript ready for submission. Instead, expectations about authorship should be discussed as early as possible when drafting the article format, in a transparent way and preferably in writing throughout the project.
- As authorship contributions might change over the course of the research/article, this might also impact whether a researcher can remain an author and/or whether additional researchers have to be added to the list of authors. Be transparent regarding necessary changes in authorship and have these changes approved by all authors.
- Be consistent when awarding authorship and use the same criteria for all involved and across all publications.
- There is more than one way to reward contributions to articles. Contributors who don’t meet the authorship criteria can e.g. be mentioned in the acknowledgements or in an expression of gratitude in the notes or at the beginning of the article text.
- Inform all authors and, if necessary, contributors before submitting a manuscript and have the last version of the manuscript approved by all contributors.
In order to increase transparency, many universities and journals recommend or even compel authors to disclose the contribution of each of the authors in the form of an authorship contribution disclosure and to publish this information together with or in the paper.
This is also a good practice listed within the ALLEA Code:
Authors include an ‘Author Contribution Statement’ in the final publication, where possible, to describe each author’s responsibilities and contributions.
In 2020 the VCWI published a general advice on authorship contribution statements. In their advice, the VCWI deemed the use of authorship contribution statements a commendable practice that benefits science in general as they:
- make sure that interdisciplinary research remains feasible by demarcating responsibilities;
- contribute to a fair assessment of researchers, and
- discourage questionable authorship practices such as honorary authorship.
The full text of this and other general advice can be consulted via the website of the VCWI.
Specifying contributions can take different forms: a written statement in one’s own words, the so-called ‘author(ship) contribution statement’, whether or not in a predetermined format; use of ‘digital badges’ where each contribution corresponds to a specific colored badge, e.g. a red badge for writing the first draft. The most well-known example is a pre-established classification of different (traditional and other) roles in a “Contributor Roles Taxonomy” (e.g. CRediT). This high-level taxonomy consists of 14 roles (Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft and Writing – review & editing) which can be used to uniformly describe each contributor’s role in the research.
There are several instances where the authorship list does not reflect the actual contributions made to the research, all of which are unacceptable research practices. To name three of the most well-known:
- Honory authorship: this relates to the inclusion of authors because of a hierarchical reason, e.g. head of the department where the research was performed. Please note that providing funding does not necessarily mean that authorship is warranted.
- Gift authorship: inclusion of a non-contributing colleague expecting that the colleague will return a favor.
- Guest authorship: inclusion of authors in the hope their appearance on the manuscript facilitates the review process or will lead to more visibility after publication.
The opposite may also occur:
Ghost authorship: in which an individual that deserves authorship is not included in the author list, either because that person was forgotten or ignored, or for strategical reasons, for example in order not having to declare a conflict of interest, which might in turn affect the review process.
‘I avoided authorship discussions with collaborators—until I learned some hard lessons.’
Testimony in Science in which a researcher testifies about his own experiences with ghost and gift authorship, and the importance of making good agreements from the start of a project.
Below are some other useful resources that can be used when discussing authorship:
- How to handle authorship disputes: a guide for new researchers (COPE).
- Authorship agreement form developed by the University of North Carolina at Charlotte Graduate School
- Documents provided by the American Psychological Association (APA):
- Authorship agreement — a contract stating authorship order and includes brief descriptions of author contributions.
- Authorship determination scorecard — a worksheet used to determine a numeric value for each author’s contributions in order to facilitate the discussion on who gets to be an author.
- Authorship tie-breaker scorecard — a worksheet used when filling out the Authorship determination scorecard results in a tie (two authors having the same numeric value) in order to determine the order of authors.
- Publication contract — a contract outlining author roles in submitting a paper for publication.
Cartoon by Patrick Hochstenbach under a Creative Commons CC BY-SA 4.0 license
When to think about this?
This part is relevant throughout the research cycle (design, execution, publication). Authorship discussion should preferably take place at the start of a research project or the planning of a collaboration. Furthermore, expectations regarding authorship need to be discussed throughout the research project.
When one of the authors leaves the institution
The timeframe for getting articles published doesn’t always match a researcher’s current academic affiliation. Sometimes researchers leave before a project or article is finalized. In this case, it is important to take additional arrangements to settle contributions to an article and accompanying rewards. In all cases, the work done by a researcher should be acknowledged correctly whether that person is still employed at the time of publication or not.
Additional arrangements concern:
- Is the researcher leaving able/willing to follow-up on the article?
- What if (minor or major) changes have to be made after peer review?
- What if additional research practices have to be made e.g. extra calculations?
- What is the possible effect to authorship contribution/order?
- What if the paper is not accepted, how will follow-up be discussed, e.g. when the article will be submitted to a different journal?
- How will decisions be made in these changes?
- …
Even though publication of an article can take some time, up-to-date contact details of the person leaving are still necessary. It is important to inform that person about content-related and practical changes, as well as publication progress, at all times.
Image processing
Image processing
In addition to the presentation of categorical and non-categorical data, special care should be taken when presenting digital images to illustrate different experimental conditions. Images have to be considered as data and are more than a simple illustration accompanying the summary statistics. Inconsistencies in the image (e.g. the use of the same image to illustrate different experimental conditions) or selective modification of images may severely decrease the confidence of your peers in your work and may warrant a correction or even a retraction of the publication. This is especially relevant in the field of biomedical and biological sciences, relying on a number of specific technologies (microscopy, western blotting, fluorescence-activated cell sorting (FACS), …) to illustrate the results.
Good academic practices when working with and presenting images
Some general good practices when working with and presenting images are:
- Modifications such as adjustments of contrast, brightness and/or color balance might be acceptable but only if the adjustments are done on the entire image and do not influence a proper perception of the data.
- Researchers should be able to trace back and motivate the adjustments.
- Selective modification of an image, for example to remove or emphasize specific features is generally not acceptable even if the modification is performed to remove an unrelated imperfection (e.g. remove a hair, fingerprint, etc.).
- Upon combining multiple images into a single field, this should be obvious from the presentation of the image and the text of the figure legend. For example, splicing of bands in case of a Western blot may be acceptable in some cases, but only if properly acknowledged.
- Always have the original, unaltered images available and only make modifications on a copy of the original image.
- Be prepared to make the raw image files available to the reviewers and/or readers of your work, for example by providing these as supplementary data or using an online data repository.
- As many issues arise by the accidental selection of the wrong image, proper data management is a key element in the prevention of mistakes.
- While generating figures, check and double check whether the correct images have been selected. With this in mind, the use of placeholder images is discouraged.
Further reading:
- The Office of Research Integrity (ORI): Online Learning Tool for Research Integrity and Image Processing
- Rossner, M. and Yamada, K. (2004). What’s in a picture? The temptation of image manipulation.
- American Journal experts: Avoiding image fraud: 7 rules for editing images.
“I find it tempting to selectively modify images or provide a non-related image in order to have an image that represents my average results or makes the figure more appealing, this is not an acceptable practice.”
The example below illustrates a case in which the same blot image is used 2 times to illustrate 2 different experimental conditions:
(2016) BIK – Prevalence of inappropriate image duplication in biomedical research publications – bioRxiv preprint.
While this minght be an honest error, it also illustrates how easy it is to make a mistake or to use an unrelated image. In the current example, the issue could be detected as the same image was presented twice. However, in case the image was only used to illustrate the unrelated condition, this would not have been picked up.
Presenting your data
Presenting your data
Data presentation is the foundation of our collective scientific knowledge, as readers’ understanding of a dataset is generally limited to what the authors present in their publications. Figures are critically important because they often show the data that support key findings (Weissgerber 2015).
Unfortunately, authors generally use simple graphs to present summary statistics, instead of providing detailed information about the distribution of the data or showing the full data. In addition, digital images are not to be considered as just nice illustrations, but are underlying data and should be treated as such (Cromey 2013). As figures are the main method of giving insight into the results of your work, researchers should strive to present their data faithfully and transparently.
ALLEA Code:
- Researchers share their results in an open, honest, transparent, and accurate manner, and respect confidentiality of data or findings when legitimately required to do so.
- Researchers report their results and methods, including the use of external services or AI and automated tools, in a way that is compatible with the accepted norms of the discipline and facilitates verification or replication, where applicable.
Reporting bias
Researchers not able to reproduce or replicate previous results are often not inclined to further pursue their investigation nor to publish the findings. Nevertheless, the resulting reporting bias might influence the correctness of the scientific literature (publication bias), with potentially canonization of false facts (Nissen et al 2016). In addition, not having the full picture might distort the conclusions that can be drawn from meta-analyses and systematic reviews and as such may lead to a biased view which in turn might impact policy decisions.
Who is involved?
In most cases analysis and presentation of the data will be in the hands of the researchers that have collected the data.
Supervisors and promotors are responsible for checking the integrity of the collected data and making sure that the analysis, presentation and conclusions of the research are faithful to the obtained data.
Publishers should have clear journal policies in place with regards to the presentation of data into figures. In addition, these guidelines should not only be available but also be enforced.
Peer reviewers should be sufficiently critical regarding the figures, look out for potential pitfalls and propose alternative presentation methods.
When to think about this?
Thinking about data presentation is most relevant during the research phase with regards to correct data management and data analysis, during the publication process when preparing the figures and reviewing the data, and during post-publication, for example in relation to questions related to the validity of the presented data.
Illustrating research with graphs
Although it has become standard practice to illustrate continuous data using bar graphs, thereby presenting the data using summary statistics as averages and deviations, researchers are advised not to use this approach and instead look for other ways to present the data, including its distribution, especially when the results are based on a low number of observations.
When generating graphs, you should ask yourself whether the presentation accurately presents the research findings and does not in any way mask particularities that could affect the way the data is perceived. The data should be convincing by itself, not because of the presentation!
Some good practices when presenting (continuous) data:
- Try to provide as much data as possible when presenting data allowing others to interpret the distribution of the data.
- Avoid bar graphs for continuous data, especially when working with low n-values. Examples of alternative graph presentation are dot and/or box plots (examples can be found in the Weissgerber 2019 paper).
- Show outliers, indicate whether or not these have been included in the statistical analysis and explain why.
- In addition to listing the different statistical tests used in the methods section, please also indicate the test used for each particular dataset.
The figure below illustrates that different data distributions can lead to the same bar graph. Having access to the full data may, however, suggest different conclusions from the summary statistics (Weissgerber 2015).
Weissgerber TL, Milic NM, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation.
This illustrates the importance of having access to the full data set, as this will show potential outliers or unequal distribution of the data, which might in turn affect the statistical analysis and/or the perception of the data.
Reporting results
Responsibilities within the publication process
In order for research to be robust and trustworthy, it is key that correct scientific behavior is not only implemented during the research phase, but also when reporting the results to the scientific community and society in general. Overall, transparency is key. Although research starts with an idea that can culminate into a paper, it is important to note that one should not only look at the scientists performing the work to make sure that everything is correct and that the work is subsequently used in a correct way. In fact, there are other actors in this ecosystem, such as the journals, publishers and other methods that give a platform to the research. In addition, it is also crucial that both the scientific community and society in general are sufficiently critical when embracing results. As such, responsible conduct of research is a shared responsibility.
Who is involved?
The above of course does not reduce the responsibility of authors as they are the ones responsible for making sure the reporting is accurate, timely and transparent. Good academic practices will encompass correct behaviour when granting authorship, how to properly cite previous work, how to choose a proper platform to report your work and how to deal with negative results.
The most common way to disseminate research findings is still to publish the work in a peer reviewed scientific journal. Journals exist in many forms, sometimes with a broad spectrum of topics, while others are tailored to specific disciplines or topics. Regardless of their focus, journals have the important task of performing a quality check of the research to make sure that low quality research is not published and that research findings are only communicated after they have been verified by peers. This process is called peer review. Journals have to develop internal quality criteria and organise a review process that is adequate for the task. For this review process, journals rely on reviewers to be critical and to identify where a study falls short.
It is important to note that the responsibility of the author(s) and the journal/publisher, does not end once the work has been published. In fact, both authors and journals/publishers have to take responsibility whenever post-publication issues arise, if necessary by correcting or even retracting the work in a transparent way. The Committee on Publication Ethics provides a number of core practices together with a variety of useful flowcharts that can assist readers of the work and journal editors with determining a strategy in case they encounter questionable research practices
Finally, it is of utmost importance to acknowledge the role of the readers of the work. Those who read publications will determine the importance of the findings, for example by citing the work. It is therefore important that this group is critical of the work and does not misuse it for its own benefit. Responsible readers look further than the title and abstract of the work, and dive into the paper to fully grasp the content. Reading is to understand whether the conclusions of the work are justified, and how they might be applied.
Gender-sensitive reporting of research
Sex and Gender Equity in Research Guidelines (SAGER) encourage a more systematic approach to the reporting of sex and gender in research across disciplines. They apply to all research with humans, animals or any material originating from humans and animals (e.g. organs, cells, tissues), as well as other disciplines whose results will be applied to humans such as , e.g. engineering.
General principles:
- Authors should use the terms ‘sex’ and ‘gender’ carefully in order to avoid confusing both terms
- Where the subjects of research comprise organisms capable of differentiation by sex, the research should be designed and conducted in a way that can reveal sex-related differences in the results, even if these were not initially expected.
- Where subjects can also be differentiated by gender (shaped by social and cultural circumstances), the research should be conducted similarly at this additional level of distinction.
Recommendations per section of the article:
- Title and abstract: if only one sex is included in the study, or if the results of the study are to be applied to only one sex or gender, the title and the abstract should specify the sex of animals or any cells, tissues and other material derived from these and the sex and gender of human participants.
- Introduction: authors should report, where relevant, whether sex and/or gender differences may be expected.
- Methods: authors should report how sex and gender were taken into account in the design of the study, whether they ensured adequate representation of males and females, and justify the reasons for any exclusion of males or females.
- Results: where appropriate, data should be routinely presented disaggregated by sex and gender. Sex- and gender-based analyses should be reported regardless of positive or negative outcome. In clinical trials, data on withdrawals and dropouts should also be reported disaggregated by sex.
- Discussion: The potential implications of sex and gender on the study results and analysis should be discussed. If a sex and gender analysis was not conducted, the rationale should be given. Authors should further discuss the implications of the lack of such analysis on the interpretation of the results.
Data Management Plan
Data Management Plan (DMP)
Writing a data management plan (DMP) when you start your research can help to ensure you achieve a standard of transparency and integrity in your research. In a DMP you describe the data you plan to collect, generate and use for your research, whether it is data you create yourself of existing data created by someone else. You describe in a structured way how you will manage those data during and after your research. You write about:
- how the data were created and what they mean
- safe storage of data so they can’t go missing or be tampered with
- data security so that only people allowed to access the data can access them
- publishing the data as evidence for your published papers (unless there are restrictions to do so), so that your findings can be verified and reproduced
- individual and institutional responsibilities to look after your data
- safeguards for ethical and privacy reasons if research involves human participants
Good data management does not end with planning. It is important that your research data are then managed according to this plan. You can review and update the plan according to the progress of the research (it’s a living document). After your research has ended your DMP will form the permanent track record for the data your research produced.
Key issues to find out when you start writing a data management plan are:
- know your institution’s policies and services, such as storage and backup strategy, intellectual property rights policy, data management policy and any data sharing facilities like an institutional repository
- ownership of your data
- your legal, ethical and other obligations regarding research data, towards research participants, colleagues, research funders and your institution
Here are some examples where good data management plans can help achieve research integrity:
- Documenting in detail how your data were created and processed provides clear evidence, for example:
- a lab book describes the experimental set up and all parameters that define your data, show the processes used to collect the data and gives and overview of all data collected
- an interview schedule and question list describe the collection of information via interviews, with a referenced codebook showing your interpretation of the interview content during your analysis
- commentary lines in computer code describe the logic of what your code does step by step.
- The licence agreement and use conditions of third party data explain how you can or cannot use third party data in your research, for example you may be allowed to use data for analysis but not copy and publish them ad you should cite the data you use (like you would cite a publication) whereby it is important to check the ownership of those data.
- In research with human participants, documenting the informed consent procedures used in your research and the personal data you may collect helps to plan which level of security is needed when storing and handling the data and how data need to be anonymised to respect the privacy of people.
- Publishing your research data in a FAIR way gives transparency about how you reached your research findings bases on those data.
- If other researchers want to reproduce your results, they need to be able to access the data and any documentation that clearly explains how the data were generated and how to interpret them. Data can for example be made available openly in a data repository with a clear use licence.
In collaborative research, it is important to describe the planned data management practices at each partner organisation, and to have a dedicated person at each site responsible for data management. In international collaborations, it can be that there are differences in the ethical and legal framework for research, or the expectation of institutions or funders for data management. Developing a data management plan can help to ensure that all these aspects are addressed before data collection starts.
FAIR data principles for research data
FAIR data principles for research data
One of the main objectives of a good RDM practice during the research project is to enable the long-term preservation of FAIR data objects after the research results have been published and/or the research project has ended. By preserving data for the long-term, it becomes possible to (1) reproduce the findings of a certain study at a later stage and to (2) re-use the data for new research purposes.
The guiding principle should be that the data are as open as possible, and as closed as necessary.
Long-term preservation of the data in itself is not sufficient to turn the data into a valuable and citable research output that is on par with publications. Barely documented data can be stored for a very, very long time on the private server of a research department, gathering dust and steadily sinking into oblivion. But digital innovations will leave that data unreadable for both machine and researcher.
Detailed documentation of you data collection processes (as part of your general data management) can help to ensure that any selection of dates clear, readable and can be made available for others. Also preregistration of your methodology cam help to prevent any threat research integrity.
Cartoon by Patrick Hochstenbach under a Creative Commons CC BY-SA 4.0 license
FAIR Principles
This is where the FAIR guiding principles for scientific data management and stewardship enter the picture. The FAIR principles were initially conceived for research data (Wilkinson et al, 2016), but are also being applied to more specific types of research outputs such as software (Lamprecht et al, 2019). Data can be turned into FAIR objects, which makes them ‘exploitable’ for the broader research community in the long run. FAIR stands for Findable, Accessible, Interoperable and Re-usable.
- Findable
Ideally, data and accompanying documentation/metadata are made findable by both humans and ICT systems. Concretely, this findability is typically ensured by means of ‘discovery metadata’ that are available via a data search engine such as DataCite. If you search for the data on the search engine, the associated discovery metadata, including the name(s) of the data creator(s), the subject of the data etc., pop up among the search results. Commonly, the discovery metadata would include a persistent identifier (e.g. DOI, handle, etc.) that directs you to the landing page where the (non-sensitive) data are available for download. Note that findability comes in different flavors. Data published on a personal website or a specific project website are to some extent findable, but not in any meaningful, structural way. - Accessible
The access conditions for the data are well-defined, supported by the appropriate license (e.g. a Creative Commons license for open data). Data are published in open access when possible, but restricted/closed access is applied in case of sensitive data (e.g. personal data). Although sensitive data are not publicly made available in open access, they can still often by re-used by other researchers, albeit via a more complex procedure that safeguards the rights of the data subjects and ensures data security. Note that the discovery metadata referring to the sensitive data can still be publicly available, even though the data themselves are not. - Interoperable
Interoperable data are data that can be combined with other datasets by humans as well as ICT systems, and has no unnecessary legal obstacles (e.g. OA license with overly complex restrictions). Additionally, the data can easily interoperate with automatised analysis workflows or other applications. It is also important that the documentation/metadata accompanying the data maximally adhere to discipline-specific standards, for instance by using ‘controlled vocabularies’ and can be encoded in a standardised, structured format in order to make them machine-readable. Examples of generic metadata standards are the Dublin Core and DataCite Metadata Schema. - Re-usable
The three pillars ‘findable’, ‘accessible’ and ‘interoperable’ are all necessary prerequisites to make the data eventually re-usable and interpretable by other researchers. Particularly important is the documentation/metadata that accompanies the data such as, f.e. a codebook that explains the different variables or an explanation of how the data were collected. Without the adequate documentation, data are generally difficult to interpret, which obviously hampers re-use. Note that, if the data are sensitive, re-use is not impossible, but has to comply with stringent conditions stipulated in a ‘data use agreement’.
Data Repositories
Researchers should hone their FAIRification skills, in order to make the data that they collect or generate as FAIR as possible. However, they are not alone in this endeavour. Next to supporting RDM services at research institutions, there is also a pivotal role for the ‘trustworthy repository’ where the research data are ultimately archived. For example, the data repository can be well-connected to the broader data ecosystem, enhancing the findability of the archived data, and provide infrastructure to implement certain metadata standards, improving interoperability.
- Re3data is a registry of institutional, disciplinary and interdisciplinary research data repositories worldwide.
When to think about this?
As already stated before: “RDM includes all steps before, during and after the project” which means you have to handle your research data correctly throughout the whole research data lifecyle to ensure the quality and integrity of your research. A data management plan (DMP) is the best tool to help you to do just that.
Research Data Management (RDM)
Research Data Management (RDM)
What are ‘research data’?
Research data are all digital or physical data – regardless of the manner in which these data are collected or stored – used or analysed to support research findings, validate research results or support a scientific reasoning, discussion or calculation in the study. Research data cover the entire spectrum from raw data to processed and analysed data included or discussed in a publication. These data can be generated data, derived or composite data, as well as self-generated data and data provided by third parties. Some examples are: survey results, statistical data, graphics, computer-generated data, simulations, software developed for research purposes, computational metadata, prints, video and audio tapes, a corpus, organisms, gene sequences, synthetic or chemical compounds, samples of any kind, patient records, protocols, measurements, notebooks, and so on.
Research data form the beating heart of academic research, and are the engine of enormous progress in technology, healthcare and on a socio-economic level.
Research Data Management (RDM)
Although researchers are continuously working with data to test them against the research hypothesis, these are often not processed and stored carefully during and after the project. At the same time, our society is becoming ever more focused on increasing datafication, with more and more processes and actions based on digital data. This also enables a specific type of data-driven research, based on the combination of (big) data with new techniques for analysis (i.e. “data science”).
This large use and potential reuse of data therefore makes solid management of research data very important in the research community, meaning that the good handling of data, the so-called research data management (RDM), should be one of the cornerstones of good academic research practice.
RDM includes all steps before, during and after the project, i.e. “the research data lifecycle”: data planning, collection, processing, analysis, security, storage, preservation, access, sharing and reuse. All these steps are bound by conditions and regulations at legal, ethical and technological levels.
The importance of research data management
Prudent and thoughtful RDM leads to a better quality and integrity of research, a greater impact research and a greater visibility and reuse of research data. Good data handling can prevent data loss/corruption, fraud and/or bad science, it makes your research process run smoother, and ensures that you can find and reuse data later. In addition, the sharing and reuse of data for future researchers is facilitated in order to develop new research both inside and outside the university.
Imagine that you have read a scientific paper and would like to know how the algorithm the authors used works ‘under the hood’ or how the authors constructed a certain plot? How easy would it be if you could just click on a link and download that information? Imagine that you have to come up with a brilliant new way of analysing the data that underpin a certain study, but the original data are nowhere to be found. How much time would you lose collecting new data? Or imagine that you actually were able to get a hand on the original data, but after close scrutiny, the data are utterly incomprehensible: the measurement unit of variable B is impossible to decode etc.. How disappointed would you be after the initial effort of obtaining the data? Indeed, it’s such a hassle.
Every researcher has a responsibility to contribute to a new world where research data that underly research findings are easily findable and interpretable. Hence the need for proper research data management and researchers are increasingly required to put a thorough data management approach into practice throughout the research process. These fairly recent RDM requirements are often formalised in policies of research institutions, funders (e.g. the necessity to develop a data management plan) and journals (e.g. the inclusion of data availability statement in the article). At first glance, these RDM requirements might seem difficult to comply with for researchers who are not yet familiar with them. when you look further, you will notice that RDM will be of help during your research and with that they also serve the purpose of improving the way we do science, making scientific results more transparent and reproducible.
Retraction of an article because access to the underlying data was not granted.
Case: Lancet, NEJM retracts controversial COVID-19 studies based on surgisphere data – Retraction watch: “Two days after issuing expressions of concern about controversial papers on Covid-19, The Lancet and the New England Journal of Medicine have retracted the controversial articles on Covid-19 because a number of the authors were not granted access to the underlying data […]”.
“[…] Because all the authors were not granted access to the raw data and the raw data could not be made available to a third-party auditor, we are unable to validate the primary data sources underlying our article, “Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19.” We therefore request that the article be retracted. We apologize to the editors and to readers of the journal for the difficulties that this has caused.””
Who is involved?
The researcher develops a data management plan for his/her research that describes how research data will be collected, organised, documented, stored, used and looked after throughout the research lifecycle; implements good data management practices; and ensures that the data remain available in the long term.
The supervisor supports and advises the researcher on data management practices and ethical, legal and contractual responsibilities.
The institution/university provides the tools, infrastructure and policies for the researcher to implement good data management practices.
Data stewards and research support staff provide advice, support and training on data management (planning) to the researcher.
The ALLEA Code also strongly confirms the importance of research data management and included the following:
- Researchers, research institutions, and organisations ensure appropriate stewardship, curation, and preservation of all data, metadata, protocols, code, software, and other research materials for a reasonable and clearly stated period.
- Researchers, research institutions, and organisations ensure that access to data is as open as possible, as closed as necessary, and where appropriate in line with the FAIR principles (Findable, Accessible, Interoperable and Reusable) for data management.
- Researchers, research institutions, and organisations are transparent about how to access and gain permission to use data, metadata, protocols, code, software, and other research materials.
- Researchers inform research participants about how their data will be used, reused, accessed, stored, and deleted, in compliance with GDPR.
- Researchers, research institutions, and organisations acknowledge data, metadata, protocols, code, software, and other research materials as legitimate and citable products of research.
- Researchers, research institutions, and organisations ensure that any contracts or agreements relating to research results include equitable and fair provisions for the management of their use, ownership, and protection under intellectual property rights.
Research funding
Research funding
Due to competition and low chances of obtaining funding (Garner et al 2013), researchers are now more than ever struggling to obtain the resources to fund their research project. As applying for funding can be time-intensive, researchers might feel like spending more time on writing applications than on research. So, when having a great idea, why not try and sell it to multiple funders and see which one ‘bites’?
“With grant success at all-time low, scientists are working harder than ever to fund their research. They respond to the competitive economic times by submitting more applications. They may also simultaneously or serially submit applications to multiple funding agencies to increase their odds of getting funding. Some grant agencies allow the submission of applications with identical or highly similar specific aims, goals, objectives and hypotheses. But we believe that researchers should not accept duplicate funding for the same work – either the whole study or any part of it.”
Quote from: Same work, twice the money? – Harold R. Garner, Lauren J. McIver & Michael B. Waitzkin – Nature 493 (2013).
It is therefore not unsurprising for a research proposal to be submitted to multiple funding bodies, either in identical or slightly modified form. This system of parallel applications increases the chance of obtaining funding. In addition, when funding is only partially granted, additional funding obtained from (an)other source(s) can help to acquire the budget needed for the complete project.
The Flemish Commission for Scientific Integrity (Vlaamse Commissie voor Wetenschappelijke Integriteit – VCWI) has formulated a general advice on Plagiarism in funding applications (2017).
Double dipping
Although there might be acceptable reasons to motivate the need for complementary funding by different funders, for example personnel costs under one and consumables at different funders, researchers should make sure not to accept funding for the same aspect of the research project.
While parallel applications aren’t necessarily problematic, some good practices should be kept in mind. First of all, researchers should be transparent towards the funding body and acknowledge if a (partially overlapping) proposal is under evaluation elsewhere. The same principle should be followed in case of overlap with an already granted project. Although there might be good reasons for the overlap, this should be communicated in a clear way. Finally, if during the application phase, the proposal is granted elsewhere, proper action should be taken in order not to obtain double funding. Researchers should not accept funding twice.
Note that more and more funders, as part of the application phase, are asking to declare whether a related proposal has been submitted/approved elsewhere. Do not assume related proposal(s) will go unnoticed if you don’t mention them, as this might have severe consequences on the further processing of your current and future application(s) and can be reported to the host institution.
ALLEA Code:
- Researchers make proper and conscientious use of research funds.