In each research project, efficient management of research data is an implicit part of good and integer research. Best practices for research data management:
A data management plan (DMP) is a dynamic document in which you write down your intentions concerning data management during and after your research project. At Hasselt University, a researcher must write an initial data management plan within six months of the official start of the research project (art. 6, General policy plan on research data management at Hasselt University). Click here for a full overview of the funder requirements for Research Data Management.
A data management plan (DMP) is a dynamic (i.e. 'living') document in which you write down your intentions concerning data management during your research. Creating a DMP at the beginning of your project helps you to reflect on how you will collect and process data, where you will store them, which security measures you should provide, which costs you should consider, etc. That way, you are well prepared from the very start, eliminating the risk of data loss, confusion (which version was the last one?), avoiding legal issues and other unhappy outcomes while making your data findable and reusable by others (win-win).
The plan is intended to be used actively as a guide and keep the plan up-to-date regularly as your research project evolves. The final version of your DMP should truly represent how you have handled your data during the project and how you will handle them afterward. Remember that the DMP itself is not a place to store data.
(within the first six months; initial full DMP)
1 | Write your DMP | In DMPonline (more information below) or in MS Word using a funder template: FWO/BOF/VLAIO-cSBO | BELSPO | Horizon Europe | H2020 | ERC Optional - Make your life easier:
|
2 | Request feedback |
|
3 | Revise your DMP | If necessary, adjust your DMP based on the feedback. |
4 | Export your DMP | |
5 | Submit your DMP (initial full DMP) |
|
Update the DMP regularly based on how the data management is implemented in your research project. Contact us if you have any questions throughout your research project. You do not need to send updates of your plan to RDM.
For FWO, consult this Cheat sheet: submitting a final DMP https://doi.org/10.5281/zenodo.10985170
Complete your DMP (either the Word version or in DMPonline). Send the PDF version of your final DMP to rdm@uhasselt.be; and submit it to your funder with the final report if necessary.
A user-friendly tool for creating a data management plan (DMP) is DMPonline. Below you will find a basic step-by-step plan for this tool, and an extensive manual can be found via this link. A benefit of using DMPonline is that you will have access to example answers and Hasselt University guidance.
You are not required to use DMPonline. You may also use another tool, but make sure the questions from your funder's DMP template are answered.
Sound data management – and, hence, a well-thought Data Management Plan (DMP) – starts with identifying a complete and detailed list of all data you will collect, generate, and (re)use.
Research data are all data generated, collected or used in the context of any research project.
As a result, this broad definition includes a wide array of types and formats of data, ranging from raw data to processed and even published data. Examples may include, but are not limited to: notes, surveys, figures, objects, audio-visual files, spreadsheets, databases, statistical data, geographical data, research software, simulations, samples (including biological material, personal data, patient data, etc.).
Origin of the data | Generate new data - Primary data Reuse existing data - Secondary data |
Stage in the research project | Raw data Processed data Analyzed data |
Materiality of the data | Digital data Non-digital, analogue or physical data |
Type of digital data | Observations Experiments Derivation or compilation Computations, models or simulations References (canonical data) |
Format of digital data | When focussing upon the format of research data, one sees that (digital) research data occur in many different technical file formats, depending on the software used for collection, analysis, and processing. However, once data analysis is completed and the data are being prepared for long-term archiving, you should consider converting your research data to a more limited set of open or standard formats to ensure long-term accessibility and usability (more information in the section On sustainable formats for data archiving in Preservation) |
In some disciplines – such as theoretical mathematics or law – you may ask yourself whether you actually have research data. In that case, research data can be defined as all information, generated as part of the scientific process, on which scientific conclusions are based. Just imagine: your computer crashes and your (home) office is destroyed by fire – what information would have gone lost in order to write a scientific publication? Well, that’s research data!
Also keep in mind that physical items, such as books, codes, maps, and artefacts are also research data, as well as (handwritten) notes, proofs, annotations etc. that support the conclusions in your published work.
Secondary data or existing data are generated by third parties and/or within the scope of another research project.
At the start of your research, look up whether there are datasets that you can reuse. In doing so …
But where can you find existing datasets that are relevant for your research project? Those places where you can share your dataset(s) yourself (See section Where should you (not) preserve your data? in Preservation) are a good starting point: from the creator of the dataset themself, in a data repository, in a data paper, … There are also specific discovery services to help you find datasets, e.g. EOSC Research Hub for Data, Datasearch and Datacite.
When you have found the relevant data for your research, it is important to consider the following:
Once you have established the accessibility, interoperability and (re)usability of the dataset, you can start processing the data.
Make sure that you cite the dataset properly, using a data citation and a persistent identifier (e.g. DOI). When there are multiple versions of the dataset, make it explicit which version you have used in your research.
Research software is newly created software during a research project to support research activities such as data analysis, simulation, visualization, and machine learning. Research software includes:
For any software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) used for research but not created during or with a clear research intent, it is important to consider the restrictions or licenses that apply to the software. Examples:
Re-using (parts of) open-source software in your newly generated research software requires some additional attention.
It is important to review if the open-source software has a license: YES - Adhere to the conditions of reuse: - Who can use the software? - How can the software be used? - Can the software be modified, redistributed, and used for commercial purposes? NO - We advise you not to reuse the software since the author did not provide permission to copy or reuse it. If you do want to use it, contact the author to verify which license applies. |
Need support in determining the license conditions? Contact RDM helpdesk.
Good Practices |
Software Management Plan: A guide to implement best practices for research software development facilitating accessibility and reproducibility.
License: free (Github Enterprise access Github Campus Program)
GitHub is a cloud-based platform to store and maintain Git repositories facilitating collaboration. Available via the UHasselt Software Center or via:
Link your ORCID profile to GitHub If you already have a GitHub account and an ORCID iD, simply sign into GitHub and go to https://github.com/settings/profile to authenticate your ORCID iD. From the settings page, scroll down to the button that says Connect your ORCID iD. Clicking that will take you to the ORCID sign-in page to authorize access. Check out this helpful instruction video to walk through the process. |
Consider the ethical and legal issues when handling personal, confidential, or third-party data. These ethical and legal issues can have implications for data storage, security, preservation, and sharing.
Ethical clearance is required for research proposals funded by Europe, SB, BOF and FWO.
Before the start of the data collection, verify if you need to apply for ethical approval when your research involves:
Before reusing data, verify if you need to apply for ethical approval when your research involves:
Personal data are all information about an identified or identifiable natural person. Identifiable is considered to be a natural person who can be identified directly or indirectly.
Read more about (processing of) personal data via GDPR for researchers at UHasselt.
At the start of your research project, contact your business developer if your research project includes:
The business developer can verify if you need an agreement or contract, for example:
More information on the Tech Transfer Office intranet (Dutch only)
Clear and detailed documentation of research data is essential to improve the data quality as well as to make your data understandable and (re)usable for yourself and others.
Documentation is needed at two levels: documentation about the entire study or project on the one hand, and documentation about individual records, observations or data points on the other.
Study-level documentation | Study-level documentation provides high-level information about the research context and design, for example, the project title and summary, data collection methods, authors and institutions involved, sources of secondary data, license and identifier for each dataset, folders structure, file naming conventions, versioning system, the relation between files or resulting publications, and other general information. |
Data-level documentation | Data-level or object-level documentation provides in-depth information about individual variables or records, for example, variable names, labels and descriptions (numeric, string, regular expression, date, etc.), units of measurement (cm, kg, etc.), calibration of instruments, controlled vocabulary or ontology terms accepted as values for each variable, missing values code, etc. |
Data documentation can take many different forms. Depending on your discipline, examples may include, but are not limited to:
Best practice: create at least one readme.txt-file per dataset
A more general approach to data documentation is a so-called readme.txt-file. It is basically a plain text file in which you bring together all information that might be necessary for peers or for your future self to be able to understand and (re)use the research data. Such readme.txt-file typically contains more information on:
Like all other forms of data documentation, the file should be created simultaneously with the dataset itself, and updated if needed. Inspiring templates and examples can be found on the websites of Harvard and Cornell University.
As it stands, metadata actually serve the same purpose as data documentation, as described above: they provide all information needed to understand and reuse the data. However, while documentation can only be interpreted by humans, metadata are automated “translations” of this information and can consequently also be read by machines and computers. They are typically formatted as a .xml or .json file, either embedded in the data file itself or captured separately.
As these metadata are machine-readable, it implies that metadata are highly structured and comprise a fixed set of elements, as defined by an established metadata schema. Therefore, it is advisable not to create your own schema but to use an existing and community-endorsed standard.
By doing so, you can score on the FAIR principles and your funder's requirements.
Domain-specific metadata schema | Depending on your discipline, various domain-specific standards have already been established. You can browse for them using the following websites:
|
Generic metadata schema | If no specific standard for your type of research exists to date, you can always resort to a generic schema, such as Dublin Core. In its most simple form, it comprises 15 elements that can be applied to virtually every discipline. A handy tool to create your own metadata according to this schema can be found here. |
As a UHasselt researcher, you are expected to upload the metadata of the datasets underlying your peer-reviewed publications. This can be done via the UHasselt metadata repository, which uses the metadata standard DataCite. This repository has been integrated into the Document Server (the database to deposit your publications).
Solutions for organizing research data (folder structures, file naming, versioning, etc.), security measures, back-ups, and collaboration. Consider the implications of any legal or ethical issues that might apply to your data.
Date YYYY-MM-DD | Version | Changes | Author (who?) | Motivation & remarks |
2024-05-28 | V1.0 | Initial version | Jane Doe | Draft |
2024-05-29 | V1.1 | Initial version | John Doe | Review |
|
More information on secure storage.
Institutional storage solution: Google Shared Drive | Google Drive, accessible by your UHasselt account, is an institutional cloud-based storage solution. The drives can mapped on your device (G-drive): My Drive and Shared drives. Google Shared Drives is the recommended storage solution by UHasselt for research data and related documentation. A Google Shared Drive is owned by a team of contributors with at least two contributors:
Request a Shared drive via the UHasselt IT Service Desk. Good to know:
|
Vlaams Supercomputer Centrum (VSC) | Storage solution for research projects with large datasets and high-performing computing needs. More information is available through the VSC website, such as: If you conduct research in collaboration with private companies, please get in touch with Geert Jan Bex for more information. |
Electronic Research Notebooks (ERN/ELNs) | What is an Electronic Research Notebook (ERN)? Endorsed ELNs
Support by ELN champions These are the UHasselt ELN champions for ELabFTW (Discipline: BIOMED):
If you would like more information about ELNs, please get in touch with our RDM helpdesk. |
Electronic Data Capture (EDC) platform | What is an EDC? Endorsed EDC
Support |
Github | License: free (Github Enterprise access Github Campus Program) GitHub is a cloud-based platform to store and maintain Git repositories for software code and text-based data files, facilitating collaboration. Available via the UHasselt Software Center. |
One-time data transfer | Belnet filesender allows you to securely share large (up to 5TB) datafiles. Avoid using external devices (e.g., USB drive, external Hard Drives) to avoid data loss and security breaches. However, if you do need to use an external device, make sure to encrypt the device. Use Bitlocker To Go (available via your Windows device). |
Collaboration | Google (Shared) Drive provides access control, allowing to share files and folders throughout the research project. It is recommended to add collaborators at the level of the drive instead of at file/folder level to keep an overview of the controlled access ('no access', 'read only', 'read and write', 'admin'). |
Google shared drive | Automatic backups are created if you use the institutionally recommended storage solution, Google Shared Drive. |
Automated backup | Do you use another storage solution (e.g., C-drive of your device)? Do you want to backup, synchronize, or mirror data to various locations?
|
Data security is relevant for any form or type of data, protecting data from unauthorized access, avoiding data loss, and ensuring research integrity.
To ensure that your data is safe, it is best to consider the 5 safes framework, providing the maximum possible security at 5 levels:
Safe People | Researchers can be trusted to use data properly by
|
Safe Projects | The use of the data is appropriate, lawful, ethical, and sensible:
|
Safe Settings | Digital and physical access control to restrict access to authorized researchers only. |
Safe Data |
Limit the risk of a data breach or disclosure. |
Safe Output |
Statistical results (published in articles) do not contain confidential information. |
Once your research project is wrapped up, it is important that your data are suitably stored or archived for future reuse in new research and for verification purposes.
Hasselt University recommends in its RDM policy plan to keep relevant research data generated during research projects for a minimum of 5 years for reproducibility, verification and potential reuse:
Valid reasons not to keep certain research data include:
Do not forget to consider the preservation plans for your physical data:
It is recommended to archive your data in a so-called data repository, that is an online database where you can deposit your dataset(s). It provides many benefits, such as unique and persistent identification of datasets (e.g. DOI), the provision of rich metadata, curation through automatic back-ups and check sums, access control possibilities (e.g. authentication procedures), licensing options, etc. In order to find a trustworthy and appropriate repository for your dataset(s):
✔ | You can search for a suitable domain-specific repository using Re3Data and/or Fairsharing. |
✔ | If you cannot find a domain-specific repository, you can turn to a general-purpose repository, such as Figshare, Dryad, Harvard Dataverse or Zenodo. For an overview, see the Generalist Repository Comparison Chart (3.0) by Stall, S. et al. (2023). |
❌ | It is not advised to use local devices (e.g., USB or external hard drives) for archiving purposes, as you risk losing your data in case of damage or loss as well as unauthorized access. |
All relevant research data generated during research projects at Hasselt University should be kept for minimal 5 years. In this regard, the university is in line with the requirements of some major Flemish funders (including FWO, EOS and VLAIO cSBO), also mandating a data retention period of at least 5 years.
For clinical trials with medicinal products for human use, the clinical trial master file must be kept for 25 years (Regulation (EU) No 536/2014 of 16 April 2014).
When working with personal data, the General Data Protection Regulation requires that these data cannot be kept longer than necessary for your current research or for possible further analyses of the data (storage limitation principle). Nonetheless, personal data can be kept longer if they are needed, for example, in order to follow-up in longitudinal studies, to verify published results, to comply with contractual obligations, or to protect Intellectual Property Rights. In addition, when your data subjects have given explicit consent to the processing of their personal data, you can also ask for their permission to keep the personal data for a fixed period of time (e.g. 5 years).
Given the enormous variety of data types, it comes as no surprise that (digital) research data likewise occur in many different technical file formats, depending on the software used for analysis and processing. However, once data analysis is completed and the data are being prepared for long-term archiving, you should consider converting your research data to a more limited set of standard, interchangeable and longer-lasting formats. Using or converting to such sustainable formats ensures the long-term usability, accessibility and sustainability of your data, and consequently is one of the key elements for FAIR data.
This typically means using open or standard formats instead of proprietary ones. Common examples of open formats include: OpenDocument Format (ODF), ASCII, tab-delimited format (.tsv), comma-separated values (.csv) and XML. For more information on recommended file formats, check out these websites: UK Data Service and DANS.
Once your research data are safely archived on a trusted platform, you can choose to share these data with others.
By sharing your data, they can be replicated and verified, enhancing the quality and integrity of your research. In addition, you will help to accelerate innovation, because other researchers can build on your findings. If that does not convince you, take a look at what the benefits are for you personally:
You do not have to share all your data, but only the data that are scientifically relevant and crucial for follow-up research.
Think twice about opening up the following types of data:
For more information on sensitive data, 3rd party data, IPR and valorisation, see Ethical and legal webpage.
Do you want to hop on the data sharing train, but have no idea where you want to go to? Here are the possible recommended destinations:
Wherever you deposit or publish your (meta)data, make sure that they adhere to the FAIR principles. For this reason, we do not recommend to use one of the following alternative routes to data sharing, as they do not allow for any sort of version control or licensing, and they don’t make your data findable and accessible for a wider audience:
When sharing your research data, selecting a suitable reuse license for your data is crucial so that other researchers clearly know under which conditions they can or cannot reuse your data. For example, do you want attribution for your work, or do you want to allow others to use your work commercially, or do you want to allow others to remix, adapt, or build upon your work?