Making data FAIR
The UT supports the principles of FAIR data, which means research data, at the latest when they are static, should be Findable, Accessible, Interoperable and Reusable. Of course this holds for shared and published static data, but also for data which are only archived at the UT.
There has been a tremendous increase in the amount of scholar produced research data and hence, there is an emerging urge to benefit from the published research data at most in the digital era of science as stated by Wilkinson et al.. FAIR principles are guidelines for researchers to make their research data Findable, Accessible, Interoperable, and Reusable with an ultimate goal of making the data available for reusability both by humans and machines as initiated by Force 11.
FINDABLE: The first step to make research data FAIR is to be able to find the data and metadata, e.g. the (meta)data should be uniquely and persistently identifiable through persistent identifiers (PID) such as DOI. Moreover, it is of great importance that data are described with rich metadata.
ACCESSIBLE: It should be possible for humans and machines to gain access to your data, under specific conditions or restrictions where appropriate. The (meta)data should be easily retrievable through PIDs assigned by the repositories.
INTEROPERABLE: Interoperability is the ability of research data to be easily combined with other datasets, applications and workflows by humans as well as computer systems. This can be achieved; i) by using well-known and preferably open file formats (4TU.ResearchData, DANS-Easy) and software whenever it is possible, ii) by using relevant standards for metadata (4TU.ResearchData, DANS-Easy) and iii) by further using community agreed schemas, controlled vocabularies, keywords, thesauri or ontologies where possible.
REUSABLE: Data should be sufficiently well-described with metadata and provenance information (make clear how, why and by whom the data have been created and processed). There should be an accessible data usage license so others know what kinds of reuse are permitted.
Following FAIR principles brings great deal of benefits to the academic community as well as individual researchers, research organizations and funders:
- Achieving maximum potential from research data.
- Achieving maximum impact from research.
- Increasing the visibility and citations of research.
- Improving transparency in research and thus, enabling reproducibility, replicability and reliability of research.
- Speeding up discoveries and revealing new insights into the research, thus facilitating new research questions to be answered. If all researchers make their data FAIR and publicly available it will save a lot of time and money. FAIR data accelerate research and make it more transparent.
- Staying aligned with international standards and approaches.
- Attracting new partnerships with researchers, business, policy and broader communities.
- Staying up to date with new innovative research approaches and tools.
In order to make research data FAIR, be aware of the following:
- FAIR does not necessarily mean that data need to be open. In the cases where the data cannot be made openly accessible, it should be still possible to make the metadata publicly available.
- Metadata is data describing other datasets, by means of formal labels, such as title, creator, year or publisher. Documentation describes data by means of broader categories, such as collection methodology, structure of and relations between data files, and terms of use. Metadata is crucial to find and manage your data, whereas documentation is needed to understand the context of your data and files. Both metadata and documentation are important for reuse of the data.
- FAIR data need a persistent identifier (PID). Trusted repositories, such as 4TU.ResearchData and DANS (list) will assign a PID when publishing datasets.
- Registering at ORCID (your personal persistent author identifier) and using this identifier with all your (data) publications.
- Using open, non-proprietary or common file formats will increase accessibility and interoperability.
- Using consistency in your file names, data variables, scripts, scripts variables and throughout similar annotations.
- Attaching the programming scripts you used to analyze or gather your data.
- Providing your data with a clear license to govern the terms of its reuse. Commonly used licenses such as Creative Commons (CC) can be linked to your data or software.
- Creating a README file (guidance / template) in order to enable that your data can be correctly interpreted and re-analyzed by others.
Your research data are a valuable asset to yourself and to other scientists, society, citizens and entrepreneurs. Therefore, the University of Twente, and a growing number of journals and funders require you to make your data FAIR and if possible open. Your data are ‘FAIR’ if they are easily Findable, Accessible, Interoperable and Re-usable. At the same time, you must comply with privacy legislation. Thus, making your data FAIR can sometimes be complex. In that case, you can ask UT’s FAIR Data Steward for help.
Aims and services
Data preservation means archiving them in a sustainable way. Publishing research data you can see as sharing them on a structural basis with the public.
Research data is often regarded as the crown jewels of science. It forms the basis of the results of scientific work. The quality of research results is also determined by the possibility of verification by means of the underlying datasets (see Netherlands Code of Conduct for Research Integrity, 3.2 art. 12a). Besides that, scientific development will benefit from sharing and reuse of research data. Good preservation forms the basis of verification, sharing and reuse of the research data.
Data preservation or archiving aims in the first place at preventing physical data loss or destruction and securing the authenticity of data. Besides, it contributes to the quality and impact of your scientific work by enabling verification and possible reuse, for instance for further analysis or follow-up, new research or as a contribution to a data resource for the scientific community.
Watch this video from DANS-KNAW in which scientists explain the importance of preserving the data of their research in a sustainable way.
For archiving data at the UT and sharing in your research group or with specified people outside the group, use Areda.
For publishing data, use 4TU.ResearchData, DANS or another trusted data repository.
Data policies and responsibilities
Research groups are responsible for the care of the data collected or generated in the research project, especially when it has a permanent character. This responsibility of proper data archiving extends beyond the end of the project and is, in the first place, the group’s own interest. Secondly, this responsibility is based on the general principle, formulated in the UT RDM policy, that intellectual property rights on research data collected or generated by UT staff (“database right”) are vested in UT.
The scope of UT RDM policy and the research group’s responsibility of data archiving does not include research data collected or generated by bachelor or master students. However, study programmes or research groups may have developed their own policies and guidelines and can make use of Areda for archiving these research data.
In general, archiving of the right selection of data is the responsibility of the researcher, (former) project leader, the supervisor in case of research by a BSc or MSc student, or the head of the research group. What materials need to be archived should be in accordance with data policy of the research group or higher organizational entity, and possible agreements with involved third parties.
Be aware that when preserving and publishing research data, contracts and other written agreements between involved parties in a project may contain information about rights and licences related to these data.
Publishing data in a repository
If possible, make research data open by publishing it in a trusted data repository, such as 4TU.ResearchData or DANS. This is highly recommended, both from an individual and public interest.
One of the services of a trusted repository is the issuing of a persistent identifier, which guarantees sustainable access. Watch this video about Persistent identifiers and data citation explained (Research Data Netherlands).
When publishing in a data repository, for proper reuse it is important to add metadata and a README file with documentation (guidance / template). You can use the same you added in Areda. In the near future automatic linking from Areda/Pure to 4TU.ResearchData and DANS will be realized.
Before publishing, please check specific project or research group policies.
Once you have published the research data, you can enhance your publication(s) based on the dataset. You need to let the dataset refer to your article(s), and vice versa.
Both 4TU.ResearchData and DANS offer the possibility to include a reference to your published article(s). This reference will be part of the metadata describing the data.
For upcoming articles, please make sure that the data reference is included in the reference list of your article. It is also recommended mentioning this reference in your cover letter, so reviewers can verify your research.
Preserving data at UT (via Areda)
Preserving, or archiving, data means that a copy, including description and documentation, is durably stored at the UT, preferably in the data archive Areda.
Apart from merely storing data for the long term, also metadata can be added to make the data findable, as well as proper documentation for the sake of interpretation, verification, interoperability and reuse of the data. For adding metadata and documentation (README file), Areda is linked to the UT Research Information System (Pure).
Areda is the University of Twente archive for the long-term storage of static data, which is collected, generated or used in UT research projects.
Areda offers research groups:
- Certified storage for long-term preservation (at least 10 years) of static datasets.
- A ‘bucket’ where data archive files (zip or tar) can be uploaded by group members.
- Technically suitable storage for all kind of static data, including special categories of personal data.
- The possibility to add metadata and documentation making use of the UT research information system (Pure).
Please, keep in mind that:
- Areda should not be used for datasets that still might change (dynamic data), or need to be used regularly.
- For the moment there is a maximum upload of 1 TB.
- Access to datasets is at research group level: all group members have access to the datasets in the group’s bucket (folder). Groups and members are based on the HR system.
- In case of uploading personal or other confidential data, access has to be managed by using encryption. Areda offers an encryption instruction. Encryption keys need to be securely managed by at least one permanent staff member of the research group besides the principal investigator, preferably the chair of the research group.
- Areda should not be used for archiving digital informed consent forms or pseudonymization keys. These must be stored encrypted and separately from the anonymized (or pseudonymized) data, for instance in JOIN (for ET, BMS, and ITC, please contact the data steward for the procedure), or the p-drive.
- Although in the group’s bucket a folder structure (by creating a path) is possible, it is advised to use it as a single archive (database) of zip-files containing a structured set of data files of a research project.
- For making overviews and searching of datasets, use the UT Research Information Portal.
Archiving of datasets demands some preparation:
- selecting and organizing the data files,
- writing data documentation,
- creating an archive file (zip or tar), and
- encryption in case of personal or other confidential data.
For more information, please consult the guide Archiving datasets in Areda.
Archiving datasets in Areda means upload the archive file you created during the preparation to the bucket of your research group. Next, add metadata about the dataset, such as title, creator, etc. and a copy of the README file in the UT research information system (Pure). There you can also link the dataset to one or more of your publications.
Here is a visual of the archiving intake process:

To help you with the intake process, please consult the guide Archiving datasets in Areda.Metadata will be reviewed by the data steward of the faculty. After this, the archive file will be transferred to the bucket of the research group.
Apart from archiving in Areda, it is recommended to publish research data and share it with others outside the UT. Sharing or publishing datasets means that you upload them to a trusted data repository, preferably 4TU.ResearchData or DANS.
Datasets are first archived in the research group bucket before moved and permanently archived at Surf Data Archive (SDA). All research group members have access to the bucket of their own group. Access to files can only be restricted by means of encryption.
It is advisable to add terms of use in the documentation of the dataset in the README file (guidance / template).
After data sets are moved to SDA they can be retrieved by means of an access request. To make a request, please fill in the form. The department chair will examine your request and decide if access can be granted. You will be notified of the outcome.
The General Data Protection Regulation (GDPR) requires that personal (any information which are related to an identified or identifiable natural person) data are not kept longer than necessary to achieve the purposes for which they are processed. If longer preservation is needed, anonymization is preferred. More information about handling personal data in research can be found here.
In any case, be sure that you registered the processing of personal data in compliance with the GDPR. When you indicated in your GDPR registration the need of preserving personal data, you should act in accordance with this information about preservation and protection of the data.
When archiving datasets with personal data, please be aware that file encryption is needed. Instruction about file encryption is offered during the Areda archiving process.
Areda is a UT facility especially for archiving static datasets. In Areda research groups can easily manage the datasets which need to be archived, not only for verification but also for internal reuse. Areda offers a cheap and reliable object storage for long-term, persistent and immutable archiving.
You can start directly archiving datasets. Guidance is available on the Areda portal.
No, because Areda is not aiming at data publishing. When you also publish the dataset in a data repository (4TU.ResearchData, DANS, etc.) a persistent identifier, such as a DOI, will be issued. Read more about persistent identifiers in Making data FAIR.
UT bachelor and master students cannot archive datasets in Areda by themselves. If you need to archive datasets, ask your supervisor to add them to his or her group’s bucket in Areda. As you are in principle the dataset rights holder, check policies and guidelines of the research group, especially regarding use rights and licenses.
Areda offers you information about encryption and instructions about tools to be used for encrypting data files with personal or other confidential content. Recommended tools can be downloaded and installed by yourself. Please, be aware that encryption demands some extra preparation time.
General metadata can be added in the UT Research Information System. Choose ‘Dataset’ and fill in the information asked, such as Title, Description, Date of data production, Contributors, Publisher, DOI, Access information, Temporal coverage, and Geo location.
Other metadata, and documentation, can be included in the README file which should accompany the dataset and be added to the description in UT Research Information System (guidance / template).
Metadata and documentation elements
metadata
documentation
descriptive
author, contributor, title, abstract, keywords, measurement type, project ID, geomapping, time period, subject area
software scripts, instrument settings, methodology, experimental protocol, codebook, laboratory notebook
administrative
data format, date, size, access rights, preservation period, persistent identifier, license for use.
user agreements, provenance (description of the origin of the data), terms of use
structural
related content, related projects, version
database scheme, relations between files, table of content
As yet a maximum volume of archived datasets has not been determined.
Apart from preparing the dataset itself and the documentation by writing a README file and adding metadata, the processing time will largely be determined by the size of the zipped data file to be uploaded and the capacity of the internet connection. In case you will upload a 50 GB zipped data file, it may take at least one hour.
Warning: Make sure your PC or laptop does not go into standby, sleep or hibernate mode during the upload.
No. Once a dataset is archived, it will remain unchanged. It is a so-called immutable object preserved for at least 10 years. You can archive a newer version separately and indicate the relation with the older version in filename and documentation.
No costs will be charged. It will be paid from central budget.
Data are stored on ISO 27001 and NEN 7510 certified servers at the University of Twente. The back-up facility is hosted by Surf, data centers are located in Utrecht and Amsterdam, The Netherlands.
Default, all members of a research group have access to the datasets in the bucket of the group. Access to data files can be restricted by means of encryption.
Default, datasets will be preserved in Areda for 10 years. In the near future it will be possible to indicate other preservation periods.
Shortly before expiring of the preservation period, the research group receives a message to decide whether the dataset must be deleted. Prolongation of the preservation period may not be free of charge.
To make a request, please fill in the form. The department chair will examine your request and decide if access can be granted. You will be notified of the outcome.
FAQ & Contact
Check these practical questions and answers about archiving research data at the UT. For other questions you can contact the data steward in your faculty.
You can archive datasets in the UT facility Areda. Apart from that you can publish the dataset in a trusted repository, preferably 4TU.ResearchData or DANS.
You can archive all types of datasets, both as supportive material to a publication (PhD-theses, journal articles, etc.) and as stand-alone items.
Datasets may be accompanied by related materials, such as
- specific viewing and analysis tools (models, algorithms, scripts, analysis or simulation software, schemas)
- laboratory or field notebooks, diaries
- questionnaires, transcripts, codebooks
- standard operating procedures and protocols
- informed consent forms
Yes, as long as archiving fulfils the following requirements:
- it is aimed at securing data authenticity, verification and/or reuse
- the data are static, so not subject to changes anymore.
The M- or P-drive are not suitable for long-term persistent storage of data because they cannot guarantee that the datasets will remain immutable, which is a prerequisite for the authenticity of the data.
Trusted data repositories are certified because they abide to 16 requirements, such as
- the explicit mission to provide access to and preserve data in its domain,
- having a continuity plan to ensure ongoing access to and preservation of its holdings and
- having adequate funding and sufficient numbers of qualified staff managed through a clear system of governance to effectively carry out the mission.
Whether datasets are FAIR highly depends on the way they are described by means of metadata and documentation, more than merely the location where they are archived or published. Look at Making data FAIR for more information.
The table below presents what to archive and publish depending of the purpose of preservation.
Purpose
What to archive/publish
Verification
Datasets underlying to research results in publications, plus analysis tools (scripts, etc.)
Reuse
All raw datasets relevant for further or other research, together with necessary scripts, models, software etc. and documentation
Data
publicationDatasets which are refined for publication, together with additional documentation
The General Data Protection Regulation (GDPR) requires that personal data (any information related to an identified or identifiable natural person) are not kept longer than necessary to achieve the purposes for which they are processed. If longer preservation is needed, anonymization is preferred.
When datasets contain personal data, please be aware that file encryption is needed when archiving. Instruction about file encryption is offered during the Areda archiving process.
If the data subjects agreed in the participation of your research project, it means that they implicitly agreed in archiving the research data. Therefore, they cannot demand that their data should be erased before the end of the retention period.
In case data subjects want to exercise their right, they can contact the Data Protection Officer (DPO).
Use general, non-proprietary formats to prevent loss of access to files and to enhance the chance of future interpretability of the data. Preferred file formats are e.g. PDF, Plain text, TIFF, FLAC, CSV or XML (see also extended list of formats from DANS or from 4TU.ResearchData).