Preserving and publishing data (FAIR)

Making data FAIR

The UT supports the FAIR principles. In practice, at the latest when a dataset is static (no longer changing), it should be Findable, Accessible, Interoperable and Reusable (FAIR)—for datasets you preserve (archive) at the UT and for datasets you publish in a trusted repository.

What are FAIR principles?

FAIR principles are widely adopted guidelines to make data discoverable and reusable by humans and machines: Findable, Accessible, Interoperable, Reusable, as first formally published by Wilkinson et al. (2016). Applying FAIR makes your datasets findable, accessible under clear conditions, interoperable and reusable, which increases visibility and citations, improves transparency and reproducibility, and reduces duplication and costs. It also keeps you aligned with community standards and fulfils funder requirements while respecting proportionate openness—as open as possible, as closed as necessary.

FAIR in practice
  • FAIR ≠ open: if data can’t be open, publish the metadata and set appropriate access conditions.
  • Use a trusted repository to preserve (archive) and/or publish: e.g. 4TU.ResearchData or DANS Data Stations. A persistent identifier (PID)—usually a DOI—is assigned by the repository on publication.
  • Provide rich metadata and a short README (guidance / template) so others can understand and cite the dataset.
  • Choose a clear licence (e.g. a Creative Commons licence) to state reuse conditions.
  • Favour open or widely used formats (4TU.ResearchDataDANS) and software whenever it is possible and keep consistent file/folder names.
  • Link your ORCID, include scripts/code, and use community standards (metadata schemas, controlled vocabularies, ontologies), where relevant.
  • For research software/code, see the five recommendations at fair-software.nl (repository, licence, registry, citation, checklist), and the Research Software Management section at the UT Service Portal.

For general advice on FAIR at UT, contact the UT FAIR data steward. For discipline-specific questions, consult your faculty data steward(s).

Aims and services

Data preservation (also called archiving) means keeping final research data in a sustainable way so they remain usable over time. Publishing means sharing selected datasets with the public via a trusted repository. For formal definitions, see the UT Research Data Management Policy.

Research data underpin scientific results. Verification depends on access to the underlying datasets, and science benefits from sharing and reuse (see Netherlands Code of Conduct for Research Integrity, 3.2 art. 12a). Good preservation is the basis for verification and reuse; publishing increases discoverability and impact.

Aims

Data preservation (archiving) prevents loss and protects authenticity, improving research quality through verification and reuse, and fostering new collaborations.

Services

Data policies and responsibilities

Research groups are responsible for the care of research data collected or generated in their projects especially when the dataset is static (no longer changing). This responsibility for proper data preservation (also called archiving) continues beyond the end of the project and is grounded in the UT Research Data Management Policy (v4.1), which states that intellectual property rights (“database right”) on research data collected or generated by UT staff are vested in UT. For authoritative rules (definitions, roles, retention), consult UT Research Data Management (RDM) Policy.

For an overview of applicable policies, visit Data policies & requirements (UT and faculty policies; funder requirements).

Student's research data

The scope of the UT RDM Policy and the research group’s responsibility for data preservation (archiving) does not include research data collected or generated by bachelor’s or master’s students. However, study programmes or research groups may have their own policies and guidelines (e.g., on selection, retention periods, supervision, or where to store). Where appropriate, they may use Areda to preserve student datasets that are static (no longer changing). Please consult your programme or research group for the applicable guidance.

Data selection

In general, selecting the research data to preserve (archive) is the responsibility of the researcher, the (former) project leader, the supervisor in case of BSc/MSc students, or the head of the research group. What must be preserved should follow the data policy of the research group or higher organisational entity and any agreements with involved third parties. (For retention, see UT RDM Policy Section 3.6: at least ten years.)

Third party contracts

Before preserving or publishing, check the applicable agreements (e.g., project or consortium agreements) and follow their conditions alongside the UT RDM Policy.

Publishing data in a repository

Publishing research data in a trusted repository improves discoverability, citation, and reuse. Where appropriate, make data open, following the principle “as open as possible, as closed as necessary.” Use a trusted repository such as 4TU.ResearchData or DANS.

Sustainable access and reuse

Trusted repositories assign a persistent identifier (PID)—usually a DOI—to support long-term access and citation. Watch this video about Persistent identifiers and data citation explained (Research Data Netherlands).

When publishing in a data repository, for proper reuse it is important to add metadata and a README file with documentation (guidance / template). You can use the same you added in Areda. Before publishing, please check specific project or research group policies. See furthermore step by step publishing guidelines for 4TU.ResearchData and for DANS.

Enhancing your publication

Once your dataset is published, cross-link it with your articles. Add the article DOI in the dataset metadata (where supported), and cite the dataset DOI in your article (Data Availability Statement and reference list).

Both 4TU.ResearchData and DANS Data Stations support adding references to related publications in the dataset record.

For new submissions, consider mentioning the dataset citation in your cover letter so reviewers can verify the data.

Preserving data at THE UT (Areda)

Data preservation (archiving) means storing a copy of the dataset—together with its description and documentation—durably at the UT, preferably in the Areda data archive. Beyond long-term storage, adding metadata is crucial for findability, and adding clear documentation is essential for interpretation, verification, interoperability and reuse. For adding metadata and a README, Areda is linked to the UT Research Information System (PURE).

What is Areda

Areda is the University of Twente archive for the long-term storage of static data, which is collected, generated or used in UT research projects.

Areda offers research groups:

  • Certified storage for long-term preservation (at least 10 years) of static datasets.
  • A ‘bucket’ where data archive files (zip or tar) can be uploaded by group members.
  • Technically suitable storage for all kind of static data, including special categories of personal data.
  • The possibility to add metadata and documentation making use of the UT research information system (Pure).

Please, keep in mind that:

  • Areda should not be used for datasets that still might change (dynamic data), or need to be used regularly.
  • For the moment there is a maximum upload of 1 TB.
  • Access to datasets is at research group level: all group members have access to the datasets in the group’s bucket (folder). Groups and members are based on the HR system.
  • In case of uploading personal or other confidential data, access has to be managed by using encryption. Areda offers an encryption instruction. Encryption keys need to be securely managed by at least one permanent staff member of the research group besides the principal investigator, preferably the chair of the research group.
  • Areda should not be used for archiving digital informed consent forms or pseudonymization keys. These must be stored encrypted and separately from the anonymized (or pseudonymized) data, for instance in JOIN (for ET, BMS, and ITC, please contact the data steward for the procedure), or the p-drive.
  • Although in the group’s bucket a folder structure (by creating a path) is possible, it is advised to use it as a single archive (database) of zip-files containing a structured set of data files of a research project.
  • For making overviews and searching of datasets, use the UT Research Information Portal.
Preparation for archiving datasets

Archiving of datasets demands some preparation:

  • selecting and organizing the data files,
  • writing data documentation,
  • creating an archive file (zip or tar), and
  • encryption in case of personal or other confidential data.

For more information, please consult the guide Archiving datasets in Areda.

Archiving datasets in Areda

Archiving datasets in Areda means upload the archive file you created during the preparation to the bucket of your research group. Next, add metadata about the dataset, such as title, creator, etc. and a copy of the README file in the UT research information system (Pure). There you can also link the dataset to one or more of your publications.

Here is a visual of the archiving intake process:

To help you with the intake process, please consult the guide Archiving datasets in Areda.

Metadata will be reviewed by the data steward of the faculty. After this, the archive file will be transferred to the bucket of the research group.

Apart from archiving in Areda, it is recommended to publish research data and share it with others outside the UT. Sharing or publishing datasets means that you upload them to a trusted data repository, preferably 4TU.ResearchData or DANS.

Access and sharing

Datasets are first archived in the research group bucket before moved and permanently archived at Surf Data Archive (SDA). All research group members have access to the bucket of their own group. Access to files can only be restricted by means of encryption.

It is advisable to add terms of use in the documentation of the dataset in the README file (guidance / template).

After data sets are moved to SDA they can be retrieved by means of an access request. To make a request, please fill in the form. The department chair will examine your request and decide if access can be granted. You will be notified of the outcome.

Personal data

The General Data Protection Regulation (GDPR) requires that personal (any information which are related to an identified or identifiable natural person) data are not kept longer than necessary to achieve the purposes for which they are processed. If longer preservation is needed, anonymization is preferred. More information about handling personal data in research can be found here.

In any case, be sure that you registered the processing of personal data in compliance with the GDPR. When you indicated in your GDPR registration the need of preserving personal data, you should act in accordance with this information about preservation and protection of the data.

When archiving datasets with personal data, please be aware that file encryption is needed. Instruction about file encryption is offered during the Areda archiving process.

Questions and Answers
Why can I best archive datasets in Areda?

Areda is a UT facility especially for archiving static datasets. In Areda research groups can easily manage the datasets which need to be archived, not only for verification but also for internal reuse. Areda offers a cheap and reliable object storage for long-term, persistent and immutable archiving.

Can I start directly archiving datasets in Areda?

You can start directly archiving datasets. Guidance is available on the Areda portal.

Does Areda issue a persistent identifier with the archived dataset?

No, because Areda is not aiming at data publishing. When you also publish the dataset in a data repository (4TU.ResearchData, DANS, etc.) a persistent identifier, such as a DOI, will be issued. Read more about persistent identifiers in Making data FAIR.

As a UT bachelor or master student, can I archive datasets in Areda?

UT bachelor and master students cannot archive datasets in Areda by themselves. If you need to archive datasets, ask your supervisor to add them to his or her group’s bucket in Areda. As you are in principle the dataset rights holder, check policies and guidelines of the research group, especially regarding use rights and licenses.

How can I encrypt data files in Areda?

Areda offers you information about encryption and instructions about tools to be used for encrypting data files with personal or other confidential content. Recommended tools can be downloaded and installed by yourself. Please, be aware that encryption demands some extra preparation time.

Which metadata and documentation should I add when archiving datasets in Areda?

General metadata can be added in the UT Research Information System. Choose ‘Dataset’ and fill in the information asked, such as Title, Description, Date of data production, Contributors, Publisher, DOI, Access information, Temporal coverage, and Geo location.

Other metadata, and documentation, can be included in the README file which should accompany the dataset and be added to the description in UT Research Information System (guidance / template).

Metadata and documentation elements


metadata

documentation

descriptive

author, contributor, title, abstract, keywords, measurement type, project ID, geomapping, time period, subject area

software scripts, instrument settings, methodology, experimental protocol, codebook, laboratory notebook

administrative

data format, date, size, access rights, preservation period, persistent identifier, license for use.

user agreements, provenance (description of the origin of the data), terms of use

structural

related content, related projects, version

database scheme, relations between files, table of content

Is there a maximum volume of datasets I can archive in Areda?

As yet a maximum volume of archived datasets has not been determined.

How much time the process of archiving will take?

Apart from preparing the dataset itself and the documentation by writing a README file and adding metadata, the processing time will largely be determined by the size of the zipped data file to be uploaded and the capacity of the internet connection. In case you will upload a 50 GB zipped data file, it may take at least one hour.

Warning: Make sure your PC or laptop does not go into standby, sleep or hibernate mode during the upload.

Can I replace an archived dataset for a new version?

No. Once a dataset is archived, it will remain unchanged. It is a so-called immutable object preserved for at least 10 years. You can archive a newer version separately and indicate the relation with the older version in filename and documentation.

What are the costs charged for archiving datasets in Areda?

No costs will be charged. It will be paid from central budget.

What is the storage location of the datasets when archived in Areda?

Data are stored on ISO 27001 and NEN 7510 certified servers at the University of Twente. The back-up facility is hosted by Surf, data centers are located in Utrecht and Amsterdam, The Netherlands.

Who can access the datasets archived in Areda?

Default, all members of a research group have access to the datasets in the bucket of the group. Access to data files can be restricted by means of encryption.

What is the preservation period of the datasets when archived in Areda?

Default, datasets will be preserved in Areda for 10 years. In the near future it will be possible to indicate other preservation periods.

What happens with the archived datasets in Areda after preservation period has expired?

Shortly before expiring of the preservation period, the research group receives a message to decide whether the dataset must be deleted. Prolongation of the preservation period may not be free of charge.

Can I make a request to obtain access to an archived dataset?

To make a request, please fill in the form. The department chair will examine your request and decide if access can be granted. You will be notified of the outcome.

FAQ & Contact

Check these practical questions and answers about archiving research data at the UT. For other questions you can contact the data steward in your faculty.                                         

Where can I archive and publish datasets?

You can archive datasets in the UT facility Areda. Apart from that you can publish the dataset in a trusted repository, preferably 4TU.ResearchData or DANS.

What kind of data materials can I archive?

You can archive all types of datasets, both as supportive material to a publication (PhD-theses, journal articles, etc.) and as stand-alone items.

Datasets may be accompanied by related materials, such as

  • specific viewing and analysis tools (models, algorithms, scripts, analysis or simulation software, schemas)
  • laboratory or field notebooks, diaries
  • questionnaires, transcripts, codebooks
  • standard operating procedures and protocols
  • informed consent forms
Can I archive datasets at any moment during my research?

Yes, as long as archiving fulfils the following requirements:

  • it is aimed at securing data authenticity, verification and/or reuse
  • the data are static, so not subject to changes anymore.
Should I archive all datasets?
Can I archive datasets on the M- or P-drive?

The M- or P-drive are not suitable for long-term persistent storage of data because they cannot guarantee that the datasets will remain immutable, which is a prerequisite for the authenticity of the data.

Why can I best publish datasets in a trusted data repository?

Trusted data repositories are certified because they abide to 16 requirements, such as

  • the explicit mission to provide access to and preserve data in its domain,
  • having a continuity plan to ensure ongoing access to and preservation of its holdings and
  • having adequate funding and sufficient numbers of qualified staff managed through a clear system of governance to effectively carry out the mission.
Are the datasets FAIR when archived in Areda and published in a trusted data repository?

Whether datasets are FAIR highly depends on the way they are described by means of metadata and documentation, more than merely the location where they are archived or published. Look at Making data FAIR for more information.

What data should I archive and publish?

The table below presents what to archive and publish depending of the purpose of preservation.

Purpose

What to archive/publish

Verification

Datasets underlying to research results in publications, plus analysis tools (scripts, etc.)

Reuse

All raw datasets relevant for further or other research, together with necessary scripts, models, software etc. and documentation

Data
publication

Datasets which are refined for publication, together with additional documentation

How can I archive personal or other confidential data?

The General Data Protection Regulation (GDPR) requires that personal data (any information related to an identified or identifiable natural person) are not kept longer than necessary to achieve the purposes for which they are processed. If longer preservation is needed, anonymization is preferred.

When datasets contain personal data, please be aware that file encryption is needed when archiving. Instruction about file encryption is offered during the Areda archiving process.

Can data subjects demand that their data are to be erased?

If the data subjects agreed in the participation of your research project, it means that they implicitly agreed in archiving the research data. Therefore, they cannot demand that their data should be erased before the end of the retention period.

In case data subjects want to exercise their right, they can contact the Data Protection Officer (DPO).

Which formats should I choose for the data files?

Use general, non-proprietary formats to prevent loss of access to files and to enhance the chance of future interpretability of the data. Preferred file formats are e.g. PDF, Plain text, TIFF, FLAC, CSV or XML (see also extended list of formats from DANS or from 4TU.ResearchData).

My favourites

About Favourites
Use the Bookmark this page button on Service Portal pages to add that page to the My Favourites section. To add web applications, use the star icon in the webapplication list. To add pages outside the Service Portal, use the Add custom bookmark button above. Add your favourite apps to your bookmarks by using the favourite button.

The My Organisation section shows mandatory bookmarks for your your main unit.

Please wait a moment...