Data Storage | BMS - BMS Datalab

Storing your research data is important for several reasons. First of all, according to the Netherlands Code of Conduct for Scientific Practice (VSNU, part III), researchers are obliged to store their raw research data for at least ten years (no maximum period) for validation purposes. Secondly, journals or funders may require you to give open access to your research data or at least share your data with other researchers upon request (see Data Sharing).

The UT has a table with an Overview of the Data Storage at the UT available. See also UT Research Support on the topic of storing & sharing research data.

The best storage solution also depends on the specific project, this Decision tool can help you find the most suitable storage solution for your research:

Decision tool for the best storage solution

Storage for Students
Also check the micro-lectures for students teaching how to handle data!
Students can make use (for a part) of the same storage facilities as staff. Please look at the Table on the UT overview of data storage for up to date info. In short, you can use:
- Google drive/One Drive (with your UT-account): store personal identifiable data (the server is GDPR compliant), share data with your supervisor/team members, capacity 1 TB
- Your supervisor can invite you as a student via SurfDrive to share data. You cannot use SurfDrive without the invitation from your supervisor.
- BMS-LAB has facilities for you to store research data. Indicate this with a project sign-up at BMS LAB.
- Dropbox/personal devices: store anonymized data (not GDPR compliant), share data.
- UT-students do not have an M-/P-drive, however, your supervisor can request for a shared p-drive with you, this can be done via LISA service portal.
NOTE: Never store personal identifiable data of your research participants on your personal laptop or an unprotected USB-device! Those data need sufficient protection under GDPR (dutch AVG).
Data preservation and Reuse
Did you finish your thesis? What to do with the research data you collected and processed? Please discuss this with your supervisor(s), as:
- Supervisor(s) decides whether and what data will be kept, and where to keep the data, because: supervisors are jointly responsible for the integrity of the research;
There is a higher chance for the data to be reused by (students of) the supervisors.
If supervisors do not want to archive the data, the UT will not require students to archive the data. Do you wonder if can you keep the data to yourself? If your data does NOT contain personal data, you can keep a copy to yourself. But if the data contains personal data, your access to this data should be blocked immediately after you finish the research
Where to store
• The directive on raw research data storage is minimally 10 years, to the extent that this is compatible with the GDPR stating to store personal data no longer than necessary.
• Data should never be stored solely on personal and/or local drives: data storage on the m-/p-drive of the UT are certified according to the ISO/IEC 27001 and NEN 7510-standards. This is the highest level of protection for your personal and also sensitive data.
• (Raw)data can be stored on the central and secured BMS server (ISO-certified as well), privacy-sensitive data of a project can be protected by encryption. Indicate this with a project sign-up at BMS LAB.
• The datafiles will be stored together with the EC approval in the same folder
• The best storage solution also depends on the specific project, this Decision tool can help you find the most suitable storage solution for your research.
• BMS Datavault: BMS LAB offers a safe vault for your sensitive research data.
• After the research, data will be stored in a trusted repository (e.g. DANS) or permanently stored on one of the secured servers of the faculty. This concerns at least the raw data.
What to Store
- Raw data file: the raw data file contains the originally collected, unprocessed data.
- Derived dataset: the derived dataset is the dataset underlying certain results or publications. You can derive different datasets from your raw data for different purposes.
- Syntaxes: a syntax file contains the code, algorithms or commands used to create your derived dataset from your original, raw dataset. It also contains (stepwise) information about the transformations and analyses performed on the raw dataset.
- Metadata file: a metadata file is a separate file attached to your dataset, which contains information about your dataset for future use (by yourself or others). For example, a metadata file should contain information on the following subjects: creator, access conditions, context, collection methods, time references, structure and organization of data files, variable names, labels and descriptions of variables and values, codes for missing values, file formats, and hard- and software used to process and analyse the data. Examples of metadata standards.
As common sense dictates, storing and sharing (sensitive) data should be handled with care (see Guidelines Personal Information). The level of precaution that should be taken depends on the sensitivity of the data, and can range from ‘simple’ precaution to storage on a secured, isolated and off-line computer or encrypted USB sticks in the IGS data vault.
Preferred file Formats
To ensure long-term preservation that is independent of certain specific software, you are encouraged to save your files in commonly used and easily re-usable file formats with open documentation (e.g: .pdf .csv .odf .xml) and for image files .jpeg, .tiff and .png. Please find a list of different preferred and acceptable file formats for different types of data here.
At some time during your research you may need to convert or migrate your data files from one format to another or from one system to another. Notice that file conversion and migration comes with risks, as loss of information or quality. Prevent problems by cleaning data (e.g. replace special characters in your files, replace footnotes by normal text) before you convert. Always check your files thoroughly after conversion.
Data compression
to be completed..
Reproducibility
In general, any scientific work should be reproducible. This applies to the social sciences as much as it does to the natural sciences. In practice, this means that the whole process of how you handle data should be documented. Gathering, cleaning, coding, transforming and scaling as well as analyses performed should all be documented. It is good practice to perform the above tasks using syntax, and to store the syntax along with the data.
Note that, even though it may be tempting to perform a ‘quick fix’ in the SPSS data view, such a change may become lost or be overlooked, rendering reproduction of the research more difficult.

Data preservation and Reuse