Professor Rob Lammertink, head of the Soft Matter, Fluidics and Interfaces (SFI) department, believes that research-data management is an important skill for young researchers to learn. He is working on standardising and streamlining data-management processes. To gain a better sense of what is needed to ensure that archived data can be reused by others, his department participated in the pilot programme for the UT archive for research data: Areda.
SFI is part of the Membranes, Science and Technology cluster (MST), a group of various departments that conduct research and give instruction on both the material and process-related aspects of membranes. The cluster employs around 70 people, including researchers, PhD students, technicians, and support staff.
Experimental and measurement data
‘We generally work in experimental labs on the characterisation of materials and structures, and we collect a lot of experimental data,’ says Lammertink. ‘We look at the properties of polymers, for example, and the performance indicators of separation processes. We also work with microscopy and electron microscopes, so a lot of what we do involves images and video footage. We measure flow, concentration, and speed profiles, and analyse structures. We also run simulations, which means working with mathematical and numerical models. They produce data, and the model itself is also a research result. Most of what our cluster produces is measurement data.’
Article data accessible through 4TUResearchData
Until now, when MST researchers were done with their research, they would store their data on external hard drives and lab journals, Lammertink explains. ‘For each article published by my own department, we already include the associated data in the 4TU.ResearchData Repository, where it is publicly available for reuse by other researchers. We also put metadata into 4TUResearchData. But data and datasets that are not associated with any article are now only stored on external hard drives, making it hard for others to find and use them. That's why I'm excited about Areda: the UT archive for research data, where data can be stored safely and long-term.’
Making data easy to find and re-use
The first step in archiving research data in Areda is proper documentation, so that it can easily be reused by both the original researcher and others, Lammertink continues. ‘Areda forces you to think hard about how to organise your data right from the start of the research project. That makes me very happy, because sometimes we find out after four years that somebody has stored their data in a way that seems illogical to me. That makes finding data – such as experimental results – a huge challenge. Developing a more uniform data-management process will make it easier to reuse, for both the original researcher and their colleagues. I want researchers to be more aware from the beginning that their data must be easy for other people to find. Right now, we can find the data, but it often requires a great deal of effort.’
For the Areda pilot, Lammertink asked two recent PhD graduates to archive their data in Areda. ‘We wanted to test user-friendliness and the process in general, to see what problems might be encountered when creating a research-group repository (a ‘bucket’) in Areda. We also wanted to know what steps during the research process are necessary to ensure the data can be easily archived in Areda.’
Lab journals in English
‘One of the two students is Chinese, and she wrote all her lab journals using Chinese characters. So now, if I want to look up experiments in her lab journal, I need somebody who can read Chinese. In the future, I want everybody to draw up their documentation in English and to incorporate metadata at the outset. The Chinese student's contract is already over, and it would be simply impossible for her to translate everything and add the metadata now. So, we only asked her to create a dataset and a bucket, so we can test for problems in the process. I am now looking to see how well I can navigate the dataset. The other student, it turned out, had already included descriptions in her Excel file, which made for adequate documentation. Including her lab journal in Areda may not even be necessary anymore. We want to use these experiences to establish a data-management process for the future.’
‘From now on, we want all new PhD students to realise that the data they generate over the next four years must be processed and stored in a way that allows for easy archiving in Areda. Among other things, that means adding metadata at various stages. For those whose research is already underway, we are informing them about Areda so they can make advance preparations for archiving their data. Good data management takes time, but starting out right saves time down the line, so on balance it does not take any extra time.’
Greater curriculum focus on RDM
In the upcoming MST summer school, Lammertink is placing an emphasis on the importance of Research Data Management (RDM). ‘I want to talk about publishing data in 4TUResearchData, as well as archiving in Areda. Training new researchers is one of our cluster's most important responsibilities. Research skills also include writing for publication, giving presentations, and – last but not least – proper data management. This latter aspect is still a little neglected. At the summer school, we are planning an open discussion on what is already going well, what could be improved, and where support and training are necessary. It is my feeling that the curriculum could include a greater focus on research-data management. The nice thing is that researchers can now get hands-on data-management experience in 4TUResearchData, Pure and Areda.’
Lammertink has seen that researchers in the wider membrane field are open to sharing their data. One example is the Belgian website https://openmembranedatabase.org, where a lot of information is available on various types of membranes. Lammertink also sees it as a ‘plus’ when researchers reuse his data for other models.
More about archiving data in Areda.
Each UT research group has its own section in Areda – to which only researchers of that group have access. Before archiving data, you must provide documentation. In this way it will be easier for your research group colleagues (and yourself) to re-use the data. Safe and secure archiving ensures verification and correct interpretation of your research (data). UT researchers can use Areda free of costs: it is paid from the central budget.
When data are properly archived in Areda, the data will remain accessible and re-usable if PhD candidates or fellow researchers have left the UT. You can link from your publications to the dataset in the UT research information system.
Digital Competence Centre and data stewards
You can ask your faculty’s research data steward(s) (who are part of the Digital Competence Network) for support and advice before and during archiving the data. They will also perform a final check.