Data collection in the social sciences is of essential importance. To guarantee safe and sustainable data collection and storage, BMS Datalab supports researchers with methodological, legal and ethical advice regarding data collection. In addition, BMS Datalab offers researchers and students of the University of Twente the soft- and hardware to collect data through online surveys on its own secure server. Navigate the menu on the left for more information.
Data collection involves understanding the different types of data you collect. Depending on the nature of your research, there are different methods of collecting data and thus different types of data. Your data may be physical (paper records or archival forms) or digital (database contents or excel data). The source of your data may be external, you collect it yourself or you generate it from a machine.
You can watch the video below about data collection methods:
Research should be conducted in such a way that the personal life of subjects is least disturbed. When collecting data, this means that only data that is necessary for the research should be gathered, and whenever possible should be gathered anonymously. This is especially true in the case of special, sensitive data.
Types of research data
Research data can be categorized in different ways. These are criteria for data type distinction:
- Character: qualitative – quantitative
- Source: primary – secondary (video explaning difference)
- Creation: observational – experimental – simulation – derived or compiled
- Phase: raw – prepared – analysed – published
New (primary) data
Any time personal information is gathered directly from respondents, their express and informed consent is required. Respondents should be aware what data is gathered, by whom, and where they can turn to for additional information. Data gathered should be in proportion to the research aim; researchers should only gather data which is needed to answer the research question – no more. At the same time, practicality and ethical concerns require that the data gathered is sufficient to answering the research question. Balancing these requirements can be challenging, but at the very least researchers should resist the urge to gather more data just because they can.
Existing (secondary) data
The codes of conduct for research and data protection apply to existing data sets as well. In practice, this means that you can only use existing data when respondents have given their consent to the initial researchers. Additionally, care should be taken when combining datasets – when two anonymous datasets with non-personal information combined allow individuals to be identified, the new dataset should be treated as personal identifiable data.
The aim of open science is that researchers reuse other parties' research data and services where possible and make their own data available as far as possible.
Think about reuse of existing datasets in your research. Check UT Research support for more data sources.