Secure Blocking for Record Linkage

MASTER Assignment

Secure Blocking for Record Linkage

Type : Master M-CS

Period: May, 2022 - Dec, 2022

Student : Witlox, K.H.D. (Kevin, Student M-CS)

Date Final project: Dec 16, 2022

Thesis

Supervisors:

Abstract:

Record linkage is the problem of joining together datasets without a unique identifier. Record linkage can be used to combine multiple data sources for answering research and policy questions, and could therefore be a valuable tool. However, even if the desired answer is only an aggregate, record linkage may be infeasable due to privacy concerns as it requires entire datasets to be shared between parties. This problem may be solved by implementing record linkage as a secure computation, preserving the privacy of the underlying data. Currently however, no blocking technique (a pre-processing step to speed up record linkage) is designed to work as a secure computation, limiting the scalability of secure record linkage solutions. Therefore, we design and implement the first secure blocking solution. We compare the running time of our solution against a secure record linkage solution, and show that our secure blocking solution allows for a great reduction in running time.