Privacy-Utility Trade-Off In Healthcare Metadata Sharing And Beyond: A Normative and Empirical Evaluation at Inter and Intra Organizational Levels
Syeda Sohail is a PhD student in the Department of Datamanagement & Biometrics. Promotors are prof.dr.ir. M. van Keulen and dr. F.A. Bukhsh from the Faculty of Electrical Engineering, Mathematics and Computer Science.
In the contemporary world, big data analytics play a crucial role in facilitating better-informed and more efficient decision-making across various domains, particularly in healthcare. In healthcare, the increased reliance on (meta) data raises critical concerns about the protection of personally identifiable information (PII) in metadata sharing across institutional boundaries, where the privacy-utility trade-off (PUT) emerges as a central challenge. This trade-off reflects the tension between the utility gained from data analytics and the privacy risks posed to individuals, organizations, and communities at local, national, and international levels. Using healthcare data effectively while maintaining privacy and regulatory compliance remains a significant practical and technical challenge. This extension of PUT manifests itself as a complex interaction between FAIR (Findable, Accessible, Interoperable, Reusable) data and FACT (Fair, Accurate, Confidential, Transparent) data principles, operating across both national and international levels.
In view of these challenges and their multilevel implications on metadata sharing across information systems and organizations in healthcare and beyond, this study provides an overarching research work focusing on normative and empirical evaluation of the privacy-utility trade-off at inter- and intra-organizational levels. The study identifies the factors and indicators prioritized in both theory and practice, focusing on actionable measures that organizations can adopt. Research begins in the healthcare domain, where privacy is paramount under GDPR, and extends to less sensitive domains, such as software solution providers, to derive evidence-based practices for managing privacy risks and legal compliance.
This Ph.D. research systematically evaluates PUT through both normative and empirical methods, following the Design Science Methodology by Roel Wieringa (2014). The normative evaluation uses content analysis, including scientific literature, regulatory frameworks, and official sources, to establish theoretical foundations (a top-down approach). The empirical evaluation applies data-driven analysis using process mining techniques in real-world healthcare metadata and beyond (bottom-up approach), capturing practical insights and process utilization patterns. The findings are represented through conceptual models (REA ontology, e3 value modeling, and BPMN 2.0) and evaluated by domain experts and conceptual modeling specialists to ensure veracity and practical applicability.
The key proposed and evaluated artifacts include multilevel privacy assurance, privacy-enhancing process mining project methodology (PEPM2) the assessment matrix for PEPM2, the core components for process utility in process mining, along with the process utility evaluation matrix (PUEM) to optimize the utility of the process using process mining (PM) while protecting privacy. Evaluations using expert opinion and real-world case studies demonstrate tangible improvements in process optimization, privacy compliance, and data utility. For example, the application of PEPM2 allowed the identification of bottlenecks in data sharing workflows and provided actionable recommendations to improve workflow efficiency by enhancing privacy preservation.
Despite these contributions, certain limitations must be acknowledged. The evaluation relies on a limited number of expert reviews and case studies, which can constrain generalizability. Furthermore, the rapidly evolving nature of data technologies and privacy regulations poses challenges for long-term applicability. Future research should extend empirical validation across domains and explore dynamic adaptations of the proposed methodologies, evaluation frameworks, and conceptual models to adapt to the ever-changing data-driven, technological, and regulatory landscapes.




