Balancing Privacy and Accuracy in Counting Traveler Movements: An Evaluation of Encrypted Bloom Filters and K-anonymity
Nadia Shafaeipoursarmoor is a PhD student in the Department of Geo-information Processing. (Co)Promotors are prof.dr.ir. M.R. van Steen and dr. F.O. Ostermann from the faculty of Geo-Information Science and Earth Observation.
Accurately counting the number of people using public transport is crucial for effective urban planning, as it enables better decision-making for optimizing routes, schedules, and overall infrastructure. A key resource for obtaining this data is the information collected through smart card-based automated fare collection (AFC) systems. These systems can provide valuable insights, such as the number of travelers at specific locations and the movement of travelers between different locations, in the form of estimate counts. However, since this data is linked to individuals, significant concerns arise regarding the privacy of travelers.
Although several approaches have been proposed to address these privacy concerns, most solutions have proven insufficient, as they still allow for the possibility of identifying individuals by tracing their travel patterns. To overcome these challenges, we used two privacy-preserving methods that ensure the protection of individual data while enabling the computation of estimate counts of travelers and their movements between locations.
The first method involves anonymizing detection data in real-time, achieving what we refer to as detection k-anonymity, which ensures that the privacy of collected data is preserved. The second method serves as an alternative for scenarios where detection k-anonymity provides lower accuracy in traveler counts. This method employs probabilistic data structures known as Bloom filters (BFs), which encode the detection data. The encoded BFs are then encrypted using homomorphic encryption (HE), allowing statistical computations to be performed directly on the encrypted data. This approach ensures that only the final results are revealed to the intended user, without compromising the underlying data. For this, we investigate both full homomorphic encryption (FHE) and partial homomorphic encryption (PHE) for scenarios where FHE is impractical, and compare the accuracy and precision of the traveler counts achieved by using these two encryption methods.
Our research examines the efficiency of these two privacy-preserving methods in providing accurate traveler counts while ensuring privacy. Specifically, it examines the impact of privacy-preserving techniques on the accuracy of traveler estimations and explores the conditions that allow accurate counting of travelers between locations while preserving privacy. Both methods are implemented and carefully tested using simulated public transport data and real-world datasets. Our results demonstrated that it is possible to accurately estimate the number of travelers while maintaining robust privacy protections under specific conditions, which are explored in detail throughout the thesis.