HomeEducationDoctorate (PhD & EngD)For current candidatesPhD infoUpcoming public defencesPhD Defence Son Minh Nguyen | Underspecification of Transformers in Indoor Localization and Indoor Navigation

PhD Defence Son Minh Nguyen | Underspecification of Transformers in Indoor Localization and Indoor Navigation

Underspecification of Transformers in Indoor Localization and Indoor Navigation

The PhD defence of Son Minh Nguyen will take place in the Waaier building of the University of Twente and can be followed by a live stream.
Live Stream

Son Minh Nguyen is a PhD student in the Department of Pervasive Systems. (Co)Promotors are prof.dr.ir. M.R. van Steen, dr. D.V. Le Viet Duc and dr. Ö. Durmaz-Incel.

Transformer architectures have gained widespread popularity, especially with the rise of large language models that successfully unify natural language understanding and computer vision. Despite their success in broad domains such as machine translation, text summarization, and video question answering, the generalization of transformer architectures remains limited, particularly in niche applications that demand specialized understanding of domain-specific modalities, such as indoor localization and inertial navigation. In this thesis, we identify fundamental limitations of standard transformer models that hinder their adoption in these contexts. Motivated by these limitations, we introduce a series of technical innovations designed to extend the capabilities of transformer architectures, making them more effective and adaptable to the unique challenges posed by indoor localization and inertial navigation tasks.

The first two parts of this thesis focus on the indoor localization domain using Received Signal Strength (RSS) measurements. In Part I, we conduct a detailed examination of key components within standard transformer architectures and propose specialized modifications to enhance their ability to learn location-specific patterns, an essential feature for processing RSS measurements in indoor localization. In particular, we develop and present three specialized transformer variants, each designed to address the localization task at different levels of abstraction and operational granularity.

In Part II of the thesis, we focus on enhancing the generalizability of indoor localization models by enabling knowledge transfer across diverse RSS fingerprint datasets. This task presents unique challenges due to discrepancies in infrastructure, device configurations, and environmental conditions inherent to independently collected datasets. To address these issues, we propose a novel plug-and-play knowledge transfer framework that aligns underlying representations across datasets, enabling the learning of transferable features that are robust to environmental dynamics. This approach significantly enhances the performance and generalizability of state-of-the-art localization models.

In the final part of the thesis, we focus attention on the indoor navigation domain, extending our investigation to inertial navigation using Inertial Measurement Unit (IMU) signals. Primarily derived from accelerometers and gyroscopes, these signals capture motion dynamics but pose unique challenges such as high sensor noise and substantial variability in human motion. These challenges significantly hinder the effectiveness of conventional deep learning models. To overcome these issues, we propose a novel transformer-based architecture that progressively decomposes complex IMU sequences into interpretable motion components, thereby enhancing the detection and representation of motion events. In addition, we introduce a novel particle-based mechanism that adaptively fine-tunes motion representations to model the inherent uncertainty in human motion and sensor readings. This architecture demonstrates substantial improvements in trajectory reconstruction quality.