Heterogeneous Datasets Fusion for Human Analytics

At imec, Body Area Networks

Background and problem statement

With the advent of commercial wearable devices, the amount of physiological data gathered from our body is exponentially increasing. This thesis wants to explore the feasibility of aggregating data from heterogeneous datasets to be used for standard analytic tools.

Skills

Fluent in MATLAB/Python or Java

Strong Background on Data Mining and Information Theory

Knowledge of Big-data Analysis is highly valuable

Knowledge of Distributed Computing is highly valuable

Motivated student eager to work independently and expand knowledge in the field

Good written and verbal English skills

Assignment

We live in the digital era where the amount of data created is of the order of several Exabyte per day (Exabyte = 10^18 bytes). With this amount of data to be analyzed, Big-data tools allow the aggregation and fusion of data from heterogeneous sources. 
With the advent of commercial wearable devices, the amount of physiological data gathered from our body is exponentially increasing too. Nevertheless, analytics on human data are generally performed on a single dataset collected under predefined conditions. Different datasets gathered from heterogeneous devices are not jointly analyzed. In particular, when data are collected under different conditions, there exist not suitable possibility for a combined analysis. 
In this thesis, we want to explore the feasibility of aggregating data from heterogeneous datasets collected from different wearable devices in order to run standard analytic tools on them. In particular, we want to study boundary conditions that allow the aggregation of data and to define suitable Information Theory measures able to describe useful or disruptive aggregations for the creation of learning models. 
The successful candidate has knowledge of Data Mining and Information Theory and is fluent in at least one between MATLAB/Python and Java. Knowledge of Distributed Computing and Big-Data analysis is highly valuable.

Components

 Literature Review on Information Fusion with focus on Wearable Sensors

 Setting a virtual Big-data analysis framework for datasets fusion

 Development of state-of-the-art Distribute Data Analytics Algorithms

 Definition of Information Theory measures describing the data aggregation process

Educational program

MSc

Computer Science / Software Engineering

Research theme

From Human Sensory-Motor Function to Patient-Practitioner Interaction

Principal Investigator track

H.Hermens?

B.J. van Beijnum?

Supervision and info

imec supervision:

Pierluigi Casale

(Pierluigi.Casale@imec-nl.nl)

UT supervision:

?

For all inquiries, please contact:

Ms Sandra Maas, Management Assistant Human Resources.

Telephone number: +31 (0)40 40 20 500