[D] Identifying Application Phases in Mobile Encrypted Network Traffic

Master Assignment

Identifying application phases in mobile encrypted network traffic

Type: Master M-CS

Location: University of Twente

Period: Jan, 2018 - Sep, 2019

Student: Teesselink, T. (Tycho, Student CS)

Date Final project: September 27, 2019

Thesis

Supervisors:


Abstract:

Mobile devices have overtaken personal computers for everyday tasks. These devices produce massive amounts of data which contains valuable information. Two fields in which monitoring of such mobile data is used are application identification and user action identification. They focus on the identification of a single user action or identify individual applications out of a known set. Monitoring this traffic can be useful for, among other things, fingerprinting traffic, intrusion detection and user-profiling. One limitation of previous works is that they are applicable for only a single user action or application. In this paper we generalise the concept of user actions by introducing mobile application phases. Application phases describe the state an application is in after a set of user actions have been performed. In contrast to user actions, these phases are application agnostic. This means that a method capable of classifying application phases is scalable and not limited to known applications. We formally define seven different application phases and show how to detect these in Android logs. We also present four different algorithms to detect these application phases in encrypted network traffic. We look at network traffic because it makes the method more scalable than a host-based solution and has a less privacy invasive nature. These algorithms use network data from a timeseries perspective instead of a flow perspective in order to take advantage of periods where network data is scarce. To assess the quality of these algorithms we generated two novel datasets consisting of encrypted network data of 361 Android applications. We were able to detect the installation of applications with 100% accuracy and distinguish foreground from background traffic with 93% accuracy.