Hiding in the deep - Online animal activity recognition using motion sensors and machine learning
Due to the COVID-19 crisis measures the PhD defence of Jacob Kamminga will take place (partly) online in the presence of an invited audience.
The PhD defence can be followed by a live stream.
Jacob Kamminga is a PhD student in the research group Pervasive Systems (PS). His supervisor is prof.dr. P.J.M. Havinga from the Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS).
The activity of animals is a rich source of information that not only provides insights into their life and well-being but also their environment. Animal activity recognition (AAR) is a new field of research that supports various goals, including the conservation of endangered species and the well-being of livestock. Over the last decades, the advent of small, lightweight, and low-power electronics has made it possible to attach unobtrusive sensors to animals that can measure a wide range of aspects such as location, temperature, and activity. These aspects are highly informative properties for numerous application domains, including wildlife monitoring, anti-poaching, and livestock management. In this thesis, we focus on AAR that aims to automatically recognize the activity from motion data – on the animal – while the activities are performed (online). Specifically, we use motion data recorded through an inertial measurement unit (IMU) that comprises an accelerometer, gyroscope, and magnetometer to classify up to eleven different activities.
Numerous factors challenge online AAR, and in this thesis, we address the following: (i.) Sensors are often attached to (wild) animals using collars around the neck. In many cases, it is difficult or impossible to re-charge the battery after the collar is attached to the animal. Furthermore, the sensor tags are restricted in weight and size to be unobtrusive. Therefore, the tags not only have limited energy but are also constrained in computation and memory resources. (ii.) Animal collars are exposed to rough behavior and environments. As such, the collar can rotate around the neck, which causes the sensors to change position and orientation over time. The variability in sensor-orientation causes significant errors if activity classifiers are sensitive to sensor orientation. It is challenging to develop AAR that is insensitive to sensor orientation. (iii.) Animal groups are heterogeneous; sensitivity of an AAR system to individual traits may impact the performance. A system trained with training-data from a subset of animals may not generalize well to others. Moreover, there are various species, and independently developing AAR for each species is a daunting task. (iv.) Various machine learning (ML) algorithms are used as AAR classifiers. Each classifier has different properties and is affected by aspects such as feature-type, parameter settings, and class-imbalance in the dataset. It is challenging to find the impact of each aspect on the classification performance. (v.) The classifiers rely on a clear feature representation of the raw sensor data. Therefore, feature extraction is an essential aspect of AAR. It is challenging to find the most resource-efficient and effective feature extraction method. (vi.) Data acquisition and labeling are tedious, especially for AAR because it can be difficult to observe (wild) animals, which is mandatory for labeling (annotation of data with ground-truth). As a result, labeled datasets are often small and class-imbalanced; e.g., a labeled dataset may comprise mostly eating activity, but little running.
To address the challenges in AAR, we pose the following main research question:
How can we recognize various animal activities using motion data while considering resource-efficiency, orientation-independence, and genericity so that it can be executed online and easily deployed across a wide variety of species?
Tremendous innovations in computation, memory, and battery technology allow us to process the sensor data locally on the animal tag, while the activities are performed. Locally processing the data eliminates the need to store or transmit large amounts of raw data and provides opportunities to develop resource-efficient animal activity recognition (AAR).
We simultaneously address the challenges of resource constraints and the dynamic sensor orientation and position through a framework that aims at finding the optimal feature set, which is lightweight and robust to the sensor’s orientation. We use 3D motion data to extract various summary statistic features. We select the most informative features using a method based on simultaneous feature selection and classification, and the feature sets are optimized through 10-fold cross-validation. We investigate the effect of different sizes of the resulting feature sets. Our results show that the accelerometer is the best sensor for AAR and using three features is sufficient.
To address the heterogeneity of animals, we propose a framework that employs multitask learning (MTL) for embedded platforms. We investigate the trade-off between classification performance and genericity of an AAR system. Specifically, we study the effect of MTL on the performance of a single classifier that classifies the activities of two species. We test and compare 7 classifiers in three scenarios: (i.) trained with individual data, (ii.) trained with data from one species and tested on the other and vice versa, and (iii.) trained with mixed data from both species. Our results are a good indication that it is possible to train classifiers with good genericity performance for similar species.
We investigate the effect of feature type, hyperparameter tuning, and data partitioning method (to create the validation datasets) on the AAR performance. Firstly, we compare the quality of time and frequency-domain summary statistic features; while investigating the effect of feature type, we validate the goat orientation-independent feature set on a different species. Overall, most classifiers yield higher performance using frequency-domain features and similar performance when the orientation-independent feature set is used. Secondly, we compare two validation partitioning methods: k-fold and leave-one-subject-out cross-validation. When an AAR classifier is used on animals that were not included in the training dataset, the performance is reflected more precisely using leave-one-subject-out cross-validation. Finally, we demonstrate the effect of hyperparameter tuning and show the level of improved classification and genericity performance.
To address the challenge of effective feature extraction while labeled datasets are small and imbalanced, we investigate unsupervised representation learning (URL). We present an analysis framework that compares various deep URL techniques to exploit unlabeled data to improve feature extraction. The most compelling reason to use URL as a feature extraction method is the ability to learn from unlabeled data. We compare three URL techniques with three conventional feature extraction methods. To investigate the effect of the size of both labeled and unlabeled part of the dataset on the quality of the features, we train the URL methods and classifiers using various sample sizes. We demonstrate that the performance of URL can approach and, in some cases, outperform conventional features. Our results show that the convolutional deep belief network (CDBN) benefited the most from an increasing amount of unlabeled data, especially in the more diverse and imbalanced dataset. Furthermore, we evaluate the effect of depth in URL and demonstrate that concatenating the 1st and 2nd layer features often results in better classification performance. Finally, we analyze and discuss the effect of class imbalance in the dataset on the feature quality. Our results indicate that deep representations provide more robustness to the imbalance in smaller labeled datasets.
We collected 3 datasets comprising 3D motion data recorded on animals. The first dataset comprises labeled motion data from 4 goats and 2 sheep and is used to study the genericity performance of AAR classifiers. The second dataset comprises labeled motion data from 5 goats collected from 6 positions in diverse orientations around the neck of the animals. This dataset is used to address sensor orientation-independent AAR. The third dataset comprises motion data from 18 horses of which data from 11 subjects is partly labeled. This dataset is used to study various factors that impact the performance of AAR and unsupervised representation learning.
In this thesis, we show that it is possible to perform AAR with excellent classification performance while it is resource-efficient and orientation-independent. Although performance versus resource usage is often a trade-off, we show that excellent performance can be obtained using only the accelerometer, a small set of features, and a classifier that is resource-efficient in memory and computation, allowing the AAR to be executed online. When sufficient aspects are similar, e.g., the species, sensor-location, and activity types, it is possible to train classifiers in a way so that AAR can be deployed across a variety of similar species. Improving orientation-independent feature extraction by exploiting unlabeled data is a prospective direction for further research.