automated ctc classification, enumeration and pheno typing

Leonie Zeune is a PhD student in the research group Medical Cell BioPhysics. Her supervisors are prof.dr. L.W.M.M. Terstappen from the Faculty of Science and Technology and prof.dr. S.A. van Gils from the Faculty of Electrical Engineering, Mathematics and Computer Science.

Cancer is one of the leading causes of death worldwide. It starts with the formation of a primary tumor and can spread throughout the body, which ultimately causes most cancer related deaths. Tumor cells that break away from the tumor and invade the bloodstream are called Circulating Tumor Cells (CTCs). Extravasation of these CTCs can give rise to new tumors at distant sites and this process is generally referred to as the formation of metastases. The CTC load is strongly correlated with the time of survival of cancer patients and the number of CTCs can be used to monitor cancer therapy. Yet, CTCs are very rare and to accurately detect, characterize and count them in images of fluorescently labeled cells is very challenging and often performed manually. This process is prone to errors and user biases. Therefore, this thesis aims to develop image analysis and machine learning models to automatically and accurately detect and classify CTCs in fluorescent images and prove the added benefit of automated image analysis methods for patients’ benefit.

CTCs and their fluorescent signals strongly vary in size, shape and signal intensity. We addressed the presence of multiple size and intensity scales in the fluorescent images by the development of a multiscale segmentation model. The model is almost parameter- independent and extracts the scale information based on a spectral decomposition of the input signal using nonlinear diffusion models. This results in an accurate segmentation of the cells combined with a clustering of the cells based on their size and intensity. We further adapted the model to a purely size-based clustering of the cells which excels in segmenting also very dim fluorescent signals.

Based on the segmentation of all cells present in the images, we can extract quantifiable features per cell, such as their size or mean fluorescent intensity. Providing these extracted features to researchers who score the fluorescent expression of CTCs as positive or negative, could greatly reduce the user bias and unify the results used for patient treatment. Moreover, we used the extracted features to identify new cell populations which were also overexpressed by cancer patients compared to healthy subjects. Identifying and understanding these cell populations has the potential to better understand the variation among patients in disease progression and treatment response.

We further analyzed the consensus of multiple reviewers in manually scoring cells as a “CTC” or “no CTC”. The results showed that, although all reviewers are trained according to the same guidelines, there is a major disagreement on a definition of a CTC. This motivates the use of automated methods. We compared the given scores to the answer of an expert panel and the answer of a Deep Learning network trained to separate CTCs from other cells. Remarkably, the Deep Learning network and the reviewers showed the highest agreement.

This motivated further investigations of Deep Learning networks for cell classification. We ultimately trained a convolutional neural network to classify cells into CTCs and four more cell classes that can be found in EpCAM enriched blood samples from cancer patients. We analyzed how network architectures based on autoencoders differ from standard architectures used for cell classification in terms of their encoding and interpretation of the image input data. The chosen network, in combination with advanced visualization techniques, allows to not only classify cells but to reveal new cell populations and subclasses of known populations. This paves the way to identify all cell populations in the fluorescent images and to investigate their implication for the outcome of patients.

The work presented in this thesis resulted in the development of an open-source software toolbox for CTC analysis called ACCEPT (Automated CTC Classification, Enumeration and PhenoTyping). The toolbox facilitates and automates the process of detecting and classifying CTCs and other cell populations in fluorescent images obtained from various microscopic systems and is now actively being used by several research groups in the field of CTC research.