Machine Learning: A new application in patients with vasculitis

Master Assignment Data SCIENCE (DS)

Machine Learning: A new application in patients with vasculitis

Type: Master BMT

Location: RUG

Period: Dec 2017 - Juli 2018 

Student: Classified.

Supervisors:

Abstract:

Giant cell arteritis (GCA) is a  vasculitis, characterized by arterial wall inflammation, sometimes overlapping with  polymyalgia rheumatica (PMR). Diagnosis of GCA and PMR is challenging and can be missed, with the risk of complications. Novel tests are needed for early diagnosis of GCA/PMR. Machine Learning may be helpful. Aim: development of a diagnostic Machine Learning algorithm tested on a prospective UMCG GCA  and PMR cohort and validated in a historical cohort of the ZGT as proof of concept. Finally the diagnostic Machine Learning algorithm will be tested for use in primary care to aid early referral in primary care.

Research Proposal:

Giant cell arteritis (GCA) is an immune mediated vasculitis characterized by inflammation of the large and medium sized arteries. GCA is closely linked to polymyalgia rheumatica (PMR) and both diseases occur only in people over 50 years of age. The diagnostic work up and treatment of both GCA and PMR is rapidly changing creating a need for early recognition, a timely diagnostic work up resulting in a rapid start of treatment. Gains of timely treatment in GCA are less sight loss and less strokes and for PMR improving physical and mental quality of life (QoL). For both GCA and PMR gains of correct diagnosis are less side effects of drugs (diabetes, osteoporosis, hypertension, hospitalization). These benefits can result in a reduction of health care costs.

GCA can be distributed locally in the temporal artery (cranial(c)-GCA) and in the other branches of the internal and external carotid artery or the aorta and its main branches more central in the thorax and abdomen (large vessel vasculitis GCA (LVV-GCA)) (Figure 1).

The incidence of PMR and GCA ranges between 12-70 per 100.000 inhabitants (see also “Impact van het project”). Clinically GCA and PMR belong to a disease spectrum which  often coexist in the same patient. Nearly half of the patients with GCA have evidence of PMR, while approximately 30% of patients with PMR have concomitant GCA.

Since both GCA and PMR can present with atypical signs and symptoms such as low-grade fever, malaise, morning stiffness and weight loss, the diagnosis is difficult and can be missed easily. Missed or delayed diagnosis may lead to loss of eye sight and cerebral stroke. Several diagnostic modalities are essential for the diagnostic work up of GCA and PMR, for instance, vascular biopsy, laboratory tests and imaging (FDG PET/CT, ultrasound, Figure 1).  As soon as the diagnosis is confirmed treatment with glucocorticoids (GC) should be started without delay. The use of digital diagnostic algorithm modelling such as machine learning may improve this.

 Machine learning (ML) may support the diagnostic work up as it has the ability to discover complex patterns - off for instance disease related symptoms - in a wide variety of voluminous data sources (Figure 2). In the case of GCA/PMR, it could potentially pick up patterns involving atypical signs and symptoms combined with diagnostic modalities such as laboratory tests. ML works as follows: from a collection of labelled data, such as textual reports, laboratory tests, imaging techniques, and other findings, a computer algorithm can automatically develop a model. The model can be seen as a black box containing all discovered clinical patterns. The model can then be used to predict the probability of a diagnosis in similar but unseen data for a new patient and help the clinician. Mathematically the model can be seen as a very complex function with huge numbers of coefficients that map the available data towards a prediction of a certain diagnosis such as GCA and PMR.

To improve outcome in GCA and PMR faster and accurate referral from primary to secondary care would be advantageous. But at present this is difficult due to the atypical signs and symptoms in earlier stages of the disease in addition to a low incidence. Given the ability of ML to deal with such complex data it may eventually be used to develop a referral tool to support swift and accurate referral to secondary care.

Technical challenges

ML is a newcomer in the medical arena, and is gradually moving from the research to clinical application. Unique aspects of ML are that it can combine data from different sources of various types, including information from textual reports, and clinical findings (laboratory tests, imaging) which is currently exploited only limited. ML can find more complex patterns than humans using very subtle clues and signals. After the model is developed, it can be applied in a highly unobtrusive manner, for instance as a mobile or web app or integrated into existing software.

The combination of machine learning / data mining with natural language processing has high promise when applied as diagnostic support tool as shown in a  growing number of studies in the context of a wide variety of diseases, for example heart failure and breast cancer. Current research by UT suggests that indeed such complex patterns can be found even in surprising data sources containing only subtle clues and signals.

We aim to develop a ML algorithm using the unique UMCG database, containing  a large cohort of GCA/PMR patients. The UMCG database, including the language use in medical reports will serve as a training set of ML. A control group of patients with fever of unknown origin (FUO) will be included to serve as  comparison.

The newly developed ML algorithm will be validated in a historic cohort from the ZGT. Future plans are to further develop the ML algorithm so it can be applied in primary care to aid timely referral.

Combining the experience from UT in the development of ML with the clinical data and experience from the ZGT and the UMCG we may improve the diagnostic process for GCA/PMR.

The technical challenge is the development of Machine Learning in a hospital challenge and adjusting it to support the GP in faster and accurate recognition of GCA and PMR.

Project Impact:

The incidence of PMR remained relatively stable over time with an age-adjusted incidence per 100,000 people of 69.8 (95% CI: 61.2, 78.4) among women and 44.8 (95% CI: 37.0, 52.6) among men. In persons older than 50 PMR is more frequent than RA. The age-adjusted incidence of GCA per 100,000 people is 24.4 (95% CI: 20.3, 28.6) among women and 10.3 (95% CI: 6.9, 13.6) among men. The estimated incidence rate of GCA in The Netherlands (per year) is 18-24 per 100.000 persons.

There is an urgent need  for improvement regarding early recognition, diagnostic work up (preferably in fast track clinics) and treatment. Gains are less sight loss, less strokes and less  hospitalization. An early recognition of GCA and PMR is important and highly needed both  by  the GP and by doctors in the hospital. The impact of this project lies in an earlier recognition of GCA and PMR, reduction of the diagnostic delay and thereby  prevention of sight loss.

The major factor in visual loss seems to be  the delayed initiation of treatment. Once patients have sight loss they need IV medication in a day care or hospital setting  which generates more costs than oral treatment at home. Also permanent sight loss is not only detrimental for the patient but also will raise the costs. The projected cost, in the U.S.A. only, of visual impairment due to GCA will exceed 76 billion US dollars while the cost of the inpatient care for the GCA patients will be about 1 billion US dollars.

References: