The AD prediction challenge
Techniques to predict onset of AD are badly needed to help focus trials of potential treatments on the right people. To conduct an effective clinical trial it is crucial to:
- Identify individuals that are likely to need and respond to treatment. AD treatments are most likely to be effective at early disease stages, even before any outward signs of dementia.
- Accurately predict the change in disease indicators so that we can assess the effect of the treatment.
Our limited understanding of AD makes prediction of symptom onset hard. However, several approaches are available in the scientific literature - we summarise them here and refer you to (Weiner et al Alz. Dement. (2017)) for a comprehensive review. Of course, these are just examples, and TADPOLE welcomes and encourages alternative strategies.
Manual prediction by a clinical expert
An informed clinician experienced in interpreting multi-modal data can judge prognosis and predict conversion to a more severe diagnostic category by drawing on their knowledge of the clinical history of patients with a similar presentation, e.g., through visual rating of brain scans (Harper et al., Brain 2016).
We strongly encourage clinical experts to take part in the challenge this way — can you beat the machines?
Statistical prediction using regression
Regression is a statistical technique to model the relationship between variables and thus to predict one set of variables from another. In TADPOLE, one might regress markers of AD or clinical assessments against time in historical data to make predictions of future measurements or changes in patient status. Examples from the literature include regression of clinical diagnosis against anatomical volumes from MRI (Scahill et al., PNAS (2002)), cognitive test scores (e.g., Yang et al., JAD (2011), Sabuncu et al., Arch. Neurol. (2011)), rate of cognitive decline (e.g., Doody et al., Alz. Res. Ther. (2010)), and retrospectively staging subjects by time to conversion between diagnoses (e.g., Guerrero et al., NeuroImage (2016)). In familial AD, (Bateman et al New Eng. J. Med. (2015)) regress several markers of AD against expected time to onset.
Supervised machine learning techniques, such as support vector machines, random forests, and artificial neural networks, learn the relationship between the values of a set of predictors and their labels. They can prove very effective in high dimensional classification and regression problems, such as those that TADPOLE presents. In AD, (Klöppel et al Brain 2008;) showed the ability to discriminate AD patients from cognitively normal subjects from MR images using support vector machines; later work uses a wider variety of biomarkers (Zhang et al., NeuroImage (2011)). Others aim for a more fine-grained classification among MCI subjects, who go on to convert to AD in a certain time frame and those who do not; see for example Young et al., NeuroImage: Clinical (2013) and Mattila et al., JAD (2011).
Data-driven disease progression models
Data-driven disease progression models are a more recent innovation in AD modelling and prediction using unsupervised learning. They do not rely techniques on prior knowledge of disease status, but rather aim to extract a picture of how all biomarkers evolve concurrently during the disease. Examples include models built on a set of scalar biomarkers to produce discrete (Fonteijn et al., NeuroImage (2012); Young et al., Brain (2014)) or continuous (Jedynak et al., NeuroImage (2012); Donohue et al., Alz. Dem. (2014)) pictures of disease progression; richer but less comprehensive models that leverage structure in data such as MR images (Durrleman et al., IJCV (2013); Lorenzi et al., Neurobiol. Aging (2015); Bilgel et al., NeuroImage (2016)); and models of disease mechanisms (Seeley et al., Neuron (2009); Zhou et al., Neuron (2012); Raj et al., Neuron (2012); Iturria-Medina et al., PLoS Comput. Biol. (2016)).