Techniques to predict onset of AD are badly needed to help focus trials of potential treatments on the right people. To conduct an effective clinical trial it is crucial to:

1. Identify individuals that are likely to need and respond to treatment. AD treatments are most likely to be effective at early disease stages, even before any outward signs of dementia.
2. Accurately predict the change in disease indicators so that we can assess the effect of the treatment.

Our limited understanding of AD makes prediction of symptom onset hard. However, several approaches are available in the scientific literature - we summarise them here and refer you to (Weiner et al Alz. Dement. (2017)) for a comprehensive review. Of course, these are just examples, and TADPOLE welcomes and encourages alternative strategies.

#### Manual prediction by a clinical expert

Clinical dementia expert Professor Nick Fox explains to the former prime minister of the UK, David Cameron, how to spot signs of Alzheimer’s disease in brain images. Source: zimbio.com

An informed clinician experienced in interpreting multi-modal data can judge prognosis and predict conversion to a more severe diagnostic category by drawing on their knowledge of the clinical history of patients with a similar presentation, e.g., through visual rating of brain scans (Harper et al., Brain 2016).

We strongly encourage clinical experts to take part in the challenge this way — can you beat the machines?

#### Statistical prediction using regression

Predicted ADAS-Cog values for different patient subgroups (slow, intermediate or fast progressors, determined by their rate of decline in MMSE score) estimated using statistical regression.
Source: Doody et al., Alz. Res. Ther. (2010).

Regression is a statistical technique to model the relationship between variables and thus to predict one set of variables from another. In TADPOLE, one might regress markers of AD or clinical assessments against time in historical data to make predictions of future measurements or changes in patient status. Examples from the literature include regression of clinical diagnosis against anatomical volumes from MRI (Scahill et al., PNAS (2002)), cognitive test scores (e.g., Yang et al., JAD (2011), Sabuncu et al., Arch. Neurol. (2011)), rate of cognitive decline (e.g., Doody et al., Alz. Res. Ther. (2010)), and retrospectively staging subjects by time to conversion between diagnoses (e.g., Guerrero et al., NeuroImage (2016)). In familial AD, (Bateman et al New Eng. J. Med. (2015)) regress several markers of AD against expected time to onset.

#### Machine learning

An example of a supervised machine learning technique combining multi-modal patient information to make a prediction. Source: Young et al., Neuroimage Clin (2013).

Supervised machine learning techniques, such as support vector machines, random forests, and artificial neural networks, learn the relationship between the values of a set of predictors and their labels. They can prove very effective in high dimensional classification and regression problems, such as those that TADPOLE presents. In AD, (Klöppel et al Brain 2008;) showed the ability to discriminate AD patients from cognitively normal subjects from MR images using support vector machines; later work uses a wider variety of biomarkers (Zhang et al., NeuroImage (2011)). Others aim for a more fine-grained classification among MCI subjects, who go on to convert to AD in a certain time frame and those who do not; see for example Young et al., NeuroImage: Clinical (2013) and Mattila et al., JAD (2011).

#### Data-driven disease progression models

Predicted severity of different biomarkers with age estimated using a data-driven disease progression model. Source: Li, et al., arxiv.org:1703.10266 (2017).

Data-driven disease progression models are a more recent innovation in AD modelling and prediction using unsupervised learning. They do not rely techniques on prior knowledge of disease status, but rather aim to extract a picture of how all biomarkers evolve concurrently during the disease. Examples include models built on a set of scalar biomarkers to produce discrete (Fonteijn et al., NeuroImage (2012); Young et al., Brain (2014)) or continuous (Jedynak et al., NeuroImage (2012); Donohue et al., Alz. Dem. (2014)) pictures of disease progression; richer but less comprehensive models that leverage structure in data such as MR images (Durrleman et al., IJCV (2013); Lorenzi et al., Neurobiol. Aging (2015); Bilgel et al., NeuroImage (2016)); and models of disease mechanisms (Seeley et al., Neuron (2009); Zhou et al., Neuron (2012); Raj et al., Neuron (2012); Iturria-Medina et al., PLoS Comput. Biol. (2016)).

Organised by: