The leaderboard is a system through which participants can submit preliminary results and compare the accuracy of their predictions with other teams, using existing ADNI data. More precisely, we generate a leaderboard table (bottom of the page), which shows the performance scores of every team submission. The table is updated live. Participants need to make predictions using the datasets created from existing ADNI data:
- LB1 – longitudinal data from ADNI1 (equivalent of D1). Just as D1, this is typically the dataset on which one will train the model.
- LB2 – longitudinal data from ADNI1 rollovers into ADNIGO and ADNI2 (equivalent of D2). This dataset is normally used after training, as input for the forecasts.
The forecasts from LB2 will be evaluated on post-ADNI1 followup data from individuals in LB2:
- LB4 – leaderboard test set (equivalent of D4). As this is the prediction dataset, it should NOT be used at all by the model providing the forecast.
We provide scripts for generating the LB1, LB2 and LB4 datasets (requires the TADPOLE datasets spreadsheet downloadable from ADNI). See the official TADPOLE github repository, under the folder "evaluation", explained in the README and Makefile. In the repository we also provide the scripts that compute the performance metrics and create the table itself (for transparency). The Makefile shows how the scripts should be run and can also be used to run the full pipeline using 'make leaderboard' (for a leaderboard submission) or 'make eval' (for a full TADPOLE submission, i.e. non-leaderboard). Before running the Makefile, don't forget to update the MATLAB path at the beginning of the Makefile. If you need further assistance with generating the datasets, do not hesitate to contact us on the Google Group.
The leaderboard is currently LIVE and is updated every hour. We encourage TADPOLE participants to try the leaderboard first, before making a proper submission.
- Participants are NOT allowed to cheat by fitting their models on the prediction dataset (LB4) or by using any information from the LB4 entries. There are no prizes associated with the leaderboard. This is just a space for participants to compare their model performance while the comperition is running.
- Participants can make as many leaderboard submissions as they like, which should include an index at the end of the submission file name.
- Participants should name their submission as 'TADPOLE_Submission_Leaderboard_<YourTeamName><SubmissionIndex>.csv'. This will enable our script to pick up the submission file automatically. Example names: TADPOLE_Submission_Leaderboard_PowerRangers1.csv or TADPOLE_Submission_Leaderboard_OxfordSuperAccurateLinearModelWithAgeRegression5.csv. No underscores should be included in team names.
- Participants don't necessary have to make a leaderboard submission in order to find their results. The script that computes the performance metrics for the leaderboard dataset (evalOneSubmission.py) is provided on the github repository. Users can run the script on their local machine, make changes to their model, and submit a leaderboard entry when ready.
The following scripts need to be run in this order:
- makeLeaderboardDataset.py - creates the leaderboard datasets LB1 (training), LB2 (subjects for which forecasts are requires) and LB4 (biomarker values for LB2 subjects at later visits). Also creates the submission skeleton for the leaderboard TADPOLE_Submission_Leaderboard_TeamName.csv
- TADPOLE_SimpleForecastExampleLeaderboard.m - generates forecasts for every subject in LB2 using a simple method. Change this file in order to
- evalOneSubmission.py - evaluates the previously generated user forecasts against LB4
If everything runs without errors and step 3 prints out the performance measures successfully, your leaderboard submission spreadsheet is ready to be uploaded via the TADPOLE website. You must be registered on the website, and logged in, in order to upload via the Submit page.
See the Makefile (leaderboard section) for the exact commands required to run these scripts. If you need further help on how to run the Python/MATLAB scripts, see this thread on the Google Group.
This is the leaderboard table, which is updated live (every hour). Some test entries might be included along the way, which will be tagged with 'UCLTest'. These entries are there for us to test further modifications to the leaderboard system and should be disregarded.
- MAUC - Multiclass Area Under the Curve
- BCA - Balanced Classification Accuracy
- MAE - Mean Absolute Error
- WES - Weighted Error Score
- CPA - Coverage Probability Accuracy for 50% Confidence Interval
- ADAS - Alzheimer's Disease Assessment Scale Cognitive (13)
- VENTS - Ventricle Volume
- RANK - computed based on MAUC.
Table last updated on 2017-09-23 08:03 (UTC+0)
|RANK||TEAM NAME||MAUC||BCA||ADAS MAE||VENTS MAE||ADAS WES||VENTS WES||ADAS CPA||VENTS CPA||DATE|
|1||TeamAlgosForGood1||0.809||0.856||4.087||4.52e-03||4.087||3.81e-03||0.091||0.006||2017-09-18 09:34 (UTC+0)|
|2||FPC1||0.758||0.722||5.000||4.19e-03||4.976||4.19e-03||0.350||0.381||2017-09-18 09:34 (UTC+0)|
|4||FPC2||0.706||0.721||6.369||2.56e-03||6.711||2.56e-03||0.392||0.324||2017-09-18 09:34 (UTC+0)|
|3||FPC3||0.706||0.721||6.369||2.56e-03||6.736||2.56e-03||0.250||0.267||2017-09-12 22:51 (UTC+0)|