Nested Cross-Validation: Robust Machine Learning Methods for Alzheimer’s Detection
At the Alzheimer’s Association International Conference in 2022, we presented our work on designing and validating machine learning models for Alzheimer’s detection using EEG. They focused on one of the hardest problems in building classifiers, overfitting, and presented our solution and recommendation to the industry: a nested cross-validation approach that is much stricter than standard approaches, leading to better validation.
The Impact of Overfitting
When building machine learning models, one of the most important challenges to overcome is called “overfitting,” which is the tendency for models to accurately predict the dataset that they are trained on, but fail when applied to the real world. Overfitting can be caused by problems in how data is structured and how the model building process is constructed. Models that are built on relatively small datasets, such as the datasets that are commonly available in clinical contexts, are also highly susceptible to overfitting and demand stronger protections.
Overfit models look great on paper, but they crash and burn when deployed to real clinical settings, leading to bad outcomes, missed opportunities, and loss of trust. In our review of the literature on machine learning models for dementia detection, we’ve found that many authors describe architectures that are highly susceptible to overfitting problems, which may lead to drastic over-estimation of model performance in the clinic.
SPARK’s High Standards
Quirk et al. compared different cross-validation approaches and examined the extent of overfitting. In particular, they hypothesized that an approach called nested cross-validation would perform better than standard cross-validation by enforcing a strict separation of the data that is used to train the model from the data that is used to validate the model. They found that nested cross-validation methods provide the strongest protection against these pitfalls.
This work underscores how critical model design is to building reliable and accurate tools for dementia detection. At SPARK, we follow this and similar approaches as we design machine learning models. Careful model validation is critical when patient health is at stake, and we encourage the rest of the industry to follows our lead in adopting this more rigorous and conservative approach.
Check out the poster below.