Published Research

Canadian Medical Association Journal – Problems in the deployment of machine-learned models in health care

KEY POINTS
Decision-support systems or clinical prediction tools based on machine learning (including the special case of deep learning) are similar to clinical support tools developed using classical statistical models and, as such, have similar limitations.

If a machine-learned model is trained using data that do not match the data it will encounter when deployed, its performance may be lower than expected.

When training, machine learning algorithms take the “path of least resistance,” leading them to learn features from the data that are spuriously correlated with target outputs instead of the correct features; this can impair the effective generalization of the resulting learned model.

Avoiding errors related to these problems involves careful evaluation of machine-learned models using new data from the performance distribution, including data samples that are expected to “trick” the model, such as those with different population demographics, difficult conditions or bad-quality inputs.