Some claim that machine learning technology has the potential to transform healthcare systems, but a study published by The BMJ finds that machine learning models have similar performance to traditional statistical models and share similar uncertainty in making risk predictions for individual patients.
The NHS has invested £250m ($323m; €275m) to embed machine learning in healthcare, but researchers say the level of consistency (stability) within and between models should be assessed before they are used to make treatment decisions for individual patients.
Risk prediction models are widely used in clinical practice. They use statistical techniques alongside information about people, such as their age and ethnicity, to identify those at high risk of developing an illness and make decisions about their care.
Previous research has found that a traditional risk prediction model such as QRISK3 has very good model performance at the population level, but has considerable uncertainty on individual risk prediction.
Some studies claim that machine learning models can outperform traditional models, while others argue that they cannot provide explainable reasons behind their predictions, potentially leading to inappropriate actions.
What’s more, machine learning models often ignore censoring—when patients are lost (either by error or by being unreachable) during a study and the model assumes they are disease free, leading to biased predictions.
To explore these issues further, researchers in the UK, China and the Netherlands set out to assess the consistency of machine learning and statistical techniques in predicting individual level and population level risks of cardiovascular disease and the effects of censoring on risk predictions.
They assessed 19 different prediction techniques (12 machine learning models and seven statistical models) using data from 3.6 million patients registered at 391 general practices in England between 1998 and 2018.
Data from general practices, hospital admission and mortality