To answer medical questions that can be applied to a wide patient population, machine learning models rely on large, diverse datasets from a variety of institutions. However, health systems and hospitals are often resistant to sharing patient data, due to legal, privacy, and cultural challenges.
An emerging technique called federated learning is a solution to this dilemma, according to a study published Tuesday in the journal Scientific Reports, led by senior author Spyridon Bakas, Ph.D., an instructor of Radiology and Pathology & Laboratory Medicine in the Perelman School of Medicine at the University of Pennsylvania.
Federated learning—an approach first implemented by Google for keyboards’ autocorrect functionality—trains an algorithm across multiple decentralized devices or servers holding local data samples, without exchanging them. While the approach could potentially be used to answer many different medical questions, Penn Medicine researchers have shown that federated learning is successful specifically in the context of brain imaging, by being able to analyze magnetic resonance imaging (MRI) scans of brain tumor patients and distinguish healthy brain tissue from cancerous regions.
A model trained at Penn Medicine, for example, can be distributed to hospitals around the world. Doctors can then train on top of this shared model, by inputting their own patient brain scans. Their new model will then be transferred to a centralized server. The models will eventually be reconciled into a consensus model that has gained knowledge from each of the hospitals, and is therefore clinically useful.
“The more data the computational model sees, the better it learns the problem, and the better it can address the question that it was designed to answer,” Bakas said. “Traditionally, machine learning has used data from a single institution, and then it became apparent that those models do not perform or generalize well on