Just a few decades ago, scientists didn’t think much about diversity when studying new medications. Most clinical trials enrolled mainly white men living near urban research institutes, with the assumption that any findings would apply equally to the rest of the country. Later research demonstrated that assumption to be false; examples accumulated of medications that were later determined to be less effective or caused more side effects in populations that were underrepresented in the initial study.
To address these inequities, federal requirements for participation in medical research were broadened in the 1990s, and clinical trials now attempt to enroll diverse populations from the onset of the study.
But we are now at risk of repeating these same mistakes as we develop new technologies, such as AI. Researchers from Stanford University examined clinical applications of machine learning to find that most algorithms are trained on datasets from patients in only three geographic areas, and that the majority of states have no represented patients whatsoever.
“AI algorithms should mirror the community,” says Amit Kaushal, an attending physician at VA Palo Alto Hospital and Stanford adjunct professor of bioengineering. “If we’re building AI-based tools for patients across the United States, as a field, we can’t have the data to train these tools all coming from the same handful of places.”
Kaushal, along with Russ Altman, a Stanford professor of bioengineering, genetics, medicine, and biomedical data science, and Curt Langlotz, a professor of radiology and biomedical informatics research, examined five years of peer-reviewed articles that trained a deep-learning algorithm for a diagnostic task intended to assist with patient care. Among U.S. studies where geographic origin could be characterized, they found the majority (71%) used patient data from California, Massachusetts, or New York to train the algorithms. Some 60%