Press Release

JHU Researchers Begin to Tackle Bias in AI Used in Disease Diagnosis and Health Care

Tue, 02/18/2020 - 13:43

Every year, millions of patients receive medical diagnoses, care and treatment plans based on algorithms. But a recent study finding a lack of ethnic diversity in the source data for one widely used treatment-decision algorithm has researchers suggesting that similar tools used by hospitals, health care systems and government agencies could be doing more harm than good.

Researchers at the Johns Hopkins Applied Physics Laboratory (APL) in Laurel, Maryland, have been investigating a type of artificial intelligence technique that may help address some of the bias in machine-learning tools, while improving training for human clinicians.

Artificial intelligence — encompassing areas such as machine learning, natural language processing and robotics — offers wide-ranging applications to medical research and health care delivery. Systems using AI can not only analyze large amounts of image and symptoms data but also potentially “learn" from that data, and arrive at decisions much faster than humans.

But there is growing concern that algorithms may reproduce disparities in training data due to cultural, social, economic, racial and gender factors, says APL’s Philippe Burlina, an artificial intelligence researcher.

What Biased Data Looks Like

APL researchers have seen this, firsthand, in collaborative work with other JHU researchers. Burlina has been working with Dr. John Aucott, of the Johns Hopkins Medicine Division of Rheumatology, on AI tools to diagnose skin lesions. “We have been using clinical as well as public domain data,” he said. “We realized that nearly all data available in the public domain includes mostly examples from white individuals, not Asians or African Americans.”

In another collaboration, with Dr. Neil Bressler at the Johns Hopkins Wilmer Eye Institute, Burlina tested AI-based retinal diagnosis tools. They found the tools offered different levels of performance — and outcomes — when examining the retinas of white and black patients. They also found that algorithms developed using data from mostly white individuals had lower performance when tested on other ethnic groups such as Chinese, Malay or Indian — and vice versa.

Most current AI algorithms are data-driven, Bressler explained, so classification performance of these algorithms depends on the data available from large cohorts of individuals where a “gold standard” diagnosis was determined for each individual.

“In the simplest cases, the root cause of the bias problem may be a lack of balance in the training datasets,” Burlina said. “That is, the dataset might have been from retina photographs predominantly of white individuals. Often, the problem is more complex than just data imbalance and there can be other underlying factors causing bias. For example, the quality and diversity of the data may vary by ethnic or gender groups. Sometimes finding the root cause of biased AI can be a challenging problem itself.”

But now that AI diagnostic tools are starting to be deployed in clinics, the potential bias in AI should be addressed, they said.

Overcoming Bias

There are several ways to approach the challenge, Burlina said. In some cases it is possible to acquire more data or, in the case of prospective data collection, design experiments so that certain so-called protected groups (like those identified by ethnicity, gender or age) are appropriately represented. But in cases using retrospective data, acquiring more data to correct the problem may not be an option.

This is where AI tools may help. Burlina and his colleagues have been working on AI techniques known as generative models that attempt to fill in “missing” data or remove bias from existing data. Machines can then learn from this newly processed data to try to predict equitable outcomes.

“We have shown that in some specific instances this approach may help address bias issues in AI,” Burlina said. “There remains, however, a lot of work to be done to further develop these ideas and work out other methods that are based on different algorithms to robustly address bias in AI.”

“APL is in a unique position to address these problems since it has a breadth of experience in applying AI to many domains,” said Ashley Llorens, the chief of APL’s Intelligent Systems Center.

APL’s second annual National Health Symposium will explore real-world AI applications for health care, harnessing AI technologies to accelerate advances while doing no harm, and ensuring the safety and security of health care while realizing AI’s full potential. Learn more and register at