Is my AI Discriminatory?

post-journal-club bias-fairness-ethics

A discussion about bias in healthcare AI, and building models with fairness and ethics in mind. [5 min read]

Meggie Debnath (DSAA, Unity Health Toronto)https://chartdatascience.ca
2022-06-21

Artificial intelligence (AI) and machine learning (ML), in particular, are part of decision making across industries, the public sectors, and everywhere in between. From identifying fraudulent bank transactions to listing the shows and movies we’re most likely to enjoy, AI is deeply embedded within our everyday lives. Oftentimes the results or outputs of these decisions are relatively harmless. However, increasingly, machine learning models are trained on complex and sensitive data, and used as part of decision making processes for diagnosing diseases or making hiring decisions. While these models have the ability to transform and improve lives, sometimes the decisions made or informed by AI can have far-reaching consequences.

As a part of our team’s bi-weekly journal clubs, we talked about sources of bias for AI models, the potential consequences and harms they can create, and what we can do as data scientists within the healthcare space.

Bias is everywhere

Bias is a part of human nature, coming from the limited view of the world that any single person or group can achieve. Whether implicitly or explicitly, this bias gets captured within our institutions and by extension - the data that we record. It can be reflected and amplified by artificial intelligence models that are trained using this data. Generally, the bias encoded within AI tools result in the greatest harm toward disadvantaged groups and people, such as racial minorities.

Different types of bias that can exist when training machine learning models [source](https://www.sciencedirect.com/science/article/pii/S2666389921002026)

Figure 1: Different types of bias that can exist when training machine learning models source

There are a few different ways bias can affect the prediction or decision made by an algorithm (Norori et al. 2021):

By the same token, harms as a result of biased AI can manifest in different ways:

In this way, AI can be a flawed reflection of our society and its systemic biases, and can become a “gatekeeper” for jobs, medical treatments, and opportunities.

Healthcare data, like any data, is flawed

Within the context of healthcare services, it is especially important to consider the types of bias within our data, as decisions made with the support of AI have the ability to influence critical decisions such as which patients receive additional care, or what medication dosages are prescribed. As with many other industries, healthcare and medical data can be biased, incorrect, missing, and incomplete.

Even without the presence of AI tools, healthcare data holds implicit bias. For example, when visiting the emergency department for abdominal pain, men wait an average of 49 minutes before receiving an analgesic, whereas women wait an average of 65 minutes (Chen et al. 2008). The COVID-19 pandemic has also highlighted many existing racial inequities in healthcare, with the morbidity and mortality rate being higher for Black Americans, Native Americans, Pacific Islanders, and Hispanic/Latino patients compared with White Americans (Gawthrop 2022).

When machine learning models are trained using data that already contains historical and societal inequities, these patterns are learned by the model, and the biases can be amplified when making predictions for new patients. Models that are deployed with underlying biases can disadvantage the groups who were under or mis-represented within the training data. For example, algorithms trained to identify disease within chest radiograph images were found to have higher underdiagnosis rates for female patients, patients under 20 years old, Black patients, and Hispanic patients. In other words, the risk of being falsely predicted as “healthy” were higher for these groups of people, meaning their clinical treatment would have been delayed or missed entirely (Seyyed-Kalantari et al. 2021).

Building fairer models

We know that our models can contain harmful biases. But what can we do as data scientists in the healthcare space to ensure our models benefit the most people, and don’t cause harm? This might be a daunting question, one that led to a lot more questions for our team:

Building fairer models is an iterative process, and one that requires more than one solution. Although not all are possible to implement everywhere, especially all at once, below are a few things our team is learning about and working on:

  1. Understanding sources and limitations of data. This involves thinking about where the data coming from and if there’s potential for any of the variables to be biased. For example, data containing a single variable “gender” with limited responses may actually be a mixture of sex or perceived gender, rather than reflecting a patient’s true gender identity.
  2. Building models with an interdisciplinary and diverse team. When developing any kind of AI tool for clinical deployment, we heavily collaborate with the clinician teams that are involved. In addition, our project teams consist of people with varied backgrounds, experiences, cultures, and training.
  3. Evaluating model performance across sub-groups and applying techniques for improving explainability. There are many tools and resources for evaluating model fairness and understanding how a model performs for subgroups. Tools such as InterpretML and modelStudio.
  4. Creating and following standards for data, processes, models, and reporting. Standardization of these elements of a data science project ensures that there are clear guidelines and expectations, consistency among and across projects, and benchmarks to evaluate quality.
  5. Monitoring data, model usage, and performance over time. Monitoring how our models are performing after deployment is important to ensure there hasn’t been any data drift or changes in the environment that may cause poor performance.
  6. Learning, discussing, and sharing. We believe it’s important to keep learning, discussing through things like journal clubs, and where possible, sharing our processes, code, research, and learnings.

AI has countless potential benefits, especially within healthcare - to improve patient care, hospital efficiency, and support decision-making. Working to build fairer models will help improve trust among clinically deployed AI tools, and ensure that all groups of people can benefit from the decisions made and supported by AI.

Takeaways

Additional resources

Below are the full list of topics and readings that we dove into for our journal club series on bias, fairness, and ethics in healthcare AI.

Topic Reading Materials
Introduction to Bias, Fairness, and Ethics in AI Medicine’s Machine Learning Problem
Indigenous Data, Representing Race in AI, and Structural Racism in Healthcare Structural racism in precision medicine: leaving no one behind
Racial Disparities and Mistrust in End-of-Life Care
Dissecting racial bias in an algorithm used to manage the health of populations
Racism and Health: Evidence and Needed Research
The disturbing return of scientific racism
Machine Learning Best Practices & Regulations FDA In Brief: FDA Collaborates with Health Canada and UK’s MHRA to Foster Good Machine Learning Practice
Algorithmic Impact Assessment tool
Suicide hotline shares data with for-profit spinoff, raising ethical questions
Their Bionic Eyes are now Obsolete and Unsupported
Failure modes and Equity Concerns in Medical Imaging Models Reading Race: AI Recognizes Patient’s Racial Identity in Medical Images
Under-diagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations
Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans
Bias and Assessing Model Fairness & Transparency Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
How to make sure your model is fair, accountable, and transparent
AI FactSheets 360
Representing Sex & Gender in AI and Healthcare Data Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations
Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare
Chen, Esther H., Frances S. Shofer, Anthony J. Dean, Judd E. Hollander, William G. Baxt, Jennifer L. Robey, Keara L. Sease, and Angela M. Mills. 2008. “Gender Disparity in Analgesic Treatment of Emergency Department Patients with Acute Abdominal Pain.” Academic Emergency Medicine 15 (5): 414–18. https://doi.org/10.1111/j.1553-2712.2008.00100.x.
Gawthrop, Elisabeth. 2022. “Color of Coronavirus: COVID-19 Deaths Analyzed by Race and Ethnicity.” APM Research Lab. https://www.apmresearchlab.org/covid/deaths-by-race.
Norori, Natalia, Qiyang Hu, Florence Marcelle Aellen, Francesca Dalia Faraci, and Athina Tzovara. 2021. “Addressing Bias in Big Data and AI for Health Care: A Call for Open Science.” Patterns 2 (10): 100347. https://doi.org/10.1016/j.patter.2021.100347.
Seyyed-Kalantari, Laleh, Haoran Zhang, Matthew B. A. McDermott, Irene Y. Chen, and Marzyeh Ghassemi. 2021. “Underdiagnosis Bias of Artificial Intelligence Algorithms Applied to Chest Radiographs in Under-Served Patient Populations.” Nature Medicine 27 (12): 2176–82. https://doi.org/10.1038/s41591-021-01595-0.

References