Health in Fragile Contexts Challenge
AEquity: A Framework for Mitigating Dataset Bias
What is the name of your solution?
AEquity: A Framework for Mitigating Dataset Bias
Provide a one-line summary of your solution.
AEquity is a deep learning-based metric that measures potential biases by investigating the underlying structure of population-specific subsets of the data. This allows for appropriate augmentation to the existing dataset to better represent that population.
Film your elevator pitch.
What specific problem are you solving?
Algorithmic bias is the inability of an algorithm to generalize to a given group, causing selective underperformance. In medicine, biased models can perpetuate existing disparities in vulnerable populations like women, people of color, aging, and people on Medicare and Medicaid. Algorithmic bias in medicine has been demonstrated in the fields of nephrology, pulmonology, radiology, cardiology, and dermatology. Despite best practices such as model calibration and training on representative datasets, algorithms still exhibit detrimental biases because algorithms developed using data from systems with longstanding discrimination and inequities tend to recapitulate those biases.
As an example, the application of standard computer vision models to chest radiographs results in selective under-diagnosis in patients who are Black, Latino, or receive Medicaid. As a result, algorithms have been used to deny care and insurance claims from black and disadvantaged patients. Underdiagnosis leads to worse patient quality of life and perpetuates the inequalities that healthcare seeks to remediate. Under-diagnosis bias in diagnostic chest X-ray deep-learning algorithms can occur via race, gender, socioeconomic status, and age. Because an average of 236 chest X-rays were taken per 1,000 individuals with 40% representing patients from under-represented subgroups, inequity can occur in over 24 million patients annually.
A second example is cost-predictive algorithms. Cost-predictive algorithms are applied to over 200 million individuals in the United States and under-represented individuals represent 10-20% of the patient population. Cost-predictive algorithms predict black patients are less likely to use care than their white counterparts, given similar comorbidities, and are subsequently allocated less healthcare resources. Less allocation of resources means that black patients receive worse care and have worse outcomes.
Underdiagnosis and under-allocation of resources are two key examples of how bias occurs in the healthcare system. These two biases are particularly dangerous because under-diagnosis can lead to under-allocation and vice versa. As artificial intelligence becomes increasingly part of our healthcare system, mitigating biases before deploying models is absolutely crucial to prevent healthcare inequities from being propagated at scale.
Two approaches towards measuring and mitigating bias in healthcare dominate the literature – model-centric methods and grassroots interventions. Model-centric methods such as calibration and prospective validation can remediate and deter the deployment of biased models. However, these typically involve recalibration which implies a tradeoff between sensitivity and specificity, leading to marginal improvements within a group at the expense of overall performance. Second, many types of algorithmic biases are resistant to simple calibration techniques due to implicit biases within the structure of underlying data. Grassroots interventions such as anti-racist policies and patient-provider education combat systemic biases, but these changes require years of effort that far exceed the timescale of clinical model development with numerous models already active
What is your solution?
We developed a deep-learning based approach – AEquity – that measures potential biases by investigating the underlying structure of population-specific subsets of the data. This allows for appropriate augmentation to the existing dataset to better represent that population. Following a data-centric AI paradigm, AEquity exhibits dataset-specific but model-agnostic properties and can be used subsequently to guide data-driven, actionable feedback such as selecting informative outcomes, collecting more diverse data or prioritizing collection of data from a specific subgroup.
The three types of biases we identify in healthcare data are sampling, complexity and label bias. Sampling bias arises when there is non-random sampling of patients, and this results in a lower or higher probability of sampling one group over another. When sampling bias was the only type of dataset bias, AEquity determines that increasing dataset diversity can mitigate dataset bias. Complexity bias was characterized as different groups representing different distributions across a given label. In complexity bias, a group presents more heterogeneously with a diagnosis, and consequently, one group exhibits a class label with greater complexity than another group. When complexity bias was the only type of dataset bias, AEquity recommends prioritizing a population for future data collection can mitigate biases. Third, label bias occurs when labels are placed incorrectly at different rates for different groups, and AEquity recommends switching to a more informative outcome.
We have made AEquity into Python library which can be easily downloaded and used locally by data scientists, minimizing the barrier to adoption. AEquity can be run on a single 4 CPU machine in 7 minutes for a 150-column tabular dataset and run on a single GPU in 25 minutes for a standard chest X-ray imaging dataset. AEq can determine potential biases at relatively small sample sizes with a sample efficiency of 200 to 300x (20,000-30,000%). This allows AEq to prospectively guide data-centric interventions. Collecting the appropriate data decreases underdiagnosis bias by 15% and improves precision by 25% for a chest X-ray model in intersectional populations like Black patients on Medicaid. Selecting an informative label on the electronic health record dataset can improve test area-under-the-curve on Black Patients by 10%.
We have provided both a short demo (AEquity - Demo - Abbreviated) and a more thorough demonstration (AEquity - Demo - Extended) to allow easy use.
Who does your solution serve, and in what ways will the solution impact their lives?
When you walk into the East Harlem Health Outreach Partnership, a student-run and physician-supervised free clinic that provides primary to uninsured adults to East Harlem, you immediately notice that the patients have a complex range of needs from endocrine disorders to maternal wellness. Residents of East Harlem have higher rates of obesity (28%), diabetes (17%), and hypertension (34%), higher rates of poverty (23%), unemployment (11%), and rent burden (48%) compared to residents of the Upper East Side, who live less than a mile away. Residents of East Harlem face higher rates of gun violence and incarceration compared to the rest New York City. In spite of these detrimental social determinants of health, the predominantly Latinx and black neighborhood is, perhaps, the liveliest neighborhood in the city. Some of the city's best restaurants such as Ricardo's steak house, and Tres Leches Cafe can be found in East Harlem. The East Harlem Night Market hosts vendors from around the neighborhood with creative home-designed hoodies, furniture, kitchenware, and spices. In the middle of winter, the East Harlem Night Market boasts a live DJ with a packed dance floor featuring talent of all ages.
When it comes to improving healthcare in underserved areas like East Harlem and many of the other communities that Mount Sinai serves, the successful development, validation and training of clinical algorithms is becoming increasingly essential. For example, the development of algorithms that predict deterioration in chronic kidney disease is crucial to patients in East Harlem because Black and Latinx patients are disproportionately affected by a mutation in the APOL1 gene (14-16% in individuals with African descent, and 5% in individuals of Afro-Carribean descent). However, risk stratification in chronic kidney disease - like via estimated glomerular filtration rate - can inappropriately place black patients at lower risk than their white counterparts with the exact same lab values. This leads to later time to diagnosis, resulting in higher rates of dialysis and long wait times for kidney transplant. There are countless more examples in cardiology with EKG predictions, rheumatology with systemic lupus sclerosis, and radiology with simple Chest X-rays where inequities in the healthcare system are being perpetuated by models trained on data representing those very same inequities.
AEquity is a model-agnostic, dataset-specific tool that can help detect, characterize and mitigate these biases before deployment into clinical practice for given subgroups by age, race, gender, and socioeconomic status. With widespread deployment, AEquity can mitigate key biases like under-diagnosis bias and under-resource allocation bias. By mitigating under-diagnosis bias, AEquity can reduce time to treat, and by mitigating under-allocation bias, AEquity can enable equitable healthcare resource allocation. Ultimately, the goal of AEquity is to build trust in the healthcare system by providing basic checks and balances.
How are you and your team well-positioned to deliver this solution?
Personally, as the team lead and resident of East Harlem, I am engrossed in understanding the ways in which I can personally assist the East Harlem community. I was on the executive board of the East Harlem Health Outreach Partnership (EHHOP) – a medical-student driven initiative to provide healthcare to individuals who are uninsured. As the technology chair for EHHOP, I helped expedite workflows for clinicians and patients building out tools to better assist providers assist their patients. I’ve also spent time understanding the breadth of healthcare needs in the East Harlem community from learning about incarceration, gun violence and maternal health through courses led by community leaders. Finally, I’ve spent time beyond the clinic to volunteer with the New York Common Pantry, providing nutritional necessities to East Harlem residents.
As a team geared towards combating equity, partnering with underrepresented populations is an essential part of our mission. AEquity partners with leaders in the two complementary approaches in inequity – community health partners and equity researchers. First, we partner with the Diversity and Innovation Hub at Mount Sinai (dihub.co). The mission of the dihub is to initiate, accelerate, and launch solutions to address social determinants of health that perpetuate disparities in health care. As an incubator that empowers underrepresented populations to target social determinants of health, dihub provides insights into enabling accessibility to the tools that we create under the guidance of community health workers and leaders. Second, AEquity partners with equity researchers at academic institutions like U. Mich and Harvard Medical School as well as private corporations like IBM to actively evaluate models for potential biases.
AEquity tackles the intersection of healthcare, artificial intelligence, and equity. Consequently, our team which consists of software engineers, physician scientists, and activists, is uniquely positioned to tackle this position. Our team is positioned at Mount Sinai Hospital, a growing hub for artificial intelligence, healthcare innovation, and diversity, equity, and inclusion initiatives. Positioned in New York City, Mount Sinai has a diverse patient population that encompasses all the boroughs. Mount Sinai serves all of NYC and its surrounding areas. Approximately 25% of patients are on Medicaid, 21% Medicare, and many more are uninsured or underinsured. Our patient population is racially/ethnically diverse with 27% Hispanic, 22.9% Black or African American, 4.1% Asian, 13.1% White, 22.1% Other, and 4.1% unknown. Access to a diverse patient population means that the institution can quickly scale impact to a breadth of different populations and generalize these results to a global scale.
Mount Sinai has a long and well-developed track record for improving patient outcomes via scalable AI/ML deployment due to a culture of interdisciplinary research with academics and industry collaborators. With the assistance of clinician data scientists, computational support, and full-stack developers, AEquity has the capacity to incorporate into data science pipelines. Building tools to help diverse patient populations is a core part of Sinai's technology-focus, activist-driven identity.
Which dimension of the Challenge does your solution most closely address?
Improve accessibility and quality of health services for underserved groups in fragile contexts around the world (such as refugees and other displaced people, women and children, older adults, LGBTQ+ individuals, etc.)
In what city, town, or region is your solution team headquartered?
New York City
In what country is your solution team headquartered?
What is your solution’s stage of development?
Pilot: An organization testing a product, service, or business model with a small number of users
How many people does your solution currently serve?
Currently, our solution serves a subset of data scientists and researchers at Mount Sinai who use the platform to check the biases in their data before developing the algorithm. This approximately impacts 50 researchers and a cohort of 60,000 patients.
Why are you applying to Solve?
The next steps for AEquity are end-to-end integration into a clinical data science pipeline at Mount Sinai Health System, and widescale deployment of AEquity on publicly available datasets as a universal quality metric. Building an API, both internally and externally, would drive model development and dataset collection. An internal API would allow integration for AEquity into the Hospital data-science pipeline and would serve as a guide for future integration into other hospitals. An external API with a UI that reports AEquity metrics for publicly available electronic health record data such as the UK Biobank, MIMIC, NIH, All of Us, and the Million Veterans Programs, would affect future model development globally, as these datasets are used across institutions with hundreds of already-known applications and models. We hope that SOLVE will be able to provide us with the resources and technical expertise to provide full-stack support on building this API for public use.
Second, after building the API, we would like to build connections with key organizations. Key organizational partners can be split into three main categories – regulatory agencies, community health centers, industry partners. The key regulatory agency is the Center for Medicare and Medicaid Services because AEquity directly serves to mitigate biases against patients on Medicare and Medicaid. To enforce algorithmic fairness, CMS can require AEquity deployment and reporting to qualify 3rd party vendors for reimbursements. As a result, healthcare companies like EPIC, 3M Informational Systems, Optum, SCIO Health analytics, Truven Health, under scrutiny from regulatory institutions and academic researchers, are financially incentivized to demonstrate the data underlying the model is reliable. Third, at community health centers like KP Colorado and Ochin, the low resource setting encourages rapid integration of 3rd party algorithms. Because CHCs serve under-represented patient populations, biases are more likely to exacerbated. Widespread integration can incentivize these health systems to be more careful in adopting quality algorithms. In this case, the SOLVE network would provide the perfect sounding board to amplify the potential impact of AEquity by enabling connections to industry and government leaders.
In which of the following areas do you most need partners or support?
Who is the Team Lead for your solution?
Faris Gulamali
What makes your solution innovative?
We create a deep learning-based metric that can be used for the characterization and mitigation of social and predictive biases in healthcare datasets. We call this metric AEquity because it utilizes an autoencoder to provide actionable, data-driven feedback towards equitable performance of models. AEquity can be run on a single 4 CPU machine in 7 minutes for a 150-column tabular dataset and run on a single GPU in 25 minutes for a standard chest X-ray imaging dataset. AEq can determine potential biases at relatively small sample sizes with a sample efficiency of 200 to 300x. This allows AEq to prospectively guide data-centric interventions. Collecting the appropriate data decreases underdiagnosis bias by 15% and improves precision by 25% for a chest X-ray model in intersectional populations like Black patients on Medicaid. Selecting an informative label on the electronic health record dataset can improve test area-under-the-curve on Black Patients by 10%.
Without the use of AI, evaluating and mitigating the biases underlying datasets requires post-hoc technical solutions or a qualitative, grassroots approach. Post-hoc technical solutions can help mitigate bias to a certain extent. However, these typically involve recalibration which, except in narrow circumstances, implies a tradeoff between sensitivity and specificity in underserved populations, leading to worse performance. Moreover, the process of model retraining can be costly and inefficient. Finally, if a model fails to perform following recalibration, inequities are either ignored with model deployment occurring regardless, which can perpetuate biases, or the model is discarded with no follow-up on why grassroots approach anti-racist policies and patient-provider education is necessary and complementary to our AI-based approach. However, these interventions operate on the timescale of years, in contrast to our method which works in hours or days depending on dataset complexity and computational resources.
Broad adaptation of AEquity for the detection and mitigation of dataset bias will inevitably cause massive amounts of dataset shift in healthcare data. AEquity can impact future data collection and clinician practices, which may change the underlying dataset characteristics. Because AEquity prospectively changes how data is collected as well as the selection of informative outcomes, the properties of the dataset may change as well as the group-specific embeddings. As a result, models that were developed on datasets subject to AEquity may initially have significant reductions in prospective performance on all populations without the necessary interventions. However, dataset shift towards equitable data will ultimately improve the fairness of future models by mitigating disparities in the data as more models are appropriately trained on AEquity guided interventions, which have been shown to improve performance in under-represented populations.
What are your impact goals for the next year and the next five years, and how will you achieve them?
Goal 1 (0-12 months): Build an API that automatically provides fairness metrics with data requests from the Mount Sinai Data Warehouse.
We have currently built AEquity into an easy to use python package. Therefore, building an internal API requires sourcing talent that can modify the existing code to integrate it into a backend pipeline. Second, we would need to purchase adequate computational resources for real-time deployment of the bias-detection pipeline.
Goal 2 (12-24 months): Conduct beta testing and prospective trial of the effect of fairness metrics on model development and healthcare delivery.
First, this requires gather approvals from various legs of the institution to enable beta testing of the pipeline within the Mount Sinai. Second, this requires establishing quantitative and qualitative metrics to evaluate the effectiveness of AEquity of behavioral modification and dataset collection.
Goal 3 (24-30 months): Scale the pipeline to publicly available healthcare data via full-stack deployment of an external API.
Key public electronic health record data includes MIMIC data, UK Biobank data, All of Us dataset, and the 1 Million Veterans dataset. Scaling to publicly available data requires a full-stack solution with an outward facing UI and API. Because different researchers use these datasets for a range of different reasons, deployment of AEquity is intended to shift the paradigm in how models are trained and which subsets are used for what purposes, as well as the implications that can be drawn from training any given model.
Goal 4 (30-36 months): Deploy AEquity at partner institutions such as University of Michigan, IBM, and Harvard Medical School.
Deployment within a single hospital system requires a thorough understanding of the hospital’s technical infrastructure. However, poor interoperability may pose a technical challenge to expansion due to variations in infrastructure. Training researchers, developers and clinicians within other institutions may require a thorough analysis of how other organizations technically operate.
Goal 5 (37-48 months): Integrate AEquity as a pre-requisite for reimbursement for the Center for Medicare and Medicaid Services.
Bias in healthcare algorithms is increasingly becoming a focus for regulators. First, the US Department of Health and Human Services has proposed a rule under Section 1557 of the Affordable Care Act to ensure that “a covered entity must not discriminate against any individual on the basis of race, color, national origin, sex, age, or disability through the use of clinical algorithms in decision-making”. Second, the Good Machine Learning Practice guidance from the FDA, Health Canada and MHRA, emphasizes the importance of ensuring that datasets are representative of the intended patient population. However, this alone is not sufficient because significant bias can arise even when standard machine learning methods are applied to diverse datasets. Serving as a “nutrition label” for the dataset to accompany model cards, AEquity prompts a movement towards healthy datasets by acknowledging the biases in care, promoting behavioral changes, and enable equity for all.
Which of the UN Sustainable Development Goals does your solution address?
How are you measuring your progress toward your impact goals?
Good Health and Well-Being:
We will measure our progress towards the good health and well-being goal both via quantitative and qualitative metrics. Quantitatively, we can highlight changes in health and well-being via rapid artificial intelligence-machine learning clinical trials, which can silently but prospectively validate algorithms in patients at Mount Sinai. The model will be evaluated via various fairness metrics like false negative rate, false positive rate, precision, and false discovery rate, and we can quantify the difference AEquity has made across different subpopulations.
Reducing Inequity:
Second, we will highlight the impact of AEquity on reducing inequity by generating an interview-based report to evaluate trends in clinical decision making following deployment. We will focus on both the provider-facing and patient-facing aspect of AEquity with questions like 1) Does widescale deployment of AEquity make models more trustworthy for the provider? 2) Does AEquity affect influence clinician pre-test and post-test probabilities for diagnostic models when a provider has a patient from an under-represented population?
What is your theory of change?
Mission Statement: Improved trust in the healthcare system.
Activity: Applying the AEquity framework to a Chest X-ray Dataset.
Output: AEquity based metrics for each subgroup (gender, race, socioeconomic status, age).
Short-Term Outcomes: Modifications in the way that the data is collected - either increasing sampling diversity or population prioritization.
Medium-Term Outcomes: Improvement in the generalizability for an algorithm on an under-represented population.
Long-Term Outcomes: Reduction in under-diagnosis bias leads to earlier time to treatment and improvement in patient outcomes.
Mission Statement: Equitable resource allocation between different racial identities.
Activity: Applying the AEquity framework to a healthcare utilization algorithm.
Output: AEquity based metrics for each subgroup (gender, race, socioeconomic status, age).
Short-Term Outcomes: Modifications in the way that the data is labeled - changing output metric from average total cost to comorbidity score.
Medium-Term Outcomes: Improvement in the generalizability for an algorithm on an under-represented population.
Long-Term Outcomes: Improved resource allocation for a given under-represented population.
Describe the core technology that powers your solution.
Diagnostic and prognostic algorithms using artificial intelligence may recapitulate and perpetuate systemic biases against underserved populations. Post-hoc technical solutions are insufficient to overcome biased data, but data-oriented approaches applied during algorithm development may be useful. We created a sample-efficient deep-learning based metric (AEquity or AEq) that measures the learnability of subsets of data representing underserved populations. We apply a systematic analysis of AEq values across subpopulations to identify and mitigate manifestations of racial bias in two known cases of algorithmic bias in healthcare – diagnosis on
chest radiographs with deep convolutional neural networks and healthcare utilization prediction with multivariable logistic regression. We show that using AEq values to guide data collection and select informative outcomes for these algorithms, improves generalization performance as measured by a range of different fairness metrics. Using AEq can help advance equity by
diagnosing and remediating bias in healthcare datasets, so algorithms developed from these datasets are more equitable.
Webpage containing demonstrations, pdf and github repositories: NIH NCATS Bias Detection Challenge Submissions (nadkarni-lab.github.io)
Publication supporting proof of concept: Autoencoders for sample size estimation for fully connected neural network classifiers | npj Digital Medicine (nature.com)
YouTube tutorial of method in practice: AEquity - Demo - YouTube
Which of the following categories best describes your solution?
A new application of an existing technology
Please select the technologies currently used in your solution:
If your solution has a website or an app, provide the links here:
https://nadkarni-lab.github.io/ncats-submission-2023/
In which countries do you currently operate?
In which countries will you be operating within the next year?
What type of organization is your solution team?
Nonprofit
How many people work on your solution team?
Full-time staff: 2; Part-time staff: 2, Contributors: 6
How long have you been working on your solution?
3-5 years.
What is your approach to incorporating diversity, equity, and inclusivity into your work?
Our team consists of researchers, industry partners and collaborators dedicated to mitigating dataset bias in under-represented populations. Key members include Carol R. Horowitz, the Dean for Gender Equity at Icahn School of Medicine at Mount Sinai, Dr. Girish N. Nadkarni, the System Chief of Data Driven Medicine at Mount Sinai Hospital, and Dr. Emmanuel Mensah, the Managing Director, Center for Integration Sciences in Global Health Equity at Brigham and Women's Hospital, Jianying Hu, Director, Healthcare and Life Sciences Research at IBM, Karandeep Singh, Assistant Professor of Life Sciences at University of Michigan. These individuals and I have a long track record of working on tackling inequities in healthcare across the fields of nephrology, urology, pulmonology, critical care and many more.
As a tool geared towards combating equity, partnering with underrepresented populations is an essential part of our mission. AEquity partners with leaders in the two complementary approaches in inequity – community health partners and equity researchers. First, we partner with the Diversity and Innovation Hub at Mount Sinai (dihub.co). The goal of dihub is to initiate, accelerate, and launch solutions to address social determinants of health that perpetuate disparities in health care. As an incubator that empowers underrepresented populations to target social determinants of health, dihub provides insights into enabling accessibility to the tools that we create under the guidance of community health workers and leaders. Second, AEquity partners with equity researchers at institutions like U. Mich and IBM, who actively evaluate models for potential biases. Third, as mentioned previously, we will conduct a qualitative evaluation of how providers interact with various metrics after deployment.
We acknowledge that a data-driven solution is not the only solution to solving inequity, but one of the key three prongs necessary to target systemic inequity in healthcare as whole.
What is your business model?
Key organizational partners can be split into three main categories – regulatory agencies, community health centers, industry partners. The key regulatory agency is the Center for Medicare and Medicaid Services because AEquity directly serves to mitigate biases against patients on Medicare and Medicaid. To enforce algorithmic fairness, CMS can require AEquity deployment and reporting to qualify 3rd party vendors for reimbursements. As a result, healthcare companies like EPIC, 3M Informational Systems, Optum, SCIO Health analytics, Truven Health, under scrutiny from regulatory institutions and academic researchers, are financially incentivized to demonstrate the data underlying the model is reliable Third, at community health centers like KP Colorado and Ochin, the low resource setting encourages rapid integration of 3rd party algorithms. Because CHCs serve under-represented patient populations, biases are more likely to exacerbated. Widespread integration can incentivize these health systems to be more careful in adopting quality algorithms.
We anticipate using an API to provide quality metrics for private companies to test their datasets and publicly report potential biases, as well as internally validate prior to model development and deployment.
Do you primarily provide products or services directly to individuals, to other organizations, or to the government?
Government (B2G)What is your plan for becoming financially sustainable?
AEquity has been developed by a cross-disciplinary team of physician-software engineers with expertise in clinical informatics, machine learning, and model deployment at the Icahn School of Medicine at Mount Sinai. Mount Sinai has a team of clinical data scientists and researchers that can maintain the codebase and sustain deployment. Also, public funding sources such as the NIH, which have shown interest in combatting bias through programs like the NIH NCATS initiative, would support further expanding AEquity to public datasets. Last, sustained integration into startups like Blue Clarity, and Solas.ai, could integrate AEquity into clinical data science pipelines outside of Mount Sinai. By limiting regulatory scrutiny and liability of industry partners from Section 1557 of the Affordable Care Act, Good Machine Learning Practice guidelines, and Clinical Decision Support Software guidelines from the FDA, AEquity can add value to industry partners and academic research institutions.
Solution Team
-
Faris Gulamali Icahn School of Medicine
to Top
Our Organization
Icahn School of Medicine at Mount Sinai