Skip to main navigation Skip to search Skip to main content

Weighing the benefits and risks of collecting race and ethnicity data in clinical settings for medical artificial intelligence

  • Amelia Fiske
  • , Sarah Blacker
  • , Lester Darryl Geneviève
  • , Theresa Willem
  • , Marie Christine Fritzsche
  • , Alena Buyx
  • , Leo Anthony Celi
  • , Stuart McLennan
  • Technical University of Munich
  • York University
  • University of Basel
  • Centre de Recherche en Santé Durable
  • Université Laval, Faculté de médecine
  • Massachusetts Institute of Technology
  • Harvard T.H. Chan School of Public Health
  • Harvard Medical School

Research output: Contribution to journalReview articlepeer-review

12 Scopus citations

Abstract

Many countries around the world do not collect race and ethnicity data in clinical settings. Without such identified data, it is difficult to identify biases in the training data or output of a given artificial intelligence (AI) algorithm, and to work towards medical AI tools that do not exclude or further harm marginalised groups. However, the collection of these data also poses specific risks to racially minoritised populations and other marginalised groups. This Viewpoint weighs the risks of collecting race and ethnicity data in clinical settings against the risks of not collecting those data. The collection of more comprehensive identified data (ie, data that include personal attributes such as race, ethnicity, and sex) has the possibility to benefit racially minoritised populations that have historically faced worse health outcomes and health-care access, and inadequate representation in research. However, the collection of extensive demographic data raises important concerns that include the construction of intersectional social categories (ie, race and its shifting meaning in different sociopolitical contexts), the risks of biological reductionism, and the potential for misuse, particularly in situations of historical exclusion, violence, conflict, genocide, and colonialism. Careful navigation of identified data collection is key to building better AI algorithms and to work towards medicine that does not exclude or harm marginalised groups.

Original languageEnglish
Pages (from-to)e286-e294
JournalThe Lancet Digital Health
Volume7
Issue number4
DOIs
StatePublished - Apr 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 16 - Peace, Justice and Strong Institutions
    SDG 16 Peace, Justice and Strong Institutions

Fingerprint

Dive into the research topics of 'Weighing the benefits and risks of collecting race and ethnicity data in clinical settings for medical artificial intelligence'. Together they form a unique fingerprint.

Cite this