Real-world data: A health equity lens for research

Health inequities around the world produce disparities of a shocking magnitude. For example, life expectancy from birth ranges from approximately 55 to 85 years old depending on the country in which a child is born. Linked with degrees of social disadvantage, these same profound differences can be seen within a single country’s borders, including the United States. Arising from the circumstances in which people grow, live, work, and age, and the systems put in place to deal with illness, these health inequities are profound, yet can be addressed with effort to understand and intervene on their origins.

One way in which the development of societies can be judged is by how fairly health is distributed across the social spectrum and how well societies protect individuals from disadvantage arising from illness. To achieve health equity, we must take action to first understand the social determinants of health (SDoH) that give rise to the inequitable distribution of illness. When evaluating SDoH, one overarching research goal is to unveil how health differs across groups of patients defined by various social characteristics. A second step is to leverage those findings to investigate how to intervene on these salient factors, or their consequences, to level observed disparities. 

Historically, randomized controlled trials (RCT) are considered the gold standard for generating evidence for insights into comparative effectiveness. Yet, most trials are poorly positioned to investigate questions of health equity, including addressing differences in treatment access and effectiveness across diverse populations, including those defined by age, gender, race and ethnicity. Analyses of the composition of randomized trial populations indicate that patient groups defined by sex, age, race/ethnicity are often starkly underrepresented. As one example, a systematic review of RCTs for patients with cardiovascular disease that were cited in American Health Association practice guidelines revealed that 80-85% of these patients were white and approximately 70% were men.       

In April, the U.S. Food and Drug Administration (FDA) published a guidance document on improving trial enrollment from underrepresented racial and ethnic groups. While the document focuses on diversity defined by race and ethnicity, the FDA defines a much broader set of under-represented groups – including by sex, gender identity, age, socioeconomic status, disability, pregnancy status, lactation status and other clinical characteristics – and encourages sponsors to enrich their trial populations along these axes. 

Advantages of Real-World Data for Health Equity Research

While trials design will hopefully evolve to be more inclusive of representative patient populations, real world data (RWD) offer an immediate and powerful vehicle to study health equity. RWD are any data related to patient health status and/or the delivery of healthcare that are routinely collected. RWD can be pulled from electronic health records (EHRs), claims, patient reported outcomes (PROs), registry data and other diverse sources. Due to the potential for huge breadth, multidimensional richness and significant longitudinality, these data offer powerful advantages and efficiencies for conducting health equity research, including: 

  • Patient representativeness and heterogeneity: Using RWD, researchers can examine treatment effects or access in populations representative of real-world practice and in diverse populations not typically included in trials. Broader generalizability is expected in terms of age, race/ethnicity, socioeconomic status, disability and other clinical characteristics amongst other factors. The data offer meaningful opportunities to unveil disparities in outcomes and access. Heterogeneity can also reveal safety and efficacy effects that may not be apparent in randomized trial environments.
  • Efficiency: RWD allows for the analysis of larger sample sizes in subgroups; the ability to generate insights rapidly; and greater statistical power to characterize under-represented populations with precision. Analyses can look at complex networks of variables – including examining multiple SDoH at the same time, their relative contributions and potential interaction.
  • Diverse data sources enriched with SDoH: Ability to deterministically link patients across data sources to characterize social, behavioral, clinical determinants of health. RWD can also be linked to other non-clinical datasets – such as employment records.

A Case Example: Leveraging RWD for Breast Health Equity

Breast cancer carries one of the highest observed racial disparities in mortality and is a priority for understanding the role of SDoH in health inequities in care and outcomes. OM1 is working with a leading mammography device company on a health-equity focused RWD platform that explores the comparative effectiveness of 3D versus 2D mammography in under-represented and high-risk sub-groups, including black and Asian women. 

This study draws from a mix of academic and community sites with targeted enrollment of more than one million women and many millions of screens. Due to the study design, the cohort features a greater proportion of black women than the U.S. population and dramatically higher representation than corresponding clinical trials of breast cancer screening modalities. A recent finding from this platform is that black women have reduced access to cutting edge 3D mammogram technology for routine screening as compared to white women. In addition, the comparative effectiveness of 3D vs 2D mammography was examined in subgroups defined by race and ethnicity, demonstrating that while black women have reduced access, they are not less likely to benefit from the improved screening profile of 3D mammograph than white women.

Beyond the case study, SDoH data can be studied across virtually all disease areas (e.g., mental health, cardiology, and dermatology). SDoH analyses may measure and track disparities with respect to geographic locations regarding patient residence or health system location. SDoH analyses can also be leveraged to facilitate targeted interventions – to target areas where patients may have a particular need or barrier to access. SDoH analyses can also provide insights regarding the factors driving the observed disparities. By designating hypotheses a priori regarding the potential pathways or mechanisms that may be underlying observed health inequities, the relevant data elements can be collected to design analyses that investigate the possible contributing role of multiple variables, such as income and education, and more. 

Considerations When Working with SDoH Data

It is important to be clear that when investigating the relationships between SDoH data, such as race, and different health outcomes, there is broad consensus that these are social constructs and not biologic determinants. Humans are remarkably genetically similar to one another and the fraction of genetic variation occurs within groups is as large as that occurring between groups defined by race or ethnicity.  

Due to a long and shameful history of misuse of racial classifications in research and medicine, there are disagreements whether race should continue to be collected and used for research purposes in the U.S, and if collected, how it should be used. It is worth noting that there are marked differences in how SDoH are collected and studied in relation to health around the world. France collects no census or other data on the race (or ethnicity) of its citizens – this limits any potential to evaluate differences by race. In the United Kingdom, there is an emphasis on evaluating differences in health by social class rather than race/ethnicity. However, in the United States across many health conditions significant race-associated differences in health outcomes continued to be observed. A health equity lens advocates for understanding these differences by carefully collecting data on SDoH and designing equity-oriented research in the service of identifying interventions to level health-related disparities.          

Additional considerations include acknowledging the diversity within racial groups; and to carefully consider how categories of groups by race/ethnicity are defined and collected. It is also important to consider and specify why information on race is to be collected during the research design phase (e.g., because of a previously documented disparity); and to describe how race was measured (observer-coded vs. self-reported, the number and names of categories, whether multiple responses were allowed). While investigations of SDoH often focus on “one variable at a time,” during the research design phase investigators should consider the comprehensive set of SDoH needed to fully explore associations and investigate possible hypotheses, including possible explanatory mechanisms. To do so, it is often important to consider whether data collection may include measures of racism, social class, culture, ancestry, migration history, language and genetic variation (if applicable), so that the basis of observed differences can be determined. 

The Future of Healthcare Equity

With careful consideration, access to high-quality data sources and sound scientific methodological approaches, RWD and SDoH data are primed to provide an enormous opportunity for addressing gaps in care, access and treatment responsiveness. Using a set of characteristics that are meaningful determinants of health outcomes, we can begin to understand the underlying causes of healthcare inequities and to develop policies and therapies in a more personalized and equitable way


¹ Closing the gap in a generation: health equity through action on the social determinants of health – Final report of the commission on social determinants of health. WHO/IER/CSDH/08.1. 27 August 2008,

² Flores LE, et al. Analysis of Age, Race, Ethnicity, and Sex of Participants in Clinical Trials Focused on Eating Disorders. JAMA Netw Open. 2022;5(2):e220051. doi:10.1001/jamanetworkopen.2022.0051

³ Flores LE, et al. Assessment of the Inclusion of Racial/Ethnic Minority, Female, and Older Individuals in Vaccine Clinical Trials. JAMA Netw Open. 2021;4(2):e2037640. doi:10.1001/jamanetworkopen.2020.37640

⁴ Downing, N.S., et al. Participation of the elderly, women, and minorities in pivotal trials supporting 2011–2013 U.S. Food and Drug Administration approvals. Trials 17, 199 (2016).

⁵ DeFilippis EM, et al. Improving Enrollment of Underrepresented Racial and Ethnic Populations in Heart Failure Trials: A Call to Action From the Heart Failure Collaboratory. JAMA Cardiol. 2022;7(5):540–548. doi:10.1001/jamacardio.2022.0161

⁶ Shehara Mendis, et al. Sex Representation in Clinical Trials Associated with FDA Cancer Drug Approvals Differs Between Solid and Hematologic Malignancies, The Oncologist, Volume 26, Issue 2, February 2021, Pages 107–114.

⁷ Sardar MR, et al. Underrepresentation of women, elderly patients, and racial minorities in the randomized trials used for cardiovascular guidelines. JAMA Intern Med. 2014 Nov;174(11):1868-70.

⁸ Diversity Plans to Improve Enrollment of Participants from Underrepresented Racial and Ethnic Populations in Clinical Trials: Guidance for Industry,

⁹ Alsheik N et al. Outcomes by Race in Breast Cancer Screening With Digital Breast Tomosynthesis Versus Digital Mammography. J Am Coll Radiol. 2021 Jul;18(7):906-918. doi: 10.1016/j.jacr.2020.12.033. Epub 2021 Feb 17.

Related Articles

Back to top button