AI2 The Impact of Including Race and Ethnicity in Risk Prediction Models on Racial Bias


      Risk prediction algorithms can support clinical decision-making but there is a lack of consensus on when and how sociodemographic factors, especially the social construct of race/ethnicity, should be included in these algorithms. Our objective is to assess the impact of including race as a predictor in a risk prediction algorithm on racial biases in model performance.


      We used data from a large integrated health care system to develop a recurrence risk prediction model for adults with colorectal cancer who underwent resection. We fitted three Cox proportional hazard models using clinical and demographic variables: one excluded race/ethnicity as a predictor (“race-blind”), one included race/ethnicity (“race-sensitive”), and one with interactions between predictors and race/ethnicity. We compared racial biases in model performances between these models, measured by discrimination (area under the receiver operator curve (AUC)) and sensitivity at a fixed specificity (80%).


      Among 4515 patients (mean age 65; 48% female), 53% were non-Hispanic White (NHW), 22% Hispanic, 13% Black/African American (AA), 12% Asian/Pacific Islander. 5-year cumulative incidence of recurrence for Stage 3 patients differed across racial groups: 25%, 33%, 35%, and 32% among NHW, Hispanic, AA, and Asian, respectively. In the “race-blind” model, AUCs varied across groups (0.74, 0.58, 0.66, and 0.61 among NHW, Hispanic, AA, and Asian, respectively). Adding race improved the AUCs only very slightly. Adding interaction terms resulted in AUCs of 0.75, 0.57, 0.63, and 0.64. Sensitivities also varied across groups (50%, 42%, 36%, and 33% among NHW, Hispanic, AA, Asian, in both the “race-blind” and “race-sensitive” models). Including interaction terms increased sensitivities only for some groups.


      Risk prediction models had worse performances in minority racial subgroups compared to NHW, even with the explicit inclusion of race as a predictor or interaction terms. Risk model developers and users need to identify algorithmic disparities and understand their potential implications.