Methodology| Volume 24, ISSUE 9, P1285-1293, September 2021

Download started.

Ok

# Mapping EQ-5D-3L to EQ-5D-5L

Open ArchivePublished:May 17, 2021

## Abstract

### Objectives

The original 3-level EQ-5D (EQ-5D-3L) includes 5 dimensions with 3 levels of problems per dimension. Since 2010, a more sensitive version with 5 levels of problems per dimension (EQ-5D-5L) has become available. Population value sets have been developed for both versions of the questionnaire. The objective of this research was to develop a mapping function to link EQ-5D-3L responses to value sets for the EQ-5D-5L.

### Methods

Various algorithms were developed to link EQ-5D-3L and EQ-5D-5L responses using data from an observational study including members of 10 subgroups (N = 3580) who completed both versions of the questionnaire. Nonparametric and ordinal logistic regression models were fit to the data and compared using Akaike’s information criterion (AIC) as well as the mean absolute error and root mean squared error of predictions. Results were contrasted qualitatively and quantitatively with those of an alternative copula-based approach.

### Results

Including indicants of problems for other EQ-5D-3L dimensions as regressors in the modeling yielded the greatest improvement in prediction accuracy. Adding age and gender lowered the AIC without improving predictions, while including a latent factor lowered the AIC further and slightly improved predictive accuracy. Models that conditioned on problems in other EQ-5D-3L dimensions yielded more accurate predictions than the alternative copula-based approach in subgroups defined by age and gender.

### Conclusion

We present novel algorithms to map EQ-5D-3L responses to EQ-5D-5L value sets. The recommended approach is based on an ordinal logistic regression that disregards age and gender and accounts for unobserved heterogeneity using a latent factor.

## Introduction

The EQ-5D questionnaire was developed to enable the measurement and valuation of health-related quality of life. The original 3-level version of the EQ-5D (EQ-5D-3L) includes 5 dimensions with 3 problem levels (none, moderate, or extreme) per dimension.
EuroQol Group
EuroQol—a new facility for the measurement of health-related quality of life.
Altogether, the questionnaire defines 243 unique states of health. To enhance the sensitivity of the EQ-5D-3L, a 5-level version of the EQ-5D (EQ-5D-5L) was developed by providing an increased number of item responses and improved item wording.
• Herdman M
• Gudex C
• Lloyd A
• et al.
Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L).
The EQ-5D-5L describes 3125 health states and has been shown to exhibit superior psychometric properties to the EQ-5D-3L in a variety of conditions.
• Buchholz I
• Janssen MF
• Kohlmann T
• Feng YS
A systematic review of studies comparing the measurement properties of the three-level and five-level versions of the EQ-5D.
• Pickard AS
• Neary MP
• Cella D.
Estimation of minimally important differences in EQ-5D utility and VAS scores in cancer.
• Khan I
• Morris S
• Pashayan N
• Matata B
• Bashir Z
• Maguirre J.
Comparing the mapping between EQ-5D-5L, EQ-5D-3L and the EORTC-QLQ-C30 in non-small cell lung cancer patients.
• Thompson A
• Turner A.
A comparison of the EQ-5D-3L and EQ-5D-5L.
After the development of the EQ-5D-5L, it was expected that it would take time for countries to produce national tariffs to inform economic evaluations. In 2012, to support the use of the EQ-5D-5L during this time, a research team from the EuroQol Group explored a variety of modeling approaches and produced a crosswalk enabling the linkage of EQ-5D-3L value sets to EQ-5D-5L responses.
• van Hout B
• Janssen MF
• Feng Y-S
• et al.
Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets.
This crosswalk was unidirectional because, at that time, there was no anticipated need to map from the EQ-5D-3L to the EQ-5D-5L.
There is a small but growing interest in the use of EQ-5D-3L data to predict utilities based on EQ-5D-5L value sets. Although many biopharmaceutical developers have transitioned to use of the EQ-5D-5L in their clinical trials, licensing of the EQ-5D-3L continues. Approaches to map EQ-5D-3L responses to EQ-5D-5L value sets enable end users to assess the potential cost-effectiveness implications of switching to the newer version of the instrument. In addition, it is anticipated that at some point in the near future health technology assessment bodies will request that sponsors use the EQ-5D-5L to inform their economic evaluations due to the availability of value sets, quality controls enforced in the EuroQol Valuation Technology, and general improvements in data collection and statistical modeling relative to health state valuation studies conducted for the EQ-5D-3L. Given the ongoing use of the EQ-5D-3L, this could engender a need for methods to link available data to EQ-5D-5L value sets.
In 2017, the Decision Support Unit (DSU) of the National Institute for Health and Care Excellence (NICE) published an EQ-5D-mapping approach based on mixture-copula models that allowed for bidirectional prediction.
• Hernandez-Alava M
• Wailoo A
• Pudney S.
Restricting their focus to the prediction of utilities for EQ-5D-5L health states based on an EQ-5D-3L value set, the authors compared the accuracy of their approach against that of the EuroQol Group’s research team under a variety of conditions and reported only very small differences in goodness-of-fit indices. Models based on aggregate fit statistics tended to favor the EuroQol approach, whereas those based on predictions within strata defined by age and gender tended to favor the DSU algorithm.
This article describes an adaptation of the EuroQol approach to map EQ-5D-3L responses to EQ-5D-5L value sets. We enrich the approach by including a latent factor as a common covariate to all dimensions, to capture unobserved heterogeneity. We present estimates with and without age and gender as covariates and compare our approach, both qualitatively and quantitatively, with the DSU approach. An R program is made available to predict utilities based on any EQ-5D-5L value set using EQ-5D-3L data.

### Data and Methods

The data that were previously used to develop a crosswalk to link EQ-5D-5L responses to EQ-5D-3L value sets were also used in this study.
• van Hout B
• Janssen MF
• Feng Y-S
• et al.
Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets.
Responses to the EQ-5D-3L and EQ-5D-5L were collected from participants in 6 countries divided among 8 disease populations—including COPD/asthma (n = 342), diabetes (n = 275), liver disease (n = 426), rheumatoid arthritis/arthritis (n = 372), cardiovascular disease (n = 250), stroke (n = 614), depression (n = 250), and personality disorders (n = 384)—as well as 443 students and 334 patients with nonspecific diagnoses. Table 1 presents a crosstabulation of the data used in the analyses.
Table 1Data used to map from 3L to 5L (numbers before listwise deletion between parentheses).
MobilityNoSomeConfined to bed
No1651 (1782)0 (29)0 (1)
Slight103 (119)507 (552)0 (1)
Moderate10 (16)525 (586)4 (4)
Severe0 (1)355 (386)25 (30)
Unable0 (4)0 (23)105 (112)
Self-careNoSomeUnable
No2280 (2468)0 (43)0 (3)
Slight73 (82)382 (408)0 (5)
Moderate10 (13)288 (313)6 (6)
Severe0 (5)90 (109)30 (35)
Unable0 (0)0 (6)126 (140)
Usual activitiesNoSomeUnable
No1308 (1382)0 (42)0 (5)
Slight146 (163)601 (661)0 (70)
Moderate15 (20)608 (656)19 (23)
Severe0 (9)254 (274)122 (134)
Unable0 (0)0 (15)212 (239)
Pain/discomfortNoModerateExtreme
No1061 (1126)0 (65)0 (1)
Slight199 (211)787 (850)0 (4)
Moderate16 (21)761 (837)18 (19)
Severe0 (6)221 (239)149 (159)
Extreme0 (2)0 (8)73 (82)
Anxiety/depressionNoModerateExtreme
No1275 (1352)0 (45)0 (1)
Slight207 (219)752 (841)0 (3)
Moderate26 (30)631 (692)14 (17)
Severe0 (10)147 (164)150 (158)
Extreme0 (3)0 (6)83 (93)
The general approach to model development was to predict for each EQ-5D-3L health state the probability of being in any one of the 3125 EQ-5D-5L health states with probabilities summing to one. Predicted utilities for the 243 EQ-5D-3L health states were subsequently obtained by summing the products of the 3125 probabilities and corresponding EQ-5D-5L health state values within a given EQ-5D-3L health state. Respondents were excluded from model estimation if they demonstrated inconsistent behavior by indicating (1) no problems using the EQ-5D-3L and severe or extreme problems using the EQ-5D-5L, (2) some problems using the EQ-5D-3L but no or extreme problems using the EQ-5D-5L, or (3) extreme problems using the EQ-5D-3L and no or slight problems using the EQ-5D-5L.
Let k = 1,…, 5 index the EQ-5D dimension (mobility, usual activities, self-care, pain/discomfort, and anxiety/depression), i = 1,…, 5 index problem levels as measured by the EQ-5D-5L, and j = 1,…, 3 index problem levels as measured by the EQ-5D-3L. Then ik and jk represent the observed problem levels for the kth dimension of the EQ-5D-5L and EQ-5D-3L, respectively. For the kth dimension, the probability of observing the ith EQ-5D-5L problem level conditional on the jth EQ-5D-3L problem level can be denoted as $pi,jk$. The probability of being in a given EQ-5D-5L health state contingent on an EQ-5D-3L health state can be derived as the product of the k dimension specific probabilities. This is represented in Eq. 1.
$Equation 1.$
(1)

Different approaches were used to estimate the probabilities Nine models were fit to map EQ-5D-3L responses to those of the EQ-5D-5L: a nonparametric model (which was chosen as the “best” model when mapping from the EQ-5D-5L to the EQ-5D-3L
• van Hout B
• Janssen MF
• Feng Y-S
• et al.
Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets.
) and 2 versions of ordinal logistic regression models. In the simpler version of the models, each EQ-5D-5L dimension problem level was related to its corresponding EQ-5D-3L problem level using 2 dummy variables, the first taking on a value 1 if the EQ-5D-3L problem level was two (some problems) and the second taking on a value of 1 if the problem level was three (extreme problems). In the more complex version of the models, dummy variables coding for moderate and extreme problems in each of the 4 other dimensions were included as regressors (8 dummy variables in total). The more complex models were fit with and without the inclusion of gender (male, female), continuous age, and age-squared. Additionally, both the simple and more complex models were fit with and without the inclusion of a latent factor common to all dimensions to account for unobserved heterogeneity (see Appendix A in Supplemental Materials found at https://doi.org/10.1016/j.jval.2021.03.009).
The logic behind adjusting for problems in other EQ-5D-3L dimensions when fitting the more complex models was that the EQ-5D-3L response metric enforces a form of censoring that is ameliorated with the EQ-5D-5L. The problem level for each EQ-5D-3L dimension can be seen as a rounded number on a continuous scale such that a respondent might provide a somewhat lower value (relatively healthier) or higher value (relatively less healthy) if enabled to do so. Consider 2 groups of respondents with the same level of problems as measured by the EQ-5D-3L for the kth dimension with the first group having more extreme problems in other dimensions than the second group. If offered the more graded set of response options included in the EQ-5D-5L, members of the first group would be expected to rate their problem level for the kth dimension as being worse than would members of the second group. Similarly, among persons having the same level of problems for a given dimension as measured by the EQ-5D-3L, one might expect older individuals to grade their problem level for that dimension as more extreme than would younger individuals if presented with the EQ-5D-5L.
Applying the nonparametric approach, the probabilities $pi,jk$were a function of the EQ-5D-3L dimension-specific problem levels and estimated by the cell frequencies defined in Table 1. The form of these probabilities is represented in Eq. 2:
$Equation 2.$
(2)

where $Ri,jk$ refers to the number of respondents in the ith row and jth column of Table 1 for dimension k, and all other terms are as previously defined. For instance, $p2,11$ was estimated by the number of respondents who categorized themselves as having slight mobility problems using the EQ-5D-5L and no mobility problems using the EQ-5D-3L divided by the total number of respondents classified as having no mobility problems using the EQ-5D-3L, that is, [103/(1651 + 103 + 10)] = 0.0584.
When fitting ordinal logistic regression models, the probabilities for each dimension k and each respondent $r=1,…,R$ were modeled as
$Equation 3.$
(3)

for i = 2, … , 4 and j = 1, …, 3, where $αk−xj′βk$ was a linear function of the regressors, and $μr$ was a latent factor capturing unobserved heterogeneity due to within-respondent correlation. The latter was assumed to be normally distributed with mean zero and variance $σ2$. The effects of the latent factor on dimensions were measured by $θk$ with $θ1$ normalized to 1. When interpreting parameter estimates for this model, the $βˆ$’s reflect the degree to which the probability of being in a more severe state increases as a function of the regressors, while the $αˆ$’s represent thresholds (points on the latent outcome) used to differentiate adjacent levels of the response variable.
All models were estimated using maximum likelihood. Details on the derivation of the likelihood, including the latent factor, are presented in Appendix A in Supplemental Materials found at https://doi.org/10.1016/j.jval.2021.03.009. To facilitate the assessment of goodness of fit, Akaike’s information criterion (AIC) was derived and reported for each model. Additionally, following the DSU report, the in-sample predictive accuracy of each model was assessed by calculating the mean absolute error (MAE) and the root mean squared error (RMSE) comparing predicted utilities based on the EQ-5D-3L descriptive system with observed EQ-5D-5L health state values. Evaluations were performed in the whole sample as well as in subgroups defined by age and gender. To facilitate comparability with DSU results, respondents who were identified as having provided inconsistent data and excluded from the estimation were included in assessments of in-sample predictive accuracy. The out-of-sample predictive accuracy of the models was also assessed. This was accomplished by dividing the sample into groups defined by disease states (eg, patients with asthma vs all others). Models were fit to the data for patients without the disease in question and predictions generated and compared against observed values using data for the patient subgroup. In all cases, the English EQ-5D-5L value set
• Devlin N.J.
• Shah K.K.
• Feng Y.
• Mulhern B.
• van Hout B.
Valuing health-related quality of life: An EQ-5D-5L value set for England.
was used to predict utilities.

### Comparison With the DSU Approach

Comparisons with the DSU mapping approach involved qualitative and quantitative elements. The qualitative component entailed the comparison of a list of “desired” characteristics that a mapping function should exhibit. The quantitative component entailed comparing model predictions by means of MAE and RMSE in subgroups defined by age and gender. The DSU model predictions were calculated using the Stata command EQ5DMAP with the Model=EQGcopula option.
• Hernandez-Alava M
• Pudney S.
eq5dmap: a command for mapping from 3-level to 5-level EQ-5D.

## Results

The relative frequencies in Table 1 constitute the parameter estimates underlying the nonparametric approach. For the kth dimension, the probability of exhibiting the ith EQ-5D-5L problem level conditional on the jth EQ-5D-3L problem level is reflected by the corresponding ijth cell count divided by the jkth column-sum. Table 2 presents the parameter estimates for ordinal logistic regressions excluding age and gender but including the latent factor. The first 4 estimates are the ancillary cut points (see Eq. 3). The parameter estimates for all other models are presented in Appendix B in Supplemental Materials found at https://doi.org/10.1016/j.jval.2021.03.009.
Table 2Ordered logistic regression including adjacent dimensions and a latent factor, parameter estimates.
Coefficients that are NOT significant at the 5% level are presented in boldface font. The parameter θ was estimated for all dimensions save mobility.
ParameterMobilitySelf-careUsual activitiesPain/DiscomfortAnxiety/Depression
EstimateSEEstimateSEEstimateSEEstimateSEEstimateSE
1|2 $(α1)$3.9330.1886.0910.3473.2910.1722.2430.1012.0270.094
2|3 $(α2)$9.4720.43711.6730.5798.8220.4057.4420.2886.4280.227
3|4 $(α3)$12.1350.49314.6080.65811.9400.49510.2160.3139.0010.253
4|5 $(α4)$18.1180.77918.4810.84416.4740.65814.4730.42513.1310.395
$β2mob$8.0190.4002.2210.2760.9060.1540.7710.118–0.2250.124
$β3mob$14.9960.7934.0760.4772.9010.4570.9530.3150.0520.315
$β2sc$1.4990.1477.9570.4331.4210.1520.8180.1180.2400.123
$β3sc$2.1650.37013.8170.7482.5130.4021.1700.2970.1880.297
$β2ua$0.9040.1710.9960.3017.3610.3690.7910.1180.5410.117
$β3ua$2.0630.2552.6070.35912.8120.5681.0560.1951.3510.192
$β2pd$0.4720.159–0.0630.2310.2500.1436.2390.2740.0310.107
$β3pd$2.2350.2500.0890.2911.0960.23810.9300.3940.0190.185
$β2ad$–0.0190.1230.4800.1630.6780.1210.0910.0946.0460.220
$β3ad$–0.2870.2240.7730.2701.7670.2200.5470.16711.5170.379
$θ$1.0000.0001.0550.1331.1460.1580.5230.0750.4600.072
$σ2$1.1210.049
Coefficients that are NOT significant at the 5% level are presented in boldface font. The parameter θ was estimated for all dimensions save mobility.
Figure 1 provides an efficient means of interpreting the results. For the kth dimension, a histogram is presented for the distribution of $x′βˆk$ along with the predicted probability of each of the 5 problem levels conditional on $x′βˆk$. At very low values of $x′βˆk$, respondents were most likely to exhibit slight problems as measured by the EQ-5D-5L. Respondents pivoted to the right (exhibited increasing problem extremity) with increases equal to the values of parameter estimates for regressors. The degree to which this occurred is illustrated at the top of each panel. The points at which the curves cross relate to the estimated ancillary cut points.
As one might expect, the coefficients suggest that for each dimension the problem level as indicated by the EQ-5D-5L was most sensitive to the problem level for the same dimension as indicated by the EQ-5D-3L. The probability of a respondent reporting a more extreme problem level for a given dimension increased with the worsening severity of problems in other dimensions. For example, compared to respondents having a similar level of mobility problems as indicated by the EQ-5D-3L, respondents exhibiting more extreme problems for most other dimensions had a higher likelihood of reporting worse mobility problems as measured by the EQ-5D-5L. Notable exceptions included the effects of EQ-5D-3L problems in anxiety/depression on the likelihood of EQ-5D-5L problems in mobility and the effects of EQ-5D-3L problems in pain-discomfort on the likelihood of EQ-5D-5L problems in self-care, which were limited. Moreover, problems in EQ-5D-3L dimensions other than usual activities had no or limited impact on the likelihood of EQ-5D-5L problems in anxiety/depression. In these exceptional cases, our findings suggest that problems in EQ-5D-3L dimensions were not informative to the likelihood of observing problems in the implicated EQ-5D-5L dimensions after controlling for problems in other conceptually related EQ-5D-3L dimensions (eg, there was no independent association of pain/discomfort with self-care after controlling for the effects of mobility and usual activities). On the whole, these results are consistent with the findings of Shaw and colleagues, who used correlation networks to explore relationships among attributes defined by the EQ-5D-3L.

Shaw JW, Bennett B, Trigg A, DeRosa M, Taylor F, Cocks K. Associations between the EQ-5D-3L and QLU-C10D descriptive systems: use of correlation networks to explore preference differences in solid tumor trials. 25th Annual International Meeting of the International Society for Pharmacoeconomics and Outcomes Research, May 18-20, 2020. Abstract PCN317.

,

Shaw JW, Bennett B, Trigg A, et al. A comparison of generic and condition-specific preference-based measures using data from nivolumab trials: EQ-5D-3L, a mapping to the EQ-5D-5L, and QLU-C10D. Manuscript submitted for publication.

The results with respect to age and gender were mixed. Without including the latent factor or age-squared in the model, age was not a statistically significant predictor for any EQ-5D-5L dimension. However, when including age-squared, age was determined to be a significant predictor for mobility and anxiety/depression, while age and age-squared were significant predictors for usual activities. When including just the latent factor in the model, age was a significant predictor for mobility, self-care, and anxiety/depression with the parameter estimate being positive for the first 2 dimensions and negative for the last. The effect of gender was only significant in predicting pain/discomfort regardless of whether or not age, age-squared, or the latent factor was included the model.
Figure 2 presents a comparison of predictions based on the EQ-5D-5L value set for the 243 EQ-5D-3L health states between models fit without age and gender and with age, age-squared, and gender. Both models included the latent factor. Conditioning on age and gender changed the magnitude of predicted values as well as the ranking of the values. The ranking distortion was most notable for health state 23311, which was ranked 123 out of 243 health states without conditioning on age and gender. Assuming male gender, the rank of this health state increased to 140 when conditioning on an age of 20 years, 131 when conditioning on an age of 50 years, and fell to 123 again when conditioning on an age of 80 years. The effects of gender on top of those of age and age-squared were minor with the majority of rankings remaining unchanged.
Table 3 presents indicants of goodness of fit and predictive accuracy for the nonparametric model as well as ordinal logistic regression models adjusting for problems in other EQ-5D-3L dimensions with or without the inclusion of age, age-squared, gender, and the latent factor. For each set of comparisons, the first row presents in-sample results for all respondents, including those deemed to provide inconsistent data. The following rows present results based on out-of-sample predictions. Including dummy variables coding for problems in other dimensions as well as the latent factor yielded reductions in AIC, MAE, and RMSE as compared to the nonparametric model. Adding age, age-squared, and gender also resulted in improvements in indices in spite of the variance in statistically significant relationships. When considering in-sample predictions, the model excluding demographic characteristics and the latent factor had the lowest MAE and RMSE, while based on out-of-sample predictions MAE and RMSE were lowest for the model including the latent factor without age and gender.
Table 3Mean absolute error, root mean squared error, and AIC by disease group; 3L to 5L.
Minimum figures are presented in boldface font.
Non-parametricOrdered logistic regression + complementary dimensionsOrdered logistic regression + complementary dimensions + latent factor
Age and genderAge, age2 and genderAge and genderAge, age2 and gender
Mean absolute error
All0.08110.07060.07080.07070.07560.07720.0753
COPD/asthma0.10100.08740.08810.08820.08150.08730.0876
Diabetes0.06880.05540.05500.05490.05000.05120.0510
Liver disease0.06670.05350.05230.05230.04340.04690.0467
RA/arthritis0.09840.08380.08370.08370.07810.08240.0826
CVD0.10420.09810.09890.09900.09120.09860.0989
Stroke0.12060.10470.10480.10480.09540.10200.1021
Depression0.08480.07200.07160.07120.06130.06870.0684
Personality disorders0.07180.06560.06570.06580.06010.06440.0644
Students0.10750.10400.10390.10380.08040.10120.1010
Other0.06390.05290.05470.05550.04170.05210.0530
Root mean squared error
All0.11010.10160.10190.10180.11450.11630.1151
COPD/asthma0.13560.12230.12360.12370.12080.12590.1261
Diabetes0.09340.08360.08330.08330.07940.08270.0825
Liver disease0.08340.07610.07570.07570.06720.07250.0725
RA/arthritis0.12960.11850.11850.11860.11090.12000.1204
CVD0.14790.14470.14570.14600.13280.14790.1483
Stroke0.15950.13920.13880.13890.13370.13900.1390
Depression0.11370.10520.10510.10520.09490.10560.1059
Personality disorders0.09240.09410.09380.09370.09090.09540.0953
Students0.15500.15790.15800.15790.11180.15860.1582
Other0.08260.07730.07790.07810.06480.07780.0778
AIC
All21315199501964719639194591913819127
COPD/asthma18836176891743517428172621699716987
Diabetes20114188251855518549183671807718068
Liver disease19848185971830818300181571785017839
RA/arthritis18533174361717117167170241674516740
CVD19505181861790717902177521745917452
Stroke16866159191564815643155221523215222
Depression19853186111832318316181541785117841
Personality disorders19260178101754017535172991701717009
Students19241178811759417583174571715017135
Other19778185011826018248180171775817740
Minimum figures are presented in boldface font.

### Comparison With the DSU Approach

The DSU approach was based on a set of copulas for each dimension without deleting any cases but using a mixture error distribution. It included age and gender as covariates as well as a latent factor common to all dimensions. The qualitative and quantitative comparisons below were guided by a list of conditions that the DSU researchers suggested a mapping model satisfy. The text in italics is quoted from Hernandez-Alava et al.
• Hernandez-Alava M.
• Pudney S.
Econometric modelling of multiple self-reports of health states: Theswitch from EQ-5D-3L to EQ-5D-5L in evaluating drug therapies forrheumatoid arthritis.
The model should:
• 1.
Treat the 3L and 5L responses symmetrically. The best road to work is not necessarily the best road home. Symmetry may well be an unnecessary restriction that is likely to lead to suboptimal predictions.
• 2.
Avoid the assumption that the 5L response scale is simply a more detailed categorization than the 3L scale of the same underlying concept. The EQ-5D-5L was developed to provide a more graded, sensitive, and responsive descriptive system than the EQ-5D-3L.
• Herdman M
• Gudex C
• Lloyd A
• et al.
Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L).
Aside from the expansion of the number of problem levels for each dimension, deviations from the EQ-5D-3L in wording and formatting are minimal. Accordingly, the aforementioned assumption would seem to be justified.
• 3.
Allow for the effects of covariates. The approaches applied in this research accommodated for the effects of various regressors, including problems in other EQ-5D-3L dimensions, age, and gender. Conversely, the DSU approach does not allow for the generation of predictions without conditioning on age and gender.
• 4.
Capture the strong association between 3L and 5L responses within each health domain, without necessarily assuming that the strength of the association is the same in all parts of the health distribution. The nonparametric approach described in this article does not assume that the strength of the association between EQ-5D-3L and EQ-5D-5L responses is constant across the health continuum. All of the ordinal logistic regression models captured the association between domain-specific problems as measured by the EQ-5D-3L and EQ-5D-5L via the inclusion of dummy variables. However, the models did assume proportionality, which means that for any split of response variable categories (eg, no problems vs slight or more, extreme problems vs severe or less), the parameter estimates would remain unchanged.
• 5.
Be sufficiently flexible to fit the diverse response patterns... so we generalize the usual assumption of normally distributed errors by allowing for a 2-part normal mixture distribution. Appendix 1 in the DSU report shows that 86.5% of cases fell into a category with a mean of 0.151 and a variance of 0.373, whereas 13.5% of cases belonged to a category with a mean of -0.976 and a variance of 3.947.
• Hernandez-Alava M
• Wailoo A
• Pudney S.
Respondents with a high likelihood of belonging to this latter group affected the central estimate less than respondents belonging to the first group. This may be a concern for the respondents who were excluded from estimation in the current study due to inconsistent data because these individuals likely resembled those in the second group. Aside from this, each dimension may have its own mixture allowing for a respondent’s data to contribute differentially to different dimensions. This reads like the econometric version of pairwise deletion, which the DSU researchers once labeled as a dangerous practice.
• 6.
Allow dependence across the five domains of EQ-5D ... incorporating a random latent factor influencing responses in all domains. Ordinal logistic regression can be adapted to allow for dependence in the problem levels for different dimensions, as has been done in this research.
Predictions using the DSU approach were calculated using the published Stata command, EQ5DMAP, including information about age and gender and choosing the Model=EQGcopula option. In their analyses, the DSU researchers excluded eight 15-year-old respondents and two 14-year-old respondents. When comparing the observed and predicted values from the DSU approach with those of the ordinal logistic regression excluding age and gender, the former yielded a mean error of 0.0006, MAE of 0.0804, and RMSE of 0.1168, while the latter yielded a mean error of 0.001, MAE of 0.0706, and RMSE of 0.1016.
With respect to their mappings from the EQ-5D-5L to the EQ-5D-3L, the DSU researchers used a type of league table to show that their approach performed better than the EuroQol researchers’ crosswalk in a selection of groups defined by age and gender.
• Hernandez-Alava M
• Wailoo A
• Pudney S.
Table 4 presents prediction accuracy indices for the copula-based mapping of EQ-5D-3L responses to utilities based on the EQ-5D-5L value set along with models developed in this research. The copula-based approach consistently performed the worst in terms of MAE. It also performed poorly regarding comparisons of RMSE being judged the least accurate model in 9 of 14 comparisons across age and gender strata.
Table 4Mean absolute error, root mean squared error age, and gender; 3L to 5L.
The model values with the lowest mean errors for the various age/gender groups are presented in boldface font.
Mean absolute errorRoot mean squared error
Ordered logistic regressionOrdered logistic regression
No age and genderAge, age2 and genderNo age and genderAge, age2 and gender
Males-Latent+Latent-Latent+LatentCopula-Latent+Latent-Latent+LatentCopula
<260.05180.04910.05260.04950.05280.07490.07520.07420.07420.0745
26-350.10260.10140.10290.10150.10910.18120.18350.18260.18550.1872
36-450.06540.06130.06400.05940.06690.10830.10680.10780.10700.1092
46-550.08030.07680.07860.07480.08210.12100.12110.12070.12150.1248
56-650.08200.07860.08070.07680.08330.11300.11250.11280.11290.1144
65-750.07330.07140.07290.07060.07450.10490.10560.10490.10600.1051
>750.09910.09790.09940.09800.10310.13600.13700.13590.13670.1381
Females
<260.05960.05730.06130.05870.06320.08610.08640.08580.08590.0881
26-350.07110.06780.07110.06760.07540.10400.10370.10390.10410.1130
36-450.06380.05960.06270.05780.06380.08970.08810.08860.08710.0889
46-550.08820.08630.08780.08590.08970.12380.12530.12430.12700.1269
56-650.08340.08180.08300.08130.08370.11930.12020.11950.12100.1205
65-750.08860.08680.08920.08710.08910.11730.11820.11800.11890.1184
>750.10250.10080.10450.10240.10780.14040.14070.14110.14100.1435
The model values with the lowest mean errors for the various age/gender groups are presented in boldface font.
As previously illustrated, conditioning on age and gender can affect the ranking of health states based on predicted utilities. This should never lead to a situation where a logically better state has a lower predicted utility than a logically worse state and the effect of age overshadows the effect of worsened health, such as when comparing values between 21113 and 21123 or 11211 and 11311. Indeed, such logical inconsistencies were not observed for any of the ordinal logistic regression models fit in this research. However, the same cannot be said of the copula-based approach for which the effects of age and gender were more pronounced. For example, when applied to the data used in this investigation, the approach yielded a predicted utility of 0.640 for a 32-year-old respondent in EQ-5D-3L state 21113 and a predicted utility of 0.642 for a 75-year-old respondent in EQ-5D-3L state 21123. Additionally, the approach generated a predicted utility of 0.911 for a 54-year-old female in EQ-5D-3L state 11311, while predicted utilities for males and females under the age of 30 years in EQ-5D-3L state 11211 were consistently below this value.

## Discussion

When members of the EuroQol Group developed their initial crosswalk,
• van Hout B
• Janssen MF
• Feng Y-S
• et al.
Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets.
it was not anticipated there would be a need for predicting utilities based on EQ-5D-5L value sets from EQ-5D-3L responses. Since then, the NICE DSU published an alternative mapping function and compared its approach to map EQ-5D-5L responses to EQ-5D-3L value sets with that of the EuroQol Group. This comparison included only the nonparametric model developed by the EuroQol researchers; no comparisons with the researchers’ other investigated approaches were rendered. The DSU model has the advantage of being able to facilitate mappings from the EQ-5D-3L to EQ-5D-5L value sets. This article summarizes the results of using the EuroQol approach to do so, and our findings show that ordinal logistic regressions including regressors coding for other EQ-5D-3L dimensions predict values based on the English EQ-5D-5L value set more accurately than the DSU approach.
A number of factors may explain the between-model performance differences observed in this research. Each EQ-5D dimension is essentially continuous, though respondents can only categorize themselves as having no, moderate, or extreme problems using the EQ-5D-3L. When completing the EQ-5D-3L, there is a form of censoring whereby a respondent’s true problem level for a dimension is unobserved and substituted with a discretized value. For a given dimension, it is easy to imagine that the distribution of true problem levels affected by this censoring is shifted to the right (meaning worse) for respondents with more extreme problems in other dimensions as well as for older respondents. The ordinal logistic regressions fit in this research suggest that dummy variables coding for problem levels in other dimensions effectively captured this phenomenon. This reflects the intercorrelation among EQ-5D dimensions. A correlation analysis revealed that anxiety/depression had the lowest correlation with mobility, while usual activities had the highest correlation with anxiety/depression (results not shown). These findings are in accordance with the magnitude and significance of the parameter estimates for variables coding for problems in other dimensions (Table 2). However, similar relationships were not observed when considering self-care, usual activities, and pain/discomfort.
Adjusting for the influence of problem levels in other dimensions, age had only a very limited influence on model fit and predictive accuracy. The DSU model does not condition on the problem levels for other dimensions and may assume that their effects are captured by the incorporation of a latent factor for each dimension. When applying the DSU model, age and gender were identified as significant factors in the estimation, and when comparing logically dominant health states, almost illogically so. We speculate that this was a consequence of imposing a “symmetric” model, which demands that one take the same road home as taken to work.
The inclusion of age and gender in the DSU model may have untoward consequences for economic evaluations that use predictions derived from it. The EQ-5D is commonly administered in clinical trials to generate utility data to inform cost-effectiveness analyses. While an enrolled trial sample will generally reflect the characteristics of the intended patient population, this is not always the case. Accordingly, the incorporation of demographic characteristics in the DSU model could result in biases in mapped utilities (whether mapping from the EQ-5D-3L to EQ-5D-5L or vice versa) when trial sample characteristics are not reflective of the population of interest. The implications of this for economic modeling and subsequent decision making are unknown and warrant further study.
Building on the DSU modeling approach, our ordinal logistic regressions were adapted by including a latent factor, common to each dimension, reflecting unobserved heterogeneity. While significant and improving the predictive power of the models, differences between predictions with and without the inclusion of this latent factor were small. Regardless, we advocate for its inclusion in the final selected model. The interpretation of the latent factor may differ between the models. In the ordinal logistic regression model fit without adjustment for problems in other EQ-5D-5L dimensions, the latent factor may have captured the influence of general health. In the models fit without age and gender, it may have captured the effects of these variables, and in all models it may have reflected the influence of individual response strategies. It was expected that the variance of the latent factor would decrease with the inclusion of more regressors, and indeed a reduction from 1.19 to 1.12 was observed after including dummy variables coding for problems in other EQ-5D-5L dimensions. However, the variance of the latent factor increased to 1.16 with the addition of age, age-squared, and gender to the models.
It is notable that the current iteration of EQ5DMAP only allows mapping to UK/English value sets.
• Hernandez-Alava M
• Pudney S.
eq5dmap: a command for mapping from 3-level to 5-level EQ-5D.
This is particularly constraining as NICE does not advocate use of the existing EQ-5D-5L English value set. The models developed in this research have been encoded in an R program that is freely accessible and easily adapted to other EQ-5D-5L value sets.
Finally, it is important to acknowledge 2 limitations of this research. First, a decision was made to exclude cases with inconsistent data from the model estimation. In reviewing prior work, the DSU researchers argued for the use of mixture models to accommodate heterogeneous population subgroups.
• Hernandez-Alava M
• Wailoo A
• Pudney S.
By excluding respondents with inconsistent data, it is conceivable that estimates may have been biased due to selection. However, while the inclusion of all respondents has a certain democratic appeal, some individuals may deliberately provide bad data or unintentionally do so (eg, due to cognitive impairment). Additionally, mixture models are not without their limitations, including difficulty had in identifying subgroups and appropriate mixture distributions as well as computational burden. Second, while ordinal logistic regression is a powerful tool as compared to other discrete choice estimators, it is founded on the assumption of proportionality (or proportional odds). Violations of this assumption suggest the need to consider alternate specifications, such as the generalized logistic regression for ordinal variables. While a test of the proportional odds assumption can be readily performed for ordinal models fit using pre-programmed routines in statistical packages, no such option was available for the models coded de novo in this research, and a means of extrapolating the test to these models was not readily apparent.

## Conclusions

This research presents novel algorithms to map EQ-5D-3L responses to EQ-5D-5L value sets. The recommended approach for use in applications is based on an ordinal logistic regression that excludes age and gender and accounts for unobserved heterogeneity using a latent factor. This model has a number of practical advantages over an alternative copula-based approach for mapping to EQ-5D-5L value sets. Notably, it exhibits superior predictive accuracy irrespective of age and gender and is not subject to producing logically inconsistent predictions. The proposed model is recommended for interim use during the period in which end users transition to full adoption of the EQ-5D-5L.

## Article and Author Information

Author Contributions: Concept and design: van Hout
Analysis and interpretation of data: van Hout, Shaw
Drafting of the manuscript: van Hout, Shaw
Critical revision of the paper for important intellectual content: Shaw
Statistical analysis: van Hout
Obtaining funding: van Hout
Administrative, technical, or logistic support: van Hout
Supervision: van Hout
Conflict of Interest Disclosures: Dr Shaw is an employee and stockholder of Bristol-Myers Squibb. No other disclosures were reported.
Funding/Support: The work of Dr van Hout was supported by a grant from the EuroQol foundation. Dr Shaw did not receive any financial support for this research.
Role of the Funder/Sponsor: The funder designed and organized the data collection which was used in an earlier study. It did not have any role in the management, analysis, and interpretation of the data; the preparation, review, or approval of the manuscript; and the decision to submit the manuscript for publication.

## Acknowledgment

The authors thank Simon Pickard, Paul Kind, and 2 anonymous reviewers for helpful comments on earlier versions of the manuscript.

## Supplemental Material

• Appendix A
• Appendix B

## References

• EuroQol Group
EuroQol—a new facility for the measurement of health-related quality of life.
Health Policy. 1990; 16: 199-208
• Herdman M
• Gudex C
• Lloyd A
• et al.
Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L).
Quality of Life Research. 2011; 20: 1727-1736
• Buchholz I
• Janssen MF
• Kohlmann T
• Feng YS
A systematic review of studies comparing the measurement properties of the three-level and five-level versions of the EQ-5D.
Pharmacoeconomics. 2018; 36: 645-661
• Pickard AS
• Neary MP
• Cella D.
Estimation of minimally important differences in EQ-5D utility and VAS scores in cancer.
Health and Quality of Life Outcomes. 2007; 5
• Khan I
• Morris S
• Pashayan N
• Matata B
• Bashir Z
• Maguirre J.
Comparing the mapping between EQ-5D-5L, EQ-5D-3L and the EORTC-QLQ-C30 in non-small cell lung cancer patients.
Health and Quality of Life Outcomes. 2016; 14: 60
• Thompson A
• Turner A.
A comparison of the EQ-5D-3L and EQ-5D-5L.
Pharmacoeconomics. 2020; 38: 575-591
• van Hout B
• Janssen MF
• Feng Y-S
• et al.
Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets.
Value in Health. 2012; 15: 708-715
• Hernandez-Alava M
• Wailoo A
• Pudney S.
Methods for mapping between the EQ-5D-5L and the 3L for technology appraisal. Report by the Decision Support Unit. Decision Support Unit, ScHARR, University of Sheffield, Sheffield, UK2017
• Devlin N.J.
• Shah K.K.
• Feng Y.
• Mulhern B.
• van Hout B.
Valuing health-related quality of life: An EQ-5D-5L value set for England.
Health Economics. 2018; 27: 7-22
• Hernandez-Alava M
• Pudney S.
eq5dmap: a command for mapping from 3-level to 5-level EQ-5D.
The Stata Journal. 2018; 18: 395-415
1. Shaw JW, Bennett B, Trigg A, DeRosa M, Taylor F, Cocks K. Associations between the EQ-5D-3L and QLU-C10D descriptive systems: use of correlation networks to explore preference differences in solid tumor trials. 25th Annual International Meeting of the International Society for Pharmacoeconomics and Outcomes Research, May 18-20, 2020. Abstract PCN317.

2. Shaw JW, Bennett B, Trigg A, et al. A comparison of generic and condition-specific preference-based measures using data from nivolumab trials: EQ-5D-3L, a mapping to the EQ-5D-5L, and QLU-C10D. Manuscript submitted for publication.

• Hernandez-Alava M.
• Pudney S.
Econometric modelling of multiple self-reports of health states: Theswitch from EQ-5D-3L to EQ-5D-5L in evaluating drug therapies forrheumatoid arthritis.
Journal of Health Economics. 2017; 55: 139-152