Patient-Reported Outcomes| Volume 23, ISSUE 8, P1056-1062, August 01, 2020

# Normative Estimates and Agreement Between 2 Measures of Health-Related Quality of Life in Older People With Frailty: Findings From the Community Ageing Research 75+ Cohort

Open ArchivePublished:July 17, 2020

## Highlights

• International guidelines identify development, evaluation, and implementation of new interventions to improve quality of life for older people with frailty as a key priority.
• Evidence on normative estimates and agreement for different measures of health-related quality of life across the spectrum of frailty is critical for designing interventions and cost-effectiveness evaluation.
• Researchers should consider using the short-form 36-item health questionnaire in frailty or the short-form 6-dimension if fit older people are the planned target. In interventions involving older people with increasing frailty, both the EuroQol 5-dimension health questionnaire and short-form 6-dimension should be included.

## Abstract

### Background

Previous studies have summarized evidence on health-related quality of life for older people, identifying a range of measures that have been validated, but have not sought to present results by degree of frailty. Furthermore, previous studies did not typically use quality-of-life measures that generate an overall health utility score. Health utility scores are a necessary component of quality-adjusted life-year calculations used to estimate the cost-effectiveness of interventions.

### Methods

We calculated normative estimates in mean and standard deviation for EQ-5D-5L, short-form 36-item health questionnaire in frailty (SF-36), and short-form 6-dimension (SF-6D) for a range of established frailty models. We compared response distributions across dimensions of the measures and investigated agreement using Bland-Altman and interclass correlation techniques.

### Results

The EQ-5D-5L, SF-36, and SF-6D scores decrease and their variability increases with advancing frailty. There is strong agreement between the EQ-5D-5L and SF-6D across the spectrum of frailty. Agreement is lower for people who are most frail, indicating that different components of the 2 instruments may have greater relevance for people with advancing frailty in later life. There is a greater risk of ceiling effects using the EQ-5D-5L rather than the SF-6D.

### Conclusions

We recommend the SF-36/SF-6D as an appropriate measure of health-related quality of life for clinical trials if fit older people are the planned target. In trials of interventions involving older people with increasing frailty, we recommend that both the EQ-5D-5L and SF36/SF6D are included, and are used in sensitivity analyses as part of cost-effectiveness evaluation.

## Introduction

The World Health Organization defines quality of life (QOL) as an “An individual’s perceptions of their position in life, in the context of the culture and value systems in which they live, and in relation to their goals, expectations, standards and concerns.”
Study protocol for the World Health Organization project to develop a Quality of Life assessment instrument (WHOQOL).
International guidelines identify development, evaluation, and implementation of new interventions and services to improve QOL for older people with frailty as a key priority.
Fit for frailty. British Geriatrics Society.
,
Multimorbidity: clinical assessment and management. National Institute for Health and Care Excellence.
A 2012 consensus report by the US Institute of Medicine has recommended a focus on QOL outcome measures for research and program evaluation of interventions for people living with long-term health conditions.
Living well with chronic illness: a call for public health action.
Previous reviews have summarized evidence on health-related quality of life (HRQOL) measures for older people, identifying a wide range of measures that have been validated, but have not sought to present results stratified by degree of frailty.
• Haywood K.
• Garratt A.
• Fitzpatrick R.
Quality of life in older people: a structured review of self-assessed health instruments.
More recently, reviews have focused on QOL for people living with frailty, but have identified limitations in the evidence base.
• Crocker T.
• Brown L.
• Clegg A.
• et al.
Quality of life is substantially worse for community-dwelling older people living with frailty: systematic review and meta-analysis.
For example, although frailty is best understood as a graded condition, with evidence for the existence of mild frailty, or prefrailty, that typically precedes development of more advanced frailty studies typically dichotomized frailty into “not frail” and “frail” categories.
• Crocker T.
• Brown L.
• Clegg A.
• et al.
Quality of life is substantially worse for community-dwelling older people living with frailty: systematic review and meta-analysis.
Furthermore, included studies did not typically use QOL measures that enable generation of an overall health utility score, whereby individual health profiles are converted into single utility scores by applying preexisting weights based on preferences of the general population. Health utility scores are a necessary component of quality-adjusted life-year calculations used in health economic evaluations to estimate value for money of interventions.
Notably, no studies have evaluated the EuroQol 5-dimension health questionnaire (EQ-5D) in frailty as a well-established measure of HRQOL that enables generation of a health utility score. The EQ-5D is the preferred UK National Institute for Health and Care Excellence (NICE) measure of HRQOL in adults
Guide to the methods of technology appraisal 2013. National Institute for Health and Care Excellence.
and is also the most evaluated HRQOL measure internationally.
• Rowen D.
• Azzabi Zouraq I.
• Chevrou-Severac H.
• van Hout B.
International regulations and recommendations for utility data for health technology assessment.
,
Summary of guidance on health-utility measures by selected health technology assessment agencies. RTI Health Solutions.
Furthermore, although previous studies have evaluated the short-form 36-item health questionnaire in frailty (SF-36),
• Crocker T.
• Brown L.
• Clegg A.
• et al.
Quality of life is substantially worse for community-dwelling older people living with frailty: systematic review and meta-analysis.
none evaluated the short-form 6 dimension (SF-6D) health utility score that can be derived from the SF-36 for health economic modeling. Although the EQ-5D and SF-6D both enable derivation of a health utility score, and have been demonstrated to converge at the aggregate level, there is ongoing uncertainty regarding differences across patient groups and illness severity.
• Brazier J.
• Roberts J.
• Tsuchiya A.
• Busschbach J.
A comparison of the EQ-5D and SF-6D across seven patient groups.
The 2 measures differ in their dimensions, items, and preference weights and therefore can potentially assign different utility scores to the same individual.
• Brazier J.
• Roberts J.
• Tsuchiya A.
• Busschbach J.
A comparison of the EQ-5D and SF-6D across seven patient groups.
,
• Barton G.
• Sach T.
• Avery A.
• et al.
A comparison of the performance of the EQ-5D and SF-6D for individuals aged ≥ 45 years.
Furthermore, the SF-6D has the potential to tap into broader aspects of HRQOL through its role and social functioning dimensions. These are particularly salient for the population of older people living with frailty because they often have complex health and social care needs and thus the social value of an intervention may be more important than health improvement.
The absence of evidence on health utility scores for older people living with frailty is problematic, because normative estimates are needed for design of clinical trials to evaluate new interventions. Furthermore, HRQOL estimates for people living with different grades of frailty inform development of robust economic models, for example, decision analytic cost-effectiveness models that incorporate transition between frailty categories. Also, investigation of agreement between different HRQOL measures across different frailty categories would help inform selection of instruments for both observational research studies and clinical trials.

### Objectives

To report normative estimates for the EQ-5D and SF-6D for a range of established frailty measures, compare response distribution across dimensions of the 2 HRQOL measures, and investigate agreement between the EQ-5D and SF-6D using Bland-Altman (BA) and intraclass correlation techniques.

## Methods

### Study Design

Secondary analysis of prospective cohort data from the Community Ageing Research 75+ study, collected between December 2014 and November 2018.

### Setting

Multisite, community-based cohort study, recruiting from UK general practices across a range of urban and rural areas, with wide sociodemographic representation.
• Heaven A.
• Brown L.
• Young J.
• et al.
Community ageing research 75+ study (CARE75+): an experimental ageing and frailty research cohort.

### Participants

People aged 75 years and older and living at home were eligible. Care home residents, people living at home and bedbound, and people in the terminal stage of life were excluded.

### Variables

#### Phenotype model

The phenotype model of frailty, based on the 5 physical characteristics as reported in the original Cardiovascular Health Study (slow walking speed, weight loss, exhaustion, weak grip strength, low energy expenditure), uses standardized cut points.
• Fried L.P.
• Tangen C.M.
• Walston J.
• et al.
Frailty in older adults: evidence for a phenotype.
Those with no characteristics were identified as fit, 1 or 2 characteristics as pre-frail, and 3 to 5 characteristics as frail.

#### Cumulative deficit model

The research-standard 60-item frailty index (FI) is based on the cumulative deficit model of frailty and previously validated as part of the English Longitudinal Study of Ageing.
• Marshall A.
• Nazroo J.
• Tampubolon G.
• Vanhoutte B.
Cohort differences in the levels and trajectories of frailty among older people in England.
The FI score is calculated as an equally weighted proportion of the number of deficits present in an individual relative to the total possible. The FI groups individuals into 4 categories: very fit (FI score of 0-0.10), well (>0.10-0.14), vulnerable (>0.14-0.24), and frail (>0.24).
• Hubbard R.E.
• Goodwin V.A.
• Llewellyn D.J.
• Warmoth K.
• Lang I.A.
Frailty, financial resources and subjective well-being in later life.

#### Electronic frailty index (eFI)

The eFI score is based on the cumulative deficit model of frailty, including 36 variables recorded in the primary care electronic health record as part of routine care. The score is calculated as an equally weighted proportion of the number of deficits present in an individual relative to the total possible. The eFI enables identification of frailty categories: fit (0-0.12), mild frailty (0.12-0.24), moderate frailty (0.24-0.36), and severe frailty (>0.36).
• Clegg A.
• Bates C.
• Young J.
• et al.
Development and validation of an electronic frailty index using routine primary care electronic health record data.

#### EuroQol 5-dimension health questionnaire, 5-level version (EQ-5D-5L)

The EQ-5D-5L 5 dimensions are mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Each dimension has 5 levels of severity: no problems, slight problems, moderate problems, severe problems, and extreme problems. The scores for each of the 5 dimensions are combined in a 5-digit number representing 3125 different health states that can be converted into a utility index ranging from -0.29 to 1 (0 for dead, 1 for perfect health, and negative values for states worse than death) for use in economic evaluation.
• Devlin N.J.
• Shah K.K.
• Feng Y.
• Mulhern B.
• Hout B.V.
Valuing health-related quality of life: an EQ-5D-5L value set for England.

#### SF-36

The RAND SF-36 questionnaire includes 36 questions spanning 8 health domains: physical functioning, bodily pain, role limitations due to physical health problems, role limitations due to personal or emotional problems, general mental health, social functioning, energy/fatigue, and general health perceptions. It also includes a single item that provides an indication of perceived change in health. The SF-36 enables calculation of Physical Component Summary (PCS) and Mental Component Summary (MCS) scores. The SF-36 domain scores PCS and MCS are on a 0-100 scale with higher scores indicating better health.

#### SF-6D

The SF-6D is a health utility score derived from 11 items of the SF-36 questionnaire. The items are converted into a 6-dimension health state classification system, the SF-6D, with 4 to 6 levels, allowing for a total of 18 000 unique health states. Dimensions of the SF-6D include physical functioning, role limitations, social functioning, pain, mental health, and vitality. The SF-6D index score has values ranging from 0.29 to 1, with lower values representing worse HRQOL.
• Janssen M.F.
• Bonsei G.J.
• Luo N.
Is EQ-5D-5L better than EQ-5D-3L? A head-to-head comparison of descriptive systems and value sets from seven countries.

### Methods of Assessment

All measures, except eFI scores, were obtained during face-to-face assessments in the participant’s own home. The eFI scores were obtained directly from primary care electronic health records.

### Bias

All measures were collected using an electronic data capture system, and researchers were unable to review previous scores at follow-up points for the same individual, limiting potential for assessment bias.

### Statistical Methods

We analyzed the 2 HRQOL measures (EQ-5D-5L and SF-6D), the 8 individual dimensions of the SF-36, and the SF-36 PCS and MCS scores. We generated summary statistics for the entire sample and for frailty subsamples. We used qualitative, BA, and quantitative, intraclass correlation coefficient (ICC), techniques to examine agreement between the 2 HRQOL measures.
The ICC method formally tests significance of agreement in the sample under study. We used consistency of agreement intraclass correlation coefficient (CA-ICC) to account for the fact that the 2 utility scores are measured on different scales using a 2-way mixed-effect ICC whereby the 2 HRQOL measures were modeled as fixed effects. The ICC method is dependent on the range of the measurement rather than the actual scale of measurement. Further, a high ICC is based on the assumption that discrepancies in measuring health utility are the same across the possible range of outcomes. This might be considered too restrictive when we compare 2 health utility measures that, under an ordinality assumption, only need to preserve ranking to be equivalent and thus allow for non-constant biases across the different values of the indices. We therefore used an additional, qualitative measure of agreement, the BA plot,
• Altman J.D.
• Bland J.M.
Measurement in medicine: the analysis of method comparison studies.
,
• Bland J.M.
• Altman J.D.
Statistical methods for assessing agreement between two methods of clinical measurement.
which shows variation in agreement over the entire range of values. We performed 2 data transformations for the BA analysis. First, to omit individual-specific clustering, we excluded all but the last observation for each individual. Second, we collapsed the data to averages of the EQ-5D for each value of the SF-6D and averages of the SF-6D for each value of the EQ-5D and retained 1 observation per individual for the analysis.
We analyzed the distributions of self-reported responses for individual dimensions of the EQ-5D-5L and SF-6D questionnaires for the whole sample and for subsamples based on frailty categories. We also examined correlation between dimensions of these 2 questionnaires using Spearman’s rank correlation coefficient.

#### Missing data

Data are assumed missing at random throughout the analysis.

## Results

### Participants

Data from 2472 assessments of 1038 individual Community Ageing Research 75+ participants are included, with 75% of the study population aged 75 to 84 and slightly more women (52.7%). Based on the phenotype model, 20.2% of the sample was classified as fit, 51.4% pre-frail, and 28.4% frail. According to the cumulative deficit model, 28.3% were classified as fit, 15.5% well, 30% vulnerable, and 26.2% frail. The eFI distribution of frailty suggests that 22.4% were fit, 32.8% had mild frailty, 32.3% moderate frailty, and 12.5% severe frailty (Table 1).
Table 1Summary statistics for the sample of older people with frailty.
 Age 75-84 75 85+ 25 Female 52.7 Frailty Phenotype model Fit 20.2 Pre-frail 51.4 Frail 28.4 Cumulative deficit model Fit 28.3 Well 15.5 Vulnerable 30 Frail 26.2 Electronic frailty index Fit 22.4 Mild 32.8 Moderate 32.3 Severe 12.5

### Main Results

#### Quality-of-life scores

Table 2 presents normative data, in the form of means and standard deviations, for the EQ-5D-5L, SF-6D, PCS, MCS, and 8 dimensions of the SF-36 for the sample as a whole and by frailty categories.
Table 2Means and standard deviations for EQ-5D, SF-6D, PCS, MCS, and 8 dimensions of SF-36 for the whole sample and by frailty categories.
AllPhenotypeFrailty IndexeFI
FitPre-frailFrailVery fitWellVulnerableFrailFitMildModerateSevere
EQ-5D0.820.920.850.710.950.910.830.650.910.850.790.69
0.190.100.170.210.090.120.140.210.130.160.200.23
SF-6D0.750.840.770.650.850.810.740.620.820.760.700.64
0.130.080.120.110.070.080.100.100.090.110.120.12
PCS40.7950.5743.0030.8051.7647.2839.2427.9749.2542.6336.2030.04
12.015.8710.959.405.297.109.278.387.9210.9011.2311.06
MCS55.5757.1756.0153.8157.5257.2756.4151.6556.7656.5154.7152.95
8.245.547.6010.215.065.947.4711.056.387.578.7610.41
Physical functioning61.0888.6467.1733.4290.3079.5758.8924.6482.2267.4049.7729.89
30.728.9127.7822.049.1714.9921.6119.3919.4926.4729.2126.60
Role—physical63.4288.0369.6337.3091.1382.5458.1631.4184.2768.4451.0338.20
41.4226.8739.2139.0523.9729.9140.0437.6930.7139.8041.6341.19
Role—emotional84.9995.0287.1075.1096.5794.2085.6367.4892.4789.3580.2773.16
31.5219.1929.4337.9114.8619.2830.4841.3323.0327.6534.7938.67
Energy/fatigue58.9172.8861.7645.2675.3267.6756.9840.0971.0261.9152.2643.31
22.2115.9120.7020.9414.6716.9718.2420.0317.8020.5621.4722.40
Mental health82.3288.1483.8875.9688.7086.9883.3772.2186.4683.9779.6775.26
15.0711.4913.2617.6910.4411.0112.0118.5211.9113.5915.6418.56
Social functioning83.1295.1186.4869.7796.4093.3484.9462.1193.5087.4877.4564.00
25.2312.7622.8029.1610.7114.3520.5631.1117.1020.4027.5331.93
Bodily pain72.6684.6175.1760.8687.4781.2870.8255.4984.6374.5166.7160.05
26.1518.5824.8127.8916.6719.9524.3928.1119.3324.8927.1528.82
General health perceptions62.2475.7065.3448.4677.6370.5461.6343.2174.8065.2855.1446.53
21.3913.5419.5721.0512.6014.4217.6520.6714.4718.3521.7922.98
eFI indicates electronic frailty index; EQ-5D, Euro-Qol 5-dimension health questionnaire; MCS, Mental Component Summary; PCS, Physical Component Summary; SF-6D, short-form 6 dimension; SF-36, short-form 36-item health questionnaire in frailty.
The EQ-5D-5L and SF-6D utility scores consistently decrease with frailty. The EQ-5D-5L mean exceeds the SF-6D mean for all frailty categories. In general, the utility score mean difference decreases with advancing frailty across all 3 indices.
Mean SF-36 scores decrease across all 8 dimensions with increasing frailty. Similarly, SF-36 PCS scores decrease with increasing frailty. Although SF-36 MCS scores decrease with frailty, the differences between 2 consecutive groups for the phenotype and eFI model are small (less than 2 points, or 4%) and somewhat larger (within 4.76 points, or 9.2%) for the Frailty Index.
Variance estimates increase with frailty for most indices. This pattern is different on dimensions related to physical functioning. In particular, standard deviation is lower for the most frail category on the PCS and 2 of its components (physical function and role limitations due to physical problems).
Normative estimates were further stratified by age group and sex (see Appendix Tables 4-7 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2020.04.1830). Quality-of-life scores consistently decrease with age. Although men, in general, report higher quality of life, women with advancing frailty and aged 85 and older report higher or similar EQ-5D, SF-6D, composite MCS, and mental health and emotional role SF-36 dimensions.
Variance estimates increase with age and are higher for women. For individuals with advancing frailty aged 85 and older, the standard deviation is consistently lower on the composite PCS and physical functioning and physical role SF-36 dimensions across all frailty models.

#### Agreement

The CA-ICC results indicate stronger agreement between the individual EQ-5D-5L and SF-6D across the entire sample (CA-ICC 0.61) compared with the generally lower CA-ICC estimates for individual frailty subgroups. (Table 3). The CA-ICC estimates are lower for the fit and frail categories of the phenotype model, but close in magnitude to the CA-ICC estimate for the entire sample for the pre-frail category. The FI CA-ICC estimates for different frailty subgroups are low in magnitude. In the case of eFI, the CA-ICC is of similar magnitude across different frailty categories and closer in magnitude to the estimate for the entire sample. Precision of CA-ICC estimates (confidence interval width 0.05) is higher for the entire sample compared with frailty subsamples.
Table 3CA-ICC between EQ-5D-5L and SF-6D for the whole sample and by frailty categories of phenotype, Frailty Index, and eFI models.
AllFitPhenotypeVery fitFrailty IndexFrailFiteFIModerateSevere
Pre-frailFrailWellVulnerableMild
CA-ICC0.610.330.60.440.270.340.420.380.500.560.560.53
CA-ICC average0.67n/an/an/a0.370.370.490.400.600.610.580.58
CA-ICC indicates consistency of agreement intraclass correlation coefficient; eFI, electronic frailty index; EQ-5D-5L, Euro-Qol 5-dimension 5-level health questionnaire; SF-6D, short-form 6-dimension.
The BA plot (Fig. 1) constructed using averages of one utility measure given the value of the other shows that there is systematic variation in the EQ-5D-5L and SF-6D scores. Frailer individuals have a lower average value of the 2 utility measurements, compared to fit individuals. Within the frail group, SF-6D scores are typically higher, compared to the EQ-5D-5L. Conversely, within the fitter group, EQ-5D-5L scores are typically higher. Results indicate increased greater variation in estimates with advancing frailty.
Given the differences in range and valuation of the EQ-5D and SF-6D utility measures, we checked whether ranks of utility scores are better suited for the BA analysis. The plot for ranks (in Supplemental Materials found at https://doi.org/10.1016/j.jval.2020.04.1830) shows higher variability in scores at better levels of health. This, at least partially, can be explained by the fact that the number of individuals with similar EQ-5D and SF-6D scores is significantly higher at the healthier end of the utility spectrum. As a result, the same error in measurement between the EQ-5D and SF-6D will lead to a larger discrepancy in rank. The BA plot for ranks is also symmetric around 0, suggesting there is no bias in predicting rank of one utility score using the rank of the other.

#### Correlation between EQ-5D-5L and SF-6D dimensions

The correlations between similar dimensions
• Whitehurst D.G.
• Bryan S.
Another study showing that two preference-based measures of health-related quality of life (EQ-5D and SF-6D) are not interchangeable. But why should we expect them to be?.
of the EQ-5D-5L and SF-6D utility indices are either high or moderate. The lowest correlations are observed among 3 pairs: (1) EQ-5D-5L pain/discomfort dimension and SF-6D mental health ($ρ=0.189$); (2) EQ-5D-5L anxiety/depression and SF-6D pain ($ρ=0.190$); and (3) EQ-5D-5L anxiety/depression and SF-6D vitality ($ρ=0.190)$ (For Spearman’s correlation estimates see Appendix in Supplemental Materials found at https://doi.org/10.1016/j.jval.2020.04.1830.)
For each dimension of the EQ-5D-5L, a large proportion of the responses is concentrated in the top level. Thirty-seven percent of the overall study population reported scoring optimal HRQOL (EQ-5D score of 11111), but this was not observed with the SF-6D with only 1% scoring the highest possible score. The predominant response for SF-6D physical functioning is level 2, while responses on pain and vitality have 2 equally probable levels (levels 1 and 2 for pain and 2 and 3 for vitality). (For more details on distributions of self-responses see Appendix in Supplemental Materials found at https://doi.org/10.1016/j.jval.2020.04.1830).

## Discussion

### Key Results

#### Quality-of-life scores

Our results indicate that mean EQ-5D-5L and SF-6D scores decrease but overall variability of scores increases with advancing frailty. Compared with the EQ-5D-5L, the SF-6D utility index value is higher for people with advancing frailty, consistent with previous research that has compared the 2 measures in more severe illness states.
• Brazier J.
• Roberts J.
• Tsuchiya A.
• Busschbach J.
A comparison of the EQ-5D and SF-6D across seven patient groups.
,
• Barton G.
• Sach T.
• Avery A.
• et al.
A comparison of the performance of the EQ-5D and SF-6D for individuals aged ≥ 45 years.
Mean scores for the 8 dimensions of the SF-36 decrease with frailty. The decrease in mean scores is more notable for physical components of the SF-36 and the overall physical component summary score, compared with the mental component summary score. Differences between means for the MCS are small with larger differences between groups of the FI model. As opposed to the EQ-5D-5L and SF-6D scores, variability estimates for the 8 SF-36 dimensions typically decrease with advancing frailty. Notable exceptions are the PCS and its physical function and physical role components, where variability increases with advancing frailty. The variability of physical characteristics appears to decrease as individuals become very old, potentially reflecting greater similarity in physical capabilities in very advanced old age.

#### Agreement

The CA-ICC for the entire sample is larger than the frailty subgroup estimates, because it is affected by the variability across a population. We found that variability within eFI categories and pre-frail category of the phenotype index is similar to the variability in the entire sample as indicated by CA-ICC estimates. The CA-ICC precision for the entire sample is larger, because the confidence interval width is directly related to sample size.
We also observed stronger agreement between utility values at higher health levels on the BA plot. Owing to a higher concentration of individuals at this end of the utility spectrum, the BA plot based on ranks demonstrates higher variability. We conclude that value, not rank, is the appropriate measure of analysis.

#### Response distribution

The EQ-5D-5L has been shown to reduce the ceiling effects of the earlier 3-level version (EQ-5D-3L).
• Janssen M.F.
• Pickard A.S.
• Golicki D.
• et al.
Measurement properties of the EQ-5D-5L compared to the EQ-5D-£L across eight patient groups: a multi-country study.
This study, however, has identified that more than 1 in 3 older individuals (37%) scored in the top level on all 5 dimensions, indicating optimal HRQOL,
• Lutomski J.E.
• Krabbe P.F.M.
• Bleijenberg N.
• et al.
Measurement properties of the EQ-5D across four major geriatric conditions: findings from TOPICS-MDS.
even though they had frailty classed as advanced by frailty models (8% by the Frailty Index and 15% by the phenotype and eFI models), raising ongoing concern for ceiling effects with the 5-level version in some groups of older people.
• Huber M.
• Vogelman M.
• Leidl R.
Valuing health-related quality of life: systematic variation in health perception.

### Limitations

The findings in this study are based on a sample of individuals who are older than 75 and live at home, with a relatively low prevalence of dementia. As a result, normative estimates and additional findings from this study cannot necessarily be extrapolated to older people living with dementia or care home residents.
• Usman A.
• Lewis S.
• Nihsliff-Smith K.
• et al.
Measuring health-related quality of life of care home residents, comparison of self-report with staff proxy responses for EQ-5D-5L and HowRu: protocol for assessing proxy reliability in care home outcomes testing.
In this study, we assessed agreement between the EQ-5D-5L and SF-6D. The EQ-5D and SF-6D measures are used in healthcare decision making by NICE in the UK and health technology assessment agencies in other countries including Brazil, China, Norway, South Korea, and Spain,
Summary of guidance on health-utility measures by selected health technology assessment agencies. RTI Health Solutions.
with the EQ-5D being the preferred measure of HRQOL in adults. NICE currently does not recommend using the 5L valuation set.
Position statement on use of the EQ-5D-5L valuation set for England (updated October 2019). National Institute for Health and Care Excellence.
Existing evidence,
• Thompson A.J.
• Turner A.J.
A comparison of the EQ-5D-3L and EQ-5D-5L.
however, suggests that the 5L has superior measurement properties than the 3L and is preferable in a population with multimorbidities likely to have some similar characteristics to the population of older people living with frailty. As the cost-effectiveness results obtained from the 2 measures will likely differ, a sensitivity analysis using the SF-6D to explore uncertainty in estimates for the population of individuals with increasing frailty is thus needed.

### Interpretation

This study provides important information on normative estimates and agreement for different measures of HRQOL across the spectrum of frailty. These normative estimates can be used for robust sample size calculations by trialists investigating novel interventions for older people with frailty where HRQOL is the primary outcome of interest.
Our findings indicate that HRQOL decreases with advancing frailty when either the EQ-5D-5L or SF-6D is used as the measure. There appears to be greater impact on physical health–related quality of life than mental health–related quality of life, which may in part be explained by the greater emphasis on physical characteristics within frailty models.
Overall health utility scores are more consistent for people who are fit, with greater variability in scores for people with increasing frailty, whereas physical components of the SF-36 and composite PCS demonstrate consistent decline with frailty. Findings are consistent across different frailty measures and constructs. We have identified the possibility of a greater risk of ceiling effects using the EQ-5D-5L compared with the SF-6D.
Findings indicate good agreement between the EQ-5D-5L and SF-6D across the spectrum of frailty, lending support for the 2 measures identifying a common construct of HRQOL in frailty. Agreement is lower for those who are most frail, indicating that different components of the 2 instruments may have greater relevance for people with advancing frailty in later life.
We recommend that researchers consider using the SF-36/SF-6D as an appropriate measure of HRQOL for clinical trials involving older people if there are concerns about the impact of ceiling effects for outcome measurement, for example, if fit older people are the planned target. In trials involving older people with increasing frailty, where the social value of an intervention may be more relevant and ceiling effects less of a concern, we recommend that both the EQ-5D-5L and SF-36/SF-6D are included as measures of HRQOL, and are used in sensitivity analysis as part of planned cost-effectiveness evaluation. Further research to clarify individual priorities for older people living with different degrees of frailty will help guide the future selection of appropriate tools for measurement of HRQOL in this population.

## Supplemental Material

• Supplemental Materials

## References

1. Study protocol for the World Health Organization project to develop a Quality of Life assessment instrument (WHOQOL).
Qual Life Res. 1993; 2: 153-159
2. Fit for frailty. British Geriatrics Society.
3. Multimorbidity: clinical assessment and management. National Institute for Health and Care Excellence.
https://www.nice.org.uk/guidance/ng56
Date accessed: September 25, 2019
4. Living well with chronic illness: a call for public health action.
Mil Med. 2015; 180: 485-487
• Haywood K.
• Garratt A.
• Fitzpatrick R.
Quality of life in older people: a structured review of self-assessed health instruments.
Qual Life Res. 2006; 6: 181-194
• Crocker T.
• Brown L.
• Clegg A.
• et al.
Quality of life is substantially worse for community-dwelling older people living with frailty: systematic review and meta-analysis.
Qual Life Res. 2019; 28: 2041-2056
5. Guide to the methods of technology appraisal 2013. National Institute for Health and Care Excellence.
• Rowen D.
• Azzabi Zouraq I.
• Chevrou-Severac H.
• van Hout B.
International regulations and recommendations for utility data for health technology assessment.
Pharmacoeconomics. 2017; 35: 11-19
6. Summary of guidance on health-utility measures by selected health technology assessment agencies. RTI Health Solutions.
• Brazier J.
• Roberts J.
• Tsuchiya A.
• Busschbach J.
A comparison of the EQ-5D and SF-6D across seven patient groups.
Health Econ. 2004; 13: 873-884
• Barton G.
• Sach T.
• Avery A.
• et al.
A comparison of the performance of the EQ-5D and SF-6D for individuals aged ≥ 45 years.
Health Econ. 2008; 17: 815-832
• Heaven A.
• Brown L.
• Young J.
• et al.
Community ageing research 75+ study (CARE75+): an experimental ageing and frailty research cohort.
BMJ Open. 2019; 9e026744
• Fried L.P.
• Tangen C.M.
• Walston J.
• et al.
Frailty in older adults: evidence for a phenotype.
J Gerontol A Biol Sci Med Sci. 2001; 56: M146-M156
• Marshall A.
• Nazroo J.
• Tampubolon G.
• Vanhoutte B.
Cohort differences in the levels and trajectories of frailty among older people in England.
J Epidemiol Community Health. 2015; 69: 316-321
• Hubbard R.E.
• Goodwin V.A.
• Llewellyn D.J.
• Warmoth K.
• Lang I.A.
Frailty, financial resources and subjective well-being in later life.
Arch Gerontol Geriatr. 2014; 58: 364-369
• Clegg A.
• Bates C.
• Young J.
• et al.
Development and validation of an electronic frailty index using routine primary care electronic health record data.
Age Ageing. 2016; 45: 353-360
• Devlin N.J.
• Shah K.K.
• Feng Y.
• Mulhern B.
• Hout B.V.
Valuing health-related quality of life: an EQ-5D-5L value set for England.
Health Econ. 2017; 27: 7-22
• Janssen M.F.
• Bonsei G.J.
• Luo N.
Is EQ-5D-5L better than EQ-5D-3L? A head-to-head comparison of descriptive systems and value sets from seven countries.
Pharmacoeconomics. 2018; 36: 675-697
• Altman J.D.
• Bland J.M.
Measurement in medicine: the analysis of method comparison studies.
Statistician. 1983; 32: 307-317
• Bland J.M.
• Altman J.D.
Statistical methods for assessing agreement between two methods of clinical measurement.
Lancet. 1986; 327: 307-310
• Whitehurst D.G.
• Bryan S.
Another study showing that two preference-based measures of health-related quality of life (EQ-5D and SF-6D) are not interchangeable. But why should we expect them to be?.
Value Health. 2011; 14: 531-538
• Janssen M.F.
• Pickard A.S.
• Golicki D.
• et al.
Measurement properties of the EQ-5D-5L compared to the EQ-5D-£L across eight patient groups: a multi-country study.
Qual Life Res. 2012; 22: 1717-1727
• Lutomski J.E.
• Krabbe P.F.M.
• Bleijenberg N.
• et al.
Measurement properties of the EQ-5D across four major geriatric conditions: findings from TOPICS-MDS.
Health Qual Life Outcomes. 2017; 15
• Huber M.
• Vogelman M.
• Leidl R.
Valuing health-related quality of life: systematic variation in health perception.
Health Qual Life Outcomes. 2017; 15: 45
• Usman A.
• Lewis S.
• Nihsliff-Smith K.
• et al.
Measuring health-related quality of life of care home residents, comparison of self-report with staff proxy responses for EQ-5D-5L and HowRu: protocol for assessing proxy reliability in care home outcomes testing.
BMJ Open. 2018; 8e022127
7. Position statement on use of the EQ-5D-5L valuation set for England (updated October 2019). National Institute for Health and Care Excellence.
• Thompson A.J.
• Turner A.J.
A comparison of the EQ-5D-3L and EQ-5D-5L.
Pharmacoeconomics. 2020; 38: 575-591