Advertisement

Minimally Important Difference of the EQ-5D-5L Index Score in Adults with Type 2 Diabetes

Open ArchivePublished:April 03, 2018DOI:https://doi.org/10.1016/j.jval.2018.02.007

      Abstract

      Background

      The EuroQol five-dimensional questionnaire (EQ-5D) is a generic preference-based measure of health-related quality of life, and several studies have made attempts to estimate the minimally important difference (MID) for the EQ-5D index score.

      Objectives

      To estimate the MID of the five-level EQ-5D (EQ-5D-5L) index score in a population-based sample of adults with type 2 diabetes and to explore whether the MID estimate varies by baseline index score and the direction of change in health status.

      Methods

      We used longitudinal survey data of adults with type 2 diabetes in Alberta, Canada. The EQ-5D-5L MID was estimated first by the instrument-defined approach, which used the difference between the baseline index scores and the index scores of simulated single-level transitions, and then by the anchor-based approach, which categorized 1-year changes in depressive symptoms, diabetes-related distress, as well as physical and mental health functioning into no change, small change, and large change groups, wherein the MID was estimated as the average change in index score of the small change group.

      Results

      Using the instrument-defined approach, MID estimates were 0.043, 0.040, and 0.045, whereas anchor-based MID estimates were 0.042, 0.034, and 0.049 for all change, improvement, and deterioration, respectively. Larger MID estimates were observed for lower baseline index scores and for deterioration in health status.

      Conclusions

      MID estimates of the EQ-5D-5L index score were consistent between instrument-defined and anchor-based approaches and ranged between 0.03 and 0.05. Estimates varied by baseline index score and the direction of change, with similar results for patient subgroups.

      Keywords

      Introduction

      Patient-reported outcome measures, particularly measures of health-related quality of life (HRQOL), are increasingly common in routine measurement of health outcomes as a means of capturing patients’ perspective of their own health and the valuation of health services in terms of the health produced, which can be used to support patient-centered decision making [
      • Devlin N.J.
      • Appleby J.
      Getting the Most Out of PROMS: Putting Health Outcomes at the Heart of NHS Decision-Making.
      ]. Nevertheless, there are many unanswered questions that need to be addressed to facilitate the application of patient-reported outcome measures in health systems and to realize the full value of the collected data [
      • Devlin N.J.
      • Appleby J.
      Getting the Most Out of PROMS: Putting Health Outcomes at the Heart of NHS Decision-Making.
      ].
      Diabetes is a prevalent chronic condition that can have adverse effects on a patient’s health and quality of life [
      World Health Organization
      Global Report on Diabetes.
      ]. According to the World Health Organization’s 2016 global report on diabetes, it is estimated that 422 million people or 8.5% of the adult population was living with diabetes in 2014, with approximately 6 new cases per 1000 individuals diagnosed annually in Canada [
      World Health Organization
      Global Report on Diabetes.
      ,
      Public Health Agency of Canada
      Diabetes in Canada: Facts and Figures from a Public Health Perspective.
      ]. Measuring the HRQOL of patients with diabetes may be useful to understanding changes in health status, improving care, and informing health care investment decisions [
      • Al Sayah F.
      • Majumdar S.R.
      • Soprovich A.
      • et al.
      The Alberta’s Caring for Diabetes (ABCD) study: rationale, design and baseline characteristics of a prospective cohort of adults with type 2 diabetes.
      ,
      • Janssen M.F.
      • Lubetkin E.I.
      • Sekhobo J.P.
      • Pickard A.S.
      The use of the EQ-5D preference-based health status measure in adults with type 2 diabetes mellitus.
      ].
      The EuroQol five-dimensional questionnaire (EQ-5D) is a generic preference-based measure of HRQOL developed by the EuroQol Group (http://www.eurqol.org) [
      • Herdman M.
      • Gudex C.
      • Lloyd A.
      • et al.
      Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L).
      ]. The EQ-5D was developed as a brief measure of health status and can be used for calculating quality-adjusted life-years for use in economic studies. There is good evidence supporting its validity, reliability, and responsiveness in type 2 diabetes [
      • Janssen M.F.
      • Lubetkin E.I.
      • Sekhobo J.P.
      • Pickard A.S.
      The use of the EQ-5D preference-based health status measure in adults with type 2 diabetes mellitus.
      ,
      • Devlin N.J.
      • Brooks R.
      EQ-5D and the EuroQol group: past, present and future.
      ]. Nevertheless, the routine collection of EQ-5D data in health systems for quality evaluation and improvement has created a demand from end users to interpret changes or differences in index score over time or between groups [
      • Devlin N.J.
      • Appleby J.
      Getting the Most Out of PROMS: Putting Health Outcomes at the Heart of NHS Decision-Making.
      ,
      • Walters S.J.
      • Brazier J.E.
      Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D.
      ]. Commonly, statistical significance is used as a decision criterion to interpret what may be considered meaningful change or difference in a score. Although statistical significance is useful for quantifying the role of random variation giving rise to the observed change, it is not necessarily reflective of the value that the patient places on the change [
      • Devlin N.J.
      • Appleby J.
      Getting the Most Out of PROMS: Putting Health Outcomes at the Heart of NHS Decision-Making.
      ]. For example, the large sample sizes obtained by routine outcome measurement may allow for the detection of statistically significant change or differences among patients or as a result of an intervention, but this may not necessarily equate to a meaningful change or difference [
      • Gutacker N.
      • Street A.
      Use of large-scale HRQoL datasets to generate individualised predictions and inform patients about the likely benefit of surgery.
      ]. To this end, estimating the minimally important difference (MID) of the five-level EQ-5D (EQ-5D-5L) index score, defined as the smallest change in index score that would be considered meaningful to the patient, may be a useful method to support the interpretability of the EQ-5D-5L [
      • Devlin N.J.
      • Appleby J.
      Getting the Most Out of PROMS: Putting Health Outcomes at the Heart of NHS Decision-Making.
      ,
      • Walters S.J.
      • Brazier J.E.
      Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D.
      ,
      • Gutacker N.
      • Street A.
      Use of large-scale HRQoL datasets to generate individualised predictions and inform patients about the likely benefit of surgery.
      ,
      • King M.T.
      A point of minimal important difference (MID): a critique of terminology and methods.
      ,
      • Johnston B.C.
      • Ebrahim S.
      • Carrasco-Labra A.
      • et al.
      Minimally important difference estimates and methods: a protocol.
      ,
      • Coretti S.
      • Ruggeri M.
      • McNamee P.
      The minimum clinically important difference for EQ-5D index: a critical review.
      ,
      • Revicki D.
      • Hays R.D.
      • Cella D.
      • Sloan J.
      Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes.
      ].
      Because the MID purports to capture the value that patients place on change, it may be considered specific to a patient population and/or clinical context of interest [
      • King M.T.
      A point of minimal important difference (MID): a critique of terminology and methods.
      ]. Similarly, it can be important to consider whether patients place a different value on health improvement (or gain) versus health deterioration (or loss) as well as with regard to their baseline (or current) health status as measured by the instrument, and how these factors may affect the MID. Previous studies have suggested that the MID differs for health improvement compared with health deterioration, whereas patients in better health may perceive a different threshold of change as minimally important compared with those in worse health [
      • King M.T.
      A point of minimal important difference (MID): a critique of terminology and methods.
      ,
      • Crosby R.D.
      • Kolotkin R.L.
      • Williams G.R.
      Defining clinically meaningful change in health-related quality of life.
      ].
      The primary objective of this study was to estimate the MID of the EQ-5D-5L index score in a representative sample of adults with type 2 diabetes in Alberta, Canada. The secondary objectives were 1) to explore whether the MID estimate varies by baseline index score, 2) to examine whether it varies by the direction of change in health status (improvement vs. deterioration), and 3) to determine MID estimates of defined patient subgroups.

      Methods

      Data Source

      Data (N = 1927) were from baseline and 1-year follow-up of an ongoing cohort of adults with type 2 diabetes in Alberta, Canada (Alberta’s Caring for Diabetes cohort study). Details of the study have been reported elsewhere [
      • Al Sayah F.
      • Majumdar S.R.
      • Soprovich A.
      • et al.
      The Alberta’s Caring for Diabetes (ABCD) study: rationale, design and baseline characteristics of a prospective cohort of adults with type 2 diabetes.
      ]. Briefly, the study aims to research the various factors associated with the development of disease complications and other health outcomes in patients with type 2 diabetes by gathering information on individual medical, behavioral, and psychosocial factors [
      • Al Sayah F.
      • Majumdar S.R.
      • Soprovich A.
      • et al.
      The Alberta’s Caring for Diabetes (ABCD) study: rationale, design and baseline characteristics of a prospective cohort of adults with type 2 diabetes.
      ]. Participants completed a mailed self-reported survey, which included questions on sociodemographic factors (i.e., age and sex), diabetes history, comorbidities, diabetes complications, self-care, diabetes-specific management, well-being, and HRQOL measures including the EQ-5D-5L [
      • Al Sayah F.
      • Majumdar S.R.
      • Soprovich A.
      • et al.
      The Alberta’s Caring for Diabetes (ABCD) study: rationale, design and baseline characteristics of a prospective cohort of adults with type 2 diabetes.
      ]. Item wording of selected measures is presented in the Appendix in Supplemental Materials found at https://doi.org/10.1016/j.jval.2018.02.007.

      Measures

      Five-level EuroQol five-dimensional questionnaire

      The EQ-5D-5L is based on a multi-attribute health classification system that includes five response levels for five dimensions: mobility (MO), self-care (SC), usual activities (UA), pain/discomfort (PD), and anxiety/depression (AD) [
      • Herdman M.
      • Gudex C.
      • Lloyd A.
      • et al.
      Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L).
      ]. According to the Canadian scoring algorithm, EQ-5D-5L scores range between −0.148 for the worst health state (55555) and 0.949 for the best health state (11111) [
      • Xie F.
      • Pullenayegum E.
      • Gaebel K.
      • et al.
      A time trade-off-derived value set of the EQ-5D-5L for Canada.
      ].

      Patient health questionnaire 8-item

      The patient health questionnaire 8-item (PHQ8) is an eight-item questionnaire measuring depression, in which each item is scored on a four-point Likert scale anchored from 0 (not at all) to 3 (nearly every day), and with a reference period of the last 2 weeks. The sum of all eight items yields a total score ranging from 0 (no depressive symptoms) to 24 (severe depressive symptoms).

      Problem areas in diabetes 5-item questionnaire

      The problem areas in diabetes 5-item questionnaire (PAID5) measures emotional functioning with five items by way of a five-point Likert scale anchored from 0 (not a problem) to 4 (serious problem) at the present time. The average of all five items gives a final score ranging between 0 and 4, wherein higher scores indicate more problems.

      12-Item short form health survey

      The 12-item short form health survey (SF-12, version 2) is a 12-item questionnaire with eight subscales: physical functioning, energy and vitality, role limitations due to physical problems, role limitations due to emotional problems, bodily pain, general health perceptions, social functioning, and mental health [
      • Fleishman J.A.
      • Selim A.J.
      • Kazis L.E.
      Deriving SF-12v2 physical and mental health summary scores: a comparison of different scoring algorithms.
      ]. The reference period of the items varies from the present moment to as long as the past 4 weeks. The subscale scores are used to calculate two summary scores: the physical component summary (PCS) score and the mental component summary (MCS) score. The summary scores are transformed to a scale score ranging from 0 (worst score) to 100 (best score) with a mean of 50 ± 10.

      MID Estimation

      We used the instrument-defined and the anchor-based approaches to estimate the MID of the EQ-5D-5L index score in the overall sample and in patient subgroups on the basis of sex, splits at approximately the median value for age, duration of diabetes, and number of comorbidities. In addition, changes in EQ-5D-5L health states were analyzed using a Paretian classification method [
      • Devlin NJ.
      • Parkin D.
      • Browne J.
      Patient-reported outcomes in the NHS: new methods for analysing and reporting EQ-5D data.
      ].

      Instrument-defined approach

      This approach is based on the average of index score differences between the baseline health state and single-level transitions to other health states. Further details of the instrument-defined approach have been published elsewhere [
      • McClure N.S.
      • Al Sayah F.
      • Xie F.
      • et al.
      Instrument-defined estimates of the minimally important difference for EQ-5D-5L index scores.
      ,
      • Luo N.
      • Johnson J.A.
      • Coons S.J.
      Using instrument-defined health state transitions to estimate minimally important differences for four preference-based health-related quality of life instruments.
      ]. Instrument-defined MID estimates were calculated for baseline EQ-5D-5L profiles using all single-level transitions (all), only transitions to a better state (improve), or only transitions to a worse state (deteriorate). Because a proportion of the sample had baseline EQ-5D-5L profiles of 11111 (or the maximum index score), these individuals had no transitions to a better state and were therefore excluded from the improve instrument-defined MID estimate. In the Canadian scoring algorithm, transitions between level 3 (i.e., moderate problems) and level 4 (i.e., extreme problems) in the MO, SC, UA, PD, and AD dimensions represent as much as 2.31, 2.28, 6.66, 4.17, and 4.40 times the value of other scoring parameters within each respective dimension. On the basis of the assumption that transitions involving a maximum-valued scoring parameter within a dimension constitute a difference or change in index score that is larger than an MID, we excluded the maximum-valued scoring parameters within each dimension. Therefore, instrument-defined MID estimates were calculated using all single-level transitions (represented as idMID) and excluding maximum-valued scoring parameters (represented as idMID*).

      Anchor-based approach

      Because there is no “one-size-fits-all” method to estimating the MID of HRQOL scores, it is generally considered good practice to estimate the MID using multiple approaches, and within each approach to use different methods (i.e., anchors and distribution parameters) to yield a pooled or triangulated MID estimate and/or a plausible range [
      • Revicki D.
      • Hays R.D.
      • Cella D.
      • Sloan J.
      Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes.
      ]. In this way, the anchor-based approach used multiple anchors with distribution-based cutoffs. Specifically, one-half SD of the baseline anchor score was used as the lower cutoff of small change for each anchor with an upper cutoff of 2 times the lower cutoff [
      • Norman G.R.
      • Sloan J.A.
      • Wyrwich K.W.
      Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation.
      ,
      • Yost K.J.
      • Eton D.T.
      • Garcia S.F.
      • Cella D.
      Minimally important differences were estimated for six PROMIS-Cancer scales in advanced-stage cancer patients.
      ,
      • Tabberer M.
      • Brooks J.
      • Wilcox T.
      A meta-analysis of four randomized clinical trials to confirm the reliability and responsiveness of the Shortness of Breath with Daily Activities (SOBDA) questionnaire in chronic obstructive pulmonary disease.
      ]. The anchor-based approach involved first categorizing the change scores of the anchors (PHQ8, PAID5, PCS, and MCS) at follow-up into three groups: no change (<0.5 SD at baseline), small change (≥0.5 SD and ≤1 SD), and large change (>1 SD). MID estimates were calculated as the average change in EQ-5D-5L index score of the small change group. The all MID estimate included minimal important change for both improving and worsening anchor change scores, in which worsening scores were multiplied by −1. Otherwise, MID estimates were categorized as improve or deteriorate if they included only a subset of the data representing improving or worsening anchor change scores, respectively. A pooled MID estimate was then calculated as the average across all anchor-based MID estimates.

      MID estimate as a function of baseline index score

      Anchor-based MID estimates and mean baseline index scores were calculated using a combined moving average and loess (i.e., local regression) smoothing approach [
      • Cleveland R.B.
      • Cleveland W.S.
      • McRae J.E.
      • Terpenning I.
      STL: a seasonal-trend decomposition procedure based on loess.
      ]. This involved ordering the data set on the basis of baseline index score (from the lowest to the highest score) and taking multiple sequential subsamples that included at least 20% of the total baseline data set. The group of individuals with the lowest baseline index score in each subsample was then not used to calculate the next subsample (i.e., the next 20% of the baseline data). For each subsample, the baseline index score was calculated as the mean of the subsample, and the MID was estimated for each anchor using the same methodology as previously described. The instrument-defined MID estimates were similarly represented as the MID estimate of each subsample. Scatterplots were generated to show the loess curve line of the MID estimate as a function of baseline index score.

      Effect size and standardized response mean

      The effect size (ES) and standardized response mean were calculated by dividing the MID estimate (numerator) by the SD of the EQ-5D-5L index score at baseline (ES) and, for anchor-based MID estimates, by the SD of the EQ-5D-5L change score (standardized response mean). With respect to the ES estimates, we are mainly interested in small ESs (i.e., between 0.2 and 0.5) to show that the MID is the smallest meaningful change or difference in index score.
      Depending on the response rate to each anchor questionnaire, between 19% and 33% of patients had incomplete follow-up information at 1 year (Table 1). The subset with missing follow-up information had worse baseline anchor and EQ-5D-5L index scores compared with the entire data set at baseline (Table 1). Results reported for anchor-based MID estimates were based on complete cases (i.e., complete EQ-5D and anchor score information). All analyses were conducted using R statistical software (The R Foundation, Vienna, Austria).
      Table 1Descriptive statistics of sample at baseline and follow-up
      CharacteristicBaselineFollow-up
      Data specific to anchor (complete case | LTFU)
      AllPHQ8PAID5SF-12 MCSSF-12 PCS
      Sample size192714284991432495128863912886391560
      Age, median (IQR)64.7 (57.2–72.2)64.9 (58.1–72.0)63.0 (55.1–72.7)64.9 (58.0–72.1)63.4 (55.1–72.4)64.6 (57.7–71.2)65.2 (56.2–74.3)64.6 (57.7–71.2)65.2 (56.2–74.3)66.2 (59.4–73.4)
      Female (%)44.945.942.346.041.845.843.245.843.243.2
      Diabetes duration, median (IQR)10.7 (5.3–16.8)10.8 (5.6–16.7)9.0 (4.8–18.5)10.8 (5.6–16.7)9.2 (4.8–18.1)10.8 (5.5–16.6)10.1 (5.0–18.5)10.8 (5.5–16.6)10.1 (5.0–18.5)10.8 (5.6–16.8)
      No. of comorbidities, median (IQR)4 (3–6)4 (2–6)4 (3–6)4 (2–6)4 (3–6)4 (3–6)4 (2–6)4 (3–6)4 (2–6)4 (2–6)
      EQ-5D-5L
      No. of health states 281220159219157209178209178239
      11111 15.9%16.9%13.0%16.8%13.5%17.3%13.1%17.3%13.1%14.3%
      55555 0.1%0.0%0.2%0.0%0.2%0.0%0.2%0.0%0.2%0.0%
      Index score, mean ± SD0.790 ± 0.1710.802 ± 0.1610.758
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.194
      0.802 ± 0.1600.758
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.196
      0.804
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.160
      0.762
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.189
      0.804
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.160
      0.762
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.189
      0.792 ± 0.170
      PHQ8, mean ± SD5.3 ± 5.44.9
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 5.1
      6.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 6.0
      4.9
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 5.1
      6.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 6.0
      4.9
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 5.1
      6.3
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 5.9
      4.9
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 5.1
      6.3
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 5.9
      5.1 ± 5.1
      PAID5, mean ± SD0.867 ± 0.8800.813 ± 0.8311.024
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.994
      0.817 ± 0.8331.018
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.996
      0.817 ± 0.8310.968
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.966
      0.817 ± 0.8310.968
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 0.966
      0.796 ± 0.821
      SF-12 MCS score, mean ± SD47.9 ± 9.848.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 9.6
      45.5
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 10.0
      48.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 9.6
      45.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 10.0
      48.7
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 9.7
      45.7
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 9.7
      48.7
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 9.7
      45.7
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 9.7
      48.1 ± 9.8
      SF-12 PCS score, mean ± SD46.0 ± 10.847.2
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 10.6
      41.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 11.1
      47.2
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 10.5
      41.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 11.2
      47.3
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 10.4
      41.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 11.3
      47.3
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 10.4
      41.6
      A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.
      ± 11.3
      46.5 ± 10.5
      Note. Values in italic are specific to the data subset that was lost to follow-up for each anchor.
      All, entire baseline data set; EQ-5D-5L, five-level EuroQol five-dimensional questionnaire; IQR, interquartile range; LTFU, lost to follow-up; MCS, mental component summary; PAID5, problem areas in diabetes 5-item; PCS, physical component summary; PHQ8, patient health questionnaire 8-item; SF-12, 12-item short form health survey.
      low asterisk A statistically significant difference (P < 0.05) based on a two-sample t test compared with the entire baseline data set.

      Results

      The median age of participants at baseline (N = 1927) was 64.7 years (interquartile range [IQR] 57.2–72.2 years), and 45% were women (Table 1). The median duration for which participants had lived with diabetes was 10.7 years (IQR 5.3–16.8 years), and a median of 4.0 (IQR 3.0–6.0) comorbidities was reported. At baseline there were 281 unique EQ-5D-5L health states reported with an average EQ-5D-5L index score of 0.79 ± 0.17, decreasing to 239 unique health states at 1-year follow-up but with a similar average index score and SD (Table 1). The mean ± SD of the anchor scores at baseline were 5.3 ± 5.4, 0.87 ± 0.88, 47.9 ± 9.8, and 46.0 ± 10.8 for PHQ8, PAID5, MCS, and PCS, respectively. The strength of the association between change in anchor score and change in EQ-5D-5L index score varied across the anchor measures (Fig. 1), with correlations of 0.41 (95% confidence interval [CI] 0.35–0.48) for PHQ8, 0.27 (95% CI 0.20–0.33) for PAID5, 0.45 (95% CI 0.39–0.50) for MCS, and 0.51 (95% CI 0.45–0.55) for PCS.
      Fig. 1
      Fig. 1The association between change in anchor score and change in EQ-5D-5L index score. The dotted lines represent the limits of the small change group on the basis of the standard deviation of the baseline anchor score. EQ-5D-5L, five-level EuroQol five-dimensional questionnaire; MCS, mental component summary; PAID5, problem areas in diabetes 5-item; PCS, physical component summary; PHQ8, patient health questionnaire 8-item; SF-12, 12-item short form health survey.
      The Paretian Classification of Health Change shows that a large proportion of changes in EQ-5D-5L health states are ambiguous (i.e., mixed) within the small and large anchor change groups (see Appendix Table S1 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2018.02.007). Furthermore, the distribution of EQ-5D-5L health states by level and dimension (see Appendix Figures S1 and S2 in Supplemental Materials found at doi:10.1016/j.jval.2018.02.007) as well as the density of index scores (see Appendix Figure S3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2018.02.007) change from baseline to 1-year follow-up, and vary among anchors and by the direction of change.

      MID Estimate for “All” Changes

      On the basis of the instrument-defined estimation method, the MID estimates ranged between 0.037 and 0.049 with an ES of 0.22 to 0.28 in the overall sample (Fig. 2; Table 2). MID estimates (0.037–0.053) and ES (0.20–0.40) were consistent across examined subgroups (see Appendix Table S2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2018.02.007). On the basis of the anchor-based approach, the MID estimates ranged from 0.031 to 0.057 (pooled estimate 0.042) with an ES of 0.18 to 0.33 (pooled ES 0.25) in the overall sample (Fig. 2; Table 2), and pooled MID estimates ranged from 0.033 to 0.047 (pooled ES 0.22–0.30) across subgroups (see Appendix Table S2 in Supplemental Materials).
      Fig. 2
      Fig. 2MID estimates of the EQ-5D-5L index score by estimation method and direction of change; point estimates (solid dots) and 95% confidence intervals based on 1000 bootstrap replicates (solid lines) for all change and the direction of change (improve vs. deteriorate). Note that values for direction of change deteriorate have been multiplied by −1. EQ-5D-5L, five-level EuroQol five-dimensional questionnaire; idMID, instrument-defined minimally important difference (*excluding maximum-valued scoring parameters); MCS, mental component summary; MID, minimally important difference; PAID5, problem areas in diabetes 5-item; PCS, physical component summary; PHQ8, patient health questionnaire 8-item; Pooled, average of anchor-based estimates; SF-12, 12-item short form health survey.
      Table 2MID estimates of the EQ-5D-5L index score by estimation method and direction of change
      Direction of changeMethodSample sizeMean+SD+MID95% CIESSRM
      AllInstrument-defined:
       idMID1927------0.0490.048−0.0490.2850.434
       idMID*1927------0.0370.037−0.0370.2170.331
      Anchor-based:
       PHQ83010.0120.0860.0430.030−0.0560.2510.382
       PAID5266−0.0010.0960.0370.022−0.0530.2190.333
       SF12 MCS3180.0150.0870.0310.021−0.0420.1840.281
       SF12 PCS2940.0160.0850.0570.046−0.0690.3340.509
       Pooled---0.0100.0890.0420.030−0.0550.2470.376
      ImproveInstrument-defined:
       idMID1927------0.0430.043−0.0440.2540.386
       idMID*1927------0.0380.037−0.0380.2200.335
      Anchor-based:
       PHQ81370.0040.0810.0310.012−0.0480.1830.279
       PAID5136−0.0090.0940.0210.000−0.0420.1220.186
       SF12 MCS1480.0120.0910.0340.018−0.0500.1980.301
       SF12 PCS1290.0140.0750.0540.036−0.0710.3140.478
       Pooled---0.0050.0850.0350.017−0.0530.2040.311
      DeteriorateInstrument-defined:
       idMID1927------0.0530.052−0.0540.3120.475
       idMID*1927------0.0380.037−0.0380.2200.335
      Anchor-based:
       PHQ81640.0120.0830.0530.034−0.0710.3080.469
       PAID51300.0120.0890.0550.034−0.0770.3200.487
       SF12 MCS1700.0190.0830.0290.014−0.0450.1730.263
       SF12 PCS1650.0180.0930.0600.046−0.0760.3500.533
       Pooled---0.0150.0870.0490.032−0.0670.2870.438
      MID, minimally important difference; EQ-5D-5L, EuroQol five-dimensional five-level questionnaire; idMID, instrument-defined minimally important difference (*excluding maximum-valued scoring parameters); PHQ8, patient health questionnaire 8 items; PAID5, problem areas in diabetes 5 items; SF12, short-form medical survey 12 item; MCS, mental health composite score; PCS, physical composite score; Pooled, average of anchor-based estimates; SD, standard deviation; CI, confidence interval based on 1000 bootstrap replicates; ES, effect size; SRM, standardized response mean. Note that values for direction of change deteriorate have been multiplied by negative one (-1); plus symbol [+] represents a statistic for the no change group (Anchor-based); the “no change group” by direction of change includes responses with anchor change scores between 0 and the corresponding limit of change (i.e., trivial improvement/deterioration); sample size is the number of respondent scores used in the calculation of the MID.

      MID Estimate for “Improve” Changes

      The MID estimates for improvement in the overall sample ranged between 0.038 and 0.043 with an ES between 0.22 and 0.25 using the instrument-defined approach (Fig. 2; Table 2). Using the anchor-based approach, the MID estimates for improvement were between 0.021 and 0.054 (pooled estimate 0.035) with an ES from 0.12 to 0.31 (pooled ES 0.20). In contrast, the examined subgroups had an instrument-defined MID ranging between 0.037 and 0.045 (ES 0.19–0.38) and pooled anchor-based MID estimates ranging from 0.027 to 0.040 (pooled ES 0.16–0.26) (see Appendix Table S2 in Supplemental Materials).

      MID Estimate for “Deteriorate” Changes

      For the overall sample, MID estimates for worsening health ranged from 0.038 to 0.053 with an ES of 0.22 to 0.31 using the instrument-defined approach (Fig. 2, Table 2). The anchor-based approach gave MID estimates ranging from 0.029 to 0.060 (pooled estimate 0.049) with an ES of 0.17 to 0.35 (pooled ES 0.29). For the examined subgroups, similar MID estimates and ES were observed using the instrument-defined approach (MID 0.037–0.059; ES 0.20–0.42), whereas the anchor-based approach gave pooled MID estimates ranging from 0.037 to 0.056 (pooled ES 0.26–0.33) (see Appendix Table S2 in Supplemental Materials).

      MID Estimate as a Function of Baseline Index Score

      The relationship between the instrument-defined MID estimate and the baseline index score depended on whether maximum-valued scoring parameters were excluded. When excluded, the MID estimates remained at a constant value of approximately 0.037 across the range of baseline index scores (Fig. 3). Conversely, when all single-level transitions were included, the MID estimate started at a larger value for lower baseline index scores and decreased to a minimum of 0.037 as the baseline score approached its upper limit. Specifically, for improvement in health, the maximum MID value of 0.058 was at the minimum baseline index score of 0.54, and it declined to the minimum MID value of 0.036 at a baseline score of 0.76 (Fig. 3). This differs from the deterioration case, in which the MID estimates started at approximately 0.074 at a baseline index score of 0.54, before peaking at approximately 0.084 at a baseline score of 0.67, and then gradually declined to 0.036 at a baseline score of 0.93.
      Fig. 3
      Fig. 3MID estimates of the EQ-5D-5L index score, and average change in index score of the no change group as a function of baseline index score, by estimation method and direction of change. Lines are based on local regression (loess) curves of estimates from ordered subsets comprising at least 20% of the baseline data; solid lines represent MID estimates; dashed lines represent average change in index score of the no change group. Note that values for direction of change deteriorate have been multiplied by −1; the no change group by direction of change includes responses with anchor change scores between 0 and the corresponding limit of change (i.e., trivial improvement/deterioration). EQ-5D-5L, five-level EuroQol five-dimensional questionnaire; idMID, instrument-defined minimally important difference (*excluding maximum-valued scoring parameters); MCS, mental component summary; MID, minimally important difference; PAID5, problem areas in diabetes 5-item; PCS, physical component summary; PHQ8, patient health questionnaire 8-item; Pooled, average of anchor-based estimates; SF-12, 12-item short form health survey.
      The anchor-based improve MID estimates start at larger values than the instrument-defined MID estimate (0.071–0.113; pooled estimate of 0.092 at a baseline index score of 0.55) before quickly descending (with some indication of a leveling-off of estimates before dropping off again) to negative minimum values (−0.006 to −0.041; pooled estimate of −0.020) at the upper limit of baseline index scores (0.941; see Fig. 3). For deteriorate MID estimates, there appears to be some consistency among the PAID5 and SF-12 PCS anchors and the instrument-defined estimates; there is, however, an observable divergence at higher baseline index scores (Fig. 3). In contrast, the PHQ8 and SF-12 MCS anchors display a relationship in which the deteriorate MID estimate increases for increasing baseline index score (Fig. 3).

      Discussion

      This study provides evidence that the MID of the EQ-5D-5L index score in adults with type 2 diabetes is in the range of 0.03 to 0.05. The instrument-defined and anchor-based approaches represent two distinct methods of MID estimation for the EQ-5D-5L index score. Anchor-based MID estimates were generally consistent with instrument-defined MID estimates, for which differences in MID estimates were observed according to baseline index score and direction of change.
      When including all single-level transitions in the instrument-defined approach, the shape of the curve relating the MID estimate to the baseline index score may be interpreted as follows: at the extremes of the baseline index score (i.e., near −0.148 and 0.949), the MID is more representative of change in a single direction, in which it is important to interpret a small change in the possible direction as meaningful, yielding a small MID estimate. As we move away from the extremes of the baseline index score range, we expect the overall MID to reflect an increasing mixture of transitions to worse and better health states. Consequently, the overall MID estimate becomes larger as the baseline index score moves toward intermediate values (i.e., near 0.5). In summary, by including all possible instrument-defined single-state transitions, we are relying completely on the instrument (and its scoring algorithm) to determine the MID estimate, which in this case suggests that larger MID estimates are associated with intermediate baseline index scores. The larger MID estimates for worsening health compared with MID estimates for improving health suggest that for the same baseline index score, a patient may consider a smaller change to a better health state as important, whereas the same magnitude of change to a worse health state may be unimportant or trivial.
      The anchor-based MID estimates also show differences based on baseline index score and direction of health state change; there is, however, greater variability among anchors likely as a result of how changes in an anchor are reflected in the EQ-5D descriptive system (e.g., changes in the PHQ8 may be reflected more in the AD dimension than in other dimensions). Comparison of anchor-based MID estimates with the instrument-defined MID estimates for all, improve, and deteriorate (as well as across subgroups) shows that there is reasonable agreement. On the basis of the 95% bootstrap confidence limits, there is evidence that the PAID5 anchor-based MID estimates have the greatest uncertainty, which is likely attributable to the low correlation between change in anchor score and change in EQ-5D-5L index score. There is an observable ceiling effect in which 14.3% to 17.3% of respondents self-reported “no problems” in all dimensions (i.e., 11111), and thus further improvement in health for these individuals cannot be reflected by the EQ-5D descriptive system. The ceiling effect of the EQ-5D-5L is noticeable in the anchor-based improve MID estimates, which show a sharp decline for increasing baseline index score becoming negative at the upper limit (i.e., reflecting a decrease in index score despite improvement in health as measured by changes in anchor scores). In contrast, the instrument-defined approach excluded health state 11111 in improve MID estimates.
      A number of limitations warrant consideration. First, the results are based on complete case analysis representing 67% to 81% of the original cohort depending on response and completion rates of each anchor questionnaire. There is evidence that those individuals who were lost to follow-up were in a poorer health state at baseline, suggesting that an assumption of missing-at-random is unlikely. Nevertheless, because anchor-based MID estimates are based on a priori small change groups, it may be reasonable to assume that the small change group had less loss to follow-up (than the large change group) so as to not adversely impact the MID estimates. Furthermore, the instrument-defined MID estimates use only the baseline information and are therefore unaffected by attrition. Second, it is important to consider that although we treat the MID estimate as a “threshold” of meaningful change, this threshold cannot suggest or indicate how “difficult” it is to achieve a change score at least as large as the MID estimate. Finally, although this study has used several methods to quantify what a patient may consider as the smallest meaningful change in index score, it is important to further test and validate estimates using other methods and in other clinical contexts.

      Conclusions

      The instrument-defined approach can be a useful method of MID estimation in a specific patient population. This provides a plausible range of the smallest change in index score that may be considered meaningful to the patient. Furthermore, the results suggest that the MID for health improvement is less than that for health deterioration as well as decreasing for higher baseline index scores, which proposes that researchers ought to consider these issues when interpreting study results. It is, however, unknown whether these phenomena are unique to patients with diabetes and/or the Canadian value set. Further research that seeks patient input directly is needed to determine what patients consider the smallest meaningful change in index score.
      Source of financial support: This study was supported by a EuroQol Research Foundation grant (The EQ Project number is 2016520).

      Supplementary material

      References

        • Devlin N.J.
        • Appleby J.
        Getting the Most Out of PROMS: Putting Health Outcomes at the Heart of NHS Decision-Making.
        King’s Fund, London2010
        • World Health Organization
        Global Report on Diabetes.
        World Health Organization, Geneva, Switzerland2016
        • Public Health Agency of Canada
        Diabetes in Canada: Facts and Figures from a Public Health Perspective.
        Public Health Agency of Canada, Ottawa, Ontario2011
        • Al Sayah F.
        • Majumdar S.R.
        • Soprovich A.
        • et al.
        The Alberta’s Caring for Diabetes (ABCD) study: rationale, design and baseline characteristics of a prospective cohort of adults with type 2 diabetes.
        Can J Diabetes. 2015; 39: S113-S119
        • Janssen M.F.
        • Lubetkin E.I.
        • Sekhobo J.P.
        • Pickard A.S.
        The use of the EQ-5D preference-based health status measure in adults with type 2 diabetes mellitus.
        Diabet Med. 2011; 28: 395-413
        • Herdman M.
        • Gudex C.
        • Lloyd A.
        • et al.
        Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L).
        Qual Life Res. 2011; 20: 1727-1736
        • Devlin N.J.
        • Brooks R.
        EQ-5D and the EuroQol group: past, present and future.
        Appl Health Econ Health Policy. 2017; 15: 127-137
        • Walters S.J.
        • Brazier J.E.
        Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D.
        Qual Life Res. 2005; 14: 1523-1532
        • Gutacker N.
        • Street A.
        Use of large-scale HRQoL datasets to generate individualised predictions and inform patients about the likely benefit of surgery.
        Qual Life Res. 2017; 26: 2497-2505
        • King M.T.
        A point of minimal important difference (MID): a critique of terminology and methods.
        Expert Rev Pharmacoecon Outcomes Res. 2011; 11: 171-184
        • Johnston B.C.
        • Ebrahim S.
        • Carrasco-Labra A.
        • et al.
        Minimally important difference estimates and methods: a protocol.
        BMJ Open. 2015; 5: e007953
        • Coretti S.
        • Ruggeri M.
        • McNamee P.
        The minimum clinically important difference for EQ-5D index: a critical review.
        Expert Rev Pharmacoecon Outcomes Res. 2014; 14: 221-233
        • Revicki D.
        • Hays R.D.
        • Cella D.
        • Sloan J.
        Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes.
        J Clin Epidemiol. 2008; 61: 102-109
        • Crosby R.D.
        • Kolotkin R.L.
        • Williams G.R.
        Defining clinically meaningful change in health-related quality of life.
        J Clin Epidemiol. 2003; 56: 395-407
        • Xie F.
        • Pullenayegum E.
        • Gaebel K.
        • et al.
        A time trade-off-derived value set of the EQ-5D-5L for Canada.
        Med Care. 2016; 54: 98-105
        • Fleishman J.A.
        • Selim A.J.
        • Kazis L.E.
        Deriving SF-12v2 physical and mental health summary scores: a comparison of different scoring algorithms.
        Qual Life Res. 2010; 19: 231-241
        • Devlin NJ.
        • Parkin D.
        • Browne J.
        Patient-reported outcomes in the NHS: new methods for analysing and reporting EQ-5D data.
        Health Econ. 2010; 19: 886-905
        • McClure N.S.
        • Al Sayah F.
        • Xie F.
        • et al.
        Instrument-defined estimates of the minimally important difference for EQ-5D-5L index scores.
        Value Health. 2017; 20: 644-650
        • Luo N.
        • Johnson J.A.
        • Coons S.J.
        Using instrument-defined health state transitions to estimate minimally important differences for four preference-based health-related quality of life instruments.
        Med Care. 2010; 48: 365-371
        • Norman G.R.
        • Sloan J.A.
        • Wyrwich K.W.
        Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation.
        Med Care. 2003; 41: 582-592
        • Yost K.J.
        • Eton D.T.
        • Garcia S.F.
        • Cella D.
        Minimally important differences were estimated for six PROMIS-Cancer scales in advanced-stage cancer patients.
        J Clin Epidemiol. 2011; 64: 507-516
        • Tabberer M.
        • Brooks J.
        • Wilcox T.
        A meta-analysis of four randomized clinical trials to confirm the reliability and responsiveness of the Shortness of Breath with Daily Activities (SOBDA) questionnaire in chronic obstructive pulmonary disease.
        Health Qual Life Outcomes. 2015; 13: 177
        • Cleveland R.B.
        • Cleveland W.S.
        • McRae J.E.
        • Terpenning I.
        STL: a seasonal-trend decomposition procedure based on loess.
        J Off Stat. 1990; 6: 3-33