Deciding Between SF-6Dv2 Health States: A Think-Aloud Study of Decision-Making Strategies Used in Discrete Choice Experiments

Objective: This study aimed to gain insight into decision-making strategies individuals used when evaluating pairs of SF-6Dv2 health states in discrete choice experiments (DCEs). Methods: This qualitative, cross-sectional, noninterventional study asked participants to use a think-aloud approach to compare SF-6Dv2 health states in DCEs. Thematic analysis focused on comprehension and cognitive strategies used to compare health states and make decisions. Results: Participants (N = 40) used 3 main strategies when completing DCEs: (1) trading, (2) reinterpretation, and (3) relying on previous experience. Trading was the most common strategy, used by everyone at least once, and involved prioritizing key attributes, such as preferring a health state with signi ﬁ cant depression but no bodily pain. Reinterpretation was used by 17 participants and involved reconstructing health states by changing underlying assumptions (eg, rationalizing selecting a health state with signi ﬁ cant pain because they could take pain medications). Finally, some (n = 13) relied on previous experience when making decisions on some choice tasks. Participants with experience dealing with pain, for instance, prioritized health states with the least impact in this dimension. Conclusions: Qualitatively evaluating the decision-making strategies used in DCEs allows researchers to evaluate whether the tasks and attributes are interpreted accurately. The ﬁ ndings from this study add to the understanding of the generation of SF6Dv2 health utility weights and the validity of these weights (e.g., reinterpreting health states could undermine the validity of DCEs and utility weights), and the overall usefulness of the SF-6Dv2. The methodology described in this study can and should be carried forth in valuing other health utility measures, not just the SF-6Dv2.


Introduction
A discrete choice experiment (DCE) is a survey-based methodology wherein individuals assess attributes of different scenarios and choose between them, indicating their overall preference. 1,2 When focused on health-related quality of life, data from DCEs can be used to derive utility indices for calculating quality-adjusted life-years (QALYs), according to individuals' preferences for living in different health states. 1,2 DCEs have been used to estimate utility weights for health utility measures, such as the EQ-5D and SF-6D Health Utility Survey (SF-6D). [3][4][5][6][7] In DCEs used to value health utility measures, participants are asked to choose among 2 health states, each characterized by a length of survival and a level on each of the health dimensions, or attributes, covered by the measures. Attributes assessed in DCEs need to be leveled (ie, each could be perceived as better or worse than another) and capable of being traded (ie, individuals are willing to swap "worse" attributes for "better" ones). 8,9 The derivation of utility weights rests on assuming (1) participants will assume health states are stable over time (eg, moderate pain for 2 years is twice as bad as 1 year), (2) participants will consider the length of survival and the decrement from negative health states along the dimensions of the measure (eg, negative outcomes are traded against positive outcomes), and (3) participants will tend to choose the health state with the largest utility. 2,[10][11][12] These underlying assumptions come with inherent problems. First, if an attribute is not clearly understood, there is a risk the individual will ignore it (ie, attribute nonattendance). 8,[11][12][13][14][15][16][17][18] Next, individuals might be unwilling to trade because they have a dominant preference for a single attribute, and make decisions based solely on that preference (ie, lexicographic preferences). 8,9,14 To minimize measurement error, careful consideration of study design and methods is needed to ensure participants interpret DCE tasks and attributes accurately. 14,19 Despite these potential issues, DCEs are important in the development of health utility measures. They examine patient priorities, informing both healthcare policy and clinician education, and enable survey developers to estimate and select a value set for understanding health preferences. 3,8,20,21 Previous research identified common strategies used when completing DCEs for estimating utility weights. These strategies align with the general assumptions described above and include trading, 8,12,14,18,22 selecting the least risk/most positive scenario, 11,14 making substitutions by ignoring/not attending to certain attributes, 11,14 adding or inferring information not included in the scenario (including inferring causality between attributes), 8,12,14 allowing 1 to 3 key attributes to influence decision making, 8,12,14 changing the key attributes in each scenario, 8,12,14 and considering all attributes. 11,12 Given the variety of potential completion strategies and issues linked to DCE tasks, it is recommended to pilot test DCEsincluding asking individuals to complete them using a think-aloud approach-to a priori identify problems 11 and provide evidence participants interpret the task and attributes accurately. 14 Cognitive interviewing, a qualitative research method informed by cognitive theory, 23 uses think-aloud and probing techniques to understand how individuals form judgments, make decisions, and answer survey questions. 24 It can be used to optimize survey design and implementation and to reduce measurement errors. 25 The purpose of this study was to use a think-aloud approach to gain insight into the decision-making process used by individuals when evaluating and selecting preferred SF-6Dv2 DCE health states.

Study Design
This qualitative, cross-sectional, noninterventional study consisted of one-on-one cognitive interviews. The 75-minute audiorecorded interviews were conducted by 1 of 2 experienced and trained qualitative researchers (L.B. and M.L.C.). All interviews were conducted using videoconferencing software, allowing for nationwide participation by a diverse geographic sample. All study materials were approved by one central independent review board (WCG/New England Institutional Review Board Study #1293768).

Study Population
The study used purposive sampling to recruit 40 participants from the general population via a third-party recruitment vendor. The final sample size was predetermined based on the number of choice tasks, the complexity of the choice tasks, and the desired sample diversity. 26,27 Specific quotas were established to ensure a diverse and representative sample in age, sex, presence of chronic health conditions, race/ethnicity, and education (see Appendix 1, Table 1 in Supplemental Materials found at https://doi.org/10.1 016/j.jval.2022.07.018). All participants were required to be 18 years of age or older, living in the United States, and fluent in English.
Potential participants completed an online screening questionnaire to assess study inclusion criteria and demographic information. A total of 87 potential participants were screened for eligibility, and recruitment was stopped when all quotas were met. Participants were excluded from the study if they were unwilling or unable to participate in an interview. During screening, participants were also asked how they would describe their current health (poor, fair, good, very good, excellent) and to rate their health satisfaction on a scale from 0 (not at all satisfied) to 10 (completely satisfied).

Study Procedures
During each interview, participants completed DCE choice tasks using a think-aloud approach. Participants were asked to read all information in each choice task out loud and to articulate their thoughts-including any points of confusion and how they ultimately decided the meaning of the item in question-as they read the health states and selected which they preferred. Upon completing each choice task, participants responded to semistructured follow-up questions, and spontaneous probing questions if necessary, to help the interviewer better understand their decision-making process and rationale for the choice they made.

Years of life left untill death
You live for 1 year with the following then you die: You live for 10 years with the following then you die: Accomplish less than you would like all of the time For each choice task, participants rated on a scale of 0 to 10 how difficult it was to select a preferred health state. After each interview, the audio recording was sent for verbatim transcription and each participant was given an honorarium.
To minimize burden, participants were divided into 8 groups of 5 by their order of entry into the study. Each group of 5 was asked to complete 1 of 8 sets of 4 choice tasks evaluating and selecting from pairs of SF-6Dv2 health state profiles (health states A and B). Each pair of health state profiles included the following attributes ( Fig. 1): years of life left until death, physical function (activity limitations), role function (limitations accomplishing tasks), pain intensity, vitality, social function (limitations in social activities), and mental health (feeling depressed or very nervous). Choice tasks were strategically selected so (1) very unusual health states were avoided by only including health states observed in a large general population data set (N = 75 000), 28 (2) comparisons including identical levels for 1 or more attributes were avoided, and (3) the number of different attribute levels being examined was maximized. The health states were scored using UK utility weights 4 and choice tasks were characterized as representing a large or small QALY difference (compared with the median difference of 2.4) and a high or a low average QALY level (compared with the median average level of 1.7). The first-choice task in each set represented a large QALY difference between health states and a high average QALY level, the second-choice task represented a large difference and a low level, the third-choice task represented a small difference and a high level, and the fourth-choice task represented a small difference and a low level. These selection criteria aimed to make the tasks challenging but realistic, starting with easier tasks before proceeding to more challenging tasks.

Coding and Analysis
Coding of interview data began immediately after each interview based solely on interviewer field notes. The interviewers populated a spreadsheet with any notable issues that arose during the interview and recorded choice task preferences.
Once received, all transcripts were cross-checked against the initial coding spreadsheet to ensure consistency and completeness and then coded using NVivo software (QSR International Pty Ltd, Burlington, MA, 2018). Transcripts were coded to identify overall opinions on the choice tasks, the decision-making approaches and strategies used when evaluating the health states, and any other suggestions or insights. The researchers independently coded the same first 2 transcripts and then met to review their coding and resolve any discrepancies. This meeting allowed for any adjustments to the codebook and code definitions. Once coding was consistent, the remaining transcripts were randomly divided between the 2 coders and coded independently. The coders communicated throughout coding (3 formal meetings) to ensure consistency and address any questions; the study principal investigator (L.B.) reviewed all coding to ensure coding reliability.
Once transcripts were coded, the 2 researchers used inductive thematic analysis to distill meaning from the data. Thematic analysis is a qualitative method in which researchers identify and interpret common themes or patterns of meaning throughout the data. 29,30 Analysis focused on the cognitive strategies used to compare and select preferred health states. This analysis included an assessment of each participant's decision-making process, preferential attributes of each scenario, clarity of health descriptions, and the level of difficulty in making decisions.
Given that each health state was assigned a utility score using the UK weights, 4 the research team queried the coded data to evaluate (1) whether cognitive strategies differed according to utility score (ie, whether different strategies were used for health states with high versus low QALY scores) and (2) whether strategies differed according to large versus small differences between the 2 health states. Finally, the data were organized into sets according to age (18-49 years and 50 years or older) and presence or absence of a chronic health condition and queried to determine whether the cognitive strategies used in decision making differed according to these factors.  --

Participant Demographics
Notably, 40 individuals participated in this study. Participants were white (n = 26, 65.0%), were female (n = 23, 58.0%), had completed some form of post-high school education (n = 29, 72.5%), and had a chronic health condition (n = 29, 72.5%). Half of the sample was between the ages of 18 and 49 years, and the other half was age 50 years or older; 14 participants (35%) rated their overall health as "very good" or "excellent." Health satisfaction ratings were wide ranging across the 11-point scale, with an average of 5.8 (lowest and highest observed scores were 1 and 9, respectively). See Table 1 for additional demographic data.

Strategies Used
Interviewers observed participants understood the choice tasks and easily completed them. This was confirmed by participants' answers to the questions asked after the think-aloud.
Participants used 3 main strategies when comparing and selecting preferred health states: (1) trading, (2) reinterpreting a health state, and (3) relying on previous experience or preconceived ideas of an attribute. Participants most often used 1 decisionmaking strategy per choice task. In some cases, participants used the same strategy for every choice task whereas others switched strategies each time. Although decision making was not a linear process, a general pattern for decision making was found across vignettes. When initially evaluating each choice task, participants were observed easily identifying either a key attribute to prioritize or whether deeper consideration of the health states was needed. Figure 2 illustrates the cognitive strategies participants were observed using in this study. Response strategies not following this pattern included a single participant who added up self-defined positives in each scenario and selected the health state with the highest score.

Trading
Trading was the most common strategy deployed in decision making and was used by every participant at least once. When trading, participants sacrificed 1 or more attributes for another. Participants used hierarchical decision making, prioritizing specific, key attributes over others (Table 2). Participants would compare those key attributes between the 2 health states and ultimately base their decision on this comparison, trading less preferable attributes in favor of the key attributes. Nevertheless, some participants were less focused on key attributes and more generally traded multiple less preferable attributes for preferable ones. Less pain, better mental health, and more years of life were frequently determined to be key attributes in a trade. Only when participants considered the combination of key attributes in description A to have similar values to the combination of those in description B did they consider other health dimensions (vitality; physical, role, and social functioning).

Reinterpretation of health state
Just less than half of participants (n = 17) would, for 1 or more choice tasks, reinterpret or change the underlying assumptions of 1 or both health states (Table 2). In using this strategy, participants refused to believe 1 or more of the attributes presented would remain the same for the duration of time provided or refused to believe attributes could go together. In not accepting the less desirable attributes as static, participants indicated they could change over time given the more positive attributes (eg, not having pain would, over time, improve mental health).
Alternatively, in refusing to believe the attributes they were presented with could coexist (eg, severe pain and no social limitations), participants selected the health state in which the attributes seemed more consistent and in line with their expectations. Participants also reinterpreted the health states by inferring outside information that was not presented to them. For example, some rationalized living with high levels of pain because they could take pain medications. One participant interpreted all the health states to mean they had a chronic or fatal condition and made decisions based on that inference.

Relying on previous experience/preconceived ideas of attribute
The final strategy participants used when deciding between 2 health states was relying on previous experience or preconceived ideas of an attribute ( Table 2). This strategy was used at least once by 13 participants. Participants were more apt to use this strategy for attributes with which they had personal experience, most commonly pain and mental health. Participants with experience with significant pain or episodes of depression or anxiety most often prioritized health states with the least impact in these dimensions, regardless of the severity of the other dimensions. This strategy was also used when participants more broadly considered their nonhealth life situations, including how the severity of some attributes might impact their ability to care for their families. Although the use of the other strategies may also have been influenced by personal experience, this strategy is distinct in that participants explicitly indicated when and why they were taking a specific experience or preconceived notion into account versus focusing in on a key attribute.

Comparison of high and low scores and large versus small differences in scores
Trading was the predominant strategy used across all DCEs regardless of utility score or the difference between the utility scores of the 2 health states (see Appendix 2, Table 1 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.07. 018). Although some participants used multiple strategies to make their decisions, this approach did not appear to be related to either the utility score level or the difference between the 2 scores. Subsequently, we examined patterns in strategies used by how difficult participants rated the decision-making process (on a scale of 0-10). In general, participants tended to rate health state comparisons with smaller differences (, 2.4) as more difficult, but this trend did not hold up across all comparisons (see Appendix 2, Table 1 in Supplemental Materials found at https://doi.org/10.1 016/j.jval.2022.07.018). There was no clear pattern to the difficulty rating when comparing high and low utility scores.

Age
Age did not appear to make a difference in the strategies used when deciding between health states (Appendix 2, Table 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2 022.07.018). All participants used trading at least once, followed by inferring additional information about a health state or an attribute, and finally relying on previous experience or preconceived ideas (specifically, making decisions based on pain and mental health). Participants aged 18 to 49 years tended to reinterpret health states to a larger degree than those aged 50 years and older (55% vs 35%).

Chronic conditions
The decision-making processes used by participants with a chronic condition varied slightly from those without a chronic       Table 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.07.018). Although all members of both groups used trading, participants without a chronic condition also made decisions by reinterpreting health states more often than participants with a chronic condition did (63.3% vs 34.5%).

Interaction between age and chronic conditions
The data suggest decision-making processes were slightly different among individuals who were younger (18-49 years old) and had a chronic condition compared with younger participants without chronic disease (Appendix 2, Table 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.07.018). Younger participants with a chronic condition relied less on reinterpreting health states than those without (40% vs 70%) and relied more on previous experience (40% vs 20%). Given that there was only 1 participant older than the age 50 years without a chronic condition, it is not possible to make any comparison about the decision-making strategies that were used.  "I choose Health Description B, because given the choices, that seems to be the best way to live. And the reason that I chose that answer is because a lot of these questions, uh, the reason I chose that answer is because I dealt with my mother, who died last year, who, um, [sighs] was disabled I would say for a lot of her latter years, I would say maybe 20, uh, who-who lived for a long time. She had a miserable life. She-she didn't do anything and, uh, for herself, uh, she, um, had a lot of pain, was depressed all the time, lived in fear, you know. So I-I look at that. I looked at how she lived and I, you know, I didn't want to live like that. So, um, that's it. [laughs]. That's why I answered these questions the way I did." ID 07 Female, 65 y/o DCE indicates discrete choice experiment.

Discussion
This was a qualitative, cross-sectional, noninterventional study consisting of one-on-one, 75-minute interviews designed to provide insight into the decision-making process used by individuals participating in a DCE study comparing and selecting preferred SF6Dv2 health states. This research aimed to ascertain participants' understanding of the task they were asked to complete and the strategies they used to do so by asking them to use a thinkaloud approach. Data from the interviews were coded and analyzed to distill the decision-making strategies used by participants. Coding and analysis proceeded using an inductive approach, after which the research team identified how the data aligned with previous research.
Using the think-aloud approach and follow-up questions with participants gave great insight into the strategies they used. The strategies identified in this study were (1) trading for a priority attribute over lower-priority attributes, (2) reinterpreting attributes and inferring additional information about health states, and (3) relying on preconceived ideas about attributes. Although trading was the dominant strategy used, participants were observed using the strategies in concert with one another.
There was some evidence the choice of strategy was influenced by age group and health status. Older participants and those who live with a chronic condition did not need to infer additional information as frequently as those who were younger or did not have a chronic condition. This may have been because these individuals could more easily relate to attributes of a hypothetical health state given their lived experiences. A future study with a more precise distribution of sample by age would enable more indepth analyses by chronic condition and age.
The strategies participants used to complete choice tasks align with previous work examining strategies for evaluating health states, indicating decision-making strategies are common across groups and tasks. Previous studies have used a think-aloud approach to examine the decision-making strategies used when completing choice tasks across a number of content areas: breast cancer screening, 18 bowel cancer screening, 12 vaccination preferences, 11,12 primary care preferences, 8 prostate cancer screening, 11 and preferences for funding new health technology assessments. 22 These previous studies indicate, regardless of the topic area or the population, the strategies used to complete choice tasks tend to be the same. Asking participants to complete this task using a think-aloud approach confirmed earlier findings that not all attributes are relevant or of high priority to all individuals and that they can and will disregard those low priority attributes when making their decision. 12 In the current study, the key, preferred attributes included less pain, better mental health, and longer duration of survival, whereas lower-priority attributes were better vitality, physical functioning, role functioning, and social functioning.
In this study, attribute nonattendance was not observed, nor was it described during the think-aloud. This may be because the think-aloud exercise of the choice tasks followed a cognitive debriefing of the SF-6Dv2. 31 This meant participants were familiar with and had a clear understanding of the attributes 1 and had already been considering how each attribute related to their daily lives, making the evaluation during the choice task more personal and less hypothetical in nature. Although not used exclusively, one participant completed choice tasks based solely on their perception of which scenario in the choice task was more "positive." This approach has been observed in previous think-aloud studies 22 and has potential implications for the underlying assumption of DCE studies, which is that individuals make decisions based on preference. Additional decision-making heuristics were not observed.
Some participants reinterpreted SF-6Dv2 health states to make them more palatable or to make the decision-making process easier. This aligns with previous studies in which individuals were found to reinterpret the attributes they were evaluating, including inferring causality between them. 8,12,14 Reconstructing health states presented in choice tasks has potential issues for the assumptions underlying DCEs, namely, that they are being interpreted as written. It is advisable to revise the DCE instructions to alert participants that the health states they are comparing are hypothetical and should be considered "as written." 8,32 Asking participants to consider the health states "as written" also serves to focus those who may have previous experience with one of the disease states.
This study is not without limitations. First, the think-aloud method itself has limitations. The request for verbalization may change the process of selecting health states. It is not always clear comments made when thinking aloud aligned with participants' processing of the task. To mitigate this issue, a series of semistructured follow-up questions was asked after each DCE task to allow the interviewer to gain a better understanding of the person's process and rationale for choosing a particular health state. Second, it is preferable to conduct these interviews in person, and although videoconferencing software mitigates this limitation, it did not eliminate it. Third, although all health states used in the DCEs had been reported in practice by at least 1 person from a large general population sample, interpretation of what could be considered an unusual health state is subjective. As described earlier, some found combinations unlikely. This has implications for how participants may have valued those specific choice tasks, thus affecting their overall preference. Fourth, the pairs of SF-6Dv2 health states participants were asked to evaluate and choose between did not include an opt-out option and forced participants to make a selection. This adds a degree of difficulty to interpreting the choices made by participants who did not have a clear preference. 2 Finally, participants were presented with fewer choice tasks than they would be asked in a typical DCE study. This may have reduced learning effects, survey fatigue, and decision fatigue.
This study also had a number of strengths. In addition to participants being familiar with the attributes before completing the think-aloud of the DCE health states, qualitative research strategies confirmed their interpretations of the attributes were correct and provided the study team a clear understanding of the decision-making strategies being used. This qualitative understanding of the decision-making strategies has been highlighted as key to developing and conducting DCEs. 1,2,8,12,14,32,33 Additionally, imposing sampling quotas helps ensure the DCEs are understandable across broad populations.
A further strength of this study was using a think-aloud approach to gain a deeper understanding of the decision-making strategies used when completing DCEs evaluating pairs of SF-6Dv2 health states. The think-aloud process highlighted the importance of ensuring the instructions for how to complete a DCE task are clear and uncomplicated. To that end, minor revisions to the instructions have been made so that when participants complete the DCE in future studies, they are clear on what is being asked of them.

Conclusion
To the best of the authors' knowledge, this is the first qualitative study to investigate the decision-making strategies implicit in valuing multiattribute health utility measures such as the SF-6Dv2. Although qualitative studies have been conducted to evaluate the content validity of existing health utility measures, including the SF-6Dv2, 31 this study sought to gain qualitative insight into the decision-making strategies that drive the scoring of health utility measures, namely, the SF-6Dv2. The evidence from this study adds to the depth of understanding of the choice tasks for the SF-6Dv2, to the strength of the utility weights to be developed, and to the overall usefulness of the SF-6Dv2. The findings further suggest the methodology described in this study can and should be carried forth in valuing other health utility measures, not just the SF-6Dv2.

Supplemental Material
Supplementary data and materials, including sample interivew guide questions, associated with this article can be found in the online version at https://doi.org/10.1016/j.jval.2022.07.018.
Conflict of Interest Disclosures: Dr Mulhern reported receiving grants outside the submitted work. Dr Brazier reported receiving royalties paid to his institution based on the use of the SF-6D. No other disclosures were reported.
Funding/Support: This work was supported by QualityMetric.
Role of the Funder/Sponsor: The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.