Advertisement

Expert Elicitation of Multinomial Probabilities for Decision-Analytic Modeling: An Application to Rates of Disease Progression in Undiagnosed and Untreated Melanoma

  • Edward C.F. Wilson
    Correspondence
    Address correspondence to: Edward C.F. Wilson, Cambridge Centre for Health Services Research, Institute of Public Health, University of Cambridge School of Clinical Medicine, Forvie Site, Cambridge Biomedical Campus, Cambridge CB20 SR, UK.
    Affiliations
    Cambridge Centre for Health Services Research, Institute of Public Health, University of Cambridge, Cambridge, UK
    Search for articles by this author
  • Juliet A. Usher-Smith
    Affiliations
    Primary Care Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
    Search for articles by this author
  • Jon Emery
    Affiliations
    Primary Care Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK

    Department of General Practice, Centre for Cancer Research, Faculty of Medicine, Dentistry and Health Science, Victorian Comprehensive Cancer Centre, University of Melbourne, Melbourne, Victoria, Australia
    Search for articles by this author
  • Pippa G. Corrie
    Affiliations
    Cambridge Cancer Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
    Search for articles by this author
  • Fiona M. Walter
    Affiliations
    Primary Care Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK

    Department of General Practice, Centre for Cancer Research, Faculty of Medicine, Dentistry and Health Science, Victorian Comprehensive Cancer Centre, University of Melbourne, Melbourne, Victoria, Australia
    Search for articles by this author
Open ArchivePublished:December 07, 2017DOI:https://doi.org/10.1016/j.jval.2017.10.009

      Abstract

      Background

      Expert elicitation is required to inform decision making when relevant “better quality” data either do not exist or cannot be collected. An example of this is to inform decisions as to whether to screen for melanoma. A key input is the counterfactual, in this case the natural history of melanoma in patients who are undiagnosed and hence untreated.

      Objectives

      To elicit expert opinion on the probability of disease progression in patients with melanoma that is undetected and hence untreated.

      Methods

      A bespoke webinar-based expert elicitation protocol was administered to 14 participants in the United Kingdom, Australia, and New Zealand, comprising 12 multinomial questions on the probability of progression from one disease stage to another in the absence of treatment. A modified Connor-Mosimann distribution was fitted to individual responses to each question. Individual responses were pooled using a Monte-Carlo simulation approach. Participants were asked to provide feedback on the process.

      Results

      A pooled modified Connor-Mosimann distribution was successfully derived from participants’ responses. Feedback from participants was generally positive, with 86% willing to take part in such an exercise again. Nevertheless, only 57% of participants felt that this was a valid approach to determine the risk of disease progression. Qualitative feedback reflected some understanding of the need to rely on expert elicitation in the absence of “hard” data.

      Conclusions

      We successfully elicited and pooled the beliefs of experts in melanoma regarding the probability of disease progression in a format suitable for inclusion in a decision-analytic model.

      Keywords

      Introduction

      Evidence-based medicine is defined as the “conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients … [which] means integrating individual clinical expertise with the best available external clinical evidence from systematic research” [
      • Sackett D.L.
      • Rosenberg W.M.
      • Gray J.A.
      • et al.
      Evidence based medicine: What it is and what it isn’t.
      ]. These principles apply equally to population-level decision making, such as whether a health care payer should provide reimbursement for a new drug, treatment pathway, or screening program.
      Decision-analytic models are frequently used by agencies such as the National Institute for Health and Care Excellence in England as a framework to structure all current best (i.e., relevant, quality-assessed) evidence to estimate the overall costs and consequences of alternative treatment strategies over an appropriate time horizon [
      • Drummond M.
      • Sculpher M.
      • Claxton K.
      • et al.
      ,
      • Briggs A.H.
      • Sculpher M.
      • Claxton K.
      Decision Modelling for Health Economic Evaluation.
      ]. A judgment is then made to decide whether the added benefit of a treatment exceeds its opportunity cost. Evidence to populate a model is ideally obtained exclusively from good-quality systematic reviews and meta-analyses of randomized controlled trials (RCTs) or other relevant study designs as appropriate. Nevertheless, because of data limitations, evidence is typically obtained from various sources including routine databases and observational studies. When no suitable previous data exist, decision makers are required to rely on expert opinion to bridge the evidence gaps.
      Such an evidence gap is the natural history of undetected and hence untreated melanoma.
      The MelaTools program (www.melatools.org) is a National Institute for Health Research–funded program based in the United Kingdom to improve the early diagnosis of melanoma to reduce associated mortality and morbidity. This includes investigation of the feasibility and cost-effectiveness of introducing a risk-based surveillance program using a self-completed assessment tool [
      • Williams L.H.
      • Shors A.R.
      • Barlow W.E.
      • et al.
      Identifying persons at highest risk of melanoma using self-assessed risk factors.
      ].
      To address this, we developed a decision model to estimate the most cost-effective cutoffs for various intervention policies [

      Wilson E, Usher-Smith J, Emery J, Corrie P, Walter F. A modelling study of the cost-effectiveness of a risk-stratified surveillance programme for melanoma in the UK. Value in Health [In Press].

      ]. Nevertheless, a key component of this is the counterfactual, in other words, the natural history of untreated melanoma in the absence of medical intervention. Good-quality data exist on prognosis after diagnosis and subsequent treatment [
      • Balch C.M.
      • Gershenwald J.E.
      • Soong S.J.
      • et al.
      Final version of 2009 AJCC melanoma staging and classification.
      ], but there are no data on untreated individuals. Obtaining such data from a prospective study by withholding treatment from newly diagnosed patients would clearly be deeply unethical. Therefore, the only way to estimate the probability of an undiagnosed and hence untreated patient progressing from one disease stage to another is to garner expert opinion.
      In this article, we apply a method to elicit multinomial probabilities from experts regarding their beliefs about the rate of progression from different melanoma disease stages (in situ disease to stage IV) to any other stage or death. The primary purpose of the analysis was to use the resulting multinomial distributions in our decision model to predict the cost-effectiveness of a self-completed risk assessment tool and subsequent surveillance program. Nevertheless, the distributions themselves are of interest because they represent a summary of expert opinion and belief.

      Methods

      Research Problem

      There are four main types of cutaneous melanoma (superficial spreading, lentigo maligna, acral lentiginous, and nodular) [
      • Liu V.
      • Mihm M.C.
      Pathology of malignant melanoma.
      ], which current guidelines categorize into nine stages of invasion [
      • Balch C.M.
      • Gershenwald J.E.
      • Soong S.J.
      • et al.
      Final version of 2009 AJCC melanoma staging and classification.
      ]. All but one are also described with a pre-invasive (in situ) phase; nodular melanoma is by definition invasive. We wished to elicit expert opinion on the rate of progression from each stage to any other. We simplified a possible set of 39 questions into 12 by assuming that invasive disease would progress at the same rate irrespective of primary melanoma subtype but we allowed the rate of progression from in situ disease to vary by subtype (Table 1). Each question is a multinomial problem: the quantities to be elicited are probabilities, but there are more than two outcomes. For example, after a defined time period, a patient with stage Ia disease may remain in stage Ia or progress to stage Ib, IIa and so forth. The sum of the probabilities must equal to 1.
      Table 1Starting stages for elicitation questions
      In situ superficial spreading melanoma
      In situ lentigo maligna melanoma
      In situ acral lentiginous melanoma
      Stage Ia
      Stage Ib
      Stage IIa
      Stage IIb
      Stage IIc
      Stage IIIa
      Stage IIIb
      Stage IIIc
      Stage 4
      Note. Participants were asked for their beliefs about the probability of progression from each of the 12 stages stated to any other stage and death.

      Elicitation Protocol

      The protocol and associated materials are given in Appendix 1 in Supplemental Materials found at doi:10.1016/j.jval.2017.10.009. The protocol was designed with the following constraints in mind:
      • 1.
        We wished to elicit opinion from experts of more than one country. We chose the United Kingdom as well as Australia and New Zealand (hereafter ANZ) as areas of relatively high melanoma prevalence. Arranging a single workshop event in the same place at the same time would have been prohibitively expensive and extremely difficult to schedule. Therefore, an online webinar approach that could be repeated to suit availability of participants was desired.
      • 2.
        Because of demands on experts’ time, the webinar could not exceed 2 hours in length.

      Ethics

      Ethical approval was not required for this study [

      Medical Research Council, NHS Health Research Authority. Do I need NHS REC approval? Decision tool. Available from: http://www.hra-decisiontools.org.uk/ethics/. [Accessed November 21, 2016].

      ]. Invitation letters explained to participants that their responses would be anonymized, with the only details being their broad job title and country (United Kingdom or ANZ).

      Identification and Recruitment of Experts

      Inclusion criteria were that participants had to be located in the United Kingdom, Australia, or New Zealand with an academic or clinical background in dermatology, oncology, plastic surgery, or epidemiology, with a particular interest and expertise in melanoma. A list of potential participants was identified by two of the investigators on the basis of known expertise and relevant publications in the field. Participants were invited to take part via email, and several sessions were scheduled to allow flexibility to maximize recruitment. Participants were paid an honorarium of £200 (A$400) for their time.

      Background Materials

      We circulated background materials to participants before the webinars, including confirmation of date and time, an explanation of the overall purpose of the exercise, a user guide explaining how responses would be recorded (on a specifically designed Microsoft Excel spreadsheet), and relevant background literature. The only relevant literature identified was the current American Joint Committee on Cancer staging recommendations for melanoma, which include survival curves by disease stage at diagnosis [
      • Balch C.M.
      • Gershenwald J.E.
      • Soong S.J.
      • et al.
      Final version of 2009 AJCC melanoma staging and classification.
      ].

      Pre-Elicitation Training

      Each webinar began with a 30-minute presentation introducing the concept of elicitation and example questions, followed by a live demonstration of how to use the Excel spreadsheet.

      Elicitation Method

      Questions were asked in the format of “Imagine a cohort of 100 patients with stage X undiagnosed and hence untreated disease. After 6 months, the patients may be in any of the following stages.” At this point, participants could select from a drop-down list any stage they think it is possible for patients of the cohort to be in. They then ranked these in order of likelihood, from most likely to least likely (screenshot in Fig. 1A). Once participants were happy with their selections, they clicked “Update chart,” which populated a chart with the selected stages, ordered from most to least likely (Fig. 1B), with a default equal spread of probabilities. Edits to the chart could be made by selecting cells in the table below the chart and either clicking increase/decrease or by simply entering an appropriate number. The chart updated instantly providing visual feedback to the participant.
      Participants were asked to first adjust the medians according to their beliefs, working from least to most likely (right to left of the chart, or bottom to top of the table, in the example shown starting from stage IIIb, then stage IIIa, etc.). A restriction was placed that the medians had to sum up to 100; the software would not permit participants to finalize their answer unless this was satisfied. Once the participant was happy that the selected medians did indeed represent their median beliefs, the lower and upper 95% credibility intervals (CrIs) were set for each stage, again working from right to left of the chart. There was no restriction on the sum of the lower and upper bounds: participants could set them such that the interval contained within them represented the strength of their belief about plausible values. As before, the chart updated immediately. allowing the participant to visualize his or her responses. Once happy, the participant clicked “Submit.” The responses were stored and time-stamped and the spreadsheet automatically moved on to the next question.
      The webinar was designed such that it could be repeated with relative ease to accommodate the availability of participants. The facilitator remained online to attend to any problems while they completed the exercise. This also served to ensure that participants completed the exercise at the allocated time and were not rushed in their own time. The time stamp on responses allowed monitoring of the length of time spent on each question.

      Fitting Distributions to Elicited Data

      A modified Connor-Mosimann distribution (a generalisation of the Dirichlet distribution) [
      • Connor R.J.
      • Mosimann J.E.
      Concepts of independence for proportions with a generalization of the Dirichlet distribution.
      ,
      • Elfadaly F.
      • Garthwaite P.
      Eliciting Dirichlet and Connor–Mosimann prior distributions for multinomial models.
      was fitted to the elicited median and 95% CrIs for all dimensions for each participant using the 'modcmfitr' package [

      Wilson E. modcmfitr: Fit a Modified Connor-Mosimann Distribution to Elicited Quantiles in Multinomial Problems. R Package. 2017. Available from: https://CRAN.R-project.org/package=modcmfitr.

      ] in the R programming language [
      • Core Team R.
      R: A Language and Environment for Statistical Computing.
      ]. These summary distributions were then sampled from many times (Monte-Carlo simulation). The empirical median and the upper and lower 95% CrIs (i.e., 2.5th, 50th, and 97.5th centiles) were then calculated from the samples and a modified Connor-Mosimann distribution fitted to these overall figures, representing an aggregate of the participants’ beliefs. This was conducted at a national level (United Kingdom and ANZ) and for all participants together. A copy of the code is available on request from the corresponding author.

      Piloting

      The protocol was piloted among MelaTools Steering Committee members on three separate occasions, resulting in several modifications to the initial plans. These related to the specification and ordering of questions, length of the webinar, and mode of elicitation.

      Specification and ordering of questions

      As described earlier, to avoid respondent fatigue, progression of melanoma was simplified into 12 stages (and a death state): 3 in situ subtypes and invasive from stage Ia to IV (as per the American Joint Committee on Cancer definitions [
      • Balch C.M.
      • Gershenwald J.E.
      • Soong S.J.
      • et al.
      Final version of 2009 AJCC melanoma staging and classification.
      ]). The training session included an explanation to participants that if they felt there was a substantial difference in the rate of progression from one stage to another by subtype (e.g., stage IIa superficial spreading vs. stage IIa lentigo maligna), then they should consider a weighted average risk on the basis of their experience of case mix. Finally, to further minimize the impact of respondent fatigue, questions were asked in random order.
      Two further issues were the time horizon over which participants were asked to estimate changes and the number of patients in the cohort. This was originally proposed to be 1000 patients over 1 month. It was, however, considered that insufficient patients would have progressed over this time and that asking participants to allocate 1000 patients may lead to spurious precision. Therefore, the time horizon was set at 6 months and participants were asked to allocate a cohort of 100 patients.

      Length of the webinar

      The webinar was originally proposed at 3 hours. However, concerns were raised that it would be difficult to recruit participants for a 3-hour session, and so the timing was reduced to 2 hours. This proved to be a reasonable estimate in piloting sessions.

      Mode of elicitation

      Two main modes of elicitation are the quantile and roulette modes [
      • Oakley J.E.
      • O’Hagan A.
      SHELF: The Sheffield Elicitation Framework (Version 2.0).
      ]. In the roulette mode, participants place “chips in bins” representing the relative strength of their belief about different values for a parameter. Although commonly used to elicit binomial probabilities and continuous quantities, we considered this less suitable for eliciting multinomial probabilities. Therefore, we opted for a quantile approach, where a minimum of three points along the distribution are elicited, typically the median and tertiles. The tertile method was originally proposed: the median expressed as the value x at which the participant would place a 50:50 bet on the “true value” being greater or less than x, and the lower and upper tertiles being the values at which a participant would place a 2:1 bet.
      This was rejected by the steering group on the grounds that clinicians would not be familiar with tertiles and objections to the comparison with gambling to explain the method. Despite the same rules of probability governing both clinical outcomes and games of chance, explanations in terms of betting odds were removed and tertiles (33% CrIs) rejected in favor of 95% CrIs, defined in terms of “it is possible for X to be greater than this value, but I would be extremely surprised if this were to be the case.” The numeric quantity (1 in 40 or 2.5% probability) was also stated as the likelihood associated with this situation.

      Format of Results

      This article is written to conform as closely as possible with the recommendations for reporting of expert judgment [
      • Iglesias C.P.
      • Thompson A.
      • Rogowski W.H.
      • et al.
      Reporting guidelines for the use of expert judgement in model-based economic evaluations.
      ], although this work was designed and conducted before the publication of these guidelines. We report details of the participants, present tables and figures of aggregate distributions, and analyze the feedback forms from participants, including time to complete.

      Results

      Participants

      Sixteen participants from a pool of 39 invited experts agreed to take part in the exercise. Of these, 13 successfully completed the entire exercise, 1 completed questions on invasive disease only, and 2 did not complete any questions/withdrew their participation (Fig. 2). Elicitation from UK participants took place over four webinars scheduled between November 2015 and January 2016. Elicitation from ANZ participants also took place over four webinars between January and February 2016.

      Elicited Probabilities

      The results comprise a modified Connor-Mosimann distribution for each question for each participant, plus a summary distribution representing the aggregation of all responses to each question (all UK, all ANZ, and both combined). Density plots proved somewhat unclear when visualizing these data. Therefore, we present box and whisker plots showing medians, interquartile and 100% ranges for the fitted distributions for UK and ANZ participants, and all combined (Fig. 3). Parameters of the respective modified Connor-Mosimann distributions and resulting medians and 95% CrIs for the United Kingdom, ANZ, and all combined are given in Appendix 2 in Supplemental Materials found at doi:10.1016/j.jval.2017.10.009.
      Fig. 3
      Fig. 3Summary results of the United Kingdom, ANZ, and both combined. Fitted modified Connor-Mosimann distributions for combined results of UK and ANZ participants and all combined. Each row represents the starting stage. Thus, the top left cell shows the summary of all UK participants’ beliefs about the probabilities of a patient with AL transitioning to any other state after 6 mo. In this case, the median probability of remaining in the AL stage is 82%, 6% probability of transitioning to stage Ia, 5% to Ib, 2% to IIa, and 1% to stage IIb or death. The IQRs (boxes) and ranges (whiskers) show the overall uncertainty in belief among the participants. AL, in situ acral lentiginous melanoma; ANZ, Australia and New Zealand; D, dead; IQR, interquartile range; LM, in situ lentigo maligna melanoma; SS, in situ superficial spreading melanoma; 1A–4, invasive melanoma of stage Ia to IV, respectively.
      The broad rankings of disease progression are consistent between the United Kingdom and ANZ, but there is some variation: in general, there was least disagreement in point estimates (i.e., medians) for in situ and stage I disease. There was most disagreement in medians for stages IIb, IIc, and IV. A crude mean of the CrIs for each stage suggests that there is the greatest overall uncertainty in progression from stages IIIb, IIIc and IV, and the least from in situ and stage Ib disease, although the 95% CrI varied between almost 0 and 1 for the risk of remaining in an in situ stage. These extremes are driven by the ANZ participants, with tighter 95% CrIs in the United Kingdom. Finally, the results suggested a greater consensus in some of the transitions: regression of disease to an earlier stage was not generally considered possible (one of the ANZ participants believed that there was a very small possibility for this), and progression straight to more extreme stages from in situ or early stage was considered less likely, with appropriately low medians and narrow 95% CrIs.

      Time to Completion and Analysis of Feedback Forms

      Time stamps reporting a time per question greater than 30 minutes were excluded (this was common for the first question when participants had opened the spreadsheet early. Details of exclusions are given in Appendix 3 in Supplemental Materials found at doi:10.1016/j.jval.2017.10.009). After excluding these, the mean time to completion per question was 5 minutes 9 seconds ± 3 minutes 16 seconds, with a total time of 59 minutes 20 seconds ± 21 minutes 19 seconds. There was a downward trend with question number, suggestive of either a learning effect and/or respondent fatigue (Fig. 4).
      Fig. 4
      Fig. 4Time taken to respond to questions. Points represent each participant’s response to each question. Participants reporting more than 12 questions include repeated answers to previous questions. Lines show fitted mean and associated 95% confidence interval. Note the question number is the chronological order rather than a specific question; the order of questions was randomized.
      A summary of the feedback is presented in Table 2. Four participants (29%) had heard of the concept of expert elicitation before this study with involvement in Delphi panels. One participant had not taken part in one before but was familiar with the hierarchy of evidence and that sometimes expert opinion is the only source of relevant data. The mean (median) response to ease of understanding was 2.14 (2.00) on a six-point scale, where 1 was considered easy and 6 difficult, with mean self-reported time to completion of 73.2 minutes (median 60 minutes). This is slightly longer than the measured mean of 59 minutes 20 seconds. (Note the measured mean excludes outliers more than 30 minutes in length.)
      Table 2Summary of quantitative feedback from participants
      QuestionsMeanMedian
      Had you heard of the concept of expert elicitation prior to this study?29% Yes
      How easy or difficult did you find the concepts to understand? (1 = easy, 6 = difficult)2.142.00
      How long did it take you to complete the questions (min)?73.2160.00
      How confident are you that your answers reflect your belief about the risk of progression from one stage to another? (1 = not at all confident, 6 = very confident)3.934.00
      Do you think this is a valid approach to determining the risk of progression from one stage to another?57% Yes
      Would you take part in one of these exercises again?86% Yes
      Mean confidence in responses was 3.93 (median 4.00) on a six-point scale (1 = not at all confident, 6 = very confident). Free-text explanations stated that although the answers may reflect a participant’s belief, the participant had concerns in the limitations of their beliefs because of lack of “hard” evidence. Other comments focused around the complexity of melanoma as a disease as well as the staging recommendations. The background of the participants was also mentioned, with dermatologists likely to be more confident in early stage progression and oncologists more familiar with later stage disease. Because of small numbers we were unable to identify whether this was reflected in our data. Finally, one respondent commented that the 6-month time horizon was too short and would have been more confident making a 1- or 2-year prognosis.
      Only 57% (8 of 14) participants felt that this was a valid approach to answering the study question. Some free-text responses acknowledged that in the absence of “better” evidence, expert elicitation was the only option. Other participants suggested a cluster RCT of screening versus no screening, or that the approach may be valid given a “large enough” sample size (suggesting 100 respondents).
      All but two participants indicated that they would be willing to take part in such an exercise again, although the participants are a self-selected group; another two participants who initially agreed to take part declined to provide answers and did not provide feedback forms. Most free-text comments suggested that participants found it interesting with a desire to see the final results of the study.

      Discussion

      Summary of Results

      We elicited parametric distributions representing 12 unknown multinomial probabilities describing experts’ beliefs about the rate of progression of an individual with untreated melanoma from one stage to another over a 6-month time horizon. The resulting distributions are in a format suitable for incorporation in a decision model. The exercise revealed where there was varying confidence both within and between individuals in the rate of progression. For example, the probability of progressing from stage Ia to IIc, IIIa, or IIIb had medians of 1% to 3%, with “tight” 95% CrIs of 0% to 17%. Other areas were highly uncertain: the 95% CrI of patient with in situ acral lentiginous melanoma remaining in that state ranges between 0% and 99%. However, this is far from a uniform distribution (representing complete ignorance) because the median is 81%.

      Comparison with Other Studies

      Probably the most well-known structured elicitation technique is the SHeffield ELicitation Framework (SHELF) [
      • Oakley J.E.
      • O’Hagan A.
      SHELF: The Sheffield Elicitation Framework (Version 2.0).
      ,
      • O’Hagan A.
      • Buck C.E.
      • Daneshkhah A.
      • et al.
      Uncertain Judgements: Eliciting Experts’ Probabilities.
      ]. This is a consensus-based approach requiring participants to agree on a final summary distribution representing their belief about plausible values for a single parameter. Recently, this has been extended to elicit multinomial parameters using a Dirichlet distribution [
      • Zapata-Vázquez R.E.
      • O’Hagan A.
      • Soares Bastos L.
      Eliciting expert judgements about a set of proportions.
      ]. Our analysis extends this by fitting elicited data to a Connor-Mosimann distribution [
      • Connor R.J.
      • Mosimann J.E.
      Concepts of independence for proportions with a generalization of the Dirichlet distribution.
      ,
      • Elfadaly F.
      • Garthwaite P.
      Eliciting Dirichlet and Connor–Mosimann prior distributions for multinomial models.
      ] modified to allow greater flexibility, thus providing a much better fit to the data.
      The SHELF approach, although considered best practice, suffers from several practical limitations. First, the consensus approach requires a face-to-face workshop bringing together all relevant participants into the same room at the same time. Second, as well as being somewhat logistically challenging, this approach limits the number of participants to six to eight at most to facilitate conversation, and the workshop requires an experienced facilitator to ensure even representation of all views. Finally, the consensus approach also limits the number of questions that can realistically be asked in one session to four or five at most.
      Other approaches have used computer-based methods. For example, a study eliciting the opinion of nurses on the effectiveness of different bandages for severe pressure ulcers used a bespoke spreadsheet in Microsoft Excel [
      • Soares M.O.
      • Bojke L.
      • Dumville J.
      • et al.
      Methods to elicit experts’ beliefs over uncertain quantities: application to a cost effectiveness transition model of negative pressure wound therapy for severe pressure ulceration.
      ]. The nurses completed the task together in a computer suite, ensuring that sufficient attention was paid to answering the questions as well as providing technical support in case of difficulties.
      Previous decision modeling studies requiring an estimate of the risk of progression in undiagnosed and untreated melanoma [
      • Losina E.
      • Walensky R.P.
      • Geller A.
      • et al.
      Visual screening for malignant melanoma: a cost-effectiveness analysis.
      ,
      • Wilson E.C.
      • Emery J.D.
      • Kinmonth A.L.
      • et al.
      The cost-effectiveness of a novel SIAscopic diagnostic aid for the management of pigmented skin lesions in primary care: a decision-analytic model.
      ] had used notional 10% annual probabilities with a “wide” 95% CrI of 0.0001% to 54.87% (a Beta(0.3,2.7) distribution). This represents a mean of approximately 5.1% per 6 months. The transition probabilities presented in this analysis are very different from these previous estimates. It would be of value reiterating those previous models with these new parameter estimates to explore the impact on their conclusions.

      Practical Issues Associated with the Elicitation Protocol

      Piloting of the protocol was an extremely important component in this project, leading to several changes in approaches and indeed a delay in the entire project by several months to ensure that the Internet-based workshop was as valuable as possible. Nevertheless, the rejection of tertiles in favor of 95% CrIs may have resulted in loss of precision: eliciting tertiles requires the participant to consider relative odds explicitly, whereas eliciting 95% CrIs requires the participant to consider “almost certainty,” which is somewhat vague. We also relied on verbal confirmation that experts had fully understood the nature of the task, were willing to honestly report their subjective uncertainty, and in particular understood the difference between means and medians. Furthermore, in retrospect it may have been preferable to request estimates of prognosis over a 1- or 2-year time horizon, not 6 months.
      Hosting the sessions as online webinars most likely increased the overall response rate, allowing scheduling to fit around the diaries of the participants. Nevertheless, we were able to elicit responses from only 14 participants (7 UK and 7 ANZ) across eight facilitated sessions, although this may be a function of relative rarity of expertise in the area. The numerous repeats of the webinar also required quite a number of input hours from the facilitator, and there was a risk of inconsistency between webinars leading to systematically different results. The effort could have been eased and consistency issues partially addressed by prerecording the presentation component with subsequent opportunity for questions, although this reduces the interactivity of the session.
      We had originally intended to use a bespoke Web-based tool for participants to record their answers. This would be platform-independent and would automatically upload answers to a study database. Nevertheless, coding such a platform proved troublesome and we found it expedient to use a macro-enabled spreadsheet written in Microsoft Excel. This led to its own problems, with participants requiring appropriate security settings on their computers to allow macros to run. In particular, the macros would run only on Windows PCs, not Macs. Participants with Macs therefore had to source Windows PCs to take part in the study.
      There was also a lack of clarity and apparent inconsistency between participants regarding the “dead” state. The intention of the facilitator was that this would be specifically melanoma mortality because the decision model into which the results will be entered already includes background mortality. This, however, was not made explicit to the participants. Thus, allowing a nonzero probability of death from an earlier stage implies that over those 6 months a participant believes that the patient could progress through all stages of the disease and die. This is unlikely from stage 1A, and therefore the two participants who allowed nonzero values for death here may have been considering background mortality. Making this explicit is a lesson for future studies of this nature.

      Justification for Seeking Expert Opinion to Inform Parameter Estimates

      The response of participants to the use of expert elicitation in decision making was very varied, ranging from complete acceptance to extreme skepticism, with only 57% considering this a valid approach. It is therefore important to consider the alternative: decisions to adopt or reject new technologies must be made irrespective of the evidence available at the time. A decision to remain with the status quo pending further evidence is still a decision not to adopt and so risks an opportunity loss. The purpose of a decision model is to assemble all evidence there is, critically appraise it, and structure it in a way to assist with a decision, for example, short-term effectiveness from an RCT combined with epidemiological data on long-term progression. There will always be gaps in this evidence, or areas where the “best available” data may fall short of the “best conceivable.” Decision modeling highlights this, and expert elicitation provides one means of plugging it where better quality evidence either does not yet exist or cannot exist/be collected (e.g., because of physical impossibility or ethical concerns as in the melanoma example presented here). The alternative to this is subjective, informal discussion of decision makers. A structured consultation process with relevant “experts” focused on carefully eliciting their epistemic uncertainty may be considered superior a priori. It should be noted that when data can be, and are, subsequently collected, the model should be updated accordingly. When those data cannot be collected, the best that can be achieved is to present the results of the elicitation process fully and transparently, allowing readers to decide whether they feel the values are plausible. If so, then all else being equal, the results generated from a decision model using those values must also be plausible.

      Conclusions

      We successfully developed an online structured process that succeeded in eliciting and fitting a multinomial distribution, representing an aggregation of experts’ beliefs about the risk of progression of untreated melanoma from one stage to another. The parameters of the overall distribution have been inserted into a decision model to estimate the cost-effectiveness of various screening and monitoring strategies to identify those at high risk of melanoma in a UK setting [

      Wilson E, Usher-Smith J, Emery J, Corrie P, Walter F. A modelling study of the cost-effectiveness of a risk-stratified surveillance programme for melanoma in the UK. Value in Health [In Press].

      ]. Critically, the uncertainty in belief of experts about the rates of progression has been captured and when combined with uncertainty in other parameters, translated into decision uncertainty as to which strategies are likely to be the most cost-effective.

      Acknowledgments

      We thank the experts in melanoma for taking part in this study and providing their most valuable feedback. We also thank members of the MelaTools Steering Committee for their help and comments in the development of this work (www.melatools.org/team.html), James Brimicombe (IT Manager, Primary Care Unit, University of Cambridge) for assistance and advice with technical aspects of the project, and Becky Lantaff (Research Assistant, Primary Care Unit, University of Cambridge) for coordinating webinars and participants.
      Source of financial support: This work is an unfunded extension to the MelaTools program, which provided participant honoraria. The MelaTools program was funded by a National Institute for Health Research (NIHR) Clinician Scientist Award (RG 68235). E. Wilson was funded by the NIHR Cambridge Biomedical Research Centre. J. Usher-Smith was funded by an NIHR Clinical Lectureship. J. Emery was funded by a National Health and Medical Research Council Practitioner Fellowship. Views expressed in this publication are those of the authors and not necessarily those of the National Health Service, the NIHR, or the Department of Health.

      Supplemental Material

      References

        • Sackett D.L.
        • Rosenberg W.M.
        • Gray J.A.
        • et al.
        Evidence based medicine: What it is and what it isn’t.
        BMJ. 1996; 312: 71-72
        • Drummond M.
        • Sculpher M.
        • Claxton K.
        • et al.
        Methods for the Economic Evaluation of Health Care Programmes. 4th ed. Oxford University Press, Oxford, UK2015
        • Briggs A.H.
        • Sculpher M.
        • Claxton K.
        Decision Modelling for Health Economic Evaluation.
        Oxford University Press, Oxford, UK2006
        • Williams L.H.
        • Shors A.R.
        • Barlow W.E.
        • et al.
        Identifying persons at highest risk of melanoma using self-assessed risk factors.
        J Clin Exp Dermatol Res. 2011; 2 (pii: 1000129)
      1. Wilson E, Usher-Smith J, Emery J, Corrie P, Walter F. A modelling study of the cost-effectiveness of a risk-stratified surveillance programme for melanoma in the UK. Value in Health [In Press].

        • Balch C.M.
        • Gershenwald J.E.
        • Soong S.J.
        • et al.
        Final version of 2009 AJCC melanoma staging and classification.
        J Clin Oncol. 2009; 27: 6199-6206
        • Liu V.
        • Mihm M.C.
        Pathology of malignant melanoma.
        Surg Clin North Am. 2003; 83 (v): 31-60
      2. Medical Research Council, NHS Health Research Authority. Do I need NHS REC approval? Decision tool. Available from: http://www.hra-decisiontools.org.uk/ethics/. [Accessed November 21, 2016].

        • Connor R.J.
        • Mosimann J.E.
        Concepts of independence for proportions with a generalization of the Dirichlet distribution.
        J Am Stat Assoc. 1969; 64: 194-206
        • Elfadaly F.
        • Garthwaite P.
        Eliciting Dirichlet and Connor–Mosimann prior distributions for multinomial models.
        Test. 2013; 22: 628-646
      3. Wilson E. modcmfitr: Fit a Modified Connor-Mosimann Distribution to Elicited Quantiles in Multinomial Problems. R Package. 2017. Available from: https://CRAN.R-project.org/package=modcmfitr.

        • Core Team R.
        R: A Language and Environment for Statistical Computing.
        R Foundation for Statistical Computing, Vienna, Austria2016
        • Oakley J.E.
        • O’Hagan A.
        SHELF: The Sheffield Elicitation Framework (Version 2.0).
        School of Mathematics and Statistics, University of Sheffield, Sheffield, UK2010
        • Iglesias C.P.
        • Thompson A.
        • Rogowski W.H.
        • et al.
        Reporting guidelines for the use of expert judgement in model-based economic evaluations.
        Pharmacoeconomics. 2016; 34: 1161-1172
        • O’Hagan A.
        • Buck C.E.
        • Daneshkhah A.
        • et al.
        Uncertain Judgements: Eliciting Experts’ Probabilities.
        Wiley, Chichester, UK2006
        • Zapata-Vázquez R.E.
        • O’Hagan A.
        • Soares Bastos L.
        Eliciting expert judgements about a set of proportions.
        J Appl Stat. 2014; 41: 1919-1933
        • Soares M.O.
        • Bojke L.
        • Dumville J.
        • et al.
        Methods to elicit experts’ beliefs over uncertain quantities: application to a cost effectiveness transition model of negative pressure wound therapy for severe pressure ulceration.
        Stat Med. 2011; 30: 2363-2380
        • Losina E.
        • Walensky R.P.
        • Geller A.
        • et al.
        Visual screening for malignant melanoma: a cost-effectiveness analysis.
        Arch Dermatol. 2007; 143: 21-28
        • Wilson E.C.
        • Emery J.D.
        • Kinmonth A.L.
        • et al.
        The cost-effectiveness of a novel SIAscopic diagnostic aid for the management of pigmented skin lesions in primary care: a decision-analytic model.
        Value Health. 2013; 16: 356-366