Systematic Review of Cost-Effectiveness Models in Prostate Cancer: Exploring New Developments in Testing and Diagnosis

Objectives Recent innovations in prostate cancer diagnosis include new biomarkers and more accurate biopsy methods. This study assesses the evidence base on cost-effectiveness of these developments (eg, Prostate Health Index and magnetic resonance imaging [MRI]-guided biopsy) and identifies areas of improvement for future cost-effectiveness models. Methods A systematic review using the National Health Service Economic Evaluation Database, MEDLINE, Embase, Health Technology Assessment databases, National Institute for Health and Care Excellence guidelines, and United Kingdom National Screening Committee guidance was performed, between 2009 and 2021. Relevant data were extracted on study type, model inputs, modeling methods and cost-effectiveness conclusions, and results narratively synthesized. Results A total of 22 model-based economic evaluations were included. A total of 11 compared the cost-effectiveness of new biomarkers to prostate-specific antigen testing alone and all found biomarkers to be cost saving. A total of 8 compared MRI-guided biopsy methods to transrectal ultrasound-guided methods and found MRI-guided methods to be most cost-effective. Newer detection methods showed a reduction in unnecessary biopsies and overtreatment. The most cost-effective follow-up strategy in men with a negative initial biopsy was uncertain. Many studies did not model for stage or grade of cancer, cancer progression, or the entire testing and treatment pathway. Few fully accounted for uncertainty. Conclusions This review brings together the cost-effectiveness literature for novel diagnostic methods in prostate cancer, showing that most studies have found new methods to be more cost-effective than standard of care. Several limitations of the models were identified, however, limiting the reliability of the results. Areas for further development include accurately modeling the impact of early diagnostic tests on long-term outcomes of prostate cancer and fully accounting for uncertainty.


Introduction
Prostate cancer is the second most commonly occurring cancer in men worldwide and the fourth most commonly occurring cancer overall. 1 Detection of early disease has historically been achieved using a prostate-specific antigen (PSA) blood test followed by transrectal ultrasound (TRUS)-guided biopsy. Nevertheless, PSA is not a specific marker for prostate cancer, and TRUSguided prostate biopsy is associated with infection and other adverse effects and can lead to false negative results in up to 25% of cases. 2,3 Therefore, current diagnostic methods lead to overdetection of cancers that may not progress to become clinically important in a man's lifetime, but can also miss aggressive, potentially fatal prostate cancer. 4,5 Overdetection can have a significant effect on the quality of life (QOL) of the men affected owing to the adverse effects associated with testing and unnecessary treatment. 6 It is also a poor use of limited healthcare resources. In the absence of robust evidence, current UK policy does not advocate population screening. Several large trials, including the European Randomised Study of Screening for Prostate Cancer (ERSPC), 7 the US Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial, 8

and the UK Cluster Randomized
Trial of PSA Testing for Prostate Cancer trial, 5 have found limited mortality benefit of PSA-based screening when considered overall. 9 Therefore, as it stands, the benefits of screening seem insufficient to outweigh the potential harms of overtreatment.
Recent years have seen the development of biomarker tests to complement PSA-based testing, for example, the Prostate Health Index (PHI), 4Kscore, SelectMDx, and PCA3. These tests act as additional reflex tests to aid the decision about when a man should be referred for prostate biopsy. Multiparametric magnetic resonance imaging (MRI) is another recent development that, when used as a triage test after PSA or other biomarker testing, might allow men with no or likely indolent cancer to avoid unnecessary biopsy and improve diagnostic accuracy with respect to more aggressive disease. 10,11 Therefore, there is potential for a reduction in overdiagnosis and higher specificity for potentially lethal cancer. 4 Nevertheless, it is not yet clear whether these new developments should be implemented either individually or in combination with one another at a national level within a screening program.
As innovations that aim to address the overdiagnosis associated with prostate cancer screening become available, healthcare policy makers must make informed decisions regarding their use in national screening strategies. As such, it is essential to establish the cost-effectiveness of these developments and their combinations, to make rational decisions about the allocation of limited healthcare resources.
This systematic review aimed to identify published economic models assessing the impact of these innovations on the costs and outcomes of prostate cancer diagnosis. The population of interest was men at risk of developing prostate cancer, the interventions reviewed were novel biomarkers and MRI-guided biopsy techniques as prostate cancer diagnostic tools, the alternatives against which the interventions were compared were standard diagnostic tools such as the PSA test, TRUS-guided biopsy, or no intervention, and the outcome considered was the cost-effectiveness of these interventions in comparison with each other. This review also determines the current evidence base and provides an overview of model characteristics. It provides information on novel tests, how they have been modeled, and the data available to populate such models, which will assist the development of new costeffectiveness models in prostate cancer screening. It assesses the limitations of available models, highlighting ways in which a future model may improve on these, and provides overall conclusions on the cost-effectiveness of these new diagnostic tools.

Study Selection
Study selection proceeded from title/abstract screening against the eligibility criteria through full-text review to data extraction. E.K. was involved at all stages. A second reviewer (J.M.) independently screened 10% of the titles and abstracts and performed data extraction on 20% of the included studies. Studies were categorized according to model-based economic evaluations of new (1) biomarkers/tests/risk models for screening in prostate cancer, (2) biopsy methods for definitive diagnosis after an initial triage screening test in prostate cancer, and (3) follow-up testing and diagnostic strategies for men initially found to have no or low-risk prostate cancer.

Search Strategy
In April 2021, studies were identified by searching the National Health Service Economic Evaluation Database (2009-2014), MED-LINE, Embase, Health Technology Assessment databases, National Institute for Health and Care Excellence (NICE) guidelines, UK National Screening Committee guidance, and reference lists from relevant studies. The review was restricted to evidence from January 2009 onward to reflect current practice in screening and testing for prostate cancer and because the aim was to identify novel tests in prostate cancer diagnosis. Search terms included free text and medical subject headings terms (Appendix 1 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2021.07.002). The search was limited to English language publications.

Eligibility Criteria
Studies were included if they were model-based economic evaluations of screening or diagnostic strategies for prostate cancer beyond the standard PSA test plus TRUS-guided biopsy. Cost-effectiveness, cost-utility, cost-consequence, and cost-benefit analyses were considered. Models could use primary data from a trial or secondary data from the literature. They could compare any novel test or diagnostic strategy for diagnosing or ruling out prostate cancer or any subsequent follow-up regime (aside from PSA testing) when prostate cancer has not been identified at initial biopsy. Models from any country or type of health system were considered.

Data Extraction
Data extraction forms were developed and pilot-tested on a random sample (5%) of included studies and refined accordingly. The data extraction form is shown in Appendix 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2021.07.002.
Information extracted from each study included context (ie, perspective and country), characteristics of the tests compared (ie, frequency of testing and threshold for a positive result), the population the strategy was applied to (ie, screening start and stop age and the prevalence of prostate cancer), outcome measures (eg, cost per quality-adjusted life-year [QALY] gained), and costeffectiveness result. Information was also extracted on characteristics of the model including model type (eg, decision tree, Markov model) and structure (how clinical pathways are represented), sensitivity analyses (including the extent to which uncertainty in the cost-effectiveness result had been quantified), the source of evidence for utility values assigned to health states and costs included in the model, and the source of evidence for accuracy of tests.

Quality Assessment
The purpose of the review was to determine the current evidence base and provide an overview of the characteristics of available models. Therefore, a formal quality checklist was not used to exclude studies from the review. Nevertheless, existing economic evaluation checklists were used as a guide to reporting the studies. 12,13 The quality of the included economic evaluations was assessed using the Consolidated Health Economic Evaluation Reporting Standards checklist. 14 A score of 0, 1, or 2 was allocated for each criterion corresponding to a decision of criterion not met, criterion met, or criterion not applicable. Risk of bias was assessed using the Bias in Economic Evaluation checklist. 15 Every item was rated as yes, no, partly, unclear, or not applicable.
The review follows the reporting standards for reviews of economic evaluations. [16][17][18]

Results
In total, 1075 studies were identified. Most studies were excluded at the abstract stage because they were not model-based economic evaluations or did not compare tests for diagnosing prostate cancer. After removing duplicates and checking for eligibility, 55 full-text articles were retrieved (Fig. 1). Of the 55 full-text articles, 22 studies were included in the review. A total of 16 articles were excluded because these were conference abstracts and the rest were excluded because they (1) were background articles rather than original studies, (2) had the wrong study design, for example, cost-comparison rather than costeffectiveness analyses, or (3) had the wrong population, for example, men with biochemical recurrence after radical prostatectomy.

Study Type
Of the 22 studies, 11 compared the cost-effectiveness of new urinary or blood biomarkers to each other or to the standard of care (a PSA test alone) ( Table 1). Another 8 studies compared different approaches with prostate biopsy. A total of 3 studies compared follow-up strategies in men who have a negative initial biopsy result. The studies were based in the United States (n = 6), United Kingdom (n = 6), The Netherlands (n = 4), Hong Kong (n = 1), Germany (n = 1), China (n = 1), Sweden (n = 1), and Canada (n = 1). One study compared results for France, Germany, Spain, and Italy. 19 All but 3 studies 20-22 performed a cost-utility analysis where outcomes were measured in QALYs. The other 3 were cost-consequence analyses reporting the number of tests and biopsies performed and expected overall diagnostic costs. 20,21 Strategies compared-biomarkers The novel diagnostic strategies that were compared with PSAbased testing alone included PHI, 20,22-24 PCA3, SelectMDx, 25,26 Stockholm3, 27 and urinary proteome analysis. 21 The definitions of these biomarker tests are given in Appendix Table S1 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2021.07. 002. Most studies considered only 1 novel test, except Sathianathen et al 28 who compared PHI, the 4Kscore, ExoDx Prostate(IntelliScore), and SelectMDx. 22,28 Of the 11 studies comparing different biomarkers, 9 referred to TRUS-guided biopsy to confirm diagnosis, 1 to multiparametric MRI, 22 and 1 did not report the biopsy method assumed. 23 Only 3 studies 23,24,27 that compared biomarkers modeled repeat PSA/biomarker testing, assuming annual 24 or 4-yearly screening. 23,27 These intervals were chosen in accordance with the American Urological Association 2009 recommendations (annual screening for men aged 40 years and older with shared decision making) 29 and the screening protocol used in ERSPC (4 yearly). 7

Strategies compared-biopsy methods
Men with a suspicion of prostate cancer indicated by a PSA test or other biomarker are generally referred for a TRUS-guided biopsy. The different biopsy methods the 7 studies identified compared included MRI-targeted methods and template mapping biopsy. 30 The definitions of biopsy methods are given in Appendix Table S2 in Supplemental Materials found at https://doi. org/10.1016/j.jval.2021.07.002. Different strategies were compared, including using MRI to decide whether a TRUS-guided biopsy is necessary and to target biopsy and strategies starting with TRUSguided biopsy and using MRI to decide whether a repeat biopsy is necessary. A total of 3 of the studies comparing biopsy methods 31-33 modeled repeat screening, assuming that men would be screened every 2 years based on the 2013 American Urological Association guideline 34 or every 4 years based on the ERSPC protocol. 7 Strategies compared-follow-up strategies in men with negative biopsies A total of 3 studies 35-37 compared follow-up strategies for men with raised PSA and negative MRI, negative prostate biopsy or negative MRI and negative biopsy. The strategies included various biomarkers (PSA, PSA velocity, PSA density, % free PSA, PSA doubling time, PSA density in transition zone, PCA3, PHI) and MRI techniques.

Accuracy data
All but 6 studies 24,27,33,36,38,39 explicitly reported the sensitivity and specificity of the tests. The assumed sensitivity of a standard biopsy ranged from 0.9 based on ERSPC data 23,40 to 0.46 based on de Rooij et al. 28,41,42 The biomarkers were generally assumed to be either particularly sensitive, that is, good at correctly identifying those with the disease, or particularly specific, that is, good at correctly identifying those without the disease. PHI at a threshold of 20, for example, had the highest reported sensitivity (1, but specificity of 0.08) and also the highest reported specificity (0.974, but sensitivity of 0.129). 20,43 The MRI-targeted biopsy methods generally had a better balance of sensitivity and specificity, ranging from a sensitivity of 0.965 (specificity of 0.597) for MRI using a Prostate Imaging-Reporting and Data System threshold of $3 32,44 to 0.770 (specificity 0.68) using fusion biopsy. 28,45,46 Appendix Table S3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2 021.07.002 details the accuracy estimates used along with their evidence sources.

Quality of life
As detailed in Table 2, all but 3 studies assigned disutilities to various aspects associated with testing including screening attendance, the biopsy procedure, diagnosis of cancer, treatment, active surveillance, advanced or metastatic cancer, posttreatment or recovery, adverse events associated with biopsy and treatment, and palliative therapy. A total of 9 studies 6,19,23,25,26,32,33,36,46 sourced all utility estimates used in their model from Heijnsdijk et al 6

Resource use
Most studies took a healthcare provider perspective for the analysis (only including costs incurred to the provider rather than any wider patient or societal costs). A total of 2 studies stated that a societal perspective was taken but did not detail the societal costs that were included. 23,24 Another 2 studies included productivity costs in terms of missed days of work when a patient undergoes a test or treatment. 27,42 No study gave a justification for the perspective taken. The main costs included were the cost of testing, biopsy, and subsequent management strategy. Thirteen studies included costs of complications arising from biopsy. 19 19,25,26,30,35,37,46 Modeling Methods Table 3 details model characteristics including model type, time horizon, and cycle length. A total of 8 combined decision tree/ Markov cohort models were identified. In 5 of these, the decision tree reflected the diagnostic process and the Markov model reflected treatment. 26,30,39,41 In the others, the decision tree captured both diagnosis and treatment and the Markov model was used for posttreatment states. 25,28,46 The treatment allocation assumed in the studies that modeled this is shown in Appendix Table S4 23 and the Prostata model, 27 and 4 decision tree models 20,22,36,42 were also identified. No study provided a justification for choosing one model type over another.

Model type
The decision trees generally used data on disease prevalence and accuracy of the tests to categorize men into true positives, false positives, true negatives, and false negatives 20,25,46 with some also incorporating the clinical significance of cancer. 25,26,30,41,42 The Markov models captured cancer progression and survival. All but 4 studies developed a de novo model. 23,27,32,33 Cycle length varied from 3 months to 1 year. The only study that reported a justification for the cycle length chosen was the NICE guideline, which stated that the guideline development committee confirmed that a cycle length of 3 months is sufficient to reflect possible clinical events a person with prostate cancer may experience. 35

Sensitivity analyses
All studies conducted a deterministic sensitivity analysis where input parameters or sets of parameters were varied to see the impact on results. Half (11 of 22) of the studies also performed a probabilistic sensitivity analysis where repeated simulations sampled all parameters from their respective distributions to

Model structure
The structure of a model relates to how different health states are categorized and how patients move between health states. Related to this, the natural history of a disease refers to how a disease progresses in a person over time in the absence of treatment. 70 Only 7 of the included models 23,27,[30][31][32]35,37 took account of how prostate cancer progresses through different health states and how the introduction of a new test might affect this, and all of these captured this progression differently. The health states included in the models are shown in Table 3 19,25,26,33,38,39,41,46 with no progression through health states. A total of 4 did not model beyond diagnosis. 14,16,22 In addition, the definition of clinically significant cancer varied across studies (Table 3). Of all the models, 6 did not consider stages or grade of cancer, only the presence or absence of cancer. 24,26,28,33,39,46 Reporting of overdiagnosis and mechanism of screening benefit Overdiagnosis and overtreatment owing to the identification of cancers that would never progress to cause prostate cancer related death or illness in a man's lifetime are key factors to consider when testing men for prostate cancer. Only 3 studies 23,27,33 provided estimates of the impact of screening on overdiagnosis. Both Heijnsdijk et al 23 77 defined as the probability that a PSA-detected case would have taken longer than the remaining lifetime to progress to clinical cancer. They found that MRI-first risk-stratified screening was associated with a 10.4% to 72.6% lower probability of overdiagnosis in screen-detected cases, depending on the 10-year absolute risk thresholds at which individuals were eligible for screening.
In addition, different approaches to measuring the benefit of screening in nonoverdiagnosed men are possible and the choice of method may affect results. Stage-shift screening models assume that the benefit associated with screening is due to a shift to a less advanced stage at diagnosis resulting in improved survival. Cure models assume that if cancers are detected earlier they can be treated and that curative treatment has the potential to prevent cancer-specific mortality. 78 Only 1 study, Heijnsdijk et al, 23 explicitly stated that the assumed mechanism of benefit of screening in their model was as a cure proportion, which assumes that a percentage of men are cured owing to screening and therefore avoid a death from prostate cancer. The other studies did not consider overdiagnosis nor give any detail on the mechanism of benefit of screening assumed.

Cost-Effectiveness Results
To aid comparison, all reported costs were inflated to the 2020 price year and converted to US dollars, taking purchasing power parities between countries into account. This was done using the web-based tool developed by the Campbell and Cochrane Economics Methods Group and the Evidence for Policy and Practice Information and Coordinating Center. 79 In reality, the costs are not comparable because different countries have different healthcare systems, care pathways, and negotiated prices. Therefore, original costs are also reported.

Biomarkers
Of the 11 studies that compared PSA testing with testing with a new biomarker, 6 studies found that introducing the new biomarker saves costs and increases QALYs 19,[24][25][26]28,39 (Table 4). A total of 3 did not measure QALYs but found that diagnostic costs were reduced, [20][21][22] and one found that the introduction of a new test increased both costs and QALYs. 27 Of the studies that considered progression through stages or grades of cancer, Heijnsdijk et al 23 found that PSA 1 PHI testing saves costs compared with PSA testing alone and results in the same QALYs 23 and Karlsson et al 27 estimated an incremental cost-effectiveness ratio (ICER) of V5663 for screening using Stockholm3 when PSA values were above 2 ng/ mL compared with PSA alone. The results from all studies were generally driven by a decrease in negative biopsies.

Biopsy methods
A total of 7 of the 8 studies that compared MRI-guided biopsy strategies to each other and to TRUS-guided biopsy found at least 1 MRI-guided strategy to be cost-effective (increased costs but also increased QALYs). The exception was Cerantola et al 38 who found that MRI-guided biopsy dominated TRUS-guided biopsy (reduced costs and increased QALYs). ICERs for MRI-guided biopsy methods compared with standard methods ranged from V323 per QALY in a study conducted from a The Netherlands perspective 41 to $35108 per QALY in a US study, 31,32 both indicating costeffectiveness according to the generally accepted costeffectiveness thresholds in the respective countries. 80,81 The increased QALYs and reduced costs were generally owing to an avoidance of the adverse effects and resource use associated with overdiagnosis.

Follow-up strategies
A total of 2 of the studies comparing follow-up strategies in men with a previous negative biopsy did not identify a clear indication of cost-effectiveness for any strategy. The NICE guideline 35 concluded that PSA velocity, density, and % free PSA may be the best indicators to trigger further diagnostics in higher risk populations, however the "no screening" strategy seemed optimal for the lowest-risk subpopulation who had MRI Likert scores of 1 or 2 (very unlikely/unlikely that the patient has prostate cancer that needs to be treated) and 2 previous negative biopsies. Nicholson et al 36 found no strategy to be cost-effective. Mowatt et al 37 found the base-case ICER for T2-MRI to be below the UK willingness to pay threshold (£30 000 per QALY) for all cohorts modeled.

Assessing uncertainty in cost-effectiveness results
A total of 5 studies found that the results were sensitive to the potential of the tests to identify cancer, particularly clinically significant cancer. 25,26,30,33,41,42 A total of 3 studies found results to be sensitive to the assumed prevalence of cancer and significant cancer. 41,42,46 The cost of the tests was also stated as an important factor in 5 of the studies. 20,27,28,30,46 Furthermore, studies found results to be sensitive to probabilities of cancer progression in undiagnosed cases, 31,32,35 increasing or decreasing survival rates in men treated for prostate cancer, 35,46 and QOL values used for diagnosed cancer states. 27,31,32,37 For example, the NICE guideline 35 stated that increasing the survival rate resulted in the strategy where all men receive an immediate TRUS and no subsequent follow-up to be optimal in the majority of subpopulations. In contrast, Venderink et al 46 found that if the yearly survival rate among patients with treated clinically significant prostate cancer were to decrease from 98.6% to 93.2%, TRUS-guided biopsy would be the most costeffective strategy. Mowatt et al 37 assessed the impact of applying a utility decrement of 0.035 (half of the disutility associated with having moderate anxiety rather than no anxiety on the EQ-5D) to patients with undiagnosed cancer, to reflect potential disutility from increased anxiety associated with having a high PSA but no diagnosis. This resulted in systematic TRUS being the most cost-effective intervention.

Quality of Included Studies
The overall mean percentage of the applicable Consolidated Health Economic Evaluation Reporting Standards criteria met by each study was calculated at 71%, with a range of 37% to 100% and a median of 68%. Only 1 study satisfied all applicable criteria (scoring 100%) (Appendix Table S5

Discussion
This review aimed to identify economic models evaluating new diagnostic tests for prostate cancer; determine the evidence base and cost-effectiveness results, provide an overview of the characteristics of these models and their data sources to aid in the development of future cost-effectiveness models in this area, and assess the limitations of available models, providing guidance on future improvements.
A total of 22 studies were identified, all published between 2011 and 2021. A total of 11 compared the cost-effectiveness of new urinary or blood biomarkers with each other or with the standard of care (a PSA test). Another 8 compared different approaches with prostate biopsy and 3 compared follow-up strategies in men who have a negative initial biopsy result. Most models used either a combined decision tree/Markov or purely Markov model structure with only 7 modeling progression through stages or grades of cancer. Substantial variability was seen in the model pathways of prostate cancer natural history, the data sources used to inform progression, treatment allocation assumed for high-and low-risk cancers, disutility values assigned to health states, and the assumed accuracy of the tests. All but 1 study 36 found the introduction of these novel tests to be cost-effective; nevertheless, in some cases, the benefits may be overestimated because of a failure to take account of overdiagnosis and the natural history of the disease in untested men.

Limitations of Included Models
The studies identified had several key limitations. Although they compared novel tests to diagnose prostate cancer, many failed to take into account the complexity of the disease, including stage or grade of cancer and how cancer progresses in diagnosed and undiagnosed cases. This calls into question the reliability of the results given that the cost-effectiveness of a new test may be overestimated if the cancers it identifies would never progress to cause symptoms or mortality if not identified. The purpose of screening and testing is to identify cancers at an early stage when they are more amenable to treatment. If cost-effectiveness models do not differentiate between cancer stages, it is difficult to measure the effects of early diagnosis. 13 Only half of the studies performed a probabilistic sensitivity analysis to fully account for the uncertainty in the model parameters. Of the 19 studies which included QALYs, 9 cited Heijnsdijk et al 6 as the source for their utility estimates who in turn obtained their estimates from studies in various countries and settings and using different evaluation techniques. Although this approach indicates that there is likely no alternative common source for these utility parameters, this is against best practice because the values cannot be considered to be equivalent when measured in different populations. 82 Notably, 13 of the 19 studies did not report uncertainty in their QALY estimates, suggesting that this uncertainty was not accounted for. 23,25,26,35,36,38,46 Given that QALY estimates can often have a substantial impact on the intervention considered most cost-effective, it is important that any underlying uncertainty in these estimates is fully accounted for.
A total of 3 studies used a time horizon of 3 years or less, modeling only up to biopsy, which is unlikely to be long enough to capture the impact of timely and accurate diagnosis of prostate cancer, because of its long-term nature. 20,21,36 Although most of the models represented the entire diagnostic pathway from test to treatment, the majority of these compared either new biomarkers or new MRI-guided biopsy methods with few comparing combinations of tests. Although both biomarker and imaging advancements are important, it seems worthwhile for them to be considered in combination given that this is how they may be used in practice. 83,84 Recommendations for Future Cost-Effectiveness Models Any future model should consider the entire diagnostic pathway, which may include both new biomarkers and biopsy methods, to comprehensively assess the "true" cost-effectiveness of these tests within a diagnostic strategy for prostate cancer. When modeling the lifetime cost-effectiveness of a test to diagnose prostate cancer, it is important to consider the natural history of the disease and how a test may affect this, to ensure that the benefit of the test is accurately represented and overdiagnosis is considered. The studies identified in this review that modeled the natural history of prostate cancer all did so in different ways, suggesting a lack of clarity in the field. Any future model should consider this carefully with the help of clinical experts.
One potential approach to overcome future model disparity is comparative modeling, an approach taken by the Cancer Intervention and Surveillance Modeling Network, which uses statistical/simulation modeling to examine the impact of screening on cancer incidence and mortality. With comparative modeling, the same or a similar set of inputs is used across a range of models with all models then reporting the same intermediate and final outputs. 85 An approach such as this could be beneficial in the area of reflex testing in prostate cancer to enable a truer comparison between strategies.
Although a formal literature review to identify health state utility values has not been performed, the values used in previous models indicate a potential paucity of information on how prostate cancer treatment and adverse effects affect QOL. This should be considered carefully and uncertainty fully accounted for where it exists, given that this could greatly affect the results of a costutility model.

Strengths and Limitations of Review
The strength of this systematic review is that it has provided an overview of cost-effectiveness models published in the last 10 years, which have compared novel diagnostic methods in prostate cancer. It has offered insight into the data parameters that will be needed to populate a future cost-effectiveness model incorporating new tests and diagnostic strategies in prostate cancer and potential sources of information for these parameters. It has also highlighted the limitations of previous models. The results from the review have emphasized the importance of accurately estimating factors such as the sensitivity of tests, the prevalence of disease, and the progression of the disease.
A limitation is that this review cannot provide recommendations on the most cost-effective test or diagnostic strategy because the studies are too heterogenous for the cost-effectiveness results to be compared. A further limitation is that, although the systematic review did not identify any relevant studies published between 2009 and 2011, the 2009 cutoff could potentially miss economic models of novel diagnostic methods published before 2009. Furthermore, 100% double screening and data extraction were not feasible because of a lack of resources. In line with guidelines, checks for accuracy were performed by comparing categorization of studies and data extraction with those of a second reviewer who independently screened 10% of randomly selected titles and abstracts and performed data extraction on 20% of the included studies. 16,17,86 Comparison With Previous Reviews This is the first study to identify cost-effectiveness models focused on screening and diagnostic strategies beyond standard PSA-based testing. One recent systematic review assessed modelbased economic evaluations of PSA-based screening strategies only. 82 This review also found a significant variation in model pathways to reflect cancer progression in the 10 included studies and limited and heterogenous evidence on QOL. A total of 3 older reviews were also identified but all assessed PSA-based screening only. [87][88][89]

Conclusion
The introduction of new biomarkers and MRI-guided biopsy methods in the studies identified in this review has been shown to lead to an improvement in health outcomes and a decrease or acceptable increase in costs. 20,25,26 Current concerns around implementing PSA-based prostate cancer screening strategies are due to overdiagnosis and overtreatment, 90 and these newer methods may lead to a reduction in these factors. This review has highlighted the substantial complexity involved in modeling the cost-effectiveness of diagnostic tests in prostate cancer to determine whether these strategies should be used at all and, if so, how and in what combination. To ensure the cost-effectiveness of any diagnostic strategy is assessed robustly, there is a need to ensure that disease progression in diagnosed and undiagnosed cases is accurately represented, uncertainty is fully accounted for, QOL estimates are measured as accurately as possible, and the possibility of repeat screening and testing in men with a negative diagnosis is considered.