Cost-Effectiveness Analysis of Treating Patients With NTRK-Positive Cancer With the Histology-Independent Therapy Entrectinib

Objectives: This study tackles several challenges of evaluating histology-independent treatments using entrectinib as an example. Histology-independent treatments are provided based on genetic marker(s) of tumors, regardless of the tumor type. We evaluated the lifetime cost-effectiveness of testing all patients for NTRK fusions and treating the positive cases with entrectinib compared with no testing and standard of care (SoC) for all patients. Methods: The health economic model consisted of a decision tree re ﬂ ecting the NTRK testing phase followed by a micro- simulation model re ﬂ ecting treatment with either entrectinib or SoC. Ef ﬁ cacy of entrectinib was based on data from basket trials, whereas historical data from NTRK-negative patients were corrected for the prognostic value of NTRK fusions to model SoC. Results: “ Testing ” (testing for NTRK fusions, with subsequent entrectinib treatment in NTRK-positive patients and SoC in NTRK-negative patients) had higher per-patient quality-adjusted life-years (QALYs) and costs than “ No testing ” (SoC for all patients), with a difference of 0.0043 and V 732, respectively. This corresponded to an incremental cost-effectiveness ratio (ICER) of V 169957/QALY and, using a cost-effectiveness threshold of V 80000/QALY, an incremental net monetary bene ﬁ t of 2 V 388. When excluding the costs of genetic testing for NTRK fusions, the ICER was reduced to V 36290/QALY and the incremental net monetary bene ﬁ t increased to V 188. Conclusions: When treatment requires the identi ﬁ cation of a genetic marker, the associated costs and effects need to be accounted for. Because of the low prevalence of NTRK fusions, the number needed-to-test to identify patients eligible for entrectinib is large. Excluding the testing phase reduces the ICER substantially.


Introduction
Recently, the European Medicines Agency (EMA) approved the first histology-independent treatments, entrectinib and larotrectinib, for tumors with NTRK gene fusions. 1,2 Histologyindependent (also called "tumor-agnostic") treatments are prescribed based on a genetic marker of the tumor, whereas most other oncology treatments are prescribed based on the tumor type. Evaluating the efficacy and cost-effectiveness of existing histology-independent treatments has proven challenging for various reasons. 3 First, clinical trials for entrectinib and larotrectinib were basket trials where patients with different tumor types were pooled together. [4][5][6] Because of the small number of patients per tumor type, tumor-specific effectiveness was not provided and marketing authorization for the pharmaceuticals was granted for all NTRK-positive (NTRK1) tumors, assuming similar efficacy across tumor types. Nevertheless, there may be heterogeneity in treatment effect across the tumor types. 7 Second, the trials were single-arm trials. The lack of randomized controlled trial (RCT) data for entrectinib and larotrectinib creates additional uncertainty around their effectiveness. Although historical data might be used to construct a synthetic control arm to the trial arm, it can be highly difficult to ascertain that the patient populations in control and intervention arm are sufficiently comparable. 8 A key issue for histology-independent treatments is that all patients in the trial have tumors with a specific genetic marker, whereas the available historical data is likely for a mixed patient population with tumors with and without the genetic marker. Given that genetic markers may affect disease prognosis, historical data from patients without the genetic marker should be corrected for the prognostic value of the genetic marker. 9 Third, for most tumor types the standard of care (SoC) does not include testing for NTRK fusions, meaning that the introduction of TRK inhibitors would also require the introduction of NTRK testing. Evaluating the cost-effectiveness of a new treatment requires a comparison between the new situation, in which the intervention is implemented, and the current situation. 9 To accurately reflect the new situation, all changes to the care pathway that are needed to identify and treat eligible patients need to be accounted for. That is, the health and cost consequences associated with introducing NTRK testing should be included in the cost-effectiveness analysis of TRK inhibitors. Various topics are to be considered when modeling tests, including the expected testing procedures in clinical practice, test properties (eg, sensitivity and specificity), and mortality during the testing phase. 9 In this article, we estimate the cost-effectiveness of entrectinib compared with SoC in cancer patients in The Netherlands. We compare a strategy in which patients are tested for NTRK fusions and receive subsequent treatment (entrectinib for NTRK1 patients and SoC for NTRK-negative [NTRK2] patients) to a strategy in which no additional testing is used and all patients receive SoC. We illustrate how some of the challenges arising from single-arm trial data may be addressed (the second challenge mentioned above) and how testing pathways can be incorporated in costeffectiveness analyses (third challenge).

Intervention and Comparator
The intervention comprised NTRK gene fusion testing for all patients with locally advanced or metastatic solid tumors followed by treatment with entrectinib in NTRK1 patients and SoC in NTRK2 patients. Patients in the comparator arm were not tested for NTRK fusions and all patients received SoC. National tumorspecific treatment guidelines were used to identify the treatments provided in SoC for each tumor type (Appendix Table 1 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2 022.08.006). We included treatments indicated for patients with locally advanced or metastatic solid tumors who had already received (at least) 1 systemic anticancer therapy. 2,10 Experimental treatments (ie, treatments outside current clinical guidelines) were excluded.
Larotrectinib also targets oncology patients with NTRK gene fusions and could be an alternative to entrectinib. Nevertheless, we were unable to include both pharmaceuticals in the model because of key differences in the trial populations, including the distribution of tumor types and the presence of pediatric patients in the larotrectinib trial but not the entrectinib trial. Publicly available data are insufficient to adjust the estimated effectiveness in the entrectinib and larotrectinib trials for differences in the trial populations. We opted to include entrectinib in our analysis because we had data available on the fitted distribution for overall survival (OS) and treatment discontinuation for entrectinib, whereas this data was not available for larotrectinib.

Study Population
The study population included adult patients with locally advanced or metastatic solid tumors who have received one or more lines of treatment and are willing to undergo further testing and treatment. Although entrectinib also received EMA approval for pediatric patients $ 12 years, there were no pediatric patients included in the ALKA-372-001, STARTRK-1, and STARTRK-2 trials for entrectinib (hereafter called entrectinib trials, N = 54), so we opted to focus on the adult population. Based on the patient characteristics in the trials, patients were assumed to be 58 years old upon entering the model, with 59% of patients being female. 6 The included cancers were breast, bile duct (ie, cholangiocarcinoma), colorectal, endometrial, ovarian, pancreatic, and thyroid cancer, as well as neuroendocrine tumor, non-small cell lung cancer, sarcoma, secretory carcinoma of the breast, and secretory carcinoma of the salivary gland.
To model the testing phase of patients, the tumor types were categorized into 3 groups based on a consensus report of Dutch experts, which outlines the envisioned NTRK testing policy in Dutch clinical practice 11 : (1) tumor types with high NTRK fusion prevalence (. 90%), (2) tumor types with low NTRK fusion prevalence but wild-type TRK protein expression, and (3) tumor types with low NTRK fusion prevalence and no/very little wild-type TRK protein expression (Appendix Table 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.08.006).
The assumed distribution of patients across tumor types in the model was based on the distribution in the pooled data set from the entrectinib trials. As the trials included only NTRK1 patients, the trial distribution of tumor types was combined with the tumor-specific NTRK prevalence to obtain the distribution of tumor types among the total group of patients eligible for NTRK fusion testing (ie, both NTRK1 and NTRK2 patients; Appendix Table 2 in Supplemental Materials found at https://doi.org/10.1 016/j.jval.2022.08.006). Appendix Table 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.08.006 also shows the tumor distribution in Dutch clinical practice among our study population, as observed in data from the Hartwig Medical Foundation (HMF), which cover 44 hospitals in The Netherlands. Nonetheless, because we are using the pooled effectiveness estimate from the entrectinib trials, we used the tumor distribution based on the entrectinib trials in our model.

Model Structure
The model consists of a decision tree (reflecting the testing phase) and a microsimulation (reflecting treatment) as shown in Figure 1. We used a lifetime time horizon and a cycle length of 1 month in the microsimulation. Analyses were performed from a Dutch societal perspective and effects and costs were discounted at 1.5% and 4%, respectively. 12

Decision Tree
Patients enter the model in a decision tree that compares "NTRK fusion testing" with a "no testing" strategy.
For patients receiving testing, the decision tree reflects the period from the decision to test for potential eligibility for entrectinib until the start of treatment. All patients were tested using immunohistochemistry (IHC) and/or next-generation sequencing of RNA (RNA-NGS). Patients with tumor types in groups 1 (high NTRK fusion prevalence) and 2 (wild-type TRK protein expression) were tested only through RNA-NGS. Patients in group 3 (low NTRK fusion prevalence, low TRK protein expression) first received an IHC test to identify those with elevated levels of TRK proteins. Patients who tested positive on the IHC test subsequently underwent confirmatory RNA-NGS testing. The latter strategy may save costs, because IHC tests tend to be much cheaper than RNA-NGS tests. Nevertheless, using IHC as a first screening tool has little added value in groups 1 and 2 because most patients will test positive on the IHC test, either because of the high prevalence of NTRK gene fusions (group 1) or wild-type TRK protein expression (ie, TRK protein expression that is not due to an NTRK fusion) (group 2). See Appendix Figures 1 and 2  Various probabilities are captured in the decision tree, including the probability that a new tumor biopsy needs to be performed to enable NTRK testing. Because biopsies sometimes fail to include sufficient tumor tissue and laboratory processes can fail, we also included the probability that a rebiopsy is required. Given that IHC tests generally can be performed using small amounts of tissue, the probability of rebiopsies was included only for RNA-NGS tests. Because of a relatively high short-term mortality in our study population, we also incorporated the probability of death during the testing phase. We assumed that all patients received (tumor specific) SoC during the testing period.

Microsimulation Model
Patients who survived the testing period subsequently entered the microsimulation model, where they received entrectinib if the test results were positive and SoC if negative. We used an individual-level state-transition model with 3 health states: "alive and on treatment," "alive and off treatment," and "dead." Note that in group 3, patients with a negative test result on the IHC test did not receive any further testing. Patients with a false negative result (ie, undetected NTRK1 patients) were assigned the mortality rates for NTRK1 patients receiving SoC. Similarly, although all patients receive SoC in the "no testing" strategy, the NTRK status of patients was tracked in the model so that the appropriate probabilities for treatment discontinuation and survival could be applied.

Model Parameters
The model input parameters are summarized in Table 1 6-14 and described in more detail below.

Decision tree
Estimates from the literature were used to obtain input values for various parameters, including the probabilities of patients needing biopsies (9.8%) and rebiopsies (15.9%) to enable testing 13 and the test properties of IHC tests. Tumor-specific sensitivity and specificity of the IHC test were used and varied between 73% to 100% and 50% to 100%, respectively (Appendix Table 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022. 08.006). The sensitivity and specificity of RNA-NGS were assumed to be 100%. 15 Waiting times per element of the testing pathway were estimated by 4 experts (3 clinical geneticists and 1 oncologist). The total waiting time in each arm was determined by the number of tests, biopsies, and rebiopsies performed and varied between 1 and 8 weeks. The probability to die during the testing phase was based on the estimated waiting times combined with tumor-and NTRK status-specific estimates of weekly mortality rates. The latter were derived from the HMF database (Appendix Table 4 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2 022.08.006).

Microsimulation model
OS and time-to-treatment discontinuation (TTD) in NTRK1 patients receiving entrectinib were based on the entrectinib trials.
Because of the small sample size of the entrectinib trials and the resulting lack of reliable tumor-specific estimates, we used single parametric survival function data for all tumor types. We used the best fitting distribution (exponential) and its coefficients from Roche's model for reimbursement submission. 16 As mentioned above, the entrectinib trials were single-arm trials. To be able to assess the relative effectiveness of entrectinib compared with SoC, we created a synthetic comparator arm. We used the HMF database, containing data from cancer patients with   (4) performing a tumor biopsy was safe according to the treating physician. 10 There are no publicly available patient-level data from the entrectinib trials, meaning that statistical methods to match the study populations from the HMF database and the entrectinib trials (eg, propensity score matching) could not be applied. 18 OS and TTD in NTRK2 patients receiving SoC were based on data from 1596 NTRK2 patients who received SoC (Appendix Fig. 3 in Supplemental Materials found at https://doi.org/10.1 016/j.jval.2022.08.006). For each tumor type, parametric distributions were fitted to data on OS and TTD, using the Akaike Information Criterion to determine the parametric distribution with the best fit (Appendix Table 5 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.08.006). Subsequently, tumorspecific monthly transition probabilities to "death" and "discontinuation of treatment" were extracted to be used in the model. The transition probabilities to discontinuation of treatment (entrectinib or SoC) and death were estimated independently.
To estimate transition probabilities to "death" and "treatment discontinuation" in NTRK1 patients receiving SoC, we applied a hazard ratio (HR) reflecting the prognostic value of carrying an oncogenic NTRK gene fusion to the OS and TTD estimates for NTRK2 patients.

Prognostic value of NTRK fusions
We used the HMF database to estimate the prognostic value of an NTRK fusion. Patients in the database were classified into 2 cohorts: NTRK1 and NTRK2. A subgroup of the NTRK2 cohort was matched to the patients in the NTRK1 cohort, using the optimal matching method. 19 Within each tumor type, patients were matched based on baseline patient characteristics, including age, sex, number of previous lines of therapy, and year of biopsy, using a ratio of 1:4 (NTRK1:NTRK2). A total of 24 NTRK1 patients were matched with 96 NTRK2 patients, with a successful covariate balance between the 2 groups. OS and TTD analyses were performed using the Kaplan-Meier method (Fig. 2) and Cox regressions, with the date of the first postbiopsy treatment as the index date and age, sex, and number of previous lines of treatment as covariates. If a subject was known to be alive before the cutoff date, the subject was censored at the last known alive date.

Costs
All costs are expressed in 2020 euros.

Decision tree
For each tumor type, we calculated the mean costs of biopsy, IHC testing, and RNA-NGS testing. Cost variations across tumor types were caused by differences in the type of biopsy needed (eg, more resources are needed for a lung biopsy than for a skin biopsy) and differences in price-setting among the main treatment centers for the various tumor types. The cost of an IHC test varied between V67 and V328 and the cost of an RNA-NGS test varied between V870 and V2137 (Appendix Table 6 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.08.006).

Microsimulation model
We included the costs of oncology drugs and their administration (including admissions at a day care unit if needed). For  Table 7 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2 022.08.006). In addition, costs for the treatment of adverse events (AEs) were included. Because of a lack of data on AEs with entrectinib, we assumed that the occurrence of AEs when receiving entrectinib was equal to AE occurrence in SoC. The tumor-specific prevalence of AEs was multiplied by the cost of treating the AEs, with input values for both variables based on estimates from the literature (Appendix Table 8 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.08.006). The costs of AEs were applied for patients in the "alive and on treatment" health state.
As per the Dutch health technology assessment (HTA) guideline, we also included the costs of non-hospital care related to cancer (eg, psychosocial care, home nursing, and general practitioner check-ups), as well as hospital and non-hospital care unrelated to cancer. We obtained age-dependent cost estimates through the "Practical Application to Include Disease Costs" tool. 20 At baseline, the total annual other healthcare costs derived from "Practical Application to Include Disease Costs" (ie, excluding drugs, administration of drugs, and AE costs) were V4453. In the year before death, they increased to V58 064.
Regarding societal costs, we included informal care costs but no productivity costs. Because the patients in our study population are in an advanced stage of cancer, we assumed that they are already out of the workforce upon entering our model in order that no additional productivity losses occur (ie, the friction period has already passed). 21 Informal care estimates were provided by a regression analysis that estimated the impact of proximity to death on the use of informal care, correcting for covariates, such as age and sex, using the Survey of Health, Ageing, and Retirement in Europe data. 22 The analysis provided the probability of using informal care, as well as the amount of hours used. We valued informal care at the standard rate recommended by the national HTA guideline (V14.77 per hour). 12

Utilities
Based on previous research findings that the quality of life decreases as patients approach death, we incorporated patient utility in our model as a function of time to death. 23 We used as model inputs the regression coefficients from a study that estimated the relationship between proximity to death and utility. 23 In the study, proximity-to-death values were obtained using OS functions and utility values based on the SF-6D method were used. Age and sex were also included as covariates, allowing us to include age-and sex-specific patient utility in our model. 23 We included the utility during the testing pathway, as well as during treatment.

Analyses
The decision tree and the microsimulation model were programmed in line with the Decision Analysis in R for Technologies in Health (DARTH) modeling framework in R 3.6.1 using RStudio 1.2.1335 (RStudio, Boston, MA). [24][25][26]

Base-Case Analysis
The base-case analysis reflects the cost-effectiveness of testing cancer patients with locally advanced or metastatic solid tumors for NTRK gene fusions and subsequently treating NTRK1 patients with entrectinib and NTRK2 patients with SoC compared with no testing and treating all patients with SoC. The incremental net monetary benefit (INMB) was calculated using a cost-effectiveness threshold of V80 000 per quality-adjusted life-year (QALY), which is the recommended value by Dutch HTA guidelines, given the calculated disease burden. 27, 28 We opted to simulate 5000 patients after evaluating the stability of model outcomes at various

Scenario Analyses
The first scenario analysis aimed to evaluate the impact of testing costs on the cost-effectiveness of implementing entrectinib by excluding the costs of NTRK testing. The second scenario excluded the costs, as well as the health effects resulting from the testing pathway (ie, the probability to die and the QALYs during the waiting time). This scenario reflects a setting in which the patients in our study population would not need any NTRK testing, for example, because RNA-NGS testing is part of standard practice and has already been done at an earlier stage. The third scenario takes the approach that is common in economic evaluations of targeted treatments, which is to only include the patients who carry the targeted genetic marker and to disregard the costs and health effects of testing. 9 We also performed the base-case analysis from a healthcare perspective.
As mentioned in section "Study Population", a stratified testing protocol (first IHC, then RNA-NGS for patients who test positive) is used for patients in group 3 (tumor types with low NTRK fusion prevalence and no/little wild-type TRK protein expression), which is the largest group. Nevertheless, RNA-NGS has much better test sensitivity and specificity than IHC. 29 If costs were not considered, providing RNA-NGS to all patients would therefore likely be preferred. RNA-NGS is a relatively new technology and in many settings is not yet part of standard care. As RNA-NGS becomes more widespread and perhaps further technological improvements are made to achieve efficiency gains, its cost may decrease. We therefore investigated at what price of RNA-NGS the provision of RNA-NGS to all patients would become cost-effective (ie, renders 0 INMB).

Sensitivity Analyses
Parameter uncertainty was tested using univariate sensitivity analysis and probabilistic sensitivity analysis (PSA). In the univariate analysis, model parameters were independently varied over the extremities of the 95% CI and, when this was not available, by a 20% increase/decrease from the parameter value in the base case. In the PSA, all parameters were varied simultaneously according to predefined distributions (Table 1 6,12-14 ). The model was run with 1000 iterations while sampling 1000 patients.

Budget Impact Analysis
A 5-year budget impact was estimated by multiplying the annual incremental healthcare costs per patient tested in the first 5 years with the expected number of patients tested. The number of patients tested each year in The Netherlands was determined by multiplying the number of expected NTRK1 patients (N = 90) 30,31 by the number of patients that need to be tested to identify 1 NTRK1 patient (as derived from our model results).

Base-Case Analysis
The results of the base-case analysis are presented in Table 2 and Appendix Table 9 in Supplemental Materials found at https://doi. org/10.1016/j.jval.2022.08.006. They showed that testing for NTRK fusions and treatment with entrectinib in NTRK1 patients and SoC in NTRK2 patients was associated with a QALY gain of 0.0043 at an increased cost of V732 per patient as compared with no testing and SoC for all patients. The incremental cost-effectiveness ratio (ICER) was V169 957/QALY and the INMB was 2V388.  Table 2 shows the results of scenario analyses with or without testing costs and consequences. Not including testing costs had a large impact, whereas not including mortality and QALYs during waiting time only had a minor impact on the cost-effectiveness results. In the third scenario analysis, in which only NTRK1 patients were considered and the testing phase was disregarded, the ICER of entrectinib versus SoC was V38 563/QALY. The results from the healthcare perspective were similar to the results in the base case (Appendix Table 10 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.08.006). We estimated that the cost of RNA-NGS would have to be reduced by 90%, to V186, before testing all eligible patients with RNA-NGS would be cost-effective.

Sensitivity Analyses
The results of the univariate analysis are presented in Appendix Figure 6 in Supplemental Materials found at https://doi. org/10.1016/j.jval.2022.08.006. The specificity of the IHC test had the largest impact on the INMB, followed by the cost of IHC testing and the HR for the prognostic value of NTRK fusions on OS. Improvements in the specificity of IHC tests (ie, fewer false positives), decreases in the costs for IHC tests, or increases in the HR for the prognostic value of NTRK fusions on OS (ie, worse prognosis of NTRK1) led to increased INMB.
The results of the PSA are presented in the cost-effectiveness plane in Figure 3 and the cost-effectiveness acceptability curve in Appendix Figure 7 in Supplemental Materials found at https:// doi.org/10.1016/j.jval.2022.08.006. Mean incremental costs of all iterations were V696 and mean incremental effects were 0.0037, resulting in an INMB of 2V399 and an ICER of V187 681/QALY. The probability of "Testing 1 entrectinib/SoC" being cost-effective compared with "No testing 1 SoC" was 0.2% at a threshold of V80 000 per QALY.

Budget Impact Analysis
The proportion of patients who received entrectinib from all patients that were tested in our model was 0.28%. Combined with the expected number of patients treated with entrectinib (N = 90), this results in 31 630 patients being eligible for NTRK testing in The Netherlands annually. The 5-year incremental budget impact of testing and treatment with entrectinib in NTRK1 patients was V93 million, with testing costs making up 82% of the total budget impact (Table 3).

Summary of Results
Our results showed that incorporating the consequences of NTRK testing has a large impact on the cost-effectiveness of implementing the histology-independent treatment entrectinib to the extent that it would alter the reimbursement decision. Excluding the costs and health effects associated with NTRK testing reduced the ICER from V169 957/QALY to V38 563/QALY. The difference is primarily owing to the rarity of NTRK gene fusions, meaning that a large number of patients need to be tested to identify the few patients that are eligible for entrectinib treatment (0.29%). This means that few patients experience QALY gains from NTRK testing (only the NTRK1 patients who experience better health outcomes with entrectinib treatment than SoC), while additional costs (for the tests) are added for all patients.
The estimated five-year budget impact of testing (V93 million) was based on the testing capacity that is needed when all eligible patients are tested. Nevertheless, the testing practice for NTRK fusion-positive cancers in The Netherlands is still in development. 11 Hence, fewer tests may be conducted in the earlier years of the implementation of NTRK testing and subsequent treatment, which would make the actual budget impact lower than what we estimated.

Strengths
When the introduction of a new drug requires a new diagnostic test to identify the target population, HTA bodies should be fully informed about the cost-effectiveness of the test-treatment combination. 9 Therefore, in this study we considered the entire patient population that would be affected (ie, all patients eligible for NTRK testing) and calculated the (downstream) costs and benefits of NTRK testing and subsequent treatment for patients who tested NTRK1 and NTRK2. Entrectinib was conditionally approved by the EMA and the Food and Drug Administration based on small single-arm basket trials showing a durable response and longer survival. 32 The single-arm nature of the trial data poses a great challenge to HTA bodies making decisions based on comparative effectiveness and cost-effectiveness. Briggs et al, 33 in a study about the estimation of the counterfactual for tumor-agnostic treatments with only single-arm trial data available, described 3 options: (1) historical controls reported in the literature, (2) previous line of therapy in intervention patients of basket trials, or (3) nonresponders in basket trials. As we did not have access to individual patient data from the entrectinib trials, we had to rely on historical data. Nevertheless, unlike Briggs et al, 33 we used genetic data to adjust the historical data for the prognostic value of carrying oncogenic NTRK gene fusions; therewith, likely increasing the accuracy of the estimated comparative effectiveness. Indeed, 2 other studies that estimated an HR for the OS of NTRK1 patients found comparable values (1.44 34 and 1.6, 35 respectively) to the value we estimated (1.44). With our approach, we illustrated how a database with genomic and clinical data can be used to match patients with and without a specific genetic marker and to estimate HRs (for OS and TTD) for the patients with the marker. A similar approach could be used to estimate HRs for patients with different genetic markers. Although residual confounding cannot be ruled out in our analysis because of the small sample size, lack of matching patients on all relevant covariates (eg, Eastern Cooperative Oncology Group performance status), and other limitations, more robust statistical estimation may be achieved with larger sample sizes (eg, for genetic markers that are less rare) and with data for more (clinical) variables. Nonetheless, RCT data are preferred over historical data for the assessment of comparative effectiveness, and the kind of matching exercise we conducted should only be done when no RCT data are available.
Most health economic models for cancer treatments include the health states "progression-free" and "progression," which require data on progression status in addition to OS. Because appropriate data on progression for patients receiving SoC were not available, we instead used a regression model that estimated the impact of proximity to death on the utility of cancer patients. Although this is not a conventional approach, we believe that linking utility to proximity to death is likely more appropriate (ie, closer to the lived experience of cancer patients) than using singular utility values for progression-free and progressed patients. We therefore consider this approach a strength of our study.

Limitations
Although we estimated tumor-specific effectiveness of SoC, we had to assume that the effectiveness of entrectinib was constant across tumor types, as the small number of observations per tumor type in the entrectinib trials precluded the estimation of reliable tumor-specific effectiveness. Nevertheless, Murphy et al 7 showed heterogeneity in clinical effects across histologies that should be accounted for, for example, by using Bayesian hierarchal models. We present a single ICER in this study, implying an all-ornothing decision regarding the reimbursement of entrectinib for NTRK1 patients, yet it might be more appropriate to differentiate between tumor types. That is, even when histology-independent therapies receive marketing authorization for all histologies, reimbursement might be warranted only for a subset of histologies because of heterogeneity in the treatment effect. If indeed there is heterogeneity in effectiveness across tumor types, the single ICER estimate we estimated based on the patient population in the entrectinib trials may be biased because the proportional distribution of tumor types in the trial is not fully representative of the distribution of tumor types among eligible patients in clinical practice. Additionally, because of the small number of NTRK1 observations per tumor type in the HMF database, we had to assume that the prognostic value of NTRK fusions was constant across tumor types.
Note that although our analysis does not account for heterogeneity in the treatment effect of entrectinib and the HRs for NTRK -gene fusions across histologies, it does account for differences in the treatment costs and effects of SoC (both TTD and OS) across tumor types and uses tumor-specific estimates of testing costs. We attempted to construct an appropriate comparator arm by estimating outcomes on a similar target population and adjusting for the prognostic value of NTRK fusions. That is, we only considered patients with locally advanced or metastatic disease who had received at least one previous line of treatment, in line with the patient population in the entrectinib trials. We also only considered patients who had one of the tumor types included in the entrectinib trials. We attempted to create a subgroup of HMF patients with a similar average age and percent of females as those in the entrectinib trials, but the subgroup was too small to enable reliable statistical estimation. Despite our efforts, we cannot be sure that the patient populations in the respective arms are fully comparable because we did not have access to patient-level data from the entrectinib trials. Moreover, there may be unobserved differences between the patient populations.

Implications for Decision Making
Considering the impact of testing costs on the costeffectiveness of entrectinib, payers should focus on policies supporting a reduction in the costs of testing. Furthermore, the uncertainty around the effectiveness of entrectinib leaves HTA bodies with 2 main choices: wait for more (tumor type-specific) evidence or provide coverage through a managed entry agreement between the pharmaceutical company and the healthcare payer. Considering the rarity of NTRK fusions, stronger evidence will likely not become available soon, meaning that waiting for more evidence would leave patients without access to entrectinib for a long time. Therefore, despite concerns about the feasibility of discontinuing reimbursement for medicines once further evidence does not demonstrate their (cost-)effectiveness, 36,37 a managed entry agreement, for example, in the form of coverage with evidence development, may be the best option. 38 Through coverage with evidence development agreements, treatments are temporarily reimbursed while further evidence is collected. After a specified period, the cost-effectiveness of the treatment is reevaluated using the additional data. Ideally, data on final outcomes, such as survival and quality of life, are collected. When the follow-up time is limited, surrogate outcomes, such as progression-free survival and tumor response rates, are also used. Although, it is of note that surrogate outcomes are not necessarily predictive of final outcomes. 39 In addition, we found that the uncertainty around the HR for NTRK gene fusions on OS has a large effect on cost-effectiveness outcomes. Because of the low prevalence of NTRK fusions, larger genomic databases (paired with clinical information) are needed to gather sufficiently large numbers of NTRK1 patients to obtain statistically significant results on the prognostic value of NTRK fusions. It may therefore be valuable, for patients with other rare genetic markers as well, if decision-makers invest in expanding genomic data collection, possibly including policies that encourage the linking of existing databases. Finally, although our analysis has focused on entrectinib, larotrectinib has a similar target population, mechanism of action, and price-setting in The Netherlands. Our finding that the introduction of TRK inhibitor entrectinib appears to not be cost-effective when considering the consequences of introducing NTRK testing but might be costeffective if the relevant tests would become standard practice, likely also applies for larotrectinib. A recent study that investigated the cost-effectiveness of larotrectinib in The Netherlands and focused on NTRK1 patients without considering the testing phase, comparable with our scenario 3, found an ICER of V41 424/ QALY, which is similar to the ICER of V38 563/QALY we estimated for entrectinib in scenario 3. 40

Conclusions
In conclusion, with the currently available evidence, it seems that entrectinib is not cost-effective compared with SoC. Nevertheless, if genetic testing of cancer patients (including RNA-NGS panels that can identify NTRK gene fusions) becomes standard practice, entrectinib may be cost-effective. Nonetheless, our study results are very uncertain because of data limitations.

Supplemental Material
Supplementary data associated with this article can be found in the online version at https://doi.org/10.1016/j.jval.2022.08.006.