Advertisement

Predicting Patient-Level 3-Level Version of EQ-5D Index Scores From a Large International Database Using Machine Learning and Regression Methods

Open AccessPublished:March 15, 2022DOI:https://doi.org/10.1016/j.jval.2022.01.024

      Highlights

      • Despite the vast amount of real-life data accumulated in healthcare, EQ-5D index scores are frequently lacking for health economic analyses and have to be estimated.
      • We predicted 3 level version of EQ-5D (EQ-5D-3L) index scores in a large heterogenous data set of population surveys and clinical studies using eXtreme Gradient Boosting classification, eXtreme Gradient Boosting regression, and ordinary least squares regression. Regression methods outperformed classification in terms of prediction accuracy and bias. The performance of the 3 methods depended on the applied evaluation criteria, the target population, the included predictors, and the EQ-5D-3L index score range.
      • The prediction accuracy of individual EQ-5D-3L index scores was inadequate for the majority of respondents. For the evaluation of personalized health interventions, we encourage the systematic collection of patient-reported outcomes such as EQ-5D with the involvement of artificial intelligence experts and outcomes researchers to enhance the value of accumulating data in health systems.

      Abstract

      Objectives

      This study aimed to evaluate the performance of machine learning and regression methods in the prediction of 3-level version of EQ-5D (EQ-5D-3L) index scores from a large diverse data set.

      Methods

      A total of 30 studies from 3 countries were combined. Predictions were performed via eXtreme Gradient Boosting classification (XGBC), eXtreme Gradient Boosting regression (XGBR) and ordinary least squares (OLS) regression using 10-fold cross-validation and 80%/20% partition for training and testing. We evaluated 6 prediction scenarios using 3 samples (general population, patients, total) and 2 predictor sets: demographic and disease-related variables with/without patient-reported outcomes. Model performance was evaluated by mean absolute error and percent of predictions within clinically irrelevant error range and within correct health severity group (EQ-5D-3L index <0.45, 0.45-0.926, >0.926).

      Results

      The data set involved 26 318 individuals (clinical settings n = 6214, general population n = 20 104) and 26 predictor variables plus diagnoses. Using all predictors and the total sample, mean absolute error values were 0.153, 0.126, and 0.131, percent of predictions within clinically irrelevant error range were 47.6%, 39.5%, and 37.4%, and within the correct health severity group were 56.3%, 64.9%, and 63.3% by XGBC, XGBR, and OLS, respectively. The performance of models depended on the applied evaluation criteria, the target population, the included predictors, and the EQ-5D-3L index score range.

      Conclusions

      Regression models (XGBR and OLS) outperformed XGBC, yet prediction errors were outside the clinically irrelevant error range for most respondents. Our results highlight the importance of systematic patient-reported outcome (EQ-5D) data collection. Dialogs between artificial intelligence and outcomes research experts are encouraged to enhance the value of accumulating data in health systems.

      Keywords

      Introduction

      Patient-reported outcomes (PROs) reflect disease burden or treatment effectiveness from the patients’ perspective. The value of PROs in improving health system performance and individual health outcomes has been demonstrated in multiple settings.
      • Berwick D.
      • Black N.
      • Cullen D.
      • et al.
      Recommendations to OECD Ministers of Health from the high level reflection group on the future of health statistics: strengthening the international comparison of health system performance through patient-reported indicators. OECD.
      Preference-based health measures such as the EQ-5D are widely used in health economic evaluations.
      Endpoints used for relative effectiveness assessment of pharmaceuticals. Health-related quality of life and utility measures. EUnetHTA.
      • Kennedy-Martin M.
      • Slaap B.
      • Herdman M.
      • et al.
      Which multi-attribute utility instruments are recommended for use in cost-utility analysis? A review of national health technology assessment (HTA) guidelines.
      EQ-5D is a recommended tool for use in cost-utility analyses around the globe. EuroQoL Group.
      EuroQoL Group
      EuroQol — a new facility for the measurement of health-related quality of life.
      Although the monitoring of PROs has become a priority in many health systems, their organized collection at national level is still in its infancy.
      • Berwick D.
      • Black N.
      • Cullen D.
      • et al.
      Recommendations to OECD Ministers of Health from the high level reflection group on the future of health statistics: strengthening the international comparison of health system performance through patient-reported indicators. OECD.
      With the gradual implementation of electronic health records and harmonized statistical data collections (eg, European Health Survey), a large amount of administrative health data is being collected.
      EESZT National eHealth Infrastructure
      National Centre for Healthcare Services.
      • Offerman A.
      Slovenia moving HIS applications to central eHealth platform (eZdravje). European Commission. Joinup Web Site.
      • Czerw A.
      • Fronczak A.
      • Witczak K.
      • Juszczyk G.
      Implementation of electronic health records in Polish outpatient health care clinics — starting point, progress, problems, and forecasts.
      Smart devices, big data, and advanced analytic techniques are contributing to the personalization of healthcare.
      • Borges do Nascimento I.J.
      • Marcolino M.S.
      • Abdulazeem H.M.
      • et al.
      Impact of big data analytics on people’s health: overview of systematic reviews and recommendations for future studies.
      • Boccia S.
      • Pastorino R.
      • Giraldi L.
      • van den Bergen K.
      Digitalisation and Big Data: Implications for the Health Sector.
      • Davenport T.
      • Kalakota R.
      The potential for artificial intelligence in healthcare.
      • Garai A.
      • Pentek I.
      • Adamko A.
      Revolutionizing healthcare with IoT and cognitive, cloud-based telemedicine.
      • Cohoon T.J.
      • Bhavnani S.P.
      Toward precision health: applying artificial intelligence analytics to digital health biometric datasets.
      Nevertheless, because of varying rules of data sharing, standards of interoperability, available infrastructure or level of stakeholder collaboration, and data sets, which are usually collected at different time points for different purposes with different methods in the health data ecosystem, are difficult to connect.
      • Iacob N.
      • Simonelli F.
      Towards a European health data ecosystem.
      For example, the Minimum European Health Module is a PRO measure collected regularly in Eurostat population surveys while hardly used in clinical trials. The EQ-5D questionnaire has been increasingly used in clinical trials, health surveys, and registries,
      • Devlin N.J.
      • Brooks R.
      EQ-5D and the EuroQol Group: past, present and future.
      ,
      • Ernstsson O.
      • Janssen M.F.
      • Heintz E.
      Collection and use of EQ-5D for follow-up, decision-making, and quality improvement in health care — the case of the Swedish National Quality Registries.
      but infrequently in general clinical practice or administrative health surveys.
      • Rencz F.
      • Gulacsi L.
      • Drummond M.
      • et al.
      EQ-5D in Central and Eastern Europe: 2000-2015.
      ,
      • Boros J.
      • Györke J.
      • Pásztorné Stokker E.
      • Szabó Z.K.
      Results of the 2014 Health Interview Survey - summary data.
      The accumulating big data are typically unstructured, heterogenous, and incomplete, which may hamper the analysis using standard regression methods, whereas novel machine learning (ML) approaches may offer advantages in such data sets.
      For calculating quality-adjusted life-years in health economic analyses, EQ-5D values are often missing and have to be estimated from other health measures.
      • Longworth L.
      • Rowen D.
      Mapping to obtain EQ-5D utility values for use in NICE health technology assessments.
      • Jia H.
      • Lubetkin E.I.
      Estimating EuroQol EQ-5D scores from Population Healthy Days data.
      • Tsiachristas A.
      • Potter C.M.
      • Rocks S.
      • et al.
      Estimating EQ-5D utilities based on the Short-Form Long Term Conditions Questionnaire (LTCQ-8).
      Therefore, the question arises whether EQ-5D index scores can be predicted from a large diverse data set combined from multiple sources and whether novel analytical methods offer advantages over conventional regression techniques.
      Over the past 15 years, we collected EQ-5D-3L data in 30 studies from 26 318 individuals in a variety of settings and designs.
      • Brodszky V.
      • Balint P.
      • Geher P.
      • et al.
      Disease burden of psoriatic arthritis compared to rheumatoid arthritis, Hungarian experiment.
      • Poor A.K.
      • Sardy M.
      • Cserni T.
      • et al.
      Assessment of health-related quality of life in psoriasis patients in Hungary.
      • Balogh O.
      • Pentek M.
      • Gulacsi L.
      • et al.
      [Quality of life and burden of disease in peripheral arterial disease: a study among Hungarian patients].
      • Pentek M.
      • Kobelt G.
      • Czirjak L.
      • et al.
      Costs of rheumatoid arthritis in Hungary.
      • Minier T.
      • Pentek M.
      • Brodszky V.
      • et al.
      Cost-of-illness of patients with systemic sclerosis in a tertiary care centre.
      • Ersek K.
      • Kovacs T.
      • Wimo A.
      • et al.
      Costs of dementia in Hungary.
      • Brodszky V.
      • Péntek M.
      • Jelics N.
      • et al.
      Health-related costs of diabetes mellitus in adults treated with insulin. Cross-sectional survey of 480 patients in general practice and outpatient settings.
      • Simoens S.
      • Dunselman G.
      • Dirksen C.
      • et al.
      The burden of endometriosis: costs and quality of life of women with endometriosis and treated in referral centres.
      • Pentek M.
      • Gulacsi L.
      • Toth E.
      • Baji P.
      • Brodszky V.
      • Horvath C.
      Ten-year fracture risk by FRAX((R)) of women with osteoporosis attending osteoporosis care in Hungary.
      • Pulay A.J.
      • Bitter I.
      • Papp S.
      • et al.
      Exploring the relationship between quality of life (EQ-5D) and clinical measures in adult attention deficit hyperactivity disorder (ADHD).
      • Hever N.V.
      • Pentek M.
      • Ballo A.
      • et al.
      Health related quality of life in patients with bladder cancer: a cross-sectional survey and validation study of the Hungarian version of the Bladder Cancer Index.
      • Rencz F.
      • Kovacs A.
      • Brodszky V.
      • et al.
      Cost of illness of medically treated benign prostatic hyperplasia in Hungary.
      • Péntek M.
      • Bereczki D.
      • Gulácsi L.
      • et al.
      Survey of epilepsy in adults in Hungary: quality of life and costs.
      • Pentek M.
      • Gulacsi L.
      • Majoros A.
      • et al.
      Health related quality of life and productivity of women with overactive bladder.
      • Tamas G.
      • Gulacsi L.
      • Bereczki D.
      • et al.
      Quality of life and costs in Parkinson’s disease: a cross sectional study in Hungary.
      • Péntek M.
      • Harangozó J.
      • Égerházi A.
      • et al.
      Quality of life and disease burden of patients with schizophrenia in Hungary.
      • Pentek M.
      • Gulacsi L.
      • Rozsa C.
      • et al.
      Health status and costs of ambulatory patients with multiple sclerosis in Hungary.
      • Rencz F.
      • Gulácsi L.
      • Brodszky V.
      • Golicki D.
      • Ruzsa G.
      • Péntek M.
      Pns401 the first parallel Eq-5d-3l and Eq-5d-5l composite time trade-off valuation study in Europe.
      • Baji P.
      • Brodszky V.
      • Rencz F.
      • Boncz I.
      • Gulacsi L.
      • Pentek M.
      Health status of the Hungarian population between 2000-2010.
      • Pentek M.
      • Brodszky V.
      • Biro Z.
      • et al.
      Subjective health expectations of patients with age-related macular degeneration treated with antiVEGF drugs.
      • Pentek M.
      • Brodszky V.
      • Gulacsi A.L.
      • et al.
      Subjective expectations regarding length and health-related quality of life in Hungary: results from an empirical investigation.
      • Donaldson C.
      • Baker R.
      • Mason H.
      • et al.
      European value of a quality adjusted life year — final publishable report.
      • Golicki D.
      • Jakubczyk M.
      • Niewada M.
      • Wrona W.
      • Busschbach J.J.
      Valuation of EQ-5D health states in Poland: first TTO-based social value set in Central and Eastern Europe.
      • Golicki D.
      • Niewada M.
      General population reference values for 3-level EQ-5D (EQ-5D-3L) questionnaire in Poland.
      • Golicki D.
      • Niewada M.
      • Buczek J.
      • et al.
      Validity of EQ-5D-5L in stroke.
      • Golicki D.
      • Sliwka A.
      • Fijewski G.
      • Latek M.
      Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
      • Golicki D.
      • Zawodnik S.
      • Janssen M.F.
      • Kiljan A.
      • Hermanowski T.
      Eq1 psychometric comparison of Eq-5d and Eq-5d-5l in student population.
      • Prevolnik Rupel V.
      • Srakar A.
      • Rand K.
      Valuation of EQ-5D-3l Health States in Slovenia: VAS based and TTO based value sets.
      • Rupel V.P.
      • Rebojl M.
      The Slovenian VAS tariff based on valuations of EQ-5D health states from the general population. EuroQol 17th Plenary Meeting.
      • Rupel V.P.
      • Ogorevc M.
      Use of the EQ-5D instrument and value scale in comparing health states of patients in four health care programs among health care providers.
      • Pentek M.
      • Beretzky Z.
      • Brodszky V.
      • et al.
      Health-related productivity of the Hungarian population. A cross-sectional survey.
      As a model of the heterogenous sociodemographic and disease-related variables that can be yielded from real-world electronic health records, a combined anonymous data set was created by applying uniform data-management rules for standard sociodemographic and healthcare-related variables.
      This study aimed to evaluate the performance of ML and ordinary least squares (OLS) regression in the prediction of individual-level EQ-5D-3L index scores from variables routinely collected in observational studies in various patient populations and the general population.

      Methods

      EQ-5D-3L

      The EQ-5D-3L questionnaire consists of 2 parts. The descriptive system assesses self-reported health in 5 dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. In each dimension, respondents can describe their current health with one of the following 3 categories: no problems, some problems, and severe problems. The descriptive system defines 243 (35) distinct health states.
      EuroQoL Group
      EuroQol — a new facility for the measurement of health-related quality of life.
      The EQ-5D-3L index scores (utilities) attached to each health state are measured in valuation studies and reflect societal preferences. In this study, we applied the UK EQ-5D-3L index value set (range −0.594 to 1.000).
      • Prevolnik Rupel V.
      • Srakar A.
      • Rand K.
      Valuation of EQ-5D-3l Health States in Slovenia: VAS based and TTO based value sets.
      The EQ-5D-3L index score of 1 represents perfect health, 0 represents death, and negative values represent “worse than death” health states. The second part of the instrument is a 20-cm vertical EuroQol visual analog scale (EQ-VAS) for the measurement of current health ranging from 0 (worst imaginable health) to 100 (best imaginable health).

      Study Population

      Data were collected in Hungary, Poland, and Slovenia. These countries have EQ-5D-3L value sets
      • Rencz F.
      • Gulácsi L.
      • Brodszky V.
      • Golicki D.
      • Ruzsa G.
      • Péntek M.
      Pns401 the first parallel Eq-5d-3l and Eq-5d-5l composite time trade-off valuation study in Europe.
      ,
      • Golicki D.
      • Jakubczyk M.
      • Niewada M.
      • Wrona W.
      • Busschbach J.J.
      Valuation of EQ-5D health states in Poland: first TTO-based social value set in Central and Eastern Europe.
      ,
      • Rupel V.P.
      • Rebojl M.
      The Slovenian VAS tariff based on valuations of EQ-5D health states from the general population. EuroQol 17th Plenary Meeting.
      and population norms.
      • Baji P.
      • Brodszky V.
      • Rencz F.
      • Boncz I.
      • Gulacsi L.
      • Pentek M.
      Health status of the Hungarian population between 2000-2010.
      ,
      • Golicki D.
      • Niewada M.
      General population reference values for 3-level EQ-5D (EQ-5D-3L) questionnaire in Poland.
      ,
      • Rupel V.P.
      • Rebojl M.
      The Slovenian VAS tariff based on valuations of EQ-5D health states from the general population. EuroQol 17th Plenary Meeting.
      ,
      Between 2000 and 2015, nearly three-quarters of EQ-5D-related studies in Central and Eastern Europe originated from these 3 countries.
      • Rencz F.
      • Gulacsi L.
      • Drummond M.
      • et al.
      EQ-5D in Central and Eastern Europe: 2000-2015.
      From Hungary, we involved 2421 outpatients with 18 chronic conditions including psoriatic arthritis (n = 177),
      • Brodszky V.
      • Balint P.
      • Geher P.
      • et al.
      Disease burden of psoriatic arthritis compared to rheumatoid arthritis, Hungarian experiment.
      plaque psoriasis (PP, n = 192),
      • Poor A.K.
      • Sardy M.
      • Cserni T.
      • et al.
      Assessment of health-related quality of life in psoriasis patients in Hungary.
      peripheric arterial occlusive disease (n = 103),
      • Balogh O.
      • Pentek M.
      • Gulacsi L.
      • et al.
      [Quality of life and burden of disease in peripheral arterial disease: a study among Hungarian patients].
      age-related macular degeneration (n = 122),
      • Pentek M.
      • Brodszky V.
      • Biro Z.
      • et al.
      Subjective health expectations of patients with age-related macular degeneration treated with antiVEGF drugs.
      rheumatoid arthritis (n = 249),
      • Pentek M.
      • Kobelt G.
      • Czirjak L.
      • et al.
      Costs of rheumatoid arthritis in Hungary.
      systemic sclerosis (SSC, n = 80),
      • Minier T.
      • Pentek M.
      • Brodszky V.
      • et al.
      Cost-of-illness of patients with systemic sclerosis in a tertiary care centre.
      dementia (n = 86),
      • Ersek K.
      • Kovacs T.
      • Wimo A.
      • et al.
      Costs of dementia in Hungary.
      diabetes mellitus (n = 264),
      • Brodszky V.
      • Péntek M.
      • Jelics N.
      • et al.
      Health-related costs of diabetes mellitus in adults treated with insulin. Cross-sectional survey of 480 patients in general practice and outpatient settings.
      endometriosis (n = 79),
      • Simoens S.
      • Dunselman G.
      • Dirksen C.
      • et al.
      The burden of endometriosis: costs and quality of life of women with endometriosis and treated in referral centres.
      osteoporosis (n = 207),
      • Pentek M.
      • Gulacsi L.
      • Toth E.
      • Baji P.
      • Brodszky V.
      • Horvath C.
      Ten-year fracture risk by FRAX((R)) of women with osteoporosis attending osteoporosis care in Hungary.
      adult attention deficit-hyperactivity disorder (n = 75),
      • Pulay A.J.
      • Bitter I.
      • Papp S.
      • et al.
      Exploring the relationship between quality of life (EQ-5D) and clinical measures in adult attention deficit hyperactivity disorder (ADHD).
      urinary bladder cancer (n = 148),
      • Hever N.V.
      • Pentek M.
      • Ballo A.
      • et al.
      Health related quality of life in patients with bladder cancer: a cross-sectional survey and validation study of the Hungarian version of the Bladder Cancer Index.
      benign prostatic hyperplasia (BPH, n = 237),
      • Rencz F.
      • Kovacs A.
      • Brodszky V.
      • et al.
      Cost of illness of medically treated benign prostatic hyperplasia in Hungary.
      epilepsy (n = 96),
      • Péntek M.
      • Bereczki D.
      • Gulácsi L.
      • et al.
      Survey of epilepsy in adults in Hungary: quality of life and costs.
      overactive bladder (n = 61),
      • Pentek M.
      • Gulacsi L.
      • Majoros A.
      • et al.
      Health related quality of life and productivity of women with overactive bladder.
      Parkinson’s disease (n = 99),
      • Tamas G.
      • Gulacsi L.
      • Bereczki D.
      • et al.
      Quality of life and costs in Parkinson’s disease: a cross sectional study in Hungary.
      chronic schizophrenia (n = 78),
      • Péntek M.
      • Harangozó J.
      • Égerházi A.
      • et al.
      Quality of life and disease burden of patients with schizophrenia in Hungary.
      and multiple sclerosis (n = 68).
      • Pentek M.
      • Gulacsi L.
      • Rozsa C.
      • et al.
      Health status and costs of ambulatory patients with multiple sclerosis in Hungary.
      Furthermore, we included 14 442 individuals from general population studies including a large representative health survey (HHU, n = 2019),
      • Pentek M.
      • Beretzky Z.
      • Brodszky V.
      • et al.
      Health-related productivity of the Hungarian population. A cross-sectional survey.
      the Hungarian EQ-5D-3L/5-level version of EQ-5D (EQ-5D-5L) valuation study (VHU, n = 1000)
      • Rencz F.
      • Gulácsi L.
      • Brodszky V.
      • Golicki D.
      • Ruzsa G.
      • Péntek M.
      Pns401 the first parallel Eq-5d-3l and Eq-5d-5l composite time trade-off valuation study in Europe.
      ), a survey about health expectations among visitors of the largest online news portal (EXP, n = 9142),
      • Pentek M.
      • Brodszky V.
      • Gulacsi A.L.
      • et al.
      Subjective expectations regarding length and health-related quality of life in Hungary: results from an empirical investigation.
      and a representative survey aiming to measure the monetary value of a quality-adjusted life-year in Europe (n = 2281).
      • Baji P.
      • Brodszky V.
      • Rencz F.
      • Boncz I.
      • Gulacsi L.
      • Pentek M.
      Health status of the Hungarian population between 2000-2010.
      ,
      • Donaldson C.
      • Baker R.
      • Mason H.
      • et al.
      European value of a quality adjusted life year — final publishable report.
      From Poland, we included 504 patients from cohort studies involving measurements before, during, or after hospitalization because of stroke (cerebrovascular accident, n = 397)
      • Golicki D.
      • Niewada M.
      • Buczek J.
      • et al.
      Validity of EQ-5D-5L in stroke.
      and osteoporotic hip fracture (n = 107)
      • Golicki D.
      • Sliwka A.
      • Fijewski G.
      • Latek M.
      Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
      and 4704 patients from the general population including students (STU, n = 443),
      • Golicki D.
      • Zawodnik S.
      • Janssen M.F.
      • Kiljan A.
      • Hermanowski T.
      Eq1 psychometric comparison of Eq-5d and Eq-5d-5l in student population.
      respondents from the Polish EQ-5D-3L valuation study (VPL, n = 320),
      • Golicki D.
      • Jakubczyk M.
      • Niewada M.
      • Wrona W.
      • Busschbach J.J.
      Valuation of EQ-5D health states in Poland: first TTO-based social value set in Central and Eastern Europe.
      and respondents from the Polish EQ-5D-3L population norms study (n = 3941.
      • Golicki D.
      • Niewada M.
      General population reference values for 3-level EQ-5D (EQ-5D-3L) questionnaire in Poland.
      From Slovenia, 3290 patients were included from a cohort study comparing the outcomes of various health programs across hospitals including conditions such as gonarthrosis, coxarthrosis, intervertebral disc disease, urinary incontinence, carpal tunnel syndrome, inguinal hernia, varicose veins, osteosynthesis removal, and shoulder lesions (PSI, n = 3290),
      • Rupel V.P.
      • Ogorevc M.
      Use of the EQ-5D instrument and value scale in comparing health states of patients in four health care programs among health care providers.
      and 958 respondents from the general population, including the Slovenian EQ-5D-3L VAS valuation study (n = 734)
      • Rupel V.P.
      • Rebojl M.
      The Slovenian VAS tariff based on valuations of EQ-5D health states from the general population. EuroQol 17th Plenary Meeting.
      and the Slovenian EQ-5D-3L time trade-off valuation study (n = 224).
      • Prevolnik Rupel V.
      • Srakar A.
      • Rand K.
      Valuation of EQ-5D-3l Health States in Slovenia: VAS based and TTO based value sets.
      Three studies involved multiple measurements of EQ-5D-3L from the same patient (eg, before and after hospitalization).
      • Golicki D.
      • Niewada M.
      • Buczek J.
      • et al.
      Validity of EQ-5D-5L in stroke.
      ,
      • Golicki D.
      • Sliwka A.
      • Fijewski G.
      • Latek M.
      Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
      ,
      • Rupel V.P.
      • Ogorevc M.
      Use of the EQ-5D instrument and value scale in comparing health states of patients in four health care programs among health care providers.
      The detailed description of the involved study populations is provided in the reference publications. We involved patients from the original databases without any other restrictions if EQ-5D-3L values were available; therefore, the number of eligible patients for our study may have differed slightly from the reference publications.

      Database and Variables

      We partitioned the database into a general population
      • Rencz F.
      • Gulácsi L.
      • Brodszky V.
      • Golicki D.
      • Ruzsa G.
      • Péntek M.
      Pns401 the first parallel Eq-5d-3l and Eq-5d-5l composite time trade-off valuation study in Europe.
      ,
      • Baji P.
      • Brodszky V.
      • Rencz F.
      • Boncz I.
      • Gulacsi L.
      • Pentek M.
      Health status of the Hungarian population between 2000-2010.
      ,
      • Pentek M.
      • Brodszky V.
      • Gulacsi A.L.
      • et al.
      Subjective expectations regarding length and health-related quality of life in Hungary: results from an empirical investigation.
      ,
      • Golicki D.
      • Jakubczyk M.
      • Niewada M.
      • Wrona W.
      • Busschbach J.J.
      Valuation of EQ-5D health states in Poland: first TTO-based social value set in Central and Eastern Europe.
      ,
      • Golicki D.
      • Niewada M.
      General population reference values for 3-level EQ-5D (EQ-5D-3L) questionnaire in Poland.
      ,
      • Golicki D.
      • Zawodnik S.
      • Janssen M.F.
      • Kiljan A.
      • Hermanowski T.
      Eq1 psychometric comparison of Eq-5d and Eq-5d-5l in student population.
      • Prevolnik Rupel V.
      • Srakar A.
      • Rand K.
      Valuation of EQ-5D-3l Health States in Slovenia: VAS based and TTO based value sets.
      • Rupel V.P.
      • Rebojl M.
      The Slovenian VAS tariff based on valuations of EQ-5D health states from the general population. EuroQol 17th Plenary Meeting.
      ,
      • Pentek M.
      • Beretzky Z.
      • Brodszky V.
      • et al.
      Health-related productivity of the Hungarian population. A cross-sectional survey.
      and patient population
      • Brodszky V.
      • Balint P.
      • Geher P.
      • et al.
      Disease burden of psoriatic arthritis compared to rheumatoid arthritis, Hungarian experiment.
      • Poor A.K.
      • Sardy M.
      • Cserni T.
      • et al.
      Assessment of health-related quality of life in psoriasis patients in Hungary.
      • Balogh O.
      • Pentek M.
      • Gulacsi L.
      • et al.
      [Quality of life and burden of disease in peripheral arterial disease: a study among Hungarian patients].
      • Pentek M.
      • Kobelt G.
      • Czirjak L.
      • et al.
      Costs of rheumatoid arthritis in Hungary.
      • Minier T.
      • Pentek M.
      • Brodszky V.
      • et al.
      Cost-of-illness of patients with systemic sclerosis in a tertiary care centre.
      • Ersek K.
      • Kovacs T.
      • Wimo A.
      • et al.
      Costs of dementia in Hungary.
      • Brodszky V.
      • Péntek M.
      • Jelics N.
      • et al.
      Health-related costs of diabetes mellitus in adults treated with insulin. Cross-sectional survey of 480 patients in general practice and outpatient settings.
      • Simoens S.
      • Dunselman G.
      • Dirksen C.
      • et al.
      The burden of endometriosis: costs and quality of life of women with endometriosis and treated in referral centres.
      • Pentek M.
      • Gulacsi L.
      • Toth E.
      • Baji P.
      • Brodszky V.
      • Horvath C.
      Ten-year fracture risk by FRAX((R)) of women with osteoporosis attending osteoporosis care in Hungary.
      • Pulay A.J.
      • Bitter I.
      • Papp S.
      • et al.
      Exploring the relationship between quality of life (EQ-5D) and clinical measures in adult attention deficit hyperactivity disorder (ADHD).
      • Hever N.V.
      • Pentek M.
      • Ballo A.
      • et al.
      Health related quality of life in patients with bladder cancer: a cross-sectional survey and validation study of the Hungarian version of the Bladder Cancer Index.
      • Rencz F.
      • Kovacs A.
      • Brodszky V.
      • et al.
      Cost of illness of medically treated benign prostatic hyperplasia in Hungary.
      • Péntek M.
      • Bereczki D.
      • Gulácsi L.
      • et al.
      Survey of epilepsy in adults in Hungary: quality of life and costs.
      • Pentek M.
      • Gulacsi L.
      • Majoros A.
      • et al.
      Health related quality of life and productivity of women with overactive bladder.
      • Tamas G.
      • Gulacsi L.
      • Bereczki D.
      • et al.
      Quality of life and costs in Parkinson’s disease: a cross sectional study in Hungary.
      • Péntek M.
      • Harangozó J.
      • Égerházi A.
      • et al.
      Quality of life and disease burden of patients with schizophrenia in Hungary.
      • Pentek M.
      • Gulacsi L.
      • Rozsa C.
      • et al.
      Health status and costs of ambulatory patients with multiple sclerosis in Hungary.
      ,
      • Pentek M.
      • Brodszky V.
      • Biro Z.
      • et al.
      Subjective health expectations of patients with age-related macular degeneration treated with antiVEGF drugs.
      ,
      • Golicki D.
      • Niewada M.
      • Buczek J.
      • et al.
      Validity of EQ-5D-5L in stroke.
      ,
      • Golicki D.
      • Sliwka A.
      • Fijewski G.
      • Latek M.
      Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
      ,
      • Rupel V.P.
      • Ogorevc M.
      Use of the EQ-5D instrument and value scale in comparing health states of patients in four health care programs among health care providers.
      sample. The database structure is summarized by listing key study characteristics and nonmissing values for patient-level explanatory variables in Appendix Table 1 of Appendix 1 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024. The dependent variable was the EQ-5D-3L index score.
      • Dolan P.
      Modeling valuations for EuroQol health states.
      Patient-level predictor variables were organized into 2 groups.

      Demographic and disease-related variables

      Predictors in this group included age, gender, education, place of residence, family status, employment, relative income (net personal income as a percent of the study year’s national average net income), setting (outpatient, hospitalized, and postoperative in the case of documented surgery within 3-6 months), number of general practitioner visits, any general practitioner visit, number of specialist visits, and any specialist visits or hospitalizations in the past year. In the case of missing data, we assumed that specialist visits happened for patients recruited in outpatient specialist centers,
      • Poor A.K.
      • Sardy M.
      • Cserni T.
      • et al.
      Assessment of health-related quality of life in psoriasis patients in Hungary.
      ,
      • Balogh O.
      • Pentek M.
      • Gulacsi L.
      • et al.
      [Quality of life and burden of disease in peripheral arterial disease: a study among Hungarian patients].
      ,
      • Pentek M.
      • Brodszky V.
      • Biro Z.
      • et al.
      Subjective health expectations of patients with age-related macular degeneration treated with antiVEGF drugs.
      and both specialist visits and hospitalizations happened at patients whose EQ-5D data were collected during or after hospitalization.
      • Golicki D.
      • Niewada M.
      • Buczek J.
      • et al.
      Validity of EQ-5D-5L in stroke.
      ,
      • Golicki D.
      • Sliwka A.
      • Fijewski G.
      • Latek M.
      Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
      ,
      • Rupel V.P.
      • Ogorevc M.
      Use of the EQ-5D instrument and value scale in comparing health states of patients in four health care programs among health care providers.
      We recorded whether patients were informal care recipients, weight, height, and body mass index. Physician-reported outcomes were transferred to a standard scale, where 0 represents the worst and 1 the best possible health status measurable with the given instrument. The included physician-reported instruments (acronym; score of worst health status—score of best health status) were the clinician-reported VAS (0-100)
      • McCormack H.M.
      • Horne D.J.
      • Sheather S.
      Clinical applications of visual analogue scales: a critical review.
      in the SSC and BPH studies,
      • Minier T.
      • Pentek M.
      • Brodszky V.
      • et al.
      Cost-of-illness of patients with systemic sclerosis in a tertiary care centre.
      ,
      • Rencz F.
      • Kovacs A.
      • Brodszky V.
      • et al.
      Cost of illness of medically treated benign prostatic hyperplasia in Hungary.
      the Mini-Mental State Examination (0-30)
      • Folstein M.F.
      • Folstein S.E.
      • McHugh P.R.
      Mini-mental state.
      in the dementia study,
      • Ersek K.
      • Kovacs T.
      • Wimo A.
      • et al.
      Costs of dementia in Hungary.
      the Clinical Global Impression (7-0) in the overactive bladder and schizophrenia studies,
      • Pentek M.
      • Gulacsi L.
      • Majoros A.
      • et al.
      Health related quality of life and productivity of women with overactive bladder.
      ,
      • Péntek M.
      • Harangozó J.
      • Égerházi A.
      • et al.
      Quality of life and disease burden of patients with schizophrenia in Hungary.
      the Expanded Disability Status Scale (10-0)
      • Kurtzke J.F.
      Rating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS).
      in the multiple sclerosis study,
      • Pentek M.
      • Gulacsi L.
      • Rozsa C.
      • et al.
      Health status and costs of ambulatory patients with multiple sclerosis in Hungary.
      and the Psoriasis Area and Severity Index (72-0)
      • Fredriksson T.
      • Pettersson U.
      Severe psoriasis—oral therapy with a new retinoid.
      in the PP study.
      • Poor A.K.
      • Sardy M.
      • Cserni T.
      • et al.
      Assessment of health-related quality of life in psoriasis patients in Hungary.
      Specific diseases were included as dummy variables indicating the main diagnosis of clinical populations
      • Brodszky V.
      • Balint P.
      • Geher P.
      • et al.
      Disease burden of psoriatic arthritis compared to rheumatoid arthritis, Hungarian experiment.
      • Poor A.K.
      • Sardy M.
      • Cserni T.
      • et al.
      Assessment of health-related quality of life in psoriasis patients in Hungary.
      • Balogh O.
      • Pentek M.
      • Gulacsi L.
      • et al.
      [Quality of life and burden of disease in peripheral arterial disease: a study among Hungarian patients].
      • Pentek M.
      • Kobelt G.
      • Czirjak L.
      • et al.
      Costs of rheumatoid arthritis in Hungary.
      • Minier T.
      • Pentek M.
      • Brodszky V.
      • et al.
      Cost-of-illness of patients with systemic sclerosis in a tertiary care centre.
      • Ersek K.
      • Kovacs T.
      • Wimo A.
      • et al.
      Costs of dementia in Hungary.
      • Brodszky V.
      • Péntek M.
      • Jelics N.
      • et al.
      Health-related costs of diabetes mellitus in adults treated with insulin. Cross-sectional survey of 480 patients in general practice and outpatient settings.
      • Simoens S.
      • Dunselman G.
      • Dirksen C.
      • et al.
      The burden of endometriosis: costs and quality of life of women with endometriosis and treated in referral centres.
      • Pentek M.
      • Gulacsi L.
      • Toth E.
      • Baji P.
      • Brodszky V.
      • Horvath C.
      Ten-year fracture risk by FRAX((R)) of women with osteoporosis attending osteoporosis care in Hungary.
      ,
      • Hever N.V.
      • Pentek M.
      • Ballo A.
      • et al.
      Health related quality of life in patients with bladder cancer: a cross-sectional survey and validation study of the Hungarian version of the Bladder Cancer Index.
      • Rencz F.
      • Kovacs A.
      • Brodszky V.
      • et al.
      Cost of illness of medically treated benign prostatic hyperplasia in Hungary.
      • Péntek M.
      • Bereczki D.
      • Gulácsi L.
      • et al.
      Survey of epilepsy in adults in Hungary: quality of life and costs.
      • Pentek M.
      • Gulacsi L.
      • Majoros A.
      • et al.
      Health related quality of life and productivity of women with overactive bladder.
      • Tamas G.
      • Gulacsi L.
      • Bereczki D.
      • et al.
      Quality of life and costs in Parkinson’s disease: a cross sectional study in Hungary.
      • Péntek M.
      • Harangozó J.
      • Égerházi A.
      • et al.
      Quality of life and disease burden of patients with schizophrenia in Hungary.
      • Pentek M.
      • Gulacsi L.
      • Rozsa C.
      • et al.
      Health status and costs of ambulatory patients with multiple sclerosis in Hungary.
      ,
      • Pentek M.
      • Brodszky V.
      • Biro Z.
      • et al.
      Subjective health expectations of patients with age-related macular degeneration treated with antiVEGF drugs.
      ,
      • Golicki D.
      • Niewada M.
      • Buczek J.
      • et al.
      Validity of EQ-5D-5L in stroke.
      ,
      • Golicki D.
      • Sliwka A.
      • Fijewski G.
      • Latek M.
      Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
      ,
      • Rupel V.P.
      • Ogorevc M.
      Use of the EQ-5D instrument and value scale in comparing health states of patients in four health care programs among health care providers.
      ,
      • Pentek M.
      • Beretzky Z.
      • Brodszky V.
      • et al.
      Health-related productivity of the Hungarian population. A cross-sectional survey.
      ,
      • Rencz F.
      • Brodszky V.
      • Gulacsi L.
      • et al.
      Parallel valuation of the EQ-5D-3L and EQ-5D-5L by time trade-off in Hungary.
      or self-reported conditions in 3 surveys in the general population (HHU,
      • Pentek M.
      • Beretzky Z.
      • Brodszky V.
      • et al.
      Health-related productivity of the Hungarian population. A cross-sectional survey.
      VHU,
      • Rencz F.
      • Brodszky V.
      • Gulacsi L.
      • et al.
      Parallel valuation of the EQ-5D-3L and EQ-5D-5L by time trade-off in Hungary.
      VPL
      • Golicki D.
      • Jakubczyk M.
      • Niewada M.
      • Wrona W.
      • Busschbach J.J.
      Valuation of EQ-5D health states in Poland: first TTO-based social value set in Central and Eastern Europe.
      ). An overall dummy variable indicated the presence of any disease. Appendix Table 2 of Appendix 1 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024 summarizes the number of patients in each study with conditions categorized under the International Classification of Diseases, Tenth Revision, codes and the International Classification of Diseases, Tenth Revision, main chapters.
      International Statistical Classification of Diseases and Related Health Problems 10th Revision.

      PRO

      The second predictor group comprised data obtained from PROs such as EQ-VAS,
      • Feng Y.
      • Parkin D.
      • Devlin N.J.
      Assessing the performance of the EQ-VAS in the NHS Proms programme.
      happiness measured on an 11-point (0-10) numeric scale,
      Happiness in Nations: Overview of Happiness Surveys Using Measure Type. Erasmus University Rotterdam, Happiness Economics Research Organisation.
      and items of the Minimum European Health Module
      Glossary: Minimum European Health Module (MEHM). Eurostat.
      ,
      • Cox B.
      • van Oyen H.
      • Cambois E.
      • et al.
      The reliability of the Minimum European Health Module.
      : self-rated health and the limitations because of health problems (Global Activity Limitation Indicator). Scores from PRO instruments were transferred to a standard scale, with 0 representing the worst and 1 the best health status measurable with the given instrument. The applied PRO instruments (acronym; score of worst health status—score of best health status) were the Health Assessment Questionnaire Disability Index (3-0)
      • Fries J.F.
      • Spitz P.W.
      • DY Y.
      The dimensions of health outcomes: the health assessment questionnaire, disability and pain scales.
      in the psoriatic arthritis, rheumatoid arthritis, and SSC studies
      • Brodszky V.
      • Balint P.
      • Geher P.
      • et al.
      Disease burden of psoriatic arthritis compared to rheumatoid arthritis, Hungarian experiment.
      ,
      • Pentek M.
      • Kobelt G.
      • Czirjak L.
      • et al.
      Costs of rheumatoid arthritis in Hungary.
      ,
      • Minier T.
      • Pentek M.
      • Brodszky V.
      • et al.
      Cost-of-illness of patients with systemic sclerosis in a tertiary care centre.
      ; the Dermatology Quality of Life Index (30-0)
      • Finlay A.Y.
      • Khan G.K.
      Dermatology Life Quality Index (DLQI)—a simple practical measure for routine clinical use.
      in the PP study
      • Poor A.K.
      • Sardy M.
      • Cserni T.
      • et al.
      Assessment of health-related quality of life in psoriasis patients in Hungary.
      ; the Barthel Index (0-100)
      • Mahoney F.
      • Barthel D.
      Functional evaluation: the Barthel index.
      in the cerebrovascular accident study
      • Golicki D.
      • Niewada M.
      • Buczek J.
      • et al.
      Validity of EQ-5D-5L in stroke.
      ; the Functional Recovery Score (0%-100%)
      • Zuckerman J.D.
      • Koval K.J.
      • Aharonoff G.B.
      • Hiebert R.
      • Skovron M.L.
      A functional recovery score for elderly hip fracture patients: I. Development.
      in the hip fracture study
      • Golicki D.
      • Sliwka A.
      • Fijewski G.
      • Latek M.
      Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
      ; the Bladder Cancer Index (0-100) in the BC study
      • Gilbert S.M.
      • Dunn R.L.
      • Hollenbeck B.K.
      • et al.
      Development and validation of the Bladder Cancer Index: a comprehensive, disease specific measure of health related quality of life in patients with localized bladder cancer.
      ; and the International Prostate Symptom Score (35-0)
      • Barry M.J.
      • Fowler F.J.
      • O’Leary M.P.
      • et al.
      The American Urological Association symptom index for benign prostatic hyperplasia.
      in the BPH study.
      • Rencz F.
      • Kovacs A.
      • Brodszky V.
      • et al.
      Cost of illness of medically treated benign prostatic hyperplasia in Hungary.

      Data Analysis

      Missing data

      Missing data were handled via the indicator method. We imputed zeros for all missing values and generated a dummy indicator for each predictor denoting missing values. The indicator was set as missing in those general population studies, where self-reported conditions were not inquired. In contrast, the disease dummy was set as absent for those patients who were asked about the presence of a disease and responded negatively. Comorbidities were not recorded in patient populations; therefore, the disease dummy was set as missing except for the index conditions.

      Prediction models

      EQ-5D-3L index scores were predicted by OLS regression, eXtreme Gradient Boosting (XGBoost) classification (XGBC), and XGBoost regression (XGBR). A regular winner of ML competitions, XGBoost is a highly scalable and computationally efficient implementation of gradient boosted trees. Boosted decision trees are an ensemble of decision trees added sequentially. Each additional tree is trained to correct the errors of the ensemble of previous trees until no further improvements can be made on a validation data set. Gradient boosting grows the best trees by optimizing a loss function that comprises prediction error and a regularization term, which describes the complexity of the trees. Depending on the loss function, XGBoost can run in classification and regression mode, which predict EQ-5D scores in 243 categories or as a continuous value, respectively.

      Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Paper presented at: KDD’16: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; New York; 2016.

      ,
      • Bentéjac C.
      • Csörgő A.
      • Martínez-Muñoz G.
      A comparative analysis of gradient boosting algorithms.
      Patients with multiple measurements were entered as unique records. No weights were applied. In the OLS model, age was split into 5-year categories, and a piecewise model was fit on EQ-VAS scores with different slopes for the 0 to 34 and 35 to 100 value ranges. We entered predictors without interactions but explored an OLS model with interaction between disease dummies, gender and age. For XGBoost, default settings were retained for most parameters after initial exploration and monitoring of train and test errors. The learning rate parameter was set to 0.1, number of trees were set to 20 to improve speed of classification, and the L1 regularization term of regression was set to 0.9 to avoid overfitting.
      We performed predictions in 6 scenarios involving the general population sample (“Pop”), the patient sample (“Pts”), and the entire sample (“Total”), by using only demographic and disease predictors (“Base”) and adding PRO predictors (“PRO”). Model training and evaluation were performed via 10-fold cross-validation, using a randomly selected 80% of the data set for training and 20% for testing. OLS coefficient estimates, XGBoost settings, and feature importance tables for the PRO scenarios are presented in Appendix 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024 (Appendix Tables 3-5 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024).

      Evaluation of Model Performance

      Models were compared via the “mean absolute error” (MAE) of prediction as an intuitive and stabile measure when comparing scenarios with different sample sizes.
      • Willmott C.J.
      • Matsuura K.
      Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance.
      Furthermore, assuming that prediction errors smaller than 0.074, the minimum clinically important difference (MCD) of EQ-5D-3L, are barely noticeable,
      • Walters S.J.
      • Brazier J.E.
      Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D.
      yet greater errors in either direction are undesirable, we calculated the percent of predictions within the “clinically irrelevant error range” (eg, predictions within true ± MCD range). Third, we assessed prediction bias via observing mean prediction errors. Finally, we calculated the percent of predictions within the “correct health severity group.” For this, according to the trimodal distribution of EQ-5D-3L, index values > 0.926 (eg, 1-MCD) denoted “full health,” values between 0.926 and 0.45 denoted “medium health,” and values < 0.45 denoted “low health.”
      • Parkin D.
      • Devlin N.
      • Feng Y.
      What determines the shape of an EQ-5D index distribution?.
      ,
      • Brazier J.
      • Roberts J.
      • Tsuchiya A.
      • Busschbach J.
      A comparison of the EQ-5D and SF-6D across seven patient groups.
      (In “full health,” true EQ-5D-3L index scores are equal to 1, whereas predictions can take values > 0.926.)
      To evaluate the reliability of predictions, from the 10 cross-validation sets, we calculated 95% confidence intervals for the evaluation metrics using the following formula:
      95% CI=CV¯±1.961kSD
      (1)


      where CV¯ is the mean and SD is the SD of the evaluation metric (eg, MAE) of the k cross-validation sets.
      • Bramer M.
      Principles of Data Mining.
      Given that algorithmic bias is of particular concern in healthcare applications,
      • Panch T.
      • Mattie H.
      • Atun R.
      Artificial intelligence and algorithmic bias: implications for health systems.
      we evaluated whether individuals in different health statuses may be affected adversely because of prediction error. In addition to mean prediction error, we quantified the percent of predictions within clinically irrelevant error range and the percent of predictions within the correct health severity group across the full range of the true EQ-5D index. (Lower values denote greater risk of flawed predictions.) Second, assuming that decisions would be based on predicted and not the unknown true values, we evaluated bias and the probability of accurate predictions across the range of predicted EQ-5D-3L index values (eg, positive predictive value).
      • Tohka J.
      • van Gils M.
      Evaluation of machine learning algorithms for health and wellness applications: a tutorial.

      Results

      Sample Characteristics

      The database contained 28 862 records of 26 318 individuals. Cross-sectional studies of the general population provided 20 104 records, whereas 8758 records were from cross-sectional and cohort studies involving 6214 patients (single measurement, n = 3753; 2 measurements, n = 2378; 3 measurements, n = 83). Most data originated from Hungary (16 862 records; 58.4%), followed by Slovenia (6507 records, 22.6%) and Poland (5493 records, 19.0%). Counting the diagnosis related dummies as one variable, the 28 predictor variables contained 64.1% missing values. There were 214 missing data patterns across observations with a range of missing variables from 7 to 23. There were no complete cases in the data set. The predictor variables and missing values are summarized in Table 1.
      Table 1Summary of predictor variables and missing values.
      Predictor groupVariableCategorySample
      General population (“Pop”)Patients (“Pts”)Entire sample (“Total”)
      “Base”AgeMean41.155.645.4
      SD15.516.617.2
      Missing (%)0.04.21.3
      GenderMale (%)54.650.453.3
      Female (%)45.449.646.7
      Missing (%)0.02.80.9
      EducationPrimary (%)10.823.314.2
      Secondary (%)39.757.644.6
      Tertiary (%)49.519.141.2
      Missing (%)0.012.93.9
      Place of residenceCapital (%)18.16.213.5
      City (%)51.545.249.0
      Village (%)30.448.637.5
      Missing (%)56.836.650.7
      Family statusSingle (%)36.235.036.1
      Married (%)63.865.063.9
      Missing (%)26.677.041.9
      EmploymentPaid employment (%)68.642.961.6
      Student (%)9.212.810.2
      Pensioner (%)15.033.720.1
      Not working (%)5.06.75.5
      Other employment (%)2.23.92.6
      Missing (%)0.013.44.1
      Relative income (0-11.0)Mean1.60.51.6
      SD1.50.31.5
      Missing (%)32.097.251.8
      SettingGeneral population (%)100.0-69.7
      Outpatient (%)-30.19.1
      Hospitalized (%)-39.211.9
      Postoperative (%)-30.79.3
      Number of GP visits at 12 monthsMean-4.04.0
      SD-6.16.1
      Missing (%)100.078.793.5
      Any GP visit past yearNo (%)-49.149.1
      Yes (%)-50.950.9
      Missing (%)100.078.793.5
      Specialist visits past yearMean-5.85.8
      SD-7.57.5
      Missing (%)100.080.994.2
      Any specialist visits past yearNo (%)-17.117.1
      Yes (%)-82.982.9
      Missing (%)100.080.994.2
      Hospitalizations past yearMean0.21.70.7
      SD0.63.82.2
      Missing (%)90.086.589.0
      Any hospitalization at 12 monthsNo (%)90.441.972.5
      Yes (%)9.658.127.5
      Missing (%)90.086.589.0
      Informal care recipientNo (%)92.270.982.4
      Yes (%)7.829.117.6
      Missing (%)90.080.387.1
      Weight, kgMean76.175.275.6
      SD16.116.816.4
      Missing (%)88.972.884.0
      Height, cmMean171.5167.6169.7
      SD9.49.79.8
      Missing (%)88.877.785.5
      BMIMean25.826.826.3
      SD4.85.25.0
      Missing (%)88.977.885.5
      DRO score
      Details of PRO instruments and DRO instruments and disease dummies are omitted from the table.
      Mean-0.70.7
      SD-0.20.2
      Missing (%)100.091.197.3
      Chronic morbidityNo (%)68.50.068.5
      Yes (%)31.50.031.5
      Missing (%)90.0100.093.0
      Any diseaseNo (%)70.50.029.5
      Yes (%)29.5100.070.5
      Missing (%)68.80.047.9
      Specific diagnoses
      Details of PRO instruments and DRO instruments and disease dummies are omitted from the table.
      Not included in the table
      “PRO”HappinessMean7.6-7.6
      SD2.0-2.0
      Missing (%)90.0100.093.0
      Self-rated healthVery good (%)20.70.020.7
      Good (%)45.30.045.3
      Fair (%)26.90.026.9
      Bad (%)6.20.06.2
      Very Bad (%)0.90.00.9
      Missing (%)90.0100.093.0
      GALISeverely limited (%)3.30.03.3
      Limited, but not severely (%)16.80.016.8
      Not limited (%)79.90.079.9
      Missing (%)90.0100.093.0
      PRO score
      Details of PRO instruments and DRO instruments and disease dummies are omitted from the table.
      Mean-0.70.7
      SD-0.30.3
      Missing (%)100.080.794.1
      EQ-VAS (0-100)Mean77.065.473.3
      SD18.922.320.7
      Missing (%)11.56.39.9
      Note. Mean, SD, and percent (%) values refer to nonmissing data. The percent of missing data was calculated for the entire sample.
      BMI indicates body mass index; DRO, physician-reported outcome; EQ-VAS indicates EuroQol visual analog scale; GALI, Global Activity Limitation Indicator; GP, general practitioner; Pop, population; PRO, patient-reported outcome; Pts, patients.
      Details of PRO instruments and DRO instruments and disease dummies are omitted from the table.
      Mean (SD) EQ-5D-3L index scores were 0.847 (0.198), 0.665 (0.317), and 0.792 (0.254) in the general population, patients, and the total sample, respectively. In the general population sample, 3.7%, 49.0%, and 47.3%; in the patient sample, 14.2%, 62.3%, and 23.1%; and in the total sample, 6.9%, 53.2%, and 39.9% had EQ-5D-3L index scores in the low, medium, and full health categories, respectively.

      The Distribution of True and Predicted EQ-5D-3L Index Scores

      The distribution of XGBC predictions resembled the trimodal distribution of true EQ-5D scores, yet predictions of full health were more frequent. The distributions of XGBR and OLS were unimodal with left skew and a peak below full health (Appendix Fig. 1 of Appendix 2 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024). The range of full health across all scenarios was 23.1% to 47.2% for true EQ-5D-3L values, 43.4% to 93.9% for XGBC, 1.0% to 26.2% for XGBR, and 1.9% to 19.2% for OLS. XGBR predictions exceeded the value of 1 less frequently than those of OLS.

      Accuracy of Predictions

      In all scenarios, MAE was greatest for XGBC and lowest for XGBR followed closely by OLS. In the PRO scenario, MAE was 0.126, 0.113, and 0.118 in the population sample; 0.200, 0.159, and 0.162 in the patient sample; and 0.153, 0.126, and 0.131 in the total sample using XGBC, XGBR, and OLS, respectively. Adding PROs to demographic and disease-related predictors decreased MAE on average by 0.022 (Fig. 1).
      Figure thumbnail gr1
      Figure 1MAE of predictions by scenario.
      CI indicates confidence interval; MAE, mean absolute error; MCD, minimum clinically important difference; OLS, ordinary least squares; Pop, population; PRO, patient-reported outcome; Pts, patients; XGBC, eXtreme Gradient Boosting classification; XGBR, eXtreme Gradient Boosting regression.
      On the contrary, the percent of predictions within the clinically irrelevant error range (true ± MCD) was highest for XGBC with 51.9%, 39.1%, and 47.6%, followed by XGBR with 41.7%, 34.2%, and 39.5% and OLS with 38.2%, 33.1%, and 37.4% in the PRO scenario for the general population, patients, and the total sample, respectively. Adding PROs to the base predictors increased the percent of predictions within the clinically irrelevant error range on average by 6.6% (Fig. 2). Although mean prediction error for XGBR and OLS predictions was nearly zero in all scenarios, XGBC showed positive bias with mean error exceeding the MCD all scenarios (range 0.086-0.097) (Appendix Fig. 2 of Appendix 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024).
      Figure thumbnail gr2
      Figure 2Percentage of predictions within clinically irrelevant error range (true ± MCD).
      MCD indicates minimum clinically important difference; OLS, ordinary least squares; Pop, population; PRO, patient-reported outcome; Pts, patients; XGBC, eXtreme Gradient Boosting classification; XGBR, eXtreme Gradient Boosting regression.
      In terms of the percent of predictions within the correct health severity group, XGBR followed closely by OLS outperformed XGBC. In the PRO scenario for the general population, patient, and total samples, 57.2%, 58.1%, and 56.3% of predictions using XGBC; 63.2%, 68.5%, and 64.9% using XGBR; and 60.5%, 68.2%, and 63.1% using OLS fell within the correct health severity group, respectively (Fig. 3). The narrow 95% confidence interval ranges suggested that the predictions of all 3 methods were reliable through the cross-validation rounds. The performance of OLS models with or without interaction terms was rather similar (Appendix Fig. 3 of Appendix 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024).
      Figure thumbnail gr3
      Figure 3Percentage of predictions in the correct health severity group.
      EQ-5D-3L indicates 3-level version of EQ-5D; OLS, ordinary least squares; Pop, population; PRO, patient-reported outcome; Pts, patients; XGBC, eXtreme Gradient Boosting classification; XGBR, eXtreme Gradient Boosting regression.

      Patterns of Prediction Error

      XGBC often predicted full across the entire range of true EQ-5D-3L index values. The scatterplots of predicted over true values suggested that adding PROs to base predictors improved prediction accuracy mainly in the low health region and in patients (Appendix Fig. 4 of Appendix 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024). Regression to the mean was observed with all methods with positive bias (range 0.59-1.39) in low health and slight negative bias (range −0.01 to −0.23) in full health (Appendix Fig. 5 of Appendix 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024).

      Accuracy of Predictions by True EQ-5D-3L Index Scores

      Figure 4 illustrates the percent of predictions within clinically irrelevant error range by true EQ-5D-3L index scores. XGBC predictions were most accurate in the full health range, whereas XGBR and OLS predictions were most accurate in medium health. All methods were least accurate in the low health range, which improved after adding PROs to the base predictors. The proportions of predictions within the correct health severity category are depicted in Appendix Figure 6 of Appendix 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024.
      Figure thumbnail gr4
      Figure 4Percentage of predictions within clinically irrelevant error range by scenario and true EQ-5D-3L index scores.
      EQ-5D-3L indicates 3-level version of EQ-5D; MCD, minimum clinically important difference; OLS, ordinary least squares; Pop, population; PRO, patient-reported outcome; Pts, patients; XGBC, eXtreme Gradient Boosting classification; XGBR, eXtreme Gradient Boosting regression.

      Accuracy of Predictions by Predicted EQ-5D-3L Index Scores

      If full health was predicted by XGBR or OLS, those values were mostly within, whereas full health predictions by XGBC were mostly outside the clinically irrelevant error range (Fig. 5). If medium health was predicted, the accuracy of the 3 methods was similar, albeit moderate. Low health predictions were the least accurate, which improved when PROs were added to base predictors, especially in patients. The pattern was similar for predictions in the correct health severity group. Although XGBC predictions were most accurate in the medium health range, the accuracy of XGBR and OLS improved from low health toward the full health range (Appendix Fig. 7 of Appendix 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024). Although mean prediction error of XGBR and OLS was approximately zero across the entire range, the bias of XGBC depended on the predicted EQ-5D-3L index values (Appendix Fig. 8 of Appendix 3 in Supplemental Materials found at https://doi.org/10.1016/j.jval.2022.01.024).
      Figure thumbnail gr5
      Figure 5Percentage of predictions within clinically irrelevant error range by scenario and predicted EQ-5D-3L index scores.
      EQ-5D-3L indicates 3-level version of EQ-5D; MCD, minimum clinically important difference; OLS, ordinary least squares; Pop, population; PRO, patient-reported outcome; Pts, patients; XGBC, eXtreme Gradient Boosting classification; XGBR, eXtreme Gradient Boosting regression.

      Discussion

      We predicted EQ-5D-3L index scores via XGBC, XGBR, and OLS regression in a large international database combining multiple studies among patients and the general population with diverse predictors and a large amount of missing data. Across scenarios involving patients and the general population with and without PRO predictors, the percent of predictions within the clinically irrelevant error (true ± MCD) range were highest for XGBC and lowest for OLS with XGBR coming close. Nevertheless, MAE of prediction was lowest for XGBR followed by OLS and XGBC. Predictions with XGBC were biased. The performance of the 3 methods depended on the evaluation criteria, the target population, the predictor variables, and the EQ-5D-3L index range. Adding PROs to the demographic and disease-related predictors improved the accuracy of predictions.
      Several studies have already used ML to predict EQ-5D values as a binary threshold
      • Huber M.
      • Kurz C.
      • Leidl R.
      Predicting patient-reported outcomes following hip and knee replacement surgery using supervised machine learning.
      ,
      • Lee S.K.
      • Son Y.J.
      • Kim J.
      • et al.
      Prediction model for health-related quality of life of elderly with chronic diseases using machine learning techniques.
      or as a continuous measure. Borchani et al
      • Borchani H.
      • Bielza C.
      • Marti Nez-Marti N.P.
      • Larranaga P.
      Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39).
      predicted EQ-5D-3L index scores from the 39-item Parkinson’s Disease Questionnaire using multidimensional Bayesian network classifiers (MAE for OLS 0.350; MAE for multidimensional Bayesian network classifier 0.174). Gutacker et al
      • Gutacker N.
      • Street A.
      Use of large-scale HRQoL datasets to generate individualised predictions and inform patients about the likely benefit of surgery.
      predicted postoperative health gains via classification and regression tree methodology (MAE ≤ root mean square error 0.158-0.224). Gao et al
      • Gao L.
      • Luo W.
      • Tonmukayakul U.
      • Moodie M.
      • Chen G.
      Mapping MacNew Heart Disease Quality of Life Questionnaire onto country-specific EQ-5D-5L utility scores: a comparison of traditional regression models with a machine learning technique.
      mapped heart disease-specific quality of life to EQ-5D-5L index scores using econometric models and deep neural network algorithm (MAE for OLS 0.090-0.129; MAE for deep neural network 0.076-0.105). Recently, Mlynczak et al
      • Mlynczak K.
      • Golicki D.
      Validity of the EQ-5D-5L questionnaire among the general population of Poland.
      applied random forest for assessing the construct validity of EQ-5D-5L (MAE ≤ root mean square error 0.121). In these studies, when compared, ML usually outperformed traditional econometric methods.
      • Huber M.
      • Kurz C.
      • Leidl R.
      Predicting patient-reported outcomes following hip and knee replacement surgery using supervised machine learning.
      ,
      • Borchani H.
      • Bielza C.
      • Marti Nez-Marti N.P.
      • Larranaga P.
      Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39).
      ,
      • Gao L.
      • Luo W.
      • Tonmukayakul U.
      • Moodie M.
      • Chen G.
      Mapping MacNew Heart Disease Quality of Life Questionnaire onto country-specific EQ-5D-5L utility scores: a comparison of traditional regression models with a machine learning technique.
      Advanced econometric models were also used to predict EQ-5D index values accommodating its multimodal distribution and upper limit at full health. In a rheumatoid arthritis data set, MAE was 0.1505 with linear regression, 0.1508 with a random effects Tobit model, 0.1508 with an adjusted limited variable model (treating EQ-5D index score predictions > 0.883 as 1), and 0.1438 with a random effects adjusted limited variable mixture model.
      • Hernandez Alava M.
      • Wailoo A.J.
      • Ara R.
      Tails from the peak district: adjusted limited dependent variable mixture models of EQ-5D questionnaire health state utility values.
      The strength of our study is that the analysis was performed on a large and diverse data set of multiple studies resembling real-world data connected in health data ecosystems.
      • Iacob N.
      • Simonelli F.
      Towards a European health data ecosystem.
      MAE was comparable with previous studies using ML or regression methods. We evaluated prediction accuracy via the percent of predictions within the “clinically irrelevant error range” by splitting absolute error into “irrelevant” (≤MCD) and “relevant” (>MCD) values. We argue that there are no established criteria for further classifying errors into “large” or “acceptable” ones. Nevertheless, erroneous predictions in any direction of any magnitude may negatively affect decisions. Therefore, by conveying clinically relevant information about the shape of error distribution, this metric has merit in the evaluation of predictive models in healthcare.
      Our study has limitations. Despite the full data matrices of individual studies, the joint database had a large proportion of missing data, which was handled via the missing indicator method. Although this method has been criticized for its biasedness,
      • Groenwold R.H.
      • White I.R.
      • Donders A.R.
      • Carpenter J.R.
      • Altman D.G.
      • Moons K.G.
      Missing covariate data in clinical research: when and when not to use the missing-indicator method for analysis.
      it has recently been advocated in predictive or epidemiological research.
      • Sperrin M.
      • Martin G.P.
      • Sisk R.
      • Peek N.
      Missing data should be handled differently for prediction than for description or causal explanation.
      ,

      Song M, Zhou X, Pazaris M, Spiegelman D. The missing covariate indicator method is nearly valid almost always. Preprint. Posted online October 30, 2021. ArXiv:211100138. https://doi.org/10.48550/arXiv.2111.00138.

      In our study, mean prediction error using XGBR and OLS was close to zero in all scenarios, whereas XGBC predictions were positively biased. Nevertheless, the performance of XGBoost is usually not affected by the imputation of missing data.
      • Rusdah D.A.
      • Murfi H.
      XGBoost in handling missing values for life insurance risk prediction.
      We did not apply multiple imputation techniques to prevent leakage of information and interference with the prediction methods. Therefore, the information contained in the data set probably could not be used to its full capacity. The potential effect of missing data structures on the predictive performance of the models deserves further exploration, along with the use of more advanced data imputation techniques such as multiple imputation
      • Sterne J.A.
      • White I.R.
      • Carlin J.B.
      • et al.
      Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.
      or LASSO regression.
      • Lavanya K.
      • Reddy L.
      • Eswara Reddy B.
      Modeling of missing data imputation using additive lasso regression model in Microsoft Azure.
      Although more advanced regression models are available to accommodate the multimodal distribution and upper limit of EQ-5D index scores at full health, simple OLS models are commonly applied to predict individual utilities.
      • Hernandez Alava M.
      • Wailoo A.J.
      • Ara R.
      Tails from the peak district: adjusted limited dependent variable mixture models of EQ-5D questionnaire health state utility values.
      ,
      • Brazier J.E.
      • Yang Y.
      • Tsuchiya A.
      • Rowen D.L.
      A review of studies mapping (or cross walking) non-preference based measures of health to generic preference-based measures.
      Adding further interaction terms did not affect the performance of our OLS model. Nevertheless, variable selection via LASSO regression from many interacted predictors has been shown to improve the performance of OLS in predicting EQ-5D-5L index scores.
      • Hay J.W.
      • Gong C.L.
      • Jiao X.
      • et al.
      A US population health survey on the impact of COVID-19 using the EQ-5D-5L.
      In addition, predictions were performed on unweighted data, which, through the nonrepresentative proportions of patients in the sample, may have introduced bias to the prediction results. Therefore, the external validity of our prediction models is probably limited and the accuracy of predictions may be further improved. As a future area of research, alternative prediction techniques, a combination of methods based on their performance in various EQ-5D-3L ranges, imputation of missing data, and weights reflecting the structure of the average population and disease epidemiology could be applied to improve the accuracy of predictions of individual EQ-5D-3L index scores. Furthermore, the external validity of the prediction models should be tested on multiple study populations that were not included in the training phase.

      Conclusions

      In a large database of EQ-5D-3L studies, prediction errors of EQ-5D-3L index scores using XGBC, XGBR, and OLS were greater than the MCD for most respondents and depended on the applied method, performance evaluation criteria, the target population, applied predictors, and the EQ-5D-3L range. The performance of XGBR slightly exceeded OLS in most evaluation measures. Regression methods outperformed XGBC in terms of prediction accuracy and bias.
      Our results warn against overoptimistic expectations and prompt for care when using ML for predicting individual patient-reported health outcomes. We recommend the systematic and widespread collection of real-world PRO data using standardized PRO measures, including EQ-5D. In addition, we encourage dialogs between artificial intelligence and outcomes research experts to enhance the value of accumulating data in health systems.

      Article and Author Information

      Author Contributions: Concept and design: Zrubka, Csabai, Hermann, Golicki, Prevolnik-Rupel, Ogorevc, Gulácsi, Péntek
      Acquisition of data: Golicki, Prevolnik-Rupel, Ogorevc, Gulácsi, Péntek
      Analysis and interpretation of data: Zrubka, Csabai, Hermann
      Drafting of the manuscript: Zrubka, Péntek
      Critical revision of the paper for important intellectual content: Zrubka, Csabai, Hermann, Golicki, Prevolnik-Rupel, Ogorevc, Gulácsi, Péntek
      Statistical analysis: Zrubka, Csabai, Hermann
      Provision of study materials or patients: Golicki, Prevolnik-Rupel, Ogorevc, Gulácsi, Péntek
      Obtaining funding: Zrubka, Gulácsi
      Supervision: Gulácsi, Péntek
      Conflict of Interest Disclosures: Drs Zrubka, Csabai, Hermann, Golicki, Prevolnik-Rupel, Gulácsi and Péntek reported receiving funding from project number TKP2020-NKA-02 implemented with the support provided from the National Research, Development, and Innovation Fund of Hungary, financed under the Tématerületi Kiválósági Program funding scheme. Dr Zrubka reported receiving grants or contracts from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement number 679681); reported receiving consulting fees from Roche Hungary; and reported serving as a member of the ISPOR Digital Health Special Interest Group as co-chair of a key scientific project. Dr Csabai reported receiving funding from the Ministry of Innovation and Technology National Research Development and Innovation Office in the framework of the Artificial Intelligence National Laboratory Program (MILAB). Dr Golicki reported receiving research grants and travel reimbursement from the EuroQol Group. Dr Péntek reported receiving funding from project number 2019-1.3.1-KK-2019-00007 implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the 2019-1.3.1-KK funding scheme during writing this study. Drs Golicki, Prevolnik-Rupel, and Péntek reported being members of the EuroQol Group. No other disclosures were reported.
      Funding/Support: Under project number TKP2020-NKA-02, this research has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the Tématerületi Kiválósági Program funding scheme.
      Role of the Funder/Sponsor: The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

      Supplemental Materials

      References

        • Berwick D.
        • Black N.
        • Cullen D.
        • et al.
        Recommendations to OECD Ministers of Health from the high level reflection group on the future of health statistics: strengthening the international comparison of health system performance through patient-reported indicators. OECD.
      1. Endpoints used for relative effectiveness assessment of pharmaceuticals. Health-related quality of life and utility measures. EUnetHTA.
        www.eunethta.eu
        Date accessed: June 30, 2021
        • Kennedy-Martin M.
        • Slaap B.
        • Herdman M.
        • et al.
        Which multi-attribute utility instruments are recommended for use in cost-utility analysis? A review of national health technology assessment (HTA) guidelines.
        Eur J Health Econ. 2020; 21: 1245-1257
      2. EQ-5D is a recommended tool for use in cost-utility analyses around the globe. EuroQoL Group.
        • EuroQoL Group
        EuroQol — a new facility for the measurement of health-related quality of life.
        Health Policy. 1990; 16: 199-208
        • EESZT National eHealth Infrastructure
        National Centre for Healthcare Services.
        https://www.eeszt.gov.hu/hu/nyito-oldal
        Date accessed: June 30, 2021
        • Offerman A.
        Slovenia moving HIS applications to central eHealth platform (eZdravje). European Commission. Joinup Web Site.
        • Czerw A.
        • Fronczak A.
        • Witczak K.
        • Juszczyk G.
        Implementation of electronic health records in Polish outpatient health care clinics — starting point, progress, problems, and forecasts.
        Ann Agric Environ Med. 2016; 23: 329-334
        • Borges do Nascimento I.J.
        • Marcolino M.S.
        • Abdulazeem H.M.
        • et al.
        Impact of big data analytics on people’s health: overview of systematic reviews and recommendations for future studies.
        J Med Internet Res. 2021; 23e27275
        • Boccia S.
        • Pastorino R.
        • Giraldi L.
        • van den Bergen K.
        Digitalisation and Big Data: Implications for the Health Sector.
        European Parliament, Brussels, Belgium2018
        • Davenport T.
        • Kalakota R.
        The potential for artificial intelligence in healthcare.
        Future Healthc J. 2019; 6: 94-98
        • Garai A.
        • Pentek I.
        • Adamko A.
        Revolutionizing healthcare with IoT and cognitive, cloud-based telemedicine.
        Acta Polytech Hung. 2019; 16: 163-181
        • Cohoon T.J.
        • Bhavnani S.P.
        Toward precision health: applying artificial intelligence analytics to digital health biometric datasets.
        Per Med. 2020; 17: 307-316
        • Iacob N.
        • Simonelli F.
        Towards a European health data ecosystem.
        Eur J Risk Regul. 2020; 11: 884-893
        • Devlin N.J.
        • Brooks R.
        EQ-5D and the EuroQol Group: past, present and future.
        Appl Health Econ Health Policy. 2017; 15: 127-137
        • Ernstsson O.
        • Janssen M.F.
        • Heintz E.
        Collection and use of EQ-5D for follow-up, decision-making, and quality improvement in health care — the case of the Swedish National Quality Registries.
        J Patient Rep Outcomes. 2020; 4: 78
        • Rencz F.
        • Gulacsi L.
        • Drummond M.
        • et al.
        EQ-5D in Central and Eastern Europe: 2000-2015.
        Qual Life Res. 2016; 25: 2693-2710
        • Boros J.
        • Györke J.
        • Pásztorné Stokker E.
        • Szabó Z.K.
        Results of the 2014 Health Interview Survey - summary data.
        KSH, Budapest, Hungary2018
        • Longworth L.
        • Rowen D.
        Mapping to obtain EQ-5D utility values for use in NICE health technology assessments.
        Value Health. 2013; 16: 202-210
        • Jia H.
        • Lubetkin E.I.
        Estimating EuroQol EQ-5D scores from Population Healthy Days data.
        Med Decis Mak. 2008; 28: 491-499
        • Tsiachristas A.
        • Potter C.M.
        • Rocks S.
        • et al.
        Estimating EQ-5D utilities based on the Short-Form Long Term Conditions Questionnaire (LTCQ-8).
        Health Qual Life Outcomes. 2020; 18: 279
        • Brodszky V.
        • Balint P.
        • Geher P.
        • et al.
        Disease burden of psoriatic arthritis compared to rheumatoid arthritis, Hungarian experiment.
        Rheumatol Int. 2009; 30: 199-205
        • Poor A.K.
        • Sardy M.
        • Cserni T.
        • et al.
        Assessment of health-related quality of life in psoriasis patients in Hungary.
        Orv Hetil. 2018; 159: 837-846
        • Balogh O.
        • Pentek M.
        • Gulacsi L.
        • et al.
        [Quality of life and burden of disease in peripheral arterial disease: a study among Hungarian patients].
        Orv Hetil. 2013; 154: 464-470
        • Pentek M.
        • Kobelt G.
        • Czirjak L.
        • et al.
        Costs of rheumatoid arthritis in Hungary.
        J Rheumatol. 2007; 34: 1437
        • Minier T.
        • Pentek M.
        • Brodszky V.
        • et al.
        Cost-of-illness of patients with systemic sclerosis in a tertiary care centre.
        Rheumatol (Oxf Engl). 2010; 49: 1920-1928
        • Ersek K.
        • Kovacs T.
        • Wimo A.
        • et al.
        Costs of dementia in Hungary.
        J Nutr Health Aging. 2010; 14: 633-639
        • Brodszky V.
        • Péntek M.
        • Jelics N.
        • et al.
        Health-related costs of diabetes mellitus in adults treated with insulin. Cross-sectional survey of 480 patients in general practice and outpatient settings.
        Diabetologia Hungarica. 2010; 19: 37-44
        • Simoens S.
        • Dunselman G.
        • Dirksen C.
        • et al.
        The burden of endometriosis: costs and quality of life of women with endometriosis and treated in referral centres.
        Hum Reprod. 2012; 27: 1292-1299
        • Pentek M.
        • Gulacsi L.
        • Toth E.
        • Baji P.
        • Brodszky V.
        • Horvath C.
        Ten-year fracture risk by FRAX((R)) of women with osteoporosis attending osteoporosis care in Hungary.
        Orv Hetil. 2016; 157: 146-153
        • Pulay A.J.
        • Bitter I.
        • Papp S.
        • et al.
        Exploring the relationship between quality of life (EQ-5D) and clinical measures in adult attention deficit hyperactivity disorder (ADHD).
        Appl Res Qual Life. 2016; 12: 409-424
        • Hever N.V.
        • Pentek M.
        • Ballo A.
        • et al.
        Health related quality of life in patients with bladder cancer: a cross-sectional survey and validation study of the Hungarian version of the Bladder Cancer Index.
        Pathol Oncol Res. 2015; 21: 619-627
        • Rencz F.
        • Kovacs A.
        • Brodszky V.
        • et al.
        Cost of illness of medically treated benign prostatic hyperplasia in Hungary.
        Int Urol Nephrol. 2015; 47: 1241-1249
        • Péntek M.
        • Bereczki D.
        • Gulácsi L.
        • et al.
        Survey of epilepsy in adults in Hungary: quality of life and costs.
        Ideggyógyászati Sz. 2013; 66: 262
        • Pentek M.
        • Gulacsi L.
        • Majoros A.
        • et al.
        Health related quality of life and productivity of women with overactive bladder.
        Orv Hetil. 2012; 153: 1068-1076
        • Tamas G.
        • Gulacsi L.
        • Bereczki D.
        • et al.
        Quality of life and costs in Parkinson’s disease: a cross sectional study in Hungary.
        PLoS One. 2014; 9e107704
        • Péntek M.
        • Harangozó J.
        • Égerházi A.
        • et al.
        Quality of life and disease burden of patients with schizophrenia in Hungary.
        Psychiatr Hung. 2012; 27: 4-17
        • Pentek M.
        • Gulacsi L.
        • Rozsa C.
        • et al.
        Health status and costs of ambulatory patients with multiple sclerosis in Hungary.
        Ideggyogy Sz. 2012; 65: 316-324
        • Rencz F.
        • Gulácsi L.
        • Brodszky V.
        • Golicki D.
        • Ruzsa G.
        • Péntek M.
        Pns401 the first parallel Eq-5d-3l and Eq-5d-5l composite time trade-off valuation study in Europe.
        Value Health. 2019; 22
        • Baji P.
        • Brodszky V.
        • Rencz F.
        • Boncz I.
        • Gulacsi L.
        • Pentek M.
        Health status of the Hungarian population between 2000-2010.
        Orv Hetil. 2015; 156: 2035-2044
        • Pentek M.
        • Brodszky V.
        • Biro Z.
        • et al.
        Subjective health expectations of patients with age-related macular degeneration treated with antiVEGF drugs.
        BMC Geriatr. 2017; 17: 233
        • Pentek M.
        • Brodszky V.
        • Gulacsi A.L.
        • et al.
        Subjective expectations regarding length and health-related quality of life in Hungary: results from an empirical investigation.
        Health Expect. 2014; 17: 696-709
        • Donaldson C.
        • Baker R.
        • Mason H.
        • et al.
        European value of a quality adjusted life year — final publishable report.
        (Newcastle)2010
        • Golicki D.
        • Jakubczyk M.
        • Niewada M.
        • Wrona W.
        • Busschbach J.J.
        Valuation of EQ-5D health states in Poland: first TTO-based social value set in Central and Eastern Europe.
        Value Health. 2010; 13: 289-297
        • Golicki D.
        • Niewada M.
        General population reference values for 3-level EQ-5D (EQ-5D-3L) questionnaire in Poland.
        Pol Arch Med Wewn. 2015; 125: 18-26
        • Golicki D.
        • Niewada M.
        • Buczek J.
        • et al.
        Validity of EQ-5D-5L in stroke.
        Qual Life Res. 2015; 24: 845-850
        • Golicki D.
        • Sliwka A.
        • Fijewski G.
        • Latek M.
        Pos14 quality of life according to EQ-5D after osteoporotic hip fracture in Poland.
        Value Health. 2006; 9: A382-A383
        • Golicki D.
        • Zawodnik S.
        • Janssen M.F.
        • Kiljan A.
        • Hermanowski T.
        Eq1 psychometric comparison of Eq-5d and Eq-5d-5l in student population.
        Value Health. 2010; 13: A240
        • Prevolnik Rupel V.
        • Srakar A.
        • Rand K.
        Valuation of EQ-5D-3l Health States in Slovenia: VAS based and TTO based value sets.
        Slovenian Journal of Public Health. 2019; 59: 8-17
        • Rupel V.P.
        • Rebojl M.
        The Slovenian VAS tariff based on valuations of EQ-5D health states from the general population. EuroQol 17th Plenary Meeting.
        (Pamplona, Spain)2000
        • Rupel V.P.
        • Ogorevc M.
        Use of the EQ-5D instrument and value scale in comparing health states of patients in four health care programs among health care providers.
        Value Health Reg Issues. 2014; 4: 95-99
        • Pentek M.
        • Beretzky Z.
        • Brodszky V.
        • et al.
        Health-related productivity of the Hungarian population. A cross-sectional survey.
        Orv Hetil. 2020; 161: 1522-1533
      3. Self-Reported Population Health: An International Perspective Based on EQ-5D. Springer, Dordrecht, The Netherlands2014
        • Dolan P.
        Modeling valuations for EuroQol health states.
        Med Care. 1997; 35: 1095-1108
        • McCormack H.M.
        • Horne D.J.
        • Sheather S.
        Clinical applications of visual analogue scales: a critical review.
        Psychol Med. 1988; 18: 1007-1019
        • Folstein M.F.
        • Folstein S.E.
        • McHugh P.R.
        Mini-mental state.
        J Psychiatr Res. 1975; 12: 189-198
        • Kurtzke J.F.
        Rating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS).
        Neurology. 1983; 33: 1444-1452
        • Fredriksson T.
        • Pettersson U.
        Severe psoriasis—oral therapy with a new retinoid.
        Dermatologica. 1978; 157: 238-244
        • Rencz F.
        • Brodszky V.
        • Gulacsi L.
        • et al.
        Parallel valuation of the EQ-5D-3L and EQ-5D-5L by time trade-off in Hungary.
        Value Health. 2020; 23: 1235-1245
      4. International Statistical Classification of Diseases and Related Health Problems 10th Revision.
        https://icd.who.int/browse10/2016/ens
        Date accessed: January 5, 2019
        • Feng Y.
        • Parkin D.
        • Devlin N.J.
        Assessing the performance of the EQ-VAS in the NHS Proms programme.
        Qual Life Res. 2014; 23: 977-989
      5. Happiness in Nations: Overview of Happiness Surveys Using Measure Type. Erasmus University Rotterdam, Happiness Economics Research Organisation.
        http://worlddatabaseofhappiness.eur.nl
        Date accessed: March 20, 2019
      6. Glossary: Minimum European Health Module (MEHM). Eurostat.
        • Cox B.
        • van Oyen H.
        • Cambois E.
        • et al.
        The reliability of the Minimum European Health Module.
        Int J Public Health. 2009; 54: 55-60
        • Fries J.F.
        • Spitz P.W.
        • DY Y.
        The dimensions of health outcomes: the health assessment questionnaire, disability and pain scales.
        J Rheumatol. 1982; 9: 789-793
        • Finlay A.Y.
        • Khan G.K.
        Dermatology Life Quality Index (DLQI)—a simple practical measure for routine clinical use.
        Clin Exp Dermatol. 1994; 19: 210-216
        • Mahoney F.
        • Barthel D.
        Functional evaluation: the Barthel index.
        Md State Med J. 1965; 14: 61-65
        • Zuckerman J.D.
        • Koval K.J.
        • Aharonoff G.B.
        • Hiebert R.
        • Skovron M.L.
        A functional recovery score for elderly hip fracture patients: I. Development.
        J Orthop Trauma. 2000; 14: 20-25
        • Gilbert S.M.
        • Dunn R.L.
        • Hollenbeck B.K.
        • et al.
        Development and validation of the Bladder Cancer Index: a comprehensive, disease specific measure of health related quality of life in patients with localized bladder cancer.
        J Urol. 2010; 183: 1764-1769
        • Barry M.J.
        • Fowler F.J.
        • O’Leary M.P.
        • et al.
        The American Urological Association symptom index for benign prostatic hyperplasia.
        J Urol. 1992; 148: 1549-1557
      7. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Paper presented at: KDD’16: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; New York; 2016.

        • Bentéjac C.
        • Csörgő A.
        • Martínez-Muñoz G.
        A comparative analysis of gradient boosting algorithms.
        Artif Intell Rev. 2020; 54: 1937-1967
        • Willmott C.J.
        • Matsuura K.
        Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance.
        Clim Res. 2005; 30: 79-82
        • Walters S.J.
        • Brazier J.E.
        Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D.
        Qual Life Res. 2005; 14: 1523-1532
        • Parkin D.
        • Devlin N.
        • Feng Y.
        What determines the shape of an EQ-5D index distribution?.
        Med Decis Mak. 2016; 36: 941-951
        • Brazier J.
        • Roberts J.
        • Tsuchiya A.
        • Busschbach J.
        A comparison of the EQ-5D and SF-6D across seven patient groups.
        Health Econ. 2004; 13: 873-884
        • Bramer M.
        Principles of Data Mining.
        Springer, London, United Kingdom2007
        • Panch T.
        • Mattie H.
        • Atun R.
        Artificial intelligence and algorithmic bias: implications for health systems.
        J Glob Health. 2019; 9010318
        • Tohka J.
        • van Gils M.
        Evaluation of machine learning algorithms for health and wellness applications: a tutorial.
        Comput Biol Med. 2021; 132104324
        • Huber M.
        • Kurz C.
        • Leidl R.
        Predicting patient-reported outcomes following hip and knee replacement surgery using supervised machine learning.
        BMC Med Inform Decis Mak. 2019; 19: 3
        • Lee S.K.
        • Son Y.J.
        • Kim J.
        • et al.
        Prediction model for health-related quality of life of elderly with chronic diseases using machine learning techniques.
        Healthc Inform Res. 2014; 20: 125-134
        • Borchani H.
        • Bielza C.
        • Marti Nez-Marti N.P.
        • Larranaga P.
        Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39).
        J Biomed Inform. 2012; 45: 1175-1184
        • Gutacker N.
        • Street A.
        Use of large-scale HRQoL datasets to generate individualised predictions and inform patients about the likely benefit of surgery.
        Qual Life Res. 2017; 26: 2497-2505
        • Gao L.
        • Luo W.
        • Tonmukayakul U.
        • Moodie M.
        • Chen G.
        Mapping MacNew Heart Disease Quality of Life Questionnaire onto country-specific EQ-5D-5L utility scores: a comparison of traditional regression models with a machine learning technique.
        Eur J Health Econ. 2021; 22: 341-350
        • Mlynczak K.
        • Golicki D.
        Validity of the EQ-5D-5L questionnaire among the general population of Poland.
        Qual Life Res. 2021; 30: 817-829
        • Hernandez Alava M.
        • Wailoo A.J.
        • Ara R.
        Tails from the peak district: adjusted limited dependent variable mixture models of EQ-5D questionnaire health state utility values.
        Value Health. 2012; 15: 550-561
        • Groenwold R.H.
        • White I.R.
        • Donders A.R.
        • Carpenter J.R.
        • Altman D.G.
        • Moons K.G.
        Missing covariate data in clinical research: when and when not to use the missing-indicator method for analysis.
        CMAJ. 2012; 184: 1265-1269
        • Sperrin M.
        • Martin G.P.
        • Sisk R.
        • Peek N.
        Missing data should be handled differently for prediction than for description or causal explanation.
        J Clin Epidemiol. 2020; 125: 183-187
      8. Song M, Zhou X, Pazaris M, Spiegelman D. The missing covariate indicator method is nearly valid almost always. Preprint. Posted online October 30, 2021. ArXiv:211100138. https://doi.org/10.48550/arXiv.2111.00138.

        • Rusdah D.A.
        • Murfi H.
        XGBoost in handling missing values for life insurance risk prediction.
        SN Appl Sci. 2020; 2: 1
        • Sterne J.A.
        • White I.R.
        • Carlin J.B.
        • et al.
        Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.
        BMJ. 2009; 338: b2393
        • Lavanya K.
        • Reddy L.
        • Eswara Reddy B.
        Modeling of missing data imputation using additive lasso regression model in Microsoft Azure.
        J Eng Appl Sci. 2018; 13: 6324-6334
        • Brazier J.E.
        • Yang Y.
        • Tsuchiya A.
        • Rowen D.L.
        A review of studies mapping (or cross walking) non-preference based measures of health to generic preference-based measures.
        Eur J Health Econ. 2010; 11: 215-225
        • Hay J.W.
        • Gong C.L.
        • Jiao X.
        • et al.
        A US population health survey on the impact of COVID-19 using the EQ-5D-5L.
        J Gen Intern Med. 2021; 36: 1292-1301