Objectives
Work productivity loss among adults with depression are associated with multiple patient characteristics. The current study examined predicted total work impairments as a result of absenteeism and presenteeism using regularized linear regression and decision-tree-based ensemble algorithm.
Methods
Data on employed US adults (18-64 years old) were analyzed from the 2019 National Health and Wellness Survey. Analysis sample included respondents who self-reported diagnosis of depression or having experienced depression in the past 12 months. Work productivity loss was derived from Work Productivity and Activity Impairment questionnaire. Group LASSO with Nesterov’s method and XGBoost regression were used separately to predict work impairments and to extract model feature importance views. Given the count-like nature of productivity loss, poisson distribution was specified in both LASSO and XGBoost. Variable selection was based on model fit statistics Akaike Information Criterion (AIC) (LASSO) and the gain in feature importance (XGBoost). Forty variables on respondent demographics, health behavior (e.g., smoking and alcohol use), depression-related variables, comorbidities, and doctor visits were entered into both models. Data was split into training, validation, and testing datasets. Hyperparameters were tuned based on the validation data. Root mean squared errors (RMSE) for the testing data were compared to assess model performance.
Results
Among 11,478 working adults with depression, XGBoost made more accurate predictions compared with LASSO (RMSE=26.6 and 27.6, respectively). Overestimation of impairment was slightly greater in the LASSO model compared with that from XGBoost (mean impairment=33% and 30%, respectively). The LASSO model selected more demographic and health behavior variables than XGBoost which ranked comorbidity variables (arthritis, sleep conditions, migraine, liver or renal diseases) as the most important features in predicting productivity loss.
Conclusions
In a broadly representative US population of working adults with depression, XGBoost model was found to better predict productivity loss compared with LASSO.
Article info
Identification
Copyright
© 2021 Published by Elsevier Inc.
User license
Elsevier user license | How you can reuse
Elsevier's open access license policy

Elsevier user license
Permitted
For non-commercial purposes:
- Read, print & download
- Text & data mine
- Translate the article
Not Permitted
- Reuse portions or extracts from the article in other works
- Redistribute or republish the final article
- Sell or re-use for commercial purposes
Elsevier's open access license policy