## Abstract

### Objectives

Many studies disregard the time dependence of nosocomial infection when examining length of hospital stay and the associated financial costs. This leads to the “time-dependent bias,” which biases multiplicative hazard ratios. We demonstrate the time-dependent bias on the additive scale of extra length of stay.

### Methods

To estimate the extra length of stay due to infection, we used a multistate model that accounted for the time of infection. For comparison we used a generalized linear model assuming a gamma distribution, a commonly used model that ignores the time of infection. We applied these two methods to a large prospective cohort of hospital admissions from Argentina, and compared the methods' performance using a simulation study.

### Results

For the Argentina data the extra length of stay due to nosocomial infection was 11.23 days when ignoring time dependence and only 1.35 days after accounting for the time of infection. The simulations showed that ignoring time dependence consistently overestimated the extra length of stay. This overestimation was similar for different rates of infection and even when an infection prolonged or shortened stay. We show examples where the time-dependent bias remains unchanged for the true discharge hazard ratios, but the bias for the extra length of stay is doubled because length of stay depends on the infection hazard.

### Conclusions

Ignoring the timing of nosocomial infection gives estimates that greatly overestimate its effect on the extra length of hospital stay.

## Keywords

## Introduction

Length of stay (LOS) in hospital is a key outcome when studying the health and economic impact of nosocomial infections (NIs) [

[1]

]. A patient with an NI is likely to stay longer in hospital, incurring extra cost. The cost arises because other patients are denied access to the hospital bed while it is used to treat the infected patient. There will be only small changes to financial expenditures from reducing NI because most are fixed within the cost structures of the hospital within the time frame of infection control decisions [[2]

]. It is the short-run willingness to pay for the marginal “bed day” released by reducing risk of NI that represents the cost of bed days used to treat NI, and this value will vary by decision maker and jurisdiction.Correctly quantifying the extra LOS (in days) due to NI is crucial for economic and policy decision making. However, this analysis is complicated by the fact that patients acquire NI during their hospital stay, thus they have already spent some time in hospital before they become infected. This time requires specific consideration in the analysis by treating NI infection as a time-dependent exposure.

Many studies are prone to the “time-dependent bias” [

[3]

], including studies that fail to treat nosocomial infection as a time-dependent exposure [[4]

]. This bias occurs when the risk sets dependent on time are not correctly addressed in the statistical analysis [[5]

]. The bias has recently been rigorously studied by comparing the discharge hazard between infected and noninfected patients using hazard ratios (via Cox regression) [6

, 7

, 8

]. However, policy makers prefer LOS because it has a simpler interpretation and is needed for estimating costs [2

, 9

].In this article, we demonstrate the time-dependent bias in terms of LOS using a simulation study and real data. We show how to avoid the bias and give computer code to fit appropriate statistical models.

## Methods

### Ignoring the time of infection

Length of stay (in days) is often modeled using a generalized linear model assuming a gamma distribution because this helps to model its strong skew distribution [

[10]

]. Using an independent variable of infected (yes/no) assumes that a complete infection history is known at admission [[4]

]. Therefore, estimates from this model will be subject to the time-dependent bias.### Modeling the time of infection

New methods have been established that model the timing of NI and so avoid the time-dependent bias [

11

, 12

]. These include a multistate model in which NI is an intermediate event between admission and discharge. Our model has three states: 0 = “no NI”, 1 = “NI”, and 2 = “discharged,” and we model the hazards among them (Fig. 1). The hazard is approximately equal to the probability of changing states in the next short time interval divided by the length of the interval, given that patients have been in the current state up to that time. We denote λ_{ij}(*t*) = λ_{ij}as the hazard for moving from state*i*to state*j*. An example hazard is,${\text{\lambda}}_{\text{01}}\left(\text{t}\right)\cdot \text{\Delta}\text{t}\text{\hspace{0.17em}}\approx \text{\hspace{0.17em}}\text{P}\left(\text{NI}\text{\hspace{0.17em}}\text{acquired}\text{\hspace{0.17em}}\text{by}\text{\hspace{0.17em}}\text{time}\text{\hspace{0.17em}}\text{t}\text{\hspace{0.17em}}\text{+}\text{\hspace{0.17em}}\text{\Delta}\text{t}\text{\hspace{0.17em}}\text{|}\text{\hspace{0.17em}}\text{no}\text{\hspace{0.17em}}\text{NI}\text{\hspace{0.17em}}\text{up}\text{\hspace{0.17em}}\text{to}\text{\hspace{0.17em}}\text{time}\text{\hspace{0.17em}}\text{t}\right)\text{.}$

The actual hazard λ

_{01}(*t*) is obtained by taking limits as Δ*t*→ 0.For simplicity we assume that the hazards are constant in time, which allows us to demonstrate the key points concerning LOS. However, methods for the general case do exist, and we give references as appropriate.

Under a constant hazard assumption, one estimates the λ

_{ij}'s using the usual ‘incidence rates’,${\widehat{\lambda}}_{ij}=\frac{\text{number}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{of}\text{\hspace{0.17em}}\text{\hspace{0.05em}}i\to j\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{transitions}}{\text{person-time}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{in}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{state}\text{\hspace{0.17em}}\text{\hspace{0.05em}}i}$

Using this model the mean extra LOS of an infected patient is 1/λ

Three important facts follow from formula 1. The leading term ([(λ

_{12}. The mean extra LOS of an uninfected patient (who may acquire infection later on) is the mean extra time in state 0 of ‘no HI’, 1/(λ_{01}+ λ_{02}), plus the mean extra time in the infected state, 1/λ_{12}, times the probability of infection, λ_{01}/(λ_{01}+ λ_{02}). The extra LOS due to an infection then is the difference in extra LOS for an infected and non-infected patient,$\text{Extra}\text{\hspace{0.17em}}\text{LoS}\text{\hspace{0.17em}}\left(\text{days}\right)\text{\hspace{0.17em}}\text{=}\text{\hspace{0.17em}}\left(\frac{{\text{\lambda}}_{\text{02}}}{{\text{\lambda}}_{\text{12}}}\text{\hspace{0.17em}}\text{\u2212}\text{\hspace{0.17em}}\text{1}\right)\frac{\text{1}}{{\text{\lambda}}_{\text{01}}\text{\hspace{0.17em}}\text{+}\text{\hspace{0.17em}}{\text{\lambda}}_{\text{02}}}$

(1)

Three important facts follow from formula 1. The leading term ([(λ

_{02})/(λ_{12})] − 1) is the ratio of the end-of-stay hazards ‘uninfected versus infected’ minus 1. This ratio determines whether infection prolongs or shortens LOS. The extra LOS is zero if λ_{02}= λ_{12}, is positive if λ_{02}> λ_{12}, and is negative if λ_{02}< λ_{12}. Second, it is a mathematical fact that not modeling the timing of infection will overestimate a genuine prolonging effect of infection on LOS in terms of the hazard ratio [6

, 7

, 8

]. This is because the hazard ratio (comparing infected with uninfected patients) is underestimated because the risk set of infected patients is incorrectly increased and the risk set of uninfected patients is incorrectly decreased. Finally, the actual number of extra days following infection does not only depend on the end-of-stay hazards, but also on the infection hazard λ_{01}.These three facts demonstrate that estimation of extra LOS will be biased if the timing of infection is

*not*taken into account. In practice, bias is usually introduced by retrospectively stratifying patients into infected and noninfected. Under a constant hazards assumption, the extra LOS can be unbiasedly estimated by plugging in the incidence rates into formula 1. If the hazards are time dependent, however, then formula 1 will also be time dependent. A time dependent version of formula 1 can be created by giving more weight to those days when most infections occurred. Below, we have simulated data using constant hazards, but the analysis of both the simulated data and the real data did not rely on such an assumption. R code to run the time dependent version formula 1 is given in the supplement (doi:10.1016/j.jval.2010.09.008).The hazards for the multistate model were calculated using the R-package

*mvna*[[13]

]. The R-package *etm*[[14]

] was used to estimate the mean extra length of stay and 95% confidence intervals for the mean. Both R-packages are available at: http://cran.r-project.org [[15]

].### Simulation study

We simulated individual LOS data (in days) using constant hazards and the multistate model in Figure 1. For each individual the time spent in state 0 was randomly generated with hazard λ

_{01}+ λ_{02}using an exponential distribution. The individual was randomly infected with probability λ_{01}/(λ_{01}+ λ_{02}). For an infected individual the time spent in state 1 was randomly generated with hazard λ_{12}.We generated 1,000 studies with 1,000 individuals for 18 scenarios (Table 1). These scenarios were chosen to mimic realistic lengths of stay from an intensive care unit (λ

_{02}= 0.2) and longer average lengths of stay from a hospital (λ_{02}= 0.1). By varying λ_{12}we considered infections that: prolonged LOS (λ_{12}< λ_{02}), shortened LOS (λ_{12}> λ_{02}), and had no effect on LOS (λ_{12}= λ_{02}). A shorter LOS after infection is possible if the infection seriously harms the patient and so hastens their death. By varying λ_{01}we examined three infection rate levels: rare (5%), moderate (9%), and common (13%).Table 1Simulation results: estimated bias for the extra LOS (days).

λ_{01} | λ_{12} | Occurrence of infection | Effect of infection on LOS | True extra LOS (days) | Mean bias (days) | |
---|---|---|---|---|---|---|

Gamma | MSM | |||||

λ_{02} = 0.2, “intensive care unit” | ||||||

0.01 | 0.12 | Rare (5%) | Prolonging | 3.17 | 5.12 | −0.10 |

0.01 | 0.20 | Rare (5%) | No effect | 0.00 | 4.96 | −0.02 |

0.01 | 0.30 | Rare (5%) | Shortening | −1.59 | 4.93 | 0.05 |

0.02 | 0.12 | Moderate (9%) | Prolonging | 3.03 | 5.28 | −0.07 |

0.02 | 0.20 | Moderate (9%) | No effect | 0.00 | 5.02 | 0.01 |

0.02 | 0.30 | Moderate (9%) | Shortening | −1.52 | 4.86 | 0.03 |

0.03 | 0.12 | Common (13%) | Prolonging | 2.90 | 5.43 | −0.05 |

0.03 | 0.20 | Common (13%) | No effect | 0.00 | 5.01 | 0.01 |

0.03 | 0.30 | Common (13%) | Shortening | −1.45 | 4.79 | 0.02 |

λ_{02} = 0.1, “hospital” | ||||||

0.005 | 0.06 | Rare (5%) | Prolonging | 6.35 | 10.17 | −0.25 |

0.005 | 0.10 | Rare (5%) | No effect | 0.00 | 9.97 | −0.01 |

0.005 | 0.15 | Rare (5%) | Shortening | −3.17 | 9.81 | 0.10 |

0.010 | 0.06 | Moderate (9%) | Prolonging | 6.06 | 10.56 | −0.14 |

0.010 | 0.10 | Moderate (9%) | No effect | 0.00 | 9.96 | 0.05 |

0.010 | 0.15 | Moderate (9%) | Shortening | −3.03 | 9.65 | 0.04 |

0.015 | 0.06 | Common (13%) | Prolonging | 5.80 | 10.81 | −0.06 |

0.015 | 0.10 | Common (13%) | No effect | 0.00 | 9.95 | −0.05 |

0.015 | 0.15 | Common (13%) | Shortening | −2.90 | 9.56 | 0.03 |

The bias is the difference between the estimated and true extra LOS. There were 1000 studies for each scenario, and 1000 patients per study.

LOS, length of stay; MSM, multistate model.

From equation [1].

### Data from Buenos Aires, Argentina

We used prospectively collected data from an observational study of NI from the intensive care units of 11 hospitals in Buenos Aires, Argentina. The data were collected as part of the International Nosocomial Infection Control Consortium (INICC) [

[16]

]. INICC is an international non-profit, multicenter collaborative health-care-acquired infection control program that uses a surveillance system based on the US National Healthcare Safety Network (NHSN) [[17]

]. The laboratory techniques used, training programs for data-collectors, definitions of infection, and surveillance activities were previously described [[16]

]. All patients entering the intensive care unit for more than 24 hours were prospectively followed; detailed information was collected on each day. The following variables were used in this analysis: hospital name and location; date of admission, discharge, and death; date, type, and site of health-care acquired infection; Average Severity of Illness Score (ASIS); and age.To investigate possible moderating or confounding effects of age and ASIS, we estimated the extra length of stay due to infection after stratifying on these variables.

## Results

### Simulation

The simulation results are shown in Table 1 and demonstrate the bias in realistic settings. For the “intensive care unit” scenario the gamma model overestimated the extra LOS by an average of around 5 days, and for the “hospital” scenario by around 10 days. In contrast, the multistate model estimated the correct extra LOS on average. The bias of the gamma model was consistent regardless of the rate of infection or the effect of infection. Changing the hazard for infection to discharge (λ

_{12}) had a slight impact on the bias for the gamma model when the rate of infection was very high. A further simulation with λ_{01}= λ_{02}= 0.1 (infection rate = 50%) and λ_{12}= 0.06, 0.1 and 0.2, had a mean bias for the gamma model of 7.5, 10.0 and 13.3 days. The bias of the gamma model was always positive, even when an infection shortened the length of stay. So the gamma model did not even correctly estimate the direction of change, let alone its size.Our results, ignoring the time of infection, were based on a parametric gamma model. As a sensitivity analysis, we instead used a non-parametric Kaplan-Meier model and found exactly the same biased estimated extra LOS due to infection (results not shown).

### Argentina

Summary statistics for the 9545 admissions are shown in Table 2. There were 826 admissions acquiring an NI (8.7%). The average LOS was 5.8 days, the median was 4 days.

Table 2Descriptive statistics for the intensive care unit data from Buenos Aires, Argentina, January 2003 to November 2008.

Variable | Category | Statistic(s) |
---|---|---|

Admissions, n | 9,545 | |

Age, mean (SD) | 68.5 (17.8) | |

Gender, n (%) | Men | 5,018 (52.6) |

ASIS, n (%) | A | 853 (9.7) |

B | 2,840 (32.4) | |

C | 3,172 (36.2) | |

D | 1,478 (16.9) | |

E | 418 (4.8) | |

NI, n (%) | Yes | 826 (8.7) |

LOS (days), mean (median) | 5.8 (4.0) |

ASIS, average severity of illness score (see Table 4 for definitions); LOS, length of stay; NI, nosocomial infection.

Figure 2 shows the cumulative hazards for moving between three states for the multistate model. The cumulative infection hazard is roughly a straight line, meaning the slope is a constant infection hazard of about 0.02 (1 divided by 50 days). This means that each day an average of 2 patients out of 100 get an NI. The hazards of discharge are not straight lines, but the discharge hazard is consistently reduced for patients with an NI.

Using the multistate model and the real data from Argentina the expected length of stay on each day is shown for patients with and without an infection in Figure 3. The extra length of stay due to NI is greater for earlier days. The average extra length of stay over all days is calculated by weighting the difference in LOS on each day. This gives an estimated extra LOS of 1.35 days (95% confidence interval: 0.77–1.93 days). The gamma model estimated an extra LOS due to NI of 11.23 days (95% confidence interval: 10.10–12.44 days).

The extra lengths of stay estimated by age group and ASIS are shown in Table 3. Lengths of stay due to infection were longer in the youngest age group (under 60), with an average of close to 3 extra days. By ASIS there was an increasing extra length of stay with increasing morbidity (groups ‘A’ to ‘D’; Table 4), but there was no extra length of stay for the sickest patients who acquired an infection (group ‘E’). Infection may have hastened death in this group.

Table 3Estimated extra length of stay by age group and ASIS for the intensive care unit data from Buenos Aires, Argentina, January 2003 to November 2008.

Variable | Group | Admissions | Extra LOS (days) | |
---|---|---|---|---|

Mean | 95% CI | |||

Age group | <60 | 2,013 | 2.98 | 1.44, 4.49 |

(years) | 60–79 | 4,655 | 0.76 | 0.07, 1.45 |

80+ | 2,610 | 0.83 | 0.04, 1.62 | |

ASIS | A | 853 | −0.48 | −1.19, 0.22 |

B | 2,840 | 0.13 | −0.85, 1.12 | |

C | 3,172 | 0.79 | 0.01, 1.58 | |

D | 1,478 | 1.89 | 0.92, 2.87 | |

E | 418 | −1.87 | −4.32, 0.58 | |

All admissions | — | 9,545 | 1.35 | 0.77, 1.93 |

ASIS, average severity of illness score (see Table 4 for definitions); LOS, length of stay.

Table 4Definitions for the five average severity of illness score categories.

Letter | Definition |
---|---|

A | Surgical admissions who require routine postoperative observation only |

B | Physiologically stable non-surgical patients who require overnight observation |

C | Admissions who need continuous nursing care and monitoring |

D | Physiologically unstable patients who require intensive nursing and medical care and need frequent reassessment and adjustment of therapy |

E | Physiologically unstable admissions who are in a coma or shock and require cardiopulmonary resuscitation or intensive medical and nursing care with frequent reassessment |

## Discussion

Our results demonstrate the time-dependent bias in terms of extra LOS. The results show that the extra LOS is always overestimated when ignoring the time of infection. Using intensive care unit data from Argentina, the gamma model greatly overestimates the extra LOS due to infection. The consequences of this are compounded when extrapolating to economic costs.

The costs of NI-expressed in monetary values are used by advocates to justify extra infection control investments. Spending money to save money is a powerful argument that will resonate with decision makers. Typically a dollar value is applied to each bed day lost to NI and the aggregate cost outcome disseminated to decision makers; the promise is that cost savings will be enjoyed if cases of infection are prevented. A value of $700 assigned to each hospital bed day lost to NI in Argentina implies cost savings of $7861 (95% confidence interval: $7070–$8708) using biased estimates and $945 (95% confidence interval: $602–$1288) using unbiased estimates. Two possible outcomes from using biased estimates to make the economic case are that too many resources are allocated to infection control, and that decision makers are disappointed with the actual savings. Unbiased estimates should be used to inform decisions, and decisions should include information on changes to all costs and changes to health benefits, such as reduced risk of mortality and reduced morbidity [

[18]

].To examine the bias in a hospital setting with generally longer length of stay compared with an intensive care unit, we multiplied all three constant hazards by a half. This multiplication does not alter the true hazard ratio, but the bias in terms of extra length of stay is roughly doubled (Table 1). This shows the additional value of examining the time-dependent bias in terms of the additive length of stay compared with multiplicative hazard ratios.

This study has a few limitations. In our simulations we assumed constant hazards between states; however, the multistate model allows hazards to change over time. We did not distinguish between patients dying while in the hospital and being discharged alive. However, the multistate model can be extended to incorporate such competing events. This makes it is possible to estimate extra LOS due to NI for patients who died, and a separate extra LOS for those who were discharged [

[19]

].- Allignol A.
- Schumacher M.
- Beyersmann J.

Estimating summary functionals in multistate models with an application to hospital infection data [online journal].

*Computational Statistics.*6 June 2010; https://doi.org/10.1007/s00180-010-0200-x

Nosocomial infections may develop during hospital stay, but may only become symptomatic and detected after discharge. If asymptomatic patients suffer the same morbidity as infected patients, then the methods used here will underestimate the increased length of stay due to infection because some patients in the non-infected group are misclassified. The size of underestimation will depend on how many patients are misclassified. An interesting approach to this problem is to model an imperfect sensitivity for the test of NI [

[20]

].The methodology in this article is appropriate for studying statistical association. In various fields of epidemiology there has been increasing interest in statistical methodology for causal effects rather than association, typically motivated by time-dynamic treatment regimes. An investigation of the impact of time-dependent adverse events on LOS has been conducted [

[21]

]. Our study makes some headway into causation because the temporal sequence of events is accounted for. This is crucial to any concept of causality, because the cause has to precede the effect. It is precisely this temporal sequence that is ignored in the time-dependent bias. We have, however, refrained from a more in-depth treatment of the issue of causality, because the development of causal models for time-dependent exposures is ongoing.Our findings are not limited to studies of infection; they are extendable to other settings with a time-dependent exposure; therefore, they are relevant for an appropriate risk-adjustment in other studies. Future studies should be aware of the time-dependent bias; ignoring it leads to overestimated lengths of stay and economic impacts. The bias applies to any method (Kaplan–Meier, Cox, and generalized linear [mixed] models with any type of distribution) unless the time of infection is taken into account.

## Supplementary data

- Supplement

## References

- Economics and preventing hospital-acquired infection.
*Emerging Infect Dis.*2004; 10: 561-566 - Estimating the cost of health care-associated infections: mind your p's and q's.
*Clin Infect Dis.*2010; 50: 1017-1021 - Time-dependent bias was common in survival analyses published in leading clinical journals.
*J Clin Epidemiol.*2004; 57: 672-682 - Bayesian analysis of nosocomial infection risk and length of stay in a department of general and digestive surgery.
*Value Health.*2010; 13: 431-439 - Two pitfalls in survival analyzes of time-dependent exposure: a case study in a cohort of Oscar nominees.
*Am Stat.*2010; 64: 205-211 - An easy mathematical proof showed that time-dependent bias inevitably leads to biased effect estimation.
*J Clin Epidemiol.*2008; 61: 1216-1221 - Efficient risk set sampling when a time-dependent exposure is present.
*Methods Inf Med.*2009; 48: 438-443 - The impact of time-dependent bias in proportional hazards modelling.
*Stat Med.*2008; 27: 6439-6454 - Estimation of extra hospital stay attributable to nosocomial infections: heterogeneity and timing of events.
*J Clin Epidemiol.*2000; 53: 409-417 - Comparing alternative models: log vs Cox proportional hazard?.
*Health Econ.*2004; 13: 749-765 - Use of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection.
*Infect Control Hosp Epidemiol.*2006; 27: 493-499 - Using a longitudinal model to estimate the effect of methicillin-resistant
*Staphylococcus aureus*infection on length of stay in an intensive care unit.*Am J Epidemiol.*2009; 170: 1186-1194 - mvna: An R package for the Nelson-Aalen estimator in multistate models.
*R News.*2008; 8: 48-50 - Empirical transition matrix of multistate models: the etm package.
*J Stat Softw.*2010; (To appear) - R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna, Austria2005
- The International Nosocomial Infection Control Consortium (INICC): goals and objectives, description of surveillance methods, and operational activities.
*Am J Infect Control.*2008; 36: 1-12 - National Healthcare Safety Network (NHSN) report, data summary for 2006, issued June 2007.
*Am J Infect Control.*2007; 35: 290-301 - Economics and preventing hospital-acquired infection: broadening the perspective.
*Infect Control Hosp Epidemiol.*2007; 28: 178-184 - Estimating summary functionals in multistate models with an application to hospital infection data [online journal].
*Computational Statistics.*6 June 2010; https://doi.org/10.1007/s00180-010-0200-x - An augmented data method for the analysis of nosocomial infection data.
*Am J Epidemiol.*2008; 168: 548-557 - A simulation-based evaluation of methods to estimate the impact of an adverse event on hospital length of stay.
*Med Care.*2007; 45: S108-S115

## Article info

### Footnotes

Funding: Martin Wolkewitz, Jan Beyersmann and Arthur Allignol were supported by Deutsche Forschungsgemeinschaft (FOR 534). Martin Wolkewitz's visit to Brisbane was supported by the Institute of Health and Biomedical Innovation visiting researcher program.

### Identification

### Copyright

© 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc.

### User license

Elsevier user license | How you can reuse

Elsevier's open access license policy

Elsevier user license

## Permitted

### For non-commercial purposes:

- Read, print & download
- Text & data mine
- Translate the article

## Not Permitted

- Reuse portions or extracts from the article in other works
- Redistribute or republish the final article
- Sell or re-use for commercial purposes

Elsevier's open access license policy