Article · Wikipedia archive · Last revised Jun 14, 2026

Hurdle model

A hurdle model is a class of statistical models where a random variable is modelled using two parts, the first of which is the probability of attaining the value 0, and the second part models the probability of the non-zero values. The use of hurdle models is often motivated by an excess of zeroes in the data that is not sufficiently accounted for in more standard statistical models.

Last revised
Jun 14, 2026
Read time
≈ 2 min
Length
484 w
Citations
4
Source

A hurdle model is a class of statistical models where a random variable is modelled using two parts, the first of which is the probability of attaining the value 0, and the second part models the probability of the non-zero values. The use of hurdle models is often motivated by an excess of zeroes in the data that is not sufficiently accounted for in more standard statistical models.

In a hurdle model, a random variable x is modelled as

Pr ( x = 0 ) = θ {\displaystyle \Pr(x=0)=\theta }
Pr ( x 0 ) = p x 0 ( x ) {\displaystyle \Pr(x\neq 0)=p_{x\neq 0}(x)}

where p x 0 ( x ) {\displaystyle p_{x\neq 0}(x)} is a truncated probability distribution function, truncated at 0.

Hurdle models were introduced by John G. Cragg in 1971,1 where the non-zero values of x were modelled using a normal model, and a probit model was used to model the zeros. The probit part of the model was said to model the presence of "hurdles" that must be overcome for the values of x to attain non-zero values, hence the designation hurdle model. Hurdle models were later developed for count data, with Poisson, geometric,2 and negative binomial3 models for the non-zero counts.

Relationship with zero-inflated models

Hurdle models differ from zero-inflated models in that zero-inflated models model the zeros using a two-component mixture model. With a mixture model, the probability of the variable being zero is determined by both the main distribution function p ( x = 0 ) {\displaystyle p(x=0)} and the mixture weight π {\displaystyle \pi } . Specifically, a zero-inflated model for a random variable x is

Pr ( x = 0 ) = π + ( 1 π ) × p ( x = 0 ) {\displaystyle \Pr(x=0)=\pi +(1-\pi )\times p(x=0)}
Pr ( x = h i ) = ( 1 π ) × p ( x = h i ) {\displaystyle \Pr(x=h_{i})=(1-\pi )\times p(x=h_{i})}

where π {\displaystyle \pi } is the mixture weight that determines the amount of zero-inflation. A zero-inflated model can only increase the probability of Pr ( x = 0 ) {\displaystyle \Pr(x=0)} , but this is not a restriction in hurdle models.4

See also

See also

References

References

  1. Cragg, John G. (1971). "Some Statistical Models for Limited Dependent Variables with Application to the Demand for Durable Goods". Econometrica. 39 (5): 829–844. doi:10.2307/1909582. JSTOR 1909582.
  2. Mullahy, John (1986). "Specification and testing of some modified count data models". Journal of Econometrics. 33 (3): 341–365. doi:10.1016/0304-4076(86)90002-3.
  3. Welsh, A. H.; Cunningham, R. B.; Donnelly, C. F.; Lindenmayer, D. B. (1996). "Modelling the abundance of rare species: statistical models for counts with extra zeros". Ecological Modelling. 88 (1–3): 297–308. doi:10.1016/0304-3800(95)00113-1.
  4. Min, Yongyi; Agresti, Alan (2005). "Random effect models for repeated measures of zero-inflated count data". Statistical Modelling. 5 (1): 1–19. CiteSeerX 10.1.1.296.3503. doi:10.1191/1471082X05st084oa. S2CID 2400918.