Article · Wikipedia archive · Last revised Jun 1, 2026

Half-normal distribution

In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution.

Last revised
Jun 1, 2026
Read time
≈ 6 min
Length
1,320 w
Citations
2
Source
Half-normal distribution
Probability density function
Probability density function of the half-normal distribution '"`UNIQ--postMath-00000001-QINU`"'
σ = 1 {\displaystyle \sigma =1}
Cumulative distribution function
Cumulative distribution function of the half-normal distribution '"`UNIQ--postMath-00000003-QINU`"'
σ = 1 {\displaystyle \sigma =1}
Parameters σ > 0 {\displaystyle \sigma >0} — (scale)
Support x [ 0 , ) {\displaystyle x\in [0,\infty )}
PDF f ( x ; σ ) = 2 σ π exp ( x 2 2 σ 2 ) x > 0 {\displaystyle f(x;\sigma )={\frac {\sqrt {2}}{\sigma {\sqrt {\pi }}}}\exp \left(-{\frac {x^{2}}{2\sigma ^{2}}}\right)\quad x>0}
CDF F ( x ; σ ) = erf ( x σ 2 ) {\displaystyle F(x;\sigma )=\operatorname {erf} \left({\frac {x}{\sigma {\sqrt {2}}}}\right)}
Quantile Q ( F ; σ ) = σ 2 erf 1 ( F ) {\displaystyle Q(F;\sigma )=\sigma {\sqrt {2}}\operatorname {erf} ^{-1}(F)}
Mean σ 2 π 0.797885 σ {\displaystyle {\frac {\sigma {\sqrt {2}}}{\sqrt {\pi }}}\approx 0.797885\sigma }
Median σ 2 erf 1 ( 1 / 2 ) 0.674490 σ {\displaystyle \sigma {\sqrt {2}}\operatorname {erf} ^{-1}(1/2)\approx 0.674490\sigma }
Mode 0 {\displaystyle 0}
Variance σ 2 ( 1 2 π ) {\displaystyle \sigma ^{2}\left(1-{\frac {2}{\pi }}\right)}
Skewness 2 ( 4 π ) ( π 2 ) 3 / 2 0.9952717 {\displaystyle {\frac {{\sqrt {2}}(4-\pi )}{(\pi -2)^{3/2}}}\approx 0.9952717}
Excess kurtosis 8 ( π 3 ) ( π 2 ) 2 0.869177 {\displaystyle {\frac {8(\pi -3)}{(\pi -2)^{2}}}\approx 0.869177}
Entropy 1 2 log 2 ( 2 π e σ 2 ) 1 {\displaystyle {\frac {1}{2}}\log _{2}\left(2\pi e\sigma ^{2}\right)-1}
MGF exp ( σ 2 t 2 2 ) erfc ( σ t 2 ) {\displaystyle \exp \left({\frac {\sigma ^{2}t^{2}}{2}}\right)\operatorname {erfc} \left(-{\frac {\sigma t}{\sqrt {2}}}\right)}
CF w ( σ t 2 ) {\displaystyle w\left({\frac {\sigma t}{\sqrt {2}}}\right)}
where w ( x ) {\displaystyle w(x)} is the Faddeeva function

In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution.

Let X {\displaystyle X} follow an ordinary normal distribution, N ( 0 , σ 2 ) {\displaystyle N(0,\sigma ^{2})} . Then, Y = | X | {\displaystyle Y=|X|} follows a half-normal distribution. Thus, the half-normal distribution is a fold at the mean of an ordinary normal distribution with mean zero.

Properties

Using the σ {\displaystyle \sigma } parametrization of the normal distribution, the probability density function (PDF) of the half-normal is given by

f Y ( y ; σ ) = 2 σ π exp ( y 2 2 σ 2 ) y 0 , {\displaystyle f_{Y}(y;\sigma )={\frac {\sqrt {2}}{\sigma {\sqrt {\pi }}}}\exp \left(-{\frac {y^{2}}{2\sigma ^{2}}}\right)\quad y\geq 0,}

where E [ Y ] = μ = σ 2 π {\displaystyle E[Y]=\mu ={\frac {\sigma {\sqrt {2}}}{\sqrt {\pi }}}} .

Alternatively using a scaled precision (inverse of the variance) parametrization (to avoid issues if σ {\displaystyle \sigma } is near zero), obtained by setting θ = π σ 2 {\displaystyle \theta ={\frac {\sqrt {\pi }}{\sigma {\sqrt {2}}}}} , the probability density function is given by

f Y ( y ; θ ) = 2 θ π exp ( y 2 θ 2 π ) y 0 , {\displaystyle f_{Y}(y;\theta )={\frac {2\theta }{\pi }}\exp \left(-{\frac {y^{2}\theta ^{2}}{\pi }}\right)\quad y\geq 0,}

where E [ Y ] = μ = 1 θ {\displaystyle E[Y]=\mu ={\frac {1}{\theta }}} .

The cumulative distribution function (CDF) is given by

F Y ( y ; σ ) = 0 y 1 σ 2 π exp ( x 2 2 σ 2 ) d x {\displaystyle F_{Y}(y;\sigma )=\int _{0}^{y}{\frac {1}{\sigma }}{\sqrt {\frac {2}{\pi }}}\,\exp \left(-{\frac {x^{2}}{2\sigma ^{2}}}\right)\,dx}

Using the change-of-variables z = x / ( 2 σ ) {\displaystyle z=x/({\sqrt {2}}\sigma )} , the CDF can be written as

F Y ( y ; σ ) = 2 π 0 y / ( 2 σ ) exp ( z 2 ) d z = erf ( y 2 σ ) , {\displaystyle F_{Y}(y;\sigma )={\frac {2}{\sqrt {\pi }}}\,\int _{0}^{y/({\sqrt {2}}\sigma )}\exp \left(-z^{2}\right)dz=\operatorname {erf} \left({\frac {y}{{\sqrt {2}}\sigma }}\right),}

where erf is the error function, a standard function in many mathematical software packages.

The quantile function (or inverse CDF) is written:

Q ( F ; σ ) = σ 2 erf 1 ( F ) {\displaystyle Q(F;\sigma )=\sigma {\sqrt {2}}\operatorname {erf} ^{-1}(F)}

where 0 F 1 {\displaystyle 0\leq F\leq 1} and erf 1 {\displaystyle \operatorname {erf} ^{-1}} is the inverse error function

The expectation is then given by

E [ Y ] = σ 2 / π , {\displaystyle E[Y]=\sigma {\sqrt {2/\pi }},}

The variance is given by

var ( Y ) = σ 2 ( 1 2 π ) . {\displaystyle \operatorname {var} (Y)=\sigma ^{2}\left(1-{\frac {2}{\pi }}\right).}

Since this is proportional to the variance σ2 of X, σ can be seen as a scale parameter of the new distribution.

The differential entropy of the half-normal distribution is exactly one bit less the differential entropy of a zero-mean normal distribution with the same second moment about 0. This can be understood intuitively since the magnitude operator reduces information by one bit (if the probability distribution at its input is even). Alternatively, since a half-normal distribution is always positive, the one bit it would take to record whether a standard normal random variable were positive (say, a 1) or negative (say, a 0) is no longer necessary. Thus,

h ( Y ) = 1 2 log 2 ( π e σ 2 2 ) = 1 2 log 2 ( 2 π e σ 2 ) 1. {\displaystyle h(Y)={\frac {1}{2}}\log _{2}\left({\frac {\pi e\sigma ^{2}}{2}}\right)={\frac {1}{2}}\log _{2}\left(2\pi e\sigma ^{2}\right)-1.}

Applications

The half-normal distribution is commonly utilized as a prior probability distribution for variance parameters in Bayesian inference applications.12

Parameter estimation

Given numbers { x i } i = 1 n {\displaystyle \{x_{i}\}_{i=1}^{n}} drawn from a half-normal distribution, the unknown parameter σ {\displaystyle \sigma } of that distribution can be estimated by the method of maximum likelihood, giving

σ ^ = 1 n i = 1 n x i 2 {\displaystyle {\hat {\sigma }}={\sqrt {{\frac {1}{n}}\sum _{i=1}^{n}x_{i}^{2}}}}

The bias is equal to

b E [ ( σ ^ m l e σ ) ] = σ 4 n {\displaystyle b\equiv \operatorname {E} {\bigg [}\;({\hat {\sigma }}_{\mathrm {mle} }-\sigma )\;{\bigg ]}=-{\frac {\sigma }{4n}}}

which yields the bias-corrected maximum likelihood estimator

σ ^ mle = σ ^ mle b ^ . {\displaystyle {\hat {\sigma \,}}_{\text{mle}}^{*}={\hat {\sigma \,}}_{\text{mle}}-{\hat {b\,}}.}
See also

See also

References

References

  1. Gelman, A. (2006), "Prior distributions for variance parameters in hierarchical models", Bayesian Analysis, 1 (3): 515–534, doi:10.1214/06-ba117a
  2. Röver, C.; Bender, R.; Dias, S.; Schmid, C.H.; Schmidli, H.; Sturtz, S.; Weber, S.; Friede, T. (2021), "On weakly informative prior distributions for the heterogeneity parameter in Bayesian random-effects meta-analysis", Research Synthesis Methods, 12 (4): 448–474, arXiv:2007.08352, doi:10.1002/jrsm.1475, PMID 33486828, S2CID 220546288
Further reading

Further reading

External links
(note that MathWorld uses the parameter θ = 1 σ π / 2 {\displaystyle \theta ={\frac {1}{\sigma }}{\sqrt {\pi /2}}}