Error exponents in hypothesis testing

In statistical hypothesis testing, the error exponent of a hypothesis testing procedure is the rate at which the probabilities of Type I and Type II decay exponentially with the size of the sample used in the test. For example, if the probability of error $P_{\mathrm {error} }$ of a test decays as $e^{-n\beta }$ , where $n$ is the sample size, the error exponent is $\beta$ .

Formally, the error exponent of a test is defined as the limiting value of the ratio of the negative logarithm of the error probability to the sample size for large sample sizes: $\lim _{n\to \infty }{\frac {-\log P_{\text{error}}}{n}}$ . Error exponents for different hypothesis tests are computed using Sanov's theorem and other results from large deviations theory. There are various methods used to show that an error exponent is achievable, including the likelihood ratio (which is known to be optimal in certain circumstances), and the empirical distribution¹. Error exponents are sometimes referred to as error rates, due to the connection between hypothesis testing and information theory².

Error exponents in binary hypothesis testing

Consider a binary hypothesis testing problem in which observations are modeled as independent and identically distributed random variables under each hypothesis. Let $Y_{1},Y_{2},\ldots ,Y_{n}$ denote the observations. Let $f_{0}$ denote the probability density function of each observation $Y_{i}$ under the null hypothesis $H_{0}$ and let $f_{1}$ denote the probability density function of each observation $Y_{i}$ under the alternate hypothesis $H_{1}$ .

In this case there are two possible error events. Error of type I, also called false positive, occurs when the null hypothesis is true and it is wrongly rejected. Error of type II, also called false negative, occurs when the alternate hypothesis is true and null hypothesis is not rejected. The probability of type I error is denoted $P(\mathrm {error} \mid H_{0})$ and the probability of type II error is denoted $P(\mathrm {error} \mid H_{1})$ . In some fields, the type I error is denoted by $\alpha _{n}$ and the type II error is denoted by $\beta _{n}$ .

Optimal error exponent for Neyman–Pearson testing

In the Neyman–Pearson³ version of binary hypothesis testing, one is interested in minimizing the probability of type II error $P({\text{error}}\mid H_{1})$ subject to the constraint that the probability of type I error $P({\text{error}}\mid H_{0})$ is less than or equal to a pre-specified level $\alpha$ . In this setting, the optimal testing procedure is a likelihood-ratio test.⁴ Furthermore, the optimal test guarantees that the type II error probability decays exponentially in the sample size $n$ according to $\lim _{n\to \infty }{\frac {-\log P(\mathrm {error} \mid H_{1})}{n}}=D(f_{0}\parallel f_{1})$ .⁵ The error exponent $D(f_{0}\parallel f_{1})$ is the Kullback–Leibler divergence between the probability distributions of the observations under the two hypotheses. This exponent is also referred to as the Chernoff–Stein lemma exponent.

Optimal error exponent for average error probability in Bayesian hypothesis testing

In the Bayesian version of binary hypothesis testing one is interested in minimizing the average error probability under both hypothesis, assuming a prior probability of occurrence on each hypothesis. Let $\pi _{0}$ denote the prior probability of hypothesis $H_{0}$ . In this case the average error probability is given by $P_{\text{ave}}=\pi _{0}P({\text{error}}\mid H_{0})+(1-\pi _{0})P({\text{error}}\mid H_{1})$ . In this setting again a likelihood ratio test is optimal and the optimal error decays as $\lim _{n\to \infty }{\frac {-\log P_{\text{ave}}}{n}}=C(f_{0},f_{1})$ where $C(f_{0},f_{1})$ represents the Chernoff-information between the two distributions defined as $C(f_{0},f_{1})=\max _{\lambda \in [0,1]}\left[-\log \int (f_{0}(x))^{\lambda }(f_{1}(x))^{(1-\lambda )}\,dx\right]$ .⁵

Trade-off between type I and II error

A more explicit tradeoff between the type I and type II error is observed when the type I error is constrained to decay exponentially, and the type II error is minimized. If we require $P({\text{error}}|H_{0})<e^{-nr}$ for some $r<D(f_{1}\|f_{0})$ , then the optimal type II error exponent is described by $\limsup _{n\to \infty }{\frac {1}{n}}\log P({\text{error}}|H_{1})=-H_{r}(f_{0}\parallel f_{1})$ . Here $H(f_{0}\parallel f_{1})$ is the Hoeffding divergence ⁶⁷² described by

$H_{r}(f_{0}\parallel f_{1})=\max _{0\leq s\leq 1}{\frac {\Psi (s)-(1-s)r}{s}}$ (1)

where $\Psi (s)=\int dx\ f_{0}(x)^{1-s}f_{1}(x)^{s}$ .

Second Order Analysis

Sometimes, the above considerations are described as the first order error exponents of hypothesis testing, meaning the analysis of

$\lim _{n\to \infty }-{\frac {1}{n}}\log \mathbb {P} _{H_{1}}[{\text{error}}]$

However, it is also possible to analyze higher order error exponents, for example the second order error exponent. If the first order error exponent is given by $I_{1}$ , then the second order error exponent is taken to be

$-\lim _{n\to \infty }{\frac {1}{\sqrt {n}}}\left(-\log \left(\mathbb {P} _{H_{1}}[{\text{error}}]\right)-nI_{1}\right)$

Second order analysis of hypothesis testing has been studied for several types of hypothesis testing problems: simple hypothesis testing⁸, One sample universal hypothesis testing⁹, and two-sample universal problems¹⁰. The second order error exponent is sometimes called the relative entropy variance⁸.

References

Hoeffding, Wassily (1965). "Asymptotically Optimal Tests for Multinomial Distributions". The Annals of Mathematical Statistics. 36 (2): 369–401. doi:10.1214/aoms/1177700150. ISSN 0003-4851. JSTOR 2238145.
Blahut, R. (1974). "Hypothesis testing and information theory". IEEE Transactions on Information Theory. 20 (4): 405–417. Bibcode:1974ITIT...20..405B. doi:10.1109/TIT.1974.1055254. ISSN 1557-9654.
Neyman, J.; Pearson, E. S. (1933), "On the problem of the most efficient tests of statistical hypotheses" (PDF), Philosophical Transactions of the Royal Society of London A, 231 (694–706): 289–337, Bibcode:1933RSPTA.231..289N, doi:10.1098/rsta.1933.0009, JSTOR 91247
Lehmann, E. L.; Romano, Joseph P. (2005). Testing Statistical Hypotheses (3 ed.). New York: Springer. ISBN 978-0-387-98864-1.
Cover, Thomas M.; Thomas, Joy A. (2006). Elements of Information Theory (2 ed.). New York: Wiley-Interscience.
Ogawa, Tomohiro; Hayashi, Masahito (2002), On Error Exponents in Quantum Hypothesis Testing, arXiv:quant-ph/0206151, Bibcode:2002quant.ph..6151O, arXiv:quant-ph/0206151
Hoeffding, Wassily (1994), Fisher, N. I.; Sen, P. K. (eds.), "On Probabilities of Large Deviations", The Collected Works of Wassily Hoeffding, Springer Series in Statistics, New York, NY: Springer, pp. 473–490, doi:10.1007/978-1-4612-0865-5_29, ISBN 978-1-4612-0865-5, retrieved 2026-03-02{{citation}}: CS1 maint: work parameter with ISBN (link)
Zhou, Lin; Tan, Vincent Y F; Motani, Mehul (22 January 2019). "Second-order asymptotically optimal statistical classification". Information and Inference: A Journal of the IMA. 9 (1). arXiv:1806.00739. doi:10.1093/imaiai/iay023. ISSN 2049-8764.
Harsha, K. V.; Ravi, Jithin; Koch, Tobias (November 2022). "Second-Order Asymptotics of Hoeffding-Like Hypothesis Tests". 2022 IEEE Information Theory Workshop (ITW). pp. 654–659. arXiv:2205.05631. Bibcode:2022itw..conf...35H. doi:10.1109/ITW54588.2022.9965931. hdl:10016/36612. ISBN 978-1-6654-8341-4.
Harsha, K. V.; Ravi, Jithin; Koch, Tobias (2026-01-14), Second-Order Asymptotics of Two-Sample Tests, arXiv:2601.09196

[HoeffdingMultinomial-1] Hoeffding, Wassily (1965). "Asymptotically Optimal Tests for Multinomial Distributions". The Annals of Mathematical Statistics. 36 (2): 369–401. doi:10.1214/aoms/1177700150. ISSN 0003-4851. JSTOR 2238145.

[HypTestingAndInfoTheory-2] Blahut, R. (1974). "Hypothesis testing and information theory". IEEE Transactions on Information Theory. 20 (4): 405–417. Bibcode:1974ITIT...20..405B. doi:10.1109/TIT.1974.1055254. ISSN 1557-9654.

[NeymanPearson1933-3] Neyman, J.; Pearson, E. S. (1933), "On the problem of the most efficient tests of statistical hypotheses" (PDF), Philosophical Transactions of the Royal Society of London A, 231 (694–706): 289–337, Bibcode:1933RSPTA.231..289N, doi:10.1098/rsta.1933.0009, JSTOR 91247

[LR-4] Lehmann, E. L.; Romano, Joseph P. (2005). Testing Statistical Hypotheses (3 ed.). New York: Springer. ISBN 978-0-387-98864-1.

[CT-5] Cover, Thomas M.; Thomas, Joy A. (2006). Elements of Information Theory (2 ed.). New York: Wiley-Interscience.

[6] Ogawa, Tomohiro; Hayashi, Masahito (2002), On Error Exponents in Quantum Hypothesis Testing, arXiv:quant-ph/0206151, Bibcode:2002quant.ph..6151O, arXiv:quant-ph/0206151

[7] Hoeffding, Wassily (1994), Fisher, N. I.; Sen, P. K. (eds.), "On Probabilities of Large Deviations", The Collected Works of Wassily Hoeffding, Springer Series in Statistics, New York, NY: Springer, pp. 473–490, doi:10.1007/978-1-4612-0865-5_29, ISBN 978-1-4612-0865-5, retrieved 2026-03-02{{citation}}: CS1 maint: work parameter with ISBN (link)

[SecondOrderSimple-8] Zhou, Lin; Tan, Vincent Y F; Motani, Mehul (22 January 2019). "Second-order asymptotically optimal statistical classification". Information and Inference: A Journal of the IMA. 9 (1). arXiv:1806.00739. doi:10.1093/imaiai/iay023. ISSN 2049-8764.

[UniversalOneSample-9] Harsha, K. V.; Ravi, Jithin; Koch, Tobias (November 2022). "Second-Order Asymptotics of Hoeffding-Like Hypothesis Tests". 2022 IEEE Information Theory Workshop (ITW). pp. 654–659. arXiv:2205.05631. Bibcode:2022itw..conf...35H. doi:10.1109/ITW54588.2022.9965931. hdl:10016/36612. ISBN 978-1-6654-8341-4.

[10] Harsha, K. V.; Ravi, Jithin; Koch, Tobias (2026-01-14), Second-Order Asymptotics of Two-Sample Tests, arXiv:2601.09196

1

2

3

4

5

6

7

8

9

10