Article · Wikipedia archive · Last revised May 29, 2026

Indian buffet process

In the mathematical theory of probability, the Indian buffet process (IBP) is a stochastic process defining a probability distribution over sparse binary matrices with a finite number of rows and an infinite number of columns. This distribution is suitable to use as a prior for models with potentially infinite number of features. The form of the prior ensures that only a finite number of features will be present in any finite set of observations but more features may appear as more data points are observed.

Last revised
May 29, 2026
Read time
≈ 2 min
Length
491 w
Citations
Source

In the mathematical theory of probability, the Indian buffet process (IBP) is a stochastic process defining a probability distribution over sparse binary matrices with a finite number of rows and an infinite number of columns. This distribution is suitable to use as a prior for models with potentially infinite number of features. The form of the prior ensures that only a finite number of features will be present in any finite set of observations but more features may appear as more data points are observed.

Indian buffet process prior

Let Z {\displaystyle Z} be an N × K {\displaystyle N\times K} binary matrix indicating the presence or absence of a latent feature. The IBP places the following prior on Z {\displaystyle Z} :

p ( Z ) = α K + i = 1 N K 1 ( i ) ! exp { α H N } k = 1 K + ( N m k ) ! ( m k 1 ) ! N ! {\displaystyle p(Z)={\frac {\alpha ^{K^{+}}}{\prod _{i=1}^{N}K_{1}^{(i)}!}}\exp\{-\alpha H_{N}\}\prod _{k=1}^{K^{+}}{\frac {(N-m_{k})!(m_{k}-1)!}{N!}}}

where K + {\displaystyle {K^{+}}} is the number of non-zero columns in Z {\displaystyle Z} , m k {\displaystyle m_{k}} is the number of ones in column k {\displaystyle k} of Z {\displaystyle Z} , H N {\displaystyle H_{N}} is the N {\displaystyle N} -th harmonic number, and K 1 ( i ) {\displaystyle K_{1}^{(i)}} is the number of new dishes sampled by the i {\displaystyle i} -th customer. The parameter α {\displaystyle \alpha } controls the expected number of features present in each observation.

In the Indian buffet process, the rows of Z {\displaystyle Z} correspond to customers and the columns correspond to dishes in an infinitely long buffet. The first customer takes the first P o i s s o n ( α ) {\displaystyle \mathrm {Poisson} (\alpha )} dishes. The i {\displaystyle i} -th customer then takes dishes that have been previously sampled with probability m k / i {\displaystyle m_{k}/i} , where m k {\displaystyle m_{k}} is the number of people who have already sampled dish k {\displaystyle k} . He also takes P o i s s o n ( α / i ) {\displaystyle \mathrm {Poisson} (\alpha /i)} new dishes. Therefore, z n k {\displaystyle z_{nk}} is one if customer n {\displaystyle n} tried the k {\displaystyle k} -th dish and zero otherwise.

This process is infinitely exchangeable for an equivalence class of binary matrices defined by a left-ordered many-to-one function. lof ( Z ) {\displaystyle \operatorname {lof} (Z)} is obtained by ordering the columns of the binary matrix Z {\displaystyle Z} from left to right by the magnitude of the binary number expressed by that column, taking the first row as the most significant bit.

See also

See also

References

References