Article · Wikipedia archive · Last revised May 30, 2026

Overlap coefficient

The overlap coefficient, or Szymkiewicz–Simpson coefficient, is a similarity measure that measures the overlap between two finite sets. It is related to the Jaccard index and is defined as the size of the intersection divided by the size of the smaller of two sets:

Last revised
May 30, 2026
Read time
≈ 1 min
Length
255 w
Citations
6
Source

The overlap coefficient,note 1 or Szymkiewicz–Simpson coefficient,345 is a similarity measure that measures the overlap between two finite sets. It is related to the Jaccard index and is defined as the size of the intersection divided by the size of the smaller of two sets:

overlap ( A , B ) = | A B | min ( | A | , | B | ) {\displaystyle \operatorname {overlap} (A,B)={\frac {|A\cap B|}{\min(|A|,|B|)}}}

Note that 0 overlap ( A , B ) 1 {\displaystyle 0\leq \operatorname {overlap} (A,B)\leq 1} . If set A is a subset of B or the converse, then the overlap coefficient is equal to 1.

See also

See also

Notes

Notes

  1. The use of the term "overlap" appears in the comment for formula #27 in Table 2 of McGill et al. (1979),1 which references Sager & Lockemann (1976).2
References

References

  1. McGill, M.; Koll, M.; Noreault, T. (October 1979). An Evaluation of Factors Affecting Document Ranking by Information Retrieval Systems, Syracuse, NY: School of Information Studies, Syracuse University.
  2. Sager, W. K. H.; Lockemann, P. C. (1976). "Classification of Ranking Algorithms". International Forum on Information and Documentation. 1 (4): 12–25.
  3. Simpson, G. G. (January 1943). "Mammals and the Nature of Continents". American Journal of Science. 241 (1): 1–31.
  4. Simpson, G. G. (July 1947). "Holarctic Mammalian Faunas and the Continental Relationships During the Cenozoic". Bulletin of the Geological Society of America. 58 (7): 613–688.
  5. Simpson, G. G. (1960). "Notes on the Measurement of Faunal Resemblance". American Journal of Science. 258-A: 300–311.