Article · Wikipedia archive · Last revised Jun 11, 2026

Co-occurrence

In linguistics, co-occurrence or cooccurrence is an above-chance frequency of ordered occurrence of two adjacent terms in a text corpus. Co-occurrence in this linguistic sense can be interpreted as an indicator of semantic proximity or an idiomatic expression. Corpus linguistics and its statistical analyses can reveal patterns of co-occurrences within a language and enable the working out of typical collocations for its lexical items.

Last revised
Jun 11, 2026
Read time
≈ 1 min
Length
220 w
Citations
1
Source

In linguistics, co-occurrence or cooccurrence (in older texts often shown with diacritic as coöccurrence) is an above-chance frequency of ordered occurrence of two adjacent terms in a text corpus. Co-occurrence in this linguistic sense can be interpreted as an indicator of semantic proximity or an idiomatic expression. Corpus linguistics and its statistical analyses can reveal (regularity of) patterns of co-occurrences within a language and enable the working out of typical collocations for its lexical items.

A co-occurrence restriction is identified when linguistic elements never occur together. Analysis of these restrictions can lead to discoveries about the structure and development of a language.1

Co-occurrence can be seen an extension of word counting in higher dimensions. Co-occurrence can be quantitatively described using measures like a massive correlation or mutual information.

Co-occurrence information and knowledge of co-occurring words may be relevant in analysis of language for the purposes of large language models, part of the emerging field of artificial intelligence, and helpful in word games such as scrabble.

See also

See also

References

References

  1. Kroeger, Paul (2005). Analyzing Grammar: An Introduction. Cambridge: Cambridge University Press. p. 20. ISBN 978-0-521-01653-7.
External links