Article · Wikipedia archive · Last revised May 28, 2026

Jsoup

jsoup is an open-source Java library designed to parse, extract, and manipulate data stored in HTML documents.

Last revised
May 28, 2026
Read time
≈ 1 min
Length
138 w
Citations
2
Source
jsoup Java HTML Parser
DeveloperJonathan Hedley
Stable release
1.22.2 / April 20, 2026 (2026-04-20)1
Written inJava
Operating systemCross-platform
PlatformJava (JVM)
TypeHTML parser
LicenseMIT license
Websitejsoup.org
Repository

jsoup is an open-source Java library designed to parse, extract, and manipulate data stored in HTML documents.

History

jsoup was created in 2009 by Jonathan Hedley. It is distributed under the MIT License, a permissive free software license similar to the Creative Commons attribution license.

Hedley's avowed intention in writing jsoup was "to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup."

Projects powered by jsoup

jsoup is used in a number of current projects,2 including Google's OpenRefine data-wrangling tool.

See also

See also

References

References

  1. "jsoup Java HTML Parser release 1.22.2". Retrieved 2026-04-20.
  2. "Jsoup". MVNRepository / F. Rodriguez. 2015-03-08.
External links