Skip navigation links

Package org.apache.commons.text.similarity

Provides algorithms for string similarity.

See: Description

Package org.apache.commons.text.similarity Description

Provides algorithms for string similarity.

The algorithms that implement the EditDistance interface follow the same simple principle: the more similar (closer) strings are, lower is the distance. For example, the words house and hose are closer than house and trousers.

The following algorithms are available at the moment:

The Cosine Distance utilises a regular expression tokenizer (\w+). And the Levenshtein Distance's behavior can be changed to take into consideration a maximum throughput.

Since:
1.0
Skip navigation links

Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.