fuzzy-searchhacktoberfesthammingjaro-winklerlevenshteinmetaphonepythonsoundexstarred-jamesturk-repostarred-repo
71f885dab1 | ||
---|---|---|
.github | ||
cjellyfish@8d3f440d90 | ||
docs | ||
jellyfish | ||
testdata@21eabbe8a7 | ||
.coveragerc | ||
.gitignore | ||
.gitmodules | ||
.pre-commit-config.yaml | ||
LICENSE | ||
MANIFEST.in | ||
README.md | ||
build-wheels.sh | ||
mkdocs.yml | ||
run-cov.sh | ||
setup.py | ||
tox.ini | ||
upload-releases.sh |
README.md
Overview
jellyfish is a library for approximate & phonetic matching of strings.
Source: https://github.com/jamesturk/jellyfish
Documentation: https://jamesturk.github.io/jellyfish/
Issues: https://github.com/jamesturk/jellyfish/issues
Included Algorithms
String comparison:
- Levenshtein Distance
- Damerau-Levenshtein Distance
- Jaro Distance
- Jaro-Winkler Distance
- Match Rating Approach Comparison
- Hamming Distance
Phonetic encoding:
- American Soundex
- Metaphone
- NYSIIS (New York State Identification and Intelligence System)
- Match Rating Codex
Example Usage
>>> import jellyfish
>>> jellyfish.levenshtein_distance(u'jellyfish', u'smellyfish')
2
>>> jellyfish.jaro_distance(u'jellyfish', u'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance(u'jellyfish', u'jellyfihs')
1
>>> jellyfish.metaphone(u'Jellyfish')
'JLFX'
>>> jellyfish.soundex(u'Jellyfish')
'J412'
>>> jellyfish.nysiis(u'Jellyfish')
'JALYF'
>>> jellyfish.match_rating_codex(u'Jellyfish')
'JLLFSH'