fuzzy-searchhacktoberfesthammingjaro-winklerlevenshteinmetaphonepythonsoundexstarred-jamesturk-repostarred-repo
b9bbb0d450
If one of the characters had a value of 128 or above, this would be treated as a signed char and would result in an array lookup with a negative index. The somewhat contrived test case given here -- comparing a space with a non-breaking space -- reproduces the segmentation fault prior to the fix. This also makes a Clang warning go away. Thanks, compiler! :-) |
||
---|---|---|
cjellyfish | ||
jellyfish | ||
.coveragerc | ||
.gitignore | ||
.travis.yml | ||
LICENSE | ||
MANIFEST.in | ||
README.rst | ||
porter-test.csv | ||
setup.py | ||
tox.ini |
README.rst
========= jellyfish ========= .. image:: https://travis-ci.org/sunlightlabs/jellyfish.svg?branch=master :target: https://travis-ci.org/sunlightlabs/jellyfish .. image:: https://coveralls.io/repos/sunlightlabs/jellyfish/badge.png?branch=master :target: https://coveralls.io/r/sunlightlabs/jellyfish .. image:: https://pypip.in/version/jellyfish/badge.svg :target: https://pypi.python.org/pypi/jellyfish .. image:: https://pypip.in/format/jellyfish/badge.svg :target: https://pypi.python.org/pypi/jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. jellyfish is a project of Sunlight Labs (c) 2014. All code is released under a BSD-style license, see LICENSE for details. Written by James Turk <jturk@sunlightfoundation.com> and Michael Stephens. See https://github.com/sunlightlabs/jellyfish/graphs/contributors for contributors. Source is available at http://github.com/sunlightlabs/jellyfish. Included Algorithms =================== String comparison: * Levenshtein Distance * Damerau-Levenshtein Distance * Jaro Distance * Jaro-Winkler Distance * Match Rating Approach Comparison * Hamming Distance Phonetic encoding: * American Soundex * Metaphone * NYSIIS (New York State Identification and Intelligence System) * Match Rating Codex Example Usage ============= >>> import jellyfish >>> jellyfish.levenshtein_distance('jellyfish', 'smellyfish') 2 >>> jellyfish.jaro_distance('jellyfish', 'smellyfish') 0.89629629629629637 >>> jellyfish.damerau_levenshtein_distance('jellyfish', 'jellyfihs') 1 >>> jellyfish.metaphone('Jellyfish') 'JLFX' >>> jellyfish.soundex('Jellyfish') 'J412' >>> jellyfish.nysiis('Jellyfish') 'JALYF' >>> jellyfish.match_rating_codex('Jellyfish') 'JLLFSH'