🎐 a python library for doing approximate and phonetic matching of strings.
Go to file
Nicholas Chammas 264f4b7fce show how to query implementation 2016-11-21 11:24:42 -05:00
cjellyfish@b1f85714c1 update CJellyfish to be semi-consistent w/ new type checking in Py impl. 2016-09-12 09:57:55 -04:00
docs show how to query implementation 2016-11-21 11:24:42 -05:00
jellyfish Merge pull request #67 from nchammas/fix-style-errors 2016-11-16 13:34:12 -05:00
testdata@814cea91eb bump testdata to include wagner-fischer 2016-10-21 14:03:14 -04:00
.coveragerc coveragerc 2014-08-11 15:02:13 -04:00
.gitignore add new test for #52 2016-05-13 19:23:06 -04:00
.gitmodules change submodule path 2015-04-22 01:50:10 -04:00
.run_with_env.cmd appveyor 2015-04-22 19:12:55 -04:00
.travis.yml 3.5 tox env may work 2015-10-08 20:36:35 -04:00
LICENSE BSD-2 2015-06-16 17:02:11 -04:00
MANIFEST.in Add docs/ and testdata/ to Manifest.in 2016-03-02 18:53:02 +01:00
README.rst add note about how to run tests 2016-11-15 13:30:03 -05:00
appveyor.yml add wheel dep to appveyor 2015-04-23 01:17:20 -04:00
run-cov.sh quite a few tests for NYSIIS, fixing some issues in the C implementation 2015-02-19 17:34:45 -05:00
setup.py 0.5.6 release 2016-06-23 13:22:26 -04:00
tox.ini unicodecsv is available on py3 2015-10-08 20:22:59 -04:00

README.rst

=========
jellyfish
=========

.. image:: https://travis-ci.org/jamesturk/jellyfish.svg?branch=master
    :target: https://travis-ci.org/jamesturk/jellyfish

.. image:: https://coveralls.io/repos/jamesturk/jellyfish/badge.png?branch=master
    :target: https://coveralls.io/r/jamesturk/jellyfish

.. image:: https://img.shields.io/pypi/v/jellyfish.svg
    :target: https://pypi.python.org/pypi/jellyfish

.. image:: https://readthedocs.org/projects/jellyfish/badge/?version=latest
    :target: https://readthedocs.org/projects/jellyfish/?badge=latest
    :alt: Documentation Status

.. image:: https://ci.appveyor.com/api/projects/status/9xeyl1f5sd5pl40h?svg=true
    :target: https://ci.appveyor.com/project/jamesturk/jellyfish/

Jellyfish is a python library for doing approximate and phonetic matching of strings.

Written by James Turk <james.p.turk@gmail.com> and Michael Stephens.

See https://github.com/jamesturk/jellyfish/graphs/contributors for contributors.

Source is available at http://github.com/jamesturk/jellyfish.

Included Algorithms
===================

String comparison:

* Levenshtein Distance
* Damerau-Levenshtein Distance
* Jaro Distance
* Jaro-Winkler Distance
* Match Rating Approach Comparison
* Hamming Distance

Phonetic encoding:

* American Soundex
* Metaphone
* NYSIIS (New York State Identification and Intelligence System)
* Match Rating Codex

Example Usage
=============

>>> import jellyfish
>>> jellyfish.levenshtein_distance(u'jellyfish', u'smellyfish')
2
>>> jellyfish.jaro_distance(u'jellyfish', u'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance(u'jellyfish', u'jellyfihs')
1

>>> jellyfish.metaphone(u'Jellyfish')
'JLFX'
>>> jellyfish.soundex(u'Jellyfish')
'J412'
>>> jellyfish.nysiis(u'Jellyfish')
'JALYF'
>>> jellyfish.match_rating_codex(u'Jellyfish')
'JLLFSH'

Running Tests
=============

If you are interested in contributing to Jellyfish, you may want to
run tests locally. Jellyfish uses tox_ to run tests, which you can
setup and run as follows::

  pip install tox
  # cd jellyfish/
  tox

.. _tox: https://tox.readthedocs.io/en/latest/