jellyfish/README.md

54 lines
1.6 KiB
Markdown
Raw Normal View History

2021-11-09 22:43:09 +00:00
# Overview
**jellyfish** is a library for approximate & phonetic matching of strings.
Source: [https://github.com/jamesturk/jellyfish](https://github.com/jamesturk/jellyfish)
Documentation: [https://jamesturk.github.io/jellyfish/](https://jamesturk.github.io/jellyfish/)
Issues: [https://github.com/jamesturk/jellyfish/issues](https://github.com/jamesturk/jellyfish/issues)
[![PyPI badge](https://badge.fury.io/py/jellyfish.svg)](https://badge.fury.io/py/jellyfish)
[![Test badge](https://github.com/jamesturk/jellyfish/workflows/Python%20package/badge.svg)](https://github.com/jamesturk/jellyfish/actions?query=workflow%3A%22Python+package)
[![Coveralls](https://coveralls.io/repos/jamesturk/jellyfish/badge.png?branch=master)](https://coveralls.io/r/jamesturk/jellyfish)
![Test Rust](https://github.com/jamesturk/rust-jellyfish/workflows/Test%20Rust/badge.svg)
2021-11-09 22:43:09 +00:00
## Included Algorithms
String comparison:
* Levenshtein Distance
* Damerau-Levenshtein Distance
* Jaro Distance
* Jaro-Winkler Distance
* Match Rating Approach Comparison
* Hamming Distance
Phonetic encoding:
* American Soundex
* Metaphone
* NYSIIS (New York State Identification and Intelligence System)
* Match Rating Codex
## Example Usage
``` python
>>> import jellyfish
2023-02-03 16:45:46 +00:00
>>> jellyfish.levenshtein_distance('jellyfish', 'smellyfish')
2021-11-09 22:43:09 +00:00
2
2023-02-03 16:45:46 +00:00
>>> jellyfish.jaro_distance('jellyfish', 'smellyfish')
2021-11-09 22:43:09 +00:00
0.89629629629629637
2023-02-03 16:45:46 +00:00
>>> jellyfish.damerau_levenshtein_distance('jellyfish', 'jellyfihs')
2021-11-09 22:43:09 +00:00
1
2023-02-03 16:45:46 +00:00
>>> jellyfish.metaphone('Jellyfish')
2021-11-09 22:43:09 +00:00
'JLFX'
2023-02-03 16:45:46 +00:00
>>> jellyfish.soundex('Jellyfish')
2021-11-09 22:43:09 +00:00
'J412'
2023-02-03 16:45:46 +00:00
>>> jellyfish.nysiis('Jellyfish')
2021-11-09 22:43:09 +00:00
'JALYF'
2023-02-03 16:45:46 +00:00
>>> jellyfish.match_rating_codex('Jellyfish')
2021-11-09 22:43:09 +00:00
'JLLFSH'
```