Release v1.0.0 (#68)
- all normalized string_metrics can now be used as scorer for process.extract/extractOne
- Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
- increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
- improved docstrings of functions
- Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
- Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation.
- Improved performance of `fuzz.partial_ratio`
-> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance.
- Improved performance of `process.extract` and `process.extractOne`
- the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0
These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`.
- added normalized version of the hamming distance in `string_metric.normalized_hamming`
- process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff
- multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz
- fixed bug in `token_ratio`
- fixed bug in result normalisation causing zero division
2021-02-12 15:37:44 +00:00
Installation
============
While there are several ways of install RapidFuzz, the recommended methods
are either by using `pip` (the Python package manager) or
`conda` (an open-source, cross-platform, package manager)
using pip
---------
RapidFuzz can be installed with `pip` :
.. code-block :: sh
pip install rapidfuzz
There are pre-built binaries (wheels) of RapidFuzz for MacOS (10.9 and later), Linux x86_64 and Windows.
.. raw :: html
<div class="admonition error">
<p class="admonition-title">failure "ImportError: DLL load failed"</p>
<p>
If you run into this error on Windows the reason is most likely, that the
<a class="reference external" href="https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads">Visual C++ 2019 redistributable</a> is not installed, which is required to
find C++ Libraries (The C++ 2019 version includes the 2015, 2017 and 2019 version).</p>
</div>
using conda
-----------
RapidFuzz can be installed with `conda` :
.. code-block :: sh
conda install -c conda-forge rapidfuzz
from git
--------
RapidFuzz can be directly used from GitHub by cloning the
2021-09-26 14:36:05 +00:00
repository. This requires a C++14 capable compiler.
Release v1.0.0 (#68)
- all normalized string_metrics can now be used as scorer for process.extract/extractOne
- Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
- increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
- improved docstrings of functions
- Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
- Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation.
- Improved performance of `fuzz.partial_ratio`
-> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance.
- Improved performance of `process.extract` and `process.extractOne`
- the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0
These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`.
- added normalized version of the hamming distance in `string_metric.normalized_hamming`
- process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff
- multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz
- fixed bug in `token_ratio`
- fixed bug in result normalisation causing zero division
2021-02-12 15:37:44 +00:00
.. code-block :: sh
2021-08-31 11:51:37 +00:00
git clone --recursive https://github.com/maxbachmann/rapidfuzz.git
Release v1.0.0 (#68)
- all normalized string_metrics can now be used as scorer for process.extract/extractOne
- Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
- increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
- improved docstrings of functions
- Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
- Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation.
- Improved performance of `fuzz.partial_ratio`
-> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance.
- Improved performance of `process.extract` and `process.extractOne`
- the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0
These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`.
- added normalized version of the hamming distance in `string_metric.normalized_hamming`
- process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff
- multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz
- fixed bug in `token_ratio`
- fixed bug in result normalisation causing zero division
2021-02-12 15:37:44 +00:00
cd rapidfuzz
pip install .