Commit Graph

12 Commits

Author SHA1 Message Date
Max Bachmann de0f6d8af3 add fallback implementation back to wheel 2022-06-23 12:54:43 +02:00
Max Bachmann 83d0a77f2a
allow usage of system installed libs (#213)
system installed versions of `rapidfuzz-cpp`, `jarowinkler-cpp` and `taskflow` are now used, if they are available in a compatible version
2022-04-17 20:21:34 +02:00
Max Bachmann 616329b5a5 Use scikit-build v0.13.0 2022-02-02 23:43:07 +01:00
Max Bachmann 5fc5ca7857 update documentation theme 2022-01-25 12:29:44 +01:00
Max Bachmann 29cb84fe38
avoid generating broken docs 2022-01-02 16:32:31 +01:00
Max Bachmann 239911bac9 cythonize while installing 2021-12-30 19:46:02 +01:00
Max Bachmann 0674083701 cythonize while packaging 2021-12-24 12:06:47 +01:00
Max Bachmann 6c5584a6c0 fix incorrect loop unrolling 2021-09-24 04:53:54 +02:00
Max Bachmann 0e6466d835 rename master branch to main 2021-02-14 15:00:57 +01:00
Max Bachmann 375c13e436 Release v1.0.0 (#68)
- all normalized string_metrics can now be used as scorer for process.extract/extractOne
- Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
- increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
- improved docstrings of functions

- Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
- Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation.
- Improved performance of `fuzz.partial_ratio`
-> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance.
- Improved performance of `process.extract` and `process.extractOne`

- the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0
  These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`.

- added normalized version of the hamming distance in `string_metric.normalized_hamming`
- process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff

- multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz
- fixed bug in `token_ratio`
- fixed bug in result normalisation causing zero division
2021-02-12 16:48:10 +01:00
maxbachmann 53085be8f6
simplify workflow 2020-06-03 09:44:57 +02:00
maxbachmann bbf2de840e
add documentation 2020-05-27 14:16:12 +02:00