## Changelog

### [2.14.0] -
#### Fixed
- improve handling of functions wrapped using `functools.wraps`

### [2.13.2] - 2022-11-05
#### Fixed
- fix incorrect results in `Hamming.normalized_similarity`
- fix incorrect score_cutoff handling in pure python implementation of
  `Postfix.normalized_distance` and `Prefix.normalized_distance`
- fix `Levenshtein.normalized_similarity` and `Levenshtein.normalized_distance`
  when used in combination with the process module
- `fuzz.partial_ratio` was not always symmetric when `len(s1) == len(s2)`

### [2.13.1] - 2022-11-02
#### Fixed
- fix bug in `normalized_similarity` of most scorers,
  leading to incorrect results when used in combination with the process module
- fix sse2 support
- fix bug in `JaroWinkler` and `Jaro` when used in the pure python process module
- forward kwargs in pure Python implementation of `process.extract`

### [2.13.0] - 2022-10-30
#### Fixed
- fix bug in `Levenshtein.editops` leading to crashes when used with `score_hint`

#### Changed
- moved capi from `rapidfuzz_capi` into `rapidfuzz`, since it will always
  succeed the installation now that there is a pure Python mode
- add `score_hint` argument to process module
- add `score_hint` argument to Levenshtein module

### [2.12.0] - 2022-10-24
#### Changed
- drop support for Python 3.6

#### Added
- added `Prefix`/`Suffix` similarity

#### Fixed
- fixed packaging with pyinstaller

### [2.11.1] - 2022-10-05
#### Fixed
- Fix segmentation fault in `process.cdist` when used with an empty query sequence

### [2.11.0] - 2022-10-02
#### Changes
- move jarowinkler dependency into rapidfuzz to simplify maintenance

#### Performance
- add SIMD implementation for `fuzz.ratio`/`fuzz.QRatio`/`Levenshtein`/`Indel`/`LCSseq`/`OSA` to improve
  performance for short strings in cdist

### [2.10.3] - 2022-09-30
#### Fixed
- use `scikit-build=0.14.1` on Linux, since `scikit-build=0.15.0` fails to find the Python Interpreter
- workaround gcc in bug in template type deduction

### [2.10.2] - 2022-09-27
#### Fixed
- fix support for cmake versions below 3.17

### [2.10.1] - 2022-09-25
#### Changed
- modernize cmake build to fix most conda-forge builds

### [2.10.0] - 2022-09-18
#### Added
- add editops to hamming distance

#### Performance
- strip common affix in osa distance

#### Fixed
- ignore missing pandas in Python3.11 tests

### [2.9.0] - 2022-09-16
#### Added
- add optimal string alignment (OSA)

### [2.8.0] - 2022-09-11
#### Fixed
- `fuzz.partial_ratio` did not find the optimal alignment in some edge cases (#219)

#### Performance
- improve performance of `fuzz.partial_ratio`

#### Changed
- increased minimum C++ version to C++17 (see #255)

### [2.7.0] - 2022-09-11
#### Performance
- improve performance of `Levenshtein.distance`/`Levenshtein.editops` for
  long sequences.

#### Added
- add `score_hint` parameter to `Levenshtein.editops` which allows the use of a
  faster implementation

#### Changed
- all functions in the `string_metric` module do now raise a deprecation warning.
  They are now only wrappers for their replacement functions, which makes them slower
  when used with the process module

### [2.6.1] - 2022-09-03
#### Fixed
- fix incorrect results of partial_ratio for long needles (#257)

### [2.6.0] - 2022-08-20
#### Fixed
- fix hashing for custom classes

#### Added
- add support for slicing in `Editops.__getitem__`/`Editops.__delitem__`
- add `DamerauLevenshtein` module

### [2.5.0] - 2022-08-14
#### Added
- added support for KeyboardInterrupt in processor module
  It might still take a bit until the KeyboardInterrupt is registered, but
  no longer runs all text comparisons after pressing `Ctrl + C`

#### Fixed
- fix default scorer used by cdist to use C++ implementation if possible

### [2.4.4] - 2022-08-12
#### Changed
- Added support for Python3.11

### [2.4.3] - 2022-08-08
#### Fixed
- fix value range of `jaro_similarity`/`jaro_winkler_similarity` in the pure Python mode
  for the string_metric module
- fix missing atomix symbol on arm 32 bit

### [2.4.2] - 2022-07-30
#### Fixed
- add missing symbol to pure Python which made the usage impossible

### [2.4.1] - 2022-07-29
#### Fixed
- fix version number

### [2.4.0] - 2022-07-29
#### Fixed
- fix banded Levenshtein implementation

#### Performance
- improve performance and memory usage of `Levenshtein.editops`
  - memory usage is reduced from O(NM) to O(N)
  - performance is improved for long sequences

### [2.3.0] - 2022-07-23
#### Added
- add `as_matching_blocks` to `Editops`/`Opcodes`
- add support for deletions from `Editops`
- add `Editops.apply`/`Opcodes.apply`
- add `Editops.remove_subsequence`

#### Changed
- merge adjacent similar blocks in `Opcodes`

#### Fixed
- fix usage of `eval(repr(Editop))`, `eval(repr(Editops))`, `eval(repr(Opcode))` and `eval(repr(Opcodes))`
- fix opcode conversion for empty source sequence
- fix validation for empty Opcode list passed into `Opcodes.__init__`

### [2.2.0] - 2022-07-19
#### Changed
- added in-tree build backend to install cmake and ninja only when it is not installed yet
  and only when wheels are available

### [2.1.4] - 2022-07-17
#### Changed
- changed internal implementation of cdist to remove build dependency to numpy

#### Added
- added wheels for musllinux and manylinux ppc64le, s390x

### [2.1.3] - 2022-07-09
#### Fixed
- fix missing type stubs

### [2.1.2] - 2022-07-04
#### Changed
- change src layout to make package import from root directory possible

### [2.1.1] - 2022-06-30
#### Changed
- allow installation without the C++ extension if it fails to compile
- allow selection of implementation via the environment variable `RAPIDFUZZ_IMPLEMENTATION`
  which can be set to "cpp" or "python"

### [2.1.0] - 2022-06-29
#### Added
- added pure python fallback for all implementations with the following exceptions:
  - no support for sequences of hashables. Only strings supported so far
  - *.editops / *.opcodes functions not implemented yet
  - process.cdist does not support multithreading

#### Fixed
- fuzz.partial_ratio_alignment ignored the score_cutoff
- fix implementation of Hamming.normalized_similarity
- fix default score_cutoff of Hamming.similarity
- fix implementation of LCSseq.distance when used in the process module
- treat hash for -1 and -2 as different

### [2.0.15] - 2022-06-24
#### Fixed
- fix integer wraparound in partial_ratio/partial_ratio_alignment

### [2.0.14] - 2022-06-23
#### Fixed
- fix unlimited recursion in LCSseq when used in combination with the process module

#### Changed
- add fallback implementations of `taskflow`, `rapidfuzz-cpp` and `jarowinkler-cpp`
  back to wheel, since some package building systems like piwheels can't clone sources

### [2.0.13] - 2022-06-22
#### Changed
- use system version of cmake on arm platforms, since the cmake package fails to compile

### [2.0.12] - 2022-06-22
#### Changed
- add tests to sdist
- remove cython dependency for sdist

### [2.0.11] - 2022-04-23
#### Changed
- relax version requirements of dependencies to simplify packaging

### [2.0.10] - 2022-04-17
#### Fixed
- Do not include installations of jaro_winkler in wheels (regression from 2.0.7)

#### Changed
- Allow installation from system installed versions of `rapidfuzz-cpp`, `jarowinkler-cpp`
  and `taskflow`

#### Added
- Added PyPy3.9 wheels on Linux

### [2.0.9] - 2022-04-07
#### Fixed
- Add missing Cython code in sdist
- consider float imprecision in score_cutoff (see #210)

### [2.0.8] - 2022-04-07
#### Fixed
- fix incorrect score_cutoff handling in token_set_ratio and token_ratio

#### Added
- add longest common subsequence

### [2.0.7] - 2022-03-13
#### Fixed
- Do not include installations of jaro_winkler and taskflow in wheels

### [2.0.6] - 2022-03-06
#### Fixed
- fix incorrect population of sys.modules which lead to submodules overshadowing
  other imports

#### Changed
- moved JaroWinkler and Jaro into a separate package

### [2.0.5] - 2022-02-25
#### Fixed
- fix signed integer overflow inside hashmap implementation

### [2.0.4] - 2022-02-21
#### Fixed
- fix binary size increase due to debug symbols
- fix segmentation fault in `Levenshtein.editops`

### [2.0.3] - 2022-02-18
#### Added
- Added fuzz.partial_ratio_alignment, which returns the result of fuzz.partial_ratio
  combined with the alignment this result stems from

#### Fixed
- Fix Indel distance returning incorrect result when using score_cutoff=1, when the strings
  are not equal. This affected other scorers like fuzz.WRatio, which use the Indel distance
  as well.

### [2.0.2] - 2022-02-12
#### Fixed
- fix type hints
- Add back transpiled cython files to the sdist to simplify builds in package builders
  like FreeBSD port build or conda-forge

### [2.0.1] - 2022-02-11
#### Fixed
- fix type hints
- Indel.normalized_similarity mistakenly used the implementation of Indel.normalized_distance

### [2.0.0] - 2022-02-09
#### Added
- added C-Api which can be used to extend RapidFuzz from different Python modules using any
  programming language which allows the usage of C-Apis (C/C++/Rust)
- added new scorers in `rapidfuzz.distance.*`
  - port existing distances to this new api
  - add Indel distance along with the corresponding editops function

#### Changed
- when the result of `string_metric.levenshtein` or `string_metric.hamming` is below max
  they do now return `max + 1` instead of -1
- Build system moved from setuptools to scikit-build
- Stop including all modules in \_\_init\_\_.py, since they significantly slowed down import time

#### Removed
- remove the `rapidfuzz.levenshtein` module which was deprecated in v1.0.0 and scheduled for removal in v2.0.0
- dropped support for Python2.7 and Python3.5

#### Deprecated
- deprecate support to specify processor in form of a boolean (will be removed in v3.0.0)
  - new functions will not get support for this in the first place
- deprecate `rapidfuzz.string_metric` (will be removed in v3.0.0). Similar scorers are available
  in `rapidfuzz.distance.*`

#### Fixed
- process.cdist did raise an exception when used with a pure python scorer

#### Performance
- improve performance and memory usage of `rapidfuzz.string_metric.levenshtein_editops`
  - memory usage is reduced by 33%
  - performance is improved by around 10%-20%
- significantly improve performance of  `rapidfuzz.string_metric.levenshtein` for `max <= 31`
  using a banded implementation

### [1.9.1] - 2021-12-13
#### Fixed
- fix bug in new editops implementation, causing it to SegFault on some inputs (see qurator-spk/dinglehopper#64)

### [1.9.0] - 2021-12-11
#### Fixed
- Fix some issues in the type annotations (see #163)

#### Performance
- improve performance and memory usage of `rapidfuzz.string_metric.levenshtein_editops`
  - memory usage is reduced by 10x
  - performance is improved from `O(N * M)` to `O([N / 64] * M)`

### [1.8.3] - 2021-11-19
#### Added
- Added missing wheels for Python3.6 on MacOs and Windows (see #159)

### [1.8.2] - 2021-10-27
#### Added
- Add wheels for Python 3.10 on MacOs

### [1.8.1] - 2021-10-22
#### Fixed
- Fix incorrect editops results (See #148)

### [1.8.0] - 2021-10-20
#### Changed
- Add Wheels for Python3.10 on all platforms except MacOs (see #141)
- Improve performance of `string_metric.jaro_similarity` and  `string_metric.jaro_winkler_similarity` for strings with a length <= 64

### [1.7.1] - 2021-10-02
#### Fixed
- fixed incorrect results of fuzz.partial_ratio for long needles (see #138)

### [1.7.0] - 2021-09-27
#### Changed
- Added typing for process.cdist
- Added multithreading support to cdist using the argument `process.cdist`
- Add dtype argument to `process.cdist` to set the dtype of the result numpy array (see #132)
- Use a better hash collision strategy in the internal hashmap, which improves the worst case performance

### [1.6.2] - 2021-09-15
#### Changed
- improved performance of fuzz.ratio
- only import process.cdist when numpy is available

### [1.6.1] - 2021-09-11
#### Changed
- Add back wheels for Python2.7

### [1.6.0] - 2021-09-10
#### Changed
- fuzz.partial_ratio uses a new implementation for short needles (<= 64). This implementation is
  - more accurate than the current implementation (it is guaranteed to find the optimal alignment)
  - it is significantly faster
- Add process.cdist to compare all elements of two lists (see #51)

### [1.5.1] - 2021-09-01
#### Fixed
- Fix out of bounds access in levenshtein_editops

### [1.5.0] - 2021-08-21
#### Changed
- all scorers do now support similarity/distance calculations between any sequence of hashables. So it is possible to calculate e.g. the WER as:
```
>>> string_metric.levenshtein(["word1", "word2"], ["word1", "word3"])
1
```

#### Added
- Added type stub files for all functions
- added jaro similarity in `string_metric.jaro_similarity`
- added jaro winkler similarity in `string_metric.jaro_winkler_similarity`
- added Levenshtein editops in `string_metric.levenshtein_editops`

#### Fixed
- Fixed support for set objects in `process.extract`
- Fixed inconsistent handling of empty strings

### [1.4.1] - 2021-03-30
#### Performance
-  improved performance of result creation in process.extract

#### Fixed
- Cython ABI stability issue (#95)
- fix missing decref in case of exceptions in process.extract

### [1.4.0] - 2021-03-29
#### Changed
- added processor support to `levenshtein` and `hamming`
- added distance support to extract/extractOne/extract_iter

#### Fixed
- incorrect results of `normalized_hamming` and `normalized_levenshtein` when used with `utils.default_process` as processor

### [1.3.3] - 2021-03-20
#### Fixed
- Fix a bug in the mbleven implementation of the uniform Levenshtein distance and cover it with fuzz tests

### [1.3.2] - 2021-03-20
#### Fixed
- some of the newly activated warnings caused build failures in the conda-forge build

### [1.3.1] - 2021-03-20
#### Fixed
- Fixed issue in LCS calculation for partial_ratio (see #90)
- Fixed incorrect results for normalized_hamming and normalized_levenshtein when the processor `utils.default_process` is used
- Fix many compiler warnings

### [1.3.0] - 2021-03-16
#### Changed
- add wheels for a lot of new platforms
- drop support for Python 2.7

#### Performance
- use `is` instead of `==` to compare functions directly by address

#### Fixed
- Fix another ref counting issue
- Fix some issues in the Levenshtein distance algorithm (see #92)

### [1.2.1] - 2021-03-08
#### Performance
- further improve bitparallel implementation of uniform Levenshtein distance for strings with a length > 64 (in many cases more than 50% faster)

### [1.2.0] - 2021-03-07
#### Changed
- add more benchmarks to documentation

#### Performance
- add bitparallel implementation to InDel Distance (Levenshtein with the weights 1,1,2) for strings with a length > 64
- improve bitparallel implementation of uniform Levenshtein distance for strings with a length > 64
- use the InDel Distance and uniform Levenshtein distance in more cases instead of the generic implementation
- Directly use the Levenshtein implementation in C++ instead of using it through Python in process.*

### [1.1.2] - 2021-03-03
#### Fixed
- Fix reference counting in process.extract (see #81)

### [1.1.1] - 2021-02-23
#### Fixed
- Fix result conversion in process.extract (see #79)

### [1.1.0] - 2021-02-21
#### Changed
- string_metric.normalized_levenshtein supports now all weights
- when different weights are used for Insertion and Deletion the strings are not swapped inside the Levenshtein implementation anymore. So different weights for Insertion and Deletion are now supported.
- replace C++ implementation with a Cython implementation. This has the following advantages:
  - The implementation is less error prone, since a lot of the complex things are done by Cython
  - slightly faster than the current implementation (up to 10% for some parts)
  - about 33% smaller binary size
  - reduced compile time
- Added **kwargs argument to process.extract/extractOne/extract_iter that is passed to the scorer
- Add max argument to hamming distance
- Add support for whole Unicode range to utils.default_process

#### Performance
- replaced Wagner Fischer usage in the normal Levenshtein distance with a bitparallel implementation

### [1.0.2] - 2021-02-19
#### Fixed
- The bitparallel LCS algorithm in fuzz.partial_ratio did not find the longest common substring properly in some cases.
The old algorithm is used again until this bug is fixed.

### [1.0.1] - 2021-02-17
#### Changed
- string_metric.normalized_levenshtein supports now the weights (1, 1, N) with N >= 1

#### Performance
- The Levenshtein distance with the weights (1, 1, >2) do now use the same implementation as the weight (1, 1, 2), since
  `Substitution > Insertion + Deletion` has no effect

#### Fixed
- fix uninitialized variable in bitparallel Levenshtein distance with the weight (1, 1, 1)

### [1.0.0] - 2021-02-12
#### Changed
- all normalized string_metrics can now be used as scorer for process.extract/extractOne
- Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
- increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
- improved docstrings of functions

#### Performance
- Added bit-parallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
- Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bit-parallel implementation.
- Improved performance of `fuzz.partial_ratio`
-> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance.
- Improved performance of `process.extract` and `process.extractOne`

#### Deprecated
- the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0
  These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`.

#### Added
- added normalized version of the hamming distance in `string_metric.normalized_hamming`
- process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff

#### Fixed
- multiple bugs in extractOne when used with a scorer, that's not from RapidFuzz
- fixed bug in `token_ratio`
- fixed bug in result normalization causing zero division


### [0.14.2] - 2020-12-31
#### Fixed
- utf8 usage in the copyright header caused problems with python2.7 on some platforms (see #70)

### [0.14.1] - 2020-12-13
#### Fixed
- when a custom processor like `lambda s: s` was used with any of the methods inside fuzz.* it always returned a score of 100. This release fixes this and adds a better test coverage to prevent this bug in the future.

### [0.14.0] - 2020-12-09
#### Added
- added hamming distance metric in the levenshtein module

#### Performance
- improved performance of default_process by using lookup table

### [0.13.4] - 2020-11-30
#### Fixed
- Add missing virtual destructor that caused a segmentation fault on Mac Os

### [0.13.3] - 2020-11-21
#### Added
- C++11 Support
- manylinux wheels

### [0.13.2] - 2020-11-21
#### Fixed
- Levenshtein was not imported from \_\_init\_\_
- The reference count of a Python Object inside process.extractOne was decremented to early

### [0.13.1] - 2020-11-17
#### Performance
- process.extractOne  exits early when a score of 100 is found. This way the other strings do not have to be preprocessed anymore.

### [0.13.0] - 2020-11-16
#### Fixed
- string objects passed to scorers had to be strings even before preprocessing them. This was changed, so they only have to be strings after preprocessing similar to process.extract/process.extractOne

#### Performance
- process.extractOne is now implemented in C++ making it a lot faster
- When token_sort_ratio or partial_token_sort ratio is used inprocess.extractOne the words in the query are only sorted once to improve the runtime

#### Changed
- process.extractOne/process.extract do now return the index of the match, when the choices are a list.

#### Removed
- process.extractIndices got removed, since the indices are now already returned by process.extractOne/process.extract

### [0.12.5] - 2020-10-26
#### Fixed
- fix documentation of process.extractOne (see #48)

### [0.12.4] - 2020-10-22
#### Added
- Added wheels for
  - CPython 2.7 on windows 64 bit
  - CPython 2.7 on windows 32 bit
  - PyPy 2.7 on windows 32 bit

### [0.12.3] - 2020-10-09
#### Fixed
- fix bug in partial_ratio (see #43)

### [0.12.2] - 2020-10-01
#### Fixed
- fix inconsistency with fuzzywuzzy in partial_ratio when using strings of equal length

### [0.12.1] - 2020-09-30
#### Fixed
- MSVC has a bug and therefore crashed on some of the templates used. This Release simplifies the templates so compiling on msvc works again

### [0.12.0] - 2020-09-30
#### Performance
- partial_ratio is using the Levenshtein distance now, which is a lot faster. Since many of the other algorithms use partial_ratio, this helps to improve the overall performance

### [0.11.3] - 2020-09-22
#### Fixed
- fix partial_token_set_ratio returning 100 all the time

### [0.11.2] - 2020-09-12
#### Added
- added rapidfuzz.\_\_author\_\_, rapidfuzz.\_\_license\_\_ and rapidfuzz.\_\_version\_\_

### [0.11.1] - 2020-09-01
#### Fixed
- do not use auto junk when searching the optimal alignment for partial_ratio

### [0.11.0] - 2020-08-22
#### Changed
- support for python 2.7 added #40
- add wheels for python2.7 (both pypy and cpython) on MacOS and Linux

### [0.10.0] - 2020-08-17
#### Changed
- added wheels for Python3.9

#### Fixed
- tuple scores in process.extractOne are now supported #39