1) in the pure Python implementation the automatic junk heuristic of difflib is not deactivated. This heuristic improves the performance for long strings,
2) the accelerated version backtracks the Levenshtein matrix to find the same alignment found by the Python implementation. However the algorithm just uses
one of multiple optimal alignment. There is no guarantee for this alignment to include the longest common substring.
3) the optimal substring is assumed to start at one of these `matching_blocks`. However this is not guaranteed.
`RapidFuzz` uses a sliding window approach (with some optimizations to skip impossible alignments) to find the optimal alignment. This approach is guaranteed
to find the optimal alignment.
## differences in preprocessing
`fuzzywuzzy` provides the function `utils.full_process` to preprocess strings. This function is called `utils.default_process` in `RapidFuzz`. It behaves similar with the only exception
that it does not provide the optional argument `force_ascii` which removes any non ascii characters from a string.
## differences in scorers
`fuzzywuzzy` has the following scorers which preprocess strings by default:
-`fuzz.token_sort_ratio`
-`fuzz.token_set_ratio`
-`fuzz.partial_token_sort_ratio`
-`fuzz.partial_token_set_ratio`
-`fuzz.WRatio`
-`fuzz.QRatio`
-`fuzz.UWRatio`
-`fuzz.UQRatio`
With the exception `fuzz.UWRatio` and `fuzz.UQRatio` of all have `force_ascii` enabled forthe peprocessing function by default.
In `RapidFuzz` no scorer preprocesses strings by default to keep the interface consistent. However a preprocessing function can be provided using the `processor` argument. In addition the functions `fuzz.UWRatio` and `fuzz.UQRatio` do not exist, since they are the same as `fuzz.WRatio` / `fuzz.QRatio` with `force_ascii` disabled. Since in `RapidFuzz` the `force_ascii` argument does not exist these functions do not provide any value.
## differences in processor functions
In `fuzzywuzzy` the process module includes the following functions:
-`extractWithoutOrder` (generator over unsorted results)
-`extract` (find the N best matches in a sorted list)
-`extractBests` (same as extract but with an addition score_cutoff parameter to filter bad matches)
-`extractOne` (find best match)
-`dedupe` (deduplicate list)
In `RapidFuzz` these functions are sometimes available under different names:
-`extractWithoutOrder` is called `extract_iter`
-`extract` / `extractBests` are a single function called `extract` which povides the optional `score_cutoff` argument