RapidFuzz/docs/contributing.rst

Contributing
============

Interested in contributing to RapidFuzz? Want to report a bug?
Before you do, please read the following guidelines.

Submission context
-------------------

Got a question or problem?
^^^^^^^^^^^^^^^^^^^^^^^^^^

For questions you can reach us using
`github discussions <https://github.com/maxbachmann/RapidFuzz/discussions>`__.

Found a bug?
^^^^^^^^^^^^

If you found a bug in the source code, you can help us by submitting an issue
to the `issue tracker <https://github.com/maxbachmann/rapidfuzz/issues>`__
in our GitHub repository. Even better, you can submit a Pull Request with a fix.
However, before doing so, please read the :ref:`submission-guidelines`.

Missing a feature?
^^^^^^^^^^^^^^^^^^

You can request a new feature by submitting an issue to our GitHub Repository.
If you would like to implement a new feature, please submit an issue with a
proposal for your work first, to be sure that it is of use for everyone.
Please consider what kind of change it is:

* For a **major feature**, first open an issue and outline your proposal so
  that it can be discussed. This will also allow us to better coordinate our
  efforts, prevent duplication of work, and help you to craft the change so
  that it is successfully accepted into the project.

* **Small features and bugs** can be crafted and directly submitted as a Pull
  Request. However, there is no guarantee that your feature will make it into
  the `main`, as it's always a matter of opinion whether if benefits the
  overall functionality of the project.

.. _submission-guidelines:

Submission guidelines
---------------------

Submitting an issue
^^^^^^^^^^^^^^^^^^^

Before you submit an issue, please search the issue tracker, maybe an issue for
your problem already exists and the discussion might inform you of workarounds
readily available.
Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			`Contributing`
			`============`
add documentation 2020-05-27 12:15:45 +00:00
			`Interested in contributing to RapidFuzz? Want to report a bug?`
			`Before you do, please read the following guidelines.`

Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			`Submission context`
			`-------------------`
add documentation 2020-05-27 12:15:45 +00:00
Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			`Got a question or problem?`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^`
add documentation 2020-05-27 12:15:45 +00:00
add levenshtein_editops 2021-08-21 00:53:32 +00:00			`For questions you can reach us using`
			`github discussions <https://github.com/maxbachmann/RapidFuzz/discussions>`__.
add documentation 2020-05-27 12:15:45 +00:00
Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			`Found a bug?`
			`^^^^^^^^^^^^`
add documentation 2020-05-27 12:15:45 +00:00
			`If you found a bug in the source code, you can help us by submitting an issue`
Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			to the `issue tracker <https://github.com/maxbachmann/rapidfuzz/issues>`__
			`in our GitHub repository. Even better, you can submit a Pull Request with a fix.`
			However, before doing so, please read the :ref:`submission-guidelines`.
add documentation 2020-05-27 12:15:45 +00:00
Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			`Missing a feature?`
			`^^^^^^^^^^^^^^^^^^`
add documentation 2020-05-27 12:15:45 +00:00
			`You can request a new feature by submitting an issue to our GitHub Repository.`
			`If you would like to implement a new feature, please submit an issue with a`
			`proposal for your work first, to be sure that it is of use for everyone.`
			`Please consider what kind of change it is:`

			`* For a major feature, first open an issue and outline your proposal so`
			`that it can be discussed. This will also allow us to better coordinate our`
			`efforts, prevent duplication of work, and help you to craft the change so`
			`that it is successfully accepted into the project.`

			`* Small features and bugs can be crafted and directly submitted as a Pull`
			`Request. However, there is no guarantee that your feature will make it into`
rename master branch to main 2021-02-14 13:59:12 +00:00			the `main`, as it's always a matter of opinion whether if benefits the
add documentation 2020-05-27 12:15:45 +00:00			`overall functionality of the project.`

Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			`.. _submission-guidelines:`

			`Submission guidelines`
			`---------------------`
add documentation 2020-05-27 12:15:45 +00:00
Release v1.0.0 (#68) - all normalized string_metrics can now be used as scorer for process.extract/extractOne - Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future. - increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future - improved docstrings of functions - Added bitparallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2). - Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bitparallel implementation. - Improved performance of `fuzz.partial_ratio` -> Since `fuzz.ratio` and `fuzz.partial_ratio` are used in most scorers, this improves the overall performance. - Improved performance of `process.extract` and `process.extractOne` - the `rapidfuzz.levenshtein` module is now deprecated and will be removed in v2.0.0 These functions are now placed in `rapidfuzz.string_metric`. `distance`, `normalized_distance`, `weighted_distance` and `weighted_normalized_distance` are combined into `levenshtein` and `normalized_levenshtein`. - added normalized version of the hamming distance in `string_metric.normalized_hamming` - process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff - multiple bugs in extractOne when used with a scorer, thats not from RapidFuzz - fixed bug in `token_ratio` - fixed bug in result normalisation causing zero division 2021-02-12 15:37:44 +00:00			`Submitting an issue`
			`^^^^^^^^^^^^^^^^^^^`
add documentation 2020-05-27 12:15:45 +00:00
			`Before you submit an issue, please search the issue tracker, maybe an issue for`
			`your problem already exists and the discussion might inform you of workarounds`
			`readily available.`