project: pySBD - Python Sentence Boundary Disambiguation (#4455)

*   project: pySBD - Python Sentence Boundary Disambiguation

* 📝  Update links and description

* 🐛  Fix missing comma

* Update universe.json

pysbd as a spacy component through entrypoints

* 🚨  Fix universe.json

* 📝  Update code_example
This commit is contained in:
Nipun Sadvilkar 2019-10-30 16:43:29 +05:30 committed by Ines Montani
parent c2f5f9f572
commit 2a5e71232b
1 changed files with 26 additions and 0 deletions

View File

@ -1801,6 +1801,32 @@
"github": "microsoft"
}
},
{
"id": "python-sentence-boundary-disambiguation",
"title": "pySBD - python Sentence Boundary Disambiguation",
"slogan": "a rule-based sentence boundary detection that works out-of-the-box",
"github": "nipunsadvilkar/pySBD",
"description": "pySBD is 'real-world' sentence segmenter which extracts a reasonable sentences when the format and domain of the input text are unknown. It is a rules-based algorithm based on [The Golden Rules](https://s3.amazonaws.com/tm-town-nlp-resources/golden_rules.txt) - a set of tests to check accuracy of segmenter in regards to edge case scenarios developed by [TM-Town](https://www.tm-town.com/) dev team. pySBD is python port of ruby gem [Pragmatic Segmenter](https://github.com/diasks2/pragmatic_segmenter).",
"pip": "pysbd",
"category": ["scientific"],
"tags": ["sentence segmentation"],
"code_example": [
"from pysbd.util import PySBDFactory",
"",
"nlp = spacy.blank('en')",
"nlp.add_pipe(PySBDFactory(nlp))",
"",
"doc = nlp('My name is Jonas E. Smith. Please turn to p. 55.')",
"print(list(doc.sents))",
"# [My name is Jonas E. Smith., Please turn to p. 55.]"
],
"author": "Nipun Sadvilkar",
"author_links": {
"twitter": "nipunsadvilkar",
"github": "nipunsadvilkar",
"website": "https://nipunsadvilkar.github.io"
}
},
{
"id": "cookiecutter-spacy-fastapi",
"title": "cookiecutter-spacy-fastapi",