Commit Graph

1607 Commits

Author SHA1 Message Date
Sofie Van Landeghem 1137420840
Small doc fixes (#5250)
* fix link

* torchtext instead tochtext
2020-04-03 13:01:43 +02:00
Nikhil Saldanha d1ddfa1cb7 update docs for EntityRecognizer.predict
return type was wrongly written as a tuple, changed to syntax.StateClass
2020-03-28 18:13:02 +01:00
Tiljander e53232533b
Describing priority rules for overlapping matches (#5197)
* Describing priority rules for overlapping matches

* Create Tiljander.md

* Describing priority rules for overlapping matches

* Update website/docs/api/entityruler.md

Co-Authored-By: Ines Montani <ines@ines.io>

Co-authored-by: Ines Montani <ines@ines.io>
2020-03-26 13:13:22 +01:00
adrianeboyd d88a377bed
Remove Vectors.from_glove (#5209) 2020-03-26 10:45:47 +01:00
Baciccin 3b53617a69 Add Ligurian language 2020-03-19 21:37:01 -07:00
Ines Montani 80e7e1347e Update universe.json [ci skip] 2020-03-17 22:21:34 +01:00
Ines Montani eda6eff8b1 Update universe.json [ci skip] 2020-03-17 22:19:29 +01:00
Ines Montani 16e7301d34
Merge pull request #5161 from pmbaumgartner/master
add gobbli to spacy-universe 🥳
2020-03-17 22:18:30 +01:00
Peter B b04057c204 add mentions of spaCy use 2020-03-17 15:03:43 -04:00
Ines Montani b2b01a5c8b Update universe.json [ci skip] 2020-03-17 19:53:31 +01:00
Peter B d2ffb406ad add gobbli to spacy-universe 🥳 2020-03-17 08:30:29 -04:00
Ines Montani 17bd9ed84f
Merge pull request #5153 from pinealan/fix/website-docs
Fix website typos and weird sentences
2020-03-16 15:03:01 +01:00
Ines Montani 2044216bd5
Merge pull request #5150 from sloev/master
add spacy_syllables to universe
2020-03-16 15:02:12 +01:00
Alan Chan 2124be100d Tweak run-on sentence 2020-03-15 03:45:20 +08:00
Alan Chan 7c3a4ce933 Missing word in api/cli doc 2020-03-15 03:45:20 +08:00
Alan Chan 36e3532475 Remove unfinished sentence 2020-03-15 03:45:17 +08:00
nihil 9cde7eb08c add spacy_syllables to universe + sign contributor agreement 2020-03-13 18:09:42 +01:00
Mark Abraham a0ffa346c0 Fix broken link in docs 2020-03-13 14:07:26 +01:00
Ines Montani c669435c62
Merge pull request #5125 from renaud/patch-1
small typo in code sample
2020-03-12 11:19:12 +01:00
svlandeg 1724a4f75b additional information if doc is empty 2020-03-09 18:08:18 +01:00
Renaud Richardet eccf6b1686
small typo in code sample 2020-03-09 14:49:11 +01:00
Ines Montani 1d6aec805d Fix formatting and update docs for v2.2.4 2020-03-09 11:17:20 +01:00
David Pollack 80004930ed fix typo in svg file 2020-03-05 17:04:33 +01:00
Ines Montani acb4e3c7ba
Merge pull request #5039 from adrianeboyd/typo/website-token-api-shape
Fix formatting in Token API
2020-02-25 14:57:25 +01:00
Ines Montani 4890db6339 Auto-format and fix image [ci skip] 2020-02-23 13:56:50 +01:00
Sofie Van Landeghem 479bd8d09f
add lemma option to displacy 'dep' visualiser (#5041)
* add lemma option to displacy 'dep' visualiser

* more compact list comprehension

* add option to doc

* fix test and add lemmas to util.get_doc

* fix capital

* remove lemma from get_doc

* cleanup
2020-02-22 14:11:51 +01:00
Adriane Boyd 3853d385fa Fix formatting in Token API 2020-02-20 13:41:24 +01:00
Kabir Khan f6ed07b85c
Use nlp.pipe in EntityRuler for phrase patterns in add_patterns (#4931)
* Fix ent_ids and labels properties when id attribute used in patterns

* use set for labels

* sort end_ids for comparison in entity_ruler tests

* fixing entity_ruler ent_ids test

* add to set

* Run make_doc optimistically if using phrase matcher patterns.

* remove unused coveragerc I was testing with

* format

* Refactor EntityRuler.add_patterns to use nlp.pipe for phrase patterns. Improves speed substantially.

* Removing old add_patterns function

* Fixing spacing

* Make sure token_patterns loaded as well, before generator was being emptied in from_disk
2020-02-16 18:17:47 +01:00
nlptechbook 979a3fd1f5
Update universe.json (#5022)
e-book is available from https://nostarch.com/NLPPython
2020-02-15 15:44:55 +01:00
Julin S 479e81bafc
fix link (#4977) 2020-02-10 20:31:26 -05:00
Ines Montani 9c08d9baa3 Remove old sections [ci skip] (closes #4961) 2020-02-03 13:10:46 +01:00
Ines Montani abd5c06374 Adjust formatting [ci skip] 2020-02-03 13:00:02 +01:00
Martin A. Kayser 02a44c5be2
Adding a note on retrieving the string rep of the match_id (#4904)
Stolen from here: https://stackoverflow.com/questions/47638877/using-phrasematcher-in-spacy-to-find-multiple-match-types
2020-02-03 12:58:58 +01:00
Omri Mendels 6ff947e1f9
Added presidio-research to universe.json (#4950)
* Added presidio-research to universe.json

Added a reference to Presidio Research, the data-science toolbox for Microsoft Presidio.

* Updated url
2020-02-03 12:57:55 +01:00
Paco Nathan 49fefb6139 Submitting `PyTextRank` for inclusion in the spaCy uniVerse (#4942)
* submitting PyTextRank for consideration of including in the spaCy uniVerse

* including SCA
2020-01-28 11:37:54 +01:00
adrianeboyd 7ad000fce7 Update docs for train CLI --use_gpu option (#4927) 2020-01-20 17:02:47 +01:00
Bram Vanroy 718704022a Changes to spacy_conll in universe (#4914)
* Update information on spacy_conll

* Typo fix
2020-01-16 01:56:39 +01:00
Preston Badeer b216ff43c9 Update vectors-similarity.md (#4889)
These links are broken on the website, due to quotes around the URLs.
2020-01-08 16:49:40 +01:00
Geoffrey Gordon Ashbrook 53929138d7 remove extra word typo (#4875)
"let you find you"
2020-01-06 12:37:42 +01:00
Ines Montani 400257a802 Update index.md [ci skip] 2020-01-04 01:52:18 +01:00
Ivan Echevarria ef13e0c038 Add n_process to Language.pipe documentation (#4842) [ci skip]
* Add n_process to documentation

* Auto-format and add default [ci skip]

Co-authored-by: Ines Montani <ines@ines.io>
2019-12-29 14:23:33 +01:00
Ines Montani 1b838d1313 Divide models into core and starters [ci skip] 2019-12-21 14:10:22 +01:00
Sofie Van Landeghem 8ebbb85117 Documentation for PhraseMatcher constructor (#4826)
* add max_length as argument for init PhraseMatcher

* improve error message too
2019-12-20 23:00:04 +01:00
Ines Montani c466e02466 Update universe [ci skip] 2019-12-13 15:57:39 +01:00
Thiago Lages de Alencar a067ded495 Update doc.md (#4796) 2019-12-11 18:21:40 +01:00
Tclack88 ab8dc2732c Update token.md (#4767)
* Update token.md

documentation is confusing: A '?' is a right punct, but '¿' is a left punct

* Update token.md

add quotations around parentheses in `is_left_punct` and `is_right_punct` for clarrification, ensuring the question mark that follows is not percieved as an example of left and right punctuation

* Move quotes into code block [ci skip]
2019-12-06 19:22:02 +01:00
Ines Montani bf611ebca7 Document jsonl option on converter [ci skip] 2019-12-06 19:17:45 +01:00
Nicolai Bjerre Pedersen de5453cdcb Fix link to user hooks in docs (#4778)
* Fix link to user hooks in docs

* Update mr_bjerre.md

Mistake in contributor agreement

* Apparently hard to get it right (wrong name of sca)
2019-12-06 19:17:12 +01:00
Ines Montani cbacb0f1a4 Update shape docs and examples (resolves #4615) [ci skip] 2019-11-23 17:16:55 +01:00
Paul O'Leary McCann f0e3e606a6 Replace python-mecab3 with fugashi for Japanese (#4621)
* Switch from mecab-python3 to fugashi

mecab-python3 has been the best MeCab binding for a long time but it's
not very actively maintained, and since it's based on old SWIG code
distributed with MeCab there's a limit to how effectively it can be
maintained.

Fugashi is a new Cython-based MeCab wrapper I wrote. Since it's not
based on the old SWIG code it's easier to keep it current and make small
deviations from the MeCab C/C++ API where that makes sense.

* Change mecab-python3 to fugashi in setup.cfg

* Change "mecab tags" to "unidic tags"

The tags come from MeCab, but the tag schema is specified by Unidic, so
it's more proper to refer to it that way.

* Update conftest

* Add fugashi link to external deps list for Japanese
2019-11-23 14:31:04 +01:00