Commit Graph

1419 Commits

Author SHA1 Message Date
Ines Montani aedae8b4c5 Update universe.json [ci skip] 2019-08-28 11:59:06 +02:00
Björn Böing bae0455f91 Fix visualizer options linking for displaCy. (#4202) 2019-08-27 14:04:28 +02:00
Ines Montani 8114933f01 Fix universe.json [ci skip] 2019-08-27 12:13:42 +02:00
Ines Montani 48385552c6 Update languages.json [ci skip] 2019-08-27 11:52:51 +02:00
yanaiela 5d7bc26735 new universe project - the numeric fused-head (#4192)
* new universe project

* Update website/meta/universe.json

Co-Authored-By: Ines Montani <ines@ines.io>

* Update website/meta/universe.json

Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-25 17:25:28 +02:00
Christos Aridas 61f5c007a0 DOC Fix pipeline functions examples (#4189) 2019-08-23 19:15:32 +02:00
Ines Montani b072c13017 Update universe with videos [ci skip] 2019-08-21 21:35:37 +02:00
Pavle Vidanović 4fe9329bfb Serbian language code update "rs" -> "sr" (#4159)
* Serbian stopwords added. (cyrillic alphabet)

* spaCy Contribution agreement included.

* Test initialize updated

* Serbian language code update. --bugfix
2019-08-21 19:57:37 +02:00
adrianeboyd 8fe7bdd0fa Improve token pattern checking without validation (#4105)
* Fix typo in rule-based matching docs

* Improve token pattern checking without validation

Add more detailed token pattern checks without full JSON pattern validation and
provide more detailed error messages.

Addresses #4070 (also related: #4063, #4100).

* Check whether top-level attributes in patterns and attr for PhraseMatcher are
  in token pattern schema

* Check whether attribute value types are supported in general (as opposed to
  per attribute with full validation)

* Report various internal error types (OverflowError, AttributeError, KeyError)
  as ValueError with standard error messages

* Check for tagger/parser in PhraseMatcher pipeline for attributes TAG, POS,
  LEMMA, and DEP

* Add error messages with relevant details on how to use validate=True or nlp()
  instead of nlp.make_doc()

* Support attr=TEXT for PhraseMatcher

* Add NORM to schema

* Expand tests for pattern validation, Matcher, PhraseMatcher, and EntityRuler

* Remove unnecessary .keys()

* Rephrase error messages

* Add another type check to Matcher

Add another type check to Matcher for more understandable error messages
in some rare cases.

* Support phrase_matcher_attr=TEXT for EntityRuler

* Don't use spacy.errors in examples and bin scripts

* Fix error code

* Auto-format

Also try get Azure pipelines to finally start a build :(

* Update errors.py


Co-authored-by: Ines Montani <ines@ines.io>
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2019-08-21 14:00:37 +02:00
Ines Montani 3134a9b6e0 Add section on expanding regex match to token boundaries (see #4158) [ci skip] 2019-08-21 12:53:31 +02:00
Ines Montani 072860fcd0 Auto-format [ci skip] 2019-08-20 14:46:41 +02:00
Andrei-Marius Avram 199589228e Added RONEC to spaCy Universe (#4151)
* Added RONEC to spaCy Universe

* Added contributor file

* Corrected date from .github/contributors/avramandrei.md

* Convert tabs to spaces

* Remove duplicate keys

Can only have one GitHub link unfortunately

* Also add models category

* Adjust ID

This is used to generate the URL, so a simpler string is better
2019-08-20 14:46:07 +02:00
Ines Montani fe230c8776 Fix typo [ci skip] 2019-08-20 13:02:05 +02:00
Daniel Bourke b0a28fd0de fix PhraseMatcher link typo (#4150)
/api/phtasematcher -> /api/phrasematcher
2019-08-20 13:01:43 +02:00
Ines Montani ce4c3e5204 Document force flag on set_extension (closes #4148) 2019-08-19 19:22:07 +02:00
Ines Montani 66aba2d676 Improve regex matching docs [ci skip] 2019-08-19 13:59:41 +02:00
Sofie Van Landeghem cc66f47893 Make enabling/disabling jupyter mode more explicit (#4144)
* make enabling/disabling jupyter mode more explicit

* markup fix
2019-08-19 11:53:34 +02:00
Ines Montani e520eb3f6c Make visualized NER examples more clear (closes #4104) [ci skip] 2019-08-18 16:29:29 +02:00
Jeno 91441f169c Update universe.json to include negspacy (#4132) 2019-08-16 17:48:17 +02:00
Ines Montani 1362f793cf Improve docs on phrase pattern attributes (closes #4100) [ci skip] 2019-08-11 11:13:49 +02:00
Ines Montani 1f4d8bf77e Update universe.json [ci skip] 2019-08-09 17:42:37 +02:00
ICLR&D 87e40b17a0 Add entry for Blackstone in universe.json (#4101)
* Add entry for Blackstone in universe.json

Add an entry for the Blackstone project. Checked JSON is valid.

* Create ICLRandD.md

* Fix indentation (tabs to spaces)

It looks like during validation, the JSON file automatically changed spaces to tabs. This caused the diff to show *everything* as changed, which is obviously not true. This hopefully fixes that.

* Try to fix formatting for diff

* Fix diff


Co-authored-by: Ines Montani <ines@ines.io>
2019-08-09 17:16:51 +02:00
Ines Montani a2ac2e873f Update Binder version [ci skip] 2019-08-08 13:03:45 +02:00
Ines Montani 3e60afacf9 Add Serbian to languages [ci skip] 2019-08-07 13:38:25 +02:00
Ines Montani 1dc28a9ecb Update Binder version [ci skip] 2019-08-07 13:38:12 +02:00
Ines Montani 8b4a0fabbb Adjust docs example [ci skip] 2019-08-07 00:46:47 +02:00
adrianeboyd 69aca7d839 Add validate option to EntityRuler (#4089)
* Add validate option to EntityRuler

* Add validate to EntityRuler, passed to Matcher and PhraseMatcher

* Add validate to usage and API docs

* Update website/docs/usage/rule-based-matching.md

Co-Authored-By: Ines Montani <ines@ines.io>

* Update website/docs/usage/rule-based-matching.md

Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-07 00:40:53 +02:00
Ines Montani 4ae320e5c2 Use consistent casing for entity ruler patterns (see #4063) [ci skip] 2019-08-06 12:20:22 +02:00
Ines Montani 223bde5cf6 Improve docs on matcher attributes [ci skip] (closes #4063) 2019-08-06 12:13:42 +02:00
Ines Montani 2bfae0b167 Auto-format 2019-08-06 12:13:31 +02:00
Ines Montani 7f3212e2f5
💫 Sync branches (#4084) [ci skip]
* Update from master

* Re-added Universe readme (#3688) (closes #3680)

* Fix typo

* Add version tag to `--base-model` argument (closes #3720)

* fixing regex matcher examples (#3708) (#3719)

* Improve Token.prob and Lexeme.prob docs (resolves #3701)

* Fix DependencyParser.predict docs (resolves #3561)

* Update languages.json


Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Aaron Kub <aaronkub@gmail.com>
2019-08-05 14:32:54 +02:00
Ines Montani 0f740fad1a Update universe.json [ci skip] 2019-08-05 14:30:07 +02:00
Ines Montani 0f76e0022d Update .tensor docs [ci skip] 2019-08-01 18:37:09 +02:00
Ines Montani 3072eb28c2 Support and render Markdown in model meta [ci skip] 2019-08-01 18:33:10 +02:00
Björn Böing a83c0add2e Add links to tokenizer API docs to refer relevant information. (#4064)
* Add links to tokenizer API docs to refer relevant information.

* Add suggested changes

Co-Authored-By: Ines Montani <ines@ines.io>
2019-08-01 14:28:38 +02:00
Ejar 2cdf7d39e7 Corrected imported fucntion (#4062)
The example showed an incorrected import
2019-08-01 12:43:36 +02:00
Mohammed Daudali 23ec07debd Correct typo for AllenAI url on homepage (#4050)
* Typo fix for AllenAI url

Changed incorrect home page url for AllenAI from appenai.org to allenai.org

* Sign contributor agreement

* Change date format
2019-07-31 00:16:33 +02:00
Ines Montani fcd2f7f656 Fix version introducing Span.ents (closes #4045) [ci skip] 2019-07-30 10:32:33 +02:00
Ines Montani fc69da0acb
💫 Support simple training format in nlp.evaluate and add tests (#4033)
* Support simple training format in nlp.evaluate and add tests

* Update docs [ci skip]
2019-07-27 17:30:18 +02:00
Ines Montani bd39e5e630 Add "Processing text" section [ci skip] 2019-07-25 17:38:03 +02:00
Ines Montani a5e3d2f318 Improve section on disabling pipes [ci skip] 2019-07-25 14:25:34 +02:00
Ines Montani 02e444ec7c Add section on special tokenizer component [ci skip] 2019-07-25 14:25:03 +02:00
Ines Montani 1fa6d6ba55 Improve consistency of docs examples [ci skip] 2019-07-25 14:24:56 +02:00
adrianeboyd 784a5f4284 Update GoldParse attributes in API docs (#4023)
* add `words`
* update name of entity list to `ner`

I think it might be a bit more consistent to have `ner` named `entities`
or `ents` (and `ents` is actually set somewhere to `None`, which is a
bit confusing), but it looks like renaming it would be a non-trivial
decision.
2019-07-25 12:14:02 +02:00
Adriane Boyd 6c5044ed2a Update annotation docs for German
- minor formatting fixes
- remove STTS tags not used in Tiger
- update list of dependency relations to match tiger2dep
2019-07-22 11:59:03 +02:00
adrianeboyd d2c474cbb7 Fix initial example in EntityRuler API docs (#3999) 2019-07-22 11:18:55 +02:00
Ines Montani 1167c303a0 Fix typos [ci skip] 2019-07-19 13:08:18 +02:00
BreakBB 6d9a7c0749 Add '--silent' argument to bash example of CLI Info 2019-07-19 10:00:45 +02:00
BreakBB c8ba0f690d Fix --force parameter of CLI package 2019-07-19 10:00:45 +02:00
Ines Montani a0acb1b3cd Also add infobox to API docs [ci skip] 2019-07-17 16:26:41 +02:00