Commit Graph

14735 Commits

Author SHA1 Message Date
Ines Montani 3a701d3645
Merge pull request #8841 from adrianeboyd/docs/ent-id-sep [ci skip]
Fix formatting of ent_id_sep in EntityRuler API docs
2021-07-30 09:09:25 +10:00
Ines Montani f08be084fb
Merge pull request #8844 from thomashacker/bugfix/fix-doc-transformer-typo [ci skip]
Fix typo in Tok2VecTransformer example config
2021-07-30 09:08:59 +10:00
thomashacker 02258916c8 Fix example config typo for transformer architecture 2021-07-29 11:19:40 +02:00
Adriane Boyd 15b12f3e35 Fix formatting of ent_id_sep in EntityRuler API docs 2021-07-29 10:10:12 +02:00
Ines Montani 0a1e299d30
Merge pull request #8814 from polm/docs/migrate-lexeme-tables [ci skip] 2021-07-29 17:18:02 +10:00
Paul O'Leary McCann 8867e60fbb
Update website/docs/usage/v3.md
Co-authored-by: Ines Montani <ines@ines.io>
2021-07-29 14:56:56 +09:00
Adriane Boyd 8547514aa4
Remove labels from textcat component config example (#8815) 2021-07-27 13:14:38 +02:00
Paul O'Leary McCann 76ac95923a Add note to migration guide about lexeme tables (fix #7290)
This just adds the resolution from #6388 to the docs.
2021-07-27 19:19:25 +09:00
Paul O'Leary McCann 67ecdcc3ac
Update subset/superset docs (#8795)
* Update subset/superset docs

* Update website/docs/usage/rule-based-matching.md

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2021-07-27 12:08:46 +02:00
Ines Montani 7f21c7dfa2
Merge pull request #8794 from explosion/autoblack
Auto-format code with black
2021-07-27 12:17:15 +10:00
Ines Montani 34c401f04f
Merge pull request #8801 from polm/fix/respect-no-skip (fixes #8796)
Respect the no_skip value
2021-07-27 12:16:47 +10:00
Ines Montani 134cb06af3
Merge pull request #8808 from kevinlu1248/master [ci skip]
Changed a CLI command in data-formats.md due to erroneous information
2021-07-27 12:15:16 +10:00
Ines Montani 9bf0d6f2fd
Merge pull request #8806 from Ledenel/master [ci skip]
fix typo
2021-07-27 12:14:22 +10:00
Kevin Lu 4a8e9e4e4e
Update data-formats.md 2021-07-25 22:58:53 -07:00
Ledenel 413f745c68 fix broken example in spaCy universe Chatterbot 2021-07-25 15:53:32 +00:00
Paul O'Leary McCann 284b530c63 Respect the no_skip value
Seems like the logic for this was just left out. See #8796.
2021-07-24 15:31:17 +09:00
explosion-bot a58ab6ea22 Auto-format code with black 2021-07-23 08:04:09 +00:00
Adriane Boyd 6bbc2b1956
Reload train corpus in debug data after initialize (#8776) 2021-07-21 22:38:40 +02:00
Adriane Boyd d48c01a6f7
Remove extraneous grc test file (#8768) 2021-07-20 15:51:15 +02:00
Sofie Van Landeghem ffaead8fe0
bump to 3.1.1 2021-07-19 14:48:27 +02:00
Sofie Van Landeghem 83e27d262e
negative tag annotation (#8731)
* unit test to unlearn tag via negative annotation

* bump thinc to 8.0.8
2021-07-19 14:39:11 +02:00
Adriane Boyd 0e4b96c97e
Update lexeme ranks for loaded vectors (#8640)
Update the ranks for any lexemes that have been added to the vocab
before the vectors are added to the model.
2021-07-19 18:25:54 +10:00
Adriane Boyd e532c69475
Update Language.replace_pipe for disabled components (#8729)
* Fix the index where the replacement in inserted to account for
disabled components
* Allow `Language.replace_pipe` to replace disabled components
2021-07-19 18:06:12 +10:00
Paul O'Leary McCann d717593eb7
Merge pull request #8754 from KennethEnevoldsen/patch-1
[minor] removed outdated spacy version for spacymoji
2021-07-18 19:17:33 +09:00
Paul O'Leary McCann ac67639eaf
Merge pull request #8755 from KennethEnevoldsen/patch-2
fixed GitHub link and thumbnail
2021-07-18 19:14:57 +09:00
Kenneth Enevoldsen 5d6aed0773
fixed GitHub link and thumbnail
Sorry, I seem to have misunderstood that the GitHub reference shouldn't be a link.
2021-07-18 10:22:00 +02:00
Ines Montani f90482d077 Tidy up and auto-format 2021-07-18 15:44:56 +10:00
Ines Montani 313f55e560 Fix JSON [ci skip] 2021-07-18 13:21:33 +10:00
Ines Montani 51e5903d6f
Merge pull request #8702 from KennethEnevoldsen/master [ci skip] 2021-07-18 13:18:42 +10:00
Kenneth Enevoldsen 8546948fba
removed outdated spacy version for spacymoji
From the documentation of spacymoji (and the requirements.txt) it seems like it is not only for version 2.
2021-07-17 15:19:43 +02:00
Kenneth Enevoldsen a0e0ccdb46
Update website/meta/universe.json
Co-authored-by: Ines Montani <ines@ines.io>
2021-07-17 07:14:46 +02:00
Ines Montani c0f436efbc
Merge pull request #8735 from explosion/autoblack 2021-07-17 13:46:17 +10:00
Ines Montani 483f3175cb Tidy up [ci skip] 2021-07-17 13:43:15 +10:00
Ines Montani 15e6578f7d
Adjust formatting 2021-07-17 10:49:13 +10:00
Mario Šaško 1ba2e8a646
Add TakeLab/spacy-udpipe to Universe (#8698)
* Add TakeLab/spacy-udpipe to universe

* Add SCA

* Sign SCA
2021-07-16 11:15:52 +02:00
explosion-bot eff3d1088b Auto-format code with black 2021-07-16 08:03:36 +00:00
Adriane Boyd f5acc48111
Remove TrainablePipe as base class for Lemmatizer in API docs (#8725) 2021-07-15 16:41:36 +02:00
Adriane Boyd ac45c7c045
Add pre-commit to ignored requirements (#8728) 2021-07-15 16:41:15 +02:00
jmyerston 993b0fab0e
Added ancient Greek language support (#8606)
* Add ancient Greek language support

Initial commit

* Contributor Agreement

* grc tokenizer test added  and files formatted with black, unnecessary import removed

Co-Authored-By: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Commas in lists fixed. __init__py added to test

* Update lex_attrs.py

* Update stop_words.py

* Update stop_words.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-07-15 10:27:17 +02:00
Sofie Van Landeghem 77859beb99
spacy.ngram_range_suggester.v1 (#8699) 2021-07-15 10:01:22 +02:00
Julien Rossi e117573822
Adding noun_chunks to the DUTCH language model (nl) (#8529)
*  implement noun_chunks for dutch language

* copy/paste FR and SV syntax iterators to accomodate UD tags
* added tests with dutch text
* signed contributor agreement

* 🐛 fix noun chunks generator

* built from scratch
* define noun chunk as a single Noun-Phrase
* includes some corner cases debugging (incorrect POS tagging)
* test with provided annotated sample (POS, DEP)

*  fix failing test

* CI pipeline did not like the added sample file
* add the sample as a pytest fixture

* Update spacy/lang/nl/syntax_iterators.py

* Update spacy/lang/nl/syntax_iterators.py

Code readability

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/tests/lang/nl/test_noun_chunks.py

correct comment

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* finalize code

* change "if next_word" into "if next_word is not None"

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2021-07-14 14:01:02 +02:00
Ines Montani 2a8eeed5da
Merge pull request #8703 from thomashacker/update/spacy-stanza [ci skip]
Update spacy-stanza universe.json
2021-07-13 19:03:42 +10:00
thomashacker aafb89df78 Update universe.json code_example 2021-07-13 10:22:49 +02:00
KennethEnevoldsen e5127992a0 added agreement 2021-07-13 10:11:02 +02:00
Kenneth Enevoldsen 94ce904e10
added missing comma 2021-07-13 09:59:34 +02:00
Kenneth Enevoldsen a81fcc81b0
added dacy to universe 2021-07-13 09:54:08 +02:00
Adriane Boyd f9fd2889b7
Use 0-vector for OOV lexemes (#8639) 2021-07-13 14:48:12 +10:00
Edward 8233359225
Fix preservation of spacy package meta (#8663)
* update package meta with existing_meta and nlp_meta

* Add spaCy contributor agreement

* Added more info when creating readme
2021-07-12 11:18:52 +02:00
Paul O'Leary McCann 1c70c87daf
Fix autoblack
The conditional needs double equals.
2021-07-10 16:02:39 +09:00
Ines Montani 616f4de034
Merge pull request #8674 from polm/fix/autoblack-no-forks [ci skip]
Make the autoblack job not run on forks
2021-07-10 16:41:59 +10:00