Commit Graph

10099 Commits

Author SHA1 Message Date
Nipun Sadvilkar 1f13005751 Incorrect Token attribute ent_iob_ description (#3800)
* Incorrect Token attribute ent_iob_ description

* Add spaCy contributor agreement
2019-05-31 16:50:45 +02:00
Ramanan Balakrishnan 26c37c5a4d fix all references to BILUO annotation format (#3797) 2019-05-31 12:19:19 +02:00
Ines Montani a7fd42d937 Make jsonschema dependency optional (#3784) 2019-05-30 14:34:58 +02:00
mak 89379a7fa4 Corrected example model URL in requirements.txt (#3786)
The URL used to show how to add a model to the requirements.txt had the old release path (excl. explosion).
2019-05-29 10:51:55 +02:00
Ines Montani a8416c46f7 Use string name in setup.py
Hopefully this will trick GitHub's parser into recognising it as a Python package and show us the dependents / "used by" statistics 🤞
2019-05-28 17:11:39 +02:00
Ujwal Narayan ed7be3f64c Update norm_exceptions.py (#3778)
* Update norm_exceptions.py

Extended the Currency set to include Franc, Indian Rupee, Bangladeshi Taka, Korean Won, Mexican Dollar, and Egyptian Pound

* Fix formatting [ci skip]
2019-05-27 11:52:52 +02:00
estr4ng7d 604acb6ace Marathi Language Support (#3767)
* Adding Marathi language details and folder to it

* Adding few changes and running tests

* Adding few changes and running tests

* Update __init__.py

mh -> mr

* Rename spacy/lang/mh/__init__.py to spacy/lang/mr/__init__.py

* mh -> mr
2019-05-24 14:29:42 +02:00
Ines Montani 7634812172 Document Language.evaluate 2019-05-24 14:06:36 +02:00
Ines Montani 45e6855550 Update Language.update docs 2019-05-24 14:06:26 +02:00
Ines Montani b78a8dc1d2 Update Scorer and add API docs 2019-05-24 14:06:04 +02:00
Ujwal Narayan 4d550a3055 Enhancing Kannada language Resources (#3755)
* Updated stop_words.py

Added more stopwords

* Create ujwal-narayan.md

Enhancing Kannada language resources
2019-05-20 12:56:10 +02:00
Ines Montani 321c9f5acc Fix lex_id docs (closes #3743) 2019-05-16 23:15:58 +02:00
BreakBB ed18a6efbd Add check for callable to 'Language.replace_pipe' to fix #3737 (#3741) 2019-05-14 16:59:31 +02:00
Ines Montani 8baff1c7c0
💫 Improve introspection of custom extension attributes (#3729)
* Add custom __dir__ to Underscore (see #3707)

* Make sure custom extension methods keep their docstrings (see #3707)

* Improve tests

* Prepend note on partial to docstring (see #3707)

* Remove print statement

* Handle cases where docstring is None
2019-05-12 00:53:11 +02:00
Ines Montani f96af8526a Merge branch 'spacy.io' [ci skip] 2019-05-11 23:03:56 +02:00
Matthew Honnibal 3aceeeaaeb Set version to v2.1.4 2019-05-11 22:57:53 +02:00
Ines Montani aea1c93a05 Replace cytoolz.partition_all with util.minibatch 2019-05-11 21:12:09 +02:00
Ines Montani 0bf6441863 Fix .iob converter (closes #3620) 2019-05-11 19:15:26 +02:00
Matthew Honnibal f6e9394aa5 Fix push-tag script 2019-05-11 19:04:35 +02:00
Matthew Honnibal a5159ddcf5 Set version to v2.1.4.dev1 2019-05-11 19:03:51 +02:00
Ines Montani 7534f7cb44 Fix return value of Language.update (closes #3692) 2019-05-11 18:40:19 +02:00
Ines Montani 503b8c85f1 Add TWiML podcast to universe [ci skip] 2019-05-11 17:48:22 +02:00
Ines Montani 0daf2422a3 Auto-format 2019-05-11 17:48:07 +02:00
Ines Montani 6b3a79ac96 Call rmtree and copytree with strings (closes #3713) 2019-05-11 15:48:35 +02:00
devforfu 21af12eb53 Make "text" key in JSONL format optional when "tokens" key is provided (#3721)
* Fix issue with forcing text key when it is not required

* Extending the docs to reflect the new behavior
2019-05-11 15:41:29 +02:00
Ines Montani 6cfa1e1f47 Fix DependencyParser.predict docs (resolves #3561) 2019-05-11 15:37:54 +02:00
Ines Montani 25f5592d57 Improve Token.prob and Lexeme.prob docs (resolves #3701) 2019-05-11 15:23:41 +02:00
Aaron Kub 719a15f23d fixing regex matcher examples (#3708) (#3719) 2019-05-10 14:23:52 +02:00
Luca Dorigo 82d034f976 Update glossary.py to match information found in documentation (#3704) (closes ##3679)
* Update glossary.py to match information found in documentation

I used regexes to add any dependency tag that was in the documentation but not in the glossary. Solves #3679 👍

* Adds forgotten colon
2019-05-10 14:23:20 +02:00
Wannaphong Phatthiyaphaibun 5a14a13f64 fix thai bug (#3693)
fix tokenize for pythainlp
2019-05-10 14:21:34 +02:00
Luca Dorigo 2663f4133c Submit contributor agreement (#3705) 2019-05-10 14:19:18 +02:00
Ines Montani 65b55f1aaa Add version tag to `--base-model` argument (closes #3720) 2019-05-10 14:06:47 +02:00
richardpaulhudson a1e07f0d14 Request to include Holmes in spaCy Universe (#3685)
* Request to add Holmes to spaCy Universe

Dear spaCy team, I would be grateful if you would consider my Python library Holmes for inclusion in the spaCy Universe. Holmes transforms the syntactic structures delivered by spaCy into semantic structures that, together with various other techniques including ontological matching and word embeddings, serve as the basis for information extraction. Holmes supports several use cases including chatbot, structured search, topic matching and supervised document classification. I had the basic idea for Holmes around 15 years ago and now spaCy has made it possible to build an implementation that is stable and fast enough to actually be of use - thank you! At present Holmes supports English and German (I am based in Munich) but could easily be extended to support any other language with a spaCy model.

* Added
2019-05-08 02:42:03 +02:00
Ines Montani 505c9e0e19 Add util.filter_spans helper (#3686) 2019-05-08 02:33:40 +02:00
F0rge1cE dd1e6b0bc6 Fix offset bug in loading pre-trained word2vec. (#3689)
* Fix offset bug in loading pre-trained word2vec.

* add contributor agreement
2019-05-06 23:00:38 +02:00
Bram Vanroy 8e6f8deaf6 Re-added Universe readme (#3688) (closes #3680) 2019-05-06 21:08:01 +02:00
Ines Montani 78cb807a9a Auto-format [ci skip] 2019-05-06 16:58:29 +02:00
Ines Montani dd153b2b33 Simplify helper (see #3681) [ci skip] 2019-05-06 15:13:10 +02:00
Ines Montani f8fce6c03c Fix typo (see #3681) 2019-05-06 15:02:11 +02:00
Ines Montani f2a56c1b56 Rewrite example to use Retokenizer (resolves #3681)
Also add helper to filter spans
2019-05-06 14:51:18 +02:00
Brad Jascob 955b95cb8b Fix inconsistant lemmatizer issue #3484 (#3646)
* Fix inconsistant lemmatizer issue #3484

* Remove test case
2019-05-04 18:16:03 +02:00
Ines Montani b4d142e3c4 Adjust wording and formatting [ci skip] 2019-05-03 12:00:31 +02:00
Ines Montani 04658ebbb2 Relax jsonschema pin (closes #3628) 2019-05-03 11:58:58 +02:00
d5555 ba4bcbf285 Update universe.json (#3653) [ci skip]
* Update universe.json

* Update universe.json
2019-05-03 11:50:12 +02:00
Dobita21 f95ecedd83 Add Thai lex_attrs (#3655)
* test sPacy commit to git fri 04052019 10:54

* change Data format from my format to master format

* ทัทั้งนี้ ---> ทั้งนี้

* delete stop_word translate from Eng

* Adjust formatting and readability

* add Thai norm_exception

* Add Dobita21 SCA

* editรึ : หรือ,

* Update Dobita21.md

* Auto-format

* Integrate norms into language defaults

* add acronym and some norm exception words

* add lex_attrs

* Add lexical attribute getters into the language defaults

* fix LEX_ATTRS


Co-authored-by: Donut <dobita21@gmail.com>
Co-authored-by: Ines Montani <ines@ines.io>
2019-05-01 12:03:14 +02:00
张晓飞 ba1ff00370 update response after calling add_pipe (#3661)
* update response after calling add_pipe

component:print_info is appened in the last, so need show it at the end of  pipeline

* Create henry860916.md
2019-05-01 12:02:18 +02:00
BreakBB 8952004dfc Update French example sents and add two German stop words (#3662)
* Update french example sentences

* Add 'anderem' and 'ihren' to German stop words
2019-05-01 12:01:35 +02:00
Ramiro Gómez 8ee4100f8f Remove dangling M (#3657)
I assume this is a typo. Sorry if it has a meaning that I'm not aware of.
2019-04-29 19:44:43 +02:00
Amit Chaudhary 167d63af31 Fix broken link to Dive Into Python 3 website (#3656)
* Fix broken link to Dive Into Python 3 website

* Sign spaCy Contributor Agreement
2019-04-29 19:44:00 +02:00
Ramiro Gómez e7e5999ddc Create yaph.md so I can contribute (#3658) 2019-04-29 19:43:06 +02:00