Ines Montani
296446a1c8
Tidy up and improve docs and docstrings ( #3370 )
...
<!--- Provide a general summary of your changes in the title. -->
## Description
* tidy up and adjust Cython code to code style
* improve docstrings and make calling `help()` nicer
* add URLs to new docs pages to docstrings wherever possible, mostly to user-facing objects
* fix various typos and inconsistencies in docs
### Types of change
enhancement, docs
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-03-08 11:42:26 +01:00
Ines Montani
daaeeb7a2b
Merge branch 'master' into develop
2019-03-07 22:07:31 +01:00
Adrien Ball
88909a9adb
Fix egg fragments in direct download ( #3369 )
...
## Description
The egg fragment in the URL must be of the form `#egg=package_name==version` instead of `#egg=package_name-version`.
One of the consequences of specifying wrong egg fragments is that `pip` does not recognize the package and its version properly, and thus it re-downloads the package systematically.
I'm not sure how this should be tested properly.
Here is what I had before the fix when running the same direct download twice:
```
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Collecting en_core_web_sm-2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm-2.0.0
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
100% |████████████████████████████████| 37.4MB 1.6MB/s
Generating metadata for package en-core-web-sm-2.0.0 produced metadata for project name en-core-web-sm. Fix your #egg=en-core-web-sm-2.0.0 fragments.
Installing collected packages: en-core-web-sm
Running setup.py install for en-core-web-sm ... done
Successfully installed en-core-web-sm-2.0.0
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Collecting en_core_web_sm-2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm-2.0.0
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
100% |████████████████████████████████| 37.4MB 919kB/s
Generating metadata for package en-core-web-sm-2.0.0 produced metadata for project name en-core-web-sm. Fix your #egg=en-core-web-sm-2.0.0 fragments.
Requirement already satisfied (use --upgrade to upgrade): en-core-web-sm from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm-2.0.0 in ./venv3/lib/python3.6/site-packages
```
And after the fix:
```
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Collecting en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
100% |████████████████████████████████| 37.4MB 1.1MB/s
Installing collected packages: en-core-web-sm
Running setup.py install for en-core-web-sm ... done
Successfully installed en-core-web-sm-2.0.0
$ python -m spacy download en_core_web_sm-2.0.0 --direct
Looking in indexes: https://pypi.python.org/simple/
Requirement already satisfied: en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0 in ./venv3/lib/python3.6/site-packages (2.0.0)
```
### Types of change
This is an enhancement as it avoids unnecessary downloads of (potentially big) spacy models, when they have already been downloaded.
## Checklist
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2019-03-07 21:07:19 +01:00
Ines Montani
96b91a8898
Fix noqa [ci skip]
2019-03-07 12:25:00 +01:00
Ines Montani
fa7314b221
Clarify train_path and dev_path format (see #3366 ) [ci skip]
2019-03-07 12:23:27 +01:00
Ines Montani
9d6ca18a10
Tidy up and only use self.vector once
2019-03-07 01:06:12 +01:00
Ines Montani
a8f1efd2f5
Merge branch 'master' into develop
2019-03-07 00:56:31 +01:00
Daniel King
5f40229397
Don't use numpy directly for similarity ( #3362 )
...
* Don't use numpy directly for similarity
* Contributor agreement
2019-03-06 22:58:38 +00:00
Ines Montani
e9babd9973
Update hyperparameters section (see #3352 )
2019-03-06 14:40:30 +01:00
Ines Montani
6bd34e9d54
Expose Japanese stop words ( closes #3346 )
2019-03-06 14:21:15 +01:00
Ines Montani
85deb96278
Fix whitespace
2019-03-06 14:20:34 +01:00
Ines Montani
48a206a95f
Fix displaCy visualizations in docs ( closes #3357 ) [ci skip]
2019-03-06 13:20:44 +01:00
Ines Montani
5eadf61327
Update pretraining docs on file format ( closes #3354 )
2019-03-04 16:30:13 +00:00
Ines Montani
23f6ebf0f3
Add missing " ( closes #3343 )
2019-02-27 16:37:03 +01:00
Ines Montani
533b580c19
Add test for stray print statements in languages (see #3342 )
2019-02-27 16:04:30 +01:00
Ines Montani
48a2046d1c
Remove stray print statement ( closes #3342 )
2019-02-27 15:35:04 +01:00
Ines Montani
07d7c0a1af
Fix whitespace
2019-02-27 15:34:21 +01:00
Ines Montani
9b62639d19
Auto-format [ci skip]
2019-02-27 14:24:55 +01:00
Matthew Honnibal
656edcb984
Set version to v2.1.0a10
2019-02-27 12:26:13 +01:00
Ines Montani
1d4ba7678f
Auto-format [ci skip]
2019-02-27 12:07:35 +01:00
Matthew Honnibal
f1d77eb140
💫 Improve handling of missing NER tags ( closes #2603 ) ( #3341 )
...
* Improve handling of missing NER tags
GoldParse can accept missing NER tags, if entities is provided
in BILUO format (rather than as spans). Missing tags can be provided
as None values.
Fix bug that occurred when first tag was a None value. Closes #2603 .
* Document specification of missing NER tags.
2019-02-27 12:06:32 +01:00
Ines Montani
c478a2ccb6
Update backwards incompat [ci skip]
2019-02-27 11:56:56 +01:00
Ines Montani
e359bdd0e3
Auto-format
2019-02-27 11:56:45 +01:00
Ines Montani
d7217513c9
Merge branch 'spacy.io' into develop [ci skip]
2019-02-27 11:42:10 +01:00
Matthew Honnibal
4a3371acd5
Make doc[0].is_sent_start == True ( closes #2869 ) ( #3340 )
...
* Make doc[0] have sent_start True. Closes #2869
* Document that doc[0].is_sent_start defaults True.
2019-02-27 11:17:17 +01:00
Matthew Honnibal
2d3ce89b78
Improve matcher tests re issue #3328
2019-02-27 10:25:56 +01:00
Matthew Honnibal
8d6954e0e7
Fix matcher bug #3328
2019-02-27 10:25:39 +01:00
Ines Montani
cb481aa1fe
Merge branch 'spacy.io' into develop [ci skip]
2019-02-26 16:51:22 +01:00
Ines Montani
aadf586789
Add xfailing test for #3331
2019-02-25 22:33:30 +01:00
Matthew Honnibal
002c24d8ea
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2019-02-25 21:55:43 +01:00
Matthew Honnibal
3cdd3eb518
Set version to v2.1.0a9
2019-02-25 21:55:19 +01:00
Ines Montani
2579ecbb63
Merge branch 'spacy.io' into develop [ci skip]
2019-02-25 21:41:51 +01:00
Matthew Honnibal
b449be0f04
Add comment re issue #3170
2019-02-25 21:24:03 +01:00
Matthew Honnibal
29fb7b4a16
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2019-02-25 21:22:02 +01:00
Matthew Honnibal
9ccd6a3062
Fix head-outside-sentence bug. Fixes #3170
2019-02-25 21:21:44 +01:00
Ines Montani
3379ebcaa4
Fix default prop [ci skip]
2019-02-25 20:29:11 +01:00
Ines Montani
e711969e3b
Add more human-readable class names [ci skip]
2019-02-25 20:22:40 +01:00
Ines Montani
162bd4d75b
💫 Add Algolia DocSearch ( #3332 )
...
* Add Algolia DocSearch
* Add human-readable selector for teaser
2019-02-25 20:11:11 +01:00
Matthew Honnibal
f2fae1f186
Add batch size argument to Language.evaluate(). Closes #3263
2019-02-25 19:30:33 +01:00
Ines Montani
f135d663f7
Update conftest.py
2019-02-25 15:55:29 +01:00
Ines Montani
76ce8b2662
Merge branch 'master' into develop
2019-02-25 15:54:55 +01:00
Julia Makogon
f1c3108d52
Fixing pymorphy2 dependency issue ( #3329 ) ( closes #3327 )
...
* Classes for Ukrainian; small fix in Russian.
* Contributor agreement
* pymorphy2 initialization split for ru and uk (#3327 )
* stop-words fixed
* Unit-tests updated
2019-02-25 15:48:17 +01:00
Ines Montani
1a735e0f1f
Add regression test for #3328
2019-02-25 10:12:58 +01:00
Ines Montani
1b6238101a
Add table explaining training metrics [ closes #2644 ]
2019-02-25 10:03:43 +01:00
Ines Montani
1981b194cc
Fix recomputing of :target [ci skip]
...
Prevents additional history entry
2019-02-25 10:03:20 +01:00
Ines Montani
55bb570f51
Add [ja] to extras_require
2019-02-25 09:37:05 +01:00
Ines Montani
dfbed07d3b
Remove unused temp errors
2019-02-24 22:26:08 +01:00
Ines Montani
d0b3af9222
Fix remaining inaccuracies in API docs ( closes #2329 )
2019-02-24 22:21:25 +01:00
Ines Montani
49d0938038
Update version [ci skip]
2019-02-24 22:01:47 +01:00
Ines Montani
62b558ab72
💫 Support lexical attributes in retokenizer attrs ( closes #2390 ) ( #3325 )
...
* Fix formatting and whitespace
* Add support for lexical attributes (closes #2390 )
* Document lexical attribute setting during retokenization
* Assign variable oputside of nested loop
2019-02-24 21:13:51 +01:00