Commit Graph

325 Commits

Author SHA1 Message Date
ines d84b13e02c Merge branch 'master' into develop 2018-07-18 18:57:00 +02:00
Ole Henrik Skogstrøm 6e2930a4a2 Conll(u)-bio converter (#2525)
* Started simple conllxbiluo converter

* Fix missing BIO to BILUO conversion
2018-07-18 18:55:42 +02:00
Matthew Honnibal 8ae1bec8bf Fix init_model 2018-07-05 14:02:06 +02:00
Matthew Honnibal dee8bdb900 Fix init-model for npz vectors 2018-07-04 02:29:48 +02:00
Matthew Honnibal 59d655e8d0 Fix model init from jsonl 2018-07-04 01:30:40 +02:00
Matthew Honnibal 1e38bea6e9 Save vectors init 2018-07-03 23:55:04 +02:00
Matthew Honnibal 6692833887 Fix init_model 2018-07-03 23:24:11 +02:00
Matthew Honnibal 4a38a26cb5 Fix init_model 2018-07-03 22:57:11 +02:00
Matthew Honnibal 019d09e3c3 Fix init model 2018-07-03 22:16:44 +02:00
Matthew Honnibal 2543f8c93a Support .npz vectors in init-model command 2018-07-03 21:42:16 +02:00
Matthew Honnibal 86aad11939 Fix init_model arg 2018-07-03 17:00:42 +02:00
Matthew Honnibal eff42d36e3 Fix init model command 2018-07-03 16:32:23 +02:00
Matthew Honnibal 6a89faf12e Add support for jsonl-formatted lexical attributes to init-model command. 2018-07-03 12:22:56 +02:00
Matthew Honnibal c83fccfe2a Fix output of best model 2018-06-25 23:05:56 +02:00
Matthew Honnibal 69c900f003 Fix init-model if no vectors provided 2018-06-25 18:26:02 +02:00
Matthew Honnibal 664f89327a Fix init-model if no vectors provided 2018-06-25 17:58:45 +02:00
Matthew Honnibal c4698f5712 Don't collate model unless training succeeds 2018-06-25 16:36:42 +02:00
Matthew Honnibal 24dfbb8a28 Fix model collation 2018-06-25 14:35:24 +02:00
Matthew Honnibal 62237755a4 Import shutil 2018-06-25 13:40:17 +02:00
Matthew Honnibal a040fca99e Import json into cli.train 2018-06-25 11:50:37 +02:00
Matthew Honnibal 2c703d99c2 Fix collation of best models 2018-06-25 01:21:34 +02:00
Matthew Honnibal 2c80b7c013 Collate best model after training 2018-06-24 23:39:52 +02:00
ines 330c039106 Merge branch 'master' into develop 2018-05-26 18:30:52 +02:00
James Messinger 4515e96e90 Better formatting for `spacy train` CLI (#2357)
* Better formatting for `spacy train` CLI

Changed to use fixed-spaces rather than tabs to align table headers and data.

### Before:
```
Itn.    P.Loss  N.Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %
0       4618.857        2910.004        76.172  79.645  67.987  88.732  88.261  100.000 4436.9  6376.4
1       4671.972        3764.812        74.481  78.046  62.374  82.680  88.377  100.000 4672.2  6227.1
2       4742.756        3673.473        71.994  77.380  63.966  84.494  90.620  100.000 4298.0  5983.9
```

### After:
```
Itn.  Dep Loss  NER Loss  UAS     NER P.  NER R.  NER F.  Tag %   Token %  CPU WPS  GPU WPS
0     4618.857  2910.004  76.172  79.645  67.987  88.732  88.261  100.000  4436.9   6376.4
1     4671.972  3764.812  74.481  78.046  62.374  82.680  88.377  100.000  4672.2   6227.1
2     4742.756  3673.473  71.994  77.380  63.966  84.494  90.620  100.000  4298.0   5983.9
```

* Added contributor file
2018-05-25 13:08:45 +02:00
Matthew Honnibal ce458c2428 Fix spacy requirement constraint in package template 2018-05-22 20:50:46 +02:00
Matthew Honnibal f3b4f6a4ec Merge setup.py 2018-05-20 23:21:00 +02:00
Ines Montani d4cc736b7c 💫 Improve model downloads: check for existing install, customise pip and use requests library again (#2346)
* Go back to using requests instead of urllib (closes #2320)

Fewer dependencies are good, but this one was simply causing too many other problems around SSL verification and Python 2/3 compatibility. requests is a popular enough package that it's okay for spaCy to depend on it – and this will hopefully make model downloads less flakey.

* Only download model if not installed (see #1456)

Use #egg=model==version to allow pip to check for existing installations. The download is only started if no installation matching the package/version is found. Fixes a long-standing inconvenience.

* Pass additional options to pip when installing model (resolves #1456)

Treat all additional arguments passed to the download command as pip options to allow user to customise the command. For example:

python -m spacy download en --user

* Add CLI option to enable installing model package dependencies

* Revert "Add CLI option to enable installing model package dependencies"

This reverts commit 9336ffe695.

* Update documentation
2018-05-20 20:26:56 +02:00
Matthew Honnibal 74d5c625b3 Use rising beam update prob 2018-05-16 20:11:59 +02:00
Matthew Honnibal dc1a479fbd Merge branch 'develop' into feature/refactor-parser 2018-05-15 18:39:21 +02:00
Matthew Honnibal 546dd99cdf Merge master into develop -- mostly Arabic and website 2018-05-15 18:14:28 +02:00
Matthew Honnibal a6ae1ee6f7 Don't modify Token in global scope 2018-05-09 00:43:00 +02:00
Matthew Honnibal f94f721f40 Avoid importing fused token symbol in ud-run-test, untl that's added 2018-05-09 00:28:03 +02:00
Matthew Honnibal 659ec5b975 Avoid importing fused token symbol in ud-run-test, untl that's added 2018-05-08 19:40:33 +02:00
Matthew Honnibal fc4dd49b77 Support oracle segmentation in ud-train CLI command 2018-05-08 13:47:45 +02:00
ines 7a3599c21a Fix formatting and consistency 2018-05-07 23:02:11 +02:00
Matthew Honnibal eddc0e0c74 Set gold.sent_starts in ud_train 2018-05-07 15:52:47 +02:00
G.Pruvost cc8e804648 #2211 - Support for ssl certs config on download command (#2212)
* Add support for SSL/Certs customization on download CLI

* Add a note on SSL options for the 'download' CLI in the README

* Add contributor agreement
2018-05-03 18:37:02 +02:00
Matthew Honnibal 723b328062 Add script to run UD test 2018-04-29 15:50:25 +02:00
Matthew Honnibal 17af6aa3a4 Update ud_train script 2018-04-29 15:49:32 +02:00
Matthew Honnibal 2c4a6d66fa Merge master into develop. Big merge, many conflicts -- need to review 2018-04-29 14:49:26 +02:00
ines 3c80f69ff5 Return data in cli.info and add silent option (resolves #2196) 2018-04-29 01:59:44 +02:00
ines 0299d5fac8 Update argument annotations and formatting 2018-04-10 21:45:11 +02:00
ines 49b1e48bf5 Fix syntax error 2018-04-10 21:44:59 +02:00
ines 70052e46e9 Fix formatting [ci skip] 2018-04-10 21:42:46 +02:00
Matthew Honnibal 0ddb152be0 Improve error message when reading vectors 2018-04-10 21:26:50 +02:00
Matthew Honnibal db50ac524e Support zipped vector files in init-model 2018-04-10 21:21:00 +02:00
ines 270fcfd925 Fix typo in package command message (closes #2200) 2018-04-10 19:14:31 +02:00
ines 24d8bf348d Revert "Add support for .zip to init_model"
This reverts commit 7ee880a0ad.
2018-04-10 19:08:06 +02:00
Matthew Honnibal 7ee880a0ad Add support for .zip to init_model 2018-04-10 14:30:04 +00:00
Ines Montani 3141e04822
💫 New system for error messages and warnings (#2163)
* Add spacy.errors module

* Update deprecation and user warnings

* Replace errors and asserts with new error message system

* Remove redundant asserts

* Fix whitespace

* Add messages for print/util.prints statements

* Fix typo

* Fix typos

* Move CLI messages to spacy.cli._messages

* Add decorator to display error code with message

An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc.

* Remove unused link in spacy.about

* Update errors for invalid pipeline components

* Improve error for unknown factories

* Add displaCy warnings

* Update formatting consistency

* Move error message to spacy.errors

* Update errors and check if doc returned by component is None
2018-04-03 15:50:31 +02:00