Commit Graph

77 Commits

Author SHA1 Message Date
Matthew Honnibal 793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal 3bf4a28d8d Use tag in CoNLL converter, not POS 2017-05-17 12:04:33 +02:00
Matthew Honnibal 8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal 5211645af3 Get data flowing through pipeline. Needs redesign 2017-05-16 11:21:59 +02:00
Matthew Honnibal a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
ines 9d85cda8e4 Fix models error message and use about.__docs_models__ (see #1051) 2017-05-13 13:05:47 +02:00
ines 4eefb288e3 Port over PR #1055 2017-05-13 03:25:32 +02:00
ines 95edd9e896 Let parse_package_meta take full path 2017-05-08 15:30:48 +02:00
ines 59c3b9d4dd Tidy up CLI and fix print functions 2017-05-07 23:25:29 +02:00
ines 527d51ac9a Fetch shortcuts from GitHub and improve error handling 2017-04-26 18:00:28 +02:00
Matthew Honnibal 4f9657b42b Fix reporting if no dev data with train 2017-04-23 22:27:10 +02:00
ines 3a9710f356 Pass dev_scores to print_progress correctly (resolves #1008)
Only read scores attribute if command is used with dev_data, otherwise
default dev_scores to empty dict.
2017-04-23 15:58:40 +02:00
ines 25c70b4cc5 Move fix_text to spacy.compat (see #1002) 2017-04-20 15:47:17 +02:00
Gyorgy Orosz 4a06a2572c Using ftfy for handling broken encoded strings. 2017-04-20 13:34:51 +02:00
ines 48da244058 Use spacy.compat.json_dumps for Python 2/3 compatibility (resolves #991) 2017-04-19 11:50:36 +02:00
ines 82f5f1f98f Replace str with compat.unicode_ 2017-04-17 01:29:54 +02:00
Matthew Honnibal 17c9fffb9e Fix naked except 2017-04-16 15:28:16 -05:00
ines 6145b7c153 Remove redundant Path 2017-04-16 20:53:25 +02:00
Matthew Honnibal 89a4f262fc Fix training methods 2017-04-16 13:00:37 -05:00
ines 8191e33cf1 Update link error message with info on permissions 2017-04-16 13:32:31 +02:00
ines a3ddbc0444 Add note about --force flag to error message 2017-04-16 13:14:36 +02:00
ines e3de035814 Add meta validation to check for required settings
Complain if no "lang", "name" or "version" is found (those settings are
used in directory / package names). Package will still build without,
but it'll inevitably fail somewhere down the line.
2017-04-16 13:13:17 +02:00
ines a7574b7572 Add more options to read in meta data in package command
Add meta option to supply path to meta.json. If no meta path is set,
check if meta.json exists in input directory and use it. Otherwise,
prompt for details on the command line.
2017-04-16 13:06:02 +02:00
ines 13c8a42d2b Fix typos 2017-04-16 13:03:58 +02:00
ines 35fb4febe2 Fix whitespace 2017-04-15 12:13:45 +02:00
ines c05ec4b89a Add compat functions and remove old workarounds
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00
ines d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines 561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
ines 84341c2975 Only compile list of models if data_path exists 2017-04-14 16:48:02 +02:00
Gyorgy Orosz dd3244c08a Made json dump to produce unicode strings in py2 2017-04-13 23:30:47 +02:00
Gyorgy Orosz a9469c8173 Fixed typo 2017-04-13 15:24:14 +02:00
ines 41037f0f07 Remove unused imports 2017-04-13 13:52:11 +02:00
ines 1b92c8d5d5 Use unicode paths on Windows/Python 2 and catch other errors (resolves #970)
try/except here is quite dirty, but it'll at least make sure users see
an error message that explains what's going on
2017-04-10 17:49:51 +02:00
ines 7ea1673072 Fix whitespace 2017-04-07 13:28:48 +02:00
ines 255650dbc2 Add connlu2json converter from explosion/spacy-dev-resources/#11 2017-04-07 13:05:12 +02:00
ines 789ce8a45e Add convert command 2017-04-07 13:04:17 +02:00
ines 9952d3b08a Fix whitespace 2017-04-07 13:02:05 +02:00
ines dcf8ab0c47 Merge branch 'develop' 2017-04-07 12:00:09 +02:00
Joshua Reeter 564daf6dec Issue #934 symlink should not convert paths as_posix under windows. 2017-03-30 23:47:45 -05:00
ines 4759fd437d Merge branch 'master' into develop 2017-03-29 10:37:13 +02:00
Grégory Howard 9c2996b27f correction of package.py (encoding on open instead of write) 2017-03-29 09:11:02 +02:00
ines 7198cf1c8a Remove unused import 2017-03-26 20:56:05 +02:00
ines 7ceaa1614b Add experimental model init command 2017-03-26 20:51:40 +02:00
Matthew Honnibal 2efdbc08ff Make training work with directories 2017-03-26 08:46:44 -05:00
Matthew Honnibal 9dcb58aaaf Merge CLI changes 2017-03-26 07:30:45 -05:00
Matthew Honnibal 6b7f7a2060 Connect parser L1 option to train CLI 2017-03-26 07:24:07 -05:00
Matthew Honnibal dec5571bf3 Update train CLI 2017-03-26 07:16:52 -05:00
ines 53cf2f1c0e Make dev data optional 2017-03-26 11:48:17 +02:00
Matthew Honnibal 5eac089fbe Merge branch 'master' into develop 2017-03-26 04:45:43 -05:00
ines 97814f8da6 Update Windows Python 2 link workaround to use helper functions 2017-03-25 14:04:27 +01:00