Commit Graph

79 Commits

Author SHA1 Message Date
Matthew Honnibal ec63f4fe7b Add option to control how missing entities are handled when getting NER tags 2017-07-29 21:58:37 +02:00
Matthew Honnibal 9bae0ddc50 Fix minibatching 2017-07-22 20:14:49 +02:00
Matthew Honnibal ed6c85fa3c Fix loading of text categories in GoldParse 2017-07-22 20:04:03 +02:00
Matthew Honnibal 7ea50182a5 Add support for text-classification labels to GoldParse 2017-07-20 00:17:47 +02:00
Matthew Honnibal ebb6c49cd5 Make alignment case-insensitive for gold 2017-06-04 20:26:42 -05:00
Matthew Honnibal fc4dd62e84 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-06-04 20:19:05 -05:00
Matthew Honnibal a053b1218e Fix item counting during training 2017-06-04 20:18:20 -05:00
Matthew Honnibal 9bc4a26213 Add option of data augmentation noise 2017-06-04 20:16:57 -05:00
Matthew Honnibal f6955a459c Fix prev commit 2017-06-03 14:38:37 -05:00
Matthew Honnibal 468ca6c760 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-06-03 14:33:51 -05:00
Matthew Honnibal c647a0d33e Fix training counter for gold preprocessing 2017-06-03 14:33:39 -05:00
Matthew Honnibal e62f46d39f Clarify gold.pyx slightly 2017-06-03 13:28:52 -05:00
Matthew Honnibal be4a640f0c Fix arc eager label costs for uint64 2017-05-30 20:37:58 +02:00
Matthew Honnibal 84e66ca6d4 WIP on stringstore change. 27 failures 2017-05-28 14:06:40 +02:00
Matthew Honnibal d06f235fc9 Fix conflict on convert.py 2017-05-26 11:33:29 -05:00
Matthew Honnibal 2e587c6417 Export iob_to_biluo utility 2017-05-26 11:32:55 -05:00
Matthew Honnibal daac3e3573 Always shuffle gold data, and support length cap 2017-05-26 11:30:52 -05:00
Matthew Honnibal 3a6e59cc53 Add minibatch function in spacy.gold 2017-05-25 17:15:09 -05:00
Matthew Honnibal 3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal 532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
Matthew Honnibal c9760b2104 Support sentence limits in GoldCorpus 2017-05-22 10:40:46 -05:00
ines 54f04a9fe0 Update API docs with changes in spacy.gold and spacy.language 2017-05-22 12:29:30 +02:00
Matthew Honnibal 2a5eb9f61e Make nonproj methods top-level functions, instead of class methods 2017-05-22 04:51:08 -05:00
Matthew Honnibal 025d9bbc37 Fix handling of non-projective deps 2017-05-22 04:51:08 -05:00
Matthew Honnibal f13d6c7359 Support gold preprocessing and single gold files 2017-05-22 04:51:08 -05:00
Matthew Honnibal 5db89053aa Merge docstrings 2017-05-21 13:46:23 -05:00
Matthew Honnibal 432b3499b3 Fix memory leak 2017-05-21 13:38:46 -05:00
Matthew Honnibal 4803b3b69e Add GoldCorpus class, to manage data streaming 2017-05-21 09:06:17 -05:00
ines 075f5ff87a Update docstrings and API docs for GoldParse 2017-05-21 13:53:46 +02:00
Matthew Honnibal fc8d3a112c Add util.env_opt support: Can set hyper params through environment variables. 2017-05-18 04:36:53 -05:00
Matthew Honnibal 793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal 89a4f262fc Fix training methods 2017-04-16 13:00:37 -05:00
ines e1efd589c3 Fix json imports and use ujson 2017-04-15 12:13:34 +02:00
ines 958b12dec8 Use pathlib instead of os.path 2017-04-15 12:13:00 +02:00
ines d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines 561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
Raphaël Bournhonesque f332bf05be Remove unused import statements 2017-03-21 21:08:54 +01:00
Matthew Honnibal 2611ac2a89 Fix scorer bug for NER, related to ambiguity between missing annotations and misaligned tokens 2017-03-16 09:38:28 -05:00
Matthew Honnibal 3d4e389d23 Whitespace 2017-03-15 09:29:42 -05:00
Matthew Honnibal 159e8c46e1 Merge old training fixes with newer state 2016-11-25 09:16:36 -06:00
Matthew Honnibal cc7e607a8a Fix gold.pyx for 1.0 2016-11-25 08:57:59 -06:00
Matthew Honnibal b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal f5fe4f595b Fix json loading, for Python 3. 2016-10-20 21:23:26 +02:00
Matthew Honnibal 52b48b415e Fix GoldParse class 2016-10-16 11:41:36 +02:00
Matthew Honnibal 0317cea0ad Fix GoldParse 2016-10-15 23:55:07 +02:00
Matthew Honnibal a48aa15384 Improve the API for the GoldParse class. 2016-10-15 23:53:29 +02:00
Matthew Honnibal e07fe92b27 Draft a refactored init for the GoldParse class 2016-10-15 22:09:52 +02:00
Matthew Honnibal 86ae665c78 Add function for entity->biluo transformation 2016-10-15 21:51:04 +02:00
Matthew Honnibal 645d99523a Move merge_sents method into spacy.gold 2016-10-13 03:24:29 +02:00