spaCy/spacy/training
Matthew Honnibal 8656a08777
Add beam_parser and beam_ner components for v3 (#6369)
* Get basic beam tests working

* Get basic beam tests working

* Compile _beam_utils

* Remove prints

* Test beam density

* Beam parser seems to train

* Draft beam NER

* Upd beam

* Add hypothesis as dev dependency

* Implement missing is-gold-parse method

* Implement early update

* Fix state hashing

* Fix test

* Fix test

* Default to non-beam in parser constructor

* Improve oracle for beam

* Start refactoring beam

* Update test

* Refactor beam

* Update nn

* Refactor beam and weight by cost

* Update ner beam settings

* Update test

* Add __init__.pxd

* Upd test

* Fix test

* Upd test

* Fix test

* Remove ring buffer history from StateC

* WIP change arc-eager transitions

* Add state tests

* Support ternary sent start values

* Fix arc eager

* Fix NER

* Pass oracle cut size for beam

* Fix ner test

* Fix beam

* Improve StateC.clone

* Improve StateClass.borrow

* Work directly with StateC, not StateClass

* Remove print statements

* Fix state copy

* Improve state class

* Refactor parser oracles

* Fix arc eager oracle

* Fix arc eager oracle

* Use a vector to implement the stack

* Refactor state data structure

* Fix alignment of sent start

* Add get_aligned_sent_starts method

* Add test for ae oracle when bad sentence starts

* Fix sentence segment handling

* Avoid Reduce that inserts illegal sentence

* Update preset SBD test

* Fix test

* Remove prints

* Fix sent starts in Example

* Improve python API of StateClass

* Tweak comments and debug output of arc eager

* Upd test

* Fix state test

* Fix state test
2020-12-13 09:08:32 +08:00
..
converters fix E902 and E903 numbering 2020-10-05 13:43:32 +02:00
__init__.pxd Renaming gold & annotation_setter (#6042) 2020-09-09 10:31:03 +02:00
__init__.py Replace pytokenizations with internal alignment (#6293) 2020-11-03 16:24:38 +01:00
align.pyx Fix alignment for 1-to-1 tokens and lowercasing (#6476) 2020-12-08 14:25:16 +08:00
alignment.py Replace pytokenizations with internal alignment (#6293) 2020-11-03 16:24:38 +01:00
augment.py Auto-format [ci skip] 2020-10-05 21:58:18 +02:00
batchers.py Renaming gold & annotation_setter (#6042) 2020-09-09 10:31:03 +02:00
corpus.py Integrate file readers 2020-10-02 01:36:06 +02:00
example.pxd Make a pre-check to speed up alignment cache (#6139) 2020-09-24 18:13:39 +02:00
example.pyx Add beam_parser and beam_ner components for v3 (#6369) 2020-12-13 09:08:32 +08:00
gold_io.pyx Use null raw for has_unknown_spaces in docs_to_json 2020-10-15 09:57:54 +02:00
initialize.py Add specific error when StaticVectors can't read the vectors data (#6450) 2020-12-09 06:16:07 +08:00
iob_utils.py Merge pull request #6089 from adrianeboyd/feature/doc-ents-v3-2 2020-09-24 14:44:42 +02:00
loggers.py Make console logger table more compact 2020-10-11 12:55:46 +02:00
loop.py Make training.loop return nlp object and path (#6520) 2020-12-08 14:55:55 +08:00
pretrain.py pretrain architectures (#6451) 2020-12-08 14:41:03 +08:00