Commit Graph

76 Commits

Author SHA1 Message Date
Matthew Honnibal f6a6c39ce8 * Add warning for models not found in parser 2015-07-08 19:52:30 +02:00
Matthew Honnibal bb522496dd * Rename Tokens to Doc 2015-07-08 18:53:00 +02:00
Matthew Honnibal ff885e8511 * Add ParserFactory convenience function 2015-07-08 12:35:46 +02:00
Matthew Honnibal 5af500909c * Remove unused directve from parser.pyx 2015-06-28 06:20:21 +02:00
Matthew Honnibal ab110be125 * Remove debugging in parser.pyx 2015-06-16 23:37:25 +02:00
Matthew Honnibal f66228f253 * Add some more features, esp for labels 2015-06-14 21:18:02 +02:00
Matthew Honnibal ea8a103007 * Fix import of TransitionSystem in parser.pyx 2015-06-14 19:01:26 +02:00
Matthew Honnibal 75289b4761 * Don't refuse to parse single token sentences, incase some transition system needs them, e.g. single word entity. Instead fix error in _init_state. 2015-06-13 22:55:55 +02:00
Matthew Honnibal 15e177d7a1 * Fixes to unshift/fast-forward strategy. Getting 91.55 greedy on NW dev, gold preproc 2015-06-12 01:50:23 +02:00
Matthew Honnibal 4575e7a60f * Fix beam search with new StateClass 2015-06-10 06:33:39 +02:00
Matthew Honnibal 04b1cd9b8c * Greedy parsing working with new StateClass. Beam parsing broken 2015-06-10 04:20:23 +02:00
Matthew Honnibal 6a94b64eca * Remove State* from parser.pyx entirely, switching over to StateClass. Beam parsing still untested. 2015-06-10 02:03:38 +02:00
Matthew Honnibal f14a1526aa * Remove version of fill_context that takes State* 2015-06-10 01:39:07 +02:00
Matthew Honnibal d68c686ec1 * Move StateClass into interface of transition functions 2015-06-10 01:35:28 +02:00
Matthew Honnibal 4b98b3e9c8 * Cost functions now take StateClass argument, instead of State*. 2015-06-10 00:40:43 +02:00
Matthew Honnibal e0cf61f591 * Move StateClass into the interface for is_valid 2015-06-09 23:23:28 +02:00
Matthew Honnibal 0895d454fb * Prepare to switch to using state class, instead of state struct 2015-06-09 21:20:14 +02:00
Matthew Honnibal c7e3dfc1dc * Don't automatically push words when stack is empty, as it messes up beam parsing. Add hash method to beam state. 2015-06-08 14:49:04 +02:00
Matthew Honnibal 6e2564239d * Bug fixes to beam parser. Search still broken on non-gold sentences 2015-06-07 19:12:59 +02:00
Matthew Honnibal 88ac5c6e98 * Send beam_width < 0 to greedy parser 2015-06-05 17:12:06 +02:00
Matthew Honnibal 6bf35cecc3 * Refactor transition system to use classes with staticmethods. 2015-06-05 02:27:17 +02:00
Matthew Honnibal 4433396005 * Impove efficiency of dynamic oracle, making beam training faster 2015-06-04 21:15:14 +02:00
Matthew Honnibal a513ec500f * Have oracle functions take a struct instead of a Python object 2015-06-02 20:01:06 +02:00
Matthew Honnibal d1b55310a1 * Refactor _advance_beam function 2015-06-02 18:38:41 +02:00
Matthew Honnibal e822df0867 * Fix bugs in new greedy/beam parser 2015-06-02 02:01:33 +02:00
Matthew Honnibal 66dfa95847 * Revise greedy_parse/beam_parse ownership goof 2015-06-02 01:34:19 +02:00
Matthew Honnibal 75658b2ed3 * Remove use of new beam.loss property, to maintain compatibility with older versions of thinc for now. 2015-06-02 00:57:09 +02:00
Matthew Honnibal 58d5ac0944 * Add beam search capabilities to Parser. Rename GreedyParser to Parser. 2015-06-02 00:28:02 +02:00
Matthew Honnibal 4010b9b6d9 * Pass parameter for regularization in parser.pyx 2015-05-27 03:18:50 +02:00
Matthew Honnibal fc75210941 * Move spacy.syntax.conll to spacy.gold 2015-05-24 21:35:02 +02:00
Matthew Honnibal 03a6626545 * Tmp commit 2015-05-12 20:27:56 +02:00
Matthew Honnibal fb8d50b3d5 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-04-30 12:45:15 +02:00
Matthew Honnibal b3fd48c97b * Fix missing root labels bug identified in Issue #57 2015-04-28 20:45:51 +02:00
Jordan Suchow 3a8d9b37a6 Remove trailing whitespace 2015-04-19 13:01:38 -07:00
Matthew Honnibal db5a43318c * Improve print_state debug printer 2015-03-27 17:29:58 +01:00
Matthew Honnibal 1705eccbbe * Remove whitespace 2015-03-27 15:22:39 +01:00
Matthew Honnibal e854ba0a13 * Remove support for force_gold flag from GreedyParser, since it's not so useful, and it's clutter 2015-03-26 16:44:47 +01:00
Matthew Honnibal 6a6085f8b9 * Clean up GreedyParser.train function a bit 2015-03-26 16:44:47 +01:00
Matthew Honnibal b3157927e6 * Clean up unused feature templates 2015-03-26 16:44:47 +01:00
Matthew Honnibal 71648205d9 * Add support for debug feature set. Just use unigrams for this. 2015-03-26 16:44:47 +01:00
Matthew Honnibal 05d6065e2e * Add assertion 2015-03-26 16:44:46 +01:00
Matthew Honnibal 31fad99518 * Use StringStore to encode label names, instead of label_ids 2015-03-26 16:44:45 +01:00
Matthew Honnibal b9b695fb1b * Remove debug word list 2015-03-26 16:44:45 +01:00
Matthew Honnibal 8057a95f20 * NER seems to be working, scoring 69 F. Need to add decision-history features --- currently only use current word, 2 words context. Need refactoring. 2015-03-26 16:44:44 +01:00
Matthew Honnibal ae235e07b9 * Refactoring working for parser, but now need to rig up features for NER, and then debug oracle etc. 2015-03-26 16:44:44 +01:00
Matthew Honnibal f321b2b2eb * Remove TODO comment 2015-03-26 16:44:43 +01:00
Matthew Honnibal 10ed738df2 * Tmp commit 2015-03-26 16:44:43 +01:00
Matthew Honnibal 8c883cef58 * Refactored transition system code now compiling. Still need to hook up label oracle, and test 2015-03-26 16:44:43 +01:00
Matthew Honnibal dc986dbc0b * Work on refactored parser, where TransitionSystem can be easily subclassed 2015-03-26 16:44:42 +01:00
Matthew Honnibal 135756ac3d * Tmp commit of NER refactoring 2015-03-26 16:44:42 +01:00