Commit Graph

122 Commits

Author SHA1 Message Date
Matthew Honnibal aec130af56 Use util.Package class for io
Previous Sputnik integration caused API change: Vocab, Tagger, etc
were loaded via a from_package classmethod, that required a
sputnik.Package instance. This forced users to first create a
sputnik.Sputnik() instance, in order to acquire a Package via
sp.pool().

Instead I've created a small file-system shim, util.Package, which
allows classes to have a .load() classmethod, that accepts either
util.Package objects, or strings. We can later gut the internals
of this and make it a proxy for Sputnik if we need more functionality
that should live in the Sputnik library.

Sputnik is now only used to download and install the data, in
spacy.en.download
2015-12-29 18:00:48 +01:00
Matthew Honnibal 6f47074214 * Make constructor of ParserModel and TaggerModel the same as AveragedPerceptron, for each pickling. 2015-11-07 18:25:17 +11:00
Matthew Honnibal 888c05a7fa * Fix variable naming in StepwiseState, for thinc 4.0 2015-11-07 11:02:44 +11:00
Matthew Honnibal fc2185bfe3 * Fix variable naming in StepwiseState, for thinc 4.0 2015-11-07 10:48:31 +11:00
Matthew Honnibal 954442a807 * Fix variable naming in StepwiseState, for thinc 4.0 2015-11-07 10:30:45 +11:00
Matthew Honnibal 19136b0e7d * Add better debug message for illegal move 2015-11-07 05:34:37 +11:00
Matthew Honnibal 3c162dcac3 * Refactor away from the _ml module, to use thinc 4.0. Still some work needs to be done, e.g. to add __reduce__ to the models, more testing, etc. 2015-11-07 03:24:30 +11:00
Matthew Honnibal b9991fbd20 * Update to use thinc 3.0 2015-11-06 00:25:59 +11:00
Matthew Honnibal 68f479e821 * Rename Doc.data to Doc.c 2015-11-04 00:15:14 +11:00
Matthew Honnibal 20fd36a0f7 * Very scrappy, likely buggy first-cut pickle implementation, to work on Issue #125: allow pickle for Apache Spark. The current implementation sends stuff to temp files, and does almost nothing to ensure all modifiable state is actually preserved. The Language() instance is a deep tree of extension objects, and if pickling during training, some of the C-data state is hard to preserve. 2015-10-13 13:44:41 +11:00
Matthew Honnibal 86c888667f * Merge in changes from de branch 2015-09-06 19:49:28 +02:00
Matthew Honnibal 5edac11225 * Wrap self.parse in nogil, and break if an invalid move is predicted. The invalid break is a work-around that papers over likely bugs, but we can't easily break in the nogil block, and otherwise we'll get an infinite loop. Need to set this as an error flag. 2015-09-06 04:15:00 +02:00
Matthew Honnibal a3d5e6c0dd * Reform constructor and save/load workflow in parser model 2015-08-26 19:19:01 +02:00
Matthew Honnibal bf38b3b883 * Hack on l/r reversal bug 2015-08-10 05:58:43 +02:00
Matthew Honnibal 6116413b47 * Fix label prediction in StepwiseState 2015-08-10 05:05:31 +02:00
Matthew Honnibal 9de98f5a6f * Add Parser.stepthrough method, with context manager 2015-08-10 00:08:46 +02:00
Matthew Honnibal 9c090945e0 * Add Parser.predict method, and clean up Parser.get_state 2015-08-09 02:29:58 +02:00
Matthew Honnibal 04fccfb984 * Fix get_state for parser prediction 2015-08-09 02:11:22 +02:00
Matthew Honnibal 55fde0e240 * Fix get_state 2015-08-09 01:45:30 +02:00
Matthew Honnibal f0f4fa9838 * Fix Parser.get_state 2015-08-09 01:40:13 +02:00
Matthew Honnibal 18331dca89 * Add continue_for argument to parser 'partial' function, which is now renamed to get_state 2015-08-09 01:31:54 +02:00
Matthew Honnibal 9de218b7ba * Fix Parser.partial function 2015-08-08 23:45:18 +02:00
Matthew Honnibal 3af938365f * Add function partial to Parser 2015-08-08 23:32:15 +02:00
Matthew Honnibal 823ef4a00b * Remove profile declarations 2015-07-25 18:13:06 +02:00
Matthew Honnibal aa28e2e01d * Release the GIL around parse function 2015-07-24 04:53:27 +02:00
Matthew Honnibal fb0a641a2d * Don't release the gil around Parser.parse. Does this indicate thread problems? 2015-07-17 23:07:37 +02:00
Matthew Honnibal e29daea85f * Fix bint/int typing problem in TransitionSystem. In C++ bint* means bool*, but in C it means int*. So, type-casting to bint* is unsafe. 2015-07-17 22:37:24 +02:00
Matthew Honnibal 45ae1ce428 * Remove unused declaration in parser 2015-07-16 01:27:11 +02:00
Matthew Honnibal 9a8db9743c * Remove gil from parser.call 2015-07-14 23:47:33 +02:00
Matthew Honnibal 38ca0c33f5 Merge branch 'neuralnet' into refactor
Mostly refactors parser, to use new thinc3.2 Example class.
Aim is to remove use of shared memory, so that we can parallelize
over documents easily.

Conflicts:
	setup.py
	spacy/syntax/parser.pxd
	spacy/syntax/parser.pyx
	spacy/syntax/stateclass.pyx
2015-07-14 14:13:47 +02:00
Matthew Honnibal 6eef0bf9ab * Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx 2015-07-13 20:20:58 +02:00
Matthew Honnibal adb868bdad * Add warning for models not found in parser 2015-07-08 20:04:55 +02:00
Matthew Honnibal 05b28ec9eb * Add warning for models not found in parser 2015-07-08 20:02:13 +02:00
Matthew Honnibal ef700401a6 * Add warning for models not found in parser 2015-07-08 20:00:46 +02:00
Matthew Honnibal 6218d8b389 * Add warning for models not found in parser 2015-07-08 19:59:16 +02:00
Matthew Honnibal f6a6c39ce8 * Add warning for models not found in parser 2015-07-08 19:52:30 +02:00
Matthew Honnibal bb522496dd * Rename Tokens to Doc 2015-07-08 18:53:00 +02:00
Matthew Honnibal ff885e8511 * Add ParserFactory convenience function 2015-07-08 12:35:46 +02:00
Matthew Honnibal e20106fdff * Begin reorganizing neuralnet work 2015-06-30 14:26:32 +02:00
Matthew Honnibal f4986d5d3c * Use new Example class 2015-06-28 22:36:03 +02:00
Matthew Honnibal 735f1af91f * Fix neural net stuff 2015-06-28 11:44:58 +02:00
Matthew Honnibal e7003f1cf3 * Remove hard-coding of vector lengths 2015-06-28 11:37:17 +02:00
Matthew Honnibal 897dd0dd0b * Merge changes, and adjust Example to use memoryview 2015-06-28 11:36:11 +02:00
Matthew Honnibal 9282a8e72c * Prepare for new models to be plugged in by using Example class 2015-06-28 11:02:35 +02:00
Matthew Honnibal 75aeccc064 * Rejig parser interface to use new thinc.api.Example class, in prep of theano model. Comment out beam search 2015-06-28 11:02:34 +02:00
Matthew Honnibal 5af500909c * Remove unused directve from parser.pyx 2015-06-28 06:20:21 +02:00
Matthew Honnibal ed40a8380e * Remove hard-coding of vector lengths 2015-06-27 04:18:47 +02:00
Matthew Honnibal f8bb43475e * Bridge to Theano working. Very disorganised. Using thinc adb60aba966ed2 2015-06-27 02:39:18 +02:00
Matthew Honnibal 2fe98b8a9a * Prepare for new models to be plugged in by using Example class 2015-06-26 13:51:39 +02:00
Matthew Honnibal 6896455884 * Rejig parser interface to use new thinc.api.Example class, in prep of theano model. Comment out beam search 2015-06-26 06:25:36 +02:00