Commit Graph

442 Commits

Author SHA1 Message Date
Matthew Honnibal b532f4eaa2 * Ensure serialize is packaged. 2015-07-27 01:51:37 +02:00
Matthew Honnibal 62da5eb338 * Inc version 2015-07-26 22:22:54 +02:00
Matthew Honnibal 65f3ce6c52 * Require preshed 0.41 2015-07-25 22:36:43 +02:00
Matthew Honnibal 287d90e792 * Use thinc 3.3 2015-07-24 04:52:50 +02:00
Matthew Honnibal 06eac32610 * Add cfile.pyx 2015-07-23 01:10:36 +02:00
Matthew Honnibal a9149fdcbd * Compile attrs.pyx 2015-07-17 16:39:25 +02:00
Matthew Honnibal db9dfd2e23 * Major refactor of serialization. Nearly complete now. 2015-07-17 01:27:54 +02:00
Matthew Honnibal 38ca0c33f5 Merge branch 'neuralnet' into refactor
Mostly refactors parser, to use new thinc3.2 Example class.
Aim is to remove use of shared memory, so that we can parallelize
over documents easily.

Conflicts:
	setup.py
	spacy/syntax/parser.pxd
	spacy/syntax/parser.pyx
	spacy/syntax/stateclass.pyx
2015-07-14 14:13:47 +02:00
Matthew Honnibal d87d71caf4 * Compile the new modules after refactor 2015-07-13 22:29:33 +02:00
Matthew Honnibal 703ca40420 * Inc version 2015-07-08 20:07:23 +02:00
Matthew Honnibal 1e8dd0e2c5 * Comple senses.pyx 2015-07-01 18:49:15 +02:00
Matthew Honnibal 90e2059200 * Include spacy.munge in the built library 2015-06-30 18:35:39 +02:00
Matthew Honnibal 5d595b5a8c * Inc versions 2015-06-30 18:11:06 +02:00
Matthew Honnibal 8e7ffd2cdd * Use thinc 3.1 2015-06-29 02:13:23 +02:00
Matthew Honnibal 9282a8e72c * Prepare for new models to be plugged in by using Example class 2015-06-28 11:02:35 +02:00
Matthew Honnibal 4944d3ba20 * Update requirement to thinc 3.0 2015-06-28 06:21:20 +02:00
Matthew Honnibal dc10aa2518 * Increment version 2015-06-24 04:52:15 +02:00
Matthew Honnibal 34c0ef2ee8 * Don't compile the orig_arc_eager and tree_arc_eager modules used for the EMNLP paper 2015-06-23 05:38:17 +02:00
Matthew Honnibal a5ae98a543 * Add tree_arc_eager to setup.py 2015-06-15 08:22:59 +02:00
Matthew Honnibal bcfdf126a4 * Add toggle for OrigArcEager system 2015-06-14 20:28:14 +02:00
Matthew Honnibal e2f9a80713 * Remove old _state imports 2015-06-10 07:09:17 +02:00
Matthew Honnibal d70304b7dd * Require newer thinc 2015-06-10 04:20:42 +02:00
Matthew Honnibal 09617a4638 * Whitespace 2015-06-09 21:20:33 +02:00
Matthew Honnibal 00a0dfcb59 * Avoid shipping the spacy.munge package 2015-06-08 00:54:13 +02:00
Matthew Honnibal 22f1ad012e * Add spacy.munge to list of packages 2015-06-07 22:28:13 +02:00
Matthew Honnibal ce8e524825 * Fix requirements in setup.py 2015-06-07 22:24:21 +02:00
Matthew Honnibal 48bc4122d8 * Upd version in setup.py 2015-06-07 19:05:28 +02:00
Matthew Honnibal cc7439a16b * Don't use alignment.pyx file, move functionality to spacy.gold 2015-05-24 21:51:15 +02:00
Matthew Honnibal fc75210941 * Move spacy.syntax.conll to spacy.gold 2015-05-24 21:35:02 +02:00
Matthew Honnibal bfeb29ebd1 * Tmp commit 2015-05-24 02:50:14 +02:00
Matthew Honnibal 03ebf70a66 * Inc version to 0.84 2015-05-12 02:38:51 +02:00
Jordan Suchow 3a8d9b37a6 Remove trailing whitespace 2015-04-19 13:01:38 -07:00
Jordan Suchow 5f0f940a1f Remove unused imports 2015-04-19 01:05:22 -07:00
Matthew Honnibal 716ba06711 * Inc version 2015-04-16 04:28:15 +02:00
Matthew Honnibal 05d0f078bb * Inc version 2015-04-13 22:29:31 +02:00
Matthew Honnibal ab53855dfe * Bump version 2015-04-13 06:08:22 +02:00
Matthew Honnibal 11c4794e56 * Bump version number 2015-04-12 07:17:32 +02:00
Matthew Honnibal 8f68b864c4 * Move Span/Spans to separate files. Currently duplicates lots of Tokens functionality. Should probably be integrated into Tokens 2015-03-26 16:44:48 +01:00
Matthew Honnibal e99f19dd6c * Fix clean function 2015-03-26 16:44:44 +01:00
Matthew Honnibal 357dcdcc01 * Fix clean function 2015-03-26 16:44:44 +01:00
Matthew Honnibal 8da53cbe3c * Fix setup.py, so that when compiling, only the necessary files are compiled 2015-03-26 16:44:43 +01:00
Matthew Honnibal 6e86790a4e * Add new syntax modules to setup.py 2015-03-26 16:44:42 +01:00
Matthew Honnibal c341bfb0a2 * Inc version 2015-03-03 05:46:14 -05:00
Matthew Honnibal 827a2337b0 * Inc version 2015-02-27 03:56:54 -05:00
Matthew Honnibal 74015da94b * Inc version 2015-02-23 15:40:41 -05:00
Matthew Honnibal 6102360111 * Add -Wno-strict-prototypes, to suppress warning 2015-02-21 20:04:37 -05:00
Matthew Honnibal ba1d3ddd7f * Move -lc++ link arg to only be used if darwin is OS. Should actually check whether GCC is compiler 2015-02-18 06:10:43 -05:00
Matthew Honnibal 59b46e4c2f * Move libc++ argument back under check for darwin. This assumes that extensions on OSX will be built with clang, but OSX GCC builds are also possible. Need to detect compiler and disable this flag 2015-02-18 06:03:45 -05:00
Matthew Honnibal aa475673ee * Tweak compile args for OSX 2015-02-18 05:41:11 -05:00
Matthew Honnibal b4edd1d907 * Make new compile args conditional on darwin, as they're invalid on Linux 2015-02-18 05:09:50 -05:00
Matthew Honnibal e885903dc6 * Add compile args to fix conda compilation on OSX, and increment version 2015-02-18 05:01:27 -05:00
Matthew Honnibal 69d27d55b0 * Inc version, with new orphan-token bug fix 2015-02-16 16:52:54 -05:00
Matthew Honnibal 789a6fe462 * Inc version --- 0.63 seems to have been packaged incorrectly, to not include a bug fix to tokens.pyx to transfer ownership to Token objects 2015-02-16 11:56:14 -05:00
Matthew Honnibal 773d209405 * Inc version to 0.63 2015-02-11 18:39:41 -05:00
Matthew Honnibal f0a9d2cb9c * Inc version 2015-02-11 14:20:57 -05:00
Matthew Honnibal 5ff2b5c8f0 * Inc version 2015-02-10 10:16:09 -05:00
Matthew Honnibal 29bdf0d05a * Inc version 2015-02-09 10:22:06 -05:00
Matthew Honnibal 407bb5da8b * Increment version 2015-02-09 09:46:20 -05:00
Matthew Honnibal 933c188eb5 * Inc version 2015-02-07 13:14:27 -05:00
Matthew Honnibal ef795aece8 * Upd release 2015-02-07 12:26:34 -05:00
Matthew Honnibal 330b1a7a3d * Inc version 2015-02-07 11:32:13 -05:00
Matthew Honnibal bfe1bcc02d * Rename 0.4.0 to 0.40 2015-02-01 18:32:01 +11:00
Matthew Honnibal dea1245311 * Require advanced version of cymem 2015-02-01 17:04:59 +11:00
Matthew Honnibal ac20a53509 * Change version to 0.4.0 2015-02-01 16:54:13 +11:00
Matthew Honnibal d1a5091052 * Require six 2015-02-01 16:24:50 +11:00
Matthew Honnibal 754f4aed8e * Inc version 2015-01-31 22:49:46 +11:00
Matthew Honnibal a3955fd8d5 * Require plac 2015-01-31 13:50:53 +11:00
Matthew Honnibal 6c081dd1fc * Handle failure when numpy headers are already installed correctly 2015-01-30 19:48:19 +11:00
Matthew Honnibal f0bbffca8d * Fix the way numpy headers are installed during compilation from source 2015-01-30 18:14:45 +11:00
Matthew Honnibal 781dd712dc * Fix numpy commit problem 2015-01-28 14:00:20 +11:00
Matthew Honnibal 8cd5a91063 * Inc version, and add wget as requirement 2015-01-25 23:00:54 +11:00
Matthew Honnibal 419fef7627 * Inc version 2015-01-25 22:15:47 +11:00
Matthew Honnibal 77a61a8970 * Inc version 2015-01-25 18:51:35 +11:00
Matthew Honnibal 7a750983b9 * Don't package word vectors in source dist 2015-01-25 16:58:38 +11:00
Matthew Honnibal 845bd2e50d * Add parts_of_speech to setup 2015-01-25 16:32:48 +11:00
Matthew Honnibal 7588adf5e7 * Add numpy to install requires 2015-01-25 14:49:10 +11:00
Matthew Honnibal 0250f39741 * Inc version 2015-01-25 02:25:16 +11:00
Matthew Honnibal b183dff72d * Remove stray print statement from setup 2015-01-22 02:06:42 +11:00
Matthew Honnibal e579dd39ca * Load numpy headers 2015-01-17 16:19:54 +11:00
Matthew Honnibal 9818d7419e * Inc version 2015-01-09 05:14:29 +11:00
Matthew Honnibal a0eb450e82 * Inc version 2015-01-08 01:19:57 +11:00
Matthew Honnibal 03a10e6cf2 * Inc version --- last didn't pack the correct cpp files. 2015-01-08 01:08:17 +11:00
Matthew Honnibal c096fe84f7 * Inc version 2015-01-08 00:10:31 +11:00
Matthew Honnibal 2f9884a2d5 * Rename 2015-01-06 13:05:43 +11:00
Matthew Honnibal b91d0cb584 * Increment version 2015-01-06 12:35:11 +11:00
Matthew Honnibal def7e98bd3 * Add monkey-patch to fix pypy compilation 2015-01-06 12:34:55 +11:00
Matthew Honnibal cda5f7aeae * Fix setup 2015-01-06 03:25:08 +11:00
Matthew Honnibal 3306ae1488 * Inc version 2015-01-05 19:11:23 +11:00
Matthew Honnibal 87fe01612a * Remove dependency on numpy and ujson 2015-01-05 19:11:12 +11:00
Matthew Honnibal dd5a6be171 * Compile spacy.orth 2015-01-05 17:55:15 +11:00
Matthew Honnibal 1dd663ea03 * Inc version 2015-01-05 13:18:12 +11:00
Matthew Honnibal f841a32ff7 * Inc version 2015-01-05 13:02:03 +11:00
Matthew Honnibal 0217df5779 * Increment version 2015-01-05 12:51:58 +11:00
Matthew Honnibal 170b93e89a * Inc version 2015-01-05 11:54:52 +11:00
Matthew Honnibal 454bec86dc * Increment version 2015-01-05 05:41:15 +11:00
Matthew Honnibal 83fa1850e2 * Refactor setup 2015-01-05 05:30:56 +11:00
Matthew Honnibal e0c85371d1 * Increment version 2015-01-04 21:21:03 +11:00
Matthew Honnibal 0cd7652545 * Use headers_workaround to avoid having install dependencies, given setuptools bug 209. 2015-01-04 21:14:07 +11:00
Matthew Honnibal 27c737a80f * Specify murmurhash in the setup_requires field 2015-01-04 01:59:14 +11:00
Matthew Honnibal a6f3c0c329 * Bump version number 2015-01-03 23:12:43 +11:00
Matthew Honnibal a179f1fc52 * Fix setup.py 2015-01-03 21:02:10 +11:00
Matthew Honnibal 9b5cef8d4a * Move around data files 2015-01-03 01:59:43 +11:00
Matthew Honnibal d48f90fbab * Write some more metadata in setup.py 2015-01-02 21:56:43 +11:00
Matthew Honnibal a04e164a37 * Move tagger.pyx to _ml.pyx 2014-12-30 21:20:55 +11:00
Matthew Honnibal ed0ff63c09 * Compile attrs and parser in setup 2014-12-23 15:18:20 +11:00
Matthew Honnibal 2a89d70429 * Add vocab.pyx to setup, and ensure we can import spacy.en.lang 2014-12-21 06:03:53 +11:00
Matthew Honnibal 87e9487d76 * Work on parser 2014-12-17 21:10:12 +11:00
Matthew Honnibal 7831b06610 * Compile morphology.pyx file 2014-12-10 08:09:13 +11:00
Matthew Honnibal ef4398b204 * Rearrange POS stuff, so that language-specific stuff can live in language-specific modules 2014-12-07 23:52:41 +11:00
Matthew Honnibal 91e8d9ea1c * Compile context.pyx and tagger.pyx modules 2014-12-07 15:29:54 +11:00
Matthew Honnibal a14f9eaf63 * Add index.pyx to setup 2014-12-04 22:14:11 +11:00
Matthew Honnibal d0d812c548 * Hack setup.py to exclude tagger stuff 2014-12-03 11:06:57 +11:00
Matthew Honnibal b934bf1c69 * Compile IOB 2014-11-12 23:21:40 +11:00
Matthew Honnibal d5e9dce039 * Compile ner NER code 2014-11-11 21:10:22 +11:00
Matthew Honnibal dbbb914480 * Upd setup 2014-11-05 20:45:44 +11:00
Matthew Honnibal 67c8c8019f * Update lexeme serialization, using a binary file format 2014-10-30 01:01:00 +11:00
Matthew Honnibal 5ebe14f353 * Add greedy pos tagger 2014-10-22 10:17:26 +11:00
Matthew Honnibal aba4a7c7ea * Remove ptb3 file from setup 2014-09-25 18:41:25 +02:00
Matthew Honnibal b15619e170 * Use PointerHash instead of locally provided _hashing module 2014-09-25 18:23:35 +02:00
Matthew Honnibal ac522e2553 * Switch from own memory class to cymem, in pip 2014-09-17 23:09:24 +02:00
Matthew Honnibal 5a20dfc03e * Add memory management code 2014-09-17 20:02:06 +02:00
Matthew Honnibal 0447279c57 * PointerHash working, efficiency is good. 6-7 mins 2014-09-13 16:43:59 +02:00
Matthew Honnibal b488224c09 * Restoring Lexeme-as-struct 2014-09-10 20:41:37 +02:00
Matthew Honnibal e80d3b9784 * Compile tokens in setup 2014-09-10 19:41:19 +02:00
Matthew Honnibal 7dac9b9ccb * Fix setup script 2014-09-01 23:41:59 +02:00
Matthew Honnibal 68bae2fec6 * More refactoring 2014-08-25 16:42:22 +02:00
Matthew Honnibal 3b793cf4f7 * Tests passing for new Word object version 2014-08-24 18:13:53 +02:00
Matthew Honnibal 89d6faa9c9 * Move en_ptb to ptb3 2014-08-22 04:24:05 +02:00
Matthew Honnibal d42cdbb446 * Compile orthography.latin.pyx 2014-08-20 17:03:19 +02:00
Matthew Honnibal 01469b0888 * Refactor spacy so that chunks return arrays of lexemes, so that there is properly one lexeme per word. 2014-08-18 19:14:00 +02:00
Matthew Honnibal 865cacfaf7 * Remove dependence on murmurhash 2014-08-16 17:37:09 +02:00
Matthew Honnibal 7fd9b2f1f8 * Add murmurhash to setup while we figure out cython includes 2014-08-15 23:56:57 +02:00
Matthew Honnibal 365a2af756 * Restore happax. commit uncommited work 2014-08-02 21:27:03 +01:00
Matthew Honnibal 18fb76b2c4 * Removed happax. Not sure if good idea. 2014-08-02 20:53:35 +01:00
Matthew Honnibal d4b8bc07ce * Use FixedTable to control index size 2014-08-01 07:27:48 +01:00
Matthew Honnibal a235804730 * Fix setup.py 2014-07-31 02:03:53 +01:00
Matthew Honnibal 5461399924 * Fix setup.py 2014-07-31 02:03:10 +01:00
Matthew Honnibal b9016c4633 * Switch to using sparsehash and murmurhash libraries out of pip 2014-07-25 15:47:27 +01:00
Matthew Honnibal 1c5ab3b49a * Add tokens module to setup 2014-07-07 12:51:23 +02:00
Matthew Honnibal 648d1fe3ed * Compile en_ptb 2014-07-07 05:10:28 +02:00
Matthew Honnibal 0c1be7effe * Compile string_tools module 2014-07-07 04:24:00 +02:00
Matthew Honnibal ca7045f3f2 * Add build/setup stuff 2014-07-05 20:49:34 +02:00