Commit Graph

515 Commits

Author SHA1 Message Date
Matthew Honnibal c059fcb0ba Update thinc requirement 2018-03-25 19:29:36 +02:00
Matthew Honnibal bede11b67c
Improve label management in parser and NER (#2108)
This patch does a few smallish things that tighten up the training workflow a little, and allow memory use during training to be reduced by letting the GoldCorpus stream data properly.

Previously, the parser and entity recognizer read and saved labels as lists, with extra labels noted separately. Lists were used becaue ordering is very important, to ensure that the label-to-class mapping is stable.

We now manage labels as nested dictionaries, first keyed by the action, and then keyed by the label. Values are frequencies. The trick is, how do we save new labels? We need to make sure we iterate over these in the same order they're added. Otherwise, we'll get different class IDs, and the model's predictions won't make sense.

To allow stable sorting, we map the new labels to negative values. If we have two new labels, they'll be noted as having "frequency" -1 and -2. The next new label will then have "frequency" -3. When we sort by (frequency, label), we then get a stable sort.

Storing frequencies then allows us to make the next nice improvement. Previously we had to iterate over the whole training set, to pre-process it for the deprojectivisation. This led to storing the whole training set in memory. This was most of the required memory during training.

To prevent this, we now store the frequencies as we stream in the data, and deprojectivize as we go. Once we've built the frequencies, we can then apply a frequency cut-off when we decide how many classes to make.

Finally, to allow proper data streaming, we also have to have some way of shuffling the iterator. This is awkward if the training files have multiple documents in them. To solve this, the GoldCorpus class now writes the training data to disk in msgpack files, one per document. We can then shuffle the data by shuffling the paths.

This is a squash merge, as I made a lot of very small commits. Individual commit messages below.

* Simplify label management for TransitionSystem and its subclasses

* Fix serialization for new label handling format in parser

* Simplify and improve GoldCorpus class. Reduce memory use, write to temp dir

* Set actions in transition system

* Require thinc 6.11.1.dev4

* Fix error in parser init

* Add unicode declaration

* Fix unicode declaration

* Update textcat test

* Try to get model training on less memory

* Print json loc for now

* Try rapidjson to reduce memory use

* Remove rapidjson requirement

* Try rapidjson for reduced mem usage

* Handle None heads when projectivising

* Stream json docs

* Fix train script

* Handle projectivity in GoldParse

* Fix projectivity handling

* Add minibatch_by_words util from ud_train

* Minibatch by number of words in spacy.cli.train

* Move minibatch_by_words util to spacy.util

* Fix label handling

* More hacking at label management in parser

* Fix encoding in msgpack serialization in GoldParse

* Adjust batch sizes in parser training

* Fix minibatch_by_words

* Add merge_subtokens function to pipeline.pyx

* Register merge_subtokens factory

* Restore use of msgpack tmp directory

* Use minibatch-by-words in train

* Handle retokenization in scorer

* Change back-off approach for missing labels. Use 'dep' label

* Update NER for new label management

* Set NER tags for over-segmented words

* Fix label alignment in gold

* Fix label back-off for infrequent labels

* Fix int type in labels dict key

* Fix int type in labels dict key

* Update feature definition for 8 feature set

* Update ud-train script for new label stuff

* Fix json streamer

* Print the line number if conll eval fails

* Update children and sentence boundaries after deprojectivisation

* Export set_children_from_heads from doc.pxd

* Render parses during UD training

* Remove print statement

* Require thinc 6.11.1.dev6. Try adding wheel as install_requires

* Set different dev version, to flush pip cache

* Update thinc version

* Update GoldCorpus docs

* Remove print statements

* Fix formatting and links [ci skip]
2018-03-19 02:58:08 +01:00
Matthew Honnibal 318c23d318 Increment thinc 2018-03-16 13:12:53 +01:00
Matthew Honnibal 39c50225e8 Update thinc 2018-03-16 03:57:47 +01:00
Matthew Honnibal 7be561c8be Fix thinc requirement 2018-03-16 03:34:12 +01:00
Matthew Honnibal 53df6d867b Require new thinc 2018-03-16 03:20:01 +01:00
Matthew Honnibal f2fa8481c4 Require thinc v6.11 2018-03-13 13:59:35 +01:00
ines 9c8a0f6eba Version-lock msgpack-python (see #2015) 2018-02-22 19:42:03 +01:00
ines f5f4de98d1 Version-lock msgpack-python (see #2015) 2018-02-22 16:02:32 +01:00
Matthew Honnibal f46bf2a7e9 Build _align.pyx 2018-02-20 17:32:13 +01:00
ines 6bba1db4cc Drop six and related hacks as a dependency 2018-02-18 13:29:56 +01:00
ines 002ee80ddf Add html5lib to setup.py to fix six error (see #1924) 2018-02-02 20:32:08 +01:00
Matthew Honnibal 2e449c1fbf Fix compiler flags, addressing #1591 2018-01-14 14:34:36 +01:00
Matthew Honnibal 04a92bd75e Pin msgpack-numpy requirement 2017-12-06 03:24:24 +01:00
Hugo aa898ab4e4 Drop support for EOL Python 2.6 and 3.3 2017-11-26 19:46:24 +02:00
Matthew Honnibal 716ccbb71e Require thinc 6.10.1 2017-11-15 14:59:34 +01:00
Matthew Honnibal 314f5b9cdb Require thinc 6.10.0 2017-10-28 18:20:10 +00:00
Matthew Honnibal 64e4ff7c4b Merge 'tidy-up' changes into branch. Resolve conflicts 2017-10-28 13:16:06 +02:00
ines 7946464742 Remove spacy.tagger (now in pipeline) 2017-10-27 19:45:04 +02:00
Matthew Honnibal 531142a933 Merge remote-tracking branch 'origin/develop' into feature/better-parser 2017-10-27 12:34:48 +00:00
Matthew Honnibal 642eb28c16 Don't compile with OpenMP by default 2017-10-27 10:16:58 +00:00
Matthew Honnibal 90d1d9b230 Remove obsolete parser code 2017-10-26 13:22:45 +02:00
Matthew Honnibal 79fcf8576a Compile with march=native 2017-10-18 21:46:34 +02:00
Matthew Honnibal 2eb0fe4957 Fix setup.py 2017-10-03 21:40:04 +02:00
Matthew Honnibal b49cc8153a Require correct thinc 2017-09-26 10:00:18 -05:00
ines 68f66aebf8 Use pkg_resources instead of pip for is_package (resolves #1293) 2017-09-16 20:27:59 +02:00
Matthew Honnibal 07cdbd1219 Require thinc 6.8.1, for Windows 2017-09-15 22:47:53 +02:00
Matthew Honnibal 96a4a9070b Compile _beam_utils 2017-08-18 21:56:19 +02:00
Matthew Honnibal f9ae86b01c Fix requirement 2017-08-18 20:56:53 +02:00
Matthew Honnibal 69bcacdc09 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-18 20:47:13 +02:00
Matthew Honnibal de7f3509d2 Compile CFile, for vector loading 2017-08-18 20:46:41 +02:00
Matthew Honnibal 426f84937f Resolve conflicts when merging new beam parsing stuff 2017-08-18 13:38:32 -05:00
Matthew Honnibal 60d8111245 Require thinc 6.8.1 2017-08-15 03:12:26 -05:00
Matthew Honnibal 52c180ecf5 Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5, reversing
changes made to 08e443e083.
2017-08-14 13:00:23 +02:00
Matthew Honnibal b353e4d843 Work on parser beam training 2017-08-12 14:47:45 -05:00
ines 495e042429 Add entry point-style auto alias for "spacy"
Simplest way to run commands as spacy xxx instead of python -m spacy
xxx, while avoiding environment conflicts
2017-08-09 12:17:30 +02:00
Matthew Honnibal ff7418b0d9 Update requirements 2017-07-25 18:58:15 +02:00
Matthew Honnibal b4cdd05466 Add vectors.pyx in setup 2017-06-05 12:45:29 +02:00
Matthew Honnibal c811790095 Register vectors.pyx in setup 2017-06-05 12:32:22 +02:00
ines 152dc018a6 Remove syntax iterators from setup.py 2017-06-05 12:30:22 +02:00
Matthew Honnibal a4dcc96c54 Require thinc bugfix 2017-06-05 04:02:52 -05:00
ines 71954d5fe7 Update Thinc version 2017-06-03 10:32:53 +02:00
ines f45cd174bf Update Thinc version 2017-06-02 18:48:16 +02:00
Matthew Honnibal ae8010b526 Move weight serialization to Thinc 2017-06-01 02:56:12 -05:00
Matthew Honnibal 2e364f7ecd Require msgpack 2017-05-29 13:47:29 +02:00
ines 3cc6fe1484 Add pip to requirements.txt and setup.py 2017-05-17 12:04:03 +02:00
Matthew Honnibal 48de4ed49f Require thinc 6.6, and compile the nn_parser module 2017-05-14 01:20:28 +02:00
Matthew Honnibal 825c6403d8 Remove serializer 2017-05-09 17:28:30 +02:00
ines 564939391a Remove spacy.orth 2017-05-09 01:21:47 +02:00
ines 229b8c3974 Tidy up 2017-05-07 18:36:35 +02:00
ines a793174ae9 Use setuptools.find_packages() 2017-05-03 20:11:02 +02:00
Yasuaki Uechi c8f83aeb87 Add basic japanese support 2017-05-03 13:56:21 +09:00
Ines Montani 7da9cefd25 Merge pull request #1022 from luvogels/master
Initial support for Norwegian Bokmål
2017-04-27 11:16:06 +02:00
Ines Montani 417f430d23 Relax version contstraint 2017-04-20 15:39:24 +02:00
Gyorgy Orosz 4a06a2572c Using ftfy for handling broken encoded strings. 2017-04-20 13:34:51 +02:00
luvogels ff900ffd7c Update setup.py
added nb
2017-04-19 21:02:26 +02:00
Matthew Honnibal e482c369eb Package converters module 2017-04-07 18:51:48 +02:00
Matthew Honnibal cc24b6d8d5 Fix setup.py 2017-04-07 17:53:22 +02:00
Matthew Honnibal eedafd8d82 Fix regex version pin 2017-04-07 17:47:11 +02:00
ines c691caa9d3 Fix requests version 2017-04-07 17:35:35 +02:00
Matthew Honnibal a001365c42 Require regex library 2017-04-07 15:43:34 +02:00
ines 7e4befec88 Add Hebrew to init and setup.py 2017-03-29 10:34:57 +02:00
Matthew Honnibal 9c17fb472f Add tag for spaCy v3.6 compatibility 2017-03-19 01:40:24 +01:00
Matthew Honnibal 5941fb9e92 Make spacy/data a package 2017-03-18 20:04:22 +01:00
Matthew Honnibal aa8ff9257f Add spacy.en.lemmatizer to setup.py 2017-03-18 19:02:33 +01:00
Matthew Honnibal afb94e5702 Add cli to setup.py 2017-03-18 19:00:39 +01:00
ines 387e34a3c5 Update plac version in requirements and setup 2017-03-18 15:14:02 +01:00
ines 4c53eed35a Remove sputnik from dependencies and docs 2017-03-15 17:39:25 +01:00
ines b62322d602 Add requests to requirements 2017-03-15 17:39:08 +01:00
Matthew Honnibal cb39b6e337 Require recent thinc 2017-03-11 12:45:22 -06:00
Matthew Honnibal 93ab888d1d Require recent preshed 2017-03-11 12:33:56 -06:00
Matthew Honnibal 0ed2afde89 Compile beam parser 2017-03-10 11:22:22 -06:00
ines ffe0f0c6c4 Add dill to requirements 2017-03-08 14:11:54 +01:00
Aniruddha Adhikary 5a4fc09576 add basic Bengali support 2017-02-28 07:48:37 +06:00
Matthew Honnibal c744ce4b6d Fix bad change to cythonize.py script, re subprocess call 2017-02-16 19:01:25 +01:00
Matthew Honnibal 0836cbe064 Pass shell to cythonize.py. See Issue #791 2017-02-17 01:06:06 +11:00
Michael Wallin 73f66ec570 Add preliminary support for Finnish 2017-02-04 13:54:10 +02:00
Raphaël Bournhonesque 0c2e5539ce Specify version number for ujson and plac
The required version was specified for plac in requirements.txt but not in setup.py, which could cause a conflicting version error.
Similarly, set the version of ujson in requirements.txt to be the same as in setup.py
2017-01-28 18:38:14 +01:00
Matthew Honnibal 48c712f1c1 Merge branch 'master' of ssh://github.com/explosion/spaCy 2017-01-16 13:18:06 +01:00
Matthew Honnibal d4e6d4c1c4 Use new thinc 2017-01-16 13:17:14 +01:00
Ines Montani a308703f47 Remove old tests 2017-01-13 01:34:48 +01:00
Ines Montani f8803808ce Remove old unused tests and conftest files 2017-01-12 15:09:05 +01:00
Ines Montani 26d018d874 Add tests for StringStore 2017-01-12 15:07:31 +01:00
Ines Montani ffcaba9017 Remove old and/or redundant tests 2017-01-12 02:10:18 +01:00
Ines Montani 33800c9367 Rename "tokens" tests to "doc" 2017-01-11 18:59:01 +01:00
Matthew Honnibal c9fdd9917c Require older thinc 2017-01-09 10:12:41 -06:00
Matthew Honnibal 7108ad9d80 Require thinc 6.1 2017-01-09 14:37:00 +01:00
Matthew Honnibal e4862d1dab Merge branch 'develop' 2017-01-09 13:36:01 +01:00
Ines Montani d87ca84028 Remove old website example tests from setup.py 2017-01-08 22:42:54 +01:00
Matthew Honnibal af81ac8bb0 Use thinc 6.0 2016-12-29 11:58:42 +01:00
Gyorgy Orosz 35aa54765d Hungarian module is exposed in spacy. 2016-12-21 20:45:36 +01:00
Magnus Burton db5a077d2b Initial commit for Swedish 2016-12-20 11:05:06 +01:00
Matthew Honnibal 0c7720e162 Remove unit and integration test packages 2016-12-19 00:26:56 +01:00
Matthew Honnibal 6c0c43c267 Add comment 2016-12-19 00:20:16 +01:00
Matthew Honnibal b2cebdcca7 List more test packages in the setup.py 2016-12-19 00:15:11 +01:00
Matthew Honnibal 97521c95b3 List the language_data package in the setup.py 2016-12-19 00:14:09 +01:00
dafnevk d8c7ac203a Added nl module for dutch 2016-11-24 16:39:49 +01:00
Matthew Honnibal 36bcd46244 Integrate patch from @mikepb re building OpenMP-supporting wheels for macOS / OSX. I'm running blind on this, so this commit might not be 100%. Rollback if there are any problems. See Issue #267. 2016-11-06 11:58:50 +01:00
Matthew Honnibal bc8d04abc0 Package alpha es, fr, it and pt directories. 2016-11-04 20:02:53 +01:00
Adam Ever Hadani 452b766d82 added ujson dependency to setup.py 2016-10-20 14:57:18 -07:00
Matthew Honnibal b5a74f8ad2 Don't automatically include a data/ directory. 2016-10-20 20:50:32 +02:00
Matthew Honnibal 811dc4da75 Fix setup.py script 2016-10-19 00:27:57 +02:00
Matthew Honnibal 818dc83e26 Fix encoding error in setup.py 2016-10-19 00:05:53 +02:00
Matthew Honnibal 509b30834f Add a pipeline module, to collect and wrap processes for annotation 2016-10-16 01:47:12 +02:00
Matthew Honnibal 53d5bd62ee Add the data/ directory as package data 2016-10-15 14:34:33 +02:00
Matthew Honnibal 2f998f8ed0 Require pathlib 2016-10-13 14:19:57 +02:00
Matthew Honnibal 7c5fe84b80 Require older preshed, for thinc compatibility. 2016-10-09 12:25:53 +02:00
Matthew Honnibal d61feffe24 Require new preshed 2016-09-30 18:41:01 +02:00
Matthew Honnibal 24337175df * Register zh package in setup.py 2016-05-03 14:36:59 +02:00
Henning Peters 2bf34687ea add stdint.h fallback (vs 2008) 2016-04-28 22:10:43 +02:00
Henning Peters bb3238bcdd pin numpy to >=1.7, ship headers 2016-04-19 19:50:42 +02:00
Henning Peters 6215272786 remove ujson as default non-dev dependency (still works as fallback if installed), because ujson doesn't ship wheels 2016-04-12 11:28:07 +02:00
Henning Peters 5f699883dd make openmp on windows optional 2016-04-12 10:12:57 +02:00
SJ 91b3f1c12f Enable OpenMP compiler option for MSVC
Enable OpenMP compiler option for MSVC to support Multi-Threading for nlp.pipe()
2016-04-09 15:22:17 -07:00
Henning Peters 29ad621825 add de 2016-04-08 14:52:29 +02:00
Matthew Honnibal 872695759d Merge pull request #306 from wbwseeker/german_noun_chunks
add German noun chunk functionality
2016-04-08 00:54:24 +10:00
Wolfgang Seeker 5e2e8e951a add baseclass DocIterator for iterators over documents
add classes for English and German noun chunks

the respective iterators are set for the document when created by the parser
as they depend on the annotation scheme of the parsing model
2016-03-16 15:53:35 +01:00
Henning Peters 54f3447b5f cleanup 2016-03-14 01:46:33 +01:00
Henning Peters 1fe29c6919 cleanup 2016-03-13 18:12:32 +01:00
Henning Peters 49f499ca1c cleanup 2016-03-12 14:30:24 +01:00
Henning Peters 5701686272 cleanup 2016-03-12 13:47:10 +01:00
Wolfgang Seeker 03fb498dbe introduce lang field for LexemeC to hold language id
put noun_chunk logic into iterators.py for each language separately
2016-03-10 13:01:34 +01:00
Wolfgang Seeker d9312bc9ea add new files npchunks.{pyx,pxd} to hold noun phrase chunk generators 2016-03-09 16:18:48 +01:00
Henning Peters 5b3b3ebc8e upgrade to latest sputnik 2016-03-08 15:30:17 +01:00
Matthew Honnibal fcaa0ad7ce Merge pull request #280 from wbwseeker/german_parser
German parser
2016-03-04 03:27:42 +11:00
Wolfgang Seeker 3448cb40a4 integrated pseudo-projective parsing into parser
- nonproj.pyx holds a class PseudoProjectivity which currently holds
  all functionality to implement Nivre & Nilsson 2005's pseudo-projective
  parsing using the HEAD decoration scheme
- changed lefts/rights in Token to account for possible non-projective
  structures
2016-03-01 10:09:08 +01:00
Henning Peters 12d58a7099 remove text-unidecode dependency 2016-02-24 08:01:59 +01:00
Henning Peters 9cc4f8d5b3 avoid shadowing __name__ 2016-02-15 01:33:39 +01:00
Henning Peters 4c9e3c7911 upgrade spuntik, enforce data api via model version constraints 2016-02-14 16:03:17 +01:00
Henning Peters 3b5f1e753b py26 compatibility 2016-02-10 14:32:54 +01:00
Henning Peters c00dd43fe0 add sun data 2016-02-09 16:42:55 +01:00
Matthew Honnibal 860fd11e98 * Don't import include files --- use the repository 2016-02-06 23:59:47 +01:00
Matthew Honnibal 8bd16ce8f7 * Try to fix win32 compilation 2016-02-05 14:43:52 +01:00
Matthew Honnibal add8f07f61 * Conditionally link against openmp, on not-darwin 2016-02-05 12:19:51 +01:00
Matthew Honnibal c9aa91041d * Don't expect openmp in options 2016-02-02 13:50:25 +01:00
Matthew Honnibal 490ba65398 * Use openmp in parser 2016-02-01 03:08:42 +01:00
Matthew Honnibal 9c34ca9e5d * Add _stack to mod_names 2016-02-01 03:00:53 +01:00
Matthew Honnibal bc0f0d284c * Require different thinc version 2016-01-30 20:29:24 +01:00
Henning Peters 65aeac24cb remove package version constraint 2016-01-21 17:40:51 +01:00
Henning Peters 211913d689 add about.py, adapt setup.py 2016-01-15 18:57:01 +01:00
Henning Peters ccd87ad7fb add default_model to about 2016-01-15 18:12:01 +01:00
Henning Peters 780cb847c9 add default_model to about 2016-01-15 18:07:15 +01:00
Henning Peters 788f734513 refactored data_dir->via, add zip_safe, add spacy.load() 2016-01-15 18:01:02 +01:00
Henning Peters bc229790ac integrate with sputnik 2016-01-13 19:46:17 +01:00
Matthew Honnibal e38205a838 * Pin versions to ranges, to escape version lock 2015-12-31 02:09:55 +01:00
Henning Peters 1c4352c42e bump version 2015-12-28 13:53:26 +01:00
Henning Peters a404bfec38 bump preshed version 2015-12-22 22:38:25 +01:00
Henning Peters 46fe3a7327 bump thinc version 2015-12-22 13:21:24 +01:00
Henning Peters 1643e63c31 bump preshed version 2015-12-22 11:23:25 +01:00
Henning Peters 4a1d843682 bump murmurhash version 2015-12-21 21:59:11 +01:00
Henning Peters 74dc02a0e6 fix windows readme 2015-12-21 21:58:53 +01:00
Henning Peters c17ce6c119 (re-)include cython sources, murmurhash header discovery 2015-12-21 12:40:44 +01:00
Henning Peters b667020e81 refactor setup.py 2015-12-13 23:39:29 +01:00
Henning Peters 4f4b1d8f3d refactor setup.py 2015-12-13 23:32:23 +01:00
Henning Peters eaadca2bf2 get buildbot running 2015-12-13 14:13:46 +01:00
Henning Peters 73674a4afb try using system-wide headers 2015-12-13 12:51:23 +01:00
Henning Peters b2f66f7b8d try using system-wide headers 2015-12-13 12:45:30 +01:00
Henning Peters 63d74ae8f3 try using system-wide headers 2015-12-13 12:41:46 +01:00
Henning Peters 92fabd0114 wrap virtualenv around cythonize 2015-12-13 12:32:22 +01:00
Henning Peters ac318b568c new approach to dependency headers 2015-12-13 11:49:17 +01:00
Matthew Honnibal 65413ad7b3 Merge pull request #186 from henningpeters/master
website build was broken for me, fixed it
2015-11-29 15:36:52 +11:00
Henning Peters abe6162e7b avoid redirect 2015-11-24 20:01:43 +01:00
Henning Peters 4e98ea4e41 bump version 2015-11-21 19:04:57 +01:00
Matthew Honnibal d8c52560d1 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-11-19 11:00:11 +01:00
Matthew Honnibal 44e563d4e5 * Pin version of murmurhash 2015-11-19 10:59:51 +01:00
Matthew Honnibal 73d47c3010 Merge pull request #185 from henningpeters/sputnik
integrate sputnik
2015-11-19 20:59:09 +11:00
Matthew Honnibal 1e166eb9cd * Upgrade spacy version 2015-11-18 17:42:56 +01:00
Henning Peters 919a4f0b04 change data path, add repository 2015-11-18 11:40:46 +01:00
Henning Peters 12de895e60 fix version 2015-11-15 16:38:16 +01:00
Matthew Honnibal 6dd37c5ee4 * Fix requirement of preshed 2015-11-08 18:09:21 +01:00
Matthew Honnibal f9d20b1318 * Require updated thinc 2015-11-08 21:32:21 +11:00
Matthew Honnibal 3c162dcac3 * Refactor away from the _ml module, to use thinc 4.0. Still some work needs to be done, e.g. to add __reduce__ to the models, more testing, etc. 2015-11-07 03:24:30 +11:00
Matthew Honnibal c339783bbe * Fix reference to tests.span in setup 2015-11-07 03:23:14 +11:00
Matthew Honnibal 802ad3d71a * Avoid compiling theano module for now 2015-11-06 00:24:43 +11:00
Matthew Honnibal 3ddea19b2b * Rename spans.pyx to span.pyx 2015-11-04 00:14:40 +11:00
Matthew Honnibal 9482d616bc * Rename spans.pyx to span.pyx 2015-11-03 23:51:05 +11:00
Matthew Honnibal f81389abe0 * Pin to specific cymem, preshed and thinc versions. 2015-11-03 23:12:13 +11:00
Matthew Honnibal 7adef3f831 * Increment version 2015-11-03 07:58:59 +01:00
Matthew Honnibal 64531d5a3a * Define package_data in one place 2015-11-03 17:07:43 +11:00
Matthew Honnibal 5ca31e05fb * Prune down package data, as models are distributed entirely within the data download. 2015-11-03 13:30:37 +11:00
Matthew Honnibal f56209ef2e * Update requirements 2015-11-03 02:40:01 +11:00
Matthew Honnibal 09e0b15629 * Package tests, for distriution in PyPi 2015-10-26 00:30:33 +11:00
Matthew Honnibal b0ba534d4a * Fix license descriptor in setup.py 2015-10-26 00:16:37 +11:00
Matthew Honnibal 9ee1ddab7e * Increment version 2015-10-23 02:04:48 +02:00
Matthew Honnibal 108138366f * Ensure .pxd files are packaged 2015-10-23 01:57:03 +02:00
Matthew Honnibal 2348a08481 * Load/dump strings with a json file, instead of the hacky strings file we were using. 2015-10-22 21:13:03 +11:00
Matthew Honnibal 579670e4c7 * Fix uget 2015-10-19 17:23:33 +11:00
Matthew Honnibal 984775e5e2 * Fix setup of uget 2015-10-19 17:19:05 +11:00
Matthew Honnibal e25adce54d Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-10-19 17:17:33 +11:00
Matthew Honnibal 382cbc8cab * Add uget to setup.py 2015-10-19 17:15:40 +11:00
Matthew Honnibal a43777cef8 * Inc version 2015-10-19 07:46:42 +02:00
Henning Peters bfde91fa49 add custom download tool (uget), replace wget with uget 2015-10-18 12:35:04 +02:00
Matthew Honnibal fc261195f7 * Fix compilation for OSX 2015-10-18 17:19:07 +11:00
Matthew Honnibal 710e8fb168 * Fix platform condition re Issue #138 2015-10-15 20:46:08 +11:00
maxirmx 1b8fd329b8 Merge remote-tracking branch 'refs/remotes/honnibal/master' 2015-10-13 11:28:17 +03:00
Matthew Honnibal d74a1e51d7 * Add cloudpickle requirement 2015-10-13 19:05:20 +11:00
maxirmx 3dbec0902f Merge remote-tracking branch 'refs/remotes/honnibal/master'
Conflicts -- pushing preshed 0.42
	requirements.txt
	setup.py
2015-10-13 10:16:16 +03:00
maxirmx 237db7f519 Appveyor build #5
Added Wordnet download
2015-10-13 10:11:56 +03:00
Matthew Honnibal 41cbbdefe3 Merge branch 'attrs' 2015-10-13 05:03:25 +02:00
Matthew Honnibal 1ca1beff4b * Allow preshed v0.42 in setup.py 2015-10-13 13:55:50 +11:00