Commit Graph

16 Commits

Author SHA1 Message Date
Ines Montani dd153b2b33 Simplify helper (see #3681) [ci skip] 2019-05-06 15:13:10 +02:00
Ines Montani f8fce6c03c Fix typo (see #3681) 2019-05-06 15:02:11 +02:00
Ines Montani f2a56c1b56 Rewrite example to use Retokenizer (resolves #3681)
Also add helper to filter spans
2019-05-06 14:51:18 +02:00
Ines Montani 399987c216 Test and update examples [ci skip] 2019-03-16 14:15:49 +01:00
Ines Montani 5d0b60999d Merge branch 'master' into develop 2019-02-07 20:54:07 +01:00
mak 8fc6aaf134 Updated main to make use of lang variable (#3220)
Updated main to make use of language variable when initializing spacy.
2019-01-31 23:43:22 +01:00
Ines Montani f37863093a 💫 Replace ujson, msgpack and dill/pickle/cloudpickle with srsly (#3003)
Remove hacks and wrappers, keep code in sync across our libraries and move spaCy a few steps closer to only depending on packages with binary wheels 🎉

See here: https://github.com/explosion/srsly

    Serialization is hard, especially across Python versions and multiple platforms. After dealing with many subtle bugs over the years (encodings, locales, large files) our libraries like spaCy and Prodigy have steadily grown a number of utility functions to wrap the multiple serialization formats we need to support (especially json, msgpack and pickle). These wrapping functions ended up duplicated across our codebases, so we wanted to put them in one place.

    At the same time, we noticed that having a lot of small dependencies was making maintainence harder, and making installation slower. To solve this, we've made srsly standalone, by including the component packages directly within it. This way we can provide all the serialization utilities we need in a single binary wheel.

    srsly currently includes forks of the following packages:

        ujson
        msgpack
        msgpack-numpy
        cloudpickle



* WIP: replace json/ujson with srsly

* Replace ujson in examples

Use regular json instead of srsly to make code easier to read and follow

* Update requirements

* Fix imports

* Fix typos

* Replace msgpack with srsly

* Fix warning
2018-12-03 01:28:22 +01:00
Ines Montani 40b57ea4ac Format example 2018-12-02 04:28:34 +01:00
Ines Montani 45798cc53e Auto-format examples 2018-12-02 04:26:26 +01:00
himkt 57311d5d47 replace janome with mecab in the documentation and the test (#2415)
* Add links to Reddit data (see #2401)

* replace janome with mecab in the documentation and the test

* add the assignment
2018-06-11 00:33:13 +02:00
Ines Montani 3f2e3cbd27
Add links to Reddit data (see #2401) 2018-05-31 16:22:43 +02:00
ines 1a38575de3 Make example Python 2 compatible (see #1617) 2017-11-20 13:57:51 +01:00
ines 173b1551af Update examples 2017-11-07 01:22:30 +01:00
ines 1b1c9105b4 Update example compatibility statements 2017-11-07 01:11:45 +01:00
ines 4b196fdf7f Fix formatting 2017-11-01 00:43:22 +01:00
ines daed7ff8fe Update information extraction examples 2017-10-26 18:46:11 +02:00