Commit Graph

7336 Commits

Author SHA1 Message Date
Sofie Van Landeghem de6a32315c
debug-model script (#5749)
* adding debug-model to print the internals for debugging purposes

* expend debug-model script with 4 stages: before, init, train, predict

* avoid enforcing to have a seed in the train script

* small fixes
2020-07-10 19:47:53 +02:00
Ines Montani a3667394b4 Integrate with latest Thinc and config overrides 2020-07-10 19:47:05 +02:00
Ines Montani 5cfc3edcaa Update CLI tests 2020-07-10 18:21:01 +02:00
Ines Montani 3583ea84d8 Update arg parsing 2020-07-10 18:20:52 +02:00
Ines Montani 73332ddb67 Update CLI commans to use one shared util file 2020-07-10 17:57:40 +02:00
Ines Montani 240e0a62ca Update with WIP 2020-07-10 13:31:27 +02:00
Ines Montani a60562f208
Update project CLI hashes, directories, skipping (#5741)
* Update project CLI hashes, directories, skipping

* Improve clone success message

* Remove unused context args

* Move project-specific utils to project utils

The hashing/checksum functions may not end up being general-purpose functions and are more designed for the projects, so they shouldn't live in spacy.util

* Improve run help and add workflows

* Add note re: directory checksum speed

* Fix cloning from subdirectories and output messages

* Remove hard-coded dirs
2020-07-09 23:51:18 +02:00
Adriane Boyd 0a62098c5f
Fix lemmatizer is_base_form for python2.7 (#5734)
* Fix lemmatizer init args for python2.7

* Move English is_base_form to a class method

* Skip test pickling PhraseMatcher for python2
2020-07-09 22:11:24 +02:00
Adriane Boyd 923affd091
Remove is_base_form from French lemmatizer (#5733)
Remove English-specific is_base_form from French lemmatizer.
2020-07-09 22:11:13 +02:00
Matthew Honnibal 552d1ad226 Hack at tests 2020-07-09 20:25:51 +02:00
Matthew Honnibal eb064c59cd Try to fix textcat test 2020-07-09 20:24:53 +02:00
Ines Montani 018319a640 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-09 19:44:41 +02:00
Ines Montani 05e182e421 Update CLI args and docstrings 2020-07-09 19:44:28 +02:00
Sofie Van Landeghem dd207a28be
cleanup components API (#5726)
* add keyword separator for update functions and drop unused "state"

* few more Example tests and various small fixes

* consistently return losses after update call

* eliminate unused tensors field across pipe components

* fix name

* fix arg name
2020-07-09 19:43:39 +02:00
Adriane Boyd ac4297ee39
Minor refactor to conversion of output docs (#5718)
Minor refactor of conversion of docs to output format to avoid
duplicate conversion steps.
2020-07-09 19:42:32 +02:00
Sofie Van Landeghem c1ea55307b
Fixing reproducible training (#5735)
* Add initial reproducibility tests

* failing test for default_text_classifier (WIP)

* track trouble to underlying tok2vec layer

* add regression test for Issue 5551

* tests go green with https://github.com/explosion/thinc/pull/359

* update test

* adding fixed seeds to HashEmbed layers, seems to fix the reproducility issue

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-07-09 19:39:31 +02:00
Matthew Honnibal 1827f22f56 Set version to v3.0.0a3 2020-07-09 19:38:04 +02:00
Matthw Honnibal 7010f1a2be Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-09 19:34:11 +02:00
Matthw Honnibal 77af0a6bb4 Offer option of padding-sensitive batching 2020-07-09 14:50:20 +02:00
Matthw Honnibal 3a7f275c02 Add extra batch util 2020-07-09 14:38:41 +02:00
Matthw Honnibal eb0798c421 Add __len__ method for Example 2020-07-09 14:38:26 +02:00
Ines Montani 8f9552d9e7
Refactor project CLI (#5732)
* Make project command a submodule

* Update with WIP

* Add helper for joining commands

* Update docstrins, formatting and types

* Update assets and add support for copying local files

* Fix type

* Update success messages
2020-07-09 01:42:51 +02:00
Adriane Boyd ad15499b3b
Fix get_loss for values outside of labels in senter (#5730)
* Fix get_loss for None alignments in senter

When converting the `sent_start` values back to `SentenceRecognizer`
labels, handle `None` alignments.

* Handle SENT_START as -1

Handle SENT_START as -1 (or -1 converted to uint64) by treating any
values other than 1 the same as 0 in `SentenceRecognizer.get_loss`.
2020-07-09 01:41:58 +02:00
Matthw Honnibal 1b20ffac38 batch_by_words by default 2020-07-08 21:37:06 +02:00
Matthw Honnibal 93e50da46a Remove auto 'set_annotation' in training to address GPU memory 2020-07-08 21:36:51 +02:00
Matthw Honnibal fb8a5967c1 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-08 15:27:50 +02:00
Ines Montani 0a3d41bb1d
Deprecat model shortcuts and simplify download (#5722) 2020-07-08 14:00:07 +02:00
Adriane Boyd c9f0f75778
Update get_loss for senter and morphologizer (#5724)
* Update get_loss for senter

Update `SentenceRecognizer.get_loss` to keep it similar to `Tagger`.

* Update get_loss for morphologizer

Update `Morphologizer.get_loss` to keep it similar to `Tagger`.
2020-07-08 13:59:28 +02:00
Matthw Honnibal ca989f4cc4 Improve cutting logic in parser 2020-07-08 11:27:54 +02:00
Matthw Honnibal 42e1109def Support option to not batch by number of words 2020-07-08 11:26:54 +02:00
Ines Montani 8cb7f9ccff
Improve assets and DVC handling (#5719)
* Improve assets and DVC handling

* Remove outdated comment [ci skip]
2020-07-07 20:51:50 +02:00
Sofie Van Landeghem a39a110c4e
Few more Example unit tests (#5720)
* small fixes in Example, UX

* add gold tests for aligned_spans and get_aligned_parse

* sentencizer unnecessary
2020-07-07 18:46:00 +02:00
Matthw Honnibal 433dc3c9c9 Simplify PrecomputableAffine slightly 2020-07-07 17:22:47 +02:00
Matthw Honnibal a4164f67ca Don't normalize gradients 2020-07-07 17:21:58 +02:00
Matthw Honnibal 8177f25b6c Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-07 17:21:10 +02:00
Ines Montani fa00a85828
Merge pull request #5715 from explosion/chore/tidy-regression-tests 2020-07-07 11:22:07 +02:00
Matthw Honnibal d1fd3438c3 Add dropout to parser hidden layer 2020-07-07 01:38:15 +02:00
Matthw Honnibal f25761e513 Dont randomize cuts in parser 2020-07-06 17:51:25 +02:00
Matthw Honnibal 709fc5e4ad Clarify dropout and seed in Tok2Vec 2020-07-06 17:50:21 +02:00
Matthew Honnibal 19d42f42de Set version to v3.0.0a2 2020-07-06 17:43:12 +02:00
Matthew Honnibal cc477be952
Improve gold-standard alignment (#5711)
* Remove previous alignment

* Implement better alignment, using ragged data structure

* Use pytokenizations for alignment

* Fixes

* Fixes

* Fix overlapping entities in alignment

* Fix align split_sents

* Update test

* Commit align.py

* Try to appease setuptools

* Fix flake8

* use realistic entities for testing

* Update tests for better alignment

* Improve alignment heuristic

Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
2020-07-06 17:39:31 +02:00
Mike Izbicki 7a2ca00794
fix bug in Korean language, resulting in 100x speedup by reducing overhead of mecab (#5701)
* speed up Korean nlp 100x by stopping mecab from reloading on each doc

* add contributor agreement

* rename variables to improve code readability
2020-07-06 17:03:33 +02:00
Ines Montani b6deef80f8 Fix class to pickling works as expected 2020-07-06 16:43:45 +02:00
Ines Montani fa261d09e8 Add alternative CLI option 2020-07-06 15:57:38 +02:00
Adriane Boyd c67fc6aa5b
Make `docs_to_json` backwards-compatible with v2 (#5714)
* In `spacy convert -t json` output the JSON docs wrapped in a list

* Add back token-level `ner` alongside the doc-level `entities`
2020-07-06 14:15:00 +02:00
Ines Montani 5b7b2a498d Tidy up and merge regression tests 2020-07-06 14:05:59 +02:00
Ines Montani 412dbb1f38
Remove dead and/or deprecated code (#5710)
* Remove dead and/or deprecated code

* Remove n_threads

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-07-06 13:06:25 +02:00
Sofie Van Landeghem fcbf899b08
Feature/example only (#5707)
* remove _convert_examples

* fix test_gold, raise TypeError if tuples are used instead of Example's

* throwing proper errors when the wrong type of objects are passed

* fix deprectated format in tests

* fix deprectated format in parser tests

* fix tests for NEL, morph, senter, tagger, textcat

* update regression tests with new Example format

* use make_doc

* more fixes to nlp.update calls

* few more small fixes for rehearse and evaluate

* only import ml_datasets if really necessary
2020-07-06 13:02:36 +02:00
graue70 9860b8399e
Fix typo in test function docstring (#5696) 2020-07-05 15:49:06 +02:00
Matthew Honnibal 3e78e82a83
Experimental character-based pretraining (#5700)
* Use cosine loss in Cloze multitask

* Fix char_embed for gpu

* Call resume_training for base model in train CLI

* Fix bilstm_depth default in pretrain command

* Implement character-based pretraining objective

* Use chars loss in ClozeMultitask

* Add method to decode predicted characters

* Fix number characters

* Rescale gradients for mlm

* Fix char embed+vectors in ml

* Fix pipes

* Fix pretrain args

* Move get_characters_loss

* Fix import

* Fix import

* Mention characters loss option in pretrain

* Remove broken 'self attention' option in pretrain

* Revert "Remove broken 'self attention' option in pretrain"

This reverts commit 56b820f6af.

* Document 'characters' objective of pretrain
2020-07-05 15:48:39 +02:00
Matthw Honnibal 3f6f087113 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-04 23:52:12 +02:00
Matthw Honnibal 5642507823 Fix has_unknown_spaces in Doc.copy 2020-07-04 23:52:02 +02:00
Matthw Honnibal 8870a6ded7 Specify seeds in HashEmbed 2020-07-04 23:51:49 +02:00
Ines Montani 37c3bb35e2 Auto-format 2020-07-04 16:25:34 +02:00
Ines Montani abd173937f Auto-format and update URL 2020-07-04 14:23:44 +02:00
Ines Montani 99aff16d60 Make argument shortcut consistent 2020-07-04 14:23:32 +02:00
Matthew Honnibal 2bd1bf81f1
Refactor pretrain and support character-based objective for v3 (#5706)
* Start adding character-based stuff

* Start adding character-based objective

* Start adding character-based stuff

* Start adding character-based objective

* Remove outdated comment

* Update pretraining models

* Add/fix character-based multi-task models

* Refactor pretrain and support character-based objective

* Update pretrain config

* Remove unused

* Fix flake8 errors

* Clean up imports

* Format

* Format

* Update Thinc version

* Raise error if vectors objective but no vectors
2020-07-03 17:57:28 +02:00
Ines Montani 84fb3a3fb3 Auto-format and fix tuple 2020-07-03 15:20:10 +02:00
Adriane Boyd 86d13a9fb8
Set version to 2.3.1 (#5705) 2020-07-03 13:38:41 +02:00
Matthew Honnibal e1b3e8ee11 Set version to v3.0.0a1 2020-07-03 13:21:08 +02:00
Matthew Honnibal a902b5f217
Record whether Doc objects are built from known spacing (#5697)
* Tell convert CLI to store user data for Doc

* Remove assert

* Add has_unknwon_spaces flag on Doc

* Do not tokenize docs with unknown spaces in Corpus

* Handle conversion of unknown spaces in Example

* Fixes

* Fixes

* Draft has_known_spaces support in DocBin

* Add test for serialize has_unknown_spaces

* Fix DocBin serialization when has_unknown_spaces

* Use serialization in test
2020-07-03 12:58:16 +02:00
Adriane Boyd abad56db7d
Add conllu2docs converter (#5704)
Add conllu2docs converter adapted from conllu2json converter
2020-07-03 12:54:32 +02:00
Jan Jessewitsch e4dcac4a4b
Merging multiple docs into one (#5032)
* Add static method to Doc to allow merging of multiple docs.

* Add error description for the error that occurs if docs with different
vocabs (from different languages) are merged in Doc.from_docs().

* Add test for Doc.from_docs() implementation.

* Fix using numpy's concatenate in Doc.from_docs.

* Replace typing's type annotations in from_docs.

* Simply remove type annotations in from_docs.

* Add documentation for Doc.from_docs to api.

* Simplify from_docs, its test and the api doc for codebase consistency.

* Fix merging of Doc objects that end with whitespaces (Achieved by simply not setting the SPACY attribute on whitespace tokens). Remove two unnecessary imports of attributes.

* Add merging of user data from Doc objects in from_docs. Add user data test case to corresponding test. Add applicable warning messages.

* Fix incorrect setting of tokens idx by using concatenated spaces (again). Add test case to corresponding test.

* Add MORPH to attrs

* Update warnings calls

* Remove out-dated error from merge

* Rename space_delimiter to ensure_whitespace

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2020-07-03 11:32:42 +02:00
Sofie Van Landeghem 41b65fd0f8
fix to pretrain script (#5699)
* fix to pretrain script

* remove unnecessary import
2020-07-02 21:48:01 +02:00
Adriane Boyd a723fa02a1
DocBin: add version number, missing attributes and strings (#5685)
* Add version number to DocBin

Add a version number to DocBin for future use.

* Add POS to all attributes in DocBin

* Add morph string to strings in DocBin

* Update DocBin API

* Add string for ENT_KB_ID in DocBin
2020-07-02 17:41:50 +02:00
Adriane Boyd a77c4c3465
Add strings and ENT_KB_ID to Doc serialization (#5691)
* Add strings for all writeable Token attributes to `Doc.to/from_bytes()`.
* Add ENT_KB_ID to default attributes.
2020-07-02 17:11:57 +02:00
Adriane Boyd 971826a96d
Include git commit in package and model meta (#5694)
* Include git commit in package and model meta

* Rewrite to read file in setup

* Fix file handle
2020-07-02 17:10:27 +02:00
Ines Montani d36632553a
Merge pull request #5688 from explosion/remove-deprecated
Remove deprecated methods: Doc.print_tree, Doc.merge, Span.merge
2020-07-02 15:10:30 +02:00
Ines Montani 8a5b9a6d5f
Merge pull request #5693 from svlandeg/bugfix/nel-v3 2020-07-02 14:45:46 +02:00
Ines Montani ee8a830248
Merge pull request #5687 from svlandeg/bugfix/init-model
Fixing init_model
2020-07-02 14:10:28 +02:00
svlandeg 04ed4d60a8 raise error when links are not aligned to tokens 2020-07-02 13:57:35 +02:00
svlandeg f503817623 fix parsing entity links in new gold format 2020-07-02 13:48:11 +02:00
Ines Montani 60c2695131 Remove deprecated methods 2020-07-01 22:33:39 +02:00
Ines Montani fe4cfd0632 Start updating website for v3 [ci skip] 2020-07-01 21:26:39 +02:00
svlandeg a30bc77415 bugfixing prune_vectors and vectors_loc 2020-07-01 21:00:47 +02:00
Matthw Honnibal 94a0cf46fd Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-01 18:45:45 +02:00
Matthw Honnibal 6a0a27e5c2 Fix max_steps 2020-07-01 18:08:14 +02:00
Ines Montani 8d90e44d74 Fix title 2020-07-01 15:38:01 +02:00
Ines Montani 8fb574900a Update parent package and version 2020-07-01 15:35:23 +02:00
Matthew Honnibal 0ada186dda Set version to v3.0.0.dev14 2020-07-01 15:31:04 +02:00
Matthw Honnibal cb51bb637b Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-01 15:17:27 +02:00
Matthw Honnibal 7734cbc34d Set batch size in begin_training 2020-07-01 15:16:59 +02:00
Matthw Honnibal 1f7709e9a6 Improve max length check in corpus 2020-07-01 15:16:43 +02:00
Matthw Honnibal 2fa56484b2 Fix eval batch size 2020-07-01 15:16:25 +02:00
Matthw Honnibal c5d12d1a22 Allow batch size to be set for evaluation in spacy train 2020-07-01 15:04:36 +02:00
Matthw Honnibal f5532757a3 Filter out 0-length examples in Corpus 2020-07-01 15:02:37 +02:00
Ines Montani bc87ba97e0
Merge pull request #5681 from svlandeg/bugfix/exec-cwd 2020-07-01 14:13:19 +02:00
Matthw Honnibal 52338a07bb Set version to v3.0.0.dev13 2020-07-01 02:49:17 +02:00
Matthw Honnibal fa6d473390 Fix parser maxout_pieces=1 2020-07-01 02:48:58 +02:00
Matthw Honnibal 35af5819e0 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-07-01 01:03:39 +02:00
Matthw Honnibal 0d6edf5397 Clean up debug code in transition_system 2020-07-01 01:03:20 +02:00
Matthw Honnibal a1b6add4c8 Fix parser gold cutting and gradient normalization 2020-07-01 01:02:58 +02:00
Matthw Honnibal 8c5a88e777 Fix per-epoch shuffling 2020-07-01 01:02:35 +02:00
svlandeg a7d547c65e small fix 2020-06-30 21:56:17 +02:00
svlandeg 8eca7e995e add try-except to git commands to get an informative warning 2020-06-30 21:53:40 +02:00
Ines Montani b032943c34 Fix funny printing again 2020-06-30 21:33:41 +02:00
Matthw Honnibal d525552979 Fix efficiency of parser backprop_nonlinearity 2020-06-30 21:22:54 +02:00
Ines Montani d64644d9d1 Adjust auto-formatting 2020-06-30 20:36:30 +02:00
Ines Montani 6da3500728 Fix command substitution 2020-06-30 20:35:51 +02:00
svlandeg e7aff9c5fc bugfix exec usage in dvc.yaml 2020-06-30 18:51:20 +02:00
svlandeg 60f97bc519 add custom warning when run_command fails 2020-06-30 17:28:43 +02:00
svlandeg 39953c7c60 fix print_run_help with new arg order 2020-06-30 17:28:09 +02:00
svlandeg cd632d8ec2 move folder for exec argument one up 2020-06-30 17:19:36 +02:00
svlandeg 1ae6fa2554 move subcommand one place up as project_dir has default 2020-06-30 16:04:53 +02:00
svlandeg a46b76f188 use current working dir as default throughout 2020-06-30 15:39:24 +02:00
svlandeg b228111925 fix funny printing 2020-06-30 14:54:45 +02:00
Ines Montani 8e20505970 Resolve within working_dir context manager 2020-06-30 13:29:45 +02:00
Ines Montani 72175b5c60 Update project command 2020-06-30 13:17:26 +02:00
Ines Montani c5e31acb06 Make working_dir yield absolute cwd path 2020-06-30 13:17:14 +02:00
Ines Montani 3aca404735 Make run_command take string and list 2020-06-30 13:17:00 +02:00
Ines Montani 7584fdafec Fix typo 2020-06-30 12:59:13 +02:00
svlandeg 140c4896a0 split_command util function 2020-06-30 12:54:15 +02:00
Matthw Honnibal 57e09747dc Improve efficiency of get_oracle_sequences 2020-06-30 11:50:48 +02:00
Matthw Honnibal 233945bfe0 Fix init for padding 2020-06-30 11:50:24 +02:00
svlandeg d23be563eb remove redundant setting of no_args_is_help 2020-06-30 11:23:35 +02:00
svlandeg b311ce982f Merge remote-tracking branch 'upstream/develop' into fix/small-edits
# Conflicts:
#	spacy/cli/project.py
2020-06-30 11:17:31 +02:00
svlandeg 7e4cbda89a fix project_init for relative path 2020-06-30 11:09:53 +02:00
Matthw Honnibal 85ed5730a2 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-06-30 01:14:16 +02:00
Ines Montani e8033df81e Also handle python3 and pip3 2020-06-29 20:30:42 +02:00
Ines Montani c874dde66c Show help on "spacy project" 2020-06-29 20:11:34 +02:00
Ines Montani 1d2c646e57 Fix init and remove .dvc/plots 2020-06-29 20:07:21 +02:00
Matthw Honnibal 5bed6fc431 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-06-29 19:55:24 +02:00
svlandeg 1176783310 fix one more shlex.split 2020-06-29 18:37:42 +02:00
svlandeg ff233d5743 print details on error msg (e.g. PermissionError on specific file) 2020-06-29 18:22:33 +02:00
svlandeg 894b8e7ff6 throw warning (instead of crashing) when temp dir can't be cleaned 2020-06-29 18:16:39 +02:00
svlandeg efe7eb71f2 create subfolder in working dir 2020-06-29 17:46:08 +02:00
svlandeg 3487214ba1 fix shlex.split for non-posix 2020-06-29 17:45:47 +02:00
Ines Montani 126050f259 Improve asset fetching
Get all paths first and run dvc add once so it only shows one progress bar and one combined git command (if repo is git repo)
2020-06-29 16:55:24 +02:00
Ines Montani 7c08713baa Improve error messages 2020-06-29 16:54:47 +02:00
Ines Montani 24664efa23 Import project_run_all function 2020-06-29 16:54:19 +02:00
svlandeg f8dddeda27 print help msg when just calling 'project' without args 2020-06-29 16:38:15 +02:00
svlandeg bf43ebbf61 fix typo's 2020-06-29 16:32:25 +02:00
Matthew Honnibal 67928036f2 Set version to v3.0.0.dev12 2020-06-29 14:45:43 +02:00
Matthew Honnibal 2d715451a2
Revert "Convert custom user_data to token extension format for Japanese tokenizer (#5652)" (#5665)
This reverts commit 1dd38191ec.
2020-06-29 14:34:15 +02:00
Sofie Van Landeghem 8d3c0306e1
refactor fixes (#5664)
* fixes in ud_train, UX for morphs

* update pyproject with new version of thinc

* fixes in debug_data script

* cleanup of old unused error messages

* remove obsolete TempErrors

* move error messages to errors.py

* add ENT_KB_ID to default DocBin serialization

* few fixes to simple_ner

* fix tags
2020-06-29 14:33:00 +02:00
Adriane Boyd 1dd38191ec
Convert custom user_data to token extension format for Japanese tokenizer (#5652)
* Convert custom user_data to token extension format

Convert the user_data values so that they can be loaded as custom token
extensions for `inflection`, `reading_form`, `sub_tokens`, and `lemma`.

* Reset Underscore state in ja tokenizer tests
2020-06-29 14:20:26 +02:00
Adriane Boyd 167df42cb6
Move lemmatizer is_base_form to language settings (#5663)
Move `Lemmatizer.is_base_form` to the language settings so that each
language can provide a language-specific method as
`LanguageDefaults.is_base_form`.

The existing English-specific `Lemmatizer.is_base_form` is moved to
`EnglishDefaults`.
2020-06-29 14:16:57 +02:00
Sofie Van Landeghem fc3cb1fa9e
NER align tests (#5656)
* one_to_man works better. misalignment doesn't yet.

* fix tests

* restore example

* xfail alignment tests
2020-06-29 13:59:17 +02:00
Matthew Honnibal 2d9604d39c Set version to v3.0.0.dev11 2020-06-29 13:56:46 +02:00
Matthw Honnibal da50473701 Tweak efficiency of arc_eager.set_costs 2020-06-29 12:17:41 +02:00
Ines Montani bac8a8d766 Merge branch 'feature/project-cli' into develop 2020-06-29 10:49:05 +02:00
Matthew Honnibal e14bf9decb Set version to v3.0.0.dev9 2020-06-28 23:58:10 +02:00
Matthew Honnibal 58c8f731bd Set version to v3.0.0.dev9 2020-06-28 23:53:14 +02:00
Ines Montani 569376e34e Replace curl with requests 2020-06-28 16:25:53 +02:00
Ines Montani dbe86b3453 Update project.py 2020-06-28 15:45:19 +02:00
Ines Montani dbfa292ed3 Output more stats in evaluate 2020-06-28 15:34:28 +02:00
Ines Montani 90b7fa8fed Run DVC command in project dir 2020-06-28 15:33:53 +02:00
Ines Montani 2f6ee0d018 Tidy up, document and add custom clone logic 2020-06-28 15:08:35 +02:00
Matthew Honnibal dc7a9be9f8 Merge branch 'feature/project-cli' of https://github.com/explosion/spaCy into feature/project-cli 2020-06-28 14:07:53 +02:00
Matthew Honnibal e08257d401 Add example of how to do sparse-checkout 2020-06-28 14:07:32 +02:00