Sofie Van Landeghem
6bfb1b3a29
Fix sparse checkout for 'spacy project' ( #6008 )
...
* exit if cloning fails
* UX
* rewrite http link to git protocol, don't use stdin
* fixes to sparse checkout
* formatting
2020-09-01 19:49:01 +02:00
Matthew Honnibal
4cce32f090
Fix tagger initialization
2020-09-01 16:38:34 +02:00
Matthew Honnibal
046c38bd26
Remove 'cleanup' of strings ( #6007 )
...
A long time ago we went to some trouble to try to clean up "unused"
strings, to avoid the `StringStore` growing in long-running processes.
This never really worked reliably, and I think it was a really wrong
approach. It's much better to let the user reload the `nlp` object as
necessary, now that the string encoding is stable (in v1, the string IDs
were sequential integers, making reloading the NLP object really
annoying.)
The extra book-keeping does make some performance difference, and the
feature is unsed, so it's past time we killed it.
2020-09-01 16:12:15 +02:00
Ines Montani
70b226f69d
Support ignore marker in project document [ci skip]
2020-09-01 12:49:04 +02:00
Ines Montani
a4c51f0f18
Add v3 info to project docs [ci skip]
2020-09-01 12:36:21 +02:00
Ines Montani
ef9005273b
Update fill-config command and add silent mode [ci skip]
2020-09-01 12:07:04 +02:00
Matthew Honnibal
ec660e3131
Fix use_pytorch_for_gpu_memory
2020-09-01 00:41:38 +02:00
Adriane Boyd
9130094199
Prevent Tagger model init with 0 labels ( #5984 )
...
* Prevent Tagger model init with 0 labels
Raise an error before trying to initialize a tagger model with 0 labels.
* Add dummy tagger label for test
* Remove tagless tagger model initializiation
* Fix error number after merge
* Add dummy tagger label to test
* Fix formatting
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-08-31 21:24:33 +02:00
Matthw Honnibal
c38298b8fa
Merge branch 'develop' of https://github.com/explosion/spaCy into develop
2020-08-31 19:55:55 +02:00
Matthw Honnibal
fe298fa50a
Shuffle on first epoch of train
2020-08-31 19:55:22 +02:00
Ines Montani
9af82f3f11
Merge pull request #6003 from explosion/feature/matcher-as-spans
2020-08-31 17:50:56 +02:00
Ines Montani
add9de5487
Deprecate (Phrase)Matcher.pipe
2020-08-31 17:01:24 +02:00
Ines Montani
83aff38c59
Make argument keyword-only
...
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-08-31 15:39:03 +02:00
Ines Montani
6340d1c63d
Add as_spans to Matcher/PhraseMatcher
2020-08-31 14:53:22 +02:00
svlandeg
13ee742fb4
example of custom logger
2020-08-31 14:24:41 +02:00
svlandeg
c18eb63483
Merge remote-tracking branch 'upstream/develop' into feature/vectors-docs
...
# Conflicts:
# website/docs/usage/embeddings-transformers.md
2020-08-31 13:21:36 +02:00
Sofie Van Landeghem
ec14744ee4
Rename Transformer listener ( #6001 )
...
* rename to spacy-transformers.TransformerListener
* add some more tok2vec tests
* use select_pipes
* fix docs - annotation setter was not changed in the end
2020-08-31 12:41:39 +02:00
Adriane Boyd
216efaf5f5
Restrict tokenizer exceptions to ORTH and NORM
2020-08-31 09:55:01 +02:00
Matthew Honnibal
9341cbc013
Set version to v3.0.0a13
2020-08-30 23:10:43 +02:00
Ines Montani
45f46a5c85
Merge pull request #5993 from explosion/feature/disabled-components
2020-08-29 15:58:41 +02:00
Ines Montani
34146750d4
Use frozen list with custom errors
...
We don't want to break backwards compatibility too much but we also want to provide the best possible UX
2020-08-29 15:20:11 +02:00
Ines Montani
744f432420
Merge pull request #5994 from explosion/feature/idempotent-component-decorator
2020-08-29 13:17:13 +02:00
Ines Montani
5de3f8604d
Update spacy/util.py
...
Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-08-29 13:17:06 +02:00
Ines Montani
091a9b522a
Remove unused variable [ci skip]
2020-08-29 13:11:26 +02:00
Ines Montani
2bc31e15c9
Tidy up and auto-format [ci skip]
2020-08-29 13:01:10 +02:00
Ines Montani
6520d1a1df
Work around set order in Language.disabled
2020-08-29 12:58:22 +02:00
Ines Montani
f45095a666
Merge pull request #5995 from adrianeboyd/bugfix/attribute-ruler-bugfixes
2020-08-29 12:38:30 +02:00
Ines Montani
e0b4984aa4
Make deprecated disable_pipes call into select_pipes
2020-08-29 12:08:46 +02:00
Ines Montani
15d73f4dc3
Make user-facing Language.disabled return list
...
More consistent with all the other properties
2020-08-29 12:08:33 +02:00
Matthew Honnibal
58f19421b1
Return empty batch from tok2vec listener if no doc.tensor
2020-08-29 03:46:50 +02:00
svlandeg
5230529de2
add loggers registry & logger docs sections
2020-08-28 21:44:04 +02:00
Ines Montani
0687d7148e
Rename user-facing API
2020-08-28 21:04:02 +02:00
Adriane Boyd
0104bd1600
Sort the AttributeRuler matches by rule order
...
Sort the returned matches by rule order (the `match_id`) so that the
rules are applied in the order they were added. This is necessary, for
instance, if the `AttributeRuler` is used for the tag map and later
rules require POS tags.
2020-08-28 21:01:06 +02:00
Ines Montani
6a999c9303
Remove outdated component attr check
2020-08-28 20:59:19 +02:00
Adriane Boyd
8674b17651
Serialize AttributeRuler.patterns
...
Serialize `AttributeRuler.patterns` instead of the individual lists to
simplify the serialized and so that patterns are reloaded exactly as
they were originally provided (preserving `_attrs_unnormed`).
2020-08-28 20:44:45 +02:00
Ines Montani
10da74382f
Raise if disabled components are removed before DisabledPipes.restore
2020-08-28 20:35:26 +02:00
Ines Montani
1e0363290e
Remove todos and update docstrings
2020-08-28 20:34:46 +02:00
Ines Montani
cad988da7f
Allow component decorators to re-run with same function
2020-08-28 16:27:22 +02:00
Ines Montani
3ce5be4b76
Allow loaded but disabled components
2020-08-28 15:20:14 +02:00
Ines Montani
89f692bc8a
Merge pull request #5992 from svlandeg/feature/wandb-restrict-config
2020-08-28 15:05:29 +02:00
Ines Montani
9c4049b57f
Merge pull request #5986 from explosion/fix/language-config-interpolate-disk-bytes
2020-08-28 15:03:52 +02:00
Ines Montani
adc050cdc5
Fix code style in test [ci skip]
2020-08-28 15:03:21 +02:00
svlandeg
05a1bafa15
fix type
2020-08-28 14:08:33 +02:00
svlandeg
33883aa764
rename field
2020-08-28 14:06:23 +02:00
svlandeg
1d8c4070aa
add disable_fields to wandb_logger
2020-08-28 13:55:32 +02:00
Ines Montani
a51b4f3a19
Merge branch 'develop' into fix/language-config-interpolate-disk-bytes
2020-08-28 13:21:17 +02:00
Ines Montani
03dde511b4
Merge pull request #5987 from explosion/feature/debug-config [ci skip]
2020-08-28 11:30:18 +02:00
Ines Montani
62e9967228
Merge branch 'develop' into fix/language-config-interpolate-disk-bytes
2020-08-28 11:19:36 +02:00
Ines Montani
4ca2698f85
Merge branch 'develop' into feature/debug-config
2020-08-28 11:19:17 +02:00
svlandeg
9a8255ffd5
two tests because of different exit type
2020-08-28 10:50:26 +02:00