Commit Graph

16184 Commits

Author SHA1 Message Date
Matthew Honnibal 725ccbac39 Format 2024-10-01 12:38:02 +02:00
Matthew Honnibal a8837beab7 Set version to v3.8.1 2024-10-01 12:37:11 +02:00
Matthew Honnibal 3a0aadcf86 Update spacy[apple] thinc-apple-ops pin for numpy v2 compatibility 2024-10-01 10:16:35 +02:00
Matthew Honnibal 114b4894fb Fix --require-parent default 2024-09-29 15:50:31 +02:00
Matthew Honnibal dec13b4258 Fix inverted cli arg 2024-09-29 15:50:05 +02:00
Matthew Honnibal c03f060527 Allow positive option --require-parent 2024-09-29 14:30:14 +02:00
Matthew Honnibal 6255cb985f Include version constraint in parent package requirement 2024-09-29 14:22:21 +02:00
Matthew Honnibal 3b165a8716 Simplify setting to require parent package 2024-09-29 14:19:10 +02:00
Matthew Honnibal 969832f5d6 Fix package 2024-09-29 14:00:11 +02:00
Matthew Honnibal 8ce53a6bbe Syntax 2024-09-29 13:51:44 +02:00
Matthew Honnibal 6fa0d709d5 Support option to not depend on parent package in spacy package 2024-09-29 13:51:04 +02:00
Matthew Honnibal 5010fcbd3a Fix numpy constant 2024-09-14 13:13:11 +02:00
Matthew Honnibal de4f19f3a3 Fix version 2024-09-14 13:12:44 +02:00
Matthew Honnibal 3d03565498 Replace numpy floats in evaluate and update 2024-09-14 12:55:53 +02:00
Matthew Honnibal 0576a1ff56 Fix numpy floats in meta.json 2024-09-14 12:54:08 +02:00
Matthew Honnibal 2f1e7ed09a Lint 2024-09-14 11:36:27 +02:00
Matthew Honnibal e2dc9b79e1 Format 2024-09-14 11:29:40 +02:00
Matthew Honnibal 3c3d75015b Set version to v3.7.7 2024-09-14 11:27:32 +02:00
Matthew Honnibal 50aa3b5cbe Merge branch 'master' of https://github.com/explosion/spaCy 2024-09-14 11:09:44 +02:00
Matthew Honnibal 8266031454 Merge numpy version update 2024-09-14 11:08:35 +02:00
Matthew Honnibal 8dcc4b8daf Skip running tests on PRs 2024-09-14 11:07:23 +02:00
Matthew Honnibal 3a635d2c94 Try skipping 686 2024-09-14 00:12:49 +02:00
Matthew Honnibal a0ce61f55a Fix thinc pin 2024-09-13 14:21:03 +02:00
Matthew Honnibal 83b4015b36 Remove aarch 2024-09-13 12:35:50 +02:00
Matthew Honnibal 419bfaf6e7 Update cibuildwheel 2024-09-13 10:44:48 +02:00
Matthew Honnibal 69ecb85fad Set version to v3.8.1 2024-09-13 10:43:40 +02:00
Matthew Honnibal b427597fc8 Set version to v3.8.0 2024-09-11 21:32:26 +02:00
Matthew Honnibal 1869a197c9 Try enabling macos-14 for arm builds 2024-09-11 16:06:57 +02:00
Matthew Honnibal c068e1de1b Fix dependencies 2024-09-11 15:57:52 +02:00
Matthew Honnibal 184e508d9c Update numpy pin 2024-09-11 15:57:17 +02:00
William Mattingly 30f1f33e78
Added Date spaCy to universe (#13415) [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:29:03 +02:00
William Mattingly f1a5ff9dba
added spacy whisper to universe (#13418) [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:28:00 +02:00
William Mattingly c80dacd046
added spacy annoy to universe (#13416) [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:26:21 +02:00
William Mattingly 7fbbb2002a
updated universe for number spacy (#13424) [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:25:23 +02:00
William Mattingly 89c1774d43
added bagpipes-spacy to universe (#13425) [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:24:06 +02:00
thjbdvlt 081e4e385d
universe-project-presque (#13515) [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:21:41 +02:00
thjbdvlt 0190e669c5
universe-package-quelquhui (#13514) [ci skip]
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:17:33 +02:00
Oren Halvani 54dc4ee8fb
Added: Constituent-Treelib to: universe.json (#13432) [ci skip]
Co-authored-by: Halvani <>
2024-09-10 14:13:36 +02:00
William Mattingly 5a7ad5572c
added gliner-spacy to universe (#13417) [ci skip]
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Co-authored-by: Ines Montani <ines@ines.io>
2024-09-10 14:12:52 +02:00
marinelay b18cc94451
Delete unnecessary method (#13441)
Co-authored-by: marinelay <marinelay@gmail.com>
2024-09-09 20:57:13 +02:00
Matthew Honnibal 4cc3ebe74e Format 2024-09-09 20:56:01 +02:00
Matthew Honnibal a019315534 Fix memory zones 2024-09-09 13:49:41 +02:00
Matthew Honnibal 59ac7e6bdb Format 2024-09-09 11:22:52 +02:00
Matthew Honnibal b65491b641 Set version to v3.8.0.dev0 2024-09-09 11:20:23 +02:00
Matthew Honnibal 1b8d560d0e
Support 'memory zones' for user memory management (#13621)
Add a context manage nlp.memory_zone(), which will begin
memory_zone() blocks on the vocab, string store, and potentially
other components.

Example usage:

```
with nlp.memory_zone():
    for text in nlp.pipe(texts):
        do_something(doc)
# do_something(doc) <-- Invalid
```

Once the memory_zone() block expires, spaCy will free any shared
resources that were allocated for the text-processing that occurred
within the memory_zone. If you create Doc objects within a memory
zone, it's invalid to access them once the memory zone is expired.

The purpose of this is that spaCy creates and stores Lexeme objects
in the Vocab that can be shared between multiple Doc objects. It also
interns strings. Normally, spaCy can't know when all Doc objects using
a Lexeme are out-of-scope, so new Lexemes accumulate in the vocab,
causing memory pressure.

Memory zones solve this problem by telling spaCy "okay none of the
documents allocated within this block will be accessed again". This
lets spaCy free all new Lexeme objects and other data that were
created during the block.

The mechanism is general, so memory_zone() context managers can be
added to other components that could benefit from them, e.g. pipeline
components.

I experimented with adding memory zone support to the tokenizer as well,
for its cache. However, this seems unnecessarily complicated. It makes
more sense to just stick a limit on the cache size. This lets spaCy
benefit from the efficiency advantage of the cache better, because
we can maintain a (bounded) cache even if only small batches of
documents are being processed.
2024-09-09 11:19:39 +02:00
ykyogoku 608f65ce40
add Tibetan (#13510) 2024-09-09 11:18:03 +02:00
Muzaffer Cikay acbf2a428f
Add Kurdish Kurmanji language (#13561)
* Add Kurdish Kurmanji language

* Add lex_attrs
2024-09-09 11:15:40 +02:00
Mark Liberko 55db9c2e87
Added gd language folder (#13570)
Implemented a foundational Scottish Gaelic (gd) language option with tokenizer_exceptions and stop_words files.
2024-09-09 11:14:09 +02:00
Matthew Honnibal 319e02545c Set version to 3.7.6 2024-08-20 12:16:08 +02:00
Matthew Honnibal a8accc3396
Use cibuildwheel to build wheels (#13603)
* Add workflow files for cibuildwheel

* Add config for cibuildwheel

* Set version for experimental prerelease

* Try updating cython

* Skip 32-bit windows builds

* Revert "Try updating cython"

This reverts commit c1b794ab5c.

* Try to import cibuildwheel settings from previous setup
2024-08-20 12:15:05 +02:00