spaCy/website/docs/usage/v3.md

106 lines
3.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: What's New in v3.0
teaser: New features, backwards incompatibilities and migration guide
menu:
- ['Summary', 'summary']
- ['New Features', 'features']
- ['Backwards Incompatibilities', 'incompat']
- ['Migrating from v2.x', 'migrating']
- ['Migrating plugins', 'plugins']
---
## Summary {#summary}
## New Features {#features}
## Backwards Incompatibilities {#incompat}
### Removed deprecated methods, attributes and arguments {#incompat-removed}
The following deprecated methods, attributes and arguments were removed in v3.0.
Most of them have been deprecated for quite a while now and many would
previously raise errors. Many of them were also mostly internals. If you've been
working with more recent versions of spaCy v2.x, it's unlikely that your code
relied on them.
| Class | Removed |
| --------------------- | ------------------------------------------------------- |
| [`Doc`](/api/doc) | `Doc.tokens_from_list`, `Doc.merge` |
| [`Span`](/api/span) | `Span.merge`, `Span.upper`, `Span.lower`, `Span.string` |
| [`Token`](/api/token) | `Token.string` |
<!-- TODO: complete (see release notes Dropbox Paper doc) -->
## Migrating from v2.x {#migrating}
## Migration notes for plugin maintainers {#plugins}
Thanks to everyone who's been contributing to the spaCy ecosystem by developing
and maintaining one of the many awesome [plugins and extensions](/universe).
We've tried to keep breaking changes to a minimum and make it as easy as
possible for you to upgrade your packages for spaCy v3.
### Custom pipeline components
The most common use case for plugins is providing pipeline components and
extension attributes.
- Use the [`@Language.factory`](/api/language#factory) decorator to register
your component and assign it a name. This allows users to refer to your
components by name and serialize pipelines referencing them. Remove all manual
entries to the `Language.factories`.
- Make sure your component factories take at least two **named arguments**:
`nlp` (the current `nlp` object) and `name` (the instance name of the added
component so you can identify multiple instances of the same component).
- Update all references to [`nlp.add_pipe`](/api/language#add_pipe) in your docs
to use **string names** instead of the component functions.
```python
### {highlight="1-5"}
from spacy.language import Language
@Language.factory("my_component", default_config={"some_setting": False})
def create_component(nlp: Language, name: str, some_setting: bool):
return MyCoolComponent(some_setting=some_setting)
class MyCoolComponent:
def __init__(self, some_setting):
self.some_setting = some_setting
def __call__(self, doc):
# Do something to the doc
return doc
```
> #### Result in config.cfg
>
> ```ini
> [components.my_component]
> factory = "my_component"
> some_setting = true
> ```
```diff
import spacy
from your_plugin import MyCoolComponent
nlp = spacy.load("en_core_web_sm")
- component = MyCoolComponent(some_setting=True)
- nlp.add_pipe(component)
+ nlp.add_pipe("my_component", config={"some_setting": True})
```
<Infobox title="Important note on registering factories" variant="warning">
The [`@Language.factory`](/api/language#factory) decorator takes care of letting
spaCy know that a component of that name is available. This means that your
users can add it to the pipeline using its **string name**. However, this
requires the decorator to be executed so users will still have to **import
your plugin**. Alternatively, your plugin could expose an
[entry point](/usage/saving-loading#entry-points), which spaCy can read from.
This means that spaCy knows how to initialize `my_component`, even if your
package isn't imported.
</Infobox>