mirror of https://github.com/explosion/spaCy.git
128 lines
4.8 KiB
Plaintext
128 lines
4.8 KiB
Plaintext
include ../../_includes/_mixins
|
||
|
||
+h(2, "101") Serialization 101
|
||
|
||
include _spacy-101/_serialization
|
||
|
||
+infobox("Important note")
|
||
| In spaCy v2.0, the API for saving and loading has changed to only use the
|
||
| four methods listed above consistently across objects and classes. For an
|
||
| overview of the changes, see #[+a("/docs/usage/v2#incompat") this table]
|
||
| and the notes on #[+a("/docs/usage/v2#migrating-saving-loading") migrating].
|
||
|
||
|
||
+h(2, "models") Saving models
|
||
|
||
p
|
||
| After training your model, you'll usually want to save its state, and load
|
||
| it back later. You can do this with the
|
||
| #[+api("language#to_disk") #[code Language.to_disk()]]
|
||
| method:
|
||
|
||
+code.
|
||
nlp.to_disk('/home/me/data/en_example_model')
|
||
|
||
p
|
||
| The directory will be created if it doesn't exist, and the whole pipeline
|
||
| will be written out. To make the model more convenient to deploy, we
|
||
| recommend wrapping it as a Python package.
|
||
|
||
+h(3, "models-generating") Generating a model package
|
||
|
||
+infobox("Important note")
|
||
| The model packages are #[strong not suitable] for the public
|
||
| #[+a("https://pypi.python.org") pypi.python.org] directory, which is not
|
||
| designed for binary data and files over 50 MB. However, if your company
|
||
| is running an #[strong internal installation] of PyPi, publishing your
|
||
| models on there can be a convenient way to share them with your team.
|
||
|
||
p
|
||
| spaCy comes with a handy CLI command that will create all required files,
|
||
| and walk you through generating the meta data. You can also create the
|
||
| meta.json manually and place it in the model data directory, or supply a
|
||
| path to it using the #[code --meta] flag. For more info on this, see
|
||
| the #[+api("cli#package") #[code package]] docs.
|
||
|
||
+aside-code("meta.json", "json").
|
||
{
|
||
"name": "example_model",
|
||
"lang": "en",
|
||
"version": "1.0.0",
|
||
"spacy_version": ">=2.0.0,<3.0.0",
|
||
"description": "Example model for spaCy",
|
||
"author": "You",
|
||
"email": "you@example.com",
|
||
"license": "CC BY-SA 3.0"
|
||
}
|
||
|
||
+code(false, "bash").
|
||
python -m spacy package /home/me/data/en_example_model /home/me/my_models
|
||
|
||
p This command will create a model package directory that should look like this:
|
||
|
||
+code("Directory structure", "yaml").
|
||
└── /
|
||
├── MANIFEST.in # to include meta.json
|
||
├── meta.json # model meta data
|
||
├── setup.py # setup file for pip installation
|
||
└── en_example_model # model directory
|
||
├── __init__.py # init for pip installation
|
||
└── en_example_model-1.0.0 # model data
|
||
|
||
p
|
||
| You can also find templates for all files in our
|
||
| #[+src(gh("spacy-dev-resouces", "templates/model")) spaCy dev resources].
|
||
| If you're creating the package manually, keep in mind that the directories
|
||
| need to be named according to the naming conventions of
|
||
| #[code [language]_[name]] and #[code [language]_[name]-[version]]. The
|
||
| #[code lang] setting in the meta.json is also used to create the
|
||
| respective #[code Language] class in spaCy, which will later be returned
|
||
| by the model's #[code load()] method.
|
||
|
||
p
|
||
| To #[strong build the package], run the following command from within the
|
||
| directory. This will create a #[code .tar.gz] archive in a directory
|
||
| #[code /dist]. For more information on building Python packages, see the
|
||
| #[+a("https://setuptools.readthedocs.io/en/latest/") Python Setuptools documentation].
|
||
|
||
|
||
+code(false, "bash").
|
||
python setup.py sdist
|
||
|
||
+h(2, "loading") Loading a custom model package
|
||
|
||
p
|
||
| To load a model from a data directory, you can use
|
||
| #[+api("spacy#load") #[code spacy.load()]] with the local path:
|
||
|
||
+code.
|
||
nlp = spacy.load('/path/to/model')
|
||
|
||
p
|
||
| If you have generated a model package, you can also install it by
|
||
| pointing pip to the model's #[code .tar.gz] archive – this is pretty
|
||
| much exactly what spaCy's #[+api("cli#download") #[code download]]
|
||
| command does under the hood.
|
||
|
||
+code(false, "bash").
|
||
pip install /path/to/en_example_model-1.0.0.tar.gz
|
||
|
||
+aside-code("Custom model names", "bash").
|
||
# optional: assign custom name to model
|
||
python -m spacy link en_example_model my_cool_model
|
||
|
||
p
|
||
| You'll then be able to load the model via spaCy's loader, or by importing
|
||
| it as a module. For larger code bases, we usually recommend native
|
||
| imports, as this will make it easier to integrate models with your
|
||
| existing build process, continuous integration workflow and testing
|
||
| framework.
|
||
|
||
+code.
|
||
# option 1: import model as module
|
||
import en_example_model
|
||
nlp = en_example_model.load()
|
||
|
||
# option 2: use spacy.load()
|
||
nlp = spacy.load('en_example_model')
|