spaCy/website/docs/usage/saving-loading.jade

128 lines
4.8 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

include ../../_includes/_mixins
+h(2, "101") Serialization 101
include _spacy-101/_serialization
+infobox("Important note")
| In spaCy v2.0, the API for saving and loading has changed to only use the
| four methods listed above consistently across objects and classes. For an
| overview of the changes, see #[+a("/docs/usage/v2#incompat") this table]
| and the notes on #[+a("/docs/usage/v2#migrating-saving-loading") migrating].
+h(2, "models") Saving models
p
| After training your model, you'll usually want to save its state, and load
| it back later. You can do this with the
| #[+api("language#to_disk") #[code Language.to_disk()]]
| method:
+code.
nlp.to_disk('/home/me/data/en_example_model')
p
| The directory will be created if it doesn't exist, and the whole pipeline
| will be written out. To make the model more convenient to deploy, we
| recommend wrapping it as a Python package.
+h(3, "models-generating") Generating a model package
+infobox("Important note")
| The model packages are #[strong not suitable] for the public
| #[+a("https://pypi.python.org") pypi.python.org] directory, which is not
| designed for binary data and files over 50 MB. However, if your company
| is running an #[strong internal installation] of PyPi, publishing your
| models on there can be a convenient way to share them with your team.
p
| spaCy comes with a handy CLI command that will create all required files,
| and walk you through generating the meta data. You can also create the
| meta.json manually and place it in the model data directory, or supply a
| path to it using the #[code --meta] flag. For more info on this, see
| the #[+api("cli#package") #[code package]] docs.
+aside-code("meta.json", "json").
{
"name": "example_model",
"lang": "en",
"version": "1.0.0",
"spacy_version": ">=2.0.0,<3.0.0",
"description": "Example model for spaCy",
"author": "You",
"email": "you@example.com",
"license": "CC BY-SA 3.0"
}
+code(false, "bash").
python -m spacy package /home/me/data/en_example_model /home/me/my_models
p This command will create a model package directory that should look like this:
+code("Directory structure", "yaml").
└── /
├── MANIFEST.in # to include meta.json
├── meta.json # model meta data
├── setup.py # setup file for pip installation
└── en_example_model # model directory
├── __init__.py # init for pip installation
└── en_example_model-1.0.0 # model data
p
| You can also find templates for all files in our
| #[+src(gh("spacy-dev-resouces", "templates/model")) spaCy dev resources].
| If you're creating the package manually, keep in mind that the directories
| need to be named according to the naming conventions of
| #[code [language]_[name]] and #[code [language]_[name]-[version]]. The
| #[code lang] setting in the meta.json is also used to create the
| respective #[code Language] class in spaCy, which will later be returned
| by the model's #[code load()] method.
p
| To #[strong build the package], run the following command from within the
| directory. This will create a #[code .tar.gz] archive in a directory
| #[code /dist]. For more information on building Python packages, see the
| #[+a("https://setuptools.readthedocs.io/en/latest/") Python Setuptools documentation].
+code(false, "bash").
python setup.py sdist
+h(2, "loading") Loading a custom model package
p
| To load a model from a data directory, you can use
| #[+api("spacy#load") #[code spacy.load()]] with the local path:
+code.
nlp = spacy.load('/path/to/model')
p
| If you have generated a model package, you can also install it by
| pointing pip to the model's #[code .tar.gz] archive this is pretty
| much exactly what spaCy's #[+api("cli#download") #[code download]]
| command does under the hood.
+code(false, "bash").
pip install /path/to/en_example_model-1.0.0.tar.gz
+aside-code("Custom model names", "bash").
# optional: assign custom name to model
python -m spacy link en_example_model my_cool_model
p
| You'll then be able to load the model via spaCy's loader, or by importing
| it as a module. For larger code bases, we usually recommend native
| imports, as this will make it easier to integrate models with your
| existing build process, continuous integration workflow and testing
| framework.
+code.
# option 1: import model as module
import en_example_model
nlp = en_example_model.load()
# option 2: use spacy.load()
nlp = spacy.load('en_example_model')