DOC Add tutorial for adding a new package (#1829)

Co-authored-by: Hood Chatham <roberthoodchatham@gmail.com>
This commit is contained in:
Roman Yurchak 2021-09-14 09:05:44 +02:00 committed by GitHub
parent 1fe84bb68b
commit 3b93e3d1bb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 255 additions and 162 deletions

View File

@ -0,0 +1,149 @@
=(meta-yaml-spec)
# The meta.yaml specification
Packages are defined by writing a `meta.yaml` file. The format of these files is
based on the `meta.yaml` files used to build [Conda
packages](https://conda.io/docs/user-guide/tasks/build-packages/define-metadata.html),
though it is much more limited. The most important limitation is that Pyodide
assumes there will only be one version of a given library available, whereas
Conda allows the user to specify the versions of each package that they want to
install. Despite the limitations, it is recommended to use existing conda
package definitions as a starting point to create Pyodide packages. In
general, however, one should not
expect Conda packages to "just work" with Pyodide, see {pr}`795`
The supported keys in the `meta.yaml` file are described below.
## `package`
### `package/name`
The name of the package. It must match the name of the package used when
expanding the tarball, which is sometimes different from the name of the package
in the Python namespace when installed. It must also match the name of the
directory in which the `meta.yaml` file is placed. It can only contain
alpha-numeric characters, `-`, and `_`.
### `package/version`
The version of the package.
## `source`
### `source/url`
The URL of the source tarball.
The tarball may be in any of the formats supported by Python's
`shutil.unpack_archive`: `tar`, `gztar`, `bztar`, `xztar`, and `zip`.
## `source/extract_dir`
The top level directory name of the contents of the source tarball (i.e. once
you extract the tarball, all the contents are in the directory named
`source/extract_dir`). This defaults to the tarball name (sans extension).
### `source/path`
Alternatively to `source/url`, a relative or absolute path can be specified
as package source. This is useful for local testing or building packages which
are not available online in the required format.
If a path is specified, any provided checksums are ignored.
### `source/md5`
The MD5 checksum of the tarball. It is recommended to use SHA256 instead of MD5.
At most one checksum entry should be provided per package.
### `source/sha256`
The SHA256 checksum of the tarball. It is recommended to use SHA256 instead of MD5.
At most one checksum entry should be provided per package.
### `source/patches`
A list of patch files to apply after expanding the tarball. These are applied
using `patch -p1` from the root of the source tree.
### `source/extras`
Extra files to add to the source tree. This should be a list where each entry is
a pair of the form `(src, dst)`. The `src` path is relative to the directory in
which the `meta.yaml` file resides. The `dst` path is relative to the root of
source tree (the expanded tarball).
## `build`
### `build/skip_host`
Skip building C extensions for the host environment. Default: `True`.
Setting this to `False` will result in ~2x slower builds for packages that
include C extensions. It should only be needed when a package is a build
time dependency for other packages. For instance, numpy is imported during
installation of matplotlib, importing numpy also imports included C extensions,
therefore it is built both for host and target.
### `build/cflags`
Extra arguments to pass to the compiler when building for WebAssembly.
(This key is not in the Conda spec).
### `build/cxxflags`
Extra arguments to pass to the compiler when building C++ files for WebAssembly.
Note that both `cflags` and `cxxflags` will be used when compiling C++ files.
A common example would be to use `-std=c++11` for code that makes use of C++11 features.
(This key is not in the Conda spec).
### `build/ldflags`
Extra arguments to pass to the linker when building for WebAssembly.
(This key is not in the Conda spec).
### `build/library`
Should be set to true for library packages. Library packages are packages that are needed for other packages but are not Python packages themselves. For library packages, the script specified in the `build/script` section is run to compile the library. See the [zlib meta.yaml](https://github.com/pyodide/pyodide/blob/main/packages/zlib/meta.yaml) for an example of a library package specification.
### `build/sharedlibrary`
Should be set to true for shared library packages. Shared library packages are packages that are needed for other packages, but are loaded dynamically when Pyodide is run. For shared library packages, the script specified in the `build/script` section is run to compile the library. The script should build the shared library and copy into into a subfolder of the source folder called `install`. Files or folders in this install folder will be packaged to make the Pyodide package. See the [CLAPACK meta.yaml](https://github.com/pyodide/pyodide/blob/main/packages/CLAPACK/meta.yaml) for an example of a shared library specification.
### `build/script`
The script section is required for a library package (`build/library` set to true). For a Python package this section is optional. If it is specified for a Python package, the script section will be run before the build system runs `setup.py`. This script is run by `bash` in the directory where the tarball was extracted.
### `build/post`
Shell commands to run after building the library. These are run inside of
`bash`, and there are two special environment variables defined:
- `$SITEPACKAGES`: The `site-packages` directory into which the package has been installed.
- `$PKGDIR`: The directory in which the `meta.yaml` file resides.
(This key is not in the Conda spec).
### `build/replace-libs`
A list of strings of the form `<old_name>=<new_name>`, to rename libraries when linking. This in particular
might be necessary when using emscripten ports.
For instance, `png16=png` is currently used in matplotlib.
## `requirements`
### `requirements/run`
A list of required packages.
(Unlike conda, this only supports package names, not versions).
## `test`
### `test/imports`
List of imports to test after the package is built.

View File

@ -2,6 +2,101 @@
# Creating a Pyodide package
## Quickstart
If you wish to use a package in Pyodide that is not already included, first you
need to determine whether it is necessary to package it for Pyodide. Ideally, you
should start this process with package dependencies.
### 1. Determining if creating a Pyodide package is necessary
Most pure Python packages can be installed directly from PyPi with
{func}`micropip.install` if they have a pure Python wheel. Check if this is the
case by going to the `pypi.org/project/<package-name>` URL and checking if the
"Download files" tab contains a file that ends with `*py3-none-any.whl`.
If the wheel is not on PyPi, but nevertheless you believe there is nothing
preventing it (it is a Python package without C extensions):
- you can create the wheel yourself by running,
```
python -m pip install build
python -m build
```
from within the package folder where the `setup.py`
are located. See the [Python packaging
guide](https://packaging.python.org/tutorials/packaging-projects/#generating-distribution-archives)
for more details.
Then upload the wheel file somewhere (not to PyPi) and install it with
micropip via its URL.
- you can also open an issue in the package repository asking the
authors to upload the wheel.
If however the package has C extensions or its code requires patching, then
continue to the next steps.
```{note}
To determine if a package has C extensions, check if its `setup.py` contains
any compilation commands.
```
### 2. Creating the meta.yaml
Once you determined that you need to create a new package for Pyodide, the
easiest place to start is with the {ref}`mkpkg tool <pyodide-mkpkg>`. If your
package is on PyPI, run:
`pyodide-build mkpkg <package-name>`
This will generate a `meta.yaml` under `package/<package-name>/` (see
{ref}`meta-yaml-spec`) that should work out of the box for many simple Python
packages. This tool will populate the latest version, download link and sha256
hash by querying PyPI. It doesn't currently handle package dependencies, so you
will need to specify those yourself.
You can also use the `meta.yaml` of other Pyodide packages in the `packages/`
folder as a starting point.
```{note}
To reliably determine build and runtime dependencies, including for non Python
libraries, it is often useful to verify if the package was already built on
[conda-forge](https://conda-forge.org/) and open the corresponding `meta.yaml`
file. This can be done either by checking if the URL
`https://github.com/conda-forge/<package-name>-feedstock/blob/master/recipe/meta.yaml`
exists, or by searching the [conda-forge Github
org](https://github.com/conda-forge/) for the package name.
The `meta.yaml` in Pyodide was inspired by the one in conda, however it is
not strictly compatible.
```
### 3. Building the package and investigating issues
Once the `meta.yaml` is ready, we build the package with,
```
PYODIDE_PACKAGES="<package-name>" make
```
and see if there are any errors. The detailed build log can be found under
`packages/<package-name>/<package-name>.log`.
If there are errors you might need to,
- patch the package by adding `.patch` files to `packages/<package-name>/patches`
- add the patch files to the `source/patches` field in the `meta.yaml` file
then re-start the build.
In general, it is recommended to look into how other similar packages are built in Pyodide.
If you still encounter difficulties in building your package, open a [new Pyodide
issue](https://github.com/pyodide/pyodide/issues).
To learn more about how packages are built in Pyodide, read the following
sections.
## Build pipeline
Pyodide includes a toolchain to make it easier to add new third-party Python
libraries to the build. We automate the following steps:
@ -26,166 +121,6 @@ Lastly, a `packages.json` file is output containing the dependency tree of all
packages, so {any}`pyodide.loadPackage` can
load a package's dependencies automatically.
## mkpkg
If you wish to create a new package for Pyodide, the easiest place to start is
with the {ref}`mkpkg tool <pyodide-mkpkg>`. If your package is on PyPI, just run:
`pyodide-build mkpkg $PACKAGE_NAME`
This will generate a `meta.yaml` (see below) that should work out of the box
for many pure Python packages. This tool will populate the latest version, download
link and sha256 hash by querying PyPI. It doesn't currently handle package
dependencies, so you will need to specify those yourself.
## The meta.yaml file
Packages are defined by writing a `meta.yaml` file. The format of these files is
based on the `meta.yaml` files used to build [Conda
packages](https://conda.io/docs/user-guide/tasks/build-packages/define-metadata.html),
though it is much more limited. The most important limitation is that Pyodide
assumes there will only be one version of a given library available, whereas
Conda allows the user to specify the versions of each package that they want to
install. Despite the limitations, keeping the file format as close as possible
to conda's should make it easier to use existing conda package definitions as a
starting point to create Pyodide packages. In general, however, one should not
expect Conda packages to "just work" with Pyodide. (In the longer term, Pyodide
may use conda as its packaging system, and this should hopefully ease that
transition.)
The supported keys in the `meta.yaml` file are described below.
### `package`
#### `package/name`
The name of the package. It must match the name of the package used when
expanding the tarball, which is sometimes different from the name of the package
in the Python namespace when installed. It must also match the name of the
directory in which the `meta.yaml` file is placed. It can only contain
alpha-numeric characters and `-`, `_`.
#### `package/version`
The version of the package.
### `source`
#### `source/url`
The URL of the source tarball.
The tarball may be in any of the formats supported by Python's
`shutil.unpack_archive`: `tar`, `gztar`, `bztar`, `xztar`, and `zip`.
### `source/extract_dir`
The top level directory name of the contents of the source tarball (i.e. once
you extract the tarball, all the contents are in the directory named
`source/extract_dir`). This defaults to the tarball name (sans extension).
#### `source/path`
Alternatively to `source/url`, a relative or absolute path can be specified
as package source. This is useful for local testing or building packages which
are not available online in the required format.
If a path is specified, any provided checksums are ignored.
#### `source/md5`
The MD5 checksum of the tarball. It is recommended to use SHA256 instead of MD5.
At most one checksum entry should be provided per package.
#### `source/sha256`
The SHA256 checksum of the tarball. It is recommended to use SHA256 instead of MD5.
At most one checksum entry should be provided per package.
#### `source/patches`
A list of patch files to apply after expanding the tarball. These are applied
using `patch -p1` from the root of the source tree.
#### `source/extras`
Extra files to add to the source tree. This should be a list where each entry is
a pair of the form `(src, dst)`. The `src` path is relative to the directory in
which the `meta.yaml` file resides. The `dst` path is relative to the root of
source tree (the expanded tarball).
### `build`
#### `build/skip_host`
Skip building C extensions for the host environment. Default: `True`.
Setting this to `False` will result in ~2x slower builds for packages that
include C extensions. It should only be needed when a package is a build
time dependency for other packages. For instance, numpy is imported during
installation of matplotlib, importing numpy also imports included C extensions,
therefore it is built both for host and target.
#### `build/cflags`
Extra arguments to pass to the compiler when building for WebAssembly.
(This key is not in the Conda spec).
#### `build/cxxflags`
Extra arguments to pass to the compiler when building C++ files for WebAssembly. Note that both clfags and cxxflags will be used when compiling C++ files. A common example would be to use -std=c++11 for code that makes use of C++11 features.
(This key is not in the Conda spec).
#### `build/ldflags`
Extra arguments to pass to the linker when building for WebAssembly.
(This key is not in the Conda spec).
#### `build/library`
Should be set to true for library packages. Library packages are packages that are needed for other packages but are not Python packages themselves. For library packages, the script specified in the `build/script` section is run to compile the library. See the [zlib meta.yaml](https://github.com/pyodide/pyodide/blob/main/packages/zlib/meta.yaml) for an example of a library package specification.
#### `build/sharedlibrary`
Should be set to true for shared library packages. Shared library packages are packages that are needed for other packages, but are loaded dynamically when Pyodide is run. For shared library packages, the script specified in the `build/script` section is run to compile the library. The script should build the shared library and copy into into a subfolder of the source folder called `install`. Files or folders in this install folder will be packaged to make the Pyodide package. See the [CLAPACK meta.yaml](https://github.com/pyodide/pyodide/blob/main/packages/CLAPACK/meta.yaml) for an example of a shared library specification.
#### `build/script`
The script section is required for a library package (`build/library` set to true). For a Python package this section is optional. If it is specified for a Python package, the script section will be run before the build system runs `setup.py`. This script is run by `bash` in the directory where the tarball was extracted.
#### `build/post`
Shell commands to run after building the library. These are run inside of
`bash`, and there are two special environment variables defined:
- `$SITEPACKAGES`: The `site-packages` directory into which the package has been installed.
- `$PKGDIR`: The directory in which the `meta.yaml` file resides.
(This key is not in the Conda spec).
#### `build/replace-libs`
A list of strings of the form `<old_name>=<new_name>`, to rename libraries when linking. This in particular
might be necessary when using emscripten ports.
For instance, `png16=png` is currently used in matplotlib.
### `requirements`
#### `requirements/run`
A list of required packages.
(Unlike conda, this only supports package names, not versions).
### `test`
#### `test/imports`
List of imports to test after the package is built.
## C library dependencies
Some Python packages depend on certain C libraries, e.g. `lxml` depends on
@ -220,8 +155,8 @@ of the Python package. See e.g. `matplotlib` for an example.
## Structure of a Pyodide package
This section describes the structure of a pure Python package, and how our
build system creates it (In general, it is not recommended, to construct these
by hand; instead create a Python wheel and install it with micropip)
build system creates it. In general, it is not recommended, to construct these
by hand; instead create a Python wheel and install it with micropip.
Pyodide is obtained by compiling CPython into web assembly. As such, it loads
packages the same way as CPython --- it looks for relevant files `.py` files in
@ -269,3 +204,10 @@ The arguments can be explained as follows:
- `--exclude *__pycache__*` to omit the pycache directories
- `--use-preload-plugins` says to [automatically decode files based on their
extension](https://emscripten.org/docs/porting/files/packaging_files.html#preloading-files)
```{eval-rst}
.. toctree::
:hidden:
meta-yaml.md
```

View File

@ -27,6 +27,8 @@ releases](https://github.com/pyodide/pyodide/releases)
(`pyodide-build-*.tar.bz2` file) serve them yourself, as explained in the
following section.
=(serving_pyodide_packages)
## Serving Pyodide packages
If you built your Pyodide distribution or downloaded the release tarball