This moves unisolation into a package key. `cross-build-env: true` means the package
is part of the cross build environment and should be unisolated. `cross-build-files`
gives a list of files that should be copied from the built copy of the package into the host
copy of the package.
This will allow us to construct a cross build environment automatically as part of building
packages. If we have these files and the Python include directory, this is sufficient for
cross-building binary packages.
This resolves#2189.
> build isolation would be a bit difficult to use in our case, as for instance
> when building scipy we need the patched numpy on the host and not the numpy
> version specified in pyproject.toml (which would be unpatched)
This is indeed the case, certain packages cannot be isolated. My strategy is to
make a list of packages that shouldn't be isolated and add symlinks from the
isolated build environment into the `.artifacts` directory to "unisolate" them.
Then we remove the unisolated package requirements from the list of packages to
install, in case pesky constraints aren't satisfied. In particular, packages
that expect to be used with `pypa/build` often feel free to put very specific
constraints on their build dependencies (often asking them to be == to a
particular version). Specific version constraints is good for build
reproducibility and with build isolation doesn't cost anything. So we just
ignore the constraints. Hopefully nothing goes wrong.
In particular, any package that does stuff both at build time and at runtime and
requires synchronization between the build time and run time environments needs
the unisolation. This includes cffi with `_cffi_backend.so`, and of course numpy
and scipy. pycparser needs to be unisolated because it is a dependency of cffi.
Currently I have also unisolated pythran and cython, though these are build time
only tools and do not really need to be unisolated. Cython I unisolated
specifically because numcodecs needs it but it isn't in the numcodecs build
dependencies. Pythran I unisolated because of a problem with the scipy build
which I don't fully understand (some problem with long double feature
detection).
Our package build process currently has a significant flaw: we first run setup.py, recording all compilation commands, then we rewrite these compilation commands to invoke emcc and replay them, and then we pray that the cross compiled executables ended up in the right place to go into the wheel. This is not a good strategy because the build script is allowed to implement arbitrary logic, and if it moves, renames, etc any of the output files then we lose track of them. This has repeatedly caused difficulty for us.
However, we also make no particularly significant use of the two pass approach. We can just do the simpler thing: capture the compiler commands as they occur, modify them as needed, and then run the fixed command.
I also added a patch to fix the numpy feature detection for wasm so that we don't have to include _npyconfig.h and config.h, numpy can generate them in the way it would for a native build. I opened a numpy PR that would fix the detection for us upstream:
numpy/numpy#21154
This clears the way for us to switch to using pypa/build (as @henryiii has suggested) by removing our dependence on specific setuptools behavior.
This is on top of #2238.
This patch makes everything in numpy, scipy and CLAPACK that began life as a fortran subroutine return int, whereas before they were a smattering of void plus lots of int, with a bunch of conflicts where the same function was defined as int in some and void in others. This matters because on upstream emscripten, these packages will not build due to linker conflicts, as the wasm linker can't link a function returning int to a function returning void.
Annoyingly, this has to be int return not void, because whereas a normal fortran subroutine doesn't return anything, there's an obscure feature of fortran 77 called alternative returns which allows subroutines to say they want to return by jumping to some other place in the code. f2c handles this with integer return values.