We do several filesystem operations before loading Python.
- Mount _node_mounts
- Create default directories
- Register NativeFS file system
This PR organizes all those operations into a single function.
This is an another split off from #3582.
Resolves https://github.com/pyodide/pyodide/issues/3337
In Firefox if one writes anything with spaces then tries to copy-paste the input to a standard Python REPL, one gets,
SyntaxError: invalid non-printable character U+00A0
this is because spaces are replaced by the non-breaking space character.
This patch replaces non-breaking space characters with normal space characters in the repl.
This creates a new `pyodide.ffi` submodule and adds a bunch of new subclasses of
`PyProxy` to it.
There are three stages in which we are concerned with the behavior of the
objects we define:
1. at time of static typechecks
2. at execution time
3. when generating docs
Prior to this PR, the subtypes of PyProxy only work well for static type checks,
they work acceptably at runtime (just don't let the user access them), and the
docs don't look that great. This PR is primarily intended to improve the docs
for PyProxy, but they also make execution time checks work better: you can now
say `obj instanceof pyodide.ffi.PyCallable` instead of `obj.isCallable()` which
I is easier to understand and to cross reference against the documentation. I am
marking `isCallable` as deprecated.
I also made a bunch of edits and improvements to the docs.
I have deprecated `PyProxyCallable` in favor of `pyodide.ffi.PyCallable` and
`PyProxy.isCallable` in favor of `obj instanceof pyodide.ffi.PyCallable`.
`PyBuffer` has been renamed to `pyodide.ffi.PyBufferView` and a new `PyBuffer`
has been created which is a subtype of `PyProxy`.
This PR adds `package_type` field to repodata.json and use it to create a list of
unvendored standard libraries. After this we don't need to manage a hard-coded
list of unvendored stdlib lists in pyodide-py.
This adds a short helper script which shows a gzip and brotli compressed size of a file,
and uses it in CI to check compressed size of pyodide.asm.* after build in addition to
the original file size.
This leads to more consistent rendering (functions and methods get parens after
them) and reduces chances of warnings about getting the wrong link. It is also
possible to use `~fully.quallified.name` to just show `name` if we use a specific
reference type, but it doesn't work with `any` for some reason.
Removes / unvendors some python modules:
- Remove `_aix_support.py`, which is for supporting IBM AIX OS.
- Unvendor `_pydecimal.py`.
- _pydecimal is a pure Python implementation of `decimal` module. [Importing `decimal` fallbacks](https://github.com/python/cpython/blob/main/Lib/decimal.py) to `_pydecimal` if the C-implementation `_decimal` is not available. In our case, _decimal is available, so _pydecimal will not be normally used.
- Unvendor `pydoc_data`.
- pydoc_data contains [a large (~700KB) dictionary](https://github.com/python/cpython/blob/main/Lib/pydoc_data/topics.py) for explaining python builtins. This is mostly used when `help("...")` is called.
This is work towards unvendoring the Pyodide foreign function interface.
Prior to this point, we included a large amount of critical functionality with `--pre-js`.
So we could create an archive called `libpyodide.a` with the object files but to use it
you would have to pass `--pre-js _pyodide.out.js` at link time. This embeds all of this
stuff in an object file called `pyodide_pre.o` which goes in our archive so you get all
the needed js runtime by linking it.
Of course someone trying to use this still has to get the Python code onto the import
path, either using `--preload-file`, using Python to unpack it as a zip archive as we now
do, with zipimporter, or otherwise. They also will have to link `libpython.a` (is CPython
going to start distributing an Emscripten libpython?) and probably various other things.
We have to use a hack to inject the JavaScript code into the object files. The normal
`EM_JS` macro cannot handle arbitrary JavaScript code -- for example it fails with many
regex. Instead we manually generate write a C source file that does what we need using
`xxd`. The generated C code is similar to what `EM_JS` generates, but it uses an array
initializer rather than a string initializer for the characters avoiding the C preprocessor /
compiler's strange opinions about strings.
Instead of putting stuff behind `IN_SPHINX`, define functions and call them from the `setup` function.
In these functions, if we want to expose variables as part of the config we have to assign to `app.config.some_var` which is more explicit.
We still have to make the path change at top level. To improve this, in the future we should:
1. rename the sphinx_pyodide folder to sphinx-pyodide
2. add a `pyproject.toml` and `setup.py` so we can `pip install -e` it
3. instead of modifying the path, source the virtual environment
Up to this point, we've used this dynamic subclassing method for
producing JsProxies for everything but errors. For errors, we make
a wrapper which is not a JsProxy that inherits from Exception and
give the wrapper a "js_error" attribute that points to an actual
error. We also make python2js know about this wrapper so it can
unwrap it. But the raw js_error object is a bit weird. There isn't
anything terrible about this situation but it is mildly unsatisfying.
This changes it so that errors subclass both JsProxy and Exception.
To do this we need to:
1. ensure that they have compatible memory layouts, and
2. convince Python that they have compatible memory layouts
I switched to using a union for the different subtypes of JsProxy
that need extra space: JsCallable, JsBuffer, and JsError. We need
js to be at the end so it won't get in the way of the BaseException
memory layout. I added _Static_asserts to double check that the
memory layouts do in fact agree.
To convince Python that they have compatible memory layouts, we have
to temporarily tell it that JsProxy is a subtype of BaseException.
To do this, we just set JsProxy.__mro__ = (BaseException,) before
creating the type and then restore it afterward.
This prints a better set of error messages in case someone calls os._exit() or in C code exit() is used.
In the future we might like to do something better here, but for now at least we can print a clear
error message.