mirror of https://github.com/python/cpython.git
Improve pickle's documentation.
There is still much to be done, but I am committing my changes incrementally to avoid losing them again (for a third time now).
This commit is contained in:
parent
87eee631fb
commit
758bca6e36
|
@ -92,11 +92,9 @@ advantage that there are no restrictions imposed by external standards such as
|
|||
XDR (which can't represent pointer sharing); however it means that non-Python
|
||||
programs may not be able to reconstruct pickled Python objects.
|
||||
|
||||
By default, the :mod:`pickle` data format uses a printable ASCII representation.
|
||||
This is slightly more voluminous than a binary representation. The big
|
||||
advantage of using printable ASCII (and of some other characteristics of
|
||||
:mod:`pickle`'s representation) is that for debugging or recovery purposes it is
|
||||
possible for a human to read the pickled file with a standard text editor.
|
||||
By default, the :mod:`pickle` data format uses a compact binary representation.
|
||||
The module :mod:`pickletools` contains tools for analyzing data streams
|
||||
generated by :mod:`pickle`.
|
||||
|
||||
There are currently 4 different protocols which can be used for pickling.
|
||||
|
||||
|
@ -110,17 +108,15 @@ There are currently 4 different protocols which can be used for pickling.
|
|||
efficient pickling of :term:`new-style class`\es.
|
||||
|
||||
* Protocol version 3 was added in Python 3.0. It has explicit support for
|
||||
bytes and cannot be unpickled by Python 2.x pickle modules.
|
||||
bytes and cannot be unpickled by Python 2.x pickle modules. This is
|
||||
the current recommended protocol, use it whenever it is possible.
|
||||
|
||||
Refer to :pep:`307` for more information.
|
||||
|
||||
If a *protocol* is not specified, protocol 3 is used. If *protocol* is
|
||||
If a *protocol* is not specified, protocol 3 is used. If *protocol* is
|
||||
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
|
||||
protocol version available will be used.
|
||||
|
||||
A binary format, which is slightly more efficient, can be chosen by specifying a
|
||||
*protocol* version >= 1.
|
||||
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
@ -146,152 +142,210 @@ an unpickler, then you call the unpickler's :meth:`load` method. The
|
|||
as line terminators and therefore will look "funny" when viewed in Notepad or
|
||||
other editors which do not support this format.
|
||||
|
||||
.. data:: DEFAULT_PROTOCOL
|
||||
|
||||
The default protocol used for pickling. May be less than HIGHEST_PROTOCOL.
|
||||
Currently the default protocol is 3; a backward-incompatible protocol
|
||||
designed for Python 3.0.
|
||||
|
||||
|
||||
The :mod:`pickle` module provides the following functions to make the pickling
|
||||
process more convenient:
|
||||
|
||||
|
||||
.. function:: dump(obj, file[, protocol])
|
||||
|
||||
Write a pickled representation of *obj* to the open file object *file*. This is
|
||||
equivalent to ``Pickler(file, protocol).dump(obj)``.
|
||||
Write a pickled representation of *obj* to the open file object *file*. This
|
||||
is equivalent to ``Pickler(file, protocol).dump(obj)``.
|
||||
|
||||
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol* is
|
||||
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
|
||||
protocol version will be used.
|
||||
The optional *protocol* argument tells the pickler to use the given protocol;
|
||||
supported protocols are 0, 1, 2, 3. The default protocol is 3; a
|
||||
backward-incompatible protocol designed for Python 3.0.
|
||||
|
||||
*file* must have a :meth:`write` method that accepts a single string argument.
|
||||
It can thus be a file object opened for writing, a :mod:`StringIO` object, or
|
||||
any other custom object that meets this interface.
|
||||
|
||||
|
||||
.. function:: load(file)
|
||||
|
||||
Read a string from the open file object *file* and interpret it as a pickle data
|
||||
stream, reconstructing and returning the original object hierarchy. This is
|
||||
equivalent to ``Unpickler(file).load()``.
|
||||
|
||||
*file* must have two methods, a :meth:`read` method that takes an integer
|
||||
argument, and a :meth:`readline` method that requires no arguments. Both
|
||||
methods should return a string. Thus *file* can be a file object opened for
|
||||
reading, a :mod:`StringIO` object, or any other custom object that meets this
|
||||
interface.
|
||||
|
||||
This function automatically determines whether the data stream was written in
|
||||
binary mode or not.
|
||||
Specifying a negative protocol version selects the highest protocol version
|
||||
supported. The higher the protocol used, the more recent the version of
|
||||
Python needed to read the pickle produced.
|
||||
|
||||
The *file* argument must have a write() method that accepts a single bytes
|
||||
argument. It can thus be a file object opened for binary writing, a
|
||||
io.BytesIO instance, or any other custom object that meets this interface.
|
||||
|
||||
.. function:: dumps(obj[, protocol])
|
||||
|
||||
Return the pickled representation of the object as a :class:`bytes`
|
||||
object, instead of writing it to a file.
|
||||
|
||||
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol*
|
||||
is specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
|
||||
protocol version will be used.
|
||||
The optional *protocol* argument tells the pickler to use the given protocol;
|
||||
supported protocols are 0, 1, 2, 3. The default protocol is 3; a
|
||||
backward-incompatible protocol designed for Python 3.0.
|
||||
|
||||
Specifying a negative protocol version selects the highest protocol version
|
||||
supported. The higher the protocol used, the more recent the version of
|
||||
Python needed to read the pickle produced.
|
||||
|
||||
.. function:: load(file, [\*, encoding="ASCII", errors="strict"])
|
||||
|
||||
Read a pickled object representation from the open file object *file* and
|
||||
return the reconstituted object hierarchy specified therein. This is
|
||||
equivalent to ``Unpickler(file).load()``.
|
||||
|
||||
The protocol version of the pickle is detected automatically, so no protocol
|
||||
argument is needed. Bytes past the pickled object's representation are
|
||||
ignored.
|
||||
|
||||
The argument *file* must have two methods, a read() method that takes an
|
||||
integer argument, and a readline() method that requires no arguments. Both
|
||||
methods should return bytes. Thus *file* can be a binary file object opened
|
||||
for reading, a BytesIO object, or any other custom object that meets this
|
||||
interface.
|
||||
|
||||
Optional keyword arguments are encoding and errors, which are used to decode
|
||||
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
|
||||
'strict', respectively.
|
||||
|
||||
.. function:: loads(bytes_object, [\*, encoding="ASCII", errors="strict"])
|
||||
|
||||
Read a pickled object hierarchy from a :class:`bytes` object and return the
|
||||
reconstituted object hierarchy specified therein
|
||||
|
||||
The protocol version of the pickle is detected automatically, so no protocol
|
||||
argument is needed. Bytes past the pickled object's representation are
|
||||
ignored.
|
||||
|
||||
Optional keyword arguments are encoding and errors, which are used to decode
|
||||
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
|
||||
'strict', respectively.
|
||||
|
||||
|
||||
.. function:: loads(bytes_object)
|
||||
|
||||
Read a pickled object hierarchy from a :class:`bytes` object.
|
||||
Bytes past the pickled object's representation are ignored.
|
||||
|
||||
The :mod:`pickle` module also defines three exceptions:
|
||||
|
||||
The :mod:`pickle` module defines three exceptions:
|
||||
|
||||
.. exception:: PickleError
|
||||
|
||||
A common base class for the other exceptions defined below. This inherits from
|
||||
Common base class for the other pickling exceptions. It inherits
|
||||
:exc:`Exception`.
|
||||
|
||||
|
||||
.. exception:: PicklingError
|
||||
|
||||
This exception is raised when an unpicklable object is passed to the
|
||||
:meth:`dump` method.
|
||||
|
||||
Error raised when an unpicklable object is encountered by :class:`Pickler`.
|
||||
It inherits :exc:`PickleError`.
|
||||
|
||||
.. exception:: UnpicklingError
|
||||
|
||||
This exception is raised when there is a problem unpickling an object. Note that
|
||||
other exceptions may also be raised during unpickling, including (but not
|
||||
necessarily limited to) :exc:`AttributeError`, :exc:`EOFError`,
|
||||
:exc:`ImportError`, and :exc:`IndexError`.
|
||||
Error raised when there a problem unpickling an object, such as a data
|
||||
corruption or a security violation. It inherits :exc:`PickleError`.
|
||||
|
||||
The :mod:`pickle` module also exports two callables, :class:`Pickler` and
|
||||
Note that other exceptions may also be raised during unpickling, including
|
||||
(but not necessarily limited to) AttributeError, EOFError, ImportError, and
|
||||
IndexError.
|
||||
|
||||
|
||||
The :mod:`pickle` module exports two classes, :class:`Pickler` and
|
||||
:class:`Unpickler`:
|
||||
|
||||
|
||||
.. class:: Pickler(file[, protocol])
|
||||
|
||||
This takes a file-like object to which it will write a pickle data stream.
|
||||
This takes a binary file for writing a pickle data stream.
|
||||
|
||||
If the *protocol* parameter is omitted, protocol 3 is used. If *protocol* is
|
||||
specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
|
||||
protocol version will be used.
|
||||
The optional *protocol* argument tells the pickler to use the given protocol;
|
||||
supported protocols are 0, 1, 2, 3. The default protocol is 3; a
|
||||
backward-incompatible protocol designed for Python 3.0.
|
||||
|
||||
*file* must have a :meth:`write` method that accepts a single string argument.
|
||||
It can thus be an open file object, a :mod:`StringIO` object, or any other
|
||||
custom object that meets this interface.
|
||||
|
||||
:class:`Pickler` objects define one (or two) public methods:
|
||||
Specifying a negative protocol version selects the highest protocol version
|
||||
supported. The higher the protocol used, the more recent the version of
|
||||
Python needed to read the pickle produced.
|
||||
|
||||
The *file* argument must have a write() method that accepts a single bytes
|
||||
argument. It can thus be a file object opened for binary writing, a
|
||||
io.BytesIO instance, or any other custom object that meets this interface.
|
||||
|
||||
.. method:: dump(obj)
|
||||
|
||||
Write a pickled representation of *obj* to the open file object given in the
|
||||
constructor. Either the binary or ASCII format will be used, depending on the
|
||||
value of the *protocol* argument passed to the constructor.
|
||||
Write a pickled representation of *obj* to the open file object given in
|
||||
the constructor.
|
||||
|
||||
.. method:: persistent_id(obj)
|
||||
|
||||
Do nothing by default. This exists so a subclass can override it.
|
||||
|
||||
If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
|
||||
other value causes :class:`Pickler` to emit the returned value as a
|
||||
persistent ID for *obj*. The meaning of this persistent ID should be
|
||||
defined by :meth:`Unpickler.persistent_load`. Note that the value
|
||||
returned by :meth:`persistent_id` cannot itself have a persistent ID.
|
||||
|
||||
See :ref:`pickle-persistent` for details and examples of uses.
|
||||
|
||||
.. method:: clear_memo()
|
||||
|
||||
Clears the pickler's "memo". The memo is the data structure that remembers
|
||||
which objects the pickler has already seen, so that shared or recursive objects
|
||||
pickled by reference and not by value. This method is useful when re-using
|
||||
picklers.
|
||||
Deprecated. Use the :meth:`clear` method on the :attr:`memo`. Clear the
|
||||
pickler's memo, useful when reusing picklers.
|
||||
|
||||
.. attribute:: fast
|
||||
|
||||
Enable fast mode if set to a true value. The fast mode disables the usage
|
||||
of memo, therefore speeding the pickling process by not generating
|
||||
superfluous PUT opcodes. It should not be used with self-referential
|
||||
objects, doing otherwise will cause :class:`Pickler` to recurse
|
||||
infinitely.
|
||||
|
||||
Use :func:`pickletools.optimize` if you need more compact pickles.
|
||||
|
||||
.. attribute:: memo
|
||||
|
||||
Dictionary holding previously pickled objects to allow shared or
|
||||
recursive objects to pickled by reference as opposed to by value.
|
||||
|
||||
|
||||
It is possible to make multiple calls to the :meth:`dump` method of the same
|
||||
:class:`Pickler` instance. These must then be matched to the same number of
|
||||
calls to the :meth:`load` method of the corresponding :class:`Unpickler`
|
||||
instance. If the same object is pickled by multiple :meth:`dump` calls, the
|
||||
:meth:`load` will all yield references to the same object. [#]_
|
||||
:meth:`load` will all yield references to the same object.
|
||||
|
||||
:class:`Unpickler` objects are defined as:
|
||||
Please note, this is intended for pickling multiple objects without intervening
|
||||
modifications to the objects or their parts. If you modify an object and then
|
||||
pickle it again using the same :class:`Pickler` instance, the object is not
|
||||
pickled again --- a reference to it is pickled and the :class:`Unpickler` will
|
||||
return the old value, not the modified one.
|
||||
|
||||
|
||||
.. class:: Unpickler(file)
|
||||
.. class:: Unpickler(file, [\*, encoding="ASCII", errors="strict"])
|
||||
|
||||
This takes a file-like object from which it will read a pickle data stream.
|
||||
This class automatically determines whether the data stream was written in
|
||||
binary mode or not, so it does not need a flag as in the :class:`Pickler`
|
||||
factory.
|
||||
This takes a binary file for reading a pickle data stream.
|
||||
|
||||
*file* must have two methods, a :meth:`read` method that takes an integer
|
||||
argument, and a :meth:`readline` method that requires no arguments. Both
|
||||
methods should return a string. Thus *file* can be a file object opened for
|
||||
reading, a :mod:`StringIO` object, or any other custom object that meets this
|
||||
The protocol version of the pickle is detected automatically, so no
|
||||
protocol argument is needed.
|
||||
|
||||
The argument *file* must have two methods, a read() method that takes an
|
||||
integer argument, and a readline() method that requires no arguments. Both
|
||||
methods should return bytes. Thus *file* can be a binary file object opened
|
||||
for reading, a BytesIO object, or any other custom object that meets this
|
||||
interface.
|
||||
|
||||
:class:`Unpickler` objects have one (or two) public methods:
|
||||
|
||||
Optional keyword arguments are encoding and errors, which are used to decode
|
||||
8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
|
||||
'strict', respectively.
|
||||
|
||||
.. method:: load()
|
||||
|
||||
Read a pickled object representation from the open file object given in
|
||||
the constructor, and return the reconstituted object hierarchy specified
|
||||
therein.
|
||||
therein. Bytes past the pickled object's representation are ignored.
|
||||
|
||||
This method automatically determines whether the data stream was written
|
||||
in binary mode or not.
|
||||
.. method:: persistent_load(pid)
|
||||
|
||||
Raise an :exc:`UnpickingError` by default.
|
||||
|
||||
.. method:: noload()
|
||||
If defined, :meth:`persistent_load` should return the object specified by
|
||||
the persistent ID *pid*. On errors, such as if an invalid persistent ID is
|
||||
encountered, an :exc:`UnpickingError` should be raised.
|
||||
|
||||
This is just like :meth:`load` except that it doesn't actually create any
|
||||
objects. This is useful primarily for finding what's called "persistent
|
||||
ids" that may be referenced in a pickle data stream. See section
|
||||
:ref:`pickle-protocol` below for more details.
|
||||
See :ref:`pickle-persistent` for details and examples of uses.
|
||||
|
||||
.. method:: find_class(module, name)
|
||||
|
||||
Import *module* if necessary and return the object called *name* from it.
|
||||
Subclasses may override this to gain control over what type of objects can
|
||||
be loaded, potentially reducing security risks.
|
||||
|
||||
|
||||
What can be pickled and unpickled?
|
||||
|
@ -506,6 +560,8 @@ The registered constructor is deemed a "safe constructor" for purposes of
|
|||
unpickling as described above.
|
||||
|
||||
|
||||
.. _pickle-persistent:
|
||||
|
||||
Pickling and unpickling external objects
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
|
@ -747,14 +803,6 @@ the same process or a new process. ::
|
|||
|
||||
.. [#] Don't confuse this with the :mod:`marshal` module
|
||||
|
||||
.. [#] *Warning*: this is intended for pickling multiple objects without intervening
|
||||
modifications to the objects or their parts. If you modify an object and then
|
||||
pickle it again using the same :class:`Pickler` instance, the object is not
|
||||
pickled again --- a reference to it is pickled and the :class:`Unpickler` will
|
||||
return the old value, not the modified one. There are two problems here: (1)
|
||||
detecting changes, and (2) marshalling a minimal set of changes. Garbage
|
||||
Collection may also become a problem here.
|
||||
|
||||
.. [#] The exception raised will likely be an :exc:`ImportError` or an
|
||||
:exc:`AttributeError` but it could be something else.
|
||||
|
||||
|
|
Loading…
Reference in New Issue