mirror of https://github.com/python/cpython.git
204 lines
8.5 KiB
TeX
204 lines
8.5 KiB
TeX
\chapter{Memory Management \label{memory}}
|
|
\sectionauthor{Vladimir Marangozov}{Vladimir.Marangozov@inrialpes.fr}
|
|
|
|
|
|
\section{Overview \label{memoryOverview}}
|
|
|
|
Memory management in Python involves a private heap containing all
|
|
Python objects and data structures. The management of this private
|
|
heap is ensured internally by the \emph{Python memory manager}. The
|
|
Python memory manager has different components which deal with various
|
|
dynamic storage management aspects, like sharing, segmentation,
|
|
preallocation or caching.
|
|
|
|
At the lowest level, a raw memory allocator ensures that there is
|
|
enough room in the private heap for storing all Python-related data
|
|
by interacting with the memory manager of the operating system. On top
|
|
of the raw memory allocator, several object-specific allocators
|
|
operate on the same heap and implement distinct memory management
|
|
policies adapted to the peculiarities of every object type. For
|
|
example, integer objects are managed differently within the heap than
|
|
strings, tuples or dictionaries because integers imply different
|
|
storage requirements and speed/space tradeoffs. The Python memory
|
|
manager thus delegates some of the work to the object-specific
|
|
allocators, but ensures that the latter operate within the bounds of
|
|
the private heap.
|
|
|
|
It is important to understand that the management of the Python heap
|
|
is performed by the interpreter itself and that the user has no
|
|
control over it, even if she regularly manipulates object pointers to
|
|
memory blocks inside that heap. The allocation of heap space for
|
|
Python objects and other internal buffers is performed on demand by
|
|
the Python memory manager through the Python/C API functions listed in
|
|
this document.
|
|
|
|
To avoid memory corruption, extension writers should never try to
|
|
operate on Python objects with the functions exported by the C
|
|
library: \cfunction{malloc()}\ttindex{malloc()},
|
|
\cfunction{calloc()}\ttindex{calloc()},
|
|
\cfunction{realloc()}\ttindex{realloc()} and
|
|
\cfunction{free()}\ttindex{free()}. This will result in
|
|
mixed calls between the C allocator and the Python memory manager
|
|
with fatal consequences, because they implement different algorithms
|
|
and operate on different heaps. However, one may safely allocate and
|
|
release memory blocks with the C library allocator for individual
|
|
purposes, as shown in the following example:
|
|
|
|
\begin{verbatim}
|
|
PyObject *res;
|
|
char *buf = (char *) malloc(BUFSIZ); /* for I/O */
|
|
|
|
if (buf == NULL)
|
|
return PyErr_NoMemory();
|
|
...Do some I/O operation involving buf...
|
|
res = PyString_FromString(buf);
|
|
free(buf); /* malloc'ed */
|
|
return res;
|
|
\end{verbatim}
|
|
|
|
In this example, the memory request for the I/O buffer is handled by
|
|
the C library allocator. The Python memory manager is involved only
|
|
in the allocation of the string object returned as a result.
|
|
|
|
In most situations, however, it is recommended to allocate memory from
|
|
the Python heap specifically because the latter is under control of
|
|
the Python memory manager. For example, this is required when the
|
|
interpreter is extended with new object types written in C. Another
|
|
reason for using the Python heap is the desire to \emph{inform} the
|
|
Python memory manager about the memory needs of the extension module.
|
|
Even when the requested memory is used exclusively for internal,
|
|
highly-specific purposes, delegating all memory requests to the Python
|
|
memory manager causes the interpreter to have a more accurate image of
|
|
its memory footprint as a whole. Consequently, under certain
|
|
circumstances, the Python memory manager may or may not trigger
|
|
appropriate actions, like garbage collection, memory compaction or
|
|
other preventive procedures. Note that by using the C library
|
|
allocator as shown in the previous example, the allocated memory for
|
|
the I/O buffer escapes completely the Python memory manager.
|
|
|
|
|
|
\section{Memory Interface \label{memoryInterface}}
|
|
|
|
The following function sets, modeled after the ANSI C standard,
|
|
but specifying behavior when requesting zero bytes,
|
|
are available for allocating and releasing memory from the Python heap:
|
|
|
|
|
|
\begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n}
|
|
Allocates \var{n} bytes and returns a pointer of type \ctype{void*}
|
|
to the allocated memory, or \NULL{} if the request fails.
|
|
Requesting zero bytes returns a distinct non-\NULL{} pointer if
|
|
possible, as if \cfunction{PyMem_Malloc(1)} had been called instead.
|
|
The memory will not have been initialized in any way.
|
|
\end{cfuncdesc}
|
|
|
|
\begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n}
|
|
Resizes the memory block pointed to by \var{p} to \var{n} bytes.
|
|
The contents will be unchanged to the minimum of the old and the new
|
|
sizes. If \var{p} is \NULL, the call is equivalent to
|
|
\cfunction{PyMem_Malloc(\var{n})}; else if \var{n} is equal to zero, the
|
|
memory block is resized but is not freed, and the returned pointer
|
|
is non-\NULL. Unless \var{p} is \NULL, it must have been
|
|
returned by a previous call to \cfunction{PyMem_Malloc()} or
|
|
\cfunction{PyMem_Realloc()}.
|
|
\end{cfuncdesc}
|
|
|
|
\begin{cfuncdesc}{void}{PyMem_Free}{void *p}
|
|
Frees the memory block pointed to by \var{p}, which must have been
|
|
returned by a previous call to \cfunction{PyMem_Malloc()} or
|
|
\cfunction{PyMem_Realloc()}. Otherwise, or if
|
|
\cfunction{PyMem_Free(p)} has been called before, undefined
|
|
behavior occurs. If \var{p} is \NULL, no operation is performed.
|
|
\end{cfuncdesc}
|
|
|
|
The following type-oriented macros are provided for convenience. Note
|
|
that \var{TYPE} refers to any C type.
|
|
|
|
\begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n}
|
|
Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} *
|
|
sizeof(\var{TYPE}))} bytes of memory. Returns a pointer cast to
|
|
\ctype{\var{TYPE}*}. The memory will not have been initialized in
|
|
any way.
|
|
\end{cfuncdesc}
|
|
|
|
\begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n}
|
|
Same as \cfunction{PyMem_Realloc()}, but the memory block is resized
|
|
to \code{(\var{n} * sizeof(\var{TYPE}))} bytes. Returns a pointer
|
|
cast to \ctype{\var{TYPE}*}.
|
|
\end{cfuncdesc}
|
|
|
|
\begin{cfuncdesc}{void}{PyMem_Del}{void *p}
|
|
Same as \cfunction{PyMem_Free()}.
|
|
\end{cfuncdesc}
|
|
|
|
In addition, the following macro sets are provided for calling the
|
|
Python memory allocator directly, without involving the C API functions
|
|
listed above. However, note that their use does not preserve binary
|
|
compatibility accross Python versions and is therefore deprecated in
|
|
extension modules.
|
|
|
|
\cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}.
|
|
|
|
\cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}.
|
|
|
|
|
|
\section{Examples \label{memoryExamples}}
|
|
|
|
Here is the example from section \ref{memoryOverview}, rewritten so
|
|
that the I/O buffer is allocated from the Python heap by using the
|
|
first function set:
|
|
|
|
\begin{verbatim}
|
|
PyObject *res;
|
|
char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */
|
|
|
|
if (buf == NULL)
|
|
return PyErr_NoMemory();
|
|
/* ...Do some I/O operation involving buf... */
|
|
res = PyString_FromString(buf);
|
|
PyMem_Free(buf); /* allocated with PyMem_Malloc */
|
|
return res;
|
|
\end{verbatim}
|
|
|
|
The same code using the type-oriented function set:
|
|
|
|
\begin{verbatim}
|
|
PyObject *res;
|
|
char *buf = PyMem_New(char, BUFSIZ); /* for I/O */
|
|
|
|
if (buf == NULL)
|
|
return PyErr_NoMemory();
|
|
/* ...Do some I/O operation involving buf... */
|
|
res = PyString_FromString(buf);
|
|
PyMem_Del(buf); /* allocated with PyMem_New */
|
|
return res;
|
|
\end{verbatim}
|
|
|
|
Note that in the two examples above, the buffer is always
|
|
manipulated via functions belonging to the same set. Indeed, it
|
|
is required to use the same memory API family for a given
|
|
memory block, so that the risk of mixing different allocators is
|
|
reduced to a minimum. The following code sequence contains two errors,
|
|
one of which is labeled as \emph{fatal} because it mixes two different
|
|
allocators operating on different heaps.
|
|
|
|
\begin{verbatim}
|
|
char *buf1 = PyMem_New(char, BUFSIZ);
|
|
char *buf2 = (char *) malloc(BUFSIZ);
|
|
char *buf3 = (char *) PyMem_Malloc(BUFSIZ);
|
|
...
|
|
PyMem_Del(buf3); /* Wrong -- should be PyMem_Free() */
|
|
free(buf2); /* Right -- allocated via malloc() */
|
|
free(buf1); /* Fatal -- should be PyMem_Del() */
|
|
\end{verbatim}
|
|
|
|
In addition to the functions aimed at handling raw memory blocks from
|
|
the Python heap, objects in Python are allocated and released with
|
|
\cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and
|
|
\cfunction{PyObject_Del()}, or with their corresponding macros
|
|
\cfunction{PyObject_NEW()}, \cfunction{PyObject_NEW_VAR()} and
|
|
\cfunction{PyObject_DEL()}.
|
|
|
|
These will be explained in the next chapter on defining and
|
|
implementing new object types in C.
|