mirror of https://github.com/python/cpython.git
291 lines
13 KiB
TeX
291 lines
13 KiB
TeX
\documentclass{howto}
|
|
|
|
\title{What's New in Python 1.6}
|
|
\release{0.01}
|
|
\author{A.M. Kuchling}
|
|
\authoraddress{\email{amk1@bigfoot.com}}
|
|
\begin{document}
|
|
\maketitle\tableofcontents
|
|
|
|
\section{Introduction}
|
|
|
|
A new release of Python, version 1.6, will be released some time this
|
|
summer. Alpha versions are already available from
|
|
\url{http://www.python.org/1.6/}. This article talks about the
|
|
exciting new features in 1.6, highlights some useful new features, and
|
|
points out a few incompatible changes that may require rewriting code.
|
|
|
|
Python's development never ceases, and a steady flow of bug fixes and
|
|
improvements are always being submitted. A host of minor bug-fixes, a
|
|
few optimizations, additional docstrings, and better error messages
|
|
went into 1.6; to list them all would be impossible, but they're
|
|
certainly significant. Consult the publicly-available CVS logs if you
|
|
want to see the full list.
|
|
|
|
% ======================================================================
|
|
\section{Unicode}
|
|
|
|
XXX
|
|
|
|
unicode support: Unicode strings are marked with u"string", and there
|
|
is support for arbitrary encoders/decoders
|
|
|
|
Added -U command line option. With the option enabled the Python
|
|
compiler interprets all "..." strings as u"..." (same with r"..." and
|
|
ur"..."). (Is this just for experimenting?)
|
|
|
|
|
|
% ======================================================================
|
|
\section{Distribution Utilities}
|
|
|
|
XXX
|
|
|
|
% ======================================================================
|
|
\section{String Methods}
|
|
|
|
% ======================================================================
|
|
\section{Porting to 1.6}
|
|
|
|
New Python releases try hard to be compatible with previous releases,
|
|
and the record has been pretty good. However, some changes are
|
|
considered useful enough (often fixing design decisions that were
|
|
initially bad) that breaking backward compatibility in subtle ways
|
|
can't always be avoided. This section lists the changes in Python 1.6
|
|
that may cause old Python code to break.
|
|
|
|
The change which will probably break the most code is tightening up
|
|
the arguments accepted by some methods. Some methods would take
|
|
multiple arguments and treat them as a tuple, particularly various
|
|
list methods such as \method{.append()}, \method{.insert()},
|
|
\method{remove()}, and \method{.count()}.
|
|
%
|
|
% XXX did anyone ever call the last 2 methods with multiple args?
|
|
%
|
|
In earlier versions of Python, if \code{L} is a list, \code{L.append(
|
|
1,2 )} appends the tuple \code{(1,2)} to the list. In Python 1.6 this
|
|
causes a \exception{TypeError} exception to be raised, with the
|
|
message: 'append requires exactly 1 argument; 2 given'. The fix is to
|
|
simply add an extra set of parentheses to pass both values as a tuple:
|
|
\code{L.append( (1,2) )}.
|
|
|
|
The earlier versions of these methods were more forgiving because they
|
|
used an old function in Python's C interface to parse their arguments;
|
|
1.6 modernizes them to use \function{PyArg_ParseTuple}, the current
|
|
argument parsing function, which provides more helpful error messages
|
|
and treats multi-argument calls as errors. If you absolutely must use
|
|
1.6 but can't fix your code, you can edit \file{Objects/listobject.c}
|
|
and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
|
|
preserve the old behaviour; this isn't recommended.
|
|
|
|
Some of the functions in the \module{socket} module are still
|
|
forgiving in this way. For example, \function{socket.connect(
|
|
('hostname', 25) )} is the correct form, passing a tuple representing
|
|
an IP address, but
|
|
\function{socket.connect( 'hostname', 25 )} also
|
|
works. \function{socket.connect_ex()} and \function{socket.bind()} are
|
|
similarly easy-going. 1.6alpha1 tightened these functions up, but
|
|
because the documentation actually used the erroneous multiple
|
|
argument form, many people wrote code which will break. So for
|
|
the\module{socket} module, the documentation was fixed and the
|
|
multiple argument form is simply marked as deprecated; it'll be
|
|
removed in a future Python version.
|
|
|
|
Some work has been done to make integers and long integers a bit more
|
|
interchangeable. In 1.5.2, large-file support was added for Solaris,
|
|
to allow reading files larger than 2Gb; this made the \method{tell()}
|
|
method of file objects return a long integer instead of a regular
|
|
integer. Some code would subtract two file offsets and attempt to use
|
|
the result to multiply a sequence or slice a string, but this raised a
|
|
\exception{TypeError}. In 1.6, long integers can be used to multiply
|
|
or slice a sequence, and it'll behave as you'd intuitively expect it to;
|
|
\code{3L * 'abc'} produces 'abcabcabc', and
|
|
\code{ (0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be
|
|
used in various new places where previously only integers were
|
|
accepted, such as in the \method{seek()} method of file objects.
|
|
|
|
The subtlest long integer change of all is that the \function{str()}
|
|
of a long integer no longer has a trailing 'L' character, though
|
|
\function{repr()} still includes it. The 'L' annoyed many people who
|
|
wanted to print long integers that looked just like regular integers,
|
|
since they had to go out of their way to chop off the character. This
|
|
is no longer a problem in 1.6, but code which assumes the 'L' is
|
|
there, and does \code{str(longval)[:-1]} will now lose the final
|
|
digit.
|
|
|
|
Taking the \function{repr()} of a float now uses a different
|
|
formatting precision than \function{str()}. \function{repr()} uses
|
|
``%.17g'' format string for C's \function{sprintf()}, while
|
|
\function{str()} uses ``%.12g'' as before. The effect is that
|
|
\function{repr()} may occasionally show more decimal places than
|
|
\function{str()}, for numbers
|
|
|
|
XXX need example value here to demonstrate problem.
|
|
|
|
|
|
% ======================================================================
|
|
\section{Core Changes}
|
|
|
|
Various minor changes have been made to Python's syntax and built-in
|
|
functions. None of the changes are very far-reaching, but they're
|
|
handy conveniences.
|
|
|
|
A change to syntax makes it more convenient to call a given function
|
|
with a tuple of arguments and/or a dictionary of keyword arguments.
|
|
In Python 1.5 and earlier, you do this with the \builtin{apply()}
|
|
built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
|
|
function \function{f()} with the argument tuple \var{args} and the
|
|
keyword arguments in the dictionary \var{kw}. Thanks to a patch from
|
|
Greg Ewing, 1.6 adds \code{f(*\var{args}, **\var{kw})} as a shorter
|
|
and clearer way to achieve the same effect. This syntax is
|
|
symmetrical with the syntax for defining functions:
|
|
|
|
\begin{verbatim}
|
|
def f(*args, **kw):
|
|
# args is a tuple of positional args,
|
|
# kw is a dictionary of keyword args
|
|
...
|
|
\end{verbatim}
|
|
|
|
A new format style is available when using the \operator{\%} operator.
|
|
'\%r' will insert the \function{repr()} of its argument. This was
|
|
also added from symmetry considerations, this time for symmetry with
|
|
the existing '\%s' format style which inserts the \function{str()} of
|
|
its argument. For example, \code{'%r %s' % ('abc', 'abc')} returns a
|
|
string containing \verb|'abc' abc|.
|
|
|
|
The \builtin{int()} and \builtin{long()} functions now accept an
|
|
optional ``base'' parameter when the first argument is a string.
|
|
\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
|
|
291. \code{int(123, 16)} raises a \exception{TypeError} exception
|
|
with the message ``can't convert non-string with explicit base''.
|
|
|
|
Previously there was no way to implement a class that overrode
|
|
Python's built-in \operator{in} operator and implemented a custom
|
|
version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
|
|
present in the sequence \var{seq}; Python computes this by simply
|
|
trying every index of the sequence until either \var{obj} is found or
|
|
an \exception{IndexError} is encountered. Moshe Zadka contributed a
|
|
patch which adds a \method{__contains__} magic method for providing a
|
|
custom implementation for \operator{in}.
|
|
|
|
Earlier versions of Python used a recursive algorithm for deleting
|
|
objects. Deeply nested data structures could cause the interpreter to
|
|
fill up the C stack and crash; Christian Tismer rewrote the deletion
|
|
logic to fix this problem. On a related note, comparing recursive
|
|
objects recursed infinitely and crashed; Jeremy Hylton rewrote the
|
|
code to no longer crash, producing a useful result instead. For
|
|
example, after this code:
|
|
|
|
\begin{verbatim}
|
|
a = []
|
|
b = []
|
|
a.append(a)
|
|
b.append(b)
|
|
\end{verbatim}
|
|
|
|
The comparison \code{a==b} returns true, because the two recursive
|
|
data structures are isomorphic.
|
|
\footnote{See the thread ``trashcan and PR\#7'' in the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant links.
|
|
%http://www.python.org/pipermail/python-dev/2000-April/004834.html
|
|
}
|
|
|
|
Work has been done on porting Python to 64-bit Windows on the Itanium
|
|
processor, mostly by Trent Mick of ActiveState. (Confusingly, for
|
|
complicated reasons \code{sys.platform} is still \code{'win32'} on
|
|
Win64.) PythonWin also supports Windows CE; see the Python CE page at
|
|
\url{http://www.python.net/crew/mhammond/ce/} for more information.
|
|
|
|
XXX UnboundLocalError is raised when a local variable is undefined
|
|
|
|
A new variable holding more detailed version information has been
|
|
added to the \module{sys} module. \code{sys.version_info} is a tuple
|
|
\code{(\var{major}, \var{minor}, \var{micro}, \var{level},
|
|
\var{serial})} For example, in 1.6a2 \code{sys.version_info} is
|
|
\code{(1, 6, 0, 'alpha', 2)}. \var{level} is a string such as
|
|
"alpha", "beta", or '' for a final release.
|
|
|
|
% ======================================================================
|
|
\section{Extending/embedding Changes}
|
|
|
|
Some of the changes are under the covers, and will only be apparent to
|
|
people writing C extension modules, or embedding a Python interpreter
|
|
in a larger application. If you aren't dealing with Python's C API,
|
|
you can safely skip this section.
|
|
|
|
Users of Jim Fulton's ExtensionClass module will be pleased to find
|
|
out that hooks have been added so that ExtensionClasses are now
|
|
supported by \function{isinstance()} and \function{issubclass()}.
|
|
This means you no longer have to remember to write code such as
|
|
\code{if type(obj) == myExtensionClass}, but can use the more natural
|
|
\code{if isinstance(obj, myExtensionClass)}.
|
|
|
|
The \file{Python/importdl.c} file, which was a mass of #ifdefs to
|
|
support dynamic loading on many different platforms, was cleaned up
|
|
are reorganized by Greg Stein. \file{importdl.c} is now quite small,
|
|
and platform-specific code has been moved into a bunch of
|
|
\file{Python/dynload_*.c} files.
|
|
|
|
Vladimir Marangozov's long-awaited malloc restructuring was completed,
|
|
to make it easy to have the Python interpreter use a custom allocator
|
|
instead of C's standard \function{malloc()}. For documentation, read
|
|
the comments in \file{Include/mymalloc.h} and
|
|
\file{Include/objimpl.h}. For the lengthy discussions during which
|
|
the interface was hammered out, see the Web archives of the 'patches'
|
|
and 'python-dev' lists at python.org.
|
|
|
|
Recent versions of the GUSI % XXX what is GUSI?
|
|
development environment for MacOS support POSIX threads. Therefore,
|
|
POSIX threads are now supported on the Macintosh too. Threading
|
|
support using the user-space GNU pth library was also contributed.
|
|
|
|
Threading support on Windows was enhanced, too. Windows supports
|
|
thread locks that use kernel objects only in case of contention; in
|
|
the common case when there's no contention, they use simpler functions
|
|
which are an order of magnitude faster. A threaded version of Python
|
|
1.5.2 on NT is twice as slow as an unthreaded version; with the 1.6
|
|
changes, the difference is only 10\%. These improvements were
|
|
contributed by Yakov Markovitch.
|
|
|
|
% ======================================================================
|
|
\section{Module changes}
|
|
|
|
re - changed to be a frontend to sre
|
|
|
|
readline, ConfigParser, cgi, calendar, posix, readline, xmllib, aifc, chunk,
|
|
wave, random, shelve, nntplib - minor enhancements
|
|
|
|
socket, httplib, urllib - optional OpenSSL support
|
|
|
|
_tkinter - support for 8.1,8.2,8.3 (support for versions older then 8.0
|
|
has been dropped). Supports Unicode (Lib/lib-tk/Tkinter.py has a test)
|
|
|
|
curses -- changed to use ncurses
|
|
|
|
% ======================================================================
|
|
\section{New modules}
|
|
|
|
winreg - Windows registry interface.
|
|
Distutils - tools for distributing Python modules
|
|
PyExpat - interface to Expat XML parser
|
|
robotparser - parse a robots.txt file (for writing web spiders)
|
|
linuxaudio - audio for Linux
|
|
mmap - treat a file as a memory buffer
|
|
filecmp - supersedes the old cmp.py and dircmp.py modules
|
|
tabnanny - check Python sources for tab-width dependance
|
|
sre - regular expressions (fast, supports unicode)
|
|
unicode - support for unicode
|
|
codecs - support for Unicode encoders/decoders
|
|
|
|
% ======================================================================
|
|
\section{IDLE Improvements}
|
|
|
|
XXX IDLE -- complete overhaul; what are the changes?
|
|
|
|
% ======================================================================
|
|
\section{Deleted and Deprecated Modules}
|
|
|
|
stdwin
|
|
|
|
\end{document}
|
|
|