cpython/Doc/whatsnew/whatsnew20.tex

291 lines
13 KiB
TeX

\documentclass{howto}
\title{What's New in Python 1.6}
\release{0.01}
\author{A.M. Kuchling}
\authoraddress{\email{amk1@bigfoot.com}}
\begin{document}
\maketitle\tableofcontents
\section{Introduction}
A new release of Python, version 1.6, will be released some time this
summer. Alpha versions are already available from
\url{http://www.python.org/1.6/}. This article talks about the
exciting new features in 1.6, highlights some useful new features, and
points out a few incompatible changes that may require rewriting code.
Python's development never ceases, and a steady flow of bug fixes and
improvements are always being submitted. A host of minor bug-fixes, a
few optimizations, additional docstrings, and better error messages
went into 1.6; to list them all would be impossible, but they're
certainly significant. Consult the publicly-available CVS logs if you
want to see the full list.
% ======================================================================
\section{Unicode}
XXX
unicode support: Unicode strings are marked with u"string", and there
is support for arbitrary encoders/decoders
Added -U command line option. With the option enabled the Python
compiler interprets all "..." strings as u"..." (same with r"..." and
ur"..."). (Is this just for experimenting?)
% ======================================================================
\section{Distribution Utilities}
XXX
% ======================================================================
\section{String Methods}
% ======================================================================
\section{Porting to 1.6}
New Python releases try hard to be compatible with previous releases,
and the record has been pretty good. However, some changes are
considered useful enough (often fixing design decisions that were
initially bad) that breaking backward compatibility in subtle ways
can't always be avoided. This section lists the changes in Python 1.6
that may cause old Python code to break.
The change which will probably break the most code is tightening up
the arguments accepted by some methods. Some methods would take
multiple arguments and treat them as a tuple, particularly various
list methods such as \method{.append()}, \method{.insert()},
\method{remove()}, and \method{.count()}.
%
% XXX did anyone ever call the last 2 methods with multiple args?
%
In earlier versions of Python, if \code{L} is a list, \code{L.append(
1,2 )} appends the tuple \code{(1,2)} to the list. In Python 1.6 this
causes a \exception{TypeError} exception to be raised, with the
message: 'append requires exactly 1 argument; 2 given'. The fix is to
simply add an extra set of parentheses to pass both values as a tuple:
\code{L.append( (1,2) )}.
The earlier versions of these methods were more forgiving because they
used an old function in Python's C interface to parse their arguments;
1.6 modernizes them to use \function{PyArg_ParseTuple}, the current
argument parsing function, which provides more helpful error messages
and treats multi-argument calls as errors. If you absolutely must use
1.6 but can't fix your code, you can edit \file{Objects/listobject.c}
and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
preserve the old behaviour; this isn't recommended.
Some of the functions in the \module{socket} module are still
forgiving in this way. For example, \function{socket.connect(
('hostname', 25) )} is the correct form, passing a tuple representing
an IP address, but
\function{socket.connect( 'hostname', 25 )} also
works. \function{socket.connect_ex()} and \function{socket.bind()} are
similarly easy-going. 1.6alpha1 tightened these functions up, but
because the documentation actually used the erroneous multiple
argument form, many people wrote code which will break. So for
the\module{socket} module, the documentation was fixed and the
multiple argument form is simply marked as deprecated; it'll be
removed in a future Python version.
Some work has been done to make integers and long integers a bit more
interchangeable. In 1.5.2, large-file support was added for Solaris,
to allow reading files larger than 2Gb; this made the \method{tell()}
method of file objects return a long integer instead of a regular
integer. Some code would subtract two file offsets and attempt to use
the result to multiply a sequence or slice a string, but this raised a
\exception{TypeError}. In 1.6, long integers can be used to multiply
or slice a sequence, and it'll behave as you'd intuitively expect it to;
\code{3L * 'abc'} produces 'abcabcabc', and
\code{ (0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be
used in various new places where previously only integers were
accepted, such as in the \method{seek()} method of file objects.
The subtlest long integer change of all is that the \function{str()}
of a long integer no longer has a trailing 'L' character, though
\function{repr()} still includes it. The 'L' annoyed many people who
wanted to print long integers that looked just like regular integers,
since they had to go out of their way to chop off the character. This
is no longer a problem in 1.6, but code which assumes the 'L' is
there, and does \code{str(longval)[:-1]} will now lose the final
digit.
Taking the \function{repr()} of a float now uses a different
formatting precision than \function{str()}. \function{repr()} uses
``%.17g'' format string for C's \function{sprintf()}, while
\function{str()} uses ``%.12g'' as before. The effect is that
\function{repr()} may occasionally show more decimal places than
\function{str()}, for numbers
XXX need example value here to demonstrate problem.
% ======================================================================
\section{Core Changes}
Various minor changes have been made to Python's syntax and built-in
functions. None of the changes are very far-reaching, but they're
handy conveniences.
A change to syntax makes it more convenient to call a given function
with a tuple of arguments and/or a dictionary of keyword arguments.
In Python 1.5 and earlier, you do this with the \builtin{apply()}
built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
function \function{f()} with the argument tuple \var{args} and the
keyword arguments in the dictionary \var{kw}. Thanks to a patch from
Greg Ewing, 1.6 adds \code{f(*\var{args}, **\var{kw})} as a shorter
and clearer way to achieve the same effect. This syntax is
symmetrical with the syntax for defining functions:
\begin{verbatim}
def f(*args, **kw):
# args is a tuple of positional args,
# kw is a dictionary of keyword args
...
\end{verbatim}
A new format style is available when using the \operator{\%} operator.
'\%r' will insert the \function{repr()} of its argument. This was
also added from symmetry considerations, this time for symmetry with
the existing '\%s' format style which inserts the \function{str()} of
its argument. For example, \code{'%r %s' % ('abc', 'abc')} returns a
string containing \verb|'abc' abc|.
The \builtin{int()} and \builtin{long()} functions now accept an
optional ``base'' parameter when the first argument is a string.
\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
291. \code{int(123, 16)} raises a \exception{TypeError} exception
with the message ``can't convert non-string with explicit base''.
Previously there was no way to implement a class that overrode
Python's built-in \operator{in} operator and implemented a custom
version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
present in the sequence \var{seq}; Python computes this by simply
trying every index of the sequence until either \var{obj} is found or
an \exception{IndexError} is encountered. Moshe Zadka contributed a
patch which adds a \method{__contains__} magic method for providing a
custom implementation for \operator{in}.
Earlier versions of Python used a recursive algorithm for deleting
objects. Deeply nested data structures could cause the interpreter to
fill up the C stack and crash; Christian Tismer rewrote the deletion
logic to fix this problem. On a related note, comparing recursive
objects recursed infinitely and crashed; Jeremy Hylton rewrote the
code to no longer crash, producing a useful result instead. For
example, after this code:
\begin{verbatim}
a = []
b = []
a.append(a)
b.append(b)
\end{verbatim}
The comparison \code{a==b} returns true, because the two recursive
data structures are isomorphic.
\footnote{See the thread ``trashcan and PR\#7'' in the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant links.
%http://www.python.org/pipermail/python-dev/2000-April/004834.html
}
Work has been done on porting Python to 64-bit Windows on the Itanium
processor, mostly by Trent Mick of ActiveState. (Confusingly, for
complicated reasons \code{sys.platform} is still \code{'win32'} on
Win64.) PythonWin also supports Windows CE; see the Python CE page at
\url{http://www.python.net/crew/mhammond/ce/} for more information.
XXX UnboundLocalError is raised when a local variable is undefined
A new variable holding more detailed version information has been
added to the \module{sys} module. \code{sys.version_info} is a tuple
\code{(\var{major}, \var{minor}, \var{micro}, \var{level},
\var{serial})} For example, in 1.6a2 \code{sys.version_info} is
\code{(1, 6, 0, 'alpha', 2)}. \var{level} is a string such as
"alpha", "beta", or '' for a final release.
% ======================================================================
\section{Extending/embedding Changes}
Some of the changes are under the covers, and will only be apparent to
people writing C extension modules, or embedding a Python interpreter
in a larger application. If you aren't dealing with Python's C API,
you can safely skip this section.
Users of Jim Fulton's ExtensionClass module will be pleased to find
out that hooks have been added so that ExtensionClasses are now
supported by \function{isinstance()} and \function{issubclass()}.
This means you no longer have to remember to write code such as
\code{if type(obj) == myExtensionClass}, but can use the more natural
\code{if isinstance(obj, myExtensionClass)}.
The \file{Python/importdl.c} file, which was a mass of #ifdefs to
support dynamic loading on many different platforms, was cleaned up
are reorganized by Greg Stein. \file{importdl.c} is now quite small,
and platform-specific code has been moved into a bunch of
\file{Python/dynload_*.c} files.
Vladimir Marangozov's long-awaited malloc restructuring was completed,
to make it easy to have the Python interpreter use a custom allocator
instead of C's standard \function{malloc()}. For documentation, read
the comments in \file{Include/mymalloc.h} and
\file{Include/objimpl.h}. For the lengthy discussions during which
the interface was hammered out, see the Web archives of the 'patches'
and 'python-dev' lists at python.org.
Recent versions of the GUSI % XXX what is GUSI?
development environment for MacOS support POSIX threads. Therefore,
POSIX threads are now supported on the Macintosh too. Threading
support using the user-space GNU pth library was also contributed.
Threading support on Windows was enhanced, too. Windows supports
thread locks that use kernel objects only in case of contention; in
the common case when there's no contention, they use simpler functions
which are an order of magnitude faster. A threaded version of Python
1.5.2 on NT is twice as slow as an unthreaded version; with the 1.6
changes, the difference is only 10\%. These improvements were
contributed by Yakov Markovitch.
% ======================================================================
\section{Module changes}
re - changed to be a frontend to sre
readline, ConfigParser, cgi, calendar, posix, readline, xmllib, aifc, chunk,
wave, random, shelve, nntplib - minor enhancements
socket, httplib, urllib - optional OpenSSL support
_tkinter - support for 8.1,8.2,8.3 (support for versions older then 8.0
has been dropped). Supports Unicode (Lib/lib-tk/Tkinter.py has a test)
curses -- changed to use ncurses
% ======================================================================
\section{New modules}
winreg - Windows registry interface.
Distutils - tools for distributing Python modules
PyExpat - interface to Expat XML parser
robotparser - parse a robots.txt file (for writing web spiders)
linuxaudio - audio for Linux
mmap - treat a file as a memory buffer
filecmp - supersedes the old cmp.py and dircmp.py modules
tabnanny - check Python sources for tab-width dependance
sre - regular expressions (fast, supports unicode)
unicode - support for unicode
codecs - support for Unicode encoders/decoders
% ======================================================================
\section{IDLE Improvements}
XXX IDLE -- complete overhaul; what are the changes?
% ======================================================================
\section{Deleted and Deprecated Modules}
stdwin
\end{document}