\documentclass{howto} \title{What's New in Python 1.6} \release{0.01} \author{A.M. Kuchling} \authoraddress{\email{amk1@bigfoot.com}} \begin{document} \maketitle\tableofcontents \section{Introduction} A new release of Python, version 1.6, will be released some time this summer. Alpha versions are already available from \url{http://www.python.org/1.6/}. This article talks about the exciting new features in 1.6, highlights some useful new features, and points out a few incompatible changes that may require rewriting code. Python's development never ceases, and a steady flow of bug fixes and improvements are always being submitted. A host of minor bug-fixes, a few optimizations, additional docstrings, and better error messages went into 1.6; to list them all would be impossible, but they're certainly significant. Consult the publicly-available CVS logs if you want to see the full list. % ====================================================================== \section{Unicode} XXX unicode support: Unicode strings are marked with u"string", and there is support for arbitrary encoders/decoders Added -U command line option. With the option enabled the Python compiler interprets all "..." strings as u"..." (same with r"..." and ur"..."). (Is this just for experimenting?) % ====================================================================== \section{Distribution Utilities} XXX % ====================================================================== \section{String Methods} % ====================================================================== \section{Porting to 1.6} New Python releases try hard to be compatible with previous releases, and the record has been pretty good. However, some changes are considered useful enough (often fixing design decisions that were initially bad) that breaking backward compatibility in subtle ways can't always be avoided. This section lists the changes in Python 1.6 that may cause old Python code to break. The change which will probably break the most code is tightening up the arguments accepted by some methods. Some methods would take multiple arguments and treat them as a tuple, particularly various list methods such as \method{.append()}, \method{.insert()}, \method{remove()}, and \method{.count()}. % % XXX did anyone ever call the last 2 methods with multiple args? % In earlier versions of Python, if \code{L} is a list, \code{L.append( 1,2 )} appends the tuple \code{(1,2)} to the list. In Python 1.6 this causes a \exception{TypeError} exception to be raised, with the message: 'append requires exactly 1 argument; 2 given'. The fix is to simply add an extra set of parentheses to pass both values as a tuple: \code{L.append( (1,2) )}. The earlier versions of these methods were more forgiving because they used an old function in Python's C interface to parse their arguments; 1.6 modernizes them to use \function{PyArg_ParseTuple}, the current argument parsing function, which provides more helpful error messages and treats multi-argument calls as errors. If you absolutely must use 1.6 but can't fix your code, you can edit \file{Objects/listobject.c} and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to preserve the old behaviour; this isn't recommended. Some of the functions in the \module{socket} module are still forgiving in this way. For example, \function{socket.connect( ('hostname', 25) )} is the correct form, passing a tuple representing an IP address, but \function{socket.connect( 'hostname', 25 )} also works. \function{socket.connect_ex()} and \function{socket.bind()} are similarly easy-going. 1.6alpha1 tightened these functions up, but because the documentation actually used the erroneous multiple argument form, many people wrote code which will break. So for the\module{socket} module, the documentation was fixed and the multiple argument form is simply marked as deprecated; it'll be removed in a future Python version. Some work has been done to make integers and long integers a bit more interchangeable. In 1.5.2, large-file support was added for Solaris, to allow reading files larger than 2Gb; this made the \method{tell()} method of file objects return a long integer instead of a regular integer. Some code would subtract two file offsets and attempt to use the result to multiply a sequence or slice a string, but this raised a \exception{TypeError}. In 1.6, long integers can be used to multiply or slice a sequence, and it'll behave as you'd intuitively expect it to; \code{3L * 'abc'} produces 'abcabcabc', and \code{ (0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in various new places where previously only integers were accepted, such as in the \method{seek()} method of file objects. The subtlest long integer change of all is that the \function{str()} of a long integer no longer has a trailing 'L' character, though \function{repr()} still includes it. The 'L' annoyed many people who wanted to print long integers that looked just like regular integers, since they had to go out of their way to chop off the character. This is no longer a problem in 1.6, but code which assumes the 'L' is there, and does \code{str(longval)[:-1]} will now lose the final digit. Taking the \function{repr()} of a float now uses a different formatting precision than \function{str()}. \function{repr()} uses ``%.17g'' format string for C's \function{sprintf()}, while \function{str()} uses ``%.12g'' as before. The effect is that \function{repr()} may occasionally show more decimal places than \function{str()}, for numbers XXX need example value here to demonstrate problem. % ====================================================================== \section{Core Changes} Various minor changes have been made to Python's syntax and built-in functions. None of the changes are very far-reaching, but they're handy conveniences. A change to syntax makes it more convenient to call a given function with a tuple of arguments and/or a dictionary of keyword arguments. In Python 1.5 and earlier, you do this with the \builtin{apply()} built-in function: \code{apply(f, \var{args}, \var{kw})} calls the function \function{f()} with the argument tuple \var{args} and the keyword arguments in the dictionary \var{kw}. Thanks to a patch from Greg Ewing, 1.6 adds \code{f(*\var{args}, **\var{kw})} as a shorter and clearer way to achieve the same effect. This syntax is symmetrical with the syntax for defining functions: \begin{verbatim} def f(*args, **kw): # args is a tuple of positional args, # kw is a dictionary of keyword args ... \end{verbatim} A new format style is available when using the \operator{\%} operator. '\%r' will insert the \function{repr()} of its argument. This was also added from symmetry considerations, this time for symmetry with the existing '\%s' format style which inserts the \function{str()} of its argument. For example, \code{'%r %s' % ('abc', 'abc')} returns a string containing \verb|'abc' abc|. The \builtin{int()} and \builtin{long()} functions now accept an optional ``base'' parameter when the first argument is a string. \code{int('123', 10)} returns 123, while \code{int('123', 16)} returns 291. \code{int(123, 16)} raises a \exception{TypeError} exception with the message ``can't convert non-string with explicit base''. Previously there was no way to implement a class that overrode Python's built-in \operator{in} operator and implemented a custom version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is present in the sequence \var{seq}; Python computes this by simply trying every index of the sequence until either \var{obj} is found or an \exception{IndexError} is encountered. Moshe Zadka contributed a patch which adds a \method{__contains__} magic method for providing a custom implementation for \operator{in}. Earlier versions of Python used a recursive algorithm for deleting objects. Deeply nested data structures could cause the interpreter to fill up the C stack and crash; Christian Tismer rewrote the deletion logic to fix this problem. On a related note, comparing recursive objects recursed infinitely and crashed; Jeremy Hylton rewrote the code to no longer crash, producing a useful result instead. For example, after this code: \begin{verbatim} a = [] b = [] a.append(a) b.append(b) \end{verbatim} The comparison \code{a==b} returns true, because the two recursive data structures are isomorphic. \footnote{See the thread ``trashcan and PR\#7'' in the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant links. %http://www.python.org/pipermail/python-dev/2000-April/004834.html } Work has been done on porting Python to 64-bit Windows on the Itanium processor, mostly by Trent Mick of ActiveState. (Confusingly, for complicated reasons \code{sys.platform} is still \code{'win32'} on Win64.) PythonWin also supports Windows CE; see the Python CE page at \url{http://www.python.net/crew/mhammond/ce/} for more information. XXX UnboundLocalError is raised when a local variable is undefined A new variable holding more detailed version information has been added to the \module{sys} module. \code{sys.version_info} is a tuple \code{(\var{major}, \var{minor}, \var{micro}, \var{level}, \var{serial})} For example, in 1.6a2 \code{sys.version_info} is \code{(1, 6, 0, 'alpha', 2)}. \var{level} is a string such as "alpha", "beta", or '' for a final release. % ====================================================================== \section{Extending/embedding Changes} Some of the changes are under the covers, and will only be apparent to people writing C extension modules, or embedding a Python interpreter in a larger application. If you aren't dealing with Python's C API, you can safely skip this section. Users of Jim Fulton's ExtensionClass module will be pleased to find out that hooks have been added so that ExtensionClasses are now supported by \function{isinstance()} and \function{issubclass()}. This means you no longer have to remember to write code such as \code{if type(obj) == myExtensionClass}, but can use the more natural \code{if isinstance(obj, myExtensionClass)}. The \file{Python/importdl.c} file, which was a mass of #ifdefs to support dynamic loading on many different platforms, was cleaned up are reorganized by Greg Stein. \file{importdl.c} is now quite small, and platform-specific code has been moved into a bunch of \file{Python/dynload_*.c} files. Vladimir Marangozov's long-awaited malloc restructuring was completed, to make it easy to have the Python interpreter use a custom allocator instead of C's standard \function{malloc()}. For documentation, read the comments in \file{Include/mymalloc.h} and \file{Include/objimpl.h}. For the lengthy discussions during which the interface was hammered out, see the Web archives of the 'patches' and 'python-dev' lists at python.org. Recent versions of the GUSI % XXX what is GUSI? development environment for MacOS support POSIX threads. Therefore, POSIX threads are now supported on the Macintosh too. Threading support using the user-space GNU pth library was also contributed. Threading support on Windows was enhanced, too. Windows supports thread locks that use kernel objects only in case of contention; in the common case when there's no contention, they use simpler functions which are an order of magnitude faster. A threaded version of Python 1.5.2 on NT is twice as slow as an unthreaded version; with the 1.6 changes, the difference is only 10\%. These improvements were contributed by Yakov Markovitch. % ====================================================================== \section{Module changes} re - changed to be a frontend to sre readline, ConfigParser, cgi, calendar, posix, readline, xmllib, aifc, chunk, wave, random, shelve, nntplib - minor enhancements socket, httplib, urllib - optional OpenSSL support _tkinter - support for 8.1,8.2,8.3 (support for versions older then 8.0 has been dropped). Supports Unicode (Lib/lib-tk/Tkinter.py has a test) curses -- changed to use ncurses % ====================================================================== \section{New modules} winreg - Windows registry interface. Distutils - tools for distributing Python modules PyExpat - interface to Expat XML parser robotparser - parse a robots.txt file (for writing web spiders) linuxaudio - audio for Linux mmap - treat a file as a memory buffer filecmp - supersedes the old cmp.py and dircmp.py modules tabnanny - check Python sources for tab-width dependance sre - regular expressions (fast, supports unicode) unicode - support for unicode codecs - support for Unicode encoders/decoders % ====================================================================== \section{IDLE Improvements} XXX IDLE -- complete overhaul; what are the changes? % ====================================================================== \section{Deleted and Deprecated Modules} stdwin \end{document}