1991-11-21 13:53:03 +00:00
|
|
|
% Format this file with latex.
|
1992-01-16 17:49:21 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\documentstyle[11pt,myformat]{report}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\title{\bf
|
|
|
|
Python Reference Manual \\
|
|
|
|
{\em Incomplete Draft}
|
|
|
|
}
|
|
|
|
|
|
|
|
\author{
|
|
|
|
Guido van Rossum \\
|
|
|
|
Dept. CST, CWI, Kruislaan 413 \\
|
|
|
|
1098 SJ Amsterdam, The Netherlands \\
|
|
|
|
E-mail: {\tt guido@cwi.nl}
|
|
|
|
}
|
|
|
|
|
|
|
|
\begin{document}
|
|
|
|
|
|
|
|
\pagenumbering{roman}
|
|
|
|
|
|
|
|
\maketitle
|
|
|
|
|
|
|
|
\begin{abstract}
|
|
|
|
|
|
|
|
\noindent
|
1992-01-20 17:10:21 +00:00
|
|
|
Python is a simple, yet powerful, interpreted programming language
|
|
|
|
that bridges the gap between C and shell programming, and is thus
|
|
|
|
ideally suited for ``throw-away programming'' and rapid prototyping.
|
|
|
|
Its syntax is put together from constructs borrowed from a variety of
|
|
|
|
other languages; most prominent are influences from ABC, C, Modula-3
|
|
|
|
and Icon.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
The Python interpreter is easily extended with new functions and data
|
|
|
|
types implemented in C. Python is also suitable as an extension
|
|
|
|
language for highly customizable C applications such as editors or
|
|
|
|
window managers.
|
|
|
|
|
|
|
|
Python is available for various operating systems, amongst which
|
|
|
|
several flavors of {\UNIX}, Amoeba, the Apple Macintosh O.S.,
|
|
|
|
and MS-DOS.
|
|
|
|
|
|
|
|
This reference manual describes the syntax and ``core semantics'' of
|
1992-01-20 17:10:21 +00:00
|
|
|
the language. It is terse, but attempts to be exact and complete.
|
|
|
|
The semantics of non-essential built-in object types and of the
|
|
|
|
built-in functions and modules are described in the {\em Python
|
|
|
|
Library Reference}. For an informal introduction to the language, see
|
|
|
|
the {\em Python Tutorial}.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\end{abstract}
|
|
|
|
|
|
|
|
\pagebreak
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
{
|
|
|
|
\parskip = 0mm
|
1991-11-21 13:53:03 +00:00
|
|
|
\tableofcontents
|
1992-01-17 14:03:20 +00:00
|
|
|
}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\pagebreak
|
|
|
|
|
|
|
|
\pagenumbering{arabic}
|
|
|
|
|
|
|
|
\chapter{Introduction}
|
|
|
|
|
|
|
|
This reference manual describes the Python programming language.
|
|
|
|
It is not intended as a tutorial.
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
While I am trying to be as precise as possible, I chose to use English
|
|
|
|
rather than formal specifications for everything except syntax and
|
|
|
|
lexical analysis. This should make the document better understandable
|
|
|
|
to the average reader, but will leave room for ambiguities.
|
|
|
|
Consequently, if you were coming from Mars and tried to re-implement
|
1992-01-16 17:49:21 +00:00
|
|
|
Python from this document alone, you might have to guess things and in
|
|
|
|
fact you would be implementing quite a different language.
|
|
|
|
On the other hand, if you are using
|
1992-01-07 16:43:53 +00:00
|
|
|
Python and wonder what the precise rules about a particular area of
|
1992-01-16 17:49:21 +00:00
|
|
|
the language are, you should definitely be able to find it here.
|
1992-01-07 16:43:53 +00:00
|
|
|
|
|
|
|
It is dangerous to add too many implementation details to a language
|
|
|
|
reference document -- the implementation may change, and other
|
|
|
|
implementations of the same language may work differently. On the
|
|
|
|
other hand, there is currently only one Python implementation, and
|
1992-01-16 17:49:21 +00:00
|
|
|
its particular quirks are sometimes worth being mentioned, especially
|
|
|
|
where the implementation imposes additional limitations.
|
1992-01-07 16:43:53 +00:00
|
|
|
|
|
|
|
Every Python implementation comes with a number of built-in and
|
|
|
|
standard modules. These are not documented here, but in the separate
|
|
|
|
{\em Python Library Reference} document. A few built-in modules are
|
|
|
|
mentioned when they interact in a significant way with the language
|
|
|
|
definition.
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
\section{Warning}
|
|
|
|
|
|
|
|
This version of the manual is incomplete. Sections that still need to
|
|
|
|
be written or need considerable work are marked with ``XXX''.
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
\section{Notation}
|
|
|
|
|
|
|
|
The descriptions of lexical analysis and syntax use a modified BNF
|
|
|
|
grammar notation. This uses the following style of definition:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-20 17:10:21 +00:00
|
|
|
name: lc_letter (lc_letter | "_")*
|
|
|
|
lc_letter: "a"..."z"
|
1992-01-07 16:43:53 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The first line says that a \verb\name\ is an \verb\lc_letter\ followed by
|
|
|
|
a sequence of zero or more \verb\lc_letter\s and underscores. An
|
|
|
|
\verb\lc_letter\ in turn is any of the single characters `a' through `z'.
|
1992-01-07 16:43:53 +00:00
|
|
|
(This rule is actually adhered to for the names defined in syntax and
|
|
|
|
grammar rules in this document.)
|
|
|
|
|
|
|
|
Each rule begins with a name (which is the name defined by the rule)
|
1992-01-20 17:10:21 +00:00
|
|
|
and a colon. A vertical bar
|
1992-01-16 17:49:21 +00:00
|
|
|
(\verb\|\) is used to separate alternatives; it is the least binding
|
|
|
|
operator in this notation. A star (\verb\*\) means zero or more
|
|
|
|
repetitions of the preceding item; likewise, a plus (\verb\+\) means
|
|
|
|
one or more repetitions, and a question mark (\verb\?\) zero or one
|
|
|
|
(in other words, the preceding item is optional). These three
|
|
|
|
operators bind as tightly as possible; parentheses are used for
|
1992-01-07 16:43:53 +00:00
|
|
|
grouping. Literal strings are enclosed in double quotes. White space
|
1992-01-20 17:10:21 +00:00
|
|
|
is only meaningful to separate tokens. Rules are normally contained
|
|
|
|
on a single line; rules with many alternatives may be formatted
|
|
|
|
alternatively with each line after the first beginning with a
|
|
|
|
vertical bar.
|
1992-01-07 16:43:53 +00:00
|
|
|
|
|
|
|
In lexical definitions (as the example above), two more conventions
|
|
|
|
are used: Two literal characters separated by three dots mean a choice
|
|
|
|
of any single character in the given (inclusive) range of ASCII
|
|
|
|
characters. A phrase between angular brackets (\verb\<...>\) gives an
|
|
|
|
informal description of the symbol defined; e.g., this could be used
|
|
|
|
to describe the notion of `control character' if needed.
|
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
Even though the notation used is almost the same, there is a big
|
1992-01-07 16:43:53 +00:00
|
|
|
difference between the meaning of lexical and syntactic definitions:
|
|
|
|
a lexical definition operates on the individual characters of the
|
|
|
|
input source, while a syntax definition operates on the stream of
|
|
|
|
tokens generated by the lexical analysis.
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
\chapter{Lexical analysis}
|
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
A Python program is read by a {\em parser}. Input to the parser is a
|
|
|
|
stream of {\em tokens}, generated by the {\em lexical analyzer}. This
|
|
|
|
chapter describes how the lexical analyzer breaks a file into tokens.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Line structure}
|
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
A Python program is divided in a number of logical lines. The end of
|
|
|
|
a logical line is represented by the token NEWLINE. Statements cannot
|
|
|
|
cross logical line boundaries except where NEWLINE is allowed by the
|
|
|
|
syntax (e.g., between statements in compound statements).
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{Comments}
|
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
A comment starts with a hash character (\verb\#\) that is not part of
|
1992-01-16 17:49:21 +00:00
|
|
|
a string literal, and ends at the end of the physical line. A comment
|
|
|
|
always signifies the end of the logical line. Comments are ignored by
|
|
|
|
the syntax.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{Line joining}
|
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
Two or more physical lines may be joined into logical lines using
|
1992-01-16 17:49:21 +00:00
|
|
|
backslash characters (\verb/\/), as follows: when a physical line ends
|
1991-11-25 17:26:57 +00:00
|
|
|
in a backslash that is not part of a string literal or comment, it is
|
|
|
|
joined with the following forming a single logical line, deleting the
|
1992-01-17 14:03:20 +00:00
|
|
|
backslash and the following end-of-line character. For example:
|
|
|
|
%
|
|
|
|
\begin{verbatim}
|
1992-01-20 17:10:21 +00:00
|
|
|
moth_names = ['Januari', 'Februari', 'Maart', \
|
|
|
|
'April', 'Mei', 'Juni', \
|
|
|
|
'Juli', 'Augustus', 'September', \
|
|
|
|
'Oktober', 'November', 'December']
|
1992-01-17 14:03:20 +00:00
|
|
|
\end{verbatim}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{Blank lines}
|
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
A logical line that contains only spaces, tabs, and possibly a
|
|
|
|
comment, is ignored (i.e., no NEWLINE token is generated), except that
|
|
|
|
during interactive input of statements, an entirely blank logical line
|
|
|
|
terminates a multi-line statement.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{Indentation}
|
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
Leading whitespace (spaces and tabs) at the beginning of a logical
|
|
|
|
line is used to compute the indentation level of the line, which in
|
|
|
|
turn is used to determine the grouping of statements.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
First, tabs are replaced (from left to right) by one to eight spaces
|
|
|
|
such that the total number of characters up to there is a multiple of
|
|
|
|
eight (this is intended to be the same rule as used by UNIX). The
|
|
|
|
total number of spaces preceding the first non-blank character then
|
1991-11-25 17:26:57 +00:00
|
|
|
determines the line's indentation. Indentation cannot be split over
|
|
|
|
multiple physical lines using backslashes.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
The indentation levels of consecutive lines are used to generate
|
|
|
|
INDENT and DEDENT tokens, using a stack, as follows.
|
|
|
|
|
|
|
|
Before the first line of the file is read, a single zero is pushed on
|
1991-11-25 17:26:57 +00:00
|
|
|
the stack; this will never be popped off again. The numbers pushed on
|
|
|
|
the stack will always be strictly increasing from bottom to top. At
|
|
|
|
the beginning of each logical line, the line's indentation level is
|
|
|
|
compared to the top of the stack. If it is equal, nothing happens.
|
|
|
|
If it larger, it is pushed on the stack, and one INDENT token is
|
|
|
|
generated. If it is smaller, it {\em must} be one of the numbers
|
|
|
|
occurring on the stack; all numbers on the stack that are larger are
|
|
|
|
popped off, and for each number popped off a DEDENT token is
|
|
|
|
generated. At the end of the file, a DEDENT token is generated for
|
|
|
|
each number remaining on the stack that is larger than zero.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
Here is an example of a correctly (though confusingly) indented piece
|
|
|
|
of Python code:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
def perm(l):
|
1992-01-17 14:03:20 +00:00
|
|
|
# Compute the list of all permutations of l
|
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
if len(l) <= 1:
|
|
|
|
return [l]
|
|
|
|
r = []
|
|
|
|
for i in range(len(l)):
|
|
|
|
s = l[:i] + l[i+1:]
|
|
|
|
p = perm(s)
|
|
|
|
for x in p:
|
|
|
|
r.append(l[i:i+1] + x)
|
|
|
|
return r
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
The following example shows various indentation errors:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
def perm(l): # error: first line indented
|
|
|
|
for i in range(len(l)): # error: not indented
|
|
|
|
s = l[:i] + l[i+1:]
|
|
|
|
p = perm(l[:i] + l[i+1:]) # error: unexpected indent
|
|
|
|
for x in p:
|
|
|
|
r.append(l[i:i+1] + x)
|
|
|
|
return r # error: inconsistent indent
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
(Actually, the first three errors are detected by the parser; only the
|
|
|
|
last error is found by the lexical analyzer -- the indentation of
|
|
|
|
\verb\return r\ does not match a level popped off the stack.)
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
\section{Other tokens}
|
|
|
|
|
|
|
|
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
|
|
|
|
exist: identifiers, keywords, literals, operators, and delimiters.
|
1991-11-25 17:26:57 +00:00
|
|
|
Spaces and tabs are not tokens, but serve to delimit tokens. Where
|
|
|
|
ambiguity exists, a token comprises the longest possible string that
|
|
|
|
forms a legal token, when read from left to right.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Identifiers}
|
|
|
|
|
|
|
|
Identifiers are described by the following regular expressions:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1991-11-25 17:26:57 +00:00
|
|
|
identifier: (letter|"_") (letter|digit|"_")*
|
1991-11-21 13:53:03 +00:00
|
|
|
letter: lowercase | uppercase
|
1992-01-07 16:43:53 +00:00
|
|
|
lowercase: "a"..."z"
|
|
|
|
uppercase: "A"..."Z"
|
|
|
|
digit: "0"..."9"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Identifiers are unlimited in length. Case is significant.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
\subsection{Keywords}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
The following identifiers are used as reserved words, or {\em
|
1992-01-16 17:49:21 +00:00
|
|
|
keywords} of the language, and cannot be used as ordinary
|
1991-11-25 17:26:57 +00:00
|
|
|
identifiers. They must be spelled exactly as written here:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
and del for in print
|
|
|
|
break elif from is raise
|
|
|
|
class else global not return
|
|
|
|
continue except if or try
|
|
|
|
def finally import pass while
|
1991-11-25 17:26:57 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
% # This Python program sorts and formats the above table
|
1991-11-25 17:26:57 +00:00
|
|
|
% import string
|
|
|
|
% l = []
|
|
|
|
% try:
|
|
|
|
% while 1:
|
|
|
|
% l = l + string.split(raw_input())
|
|
|
|
% except EOFError:
|
|
|
|
% pass
|
|
|
|
% l.sort()
|
|
|
|
% for i in range((len(l)+4)/5):
|
|
|
|
% for j in range(i, len(l), 5):
|
|
|
|
% print string.ljust(l[j], 10),
|
|
|
|
% print
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Literals}
|
|
|
|
|
|
|
|
\subsection{String literals}
|
|
|
|
|
|
|
|
String literals are described by the following regular expressions:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1991-11-25 17:26:57 +00:00
|
|
|
stringliteral: "'" stringitem* "'"
|
1991-11-21 13:53:03 +00:00
|
|
|
stringitem: stringchar | escapeseq
|
1992-01-07 16:43:53 +00:00
|
|
|
stringchar: <any ASCII character except newline or "\" or "'">
|
|
|
|
escapeseq: "'" <any ASCII character except newline>
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
String literals cannot span physical line boundaries. Escape
|
|
|
|
sequences in strings are actually interpreted according to rules
|
|
|
|
simular to those used by Standard C. The recognized escape sequences
|
|
|
|
are:
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tabular}{|l|l|}
|
|
|
|
\hline
|
|
|
|
\verb/\\/ & Backslash (\verb/\/) \\
|
|
|
|
\verb/\'/ & Single quote (\verb/'/) \\
|
|
|
|
\verb/\a/ & ASCII Bell (BEL) \\
|
|
|
|
\verb/\b/ & ASCII Backspace (BS) \\
|
1992-01-16 17:49:21 +00:00
|
|
|
%\verb/\E/ & ASCII Escape (ESC) \\
|
1991-11-25 17:26:57 +00:00
|
|
|
\verb/\f/ & ASCII Formfeed (FF) \\
|
|
|
|
\verb/\n/ & ASCII Linefeed (LF) \\
|
|
|
|
\verb/\r/ & ASCII Carriage Return (CR) \\
|
|
|
|
\verb/\t/ & ASCII Horizontal Tab (TAB) \\
|
|
|
|
\verb/\v/ & ASCII Vertical Tab (VT) \\
|
|
|
|
\verb/\/{\em ooo} & ASCII character with octal value {\em ooo} \\
|
1992-01-07 16:43:53 +00:00
|
|
|
\verb/\x/{em xx...} & ASCII character with hex value {\em xx...} \\
|
1991-11-25 17:26:57 +00:00
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{center}
|
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
In strict compatibility with in Standard C, up to three octal digits are
|
1991-11-25 17:26:57 +00:00
|
|
|
accepted, but an unlimited number of hex digits is taken to be part of
|
|
|
|
the hex escape (and then the lower 8 bits of the resulting hex number
|
1992-01-16 17:49:21 +00:00
|
|
|
are used in all current implementations...).
|
1991-11-25 17:26:57 +00:00
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
All unrecognized escape sequences are left in the string unchanged,
|
|
|
|
i.e., {\em the backslash is left in the string.} (This rule is
|
1991-11-25 17:26:57 +00:00
|
|
|
useful when debugging: if an escape sequence is mistyped, the
|
1992-01-07 16:43:53 +00:00
|
|
|
resulting output is more easily recognized as broken. It also helps a
|
|
|
|
great deal for string literals used as regular expressions or
|
|
|
|
otherwise passed to other modules that do their own escape handling --
|
|
|
|
but you may end up quadrupling backslashes that must appear literally.)
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{Numeric literals}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
There are three types of numeric literals: plain integers, long
|
|
|
|
integers, and floating point numbers.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
Integers and long integers are described by the following regular expressions:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1991-11-25 17:26:57 +00:00
|
|
|
longinteger: integer ("l"|"L")
|
1991-11-21 13:53:03 +00:00
|
|
|
integer: decimalinteger | octinteger | hexinteger
|
1991-11-25 17:26:57 +00:00
|
|
|
decimalinteger: nonzerodigit digit* | "0"
|
|
|
|
octinteger: "0" octdigit+
|
|
|
|
hexinteger: "0" ("x"|"X") hexdigit+
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
nonzerodigit: "1"..."9"
|
|
|
|
octdigit: "0"..."7"
|
|
|
|
hexdigit: digit|"a"..."f"|"A"..."F"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Although both lower case `l'and upper case `L' are allowed as suffix
|
|
|
|
for long integers, it is strongly recommended to always use `L', since
|
|
|
|
the letter `l' looks too much like the digit `1'.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
Plain integer decimal literals must be at most $2^{31} - 1$ (i.e., the
|
1992-01-17 14:03:20 +00:00
|
|
|
largest positive integer, assuming 32-bit arithmetic); octal and
|
|
|
|
hexadecimal literals may be as large as $2^{32} - 1$. There is no limit
|
|
|
|
for long integer literals.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
Some examples of plain and long integer literals:
|
1992-01-17 14:03:20 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
7 2147483647 0177 0x80000000
|
|
|
|
3L 79228162514264337593543950336L 0377L 0100000000L
|
|
|
|
\end{verbatim}
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
Floating point numbers are described by the following regular expressions:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-17 14:03:20 +00:00
|
|
|
floatnumber: pointfloat | exponentfloat
|
|
|
|
pointfloat: [intpart] fraction | intpart "."
|
|
|
|
exponentfloat: (intpart | pointfloat) exponent
|
1991-11-21 13:53:03 +00:00
|
|
|
intpart: digit+
|
1991-11-25 17:26:57 +00:00
|
|
|
fraction: "." digit+
|
|
|
|
exponent: ("e"|"E") ["+"|"-"] digit+
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The allowed range of floating point literals is
|
|
|
|
implementation-dependent.
|
1992-01-16 17:49:21 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Some examples of floating point literals:
|
1992-01-16 17:49:21 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
\begin{verbatim}
|
|
|
|
3.14 10. .001 1e100 3.14e-10
|
1992-01-16 17:49:21 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Note that numeric literals do not include a sign; a phrase like
|
|
|
|
\verb\-1\ is actually an expression composed of the operator
|
1992-01-16 17:49:21 +00:00
|
|
|
\verb\-\ and the literal \verb\1\.
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
\section{Operators}
|
|
|
|
|
|
|
|
The following tokens are operators:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
+ - * / %
|
|
|
|
<< >> & | ^ ~
|
1992-01-07 16:43:53 +00:00
|
|
|
< == > <= <> != >=
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
The comparison operators \verb\<>\ and \verb\!=\ are alternate
|
|
|
|
spellings of the same operator.
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
\section{Delimiters}
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
The following tokens serve as delimiters or otherwise have a special
|
|
|
|
meaning:
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
( ) [ ] { }
|
1992-01-07 16:43:53 +00:00
|
|
|
; , : . ` =
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
The following printing ASCII characters are not used in Python (except
|
|
|
|
in string literals and in comments). Their occurrence is an
|
|
|
|
unconditional error:
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
! @ $ " ?
|
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-16 17:49:21 +00:00
|
|
|
They may be used by future versions of the language though!
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
\chapter{Execution model}
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
\section{Objects, values and types}
|
|
|
|
|
|
|
|
I won't try to define rigorously here what an object is, but I'll give
|
|
|
|
some properties of objects that are important to know about.
|
|
|
|
|
|
|
|
Every object has an identity, a type and a value. An object's {\em
|
|
|
|
identity} never changes once it has been created; think of it as the
|
|
|
|
object's (permanent) address. An object's {\em type} determines the
|
1992-01-17 14:03:20 +00:00
|
|
|
operations that an object supports (e.g., does it have a length?) and
|
|
|
|
also defines the ``meaning'' of the object's value. The type also
|
|
|
|
never changes. The {\em value} of some objects can change; whether
|
|
|
|
this is possible is a property of its type.
|
1992-01-07 16:43:53 +00:00
|
|
|
|
|
|
|
Objects are never explicitly destroyed; however, when they become
|
1992-01-17 14:03:20 +00:00
|
|
|
unreachable they may be garbage-collected. An implementation is
|
|
|
|
allowed to delay garbage collection or omit it altogether -- it is a
|
|
|
|
matter of implementation quality how garbage collection is
|
|
|
|
implemented, as long as no objects are collected that are still
|
|
|
|
reachable. (Implementation note: the current implementation uses a
|
1992-01-07 16:43:53 +00:00
|
|
|
reference-counting scheme which collects most objects as soon as they
|
1992-01-20 17:10:21 +00:00
|
|
|
become unreachable, but never collects garbage containing circular
|
1992-01-07 16:43:53 +00:00
|
|
|
references.)
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Note that the use of the implementation's tracing or debugging
|
|
|
|
facilities may keep objects alive that would normally be collectable.
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
(Some objects contain references to ``external'' resources such as
|
|
|
|
open files. It is understood that these resources are freed when the
|
|
|
|
object is garbage-collected, but since garbage collection is not
|
1992-01-17 14:03:20 +00:00
|
|
|
guaranteed, such objects also provide an explicit way to release the
|
|
|
|
external resource (e.g., a \verb\close\ method). Programs are strongly
|
1992-01-07 16:43:53 +00:00
|
|
|
recommended to use this.)
|
|
|
|
|
|
|
|
Some objects contain references to other objects. These references
|
|
|
|
are part of the object's value; in most cases, when such a
|
|
|
|
``container'' object is compared to another (of the same type), the
|
1992-01-17 14:03:20 +00:00
|
|
|
comparison applies to the {\em values} of the referenced objects (not
|
|
|
|
their identities).
|
1992-01-07 16:43:53 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Types affect almost all aspects of objects.
|
|
|
|
Even object identity is affected in some sense: for immutable
|
1992-01-07 16:43:53 +00:00
|
|
|
types, operations that compute new values may actually return a
|
1992-01-17 14:03:20 +00:00
|
|
|
reference to any existing object with the same type and value, while
|
1992-01-07 16:43:53 +00:00
|
|
|
for mutable objects this is not allowed. E.g., after
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
a = 1; b = 1; c = []; d = []
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\verb\a\ and \verb\b\ may or may not refer to the same object, but
|
|
|
|
\verb\c\ and \verb\d\ are guaranteed to refer to two different, unique,
|
|
|
|
newly created lists.
|
|
|
|
|
|
|
|
\section{Execution frames, name spaces, and scopes}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
XXX code blocks, scopes, name spaces, name binding, exceptions
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\chapter{The standard type hierarchy}
|
|
|
|
|
|
|
|
The following types are built into Python. Extension modules
|
|
|
|
written in C can define additional types. Future versions of Python
|
|
|
|
may also add types to the type hierarchy (e.g., rational or complex
|
|
|
|
numbers, lists of efficiently stored integers, etc.).
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
|
|
|
|
\item[None]
|
|
|
|
This type has a single value. There is a single object with this value.
|
|
|
|
This object is accessed through the built-in name \verb\None\.
|
|
|
|
It is returned from functions that don't explicitly return an object.
|
|
|
|
|
|
|
|
\item[Numbers]
|
|
|
|
These are created by numeric literals and returned as results
|
|
|
|
by arithmetic operators and arithmetic built-in functions.
|
|
|
|
Numeric objects are immutable; once created their value never changes.
|
|
|
|
Python numbers are of course strongly related to mathematical numbers,
|
|
|
|
but subject to the limitations of numerical representation in computers.
|
|
|
|
|
|
|
|
Python distinguishes between integers and floating point numbers:
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
\item[Integers]
|
|
|
|
These represent elements from the mathematical set of whole numbers.
|
|
|
|
|
|
|
|
There are two types of integers:
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
|
|
|
|
\item[Plain integers]
|
|
|
|
These represent numbers in the range $-2^{31}$ through $2^{31}-1$.
|
|
|
|
(The range may be larger on machines with a larger natural word
|
|
|
|
size, but not smaller.)
|
|
|
|
When the result of an operation falls outside this range, the
|
|
|
|
exception \verb\OverflowError\ is raised.
|
|
|
|
For the purpose of shift and mask operations, integers are assumed to
|
|
|
|
have a binary, 2's complement notation using 32 or more bits, and
|
|
|
|
hiding no bits from the user (i.e., all $2^{32}$ different bit
|
|
|
|
patterns correspond to different values).
|
|
|
|
|
|
|
|
\item[Long integers]
|
|
|
|
These represent numbers in an unlimited range, subject to avaiable
|
|
|
|
(virtual) memory only. For the purpose of shift and mask operations,
|
|
|
|
a binary representation is assumed, and negative numbers are
|
|
|
|
represented in a variant of 2's complement which gives the illusion of
|
|
|
|
an infinite string of sign bits extending to the left.
|
|
|
|
|
|
|
|
\end{description} % Integers
|
|
|
|
|
|
|
|
The rules for integer representation are intended to give the most
|
|
|
|
meaningful interpretation of shift and mask operations involving
|
|
|
|
negative integers and the least surprises when switching between the
|
|
|
|
plain and long integer domains. For any operation except left shift,
|
|
|
|
if it yields a result in the plain integer domain without causing
|
|
|
|
overflow, it will yield the same result in the long integer domain or
|
|
|
|
when using mixed operands.
|
|
|
|
|
|
|
|
\item[Floating point numbers]
|
|
|
|
These represent machine-level double precision floating point numbers.
|
|
|
|
You are at the mercy of the underlying machine architecture and
|
|
|
|
C implementation for the accepted range and handling of overflow.
|
|
|
|
|
|
|
|
\end{description} % Numbers
|
|
|
|
|
|
|
|
\item[Sequences]
|
|
|
|
These represent finite ordered sets indexed by natural numbers.
|
|
|
|
The built-in function \verb\len()\ returns the number of elements
|
|
|
|
of a sequence. When this number is $n$, the index set contains
|
|
|
|
the numbers $0, 1, \ldots, n-1$. Element \verb\i\ of sequence
|
|
|
|
\verb\a\ is selected by \verb\a[i]\.
|
|
|
|
|
|
|
|
Sequences also support slicing: \verb\a[i:j]\ selects all elements
|
|
|
|
with index $k$ such that $i < k < j$. When used as an expression,
|
|
|
|
a slice is a sequence of the same type -- this implies that the
|
|
|
|
index set is renumbered so that it starts at 0 again.
|
|
|
|
|
|
|
|
Sequences are distinguished according to their mutability:
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
%
|
|
|
|
\item[Immutable sequences]
|
|
|
|
An object of an immutable sequence type cannot change once it is
|
|
|
|
created. (If the object contains references to other objects,
|
|
|
|
these other objects may be mutable and may be changed; however
|
|
|
|
the collection of objects directly referenced by an immutable object
|
|
|
|
cannot change.)
|
|
|
|
|
|
|
|
The following types are immutable sequences:
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
|
|
|
|
\item[Strings]
|
|
|
|
The elements of a string are characters. There is no separate
|
|
|
|
character type; a character is represented by a string of one element.
|
|
|
|
Characters represent (at least) 8-bit bytes. The built-in
|
|
|
|
functions \verb\chr()\ and \verb\ord()\ convert between characters
|
|
|
|
and nonnegative integers representing the byte values.
|
|
|
|
Bytes with the values 0-127 represent the corresponding ASCII values.
|
|
|
|
|
|
|
|
(On systems whose native character set is not ASCII, strings may use
|
|
|
|
EBCDIC in their internal representation, provided the functions
|
|
|
|
\verb\chr()\ and \verb\ord()\ implement a mapping between ASCII and
|
|
|
|
EBCDIC, and string comparisons preserve the ASCII order.
|
|
|
|
Or perhaps someone can propose a better rule?)
|
|
|
|
|
|
|
|
\item[Tuples]
|
|
|
|
The elements of a tuple are arbitrary Python objects.
|
|
|
|
Tuples of two or more elements are formed by comma-separated lists
|
|
|
|
of expressions. A tuple of one element can be formed by affixing
|
|
|
|
a comma to an expression (an expression by itself of course does
|
|
|
|
not create a tuple). An empty tuple can be formed by enclosing
|
|
|
|
`nothing' in parentheses.
|
|
|
|
|
|
|
|
\end{description} % Immutable sequences
|
|
|
|
|
|
|
|
\item[Mutable sequences]
|
|
|
|
Mutable sequences can be changed after they are created.
|
|
|
|
The subscript and slice notations can be used as the target
|
|
|
|
of assignment and \verb\del\ (delete) statements.
|
|
|
|
|
|
|
|
There is currently a single mutable sequence type:
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
|
|
|
|
\item[Lists]
|
|
|
|
The elements of a list are arbitrary Python objects.
|
|
|
|
Lists are formed by placing a comma-separated list of expressions
|
|
|
|
in square brackets. (Note that there are no special cases for lists
|
|
|
|
of length 0 or 1.)
|
|
|
|
|
|
|
|
\end{description} % Mutable sequences
|
|
|
|
|
|
|
|
\end{description} % Sequences
|
|
|
|
|
|
|
|
\item[Mapping types]
|
|
|
|
These represent finite sets of objects indexed by arbitrary index sets.
|
|
|
|
The subscript notation \verb\a[k]\ selects the element indexed
|
|
|
|
by \verb\k\ from the mapping \verb\a\; this can be used in
|
|
|
|
expressions and as the target of assignments or \verb\del\ statements.
|
|
|
|
The built-in function \verb\len()\ returns the number of elements
|
|
|
|
in a mapping.
|
|
|
|
|
|
|
|
There is currently a single mapping type:
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
|
|
|
|
\item[Dictionaries]
|
|
|
|
These represent finite sets of objects indexed by strings.
|
|
|
|
Dictionaries are created by the \verb\{...}\ notation (see section
|
|
|
|
\ref{dict}). (Implementation note: the strings used for indexing must
|
|
|
|
not contain null bytes.)
|
|
|
|
|
|
|
|
\end{description} % Mapping types
|
|
|
|
|
|
|
|
\item[Callable types]
|
|
|
|
These are the types to which the function call operation can be applied:
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
\item[User-defined functions]
|
|
|
|
XXX
|
|
|
|
\item[Built-in functions]
|
|
|
|
XXX
|
|
|
|
\item[User-defined methods]
|
|
|
|
XXX
|
|
|
|
\item[Built-in methods]
|
|
|
|
XXX
|
|
|
|
\item[User-defined classes]
|
|
|
|
XXX
|
|
|
|
\end{description}
|
|
|
|
|
|
|
|
\item[Modules]
|
|
|
|
XXX
|
|
|
|
|
|
|
|
\item[Class instances]
|
|
|
|
XXX
|
|
|
|
|
|
|
|
\item[Files]
|
|
|
|
XXX
|
|
|
|
|
|
|
|
\item[Internal types]
|
|
|
|
A few types used internally by the interpreter are exposed to the user.
|
|
|
|
Their definition may change with future versions of the interpreter,
|
|
|
|
but they are mentioned here for completeness.
|
|
|
|
|
|
|
|
\begin{description}
|
|
|
|
\item[Code objects]
|
|
|
|
XXX
|
|
|
|
\item[Traceback objects]
|
|
|
|
XXX
|
|
|
|
\end{description} % Internal types
|
|
|
|
|
|
|
|
\end{description} % Types
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
\chapter{Expressions and conditions}
|
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
From now on, extended BNF notation will be used to describe syntax,
|
|
|
|
not lexical analysis.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
This chapter explains the meaning of the elements of expressions and
|
|
|
|
conditions. Conditions are a superset of expressions, and a condition
|
1992-01-17 14:03:20 +00:00
|
|
|
may be used wherever an expression is required by enclosing it in
|
|
|
|
parentheses. The only places where expressions are used in the syntax
|
|
|
|
instead of conditions is in expression statements and on the
|
|
|
|
right-hand side of assignments; this catches some nasty bugs like
|
|
|
|
accedentally writing \verb\x == 1\ instead of \verb\x = 1\.
|
1992-01-07 16:43:53 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
The comma has several roles in Python's syntax. It is usually an
|
1992-01-07 16:43:53 +00:00
|
|
|
operator with a lower precedence than all others, but occasionally
|
1992-01-17 14:03:20 +00:00
|
|
|
serves other purposes as well; e.g., it separates function arguments,
|
|
|
|
is used in list and dictionary constructors, and has special semantics
|
|
|
|
in \verb\print\ statements.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
When (one alternative of) a syntax rule has the form
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
name: othername
|
|
|
|
\end{verbatim}
|
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
and no semantics are given, the semantics of this form of \verb\name\
|
|
|
|
are the same as for \verb\othername\.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Arithmetic conversions}
|
|
|
|
|
|
|
|
When a description of an arithmetic operator below uses the phrase
|
|
|
|
``the numeric arguments are converted to a common type'',
|
|
|
|
this both means that if either argument is not a number, a
|
1992-01-20 17:10:21 +00:00
|
|
|
\verb\TypeError\ exception is raised, and that otherwise
|
1991-11-21 13:53:03 +00:00
|
|
|
the following conversions are applied:
|
|
|
|
|
|
|
|
\begin{itemize}
|
1992-01-20 17:10:21 +00:00
|
|
|
\item first, if either argument is a floating point number,
|
1991-11-21 13:53:03 +00:00
|
|
|
the other is converted to floating point;
|
|
|
|
\item else, if either argument is a long integer,
|
|
|
|
the other is converted to long integer;
|
1992-01-17 14:03:20 +00:00
|
|
|
\item otherwise, both must be plain integers and no conversion
|
1991-11-21 13:53:03 +00:00
|
|
|
is necessary.
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
\section{Atoms}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Atoms are the most basic elements of expressions. Forms enclosed in
|
|
|
|
reverse quotes or in parentheses, brackets or braces are also
|
|
|
|
categorized syntactically as atoms. The syntax for atoms is:
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-17 14:03:20 +00:00
|
|
|
atom: identifier | literal | enclosure
|
|
|
|
enclosure: parenth_form | list_display | dict_display | string_conversion
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\subsection{Identifiers (Names)}
|
|
|
|
|
|
|
|
An identifier occurring as an atom is a reference to a local, global
|
1992-01-17 14:03:20 +00:00
|
|
|
or built-in name binding. If a name can be assigned to anywhere in a
|
|
|
|
code block, and is not mentioned in a \verb\global\ statement in that
|
|
|
|
code block, it refers to a local name throughout that code block.
|
1991-11-21 13:53:03 +00:00
|
|
|
Otherwise, it refers to a global name if one exists, else to a
|
|
|
|
built-in name.
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
When the name is bound to an object, evaluation of the atom yields
|
|
|
|
that object. When a name is not bound, an attempt to evaluate it
|
1992-01-20 17:10:21 +00:00
|
|
|
raises a \verb\NameError\ exception.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{Literals}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
Python knows string and numeric literals:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
literal: stringliteral | integer | longinteger | floatnumber
|
|
|
|
\end{verbatim}
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
Evaluation of a literal yields an object of the given type
|
|
|
|
(string, integer, long integer, floating point number)
|
|
|
|
with the given value.
|
|
|
|
The value may be approximated in the case of floating point literals.
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
All literals correspond to immutable data types, and hence the
|
|
|
|
object's identity is less important than its value. Multiple
|
|
|
|
evaluations of literals with the same value (either the same
|
|
|
|
occurrence in the program text or a different occurrence) may obtain
|
|
|
|
the same object or a different object with the same value.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
(In the original implementation, all literals in the same code block
|
|
|
|
with the same type and value yield the same object.)
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\subsection{Parenthesized forms}
|
1992-01-17 14:03:20 +00:00
|
|
|
|
|
|
|
A parenthesized form is an optional condition list enclosed in
|
|
|
|
parentheses:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
parenth_form: "(" [condition_list] ")"
|
|
|
|
\end{verbatim}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
A parenthesized condition list yields whatever that condition list
|
|
|
|
yields.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
An empty pair of parentheses yields an empty tuple object. Since
|
|
|
|
tuples are immutable, the rules for literals apply here.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
(Note that tuples are not formed by the parentheses, but rather by use
|
|
|
|
of the comma operator. The exception is the empty tuple, for which
|
|
|
|
parentheses {\em are} required -- allowing unparenthesized ``nothing''
|
|
|
|
in expressions would causes ambiguities and allow common typos to
|
|
|
|
pass uncaught.)
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{List displays}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
A list display is a possibly empty series of conditions enclosed in
|
|
|
|
square brackets:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
list_display: "[" [condition_list] "]"
|
|
|
|
\end{verbatim}
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
A list display yields a new list object.
|
|
|
|
|
|
|
|
If it has no condition list, the list object has no items.
|
|
|
|
Otherwise, the elements of the condition list are evaluated
|
|
|
|
from left to right and inserted in the list object in that order.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\subsection{Dictionary displays} \label{dict}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
A dictionary display is a possibly empty series of key/datum pairs
|
|
|
|
enclosed in curly braces:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
dict_display: "{" [key_datum_list] "}"
|
|
|
|
key_datum_list: [key_datum ("," key_datum)* [","]
|
|
|
|
key_datum: condition ":" condition
|
|
|
|
\end{verbatim}
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
A dictionary display yields a new dictionary object.
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
The key/datum pairs are evaluated from left to right to define the
|
|
|
|
entries of the dictionary: each key object is used as a key into the
|
|
|
|
dictionary to store the corresponding datum.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
Keys must be strings, otherwise a \verb\TypeError\ exception is raised.
|
1992-01-17 14:03:20 +00:00
|
|
|
Clashes between duplicate keys are not detected; the last datum
|
|
|
|
(textually rightmost in the display) stored for a given key value
|
|
|
|
prevails.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{String conversions}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
A string conversion is a condition list enclosed in reverse (or
|
1992-01-17 14:03:20 +00:00
|
|
|
backward) quotes:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
string_conversion: "`" condition_list "`"
|
|
|
|
\end{verbatim}
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
A string conversion evaluates the contained condition list and converts the
|
|
|
|
resulting object into a string according to rules specific to its type.
|
|
|
|
|
1991-11-25 17:26:57 +00:00
|
|
|
If the object is a string, a number, \verb\None\, or a tuple, list or
|
1992-01-17 14:03:20 +00:00
|
|
|
dictionary containing only objects whose type is one of these, the
|
|
|
|
resulting string is a valid Python expression which can be passed to
|
|
|
|
the built-in function \verb\eval()\ to yield an expression with the
|
1991-11-21 13:53:03 +00:00
|
|
|
same value (or an approximation, if floating point numbers are
|
|
|
|
involved).
|
|
|
|
|
|
|
|
(In particular, converting a string adds quotes around it and converts
|
|
|
|
``funny'' characters to escape sequences that are safe to print.)
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
It is illegal to attempt to convert recursive objects (e.g., lists or
|
|
|
|
dictionaries that contain a reference to themselves, directly or
|
|
|
|
indirectly.)
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Primaries}
|
|
|
|
|
|
|
|
Primaries represent the most tightly bound operations of the language.
|
|
|
|
Their syntax is:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-17 14:03:20 +00:00
|
|
|
primary: atom | attributeref | subscription | slicing | call
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\subsection{Attribute references}
|
|
|
|
|
|
|
|
An attribute reference is a primary followed by a period and a name:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
attributeref: primary "." identifier
|
1992-01-17 14:03:20 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
The primary must evaluate to an object of a type that supports
|
|
|
|
attribute references, e.g., a module or a list. This object is then
|
|
|
|
asked to produce the attribute whose name is the identifier. If this
|
|
|
|
attribute is not available, the exception \verb\AttributeError\ is
|
|
|
|
raised. Otherwise, the type and value of the object produced is
|
|
|
|
determined by the object. Multiple evaluations of the same attribute
|
|
|
|
reference may yield different objects.
|
|
|
|
|
|
|
|
\subsection{Subscriptions}
|
|
|
|
|
|
|
|
A subscription selects an item of a sequence or mapping object:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
subscription: primary "[" condition "]"
|
1992-01-17 14:03:20 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
The primary must evaluate to an object of a sequence or mapping type.
|
|
|
|
|
|
|
|
If it is a mapping, the condition must evaluate to an object whose
|
|
|
|
value is one of the keys of the mapping, and the subscription selects
|
|
|
|
the value in the mapping that corresponds to that key.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
If it is a sequence, the condition must evaluate to a plain integer.
|
|
|
|
If this value is negative, the length of the sequence is added to it
|
|
|
|
(so that, e.g., \verb\x[-1]\ selects the last item of \verb\x\.)
|
|
|
|
The resulting value must be a nonnegative integer smaller than the
|
|
|
|
number of items in the sequence, and the subscription selects the item
|
|
|
|
whose index is that value (counting from zero).
|
1992-01-17 14:03:20 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
A string's items are characters. A character is not a separate data
|
1992-01-17 14:03:20 +00:00
|
|
|
type but a string of exactly one character.
|
|
|
|
|
|
|
|
\subsection{Slicings}
|
|
|
|
|
|
|
|
A slicing selects a range of items in a sequence object:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
slicing: primary "[" [condition] ":" [condition] "]"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The primary must evaluate to a sequence object. The lower and upper
|
|
|
|
bound expressions, if present, must evaluate to plain integers;
|
|
|
|
defaults are zero and the sequence's length, respectively. If either
|
|
|
|
bound is negative, the sequence's length is added to it. The slicing
|
|
|
|
now selects all items with index $k$ such that $i <= k < j$ where $i$
|
|
|
|
and $j$ are the specified lower and upper bounds. This may be an
|
|
|
|
empty sequence. It is not an error if $i$ or $j$ lie outside the
|
|
|
|
range of valid indexes (such items don't exist so they aren't
|
|
|
|
selected).
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\subsection{Calls}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
A call calls a function with a possibly empty series of arguments:
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
\begin{verbatim}
|
|
|
|
call: primary "(" [condition_list] ")"
|
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The primary must evaluate to a callable object (user-defined
|
|
|
|
functions, built-in functions, methods of built-in objects, class
|
|
|
|
objects, and methods of class instances are callable). If it is a
|
|
|
|
class, the argument list must be empty.
|
1992-01-17 14:03:20 +00:00
|
|
|
|
|
|
|
XXX explain what happens on function call
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Factors}
|
|
|
|
|
|
|
|
Factors represent the unary numeric operators.
|
|
|
|
Their syntax is:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
factor: primary | "-" factor | "+" factor | "~" factor
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The unary \verb\"-"\ operator yields the negative of its
|
|
|
|
numeric argument.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The unary \verb\"+"\ operator yields its numeric argument unchanged.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The unary \verb\"~"\ operator yields the bit-wise negation of its
|
|
|
|
plain or long integer argument. The bit-wise negation negation of
|
|
|
|
\verb\x\ is defined as \verb\-(x+1)\.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
In all three cases, if the argument does not have the proper type,
|
1992-01-20 17:10:21 +00:00
|
|
|
a \verb\TypeError\ exception is raised.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Terms}
|
|
|
|
|
|
|
|
Terms represent the most tightly binding binary operators:
|
1992-01-20 17:10:21 +00:00
|
|
|
%
|
1991-11-21 13:53:03 +00:00
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
term: factor | term "*" factor | term "/" factor | term "%" factor
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
1992-01-20 17:10:21 +00:00
|
|
|
%
|
|
|
|
The \verb\"*"\ (multiplication) operator yields the product of its
|
1992-01-17 14:03:20 +00:00
|
|
|
arguments. The arguments must either both be numbers, or one argument
|
|
|
|
must be a plain integer and the other must be a sequence. In the
|
|
|
|
former case, the numbers are converted to a common type and then
|
|
|
|
multiplied together. In the latter case, sequence repetition is
|
1992-01-20 17:10:21 +00:00
|
|
|
performed; a negative repetition factor yields an empty sequence.
|
1992-01-17 14:03:20 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The \verb\"/"\ (division) operator yields the quotient of its
|
1992-01-17 14:03:20 +00:00
|
|
|
arguments. The numeric arguments are first converted to a common
|
1992-01-20 17:10:21 +00:00
|
|
|
type. Plain or long integer division yields an integer of the same
|
|
|
|
type; the result is that of mathematical division with the `floor'
|
|
|
|
function applied to the result. Division by zero raises the
|
|
|
|
\verb\ZeroDivisionError\ exception.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The \verb\"%"\ (modulo) operator yields the remainder from the
|
1992-01-17 14:03:20 +00:00
|
|
|
division of the first argument by the second. The numeric arguments
|
1992-01-20 17:10:21 +00:00
|
|
|
are first converted to a common type. A zero right argument raises the
|
|
|
|
\verb\ZeroDivisionError\ exception. The arguments may be floating point
|
|
|
|
numbers, e.g., \verb\3.14 % 0.7\ equals \verb\0.34\. The modulo operator
|
1992-01-17 14:03:20 +00:00
|
|
|
always yields a result with the same sign as its second operand (or
|
|
|
|
zero); the absolute value of the result is strictly smaller than the
|
|
|
|
second operand.
|
|
|
|
|
|
|
|
The integer division and modulo operators are connected by the
|
1992-01-20 17:10:21 +00:00
|
|
|
following identity: \verb\x == (x/y)*y + (x%y)\.
|
|
|
|
Integer division and modulo are also connected with the built-in
|
|
|
|
function \verb\divmod()\: \verb\divmod(x, y) == (x/y, x%y)\.
|
|
|
|
These identities don't hold for floating point numbers; there a
|
|
|
|
similar identity holds where \verb\x/y\ is replaced by
|
|
|
|
\verb\floor(x/y)\).
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Arithmetic expressions}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
arith_expr: term | arith_expr "+" term | arith_expr "-" term
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-17 14:03:20 +00:00
|
|
|
The \verb|"+"| operator yields the sum of its arguments. The
|
1992-01-20 17:10:21 +00:00
|
|
|
arguments must either both be numbers, or both sequences of the same
|
|
|
|
type. In the former case, the numbers are converted to a common type
|
|
|
|
and then added together. In the latter case, the sequences are
|
|
|
|
concatenated.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-07 16:43:53 +00:00
|
|
|
The \verb|"-"| operator yields the difference of its arguments.
|
1991-11-21 13:53:03 +00:00
|
|
|
The numeric arguments are first converted to a common type.
|
|
|
|
|
|
|
|
\section{Shift expressions}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-20 17:10:21 +00:00
|
|
|
shift_expr: arith_expr | shift_expr ( "<<" | ">>" ) arith_expr
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
These operators accept plain or long integers as arguments. The
|
|
|
|
arguments are converted to a common type. They shift the first
|
|
|
|
argument to the left or right by the number of bits given by the
|
|
|
|
second argument.
|
|
|
|
|
|
|
|
A right shift by $n$ bits is defined as division by $2^n$. A left
|
|
|
|
shift by $n$ bits is defined as multiplication with $2^n$ without
|
|
|
|
overflow check; for plain integers this drops bits if the result is
|
|
|
|
not less than $2^{31} - 1$ in absolute value.
|
|
|
|
|
|
|
|
Negative shift counts raise a \verb\ValueError\ exception.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Bitwise AND expressions}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
and_expr: shift_expr | and_expr "&" shift_expr
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
This operator yields the bitwise AND of its arguments, which must be
|
|
|
|
plain or long integers. The arguments are converted to a common type.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Bitwise XOR expressions}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
xor_expr: and_expr | xor_expr "^" and_expr
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
This operator yields the bitwise exclusive OR of its arguments, which
|
|
|
|
must be plain or long integers. The arguments are converted to a
|
|
|
|
common type.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Bitwise OR expressions}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
or_expr: xor_expr | or_expr "|" xor_expr
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
This operator yields the bitwise OR of its arguments, which must be
|
|
|
|
plain or long integers. The arguments are converted to a common type.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Comparisons}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-20 17:10:21 +00:00
|
|
|
comparison: or_expr (comp_operator or_expr)*
|
1992-01-07 16:43:53 +00:00
|
|
|
comp_operator: "<"|">"|"=="|">="|"<="|"<>"|"!="|"is" ["not"]|["not"] "in"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
Comparisons yield integer value: 1 for true, 0 for false.
|
|
|
|
|
|
|
|
Comparisons can be chained arbitrarily,
|
|
|
|
e.g., $x < y <= z$ is equivalent to
|
1992-01-20 17:10:21 +00:00
|
|
|
$x < y$ \verb\and\ $y <= z$, except that $y$ is evaluated only once
|
1991-11-21 13:53:03 +00:00
|
|
|
(but in both cases $z$ is not evaluated at all when $x < y$ is
|
|
|
|
found to be false).
|
|
|
|
|
|
|
|
Formally, $e_0 op_1 e_1 op_2 e_2 ...e_{n-1} op_n e_n$ is equivalent to
|
1992-01-20 17:10:21 +00:00
|
|
|
$e_0 op_1 e_1$ \verb\and\ $e_1 op_2 e_2$ \verb\and\ ... \verb\and\
|
1991-11-21 13:53:03 +00:00
|
|
|
$e_{n-1} op_n e_n$, except that each expression is evaluated at most once.
|
|
|
|
|
|
|
|
Note that $e_0 op_1 e_1 op_2 e_2$ does not imply any kind of comparison
|
|
|
|
between $e_0$ and $e_2$, e.g., $x < y > z$ is perfectly legal.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The forms \verb\<>\ and \verb\!=\ are equivalent; for consistency with
|
|
|
|
C, \verb\!=\ is preferred; where \verb\!=\ is mentioned below
|
|
|
|
\verb\<>\ is also implied.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The operators {\tt "<", ">", "==", ">=", "<="}, and {\tt "!="} compare
|
1991-11-21 13:53:03 +00:00
|
|
|
the values of two objects. The objects needn't have the same type.
|
1992-01-20 17:10:21 +00:00
|
|
|
If both are numbers, they are coverted to a common type. Otherwise,
|
|
|
|
objects of different types {\em always} compare unequal, and are
|
|
|
|
ordered consistently but arbitrarily.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
(This unusual
|
|
|
|
definition of comparison is done to simplify the definition of
|
1991-11-25 17:26:57 +00:00
|
|
|
operations like sorting and the \verb\in\ and \verb\not in\ operators.)
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
Comparison of objects of the same type depends on the type:
|
|
|
|
|
|
|
|
\begin{itemize}
|
1992-01-20 17:10:21 +00:00
|
|
|
|
|
|
|
\item
|
|
|
|
Numbers are compared arithmetically.
|
|
|
|
|
|
|
|
\item
|
|
|
|
Strings are compared lexicographically using the numeric equivalents
|
|
|
|
(the result of the built-in function \verb\ord\) of their characters.
|
|
|
|
|
|
|
|
\item
|
|
|
|
Tuples and lists are compared lexicographically using comparison of
|
|
|
|
corresponding items.
|
|
|
|
|
|
|
|
\item
|
|
|
|
Mappings (dictionaries) are compared through lexicographic
|
|
|
|
comparison of their sorted (key, value) lists.%
|
|
|
|
\footnote{This is expensive since it requires sorting the keys first,
|
|
|
|
but about the only sensible definition. It was tried to compare
|
|
|
|
dictionaries using the following rules, but this gave surprises in
|
|
|
|
cases like \verb|if d == {}: ...|.}
|
|
|
|
|
|
|
|
\item
|
|
|
|
Most other types compare unequal unless they are the same object;
|
|
|
|
the choice whether one object is considered smaller or larger than
|
|
|
|
another one is made arbitrarily but consistently within one
|
|
|
|
execution of a program.
|
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{itemize}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The operators \verb\in\ and \verb\not in\ test for sequence
|
|
|
|
membership: if $y$ is a sequence, $x ~\verb\in\~ y$ is true if and
|
|
|
|
only if there exists an index $i$ such that $x = y[i]$.
|
|
|
|
$x ~\verb\not in\~ y$ yields the inverse truth value. The exception
|
|
|
|
\verb\TypeError\ is raised when $y$ is not a sequence, or when $y$ is
|
|
|
|
a string and $x$ is not a string of length one.%
|
|
|
|
\footnote{The latter restriction is sometimes a nuisance.}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
The operators \verb\is\ and \verb\is not\ compare object identity:
|
1992-01-20 17:10:21 +00:00
|
|
|
$x ~\verb\is\~ y$ is true if and only if $x$ and $y$ are the same
|
|
|
|
object. $x ~\verb\is not\~ y$ yields the inverse truth value.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Boolean operators}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
condition: or_test
|
1992-01-07 16:43:53 +00:00
|
|
|
or_test: and_test | or_test "or" and_test
|
|
|
|
and_test: not_test | and_test "and" not_test
|
|
|
|
not_test: comparison | "not" not_test
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
In the context of Boolean operators, and also when conditions are used
|
|
|
|
by control flow statements, the following values are interpreted as
|
|
|
|
false: \verb\None\, numeric zero of all types, empty sequences
|
|
|
|
(strings, tuples and lists), and empty mappings (dictionaries). All
|
|
|
|
other values are interpreted as true.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
The operator \verb\not\ yields 1 if its argument is false, 0 otherwise.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The condition $x ~\verb\and\~ y$ first evaluates $x$; if $x$ is false,
|
1991-11-21 13:53:03 +00:00
|
|
|
$x$ is returned; otherwise, $y$ is evaluated and returned.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
The condition $x ~\verb\or\~ y$ first evaluates $x$; if $x$ is true,
|
1991-11-21 13:53:03 +00:00
|
|
|
$x$ is returned; otherwise, $y$ is evaluated and returned.
|
|
|
|
|
|
|
|
(Note that \verb\and\ and \verb\or\ do not restrict the value and type
|
|
|
|
they return to 0 and 1, but rather return the last evaluated argument.
|
1992-01-20 17:10:21 +00:00
|
|
|
This is sometimes useful, e.g., if \verb\s\ is a string, which should be
|
|
|
|
replaced by a default value if it is empty, \verb\s or 'foo'\
|
1991-11-21 13:53:03 +00:00
|
|
|
returns the desired value. Because \verb\not\ has to invent a value
|
|
|
|
anyway, it does not bother to return a value of the same type as its
|
1992-01-20 17:10:21 +00:00
|
|
|
argument, so \verb\not 'foo'\ yields \verb\0\, not \verb\''\.)
|
|
|
|
|
|
|
|
\section{Expression lists and condition lists}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
expr_list: or_expr ("," or_expr)* [","]
|
|
|
|
cond_list: condition ("," condition)* [","]
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
The only difference between expression lists and condition lists is
|
|
|
|
the lowest priority of operators that can be used in them without
|
|
|
|
being enclosed in parentheses; condition lists allow all operators,
|
|
|
|
while expression lists don't allow comparisons and Boolean operators
|
|
|
|
(they do allow bitwise and shift operators though).
|
|
|
|
|
|
|
|
Expression lists are used in expression statements and assignments;
|
|
|
|
condition lists are used everywhere else.
|
|
|
|
|
|
|
|
An expression (condition) list containing at least one comma yields a
|
|
|
|
tuple. The length of the tuple is the number of expressions
|
|
|
|
(conditions) in the list. The expressions (conditions) are evaluated
|
|
|
|
from left to right.
|
|
|
|
|
|
|
|
The trailing comma is required only to create a single tuple (a.k.a. a
|
|
|
|
{\em singleton}); it is optional in all other cases. A single
|
|
|
|
expression (condition) without a trailing comma doesn't create a
|
|
|
|
tuple, but rather yields the value of that expression (condition).
|
|
|
|
|
|
|
|
To create an empty tuple, use an empty pair of parentheses: \verb\()\.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\chapter{Simple statements}
|
|
|
|
|
|
|
|
Simple statements are comprised within a single logical line.
|
1992-01-20 17:10:21 +00:00
|
|
|
Several simple statements may occur on a single line separated
|
1991-11-21 13:53:03 +00:00
|
|
|
by semicolons. The syntax for simple statements is:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
stmt_list: simple_stmt (";" simple_stmt)* [";"]
|
1991-11-21 13:53:03 +00:00
|
|
|
simple_stmt: expression_stmt
|
|
|
|
| assignment
|
|
|
|
| pass_stmt
|
|
|
|
| del_stmt
|
|
|
|
| print_stmt
|
|
|
|
| return_stmt
|
|
|
|
| raise_stmt
|
|
|
|
| break_stmt
|
|
|
|
| continue_stmt
|
|
|
|
| import_stmt
|
1992-01-07 16:43:53 +00:00
|
|
|
| global_stmt
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\section{Expression statements}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
expression_stmt: expression_list
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
An expression statement evaluates the expression list (which may
|
|
|
|
be a single expression).
|
|
|
|
If the value is not \verb\None\, it is converted to a string
|
|
|
|
using the rules for string conversions, and the resulting string
|
|
|
|
is written to standard output on a line by itself.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
(The exception for \verb\None\ is made so that procedure calls, which
|
|
|
|
are syntactically equivalent to expressions, do not cause any output.
|
|
|
|
A tuple with only \verb\None\ items is written normally.)
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\section{Assignments}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-20 17:10:21 +00:00
|
|
|
assignment: (target_list "=")+ expression_list
|
1992-01-07 16:43:53 +00:00
|
|
|
target_list: target ("," target)* [","]
|
|
|
|
target: identifier | "(" target_list ")" | "[" target_list "]"
|
1991-11-21 13:53:03 +00:00
|
|
|
| attributeref | subscription | slicing
|
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
(See the section on primaries for the syntax definition of the last
|
1991-11-21 13:53:03 +00:00
|
|
|
three symbols.)
|
|
|
|
|
|
|
|
An assignment evaluates the expression list (remember that this can
|
|
|
|
be a single expression or a comma-separated list,
|
|
|
|
the latter yielding a tuple)
|
|
|
|
and assigns the single resulting object to each of the target lists,
|
|
|
|
from left to right.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
Assignment is defined recursively depending on the form of the target.
|
|
|
|
When a target is part of a mutable object (an attribute reference,
|
|
|
|
subscription or slicing), the mutable object must ultimately perform
|
|
|
|
the assignment and decide about its validity, and may raise an
|
|
|
|
exception if the assignment is unacceptable. The rules observed by
|
|
|
|
various types and the exceptions raised are given with the definition
|
|
|
|
of the object types (some of which are defined in the library
|
|
|
|
reference).
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
Assignment of an object to a target list is recursively
|
|
|
|
defined as follows.
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
\item
|
|
|
|
If the target list contains no commas (except in nested constructs):
|
|
|
|
the object is assigned to the single target contained in the list.
|
|
|
|
|
|
|
|
\item
|
|
|
|
If the target list contains commas (that are not in nested constructs):
|
1992-01-20 17:10:21 +00:00
|
|
|
the object must be a tuple with the same number of items
|
1991-11-21 13:53:03 +00:00
|
|
|
as the list contains targets, and the items are assigned, from left
|
|
|
|
to right, to the corresponding targets.
|
|
|
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
Assignment of an object to a (non-list)
|
|
|
|
target is recursively defined as follows.
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
|
|
|
\item
|
|
|
|
If the target is an identifier (name):
|
1992-01-20 17:10:21 +00:00
|
|
|
\begin{itemize}
|
|
|
|
\item
|
|
|
|
If the name does not occur in a \verb\global\ statement in the current
|
|
|
|
code block: the object is bound to that name in the current local
|
|
|
|
name space.
|
|
|
|
\item
|
|
|
|
Otherwise: the object is bound to that name in the current global name
|
|
|
|
space.
|
|
|
|
\end{itemize}
|
|
|
|
A previous binding of the same name in the same name space is undone.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\item
|
|
|
|
If the target is a target list enclosed in parentheses:
|
|
|
|
the object is assigned to that target list.
|
|
|
|
|
|
|
|
\item
|
|
|
|
If the target is a target list enclosed in square brackets:
|
1992-01-20 17:10:21 +00:00
|
|
|
the object must be a list with the same number of items
|
1991-11-21 13:53:03 +00:00
|
|
|
as the target list contains targets,
|
|
|
|
and the list's items are assigned, from left to right,
|
|
|
|
to the corresponding targets.
|
|
|
|
|
|
|
|
\item
|
|
|
|
If the target is an attribute reference:
|
|
|
|
The primary expression in the reference is evaluated.
|
|
|
|
It should yield an object with assignable attributes;
|
1992-01-20 17:10:21 +00:00
|
|
|
if this is not the case, \verb\TypeError\ is raised.
|
1991-11-21 13:53:03 +00:00
|
|
|
That object is then asked to assign the assigned object
|
|
|
|
to the given attribute; if it cannot perform the assignment,
|
|
|
|
it raises an exception.
|
|
|
|
|
|
|
|
\item
|
1992-01-20 17:10:21 +00:00
|
|
|
If the target is a subscription: The primary expression in the
|
|
|
|
reference is evaluated. It should yield either a mutable sequence
|
|
|
|
(list) object or a mapping (dictionary) object. Next, the subscript
|
|
|
|
expression is evaluated.
|
|
|
|
|
|
|
|
If the primary is a sequence object, the subscript must yield a plain
|
|
|
|
integer. If it is negative, the sequence's length is added to it.
|
|
|
|
The resulting value must be a nonnegative integer less than the
|
|
|
|
sequence's length, and the sequence is asked to assign the assigned
|
|
|
|
object to its item with that index. If the index is out of range,
|
|
|
|
\verb\IndexError\ is raised (assignment to a subscripted sequence
|
|
|
|
cannot add new items to a list).
|
|
|
|
|
|
|
|
If the primary is a mapping object, the subscript must have a type
|
|
|
|
compatible with the mapping's key type, and the mapping is then asked
|
|
|
|
to to create a key/datum pair which maps the subscript to the assigned
|
|
|
|
object. This can either replace an existing key/value pair with the
|
|
|
|
same key value, or insert a new key/value pair (if no key with the
|
|
|
|
same value existed).
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\item
|
1992-01-20 17:10:21 +00:00
|
|
|
If the target is a slicing: The primary expression in the reference is
|
|
|
|
evaluated. It should yield a mutable sequence (list) object. The
|
|
|
|
assigned object should be a sequence object of the same type. Next,
|
|
|
|
the lower and upper bound expressions are evaluated, insofar they are
|
|
|
|
present; defaults are zero and the sequence's length. The bounds
|
|
|
|
should evaluate to (small) integers. If either bound is negative, the
|
|
|
|
sequence's length is added to it. The resulting bounds are clipped to
|
|
|
|
lie between zero and the sequence's length, inclusive. Finally, the
|
|
|
|
sequence object is asked to replace the items indicated by the slice
|
|
|
|
with the items of the assigned sequence. This may change the
|
|
|
|
sequence's length, if it allows it.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\end{itemize}
|
1992-01-20 17:10:21 +00:00
|
|
|
|
1991-11-21 13:53:03 +00:00
|
|
|
(In the original implementation, the syntax for targets is taken
|
|
|
|
to be the same as for expressions, and invalid syntax is rejected
|
|
|
|
during the code generation phase, causing less detailed error
|
|
|
|
messages.)
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\pass\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
pass_stmt: "pass"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\verb\pass\ is a null operation -- when it is executed, nothing
|
|
|
|
happens. It is useful as a placeholder when a statement is
|
|
|
|
required syntactically, but no code needs to be executed, for example:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
def f(arg): pass # a no-op function
|
|
|
|
|
|
|
|
class C: pass # an empty class
|
|
|
|
\end{verbatim}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\del\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
del_stmt: "del" target_list
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
Deletion is recursively defined very similar to the way assignment is
|
|
|
|
defined. Rather that spelling it out in full details, here are some
|
|
|
|
hints.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
Deletion of a target list recursively deletes each target,
|
|
|
|
from left to right.
|
|
|
|
|
|
|
|
Deletion of a name removes the binding of that name (which must exist)
|
1992-01-20 17:10:21 +00:00
|
|
|
from the local or global name space, depending on whether the name
|
|
|
|
occurs in a \verb\global\ statement in the same code block.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
Deletion of attribute references, subscriptions and slicings
|
|
|
|
is passed to the primary object involved; deletion of a slicing
|
|
|
|
is in general equivalent to assignment of an empty slice of the
|
|
|
|
right type (but even this is determined by the sliced object).
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\print\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
print_stmt: "print" [ condition ("," condition)* [","] ]
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\verb\print\ evaluates each condition in turn and writes the resulting
|
|
|
|
object to standard output (see below). If an object is not a string,
|
|
|
|
it is first converted to a string using the rules for string
|
|
|
|
conversions. The (resulting or original) string is then written. A
|
|
|
|
space is written before each object is (converted and) written, unless
|
|
|
|
the output system believes it is positioned at the beginning of a
|
|
|
|
line. This is the case: (1) when no characters have yet been written
|
|
|
|
to standard output; or (2) when the last character written to standard
|
|
|
|
output is \verb/\n/; or (3) when the last write operation on standard
|
|
|
|
output was not a \verb\print\ statement. (In some cases it may be
|
|
|
|
functional to write an empty string to standard output for this
|
|
|
|
reason.)
|
|
|
|
|
|
|
|
A \verb/"\n"/ character is written at the end, unless the \verb\print\
|
|
|
|
statement ends with a comma. This is the only action if the statement
|
|
|
|
contains just the keyword \verb\print\.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
Standard output is defined as the file object named \verb\stdout\
|
|
|
|
in the built-in module \verb\sys\. If no such object exists,
|
1992-01-20 17:10:21 +00:00
|
|
|
or if it is not a writable file, a \verb\RuntimeError\ exception is raised.
|
1991-11-21 13:53:03 +00:00
|
|
|
(The original implementation attempts to write to the system's original
|
|
|
|
standard output instead, but this is not safe, and should be fixed.)
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\return\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
return_stmt: "return" [condition_list]
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\verb\return\ may only occur syntactically nested in a function
|
|
|
|
definition, not within a nested class definition.
|
|
|
|
|
|
|
|
If a condition list is present, it is evaluated, else \verb\None\
|
|
|
|
is substituted.
|
|
|
|
|
|
|
|
\verb\return\ leaves the current function call with the condition
|
|
|
|
list (or \verb\None\) as return value.
|
|
|
|
|
|
|
|
When \verb\return\ passes control out of a \verb\try\ statement
|
|
|
|
with a \verb\finally\ clause, that finally clause is executed
|
|
|
|
before really leaving the function.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\raise\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
raise_stmt: "raise" condition ["," condition]
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\verb\raise\ evaluates its first condition, which must yield
|
|
|
|
a string object. If there is a second condition, this is evaluated,
|
|
|
|
else \verb\None\ is substituted.
|
|
|
|
|
|
|
|
It then raises the exception identified by the first object,
|
|
|
|
with the second one (or \verb\None\) as its parameter.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\break\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
break_stmt: "break"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\verb\break\ may only occur syntactically nested in a \verb\for\
|
|
|
|
or \verb\while\ loop, not nested in a function or class definition.
|
|
|
|
|
|
|
|
It terminates the neares enclosing loop, skipping the optional
|
|
|
|
\verb\else\ clause if the loop has one.
|
|
|
|
|
|
|
|
If a \verb\for\ loop is terminated by \verb\break\, the loop control
|
1992-01-20 17:10:21 +00:00
|
|
|
target keeps its current value.
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
When \verb\break\ passes control out of a \verb\try\ statement
|
|
|
|
with a \verb\finally\ clause, that finally clause is executed
|
|
|
|
before really leaving the loop.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\continue\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
continue_stmt: "continue"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\verb\continue\ may only occur syntactically nested in a \verb\for\ or
|
|
|
|
\verb\while\ loop, not nested in a function or class definition, and
|
|
|
|
not nested in the \verb\try\ clause of a \verb\try\ statement with a
|
|
|
|
\verb\finally\ clause (it may occur nested in a \verb\except\ or
|
|
|
|
\verb\finally\ clause of a \verb\try\ statement though).
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
It continues with the next cycle of the nearest enclosing loop.
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\import\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
import_stmt: "import" identifier ("," identifier)*
|
|
|
|
| "from" identifier "import" identifier ("," identifier)*
|
|
|
|
| "from" identifier "import" "*"
|
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
Import statements are executed in two steps: (1) find a module, and
|
|
|
|
initialize it if necessary; (2) define a name or names in the local
|
|
|
|
name space. The first form (without \verb\from\) repeats these steps
|
|
|
|
for each identifier in the list.
|
|
|
|
|
|
|
|
The system maintains a table of modules that have been initialized,
|
|
|
|
indexed by module name. (The current implementation makes this table
|
|
|
|
accessible as \verb\sys.modules\.) When a module name is found in
|
|
|
|
this table, step (1) is finished. If not, a search for a module
|
|
|
|
definition is started. This first looks for a built-in module
|
|
|
|
definition, and if no built-in module if the given name is found, it
|
|
|
|
searches a user-specified list of directories for a file whose name is
|
|
|
|
the module name with extension \verb\".py"\. (The current
|
|
|
|
implementation uses the list of strings \verb\sys.path\ as the search
|
|
|
|
path; it is initialized from the shell environment variable
|
|
|
|
\verb\$PYTHONPATH\, with an installation-dependent default.)
|
|
|
|
|
|
|
|
If a built-in module is found, its built-in initialization code is
|
|
|
|
executed and step (1) is finished. If no matching file is found,
|
|
|
|
\ImportError\ is raised (and step (2) is never started). If a file is
|
|
|
|
found, it is parsed. If a syntax error occurs, HIRO
|
|
|
|
|
|
|
|
\section{The \verb\global\ statement}
|
1992-01-07 16:43:53 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
global_stmt: "global" identifier ("," identifier)*
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
(XXX To be done.)
|
|
|
|
|
|
|
|
\chapter{Compound statements}
|
|
|
|
|
|
|
|
(XXX The semantic definitions of this chapter are still to be done.)
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
statement: stmt_list NEWLINE | compound_stmt
|
|
|
|
compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | funcdef | classdef
|
|
|
|
suite: statement | NEWLINE INDENT statement+ DEDENT
|
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\if\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
if_stmt: "if" condition ":" suite
|
|
|
|
("elif" condition ":" suite)*
|
|
|
|
["else" ":" suite]
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\while\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
while_stmt: "while" condition ":" suite ["else" ":" suite]
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\for\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
for_stmt: "for" target_list "in" condition_list ":" suite
|
|
|
|
["else" ":" suite]
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
1992-01-20 17:10:21 +00:00
|
|
|
\section{The \verb\try\ statement}
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
try_stmt: "try" ":" suite
|
|
|
|
("except" condition ["," condition] ":" suite)*
|
|
|
|
["finally" ":" suite]
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\section{Function definitions}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
funcdef: "def" identifier "(" [parameter_list] ")" ":" suite
|
|
|
|
parameter_list: parameter ("," parameter)*
|
|
|
|
parameter: identifier | "(" parameter_list ")"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\section{Class definitions}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
1992-01-07 16:43:53 +00:00
|
|
|
classdef: "class" identifier [inheritance] ":" suite
|
|
|
|
inheritance: "(" expression ("," expression)* ")"
|
1991-11-21 13:53:03 +00:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
XXX Syntax for scripts, modules
|
|
|
|
XXX Syntax for interactive input, eval, exec, input
|
1992-01-07 16:43:53 +00:00
|
|
|
XXX New definition of expressions (as conditions)
|
1991-11-21 13:53:03 +00:00
|
|
|
|
|
|
|
\end{document}
|