mirror of https://github.com/python/cpython.git
Just another intermediate version...
This commit is contained in:
parent
1c462adaa8
commit
7b632a6073
128
Doc/ref.tex
128
Doc/ref.tex
|
@ -1,5 +1,5 @@
|
||||||
% Format this file with latex.
|
% Format this file with latex.
|
||||||
|
|
||||||
\documentstyle[myformat]{report}
|
\documentstyle[myformat]{report}
|
||||||
|
|
||||||
\title{\bf
|
\title{\bf
|
||||||
|
@ -65,17 +65,18 @@ rather than formal specifications for everything except syntax and
|
||||||
lexical analysis. This should make the document better understandable
|
lexical analysis. This should make the document better understandable
|
||||||
to the average reader, but will leave room for ambiguities.
|
to the average reader, but will leave room for ambiguities.
|
||||||
Consequently, if you were coming from Mars and tried to re-implement
|
Consequently, if you were coming from Mars and tried to re-implement
|
||||||
Python from this document alone, you might in fact be implementing
|
Python from this document alone, you might have to guess things and in
|
||||||
quite a different language. On the other hand, if you are using
|
fact you would be implementing quite a different language.
|
||||||
|
On the other hand, if you are using
|
||||||
Python and wonder what the precise rules about a particular area of
|
Python and wonder what the precise rules about a particular area of
|
||||||
the language are, you should be able to find it here.
|
the language are, you should definitely be able to find it here.
|
||||||
|
|
||||||
It is dangerous to add too many implementation details to a language
|
It is dangerous to add too many implementation details to a language
|
||||||
reference document -- the implementation may change, and other
|
reference document -- the implementation may change, and other
|
||||||
implementations of the same language may work differently. On the
|
implementations of the same language may work differently. On the
|
||||||
other hand, there is currently only one Python implementation, and
|
other hand, there is currently only one Python implementation, and
|
||||||
particular quirks of it are sometimes worth mentioning, especially
|
its particular quirks are sometimes worth being mentioned, especially
|
||||||
where it differs from the ``ideal'' specification.
|
where the implementation imposes additional limitations.
|
||||||
|
|
||||||
Every Python implementation comes with a number of built-in and
|
Every Python implementation comes with a number of built-in and
|
||||||
standard modules. These are not documented here, but in the separate
|
standard modules. These are not documented here, but in the separate
|
||||||
|
@ -93,20 +94,20 @@ name: lcletter (lcletter | "_")*
|
||||||
lcletter: "a"..."z"
|
lcletter: "a"..."z"
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
The first line says that a \verb\name\ is a \verb\lcletter\ followed by
|
The first line says that a \verb\name\ is an \verb\lcletter\ followed by
|
||||||
a sequence of zero or more \verb\lcletter\s and underscores. A
|
a sequence of zero or more \verb\lcletter\s and underscores. An
|
||||||
\verb\lcletter\ in turn is any of the single characters `a' through `z'.
|
\verb\lcletter\ in turn is any of the single characters `a' through `z'.
|
||||||
(This rule is actually adhered to for the names defined in syntax and
|
(This rule is actually adhered to for the names defined in syntax and
|
||||||
grammar rules in this document.)
|
grammar rules in this document.)
|
||||||
|
|
||||||
Each rule begins with a name (which is the name defined by the rule)
|
Each rule begins with a name (which is the name defined by the rule)
|
||||||
followed by a colon. Each rule is wholly contained on one line. A
|
and a colon, and is wholly contained on one line. A vertical bar
|
||||||
vertical bar (\verb\|\) is used to separate alternatives, it is the
|
(\verb\|\) is used to separate alternatives; it is the least binding
|
||||||
least binding operator in this notation. A star (\verb\*\) means zero
|
operator in this notation. A star (\verb\*\) means zero or more
|
||||||
or more repetitions of the preceding item; likewise, a plus (\verb\+\)
|
repetitions of the preceding item; likewise, a plus (\verb\+\) means
|
||||||
means one or more repetitions and a question mark (\verb\?\) zero or
|
one or more repetitions, and a question mark (\verb\?\) zero or one
|
||||||
one (in other words, the preceding item is optional). These three
|
(in other words, the preceding item is optional). These three
|
||||||
operators bind as tight as possible; parentheses are used for
|
operators bind as tightly as possible; parentheses are used for
|
||||||
grouping. Literal strings are enclosed in double quotes. White space
|
grouping. Literal strings are enclosed in double quotes. White space
|
||||||
is only meaningful to separate tokens.
|
is only meaningful to separate tokens.
|
||||||
|
|
||||||
|
@ -117,7 +118,7 @@ characters. A phrase between angular brackets (\verb\<...>\) gives an
|
||||||
informal description of the symbol defined; e.g., this could be used
|
informal description of the symbol defined; e.g., this could be used
|
||||||
to describe the notion of `control character' if needed.
|
to describe the notion of `control character' if needed.
|
||||||
|
|
||||||
Although the notation used is almost the same, there is a big
|
Even though the notation used is almost the same, there is a big
|
||||||
difference between the meaning of lexical and syntactic definitions:
|
difference between the meaning of lexical and syntactic definitions:
|
||||||
a lexical definition operates on the individual characters of the
|
a lexical definition operates on the individual characters of the
|
||||||
input source, while a syntax definition operates on the stream of
|
input source, while a syntax definition operates on the stream of
|
||||||
|
@ -131,22 +132,22 @@ chapter describes how the lexical analyzer breaks a file into tokens.
|
||||||
|
|
||||||
\section{Line structure}
|
\section{Line structure}
|
||||||
|
|
||||||
A Python program is divided in a number of logical lines. Statements
|
A Python program is divided in a number of logical lines. The end of
|
||||||
do not straddle logical line boundaries except where explicitly
|
a logical line is represented by the token NEWLINE. Statements cannot
|
||||||
indicated by the syntax (i.e., for compound statements). To this
|
cross logical line boundaries except where NEWLINE is allowed by the
|
||||||
purpose, the end of a logical line is represented by the token
|
syntax (e.g., between statements in compound statements).
|
||||||
NEWLINE.
|
|
||||||
|
|
||||||
\subsection{Comments}
|
\subsection{Comments}
|
||||||
|
|
||||||
A comment starts with a hash character (\verb\#\) that is not part of
|
A comment starts with a hash character (\verb\#\) that is not part of
|
||||||
a string literal, and ends at the end of the physical line. Comments
|
a string literal, and ends at the end of the physical line. A comment
|
||||||
are ignored by the syntax.
|
always signifies the end of the logical line. Comments are ignored by
|
||||||
|
the syntax.
|
||||||
|
|
||||||
\subsection{Line joining}
|
\subsection{Line joining}
|
||||||
|
|
||||||
Two or more physical lines may be joined into logical lines using
|
Two or more physical lines may be joined into logical lines using
|
||||||
backslash characters (\verb/\/), as follows: When physical line ends
|
backslash characters (\verb/\/), as follows: when a physical line ends
|
||||||
in a backslash that is not part of a string literal or comment, it is
|
in a backslash that is not part of a string literal or comment, it is
|
||||||
joined with the following forming a single logical line, deleting the
|
joined with the following forming a single logical line, deleting the
|
||||||
backslash and the following end-of-line character.
|
backslash and the following end-of-line character.
|
||||||
|
@ -160,13 +161,14 @@ terminates a multi-line statement.
|
||||||
|
|
||||||
\subsection{Indentation}
|
\subsection{Indentation}
|
||||||
|
|
||||||
Spaces and tabs at the beginning of a logical line are used to compute
|
Leading whitespace (spaces and tabs) at the beginning of a logical
|
||||||
the indentation level of the line, which in turn is used to determine
|
line is used to compute the indentation level of the line, which in
|
||||||
the grouping of statements.
|
turn is used to determine the grouping of statements.
|
||||||
|
|
||||||
First, each tab is replaced by one to eight spaces such that the total
|
First, tabs are replaced (from left to right) by one to eight spaces
|
||||||
number of spaces up to that point is a multiple of eight. The total
|
such that the total number of characters up to there is a multiple of
|
||||||
number of spaces preceding the first non-blank character then
|
eight (this is intended to be the same rule as used by UNIX). The
|
||||||
|
total number of spaces preceding the first non-blank character then
|
||||||
determines the line's indentation. Indentation cannot be split over
|
determines the line's indentation. Indentation cannot be split over
|
||||||
multiple physical lines using backslashes.
|
multiple physical lines using backslashes.
|
||||||
|
|
||||||
|
@ -185,6 +187,38 @@ popped off, and for each number popped off a DEDENT token is
|
||||||
generated. At the end of the file, a DEDENT token is generated for
|
generated. At the end of the file, a DEDENT token is generated for
|
||||||
each number remaining on the stack that is larger than zero.
|
each number remaining on the stack that is larger than zero.
|
||||||
|
|
||||||
|
Here is an example of a correctly (though confusingly) indented piece
|
||||||
|
of Python code:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
def perm(l):
|
||||||
|
if len(l) <= 1:
|
||||||
|
return [l]
|
||||||
|
r = []
|
||||||
|
for i in range(len(l)):
|
||||||
|
s = l[:i] + l[i+1:]
|
||||||
|
p = perm(s)
|
||||||
|
for x in p:
|
||||||
|
r.append(l[i:i+1] + x)
|
||||||
|
return r
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
The following example shows various indentation errors:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
def perm(l): # error: first line indented
|
||||||
|
for i in range(len(l)): # error: not indented
|
||||||
|
s = l[:i] + l[i+1:]
|
||||||
|
p = perm(l[:i] + l[i+1:]) # error: unexpected indent
|
||||||
|
for x in p:
|
||||||
|
r.append(l[i:i+1] + x)
|
||||||
|
return r # error: inconsistent indent
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
(Actually, the first three errors are detected by the parser; only the
|
||||||
|
last error is found by the lexical analyzer -- the indentation of
|
||||||
|
\verb\return r\ does not match a level popped off the stack.)
|
||||||
|
|
||||||
\section{Other tokens}
|
\section{Other tokens}
|
||||||
|
|
||||||
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
|
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
|
||||||
|
@ -205,12 +239,13 @@ uppercase: "A"..."Z"
|
||||||
digit: "0"..."9"
|
digit: "0"..."9"
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
Identifiers are unlimited in length. Case is significant.
|
Identifiers are unlimited in length. Case is significant. Keywords
|
||||||
|
are not identifiers.
|
||||||
|
|
||||||
\section{Keywords}
|
\section{Keywords}
|
||||||
|
|
||||||
The following identifiers are used as reserved words, or {\em
|
The following identifiers are used as reserved words, or {\em
|
||||||
keywords} of the language, and may not be used as ordinary
|
keywords} of the language, and cannot be used as ordinary
|
||||||
identifiers. They must be spelled exactly as written here:
|
identifiers. They must be spelled exactly as written here:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
|
@ -260,7 +295,7 @@ are:
|
||||||
\verb/\'/ & Single quote (\verb/'/) \\
|
\verb/\'/ & Single quote (\verb/'/) \\
|
||||||
\verb/\a/ & ASCII Bell (BEL) \\
|
\verb/\a/ & ASCII Bell (BEL) \\
|
||||||
\verb/\b/ & ASCII Backspace (BS) \\
|
\verb/\b/ & ASCII Backspace (BS) \\
|
||||||
\verb/\E/ & ASCII Escape (ESC) \\
|
%\verb/\E/ & ASCII Escape (ESC) \\
|
||||||
\verb/\f/ & ASCII Formfeed (FF) \\
|
\verb/\f/ & ASCII Formfeed (FF) \\
|
||||||
\verb/\n/ & ASCII Linefeed (LF) \\
|
\verb/\n/ & ASCII Linefeed (LF) \\
|
||||||
\verb/\r/ & ASCII Carriage Return (CR) \\
|
\verb/\r/ & ASCII Carriage Return (CR) \\
|
||||||
|
@ -272,13 +307,13 @@ are:
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{center}
|
\end{center}
|
||||||
|
|
||||||
For compatibility with in Standard C, up to three octal digits are
|
In strict compatibility with in Standard C, up to three octal digits are
|
||||||
accepted, but an unlimited number of hex digits is taken to be part of
|
accepted, but an unlimited number of hex digits is taken to be part of
|
||||||
the hex escape (and then the lower 8 bits of the resulting hex number
|
the hex escape (and then the lower 8 bits of the resulting hex number
|
||||||
are used...).
|
are used in all current implementations...).
|
||||||
|
|
||||||
All unrecognized escape sequences are left in the string {\em
|
All unrecognized escape sequences are left in the string unchanged,
|
||||||
unchanged}, i.e., the backslash is left in the string. (This rule is
|
i.e., {\em the backslash is left in the string.} (This rule is
|
||||||
useful when debugging: if an escape sequence is mistyped, the
|
useful when debugging: if an escape sequence is mistyped, the
|
||||||
resulting output is more easily recognized as broken. It also helps a
|
resulting output is more easily recognized as broken. It also helps a
|
||||||
great deal for string literals used as regular expressions or
|
great deal for string literals used as regular expressions or
|
||||||
|
@ -313,6 +348,18 @@ fraction: "." digit+
|
||||||
exponent: ("e"|"E") ["+"|"-"] digit+
|
exponent: ("e"|"E") ["+"|"-"] digit+
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
|
Some examples of numeric literals:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
1 1234567890 0177777 0x80000
|
||||||
|
|
||||||
|
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
Note that the definitions for literals do not include a sign; a phrase
|
||||||
|
like \verb\-1\ is actually an expression composed of the operator
|
||||||
|
\verb\-\ and the literal \verb\1\.
|
||||||
|
|
||||||
\section{Operators}
|
\section{Operators}
|
||||||
|
|
||||||
The following tokens are operators:
|
The following tokens are operators:
|
||||||
|
@ -336,13 +383,16 @@ meaning:
|
||||||
; , : . ` =
|
; , : . ` =
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
The following printing ASCII characters are currently not used;
|
The following printing ASCII characters are not used in Python (except
|
||||||
their occurrence is an unconditional error:
|
in string literals and in comments). Their occurrence is an
|
||||||
|
unconditional error:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
! @ $ " ?
|
! @ $ " ?
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
|
They may be used by future versions of the language though!
|
||||||
|
|
||||||
\chapter{Execution model}
|
\chapter{Execution model}
|
||||||
|
|
||||||
(XXX This chapter should explain the general model of the execution of
|
(XXX This chapter should explain the general model of the execution of
|
||||||
|
|
128
Doc/ref/ref.tex
128
Doc/ref/ref.tex
|
@ -1,5 +1,5 @@
|
||||||
% Format this file with latex.
|
% Format this file with latex.
|
||||||
|
|
||||||
\documentstyle[myformat]{report}
|
\documentstyle[myformat]{report}
|
||||||
|
|
||||||
\title{\bf
|
\title{\bf
|
||||||
|
@ -65,17 +65,18 @@ rather than formal specifications for everything except syntax and
|
||||||
lexical analysis. This should make the document better understandable
|
lexical analysis. This should make the document better understandable
|
||||||
to the average reader, but will leave room for ambiguities.
|
to the average reader, but will leave room for ambiguities.
|
||||||
Consequently, if you were coming from Mars and tried to re-implement
|
Consequently, if you were coming from Mars and tried to re-implement
|
||||||
Python from this document alone, you might in fact be implementing
|
Python from this document alone, you might have to guess things and in
|
||||||
quite a different language. On the other hand, if you are using
|
fact you would be implementing quite a different language.
|
||||||
|
On the other hand, if you are using
|
||||||
Python and wonder what the precise rules about a particular area of
|
Python and wonder what the precise rules about a particular area of
|
||||||
the language are, you should be able to find it here.
|
the language are, you should definitely be able to find it here.
|
||||||
|
|
||||||
It is dangerous to add too many implementation details to a language
|
It is dangerous to add too many implementation details to a language
|
||||||
reference document -- the implementation may change, and other
|
reference document -- the implementation may change, and other
|
||||||
implementations of the same language may work differently. On the
|
implementations of the same language may work differently. On the
|
||||||
other hand, there is currently only one Python implementation, and
|
other hand, there is currently only one Python implementation, and
|
||||||
particular quirks of it are sometimes worth mentioning, especially
|
its particular quirks are sometimes worth being mentioned, especially
|
||||||
where it differs from the ``ideal'' specification.
|
where the implementation imposes additional limitations.
|
||||||
|
|
||||||
Every Python implementation comes with a number of built-in and
|
Every Python implementation comes with a number of built-in and
|
||||||
standard modules. These are not documented here, but in the separate
|
standard modules. These are not documented here, but in the separate
|
||||||
|
@ -93,20 +94,20 @@ name: lcletter (lcletter | "_")*
|
||||||
lcletter: "a"..."z"
|
lcletter: "a"..."z"
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
The first line says that a \verb\name\ is a \verb\lcletter\ followed by
|
The first line says that a \verb\name\ is an \verb\lcletter\ followed by
|
||||||
a sequence of zero or more \verb\lcletter\s and underscores. A
|
a sequence of zero or more \verb\lcletter\s and underscores. An
|
||||||
\verb\lcletter\ in turn is any of the single characters `a' through `z'.
|
\verb\lcletter\ in turn is any of the single characters `a' through `z'.
|
||||||
(This rule is actually adhered to for the names defined in syntax and
|
(This rule is actually adhered to for the names defined in syntax and
|
||||||
grammar rules in this document.)
|
grammar rules in this document.)
|
||||||
|
|
||||||
Each rule begins with a name (which is the name defined by the rule)
|
Each rule begins with a name (which is the name defined by the rule)
|
||||||
followed by a colon. Each rule is wholly contained on one line. A
|
and a colon, and is wholly contained on one line. A vertical bar
|
||||||
vertical bar (\verb\|\) is used to separate alternatives, it is the
|
(\verb\|\) is used to separate alternatives; it is the least binding
|
||||||
least binding operator in this notation. A star (\verb\*\) means zero
|
operator in this notation. A star (\verb\*\) means zero or more
|
||||||
or more repetitions of the preceding item; likewise, a plus (\verb\+\)
|
repetitions of the preceding item; likewise, a plus (\verb\+\) means
|
||||||
means one or more repetitions and a question mark (\verb\?\) zero or
|
one or more repetitions, and a question mark (\verb\?\) zero or one
|
||||||
one (in other words, the preceding item is optional). These three
|
(in other words, the preceding item is optional). These three
|
||||||
operators bind as tight as possible; parentheses are used for
|
operators bind as tightly as possible; parentheses are used for
|
||||||
grouping. Literal strings are enclosed in double quotes. White space
|
grouping. Literal strings are enclosed in double quotes. White space
|
||||||
is only meaningful to separate tokens.
|
is only meaningful to separate tokens.
|
||||||
|
|
||||||
|
@ -117,7 +118,7 @@ characters. A phrase between angular brackets (\verb\<...>\) gives an
|
||||||
informal description of the symbol defined; e.g., this could be used
|
informal description of the symbol defined; e.g., this could be used
|
||||||
to describe the notion of `control character' if needed.
|
to describe the notion of `control character' if needed.
|
||||||
|
|
||||||
Although the notation used is almost the same, there is a big
|
Even though the notation used is almost the same, there is a big
|
||||||
difference between the meaning of lexical and syntactic definitions:
|
difference between the meaning of lexical and syntactic definitions:
|
||||||
a lexical definition operates on the individual characters of the
|
a lexical definition operates on the individual characters of the
|
||||||
input source, while a syntax definition operates on the stream of
|
input source, while a syntax definition operates on the stream of
|
||||||
|
@ -131,22 +132,22 @@ chapter describes how the lexical analyzer breaks a file into tokens.
|
||||||
|
|
||||||
\section{Line structure}
|
\section{Line structure}
|
||||||
|
|
||||||
A Python program is divided in a number of logical lines. Statements
|
A Python program is divided in a number of logical lines. The end of
|
||||||
do not straddle logical line boundaries except where explicitly
|
a logical line is represented by the token NEWLINE. Statements cannot
|
||||||
indicated by the syntax (i.e., for compound statements). To this
|
cross logical line boundaries except where NEWLINE is allowed by the
|
||||||
purpose, the end of a logical line is represented by the token
|
syntax (e.g., between statements in compound statements).
|
||||||
NEWLINE.
|
|
||||||
|
|
||||||
\subsection{Comments}
|
\subsection{Comments}
|
||||||
|
|
||||||
A comment starts with a hash character (\verb\#\) that is not part of
|
A comment starts with a hash character (\verb\#\) that is not part of
|
||||||
a string literal, and ends at the end of the physical line. Comments
|
a string literal, and ends at the end of the physical line. A comment
|
||||||
are ignored by the syntax.
|
always signifies the end of the logical line. Comments are ignored by
|
||||||
|
the syntax.
|
||||||
|
|
||||||
\subsection{Line joining}
|
\subsection{Line joining}
|
||||||
|
|
||||||
Two or more physical lines may be joined into logical lines using
|
Two or more physical lines may be joined into logical lines using
|
||||||
backslash characters (\verb/\/), as follows: When physical line ends
|
backslash characters (\verb/\/), as follows: when a physical line ends
|
||||||
in a backslash that is not part of a string literal or comment, it is
|
in a backslash that is not part of a string literal or comment, it is
|
||||||
joined with the following forming a single logical line, deleting the
|
joined with the following forming a single logical line, deleting the
|
||||||
backslash and the following end-of-line character.
|
backslash and the following end-of-line character.
|
||||||
|
@ -160,13 +161,14 @@ terminates a multi-line statement.
|
||||||
|
|
||||||
\subsection{Indentation}
|
\subsection{Indentation}
|
||||||
|
|
||||||
Spaces and tabs at the beginning of a logical line are used to compute
|
Leading whitespace (spaces and tabs) at the beginning of a logical
|
||||||
the indentation level of the line, which in turn is used to determine
|
line is used to compute the indentation level of the line, which in
|
||||||
the grouping of statements.
|
turn is used to determine the grouping of statements.
|
||||||
|
|
||||||
First, each tab is replaced by one to eight spaces such that the total
|
First, tabs are replaced (from left to right) by one to eight spaces
|
||||||
number of spaces up to that point is a multiple of eight. The total
|
such that the total number of characters up to there is a multiple of
|
||||||
number of spaces preceding the first non-blank character then
|
eight (this is intended to be the same rule as used by UNIX). The
|
||||||
|
total number of spaces preceding the first non-blank character then
|
||||||
determines the line's indentation. Indentation cannot be split over
|
determines the line's indentation. Indentation cannot be split over
|
||||||
multiple physical lines using backslashes.
|
multiple physical lines using backslashes.
|
||||||
|
|
||||||
|
@ -185,6 +187,38 @@ popped off, and for each number popped off a DEDENT token is
|
||||||
generated. At the end of the file, a DEDENT token is generated for
|
generated. At the end of the file, a DEDENT token is generated for
|
||||||
each number remaining on the stack that is larger than zero.
|
each number remaining on the stack that is larger than zero.
|
||||||
|
|
||||||
|
Here is an example of a correctly (though confusingly) indented piece
|
||||||
|
of Python code:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
def perm(l):
|
||||||
|
if len(l) <= 1:
|
||||||
|
return [l]
|
||||||
|
r = []
|
||||||
|
for i in range(len(l)):
|
||||||
|
s = l[:i] + l[i+1:]
|
||||||
|
p = perm(s)
|
||||||
|
for x in p:
|
||||||
|
r.append(l[i:i+1] + x)
|
||||||
|
return r
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
The following example shows various indentation errors:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
def perm(l): # error: first line indented
|
||||||
|
for i in range(len(l)): # error: not indented
|
||||||
|
s = l[:i] + l[i+1:]
|
||||||
|
p = perm(l[:i] + l[i+1:]) # error: unexpected indent
|
||||||
|
for x in p:
|
||||||
|
r.append(l[i:i+1] + x)
|
||||||
|
return r # error: inconsistent indent
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
(Actually, the first three errors are detected by the parser; only the
|
||||||
|
last error is found by the lexical analyzer -- the indentation of
|
||||||
|
\verb\return r\ does not match a level popped off the stack.)
|
||||||
|
|
||||||
\section{Other tokens}
|
\section{Other tokens}
|
||||||
|
|
||||||
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
|
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
|
||||||
|
@ -205,12 +239,13 @@ uppercase: "A"..."Z"
|
||||||
digit: "0"..."9"
|
digit: "0"..."9"
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
Identifiers are unlimited in length. Case is significant.
|
Identifiers are unlimited in length. Case is significant. Keywords
|
||||||
|
are not identifiers.
|
||||||
|
|
||||||
\section{Keywords}
|
\section{Keywords}
|
||||||
|
|
||||||
The following identifiers are used as reserved words, or {\em
|
The following identifiers are used as reserved words, or {\em
|
||||||
keywords} of the language, and may not be used as ordinary
|
keywords} of the language, and cannot be used as ordinary
|
||||||
identifiers. They must be spelled exactly as written here:
|
identifiers. They must be spelled exactly as written here:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
|
@ -260,7 +295,7 @@ are:
|
||||||
\verb/\'/ & Single quote (\verb/'/) \\
|
\verb/\'/ & Single quote (\verb/'/) \\
|
||||||
\verb/\a/ & ASCII Bell (BEL) \\
|
\verb/\a/ & ASCII Bell (BEL) \\
|
||||||
\verb/\b/ & ASCII Backspace (BS) \\
|
\verb/\b/ & ASCII Backspace (BS) \\
|
||||||
\verb/\E/ & ASCII Escape (ESC) \\
|
%\verb/\E/ & ASCII Escape (ESC) \\
|
||||||
\verb/\f/ & ASCII Formfeed (FF) \\
|
\verb/\f/ & ASCII Formfeed (FF) \\
|
||||||
\verb/\n/ & ASCII Linefeed (LF) \\
|
\verb/\n/ & ASCII Linefeed (LF) \\
|
||||||
\verb/\r/ & ASCII Carriage Return (CR) \\
|
\verb/\r/ & ASCII Carriage Return (CR) \\
|
||||||
|
@ -272,13 +307,13 @@ are:
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{center}
|
\end{center}
|
||||||
|
|
||||||
For compatibility with in Standard C, up to three octal digits are
|
In strict compatibility with in Standard C, up to three octal digits are
|
||||||
accepted, but an unlimited number of hex digits is taken to be part of
|
accepted, but an unlimited number of hex digits is taken to be part of
|
||||||
the hex escape (and then the lower 8 bits of the resulting hex number
|
the hex escape (and then the lower 8 bits of the resulting hex number
|
||||||
are used...).
|
are used in all current implementations...).
|
||||||
|
|
||||||
All unrecognized escape sequences are left in the string {\em
|
All unrecognized escape sequences are left in the string unchanged,
|
||||||
unchanged}, i.e., the backslash is left in the string. (This rule is
|
i.e., {\em the backslash is left in the string.} (This rule is
|
||||||
useful when debugging: if an escape sequence is mistyped, the
|
useful when debugging: if an escape sequence is mistyped, the
|
||||||
resulting output is more easily recognized as broken. It also helps a
|
resulting output is more easily recognized as broken. It also helps a
|
||||||
great deal for string literals used as regular expressions or
|
great deal for string literals used as regular expressions or
|
||||||
|
@ -313,6 +348,18 @@ fraction: "." digit+
|
||||||
exponent: ("e"|"E") ["+"|"-"] digit+
|
exponent: ("e"|"E") ["+"|"-"] digit+
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
|
Some examples of numeric literals:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
1 1234567890 0177777 0x80000
|
||||||
|
|
||||||
|
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
Note that the definitions for literals do not include a sign; a phrase
|
||||||
|
like \verb\-1\ is actually an expression composed of the operator
|
||||||
|
\verb\-\ and the literal \verb\1\.
|
||||||
|
|
||||||
\section{Operators}
|
\section{Operators}
|
||||||
|
|
||||||
The following tokens are operators:
|
The following tokens are operators:
|
||||||
|
@ -336,13 +383,16 @@ meaning:
|
||||||
; , : . ` =
|
; , : . ` =
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
The following printing ASCII characters are currently not used;
|
The following printing ASCII characters are not used in Python (except
|
||||||
their occurrence is an unconditional error:
|
in string literals and in comments). Their occurrence is an
|
||||||
|
unconditional error:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
! @ $ " ?
|
! @ $ " ?
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
|
They may be used by future versions of the language though!
|
||||||
|
|
||||||
\chapter{Execution model}
|
\chapter{Execution model}
|
||||||
|
|
||||||
(XXX This chapter should explain the general model of the execution of
|
(XXX This chapter should explain the general model of the execution of
|
||||||
|
|
Loading…
Reference in New Issue