1998-05-06 19:52:49 +00:00
|
|
|
\chapter{Introduction}
|
|
|
|
|
|
|
|
This reference manual describes the Python programming language.
|
|
|
|
It is not intended as a tutorial.
|
|
|
|
|
|
|
|
While I am trying to be as precise as possible, I chose to use English
|
|
|
|
rather than formal specifications for everything except syntax and
|
|
|
|
lexical analysis. This should make the document more understandable
|
|
|
|
to the average reader, but will leave room for ambiguities.
|
|
|
|
Consequently, if you were coming from Mars and tried to re-implement
|
|
|
|
Python from this document alone, you might have to guess things and in
|
|
|
|
fact you would probably end up implementing quite a different language.
|
|
|
|
On the other hand, if you are using
|
|
|
|
Python and wonder what the precise rules about a particular area of
|
|
|
|
the language are, you should definitely be able to find them here.
|
1998-06-15 16:27:37 +00:00
|
|
|
If you would like to see a more formal definitition of the language,
|
|
|
|
maybe you could volunteer your time --- or invent a cloning machine
|
|
|
|
:-).
|
1998-05-06 19:52:49 +00:00
|
|
|
|
|
|
|
It is dangerous to add too many implementation details to a language
|
|
|
|
reference document --- the implementation may change, and other
|
|
|
|
implementations of the same language may work differently. On the
|
1998-06-15 16:27:37 +00:00
|
|
|
other hand, there is currently only one Python implementation in
|
|
|
|
widespread use (although a second one now exists!), and
|
1998-05-06 19:52:49 +00:00
|
|
|
its particular quirks are sometimes worth being mentioned, especially
|
|
|
|
where the implementation imposes additional limitations. Therefore,
|
|
|
|
you'll find short ``implementation notes'' sprinkled throughout the
|
|
|
|
text.
|
|
|
|
|
|
|
|
Every Python implementation comes with a number of built-in and
|
|
|
|
standard modules. These are not documented here, but in the separate
|
|
|
|
{\em Python Library Reference} document. A few built-in modules are
|
|
|
|
mentioned when they interact in a significant way with the language
|
|
|
|
definition.
|
|
|
|
|
|
|
|
\section{Notation}
|
|
|
|
|
|
|
|
The descriptions of lexical analysis and syntax use a modified BNF
|
|
|
|
grammar notation. This uses the following style of definition:
|
|
|
|
\index{BNF}
|
|
|
|
\index{grammar}
|
|
|
|
\index{syntax}
|
|
|
|
\index{notation}
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
name: lc_letter (lc_letter | "_")*
|
|
|
|
lc_letter: "a"..."z"
|
|
|
|
\end{verbatim}
|
|
|
|
|
1998-05-14 19:37:06 +00:00
|
|
|
The first line says that a \code{name} is an \code{lc_letter} followed by
|
|
|
|
a sequence of zero or more \code{lc_letter}s and underscores. An
|
|
|
|
\code{lc_letter} in turn is any of the single characters \character{a}
|
|
|
|
through \character{z}. (This rule is actually adhered to for the
|
|
|
|
names defined in lexical and grammar rules in this document.)
|
1998-05-06 19:52:49 +00:00
|
|
|
|
|
|
|
Each rule begins with a name (which is the name defined by the rule)
|
1998-05-14 19:37:06 +00:00
|
|
|
and a colon. A vertical bar (\code{|}) is used to separate
|
1998-05-06 19:52:49 +00:00
|
|
|
alternatives; it is the least binding operator in this notation. A
|
1998-05-14 19:37:06 +00:00
|
|
|
star (\code{*}) means zero or more repetitions of the preceding item;
|
|
|
|
likewise, a plus (\code{+}) means one or more repetitions, and a
|
|
|
|
phrase enclosed in square brackets (\code{[ ]}) means zero or one
|
1998-05-06 19:52:49 +00:00
|
|
|
occurrences (in other words, the enclosed phrase is optional). The
|
1998-05-14 19:37:06 +00:00
|
|
|
\code{*} and \code{+} operators bind as tightly as possible;
|
1998-05-06 19:52:49 +00:00
|
|
|
parentheses are used for grouping. Literal strings are enclosed in
|
|
|
|
quotes. White space is only meaningful to separate tokens.
|
|
|
|
Rules are normally contained on a single line; rules with many
|
|
|
|
alternatives may be formatted alternatively with each line after the
|
|
|
|
first beginning with a vertical bar.
|
|
|
|
|
|
|
|
In lexical definitions (as the example above), two more conventions
|
|
|
|
are used: Two literal characters separated by three dots mean a choice
|
|
|
|
of any single character in the given (inclusive) range of \ASCII{}
|
1998-05-14 19:37:06 +00:00
|
|
|
characters. A phrase between angular brackets (\code{<...>}) gives an
|
1998-07-24 15:36:43 +00:00
|
|
|
informal description of the symbol defined; e.g., this could be used
|
1998-05-06 19:52:49 +00:00
|
|
|
to describe the notion of `control character' if needed.
|
|
|
|
\index{lexical definitions}
|
1998-05-14 19:37:06 +00:00
|
|
|
\index{ASCII@\ASCII{}}
|
1998-05-06 19:52:49 +00:00
|
|
|
|
|
|
|
Even though the notation used is almost the same, there is a big
|
|
|
|
difference between the meaning of lexical and syntactic definitions:
|
|
|
|
a lexical definition operates on the individual characters of the
|
|
|
|
input source, while a syntax definition operates on the stream of
|
|
|
|
tokens generated by the lexical analysis. All uses of BNF in the next
|
|
|
|
chapter (``Lexical Analysis'') are lexical definitions; uses in
|
|
|
|
subsequent chapters are syntactic definitions.
|