mirror of https://github.com/python/cpython.git
Text from Tim & Guido discussing floating point arithmetic and what users
need to understand about the binary & decimal fp, so that representation weirdness is documented somewhere. This makes it easier to repond to "bug" reports caused by user confusion & ignorance of the issues. This closes SF patch #426208.
This commit is contained in:
parent
1dc98c44fe
commit
417d667dd9
265
Doc/tut/tut.tex
265
Doc/tut/tut.tex
|
@ -4085,4 +4085,269 @@ check (or even suggest) matching parentheses, quotes, etc., would also
|
|||
be useful.
|
||||
|
||||
|
||||
\chapter{Floating Point Arithmetic: Issues and Limitations
|
||||
\label{fp-issues}}
|
||||
|
||||
Floating-point numbers are represented in computer hardware as
|
||||
base 2 (binary) fractions. For example, the decimal fraction
|
||||
|
||||
\begin{verbatim}
|
||||
0.125
|
||||
\end{verbatim}
|
||||
|
||||
has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction
|
||||
|
||||
\begin{verbatim}
|
||||
0.001
|
||||
\end{verbatim}
|
||||
|
||||
has value 0/2 + 0/4 + 1/8. These two fractions have identical values,
|
||||
the only real difference being that the first is written in base 10
|
||||
fractional notation, and the second in base 2.
|
||||
|
||||
Unfortunately, most decimal fractions cannot be represented exactly as
|
||||
binary fractions. A consequence is that, in general, the decimal
|
||||
floating-point numbers you enter are only approximated by the binary
|
||||
floating-point numbers actually stored in the machine.
|
||||
|
||||
The problem is easier to understand at first in base 10. Consider the
|
||||
fraction 1/3. You can approximate that as a base 10 fraction:
|
||||
|
||||
\begin{verbatim}
|
||||
0.3
|
||||
\end{verbatim}
|
||||
|
||||
or, better,
|
||||
|
||||
\begin{verbatim}
|
||||
0.33
|
||||
\end{verbatim}
|
||||
|
||||
or, better,
|
||||
|
||||
\begin{verbatim}
|
||||
0.333
|
||||
\end{verbatim}
|
||||
|
||||
and so on. No matter how many digits you're willing to write down, the
|
||||
result will never be exactly 1/3, but will be an increasingly better
|
||||
approximation to 1/3.
|
||||
|
||||
In the same way, no matter how many base 2 digits you're willing to
|
||||
use, the decimal value 0.1 cannot be represented exactly as a base 2
|
||||
fraction. In base 2, 1/10 is the infinitely repeating fraction
|
||||
|
||||
\begin{verbatim}
|
||||
0.0001100110011001100110011001100110011001100110011...
|
||||
\end{verbatim}
|
||||
|
||||
Stop at any finite number of bits, and you get an approximation. This
|
||||
is why you see things like:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> 0.1
|
||||
0.10000000000000001
|
||||
\end{verbatim}
|
||||
|
||||
On most machines today, that is what you'll see if you enter 0.1 at
|
||||
a Python prompt. You may not, though, because the number of bits
|
||||
used by the hardware to store floating-point values can vary across
|
||||
machines, and Python only prints a decimal approximation to the true
|
||||
decimal value of the binary approximation stored by the machine. On
|
||||
most machines, if Python were to print the true decimal value of
|
||||
the binary approximation stored for 0.1, it would have to display
|
||||
|
||||
\begin{verbatim}
|
||||
>>> 0.1
|
||||
0.1000000000000000055511151231257827021181583404541015625
|
||||
\end{verbatim}
|
||||
|
||||
instead! The Python prompt (implicitly) uses the builtin
|
||||
\function{repr()} function to obtain a string version of everything it
|
||||
displays. For floats, \code{repr(\var{float})} rounds the true
|
||||
decimal value to 17 significant digits, giving
|
||||
|
||||
\begin{verbatim}
|
||||
0.10000000000000001
|
||||
\end{verbatim}
|
||||
|
||||
\code{repr(\var{float})} produces 17 significant digits because it
|
||||
turns out that's enough (on most machines) so that
|
||||
\code{eval(repr(\var{x})) == \var{x}} exactly for all finite floats
|
||||
\var{x}, but rounding to 16 digits is not enough to make that true.
|
||||
|
||||
Note that this is in the very nature of binary floating-point: this is
|
||||
not a bug in Python, it is not a bug in your code either, and you'll
|
||||
see the same kind of thing in all languages that support your
|
||||
hardware's floating-point arithmetic.
|
||||
|
||||
Python's builtin \function{str()} function produces only 12
|
||||
significant digits, and you may wish to use that instead. It's
|
||||
unusual for \code{eval(str(\var{x}))} to reproduce \var{x}, but the
|
||||
output may be more pleasant to look at:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> print str(0.1)
|
||||
0.1
|
||||
\end{verbatim}
|
||||
|
||||
It's important to realize that this is, in a real sense, an illusion:
|
||||
the value in the machine is not exactly 1/10, you're simply rounding
|
||||
the \emph{display} of the true machine value.
|
||||
|
||||
Other surprises follow from this one. For example, after seeing
|
||||
|
||||
\begin{verbatim}
|
||||
>>> 0.1
|
||||
0.10000000000000001
|
||||
\end{verbatim}
|
||||
|
||||
you may be tempted to use the \function{round()} function to chop it
|
||||
back to the single digit you expect. But that makes no difference:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> round(0.1, 1)
|
||||
0.10000000000000001
|
||||
\end{verbatim}
|
||||
|
||||
The problem is that the binary floating-point value stored for "0.1"
|
||||
was already the best possible binary approximation to 1/10, so trying
|
||||
to round it again can't make it better: it was already as good as it
|
||||
gets.
|
||||
|
||||
Another consequence is that since 0.1 is not exactly 1/10, adding 0.1
|
||||
to itself 10 times may not yield exactly 1.0, either:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> sum = 0.0
|
||||
>>> for i in range(10):
|
||||
... sum += 0.1
|
||||
...
|
||||
>>> sum
|
||||
0.99999999999999989
|
||||
\end{verbatim}
|
||||
|
||||
Binary floating-point arithmetic holds many surprises like this. The
|
||||
problem with "0.1" is explained in precise detail below, in the
|
||||
"Representation Error" section. See
|
||||
\citetitle[http://www.lahey.com/float.htm]{The Perils of Floating
|
||||
Point} for a more complete account of other common surprises.
|
||||
|
||||
As that says near the end, ``there are no easy answers.'' Still,
|
||||
don't be unduly wary of floating-point! The errors in Python float
|
||||
operations are inherited from the floating-point hardware, and on most
|
||||
machines are on the order of no more than 1 part in 2**53 per
|
||||
operation. That's more than adequate for most tasks, but you do need
|
||||
to keep in mind that it's not decimal arithmetic, and that every float
|
||||
operation can suffer a new rounding error.
|
||||
|
||||
While pathological cases do exist, for most casual use of
|
||||
floating-point arithmetic you'll see the result you expect in the end
|
||||
if you simply round the display of your final results to the number of
|
||||
decimal digits you expect. \function{str()} usually suffices, and for
|
||||
finer control see the discussion of Pythons's \code{\%} format
|
||||
operator: the \code{\%g}, \code{\%f} and \code{\%e} format codes
|
||||
supply flexible and easy ways to round float results for display.
|
||||
|
||||
|
||||
\section{Representation Error
|
||||
\label{fp-error}}
|
||||
|
||||
This section explains the ``0.1'' example in detail, and shows how
|
||||
you can perform an exact analysis of cases like this yourself. Basic
|
||||
familiarity with binary floating-point representation is assumed.
|
||||
|
||||
\dfn{Representation error} refers to that some (most, actually)
|
||||
decimal fractions cannot be represented exactly as binary (base 2)
|
||||
fractions. This is the chief reason why Python (or Perl, C, \Cpp,
|
||||
Java, Fortran, and many others) often won't display the exact decimal
|
||||
number you expect:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> 0.1
|
||||
0.10000000000000001
|
||||
\end{verbatim}
|
||||
|
||||
Why is that? 1/10 is not exactly representable as a binary fraction.
|
||||
Almost all machines today (November 2000) use IEEE-754 floating point
|
||||
arithmetic, and almost all platforms map Python floats to IEEE-754
|
||||
"double precision". 754 doubles contain 53 bits of precision, so on
|
||||
input the computer strives to convert 0.1 to the closest fraction it can
|
||||
of the form \var{J}/2**\var{N} where \var{J} is an integer containing
|
||||
exactly 53 bits. Rewriting
|
||||
|
||||
\begin{verbatim}
|
||||
1 / 10 ~= J / (2**N)
|
||||
\end{verbatim}
|
||||
|
||||
as
|
||||
|
||||
\begin{verbatim}
|
||||
J ~= 2**N / 10
|
||||
\end{verbatim}
|
||||
|
||||
and recalling that \var{J} has exactly 53 bits (is \code{>= 2**52} but
|
||||
\code{< 2**53}), the best value for \var{N} is 56:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> 2L**52
|
||||
4503599627370496L
|
||||
>>> 2L**53
|
||||
9007199254740992L
|
||||
>>> 2L**56/10
|
||||
7205759403792793L
|
||||
\end{verbatim}
|
||||
|
||||
That is, 56 is the only value for \var{N} that leaves \var{J} with
|
||||
exactly 53 bits. The best possible value for \var{J} is then that
|
||||
quotient rounded:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> q, r = divmod(2L**56, 10)
|
||||
>>> r
|
||||
6L
|
||||
\end{verbatim}
|
||||
|
||||
Since the remainder is more than half of 10, the best approximation is
|
||||
obtained by rounding up:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> q+1
|
||||
7205759403792794L
|
||||
\end{verbatim}
|
||||
|
||||
Therefore the best possible approximation to 1/10 in 754 double
|
||||
precision is that over 2**56, or
|
||||
|
||||
\begin{verbatim}
|
||||
7205759403792794 / 72057594037927936
|
||||
\end{verbatim}
|
||||
|
||||
Note that since we rounded up, this is actually a little bit larger than
|
||||
1/10; if we had not rounded up, the quotient would have been a little
|
||||
bit smaller than 1/10. But in no case can it be *exactly* 1/10!
|
||||
|
||||
So the computer never ``sees'' 1/10: what it sees is the exact
|
||||
fraction given above, the best 754 double approximation it can get:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> .1 * 2L**56
|
||||
7205759403792794.0
|
||||
\end{verbatim}
|
||||
|
||||
If we multiply that fraction by 10**30, we can see the (truncated)
|
||||
value of its 30 most significant decimal digits:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> 7205759403792794L * 10L**30 / 2L**56
|
||||
100000000000000005551115123125L
|
||||
\end{verbatim}
|
||||
|
||||
meaning that the exact number stored in the computer is approximately
|
||||
equal to the decimal value 0.100000000000000005551115123125. Rounding
|
||||
that to 17 significant digits gives the 0.10000000000000001 that Python
|
||||
displays (well, will display on any 754-conforming platform that does
|
||||
best-possible input and output conversions in its C library --- yours may
|
||||
not!).
|
||||
|
||||
\end{document}
|
||||
|
|
Loading…
Reference in New Issue