Common Lisp the Language, 2nd Edition
Next: Predicates on Numbers
Up: Numbers
Previous: Numbers
In general, computations with floating-point numbers are only approximate. The precision of a floating-point number is not necessarily correlated at all with the accuracy of that number. For instance, 3.142857142857142857 is a more precise approximation to than 3.14159, but the latter is more accurate. The precision refers to the number of bits retained in the representation. When an operation combines a short floating-point number with a long one, the result will be a long floating-point number. This rule is made to ensure that as much accuracy as possible is preserved; however, it is by no means a guarantee. Common Lisp numerical routines do assume, however, that the accuracy of an argument does not exceed its precision. Therefore when two small floating-point numbers are combined, the result will always be a small floating-point number. This assumption can be overridden by first explicitly converting a small floating-point number to a larger representation. (Common Lisp never converts automatically from a larger size to a smaller one.)
Rational computations cannot overflow in the usual sense (though of course there may not be enough storage to represent one), as integers and ratios may in principle be of any magnitude. Floating-point computations may get exponent overflow or underflow; this is an error.
X3J13 voted in June 1989 (FLOAT-UNDERFLOW) to address certain problems
relating to floating-point overflow and underflow, but certain parts of
the proposed solution were not adopted, namely to add the macro
without-floating-underflow-traps
to the language and to
require certain behavior of floating-point overflow and underflow. The
committee agreed that this area of the language requires more discussion
before a solution is standardized.
For the record, the proposal that was considered and rejected (for
the nonce) introduced a macro
without-floating-underflow-traps
that would execute its
body in such a way that, within its dynamic extent, a floating-point
underflow must not signal an error but instead must produce either a
denormalized number or zero as the result. The rejected proposal also
specified the following treatment of overflow and underflow:
floating-point-overflow
.without-floating-underflow-traps
, a floating-point
computation that underflows should signal an error of type
floating-point-underflow
. A result that can be represented
only in denormalized form must be considered an underflow in
implementations that support denormalized floating-point numbers.These points refer to conditions floating-point-overflow
and floating-point-underflow
that were approved by X3J13
and are described in section 29.5.
When rational and floating-point numbers are compared or combined by
a numerical function, the rule of floating-point contagion is
followed: when a rational meets a floating-point number, the rational is
first converted to a floating-point number of the same format. For
functions such as +
that take more than two arguments, it
may be that part of the operation is carried out exactly using rationals
and then the rest is done using floating-point arithmetic.
X3J13 voted in January 1989 (CONTAGION-ON-NUMERICAL-COMPARISONS) to
apply the rule of floating-point contagion stated above to the case of
combining rational and floating-point numbers. For
comparing, the following rule is to be used instead: When a
rational number and a floating-point number are to be compared by a
numerical function, in effect the floating-point number is first
converted to a rational number as if by the function
rational
, and then an exact comparison of two rational
numbers is performed. It is of course valid to use a more efficient
implementation than actually calling the function rational
,
as long as the result of the comparison is the same. In the case of
complex numbers, the real and imaginary parts are handled
separately.
Rationale: In general, accuracy cannot be preserved
in combining operations, but it can be preserved in comparisons, and
preserving it makes that part of Common Lisp algebraically a bit more
tractable. In particular, this change prevents the breakdown of
transitivity. Let a
be the result of
(/ 10.0 single-float-epsilon)
, and let j
be
the result of (floor a)
. (Note that
(= a (+ a 1.0))
is true, by the definition of
single-float-epsilon
.) Under the old rules, all of
(<= a j)
, (< j (+ j 1))
, and
(<= (+ j 1) a)
would be true; transitivity would then
imply that (< a a)
ought to be true, but of course it is
false, and therefore transitivity fails. Under the new rule, however,
(<= (+ j 1) a)
is false.
For functions that are mathematically associative (and possibly commutative), a Common Lisp implementation may process the arguments in any manner consistent with associative (and possibly commutative) rearrangement. This does not affect the order in which the argument forms are evaluated, of course; that order is always left to right, as in all Common Lisp function calls. What is left loose is the order in which the argument values are processed. The point of all this is that implementations may differ in which automatic coercions are applied because of differing orders of argument processing. As an example, consider this expression:
(+ 1/3 2/3 1.0D0 1.0 1.0E-15)
One implementation might process the arguments from left to right,
first adding 1/3
and 2/3
to get
1
, then converting that to a double-precision
floating-point number for combination with 1.0D0
, then
successively converting and adding 1.0
and
1.0E-15
. Another implementation might process the arguments
from right to left, first performing a single-precision floating-point
addition of 1.0
and 1.0E-15
(and probably
losing some accuracy in the process!), then converting the sum to double
precision and adding 1.0D0
, then converting
2/3
to double-precision floating-point and adding it, and
then converting 1/3
and adding that. A third implementation
might first scan all the arguments, process all the rationals first to
keep that part of the computation exact, then find an argument of the
largest floating-point format among all the arguments and add that, and
then add in all other arguments, converting each in turn (all in a
perhaps misguided attempt to make the computation as accurate as
possible). In any case, all three strategies are legitimate. The user
can of course control the order of processing explicitly by writing
several calls; for example:
(+ (+ 1/3 2/3) (+ 1.0D0 1.0E-15) 1.0)
The user can also control all coercions simply by writing calls to coercion functions explicitly.
In general, then, the type of the result of a numerical function is a floating-point number of the largest format among all the floating-point arguments to the function; but if the arguments are all rational, then the result is rational (except for functions that can produce mathematically irrational results, in which case a single-format floating-point number may result).
There is a separate rule of complex contagion. As a rule, complex
numbers never result from a numerical function unless one or more of the
arguments is complex. (Exceptions to this rule occur among the
irrational and transcendental functions, specifically expt
,
log
, sqrt
, asin
,
acos
, acosh
, and atanh
; see
section 12.5.) When a
non-complex number meets a complex number, the non-complex number is in
effect first converted to a complex number by providing an imaginary
part of zero.
If any computation produces a result that is a ratio of two integers such that the denominator evenly divides the numerator, then the result is immediately converted to the equivalent integer. This is called the rule of rational canonicalization.
If the result of any computation would be a complex rational with a
zero imaginary part, the result is immediately converted to a
non-complex rational number by taking the real part. This is called the
rule of complex canonicalization. Note that this rule does
not apply to complex numbers whose components are
floating-point numbers. Whereas #C(5 0)
and 5
are not distinct values in Common Lisp (they are always
eql
), #C(5.0 0.0)
and 5.0
are
always distinct values in Common Lisp (they are never eql
,
although they are equalp
).
Next: Predicates on Numbers
Up: Numbers
Previous: Numbers
AI.Repository@cs.cmu.edu