Floating point represents real numbers in a fixed space. But it does not precisely represent all real numbers, which leads to two types of floating point errors.

** 1. Rounding error (linear)**

**2. Cancellation error (exponential)**

Performing complex mathematical computations requires numerous floating point operations, each of which is likely to make the result less accurate. Error builds up invisibly. **Current standards** for floating point have **no means of measuring and/or recording floating point rounding and cancellation error.**

# Real Consequences

Since the invention of computers, real number calculations have produced hidden, unreported errors*,* sometimes catastrophically.

### PATRIOT MISSILE FAILURE

The most notorious floating point error catastrophe was the Patriot Missile Failure at Dhahran, Saudi Arabia, February 25, 1991, when a Patriot missile failed to destroy a SCUD missile and 128 U.S. military soldiers were killed or wounded as a result. This was the greatest combat loss in an Army unit since Vietnam. The conversion of 100 hours in tenths of a second (3600000) to floating point introduced an undetectable error resulting in the missile guidance software incorrectly locating the SCUD missile.

### Ariane 5 ROCKET, Flight 501

On June 4th, 1996, 40 seconds into flight and at an altitude of 3.7 kilometers, the initial launch of the Ariane 5 rocket, flight 501, ended in RUD (colloquially, Rapid Unplanned Disassembly). Estimates of the loss of the rocket and cargo are as high as $500M. Cause of the failure was an inappropriate floating point conversion. (Photo from Deadpan)

### Vancouver Stock Exchange

In January of 1982 the Vancouver Stock Exchange started a stock index accumulating total stock value for all 1,400 stocks listed on the exchange. but truncating (rounding down) that sum up to 3000 times per day resulting in a loss of index value of about $25 per month for about 23 months indicating an index value of $524.811 when the actual value was $1098.892. (Image by Mafue)

# Standard Floating Point

### Why Floating Point?

Floating point was created because computers needed to represent real numbers in a fixed number of digits and to perform operations on them. The very first computer, the Zuse Z3, employed floating point in 1941.

### When did it become standard?

First standardized in 1985, floating point, and operations with floating point numbers, are defined by an international standard.

### What is Floating Point Representation?

Computers represent real numbers (numbers with a decimal point) with a scheme similar to scientific notation. Floating point numbers are represented with a fixed number of digits, and just as 1/3 (0.33333...) cannot be represented without error, real numbers with fractions in a computer usually have some error.

### Why is there Error?

There is a flaw in the floating point standard. In most cases floating point numbers, and the operations on them, introduce an error which is undetected and can accumulate, even catastrophically, over a sequence of operations. As the calculation complexity increases, the small errors in the many floating point operations build up and are carried forward. The **floating point standard** provides **no means of detecting or measuring this error**.

### What devices are effected?

Nearly all general purpose computers constructed today have hardware or software implementations of mathematical operations on floating point numbers. Failure to obtain a correct result occurs on your cell phone, your tablet, your notebook, and your personal computer. Even Google Calculator.

### Prove to me that there is error

Pull out your calculator on your phone or computer. Calculate the square root of π and then square that result, you see the result π, then subtract π like this:

(√π)^{2} - π =

The result should be zero. . . . . . but it is not.

### Is this a program bug?

No! It is in the very design of the computer hardware. The international floating point standard specifies that computers should be designed this way.

# Current Approaches to Floating Point Error?

### Static Error Analysis

**Static error analysis** requires significant mathematical analysis and cannot determine actual error in real time. This work must be done by highly skilled mathematician programmers and is only applied to critical projects because of the greatly increased cost and time required.

### Avoidance

**Avoidance **of some error can be achieved by teaching programmers to be alert and avoid practices that amplify floating point errors. But this provides no assurance of accuracy.

**Dynamic error analysis**

**Dynamic error analysis** by means of error injection requires multiple execution of algorithms. It would be of little use when using adaptive algorithms or when error information is required in real time.

### Interval arithmetic

**Interval arithmetic** provides a means of computing bounds for floating point computations, it requires greatly increased computation time and at least twice as much storage.

### Additional bits

**Using additional bits** to represent real numbers is a commonly proposed solution. It significantly increases computation time and required storage. Though it is likely more accurate, it can still be in error. And there is no indication that the value has any significance at all - the result may have lost all significant bits!