** Is your ice cream float bigger than mine
:PROPERTIES:
:categories: math
:date: 2013/05/27 07:46:36
:updated: 2013/05/28 08:59:01
:END:
Float numbers (i.e. the ones with decimals) cannot be perfectly represented in a computer. This can lead to some artifacts when you have to compare float numbers that on paper should be the same, but in silico are not. Let us look at some examples. In this example, we do some simple math that should result in an answer of 1, and then see if the answer is "equal" to one.
#+BEGIN_SRC python
print 3.0 * (1.0/3.0)
print 1.0 == 3.0 * (1.0/3.0)
#+END_SRC
#+RESULTS:
: 1.0
: True
Everything looks fine. Now, consider this example.
#+BEGIN_SRC python
print 49.0 * (1.0/49.0)
print 1.0 == 49.0 * (1.0/49.0)
#+END_SRC
#+RESULTS:
: 1.0
: False
The first line looks like everything is find, but the equality fails!
#+RESULTS:
: 1.0
: False
You can see here why the equality statement fails. We will print the two numbers to sixteen decimal places.
#+BEGIN_SRC python
print '{0:1.16f}'.format(49.0 * (1.0/49.0) )
print '{0:1.16f}'.format(1.0)
print 1 - 49.0 * (1.0/49.0)
#+END_SRC
#+RESULTS:
: 0.9999999999999999
: 1.0000000000000000
: 1.11022302463e-16
The two numbers actually are not equal to each other because of float math. They are /very, very/ close to each other, but not the same.
This leads to the idea of asking if two numbers are equal to each other within some tolerance. The question of what tolerance to use requires thought. Should it be an absolute tolerance? a relative tolerance? How large should the tolerance be? We will use the distance between 1 and the nearest floating point number (this is =eps= in Matlab). =numpy= can tell us this number with the =np.spacing= command.
Below, we implement a comparison function from [[https://doi.org/10.1107/S010876730302186X][doi:10.1107/S010876730302186X]] that allows comparisons with tolerance.
#+BEGIN_SRC python
# Implemented from Acta Crystallographica A60, 1-6 (2003). doi:10.1107/S010876730302186X
import numpy as np
print np.spacing(1)
def feq(x, y, epsilon):
'x == y'
return not((x < (y - epsilon)) or (y < (x - epsilon)))
print feq(1.0, 49.0 * (1.0/49.0), np.spacing(1))
#+END_SRC
#+RESULTS:
: 2.22044604925e-16
: True
For completeness, here are the other float comparison operators from that paper. We also show a few examples.
#+BEGIN_SRC python
import numpy as np
def flt(x, y, epsilon):
'x < y'
return x < (y - epsilon)
def fgt(x, y, epsilon):
'x > y'
return y < (x - epsilon)
def fle(x, y, epsilon):
'x <= y'
return not(y < (x - epsilon))
def fge(x, y, epsilon):
'x >= y'
return not(x < (y - epsilon))
print fge(1.0, 49.0 * (1.0/49.0), np.spacing(1))
print fle(1.0, 49.0 * (1.0/49.0), np.spacing(1))
print fgt(1.0 + np.spacing(1), 49.0 * (1.0/49.0), np.spacing(1))
print flt(1.0 - 2 * np.spacing(1), 49.0 * (1.0/49.0), np.spacing(1))
#+END_SRC
#+RESULTS:
: True
: True
: True
: True
As you can see, float comparisons can be tricky. You have to give a lot of thought to how to make the comparisons, and the functions shown above are not the only way to do it. You need to build in testing to make sure your comparisons are doing what you want.