Have you ever come across a problem in Python where it seems like 2 doesn’t equal 2 anymore?
You may encouter this problem when you add 2 floating point numbers….
0.35 + 0.1
This doesn’t seem to make sense! The result should be 0.45! Or what about when we try to calculate a 12-period average in Pandas dataframe…
This dataframe calculates the 12 period average for the “Numbers” column, and asks if “Number” is greater or equal to the 12 period average. Look at the last row:
- “Numbers” = 3.9
- “Average” = 3.9
- So why is “above or equal to 12 period average” False? It should be true! (3.9 = 3.9)
If you’ve ever stumbled across problems like this while coding, fear not. It’s not because Python is broken, has a bug, or 2 no longer equals 2. This error occurs because Python uses floating point arithmetic, as opposed to decimal math. Let me explain:
What is the problem specifically
Computer hardware represents floating point numbers with binary fractions. This means that for the number x, x will be presented by a/2 + b/4 + c/8 + d/16……
Here are 2 examples. The first example has no rounding error whereas the second example does have rounding error
No rounding error
0.5 is represented perfectly by the binary fraction 1/2, as shown by the .as_integer_ratio() method. This method returns a pair of integers whose ratio is equal to the float.
Meanwhile, a number such as 0.45 cannot be exactly represented by a binary fraction. Whereas decimal fractions would represent this as 45/100, binary fractions tries to approximate this decimal fraction with 8106479329266893/18014398509481984
Since decimal fractions (e.g. 45/100) need to be approximated by binary fractions (e.g. 8106479329266893/18014398509481984), the approximation is not going to be exact. Hence why going back to my previous example, 3.9 does not equal 3.9
*The 3.9 in “Average” column was a 12 period average of the “Numbers” column
If we look at 12 period average for the “Numbers” column more closely, we get 3.9000000000000004 even though in reality (decimal math), the 12 period average should be just 3.9 (without the additional 0.0000000000000004).
As you can see, this isn’t a big problem if your code doesn’t involve a lot of math where precision is necessary. But precision is necessary in many Python for finance use cases (e.g. the above example gave me “False” when I asked it “is 3.9 >= 3.9”, which should have returned True). If you build quantitative trading models, make sure your calculations are accurate is very important. Otherwise your trading model might suggest the wrong trade!
So how do we deal with this? There are 2 solutions – an easy one, and a more complicated one.
Python rounding error solution #1: .round() function from the Pandas library, if applied to a Pandas series
The easiest way to deal with this problem is to use the .round() function from the Pandas library. This solution only works if you apply it to a Pandas series (i.e. a column in a Pandas dataframe).
Going back to our previous example, the last element in column “Numbers” (3.9) does not equal to the last element in column “Average” (3.9) because the last element in column “Average” isn’t 3.9
We can add the .round() function to any equation that’s applied to a Panda series. Doing so will round the equation’s result to a certain decimal point. How many decimals will .round() round to? Just specify in the parameters.
- If you type .round(3), your calculation will be rounded to 3 decimal places
- If you type .round(5), your calculation will be rounded to 5 decimal places
- If you type .round(10), your calculation will be rounded to 10 decimal places
Personally I like to round to 5 decimal places. Here’s how we can use .round() to solve the above Python rounding error:
As you can see, we called the .round() function on the column “Average” which calculates a 12 period average of the “Numbers” column. The final element now rounds to 3.90000. That’s why the last element in “Above or equal to 12 period average” column is now True (whereas before it was False). 3.9 = 3.90000
Python rounding error solution #2: decimal module
What if your number or calculation needs to be so precise that you cannot round the number? What if you need 100% accuracy? No need to worry, you can use the decimal module.
First, import the decimal module:
import decimal from decimal import Decimal
To create an accurate calculation, use the Decimal() function on the numbers that are a part of your calculation. For example, if you want to divide 2 by 23, here’s the code you should write:
import decimal from decimal import Decimal #decimal.getcontext() calculation = Decimal(2) / Decimal(23) calculation