Precision reading and writing data frames to csv
python at lucidity.plus.com
Sat Mar 11 18:29:35 EST 2017
On 11/03/17 22:01, Paulo da Silva wrote:
> I have a dir with lots of csv files. These files are updated +- once a
> day. I could see that some values are converted, during output, to very
> close values but with lots of digits. I understand that is caused by the
> internal bits' representation of the float/double values.
> Now my question is: Is it possible the occurrence of successive
> cumulative errors? I mean reading a file, adding lines or change few
> ones but keeping the most of the other lines untouched and, even so,
> existent untouched lines keep always changing?
Firstly, if the values are changing then they are being first written by
a process that has a very strict decimal representation and then read in
and re-written by something that has a less-strict representation (I'm
guessing that by "float/double values", you mean IEEE 754).
Because IEEE 754 can't represent all values *exactly*, some input values
will change to something close as you have seen.
However, if the input value matches something that IEEE 754 *can*
represent exactly then it will not change.
Whether you will see _cumulative_ errors depends on whether the output
stage of the first pass truncates the output with a field width
specifier or similar. If not, then you should see the initial change
you've noticed and then nothing more after that for that particular datum.
Having said all that, if you use Python's decimal.Decimal type instead
of float/double for processing your files, then you are better off if
absolute precision is what you need.
More information about the Python-list