On Thu, Aug 08, 2019 at 10:22:49AM -0000, Richard Musil wrote:
I have found myself in an awkward situation with current (Python 3.7) JSON module. Basically it boils down to how it handles floats. I had been hit on this particular case:
In : float(0.6441726684570313) Out: 0.6441726684570312
but I guess it really does not matter.
I think it does matter, because there is no such float as ``0.6441726684570313``.
The call to float() there is a no-op, because the literal 0.64417... is already compiled to a float before the function is called. Calling float() on a float does nothing. So the problem here lies *before* you call float(): floats only have finite precision, and they use base 2, not 10, so there are many decimal numbers they cannot represent. And 0.6441726684570313 is one of those numbers.
So even though you typed 0.64417...313, that gets rounded to the nearest base-2 number, which is 0.64417...312. You can see this by using the hex representation:
py> float.fromhex('0x1.49d1000000000p-1') 0.6441726684570312 py> float.fromhex('0x1.49d1000000001p-1') 0.6441726684570314
What matters is that I did not find a way how to fix it with the standard `json` module. I have the JSON file generated by another program (C++ code, which uses nlohmann/json library), which serializes one of the floats to the value above.
I am very curious how the C++ code is generating that value, because Python floats ought to be identical to C doubles. Perhaps the C++ code is using an extended precision float with more bits? Or a decimal?
Then when reading this JSON file in my Python code, I can get either decimal.Decimal object (when specifying `parse_float=decimal.Decimal`) or float. If I use the latter the least significant digit is lost in deserialization.
As above, that's unavoidable for Python floats.
If I use Decimal, the value is preserved, but there seems to be no way to "serialize it back". Writing a custom serializer:
Alas, this is beyond my knowledge of JSON, but if you are correct that there's no way to serialise Decimals back to JSON, that seems like a major missing piece to me. Perhaps you can help jump-start this:
class DecimalEncoder(json.JSONEncoder): def default(self, o): if isinstance(o, decimal.Decimal): return str(o) # <- This becomes quoted in the serialized output return super.default(o)
I don't know a lot about writing JSON encoders, but a half-hearted and cursory glance at other custom encoders on Stackoverflow suggests to me that you probably want this instead:
return (str(o),) # return a length-1 tuple
but don't quote me or ask me to explain why that does or doesn't work.