I guess we can wait for your patch, but I have no idea, having read through this whole thread, exactly what you are actually proposing. I suggest teh first step would be to lay that out and see what the core devs think.

In the meantime, I hope it will be helpful for me to summarize what I got out of this discussion, and then make my own proposal, which may or may not be similar to Richards:

TL;DR : I propose that python's JSON encoder encode the Decimal type as maintaining full precision.

1) The original post was inspired by a particular problem the OP is trying to solve, and a suggested solution that I suspect the OP thought was the least disruptive and maybe most general solution. However, what I think that did was throw some red herrings into the conversation.However, it made me go read the RFC and see that the JSON spec really says about numbers, and think about whether the Python json module does as well as it could in transcoding numbers. And I think we've found a limitation.

2) To be clear about vocabulary: a "float" is a binary floating point number, for all intents and purposes an IEEE754 float -- or at the very least, a Python float. Which is compatible between many computer systems.

JSON does not have a "float" specification. 

JSON has a "number" specification -- and it's a textual representation that can (only) represent the subset of rational numbers that can be represented in base ten with a finite number of digits. But it does not specify a maximum number of digits. But it does allow implementations to set a limit. In the RFC, it addresses this issue with:

This specification allows implementations to set limits on the range
   and precision of numbers accepted.  Since software that implements
   IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
   generally available and widely used, good interoperability can be
   achieved by implementations that expect no more precision or range
   than these provide, in the sense that implementations will
   approximate JSON numbers within the expected precision.  A JSON
   number such as 1E400 or 3.141592653589793238462643383279 may indicate
   potential interoperability problems, since it suggests that the
   software that created it expects receiving software to have greater
   capabilities for numeric magnitude and precision than is widely
OK -- so this means: if you want to be generally interoperable, than limit yourself to numbers that can be represented by IEEE-754.

But it does not prohibit greater precision, or different binary representation when decoded.

Python's json module, like I imagine most JSON decoders, takes the very practical approach of using (IEEE-754) float as a default for JSON numbers with a fractional part. But it also allows you to decode a JSON number as a Decimal type instead. But it does not have a way to losslessly encode a python Decimal as JSON.

Since the JSON spec does in fact allow lossless representation of a Python Decimal, it seems that for completeness' sake, the json default encoding of Decimal should maintain the full precision. This would provide round tripping in the sense that a Python Decimal encoded and then decoded as JSON would get back the same value. (but a given JSON number would not necessarily get the exact same text back when round-tripped though Decimal)

And it would provide compatibility with any hypothetical other JSON implementation that fully supports Decimal numbers.

Note that this *might* solve the OPs problem in this particular case, but not in the general case -- it relies on the Python user to know how some *other* JSON encoder is encoding its floats. But it would provide a consistent encoding of Decimal that should be compatible with other *decimal* numeric types.

Final points:

I fully concur with many posters that byte for byte consistency of JSON is NOT a reasonable goal.

I also fully agree that the Python JSON encoder should not EVER generate invalid JSON, so the OP's idea of a "raw" encoder seems like a bad idea. I can't think of, and I don't think anyone else has come up with, any examples other than Decimal that require this "raw" encoding. And if anyone finds any others, then those should be addressed properly.

The fundamental problem here is not that we don't allow a raw encoding, but that the JSON spec is based on decimal numbers, and Python also support Decimal numbers, but there was that one missing piece of how to "properly" encode the Decimal type -- it is clear in the JSON spec how best to do that, so Python should support it.

Richard: if your proposal is different, I'd love to hear what it is, and why you think Python needs something else.


Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython