[Python-Dev] Dropping bytes "support" in json

Bob Ippolito bob at redivi.com
Mon Apr 27 17:07:04 CEST 2009

On Mon, Apr 27, 2009 at 7:25 AM, Damien Diederen <dd at crosstwine.com> wrote:
> Antoine Pitrou <solipsis at pitrou.net> writes:
>> Hello,
>> We're in the process of forward-porting the recent (massive) json
>> updates to 3.1, and we are also thinking of dropping remnants of
>> support of the bytes type in the json library (in 3.1, again). This
>> bytes support almost didn't work at all, but there was a lot of C and
>> Python code for it nevertheless. We're also thinking of dropping the
>> "encoding" argument in the various APIs, since it is useless.
> I had a quick look into the module on both branches, and at Antoine's
> latest patch (json_py3k-3).  The current situation on trunk is indeed
> not very pretty in terms of code duplication, and I agree it would be
> nice not to carry that forward.
> I couldn't figure out a way to get rid of it short of multi-#including
> "templates" and playing with the C preprocessor, however, and have the
> nagging feeling the latter would be frowned upon by the maintainers.
> There is a precedent with xmltok.c/xmltok_impl.c, though, so maybe I'm
> wrong about that.  Should I give it a try, and see how "clean" the
> result can be made?
>> Under the new situation, json would only ever allow str as input, and
>> output str as well. By posting here, I want to know whether anybody
>> would oppose this (knowing, once again, that bytes support is already
>> broken in the current py3k trunk).
> Provided one of the alternatives is dropped, wouldn't it be better to do
> the opposite, i.e., have the decoder take bytes as input, and the
> encoder produce bytes—and layer the str functionality on top of that?  I
> guess the answer depends on how the (most common) lower layers are
> structured, but it would be nice to allow a straight bytes path to/from
> the underlying transport.
> (I'm willing to have a go at the conversion in case somebody is
> interested.)
> Bob, would you have an idea of which lower layers are most commonly used
> with the json module, and whether people are more likely to expect strs
> or bytes in Python 3.x?  Maybe that data could be inferred from some bug
> tracking system?

I don't know what Python 3.x users expect. As far as I know, none of
the lower layers of the json package are used directly. They're
certainly not supposed to be or documented as such.

My use case for dumps is typically bytes output because we push it
straight to and from IO. Some people embed JSON in other documents
(e.g. HTML) where you would want it to be text. I'm pretty sure that
the IO case is more common.


More information about the Python-Dev mailing list