[Web-SIG] Time a for JSON parser in the standard library?

Bob Ippolito bob at redivi.com
Sun Mar 23 22:49:41 CET 2008


On Thu, Mar 20, 2008 at 8:38 PM, Robert Brewer <fumanchu at aminus.org> wrote:
>
> Bob Ippolito wrote:
>  > On Thu, Mar 20, 2008 at 5:50 PM, Robert Brewer <fumanchu at aminus.org>
>  > wrote:
>  > > Deron Meranda wrote:
>  > > > And even then, we're not just talking about a JSON parser.
>  > > > We're all doing more than that; we're mapping Python to JSON.
>  > > > And there is no definitive spec for that.  Just look at my
>  > > > numbers tests; there are a lot of differences in how numeric
>  > > > mappings are done, but yet many of them can be arguably
>  > > > "correct" while still doing things differently.
>  > >
>  > >  ...which IMO argues that any json implementation that goes
>  > > in the stdlib needs to at least allow access to the raw bytes
>  > > in both directions. For example, if you really want JSON
>  > > numerals to become Python decimals, you shouldn't be forced
>  > > to lose information just because the json decoder was only
>  > > designed to hand you a float. Arbitrary converter plugins would
>  > > be icing on the cake. A built in decimal converter would be
>  > > heaven. :)
>  >
>  > That can be easily done, but at the expense of speed or clarity in the
>  > implementation... I'd be willing to add some hooks to simplejson that
>  > allow people to pass in their own functions that turn JSON terms (as
>  > strings) into Python objects.
>
>  That'd be great! I expect a speed penalty of course, and IMO most of that should be pushed onto anyone passing in functions, rather than making everyone pay.

Ok, so I made these changes (parse_float/parse_int/parse_constant) and
a few others for simplejson 1.8.1 and moved it to google code.

http://code.google.com/p/simplejson/

Other changes:

 * No longer escapes / by default, if you're embedding in HTML then
you'll have to escape that yourself, I got really tired of looking at
URLs with all of the /s escaped.
 * Optional scanstring C speedup for decoding
 * correct unicode surrogate pair decoding
 * can be used from the command-line now to validate and pretty-print
JSON ("curl http://json/ | python -msimplejson")
 * bug fix for ensure_ascii=False decoding

If anyone else has any complaints, bug reports, or feature requests
they'd like to address they should speak up soon either here or on the
issue tracker. I think it's more or less ready to go into the stdlib
at this point.

-bob


More information about the Web-SIG mailing list