[Web-SIG] Time a for JSON parser in the standard library?
Bob Ippolito
bob at redivi.com
Sun Mar 23 22:49:41 CET 2008
On Thu, Mar 20, 2008 at 8:38 PM, Robert Brewer <fumanchu at aminus.org> wrote:
>
> Bob Ippolito wrote:
> > On Thu, Mar 20, 2008 at 5:50 PM, Robert Brewer <fumanchu at aminus.org>
> > wrote:
> > > Deron Meranda wrote:
> > > > And even then, we're not just talking about a JSON parser.
> > > > We're all doing more than that; we're mapping Python to JSON.
> > > > And there is no definitive spec for that. Just look at my
> > > > numbers tests; there are a lot of differences in how numeric
> > > > mappings are done, but yet many of them can be arguably
> > > > "correct" while still doing things differently.
> > >
> > > ...which IMO argues that any json implementation that goes
> > > in the stdlib needs to at least allow access to the raw bytes
> > > in both directions. For example, if you really want JSON
> > > numerals to become Python decimals, you shouldn't be forced
> > > to lose information just because the json decoder was only
> > > designed to hand you a float. Arbitrary converter plugins would
> > > be icing on the cake. A built in decimal converter would be
> > > heaven. :)
> >
> > That can be easily done, but at the expense of speed or clarity in the
> > implementation... I'd be willing to add some hooks to simplejson that
> > allow people to pass in their own functions that turn JSON terms (as
> > strings) into Python objects.
>
> That'd be great! I expect a speed penalty of course, and IMO most of that should be pushed onto anyone passing in functions, rather than making everyone pay.
Ok, so I made these changes (parse_float/parse_int/parse_constant) and
a few others for simplejson 1.8.1 and moved it to google code.
http://code.google.com/p/simplejson/
Other changes:
* No longer escapes / by default, if you're embedding in HTML then
you'll have to escape that yourself, I got really tired of looking at
URLs with all of the /s escaped.
* Optional scanstring C speedup for decoding
* correct unicode surrogate pair decoding
* can be used from the command-line now to validate and pretty-print
JSON ("curl http://json/ | python -msimplejson")
* bug fix for ensure_ascii=False decoding
If anyone else has any complaints, bug reports, or feature requests
they'd like to address they should speak up soon either here or on the
issue tracker. I think it's more or less ready to go into the stdlib
at this point.
-bob
More information about the Web-SIG
mailing list