[Python-ideas] iterator/stream-based JSON API

Robert Kern robert.kern at gmail.com
Mon Jun 3 15:14:16 CEST 2013


On 2013-06-03 13:37, Daniel Holth wrote:
> On Mon, Jun 3, 2013 at 5:12 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> On Sun, 2 Jun 2013 22:11:23 -0400
>> Daniel Holth <dholth at gmail.com> wrote:
>>> When will the stdlib get a decent iterator/stream-based JSON API? For
>>> example automated packaging tools may be parsing a lot of JSON but
>>> ignoring most of it, and it would be lovely to say "if key not in
>>> interesting: skip_without_parsing()".
>
> No offense meant. The existing JSON API is quite good.
>
>> Do you think that would make a significant difference? I'm not sure
>> what "parsing a lot of JSON" means, but I suppose packaging metadata is
>> usually quite small.
>
> I don't know whether it would matter for packaging but it would be
> very useful sometimes.
>
> The jsonpull API looks like:
> http://code.google.com/p/jsonpull/source/browse/trunk/Example.java
>
> A bit like:
>
> parser = Json(text)
> parser.eat('{') # expect an object
> for element in parser.objectElements():
>      parser.eat(Json.KEY)
>      key = parser.getString()
>      if key == "name":
>          name = parser.getStringValue()
>      elif key == "contact":
>
>
> You can ask it what the next token is, seek ahead (never behind) to a
> named key in an object, or iterate over all the keys in an object
> without necessarily iterating over child objects. Once you get to an
> interesting sub-object you can get an iterator for that sub-object and
> perhaps pass it to a child constructor.

Unless if your JSON file has dict values of, say, megabytes in size, I doubt 
that writing such code is going to be much more efficient than just building the 
whole dict and ignoring the keys that you don't want.

I suspect that most of the use cases could be satisfied by being able to either 
whitelist or blacklist top-level keys. This would be a relatively simple 
modification to _json.c, I think, if you wanted to pursue it.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco



More information about the Python-ideas mailing list