data:image/s3,"s3://crabby-images/af4b2/af4b2123133673552e21eb691de3816ceb7cd6b7" alt=""
On Mon, Jun 3, 2013 at 8:37 AM, Daniel Holth <dholth@gmail.com> wrote:
On Mon, Jun 3, 2013 at 5:12 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sun, 2 Jun 2013 22:11:23 -0400 Daniel Holth <dholth@gmail.com> wrote:
When will the stdlib get a decent iterator/stream-based JSON API? For example automated packaging tools may be parsing a lot of JSON but ignoring most of it, and it would be lovely to say "if key not in interesting: skip_without_parsing()".
No offense meant. The existing JSON API is quite good.
Do you think that would make a significant difference? I'm not sure what "parsing a lot of JSON" means, but I suppose packaging metadata is usually quite small.
I don't know whether it would matter for packaging but it would be very useful sometimes.
The jsonpull API looks like: http://code.google.com/p/jsonpull/source/browse/trunk/Example.java
A bit like:
parser = Json(text) parser.eat('{') # expect an object for element in parser.objectElements(): parser.eat(Json.KEY) key = parser.getString() if key == "name": name = parser.getStringValue() elif key == "contact":
You can ask it what the next token is, seek ahead (never behind) to a named key in an object, or iterate over all the keys in an object without necessarily iterating over child objects. Once you get to an interesting sub-object you can get an iterator for that sub-object and perhaps pass it to a child constructor.
The ijson API yields a stream of events containing the full path to each item in the parsed JSON, an event name like "start_map", "end_map", "start_array", ... list(ijson.parse(StringIO.StringIO("""{ "a": { "b": "c" } }"""))) [('', 'start_map', None), ('', 'map_key', 'a'), ('a', 'start_map', None), ('a', 'map_key', 'b'), ('a.b', 'string', u'c'), ('a', 'end_map', None), ('', 'end_map', None)] It also has a higher-level API yielding only the objects under a certain prefix. Pass "a.b" and you would get only "c". Besides memory this kind of thing makes it much easier to know which level of the JSON structure you are in compared to the existing object_pairs hook. I kindof like the pull API because you can "choose your own adventure", deciding whether to do higher or lower level parsing depending on where in the JSON you are. But you could easily get lost and do things that aren't permitted based on the parser state. Daniel Holth