[docs] [issue12134] json.dump much slower than dumps
Terry J. Reedy
report at bugs.python.org
Sat May 28 21:42:09 CEST 2011
Terry J. Reedy <tjreedy at udel.edu> added the comment:
>From just reading the docs, it appears that json.dump(obj,fp) == fp.write(json.dumps(obj)) and it is easy to wonder why .dump even exists, as it seems a trivial abbreviation (and why not .dump and .dumpf instead). Since, '_one_shot' and 'c_make_encoder' are not mentioned in the doc, there is no hint from these either. So I think a doc addition is needed.
The benchmark is not completely fair as the .dumps timing omits the write call. For the benchmark, that would be trivial. But in real use on multitasking systems with slow (compared to cpu speed) output channels, the write time might dominate. I can even imagine .dump sometimes winning by getting chunks into a socket buffer and possibly out on the wire, almost immediately, instead of waiting to compute the entire output, possibly interrupted by task swaps. So I presume *this* is at least part of the reason for the incremental .dump.
I changed 'pass' to 'print(bug)' in class writable and verified that .dump is *very* incremental. Even '[' and ']' are separate outputs.
DOC suggestion: (limited to CPython since spec does not prohibit naive implementation of .dump given above) After current .dumps line, add
"In CPython, json.dumps(o), by itself, is faster than json.dump(o,f), at the expense of using more space, because it creates the entire string at once, instead of incrementally writing each piece of o to f. However, f.write(json.dumps(o)) may not be faster."
Python tracker <report at bugs.python.org>
More information about the docs