New submission from James Hutchison firstname.lastname@example.org:
The Unzip module is always unbuffered (tested v.3.1.2 Windows XP, 32-bit). This means that if one has to do many small reads it is a lot slower than reading a chunk of data to a buffer and then reading from that buffer. It seems logical that the unzip module should default to buffered reading and/or have a buffered argument. Likewise, the documentation should clarify that there is no buffering involved when doing a read, which runs contrary to the default behavior of a normal read.
start Zipfile read done 27432 reads done took 0.859 seconds start buffered Zipfile read done 27432 reads done took 0.072 seconds start normal read (default buffer) done 27432 reads done took 0.139 seconds start buffered normal read done 27432 took 0.137 seconds
---------- assignee: docs@python components: Documentation, IO, Library (Lib) messages: 120871 nosy: Jimbofbx, docs@python priority: normal severity: normal status: open title: ZipFile unzip is unbuffered type: performance versions: Python 2.5, Python 2.6, Python 2.7, Python 3.1