[Python-Dev] Prefetching on buffered IO files

Hagen Fürstenau hagen at zhuliguan.net
Wed Sep 29 10:06:57 CEST 2010


> Ow... I've always assumed that seek() is essentially free, because
> that's how a typical OS kernel implements it. If seek() is bad on
> GzipFile, how hard would it be to fix this?

I'd imagine that there's no easy way to make arbitrary seeks on a
GzipFile fast. But wouldn't it be enough to optimize small relative
(backwards) seeks?

> How common is the use case where you need to read a gzipped pickle
> *and* you need to leave the unzipped stream positioned exactly at the
> end of the pickle?

Not uncommon, I think. You need this for unpickling objects which were
dumped one after another into a GzipFile, right?

ISTM that the immediate performance issue can be solved by the present
patch, and there's room for future improvement by optimizing GzipFile
seeks and/or extending the IO API.

Cheers,
Hagen



More information about the Python-Dev mailing list