can we have a filecmp.cmp() that accept a different buffer size?

Marcus Alanen marcus at infa.abo.fi
Thu Jun 12 05:28:01 EDT 2003


On Thu, 12 Jun 2003 08:20:39 GMT, Kendear <kendear at nospam.com> wrote:
>filecmp.cmp() uses a BUFSIZE of 8k to compare.
>For files that are 500MB, the hard disk is really
>busy, going back and forth, while my 512MB RAM is
>sitting there, sipping margarita.  Can we have a
>version of filecmp.cmp() (and filecmp's other
>methods) that accepts a BUFSIZE, such as 1MB or more?

There are some issues with this. First, the stat() call of a file
should give the "preferred" value of a buffer size in st_blksize. So
if python follows this value, it _should_ already be a good enough
value for most uses. Second, in practice some operating system kernels
provide read-ahead of files, that is, they sends extra read requests
to the hard drive so that future requests from the application don't
have to wait so long. So using a BUFSIZE might do no good.

Basically, setting the buffer size explicitely is probably a
nice-to-have in the short run, but it belongs to the kernel side,
IMHO.

Regards,
Marcus






More information about the Python-list mailing list