comparing multiple copies of terrabytes of data?
Dan Stromberg
strombrg at dcs.nac.uci.edu
Mon Oct 25 14:08:29 EDT 2004
We will soon have 3 copies, for testing purposes, of what should be about
4.5 terrabytes of data.
Rather than cmp'ing twice, to verify data integrity, I was thinking we
could speed up the comparison a bit, by using a python script that does 3
reads, instead of 4 reads, per disk block - with a sufficiently large
blocksize, of course.
My question then is, does python have a high-level API that would
facilitate this sort of thing, or should I just code something up based on
open and read?
Thanks!
More information about the Python-list
mailing list