comparing multiple copies of terrabytes of data?

Istvan Albert ialbert at
Tue Oct 26 19:43:30 CEST 2004

Josiah Carlson wrote:

> measure, if shy by around 3 orders of magnitude in terms of time.

> That new one runs in 5 minutes 15 seconds total, because it exploits the

my point was never to say that it is not possible to write
a better way, nor to imply that you could not do it, I simply
said that there is no easy way around this problem.

Your solution while short and nice is not simple, and
requires quite a bit of knowledge to understand
why and how it works.

other topic ... sometimes I have a hard time visualizing how much
a terrabyte of data actually is, this is a good example
for that, even this optimized algorithm would take over three
days to perform a a simple identity check ...


