[Baypiggies] Silent data corruption paper...
drewp at bigasterisk.com
Tue Mar 17 08:21:19 CET 2009
Shannon -jj Behrens wrote:
> I remember hearing that Google operated at such a large scale that
> these sorts of things tended to catch up with them. Their approach
> was to use more redundancy.
"""we were able to sort 1TB ... on 1,000 computers in 68 seconds.
Where do you put 1PB of sorted data? We were writing it to 48,000 hard
drives (we did not use the full capacity of these disks, though), and
every time we ran our sort, at least one of our disks managed to break
(this is not surprising at all given the duration of the test, the
number of disks involved, and the expected lifetime of hard disks). To
make sure we kept our sorted petabyte safe, we asked the Google File
System to write three copies of each file to three different disks."""
and more commentary at
More information about the Baypiggies