[Tutor] Corrupt file(s) likelihood and prevention?

Cameron Simpson cs at cskk.id.au
Fri Jan 31 00:01:34 EST 2020


On 30Jan2020 21:56, boB Stepp <robertvstepp at gmail.com> wrote:
>I just finished reading a thread on the main list that got a little
>testy.  The part that is piquing my interest was some discussion of
>data files perhaps getting corrupted.  I have wondered about this off
>and on.  First what is the likelihood nowadays of a file becoming
>corrupted, and, second, what steps should one take to prevent/minimize
>this from happening?

I can't speak to specific likelihoods, but these days things should be 
pretty stable. Hard drives have long mean time to failure, memory is 
often ECC (eg fixing single bit errors and noticing mulitbit errors), 
and data sizes are large enough that if failures were common nothing 
would work. (Of course that goes both ways.) Some filesystems checksum 
their blocks (ZFS in particular sticks in my mind for this, and it is in 
its way a flexible RAID, meaning a bad block gets fixed from a good one 
as discovered).

>I notice that one of the applications I work
>with at my job apparently uses checksums to detect when one of its
>files gets modified from outside of the application.

That can often be worth doing. It depends on the app. If a file is 
modified outside the app then many things it may be assuming in its 
internal state may need discarding or checking (conceptually: flush 
cached knowledge).

Leaving aside system failures, there's also malice. In another life we 
used to checksum the OS files on a regular basis looking for changed 
files.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Tutor mailing list