Secure delete with python
duncan.booth at invalid.invalid
Tue Sep 7 10:30:14 CEST 2004
Ville Vainio <ville at spammers.com> wrote in
news:du74qmb9mzs.fsf at amadeus.cc.tut.fi:
> Seriously? What OSen are known for doing this? I'd had thought that if
> the file size is unchanged, the data is always written over the old
I don't know for certain, but I think it is a pretty safe bet that NTFS
allocates new disc blocks instead of updating the existing ones.
NTFS is a transaction based file system, i.e. it guarantees that any
particular disc operation either completes or doesn't, you can never get
file-system corruption due to a power loss part way through updating a
file. Transactions are written to two transaction logs (in case one is
corrupted on failure), and every few seconds the outstanding transactions
are committed. Once committed there is sufficient information in the
transaction log that even if power is lost the transaction can be
completed, and likewise any transaction that has not been committed has
sufficient information stored that it can be rolled back.
There isn't very much published information on the NTFS internals (any
useful references gratefully received), but so far as I can see writing
updates to a fresh disc block would be the only realistic way to implement
this (otherwise you would need to write the data three times: once to each
transaction log then again to the actual file). If the data is written
separately then the transaction log only needs to store the location of the
new data (so it can be wiped if the transaction is rolled back) and then
update pointers when it is committed.
The other reason why I'm sure overwriting an existing file must allocate
new disc blocks is that NTFS supports compression on files, so if you start
off with a compressed file containing essentially random data and overwrite
it with repeated data (e.g. nulls) it will occupy less disc space.
More information about the Python-list