writing large files quickly

rbt rbt at athop1.ath.vt.edu
Fri Jan 27 16:18:46 EST 2006


Grant Edwards wrote:
> On 2006-01-27, rbt <rbt at athop1.ath.vt.edu> wrote:
> 
> 
>>Hmmm... when I copy the file to a different drive, it takes up 
>>409,600,000 bytes. Also, an md5 checksum on the generated file and on 
>>copies placed on other drives are the same. It looks like a regular, big 
>>file... I don't get it.
> 
> 
> Because the filesystem code keeps track of where you are in
> that 400MB stream, and returns 0x00 anytime you're reading from
> a "hole".  The "cp" program and the "md5sum" just open the file
> and start read()ing.  The filesystem code returns 0x00 bytes
> for all of the read positions that are in the "hole", just like
> Don said:

OK I finally get it. It's too good to be true :)

I'm going back to using _real_ files... files that don't look as if they 
are there but aren't. BTW, the file 'size' and 'size on disk' were 
identical on win 2003. That's a bit deceptive. According to the NTFS 
docs, they should be drastically different... 'size on disk' should be 
like 64K or something.

> 
> 
>>>The blocks that were never written are virtual blocks,
>>>inasmuch as read() at that location will cause the filesystem
>>>to return a block of NULs.
> 
> 



More information about the Python-list mailing list