writing large files quickly

Grant Edwards grante at visi.com
Fri Jan 27 16:07:40 EST 2006


On 2006-01-27, rbt <rbt at athop1.ath.vt.edu> wrote:

>>>>>    fd.write('0')
>>>
>>>[cut]
>>>
>>>>f = file('large_file.bin','wb')
>>>>f.seek(409600000-1)
>>>>f.write('\x00')
>>>
>>>While a mindblowingly simple/elegant/fast solution (kudos!), the 
>>>OP's file ends up with full of the character zero (ASCII 0x30), 
>>>while your solution ends up full of the NUL character (ASCII 0x00):
>> 
>> Oops.  I missed the fact that he was writing 0x30 and not 0x00.
>> 
>> Yes, the "hole" in the file will read as 0x00 bytes.  If the OP
>> actually requires that the file contain something other than
>> 0x00 bytes, then my solution won't work.
>
> Won't work!? It's absolutely fabulous! I just need something big, quick 
> and zeros work great.

Then Bob's your uncle, eh?

> How the heck does that make a 400 MB file that fast?

Most of the file isn't really there, it's just a big "hole" in
a sparse array containing a single allocation block that
contains the single '0x00' byte that was written:

$ ls -l large_file.bin 
-rw-r--r--  1 grante users 409600000 Jan 27 15:02 large_file.bin
$ du -h large_file.bin
12K     large_file.bin

The filesystem code in the OS is written so that it returns
'0x00' bytes when you attempt to read data from the "hole" in
the file.  So, if you open the file and start reading, you'll
get 400MB of 0x00 bytes before you get an EOF return.  But the
file really only takes up a couple "chunks" of disk space, and
chunks are usually on the order of 4KB.

-- 
Grant Edwards                   grante             Yow!  Is this an out-take
                                  at               from the "BRADY BUNCH"?
                               visi.com            



More information about the Python-list mailing list