File holes in Linux
Grant Edwards
invalid at invalid.invalid
Wed Sep 29 16:38:14 EDT 2010
On 2010-09-29, Ned Deily <nad at acm.org> wrote:
><AANLkTinPUYzL5LaQBV-B3BUX6OzYd6+UMPXRptqH7Wcz at mail.gmail.com>,
> Tom Potts <karaken12 at gmail.com> wrote:
>> Hi, all. I'm not sure if this is a bug report, a feature request or what,
>> so I'm posting it here first to see what people make of it. I was copying
>> over a large number of files using shutil, and I noticed that the final
>> files were taking up a lot more space than the originals; a bit more
>> investigation showed that files with a positive nominal filesize which
>> originally took up 0 blocks were now taking up the full amount. It seems
>> that Python does not write back file holes as it should; here is a simple
>> program to illustrate:
>> data = '\0' * 1000000
>> file = open('filehole.test', 'wb')
>> file.write(data)
>> file.close()
>> A quick `ls -sl filehole.test' will show that the created file actually
>> takes up about 980k, rather than the 0 bytes expected.
>
> I would expect the file size to be 980k in that case. AFAIK, simply
> writing null bytes doesn't automatically create a sparse file on Unix-y
> systems.
Correct. As Ned says, you create holes by seeking past the end of the
file before writing data, not by writing 0x00 bytes. Here's a
demonstration:
Writing 0x00 values:
$ dd if=/dev/zero of=foo1 bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0315967 s, 332 MB/s
$ ls -l foo1
-rw-r--r-- 1 grante users 10485760 Sep 29 15:32 foo1
$ du -s foo1
10256 foo1
Seeking, then write a single byte:
$ dd if=/dev/zero of=foo2 bs=1 count=1 seek=10485759
1+0 records in
1+0 records out
1 byte (1 B) copied, 8.3075e-05 s, 12.0 kB/s
$ ls -l foo2
-rw-r--r-- 1 grante users 10485760 Sep 29 15:35 foo2
$ du -s foo2
16 foo2
--
Grant Edwards grant.b.edwards Yow! Please come home with
at me ... I have Tylenol!!
gmail.com
More information about the Python-list
mailing list