[New-bugs-announce] [issue39910] os.ftruncate on Windows should be sparse

Mingye Wang report at bugs.python.org
Mon Mar 9 07:19:49 EDT 2020


New submission from Mingye Wang <arthur200126 at gmail.com>:

Consider this interaction:

cmd> echo > 1.txt
cmd> python -c "__import__('os').truncate('1.txt', 1024 ** 3)"
cmd> fsutil sparse queryFlag 1.txt

Not only takes a long time as is typical for a zero-write, but also reports non-sparse as an actual write would suggest. This is because internally, _chsize_s and friends enlarges files using a loop.[1]
  [1]: https://github.com/leelwh/clib/blob/master/c/chsize.c

On Unix systems, ftruncate for enlarging is described as "... as if the extra space is zero-filled", but this is not to be taken literally. In practice, sparse files are used whenever available (GNU dd expects that) and people do expect the operation to be very fast without a lot of real writes. A FreeBSD bug exists around how ftruncate is too slow on UFS.

The aria2 downloader gives a good example of how to truncate into a sparse file on Windows.[2] First a FSCTL_SET_SPARSE control is issued, and then a seek + SetEndOfFile would finish the job. Of course, a lseek to the end would be required to first determine the size of the file, so we know whether we are enlarging (sparse) or shrinking (don't sparse).
  [2]: https://github.com/aria2/aria2/blob/master/src/AbstractDiskWriter.cc#L507

----------
components: Library (Lib)
messages: 363717
nosy: Artoria2e5, steve.dower
priority: normal
severity: normal
status: open
title: os.ftruncate on Windows should be sparse
versions: Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39910>
_______________________________________


More information about the New-bugs-announce mailing list