[Python-3000] Weird truncate() behavior.

Alexandre Vassalotti alexandre at peadrop.com
Thu Dec 27 08:43:27 CET 2007


I have been working on lately on a small semantic to the truncate()
method of file objects -- i.e., make file.truncate(pos) imply a seek
to the given argument. I thought I had it all working, but I found out
that one test was failing -- testTruncateOnWindows in test_file. Tried
to fix it by myself, but I failed miserably. So, I  am thinking maybe
someone, with a fresh look on the problem, could help me.

By applying the following (simplified) patch:


--- Modules/_fileio.c   (revision 59594)
+++ Modules/_fileio.c   (working copy)
@@ -635,7 +635,8 @@
                   return NULL;
         }
         else {
-               Py_INCREF(posobj);
+               /* Move to the position to be truncated. */
+               posobj = portable_lseek(fd, posobj, 0);
         }

 #if !defined(HAVE_LARGEFILE_SUPPORT)


I get the desired truncate() behavior. However, this causes truncate()
to fail in a weird but predictable manner, as shown by the following
example:

f = open("@test","wb")
f.write(b"1234567890")
f.close()
f = open("@test","rb+")
f.read(4)
f.truncate(4)
print(f.tell())  # should print 4, but print -2 instead (?!)
f.seek(0, 1)     # this shouldn't change the file position
print(f.tell())  # print 4 (?!)

It is worthy to note, that calling write(), while tell() returns -2,
raises the following exception:

>>> f.write(b"hello")
Traceback (most recent call last):
  ...
IOError: [Errno 22] Invalid argument

The thing that I find really weird is that the example translate to
the correct (except for the last write() call) following library
function calls:

f.truncate(4):
  lseek64(3, 4, 0, 0, 0xb7da9814)                  = 4
  ftruncate64(3, 4, 0, 0x80aa6c6, 0xb7da9814)      = 0
print(f.tell()):
  lseek64(3, 0, 0, 1, 0x82ea99c)                   = 4
  write(1, "-2\n", 3)                              = 3

[The original output of ltrace was much larger; I just show the bare minimum]

There is only three things that comes to my mind that could explain
this weird behavior (in order of likeliness):
  1. My code is utter nonsense.
  2. I am relying on some undefined behavior.
  3. There is a bug in the C library.

Thanks,
-- Alexandre


More information about the Python-3000 mailing list