file position *tell()* works different

Tim Peters tim.one at comcast.net
Fri Sep 19 20:50:30 CEST 2003


[Tim, quoting the standard]
>> For a text stream, its file position indicator contains
>> unspecified information, usable by the fseek function for returning
>> the file position indicator for the stream to its position at the
>> time of the ftell call

[Richie Hindle]
> It still doesn't seem to work as specified:
>
> ------------------------------ peter.py ------------------------------
>
> # Open the file in text mode, read a line, and store the position.
> fp = file('test_data.txt', 'rt')
> line = fp.readline()
> storedPosition = fp.tell()
> print 'Line: %r, file pointer after read: %d' % (line, storedPosition)
>
> # Read some more and print it.
> print 'Read another line from this position: %r' % fp.readline()
>
> # Now seek back and read the same line again.
> fp.seek(storedPosition)
> print 'Another read from the same position: %r' % fp.readline()
>
> ----------------------------------------------------------------------
>
> This prints:
>
> Line: '0123456789\n', file pointer after read: 8
> Read another line from this position: '0123456789\n'
> Another read from the same position: '89\n'
>
> I'd expect doing readline/tell/readline/seek/readline to read the same
> line the second two times.  And however you implement tell and seek, a
> tell value of 8 after reading 11 bytes looks pretty weird.

I don't know what you're doing -- there's not enough info here.  I *assume*
this is Peter's original test data but with \r\n line ends replaced by \r.
If so, that's not a legitimate text file on Windows, so all bets are off if
you try to open it in text mode.  You really have to read the C standard to
appreciate just *how* feeble a text-mode file is across platfroms -- you
have to adhere to some very restrictive rules.  In particular, the effect of
any I/O operation on a file opened in text mode is undefined if the file
contains an \r (or any other control character apart from space, tab, and
newline; or any "high bit" character; or the DEL character; even if all
other rules are met, the number of trailing spaces you read need not be
equal to the number of trailing spaces on the line you wrote; then read read
the std for the obscure rules <wink>).  Microsofts defines what happens when
the pair \r\n is used to terminate lines in a text-mode file, but isn't
required to (and doesn't) define anything about what happens if an \r
appears by itself.

> I'd write the same code in C if I had the time, so at least we could
> be *sure* we can blame Microsoft.  8-)

Microsoft's fgets() does the same, provided you feed it a buffer large
enough to hold the largest line in the file.  But if you're not feeding the
program a legit text-mode file on the platform you're using, you've got no
cause for complaint.  Give the test file \r\n line ends and your program
should work as you hoped on Windows (because then it's a legit text-mode
file on Windows).

BTW, the 't' in your 'rt' is a Microsoft extension to C.  't' isn't
recognized on most other platforms.  The default is (alas) text mode, so 't'
isn't needed even on Windows.






More information about the Python-list mailing list