file.seek misbehaving?

cheesecake vigenereNOviSPAM at leland.stanford.edu.invalid
Fri Jun 2 10:09:50 EDT 2000


I've run into a problem that is perhaps not a Python-specific
bug. I was processing some log files using Python, when I
noticed some mysterious bugs in certain files. I tracked down
the problem to the seek method misbehaving.

Some of the files I worked with were files created in NT or
saved in NT using a text editor. The problem files were copied
over ftp from a unix machine to the NT machine on which I work.
None of the files that were saved in NT (ie, NT native files)
had any problems.

It seems that seek doesn't work correctly because it's not clear
how large the file really is. I copied all the contents of one
bad file into a new file, and the new file was over 5KB bigger
according to NT. The new file works just fine with the seek
operations. But when I read() both files into memory, they're
identical strings and have the content-length of the original
file.

Could the problem be that the differences between the NT file
system and the unix file system of the other computer are not
properly handled because I'm running on NT, so the Python
interpreter figures it should assume everything is NT?

Let me summarize the problem with some code:
fd1 = open("somefile.txt", "r")  # somefile.txt is a unix file
fd2 = open("someother.txt", "r") # someother.txt is NT
lines1 = []
lines2 = []
lines1[0] = fd1.readline()
lines2[0] = fd2.readline() # lines1[0] == lines2[0]
pos1 = fd1.tell()
pos2 = fd2.tell() # at this point pos1 == pos2, like it should be
lines1[1] = fd1.readline()
lines2[1] = fd2.readline() # lines1[1] == lines2[1]
fd1.seek(pos1)
fd2.seek(pos2) # pos1 == pos2 (still)
lines1[2] = fd1.readline()
lines2[2] = fd2.readline() # lines1[2] != lines2[2] !!!!?

What gives here? That's like having file.seek(file.tell())
moving you somewhere else.

Anyway, I'd like to find out if anyone else has had such
problems or knows what the solution is. I don't read this ng, so
please send replies to my email address at townsend at beehive.de.

TD

* Sent from RemarQ http://www.remarq.com The Internet's Discussion Network *
The fastest and easiest way to search and participate in Usenet - Free!




More information about the Python-list mailing list