Relative seeks on string IO

Pierre Quentel pierre.quentel at gmail.com
Wed Sep 7 03:00:24 EDT 2011


>
> Please post code without non-code indents, like so:
>
Sorry about that. After the line "Example :" I indented the next
block, out of habit ;-)
>
> What system are you using? Does it have a narrow or wide unicode build?
> (IE, what is the value of sys.maxunicode?)
>
I use Windows XP Pro, version 2002, SP3. sys.maxunicode is 65535

I have the same behaviour with 3.1.1 and with 2.7

I don't understand why variable sized code units would cause problems.
On text file objects, read(nb) reads nb characters, regardless of the
number of bytes used to encode them, and tell() returns a position in
the text stream just after the next (unicode) character read

As for SringIO, a wrapper around file objects simulates a correct
behaviour for relative seeks :

====================
txt = "abcdef"
txt += "تخصيص هذه الطبعة"
txt += "머니투데이"
txt += "endof file"

out = open("test.txt","w",encoding="utf-8")
out.write(txt)
out.close()

fobj = open("test.txt",encoding="utf-8")
fobj.seek(3)
try:
    fobj.seek(2,1)
except IOError:
    print('raises IOError')

class _file:

    def __init__(self,file_obj):
        self.file_obj = file_obj

    def read(self,nb=None):
        if nb is None:
            return self.file_obj.read()
        else:
            return self.file_obj.read(nb)

    def seek(self,offset,whence=0):
        if whence==0:
            self.file_obj.seek(offset)
        else:
            if whence==2:
                # read till EOF
                while True:
                    buf = self.file_obj.read()
                    if not buf:
                        break
            self.file_obj.seek(self.file_obj.tell()+offset)

fobj = _file(open("test.txt",encoding="utf-8"))
fobj.seek(3)
fobj.seek(2,1)
fobj.seek(-5,2)
print(fobj.read(3))
==========================

- Pierre



More information about the Python-list mailing list