[I18n-sig] PEP 263 and Japanese native encodings

Thu, 07 Mar 2002 12:25:30 -0500

[M.-A. Lemburg]
> I've updated the PEP to clarify this. Basically it should be
> possible to do:
>
> file = open('script.py')
> line1 = file.readline()
> line2 = file.readline()
>
> # check line1 and line2 for the RE from the PEP
>
> # push the two lines back onto the file stream or handle this
> # situation using a line buffer.
>
> Nothing complicated, really.

A complication is that so long as Python uses C stdio to read files, there's
no guarantee that "funny bytes" can be gotten from files opened in text
mode.  The inability to read chr(26) from a text-mode file on Windows is an
infamous example of that:

>>> f = open('oops', 'wb')
>>> f.write('x' * 100 + chr(26) + 'x' * 100)
>>> f.close()
>>> f = open('oops')
>>> len(f.read())  # chr(26) acts like EOF on Windows in text mode
100
>>>

OTOH, if you open in binary mode instead, you have to wrestle with the
platform's line-end conventions.

the-devil-is-in-the-details-ly y'rs  - tim