[I18n-sig] PEP 263 and Japanese native encodings

M.-A. Lemburg mal@lemburg.com
Thu, 07 Mar 2002 19:09:58 +0100


Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > I've updated the PEP to clarify this. Basically it should be
> > possible to do:
> >
> > file = open('script.py')
> > line1 = file.readline()
> > line2 = file.readline()
> >
> > # check line1 and line2 for the RE from the PEP
> >
> > # push the two lines back onto the file stream or handle this
> > # situation using a line buffer.
> >
> > Nothing complicated, really.
> 
> A complication is that so long as Python uses C stdio to read files, there's
> no guarantee that "funny bytes" can be gotten from files opened in text
> mode.  The inability to read chr(26) from a text-mode file on Windows is an
> infamous example of that:
> 
> >>> f = open('oops', 'wb')
> >>> f.write('x' * 100 + chr(26) + 'x' * 100)
> >>> f.close()
> >>> f = open('oops')
> >>> len(f.read())  # chr(26) acts like EOF on Windows in text mode
> 100
> >>>

Pass that string to a teletex machine and you'll get the same
result... Hmm, this should tell us something ;-)
 
> OTOH, if you open in binary mode instead, you have to wrestle with the
> platform's line-end conventions.

Martin's patch leaves these "minor" issues to the tokenizer 
and that's good :-) 

I only wanted to give a very simple
example of what the original idea was when I added "ASCII
compatible encoding" to the PEP -- basically to simplify
the coding parsing part.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/