[Python-Dev] PEP 263 considered faulty (for some Japanese)

Martin v. Loewis martin@v.loewis.de
18 Mar 2002 16:47:14 +0100


"Stephen J. Turnbull" <stephen@xemacs.org> writes:

> I don't see any need for a deviation of the implementation from the
> spec.  

You probably haven't looked at the code of the Python parser, either.

> Just slurp in the whole file in the specified encoding.  Then cast
> the Unicode characters in ordinary literal strings down to bytesize

It's not that simple. Or, perhaps, it is - but still somebody needs to
write this. I won't find the time for a stage 2 implementation anytime
soon, but I still would like to see the feature in Python 2.3.

Even without looking at the parser code, you find two alternative
implementations. Trust me that you will find more alternatives when
you start writing the parser, and more problems. There is a number of
aspects that need to be preserved. Performance is one of them, usage
of the tokenizer for pgen is another.

Regards,
Martin