[Python-Dev] Support of UTF-16 and UTF-32 source encodings
Benjamin Peterson
benjamin at python.org
Sat Nov 14 17:57:58 EST 2015
I agree that supporting UTF-16 doesn't seem terribly useful. Also, thank
you for giving the tokenizer some love!
On Sat, Nov 14, 2015, at 11:19, Serhiy Storchaka wrote:
> For now UTF-16 and UTF-32 source encodings are not supported. There is a
> comment in Parser/tokenizer.c:
>
> /* Disable support for UTF-16 BOMs until a decision
> is made whether this needs to be supported. */
>
> Can we make a decision whether this support will be added in foreseeable
> future (say in near 10 years), or no?
>
> Removing commented out and related code will help to refactor the
> tokenizer, and that can help to fix some existing bugs (e.g. issue14811,
> issue18961, issue20115 and may be others). Current tokenizing code is
> too tangled.
>
> If the support of UTF-16 and UTF-32 is planned, I'll take this to
> attention during refactoring. But in many places besides the tokenizer
> the ASCII compatible encoding of source files is expected.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/benjamin%40python.org
More information about the Python-Dev
mailing list