[Python-Dev] Support of UTF-16 and UTF-32 source encodings

Benjamin Peterson benjamin at python.org
Sat Nov 14 17:57:58 EST 2015


I agree that supporting UTF-16 doesn't seem terribly useful. Also, thank
you for giving the tokenizer some love!

On Sat, Nov 14, 2015, at 11:19, Serhiy Storchaka wrote:
> For now UTF-16 and UTF-32 source encodings are not supported. There is a 
> comment in Parser/tokenizer.c:
> 
>      /* Disable support for UTF-16 BOMs until a decision
>         is made whether this needs to be supported.  */
> 
> Can we make a decision whether this support will be added in foreseeable 
> future (say in near 10 years), or no?
> 
> Removing commented out and related code will help to refactor the 
> tokenizer, and that can help to fix some existing bugs (e.g. issue14811, 
> issue18961, issue20115 and may be others). Current tokenizing code is 
> too tangled.
> 
> If the support of UTF-16 and UTF-32 is planned, I'll take this to 
> attention during refactoring. But in many places besides the tokenizer 
> the ASCII compatible encoding of source files is expected.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/benjamin%40python.org


More information about the Python-Dev mailing list