On Thu, Aug 26, 2021 at 1:54 AM Petr Viktorin <encukou@gmail.com> wrote:

On 26. 08. 21 9:54, Marc-Andre Lemburg wrote:
> On 26.08.2021 06:07, Christopher Barker wrote:
>> I'm working on a PR now. It seems there is little support for keeping the
>> python2 content in the docs, so I'm re-writing it as though it was never there.
>> If someone wants to add a note about Python 2, of course that can be added later.
>>
>> Note that "moving the Python 2 content to a section at the end" is not all that
>> straightforward, as it is pretty mixed in with the text at this point.
>>
>> But now a question -- the current text reads:
>>
>> "Code in the core Python distribution should always use UTF-8"
>>
>> and then:
>>
>> "In the standard library, non-default encodings should be used only for
>> test purposes or when a comment or docstring needs to mention an author
>> name that contains non-ASCII characters ..."
>>
>> I *think* that's a remnant of the Py2 ASCII encoding days -- but I wanted to
>> make sure, a bit later on, it says:
>>
>> "The following policy is prescribed for the
>> standard library ... In addition, string literals and comments must also be in
>> ASCII."
>
> For Python 2 code we mandated ASCII for the stdlib, with some exceptions
> using the source code encoding for testing purposes or in case e.g.
> Martin von Löwis or Marc-André Lemburg wanted to put his name into the code
> without escaping part of it ;-)
>
> Note that Python 2 defaults to ASCII as source code encoding.
>
> With UTF-8 as standard source code encoding, this is no longer
> necessary.
>
> So the second quote can be changed to "In the standard library, non-default
> source code encodings should be used only for test purposes ...".
>
>> Is that still correct for string literals and comments? And what about docstrings?
>>
>> It seems to me that if we really are utf-8, then there is no need for those
>> "textual" elements to be ASCII. e.g they can still contain non-ascii characters,
>> and escaping those makes things less readable, not more.
>>
>> So I think that section should now read:
>>
>> """
>> Source File Encoding
>> --------------------
>>
>> Code in the core Python distribution should always use UTF-8, and should not
>> have an encoding declaration.
>>
>> In the standard library, non-UTF-8 encodings should be used only for
>> test purposes.
>
> I think the above should be limited to Python code. In C or other
> source files you may well still need a source code encoding.
>
>> The following policy is prescribed for the standard library (see PEP
>> 3131): All identifiers in the Python standard library MUST use
>> ASCII-only identifiers, and SHOULD use English words wherever feasible
>> (in many cases, abbreviations and technical terms are used which aren't
>> English). In comment and docstrings, authors whose names tht are not
>> based on the Latin alphabet (latin-1, ISO/IEC 8859-1 character set)
>> MUST provide a transliteration of their names in this character set.
>>
>> Open source projects with a global audience are encouraged to adopt a
>> similar policy.
>> """
>>
>> But maybe we do want to keep comments, docstrings and literals as ASCII with
>> escapes?
>
> No need for the stdlib, since UTF-8 is widely accepted by now
> and why should people with non-ASCII names not be able to write
> their true name ?
>
> You may have noted that I rarely do... the reason is that in the
> past, the accent on the "e" caused me too many problems. Perhaps
> one of these days, I'll go back to adding it again :-)

I would drop the weirdly specific "(latin-1, ISO/IEC 8859-1 character
set)" note, and only keep "based on the Latin alphabet".
The Ł in Łukasz's name is not in latin-1, and I don't think it needs
different treatment than German or French names. (As opposed to a
Russian or Chinese name, where an an average English speaker isn't able
to type an approximation of the name on their keyboard.)

- Peťa Viktorin

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/E6B6INCC5IH5477XF5BGXPC3GPIEER5R/
Code of Conduct: http://python.org/psf/codeofconduct/