[Python-checkins] r75001 - python/branches/release31-maint/Doc/howto/unicode.rst

Tue Sep 22 00:24:07 CEST 2009

On Mon, 21 Sep 2009 at 22:12, Senthil Kumaran wrote:

> On Mon, Sep 21, 2009 at 11:10:25AM -0400, Jim Jewett wrote:
>> I think this was the wrong fix; the point was to show that some
>> encodings aren't "complete".  ASCII wasn't, but I think UTF-8 actually
>> is.
>>
>> Maybe change from
>>
>>>  Encodings don't have to handle every possible Unicode character, and most
>> to:
>>
>> Encodings don't have to handle every possible Unicode character, and
>> most encodings don't.  Most notably, the 'ASCII' encoding doesn't.
>> (Python's default encoding has been changed to 'UTF-8', which does
>> handle all valid Unicode characters.)
>
> +1 on this change in wording.  No need for ()s IMO.
>
> I was curious to know why it was previous sentence was  removed than
> mentioning that the default encoding has changed from ascii to UTF-8.

UTF-8 isn't introduced until later in the section.  At that point one
could mention that Python uses that as its default encoding, but what
exactly does that mean?  In the context of a howto, that would seem to
need explanation.  One explanation is provided in the subsequent section:
the "Unicode Literals in Python Source Code" subsection explains that
the default source encoding is UTF-8 and what that means.

Then there's getdefaultencoding, which returns 'utf-8' on my python3,
but isn't mentioned in the howto.

So I think deleting the sentence (as was done) is good, because its
presence there was not particularly enlightening.  But the explanation
in the subsequent section may be incomplete.

--David

PS: I also notice that the section on filenames doesn't mention PEP 383
encoding...but perhaps that is beyond the scope of the howto?