PEP 100 references & wording

I just noticed that PEP 100 (Python/Unicode integration) references
http://starship.python.net/~lemburg/unicode-proposal.txt
as the latest version. Sure enough, I visited that and found that it's newer than the PEP (1.8 v. 1.7).
Shouldn't the PEP be the most up-to-date public document? The comment right after that suggests this should be so:
[ed. note: new revisions should be made to this PEP document, while the historical record previous to version 1.7 should be retrieved from MAL's url, or Misc/unicode.txt]
Since this is now an informational PEP, I believe the wording should change to reflect functionality that has already been implemented. For instance, instead of
Python should provide a built-in constructor for Unicode strings which is available through __builtins__:
it should read
Python provides a built-in constructor for Unicode strings which is available through __builtins__:
Skip

Skip Montanaro wrote:
I just noticed that PEP 100 (Python/Unicode integration) references
http://starship.python.net/~lemburg/unicode-proposal.txt
as the latest version. Sure enough, I visited that and found that it's newer than the PEP (1.8 v. 1.7).
True. I'm not sure why the above file is 1.8 and the CVS PEP at 1.7. I guess I forgot to update the PEP.
FYI, here's adiff between the 1.7 and 1.8 version:
--- unicode-proposal-1.7.txt Tue Oct 17 17:38:40 2000 +++ unicode-proposal.txt Tue Oct 17 17:38:40 2000 @@ -1,7 +1,7 @@ ============================================================================= - Python Unicode Integration Proposal Version: 1.7 + Python Unicode Integration Proposal Version: 1.8 -----------------------------------------------------------------------------
Introduction: ------------- @@ -612,11 +612,11 @@ Case Conversion: ----------------
Case conversion is rather complicated with Unicode data, since there are many different conditions to respect. See
- http://www.unicode.org/unicode/reports/tr13/ + http://www.unicode.org/unicode/reports/tr21/
for some guidelines on implementing case conversion.
For Python, we should only implement the 1-1 conversions included in Unicode. Locale dependent and other special case conversions (see the @@ -631,11 +631,15 @@ possible. Line Breaks: ------------
Line breaking should be done for all Unicode characters having the B property as well as the combinations CRLF, CR, LF (interpreted in that -order) and other special line separators defined by the standard. +order) and other special line separators defined by the standard. See + + http://www.unicode.org/unicode/reports/tr13/ + +for some guidelines on implementing line breaks and newline handling.
The Unicode type should provide a .splitlines() method which returns a list of lines according to the above specification. See Unicode Methods.
@@ -1010,11 +1014,11 @@ Unicode 3.0:
Unicode-TechReports: http://www.unicode.org/unicode/reports/techreports.html
Unicode-Mappings: - ftp://ftp.unicode.org/Public/MAPPINGS/ + http://www.unicode.org/Public/MAPPINGS/
Introduction to Unicode (a little outdated by still nice to read): http://www.nada.kth.se/i18n/ucs/unicode-iso10646-oview.html
For comparison: @@ -1047,10 +1051,11 @@ Encodings: http://www.uazone.com/multiling/unicode/wg2n1035.html
History of this Proposal: ------------------------- +1.8: Fixed some URLs to the unicode.org site. 1.7: Added note about the changed behaviour of "s#". 1.6: Changed <defencstr> to <defenc> since this is the name used in the implementation. Added notes about the usage of <defenc> in the buffer protocol implementation. 1.5: Added notes about setting the <default encoding>. Fixed some
Shouldn't the PEP be the most up-to-date public document? The comment right after that suggests this should be so:
[ed. note: new revisions should be made to this PEP document, while the historical record previous to version 1.7 should be retrieved from MAL's url, or Misc/unicode.txt]
Since this is now an informational PEP, I believe the wording should change to reflect functionality that has already been implemented. For instance, instead of
Python should provide a built-in constructor for Unicode strings which is available through __builtins__:
it should read
Python provides a built-in constructor for Unicode strings which is available through __builtins__:
True again; I just didn't find time to rewrite these bits. The PEP is basically a reformatted proposal. That's where the "should" wording originates from.

Marc, can you update PEP 100?
You might want to retire the starship URL and use the PEP URL as the official location.
--Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Marc, can you update PEP 100?
You might want to retire the starship URL and use the PEP URL as the official location.
Will do, but it might take a week or two.
participants (3)
-
Guido van Rossum
-
M.-A. Lemburg
-
Skip Montanaro