[Python-checkins] r54991 - peps/trunk/pep-0000.txt peps/trunk/pep-3120.txt peps/trunk/pep-3142.txt

martin.v.loewis python-checkins at python.org
Fri Apr 27 07:16:11 CEST 2007


Author: martin.v.loewis
Date: Fri Apr 27 07:16:10 2007
New Revision: 54991

Added:
   peps/trunk/pep-3120.txt
      - copied, changed from r54990, peps/trunk/pep-3142.txt
Removed:
   peps/trunk/pep-3142.txt
Modified:
   peps/trunk/pep-0000.txt
Log:
Rename 3142 to 3120, as that is the next available number.


Modified: peps/trunk/pep-0000.txt
==============================================================================
--- peps/trunk/pep-0000.txt	(original)
+++ peps/trunk/pep-0000.txt	Fri Apr 27 07:16:10 2007
@@ -117,8 +117,8 @@
  S  3117  Postfix Type Declarations                    Brandl
  S  3118  Revising the buffer protocol                 Oliphant, Banks
  S  3119  Introducing Abstract Base Classes            GvR, Talin
+ S  3120  Using UTF-8 as the default source encoding   von Löwis
  S  3141  A Type Hierarchy for Numbers                 Yasskin
- S  3142  Using UTF-8 as the default source encoding   von Löwis
 
  Finished PEPs (done, implemented in Subversion)
 
@@ -475,8 +475,8 @@
  S  3117  Postfix Type Declarations                    Brandl
  S  3118  Revising the buffer protocol                 Oliphant, Banks
  S  3119  Introducing Abstract Base Classes            GvR, Talin
+ S  3120  Using UTF-8 as the default source encoding   von Löwis
  S  3141  A Type Hierarchy for Numbers                 Yasskin
- S  3142  Using UTF-8 as the default source encoding   von Löwis
 
 
 Key

Copied: peps/trunk/pep-3120.txt (from r54990, peps/trunk/pep-3142.txt)
==============================================================================
--- peps/trunk/pep-3142.txt	(original)
+++ peps/trunk/pep-3120.txt	Fri Apr 27 07:16:10 2007
@@ -1,4 +1,4 @@
-PEP: 3142
+PEP: 3120
 Title: Using UTF-8 as the default source encoding
 Version: $Revision $
 Last-Modified: $Date $

Deleted: /peps/trunk/pep-3142.txt
==============================================================================
--- /peps/trunk/pep-3142.txt	Fri Apr 27 07:16:10 2007
+++ (empty file)
@@ -1,107 +0,0 @@
-PEP: 3142
-Title: Using UTF-8 as the default source encoding
-Version: $Revision $
-Last-Modified: $Date $
-Author: Martin v. Löwis <martin at v.loewis.de>
-Status: Draft
-Type: Standards Track
-Content-Type: text/x-rst
-Created: 15-Apr-2007
-Python-Version: 3.0
-Post-History:
-
-
-Specification
-=============
-
-This PEP proposes to change the default source encoding from ASCII to
-UTF-8. Support for alternative source encodings [#pep263]_ continues to
-exist; an explicit encoding declaration takes precedence over the
-default.
-
-
-A Bit of History
-================
-
-In Python 1, the source encoding was unspecified, except that the
-source encoding had to be a superset of the system's basic execution
-character set (i.e. an ASCII superset, on most systems).  The source
-encoding was only relevant for the lexis itself (bytes representing
-letters for keywords, identifiers, punctuation, line breaks, etc).
-The contents of a string literal was copied literally from the file
-on source.
-
-In Python 2.0, the source encoding changed to Latin-1 as a side effect
-of introducing Unicode. For Unicode string literals, the characters
-were still copied literally from the source file, but widened on a
-character-by-character basis. As Unicode gives a fixed interpretation
-to code points, this algorithm effectively fixed a source encoding, at
-least for files containing non-ASCII characters in Unicode literals.
-
-PEP 263 identified the problem that you can use only those Unicode
-characters in a Unicode literal which are also in Latin-1, and
-introduced a syntax for declaring the source encoding. If no source
-encoding was given, the default should be ASCII. For compatibility
-with Python 2.0 and 2.1, files were interpreted as Latin-1 for a
-transitional period. This transition ended with Python 2.5, which
-gives an error if non-ASCII characters are encountered and no source
-encoding is declared.
-
-Rationale
-=========
-
-With PEP 263, using arbitrary non-ASCII characters in a Python file is
-possible, but tedious. One has to explicitly add an encoding
-declaration. Even though some editors (like IDLE and Emacs) support
-the declarations of PEP 263, many editors still do not (and never
-will); users have to explicitly adjust the encoding which the editor
-assumes on a file-by-file basis.
-
-When the default encoding is changed to UTF-8, adding non-ASCII text
-to Python files becomes easier and more portable: On some systems,
-editors will automatically choose UTF-8 when saving text (e.g. on Unix
-systems where the locale uses UTF-8). On other systems, editors will
-guess the encoding when reading the file, and UTF-8 is easy to
-guess. Yet other editors support associating a default encoding with a
-file extension, allowing users to associate .py with UTF-8.
-
-For Python 2, an important reason for using non-UTF-8 encodings was
-that byte string literals would be in the source encoding at run-time,
-allowing then to output them to a file or render them to the user
-as-is. With Python 3, all strings will be Unicode strings, so the
-original encoding of the source will have no impact at run-time.
-
-Implementation
-==============
-
-The parser needs to be changed to accept bytes > 127 if no source
-encoding is specified; instead of giving an error, it needs to check
-that the bytes are well-formed UTF-8 (decoding is not necessary,
-as the parser converts all source code to UTF-8, anyway).
-
-IDLE needs to be changed to use UTF-8 as the default encoding.
-
-
-References
-==========
-
-.. [#pep263]
-   http://www.python.org/dev/peps/pep-0263/
-   
-
-
-Copyright
-=========
-
-This document has been placed in the public domain.
-
-
-
-..
-   Local Variables:
-   mode: indented-text
-   indent-tabs-mode: nil
-   sentence-end-double-space: t
-   fill-column: 70
-   coding: utf-8
-   End:


More information about the Python-checkins mailing list