[Python-Dev] forwarded message from Stephen J. Turnbull

Fri, 01 Mar 2002 23:28:10 +0100

"Martin v. Loewis" wrote:
>=20
> barry@zope.com (Barry A. Warsaw) writes:
>=20
> > I'm hoping that Stephen will soon be able to join the python-dev
> > discussions more directly, and I'm cc'ing him on this message.  I
> > admit to wearing the typical American sunglasses on this issue, MM2.1
> > not withstanding.  I think Stephen's view point and experience with
> > this issue is worth bringing up here.
>=20
> We have sorted out some of this on python-list already. Stephen
> mistook the proposal as talking about the encoding which is used in
> the Python source code. With the occasional exception of a Latin-1 "=F6=
"
> in a comment line, I think the Python source is currently pure ASCII -
> I would have no problems with changing these few occurrences to UTF-8,
> as Stephen suggests.
>=20
> I'm not sure whether he still has his original position "do not allow
> multiple source encodings to enter the language", which, to me,
> translates into "source encodings are always UTF-8". If that is the
> route to take, PEP 263 should be REJECTED, in favour of only
> tightening the Python language definition to only allow UTF-8 source
> code.

-1

We had the UTF-8 discussion back when we started discussing
the Unicode intergration. Let's not start it again.

Note that PEP only *allows* using various locale dependent=20
encodings for Python source. Don't take this as: hey, everybody=20
will start using their own home-grown ROT-13 variant now and=20
start having code obfuscation fests :-)

Reasonable programmer will stay reasonable programmers even
if you give them a nice tool like Emacs which allows them
to do just about anything you could possibly imagine to=20
source code. Adding the option of having different source
code encodings won't change that. I don't think there's much
to worry about and that most of this is just FUD.

> This, of course, will find disapproval from people who currently use
> different source encodings. "Not everybody has a UTF-8 editor" is a
> frequent complaint from that camp, and I think it is a valid one
> (although I personally do have a UTF-8-capable editor, and although
> IDLE could be taught into doing UTF-8 only easily). To allow notepad
> support, we'd still need to accept the UTF-8 signature at the
> beginning of a file.

Which the PEP allows.
=20
> > Nevertheless, it is supported in XEmacs, and is working in 21.4.6.
> > Except ... only in the first line.  Yep, there's explicit code to
> > restrict recognition to the first line.  No second or third line, no
> > trailing local variables section.  I don't have a problem with
> > extending to the first few lines, but the trailing local variables
> > section I oppose.
>=20
> For Python, it probably would have to go to the second line, with the
> rationale given in the Emacs manual: the first line is often used for
> #!.

Right.
=20
> > You guys absolutely definitely positively only ever need _first and
> > second line_ forever and ever promise cross your heart, right?
>=20
> Yes. The current proposal does not spell this out, but I think this
> restriction is reasonable.

I think the comment about the RE and where it is applied is very
clear on this.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/