[Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
mal at lemburg.com
Sun Jul 15 20:07:50 CEST 2001
Guido van Rossum wrote:
> > > Explain again why a directive is better than a specially marked
> > > comment, when your main goal seems to be to make it easy for
> > > non-parsing tools like editors to find it?
> > >...
> > Parsing tools do need it. The directive changes the file's semantics.
> > Both parsing and non-parsing tools need it.
> I understand that.
> > I could live with a comment but I think that that is actually harder to
> > implement so I don't understand the benefit...I'm still trying to
> > understand what tools we are protecting. compiler.py can be easily
> > fixed. The real parser/compiler can be easily fixed. The other tools
> > mostly take their cue from one of these two modules, right?
> I disagree with the first sentence -- I believe a comment is easier to
> implement. The directive statement is still problematic. Martin's
> hack falls short of doing the right thing in all cases: you can't have
> the first statement of your program be "directive = ..." or
> Another argument for a comment: I expect there could be situations
> where you want to declare an encoding that doesn't affect the Python
> parser, but that does affect the editor (e.g. when you use the
> encoding only in comments and/or 8-bit strings). A comment would
> back-port to older Python versions; a directive statement wouldn't. I
> don't know how important this is though.
Even though putting the information into a comment would
indeed be easier to implement, I think that from a design point
of view, it is a hack and not a clean design.
Note that a programmer can always place the encoding information
in the format needed for the editor into an additional comment
in fron of the doc-string if that's needed (the comment format
needed for the editor will be editor-specific !).
I think that apart from adding a new keyword to the language
the argument about breaking doc-string tools is not a valid
one. Non-Unicode doc-strings will continue to work like they
# -*- encoding='utf-8' -*-
""" Binary doc-string using UTF-8
directive unicodeencoding = 'utf-8'
print u"Unicode encoded as UTF-8 rather than unicode-escape"
Or am I missing something ?
CEO eGenix.com Software GmbH
Consulting & Company: http://www.egenix.com/
Python Software: http://www.lemburg.com/python/
More information about the Python-list