[I18n-sig] Strawman Proposal (2): Encoding attributes

M.-A. Lemburg mal@lemburg.com
Fri, 09 Feb 2001 22:39:42 +0100

"Martin v. Loewis" wrote:
> > So what this strawman suggest is in summary:
> >
> > 1. add an encoding identifier to the top of a source code file
> > 2. use that encoding information to decode u"..." literals into
> >    Unicode
> > 3. leave all other literals and text alone
> I think the proposal was to do
> 3. raise an error if another literal uses bytes > 127
> instead. Since users need to actively change their source to use the
> encoding declaration, they'll combine this with putting u in front of
> every affected string. If they then still have strings with bytes
> >127, they need to use the \x notation, as the string should not
> contain text.

Hmm, are you sure this would make the encoding declaration a
popular tool ?

If we would just allow ASCII-supersets as source file encoding,
then we wouldn't have to make that restriction, since only the
Unicode literal handling in the parser would have to be adjusted
(and this is easy to do).

This would make UTF-16 encodings impossible, but I think that
two-byte encodings not the right approach to maintainable programs
anyways ;-)

Marc-Andre Lemburg
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/