[Python-3000] Draft PEP for Requiring Lowercase Literal Modifiers
Andrew Karem McCollum
mccollum at fas.harvard.edu
Mon Mar 19 16:39:57 CET 2007
This is my first PEP, and one of my first postings to this list, so I
apologize in advance for any glaring errors. I wrote this up because I
feel like it is a good companion to the recent octal and binary
discussions/PEP. If nothing else, this should at least provide a jumping
off point for discussion and someone more experienced could use it as a
basis for a more rigorous PEP if they so desired. If it is supported, I
am happy to work on an implementation, though I imagine someone else could
produce one much more expediently.
-Andrew McCollum
-------------------------------------------------------------------
PEP: XXX
Title: Requiring Lowercase Characters in Literal Modifiers
Version: $Revision$
Last-Modified: $Date$
Author: Andrew McCollum <mccollum at fas.harvard.edu>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 19-Mar-2007
Python-Version: 3.0
Post-History:
Abstract
========
This PEP proposes to change the syntax for declaring literals with prefixes or
modifiers by requiring the prefix or modifier to be in lowercase. Affected
modifiers include:
* The 'b' and 'r' prefixes for string and bytes literals
* The 'b', 'o', and 'x' modifiers for integer literals in other bases
* The 'j' suffix for imaginary literals
* The 'e' exponent notation for floating point literals
Motivation
==========
The primary motivation for this change is to avoid confusion and improve the
appearance of these literals. The most obvious example is the proposed 'o'
modifier for declaring an octal number such as, '0o755', representing the
decimal value '493'. When the 'o' character is written in uppercase, the number
appears as '0O755', which can be confusing in many fonts because it resembles
the string of digits '00755'.
In other cases, the lowercase version of the modifier serves to visually
separate the two components of the literal, or to make the modifier stand out
against the neighboring literal, such as in the following examples::
0x5A 0b0110 0xab
1.92e21 3.13j 3.14e5j
r'\d+\.' b"Hello world"
With uppercase modifiers, these literals appear as::
0X5A 0B0110 0Xab
1.92E21 3.13J 3.14E5J
R'\d+\.' B"Hello world"
Which are more difficult to visually parse, especially upon initial inspection.
There is also an argument for uniformity and so that TOOWTDI. Unlike the case
of string literals where ', ", and """ all behave differently making each
useful in different situations, in the case of literal modifiers, the
difference is purely cosmetic and the behavior of the literal is otherwise
unchanged.
Grammar Changes
===============
The new grammar for string prefixes [1]_ (with the bytes literal) will be:
stringprefix ::= "b" | "r" | "br"
Integer literals [2]_ will have the following grammar changes:
bininteger ::= "0b" bindigit+
octinteger ::= "0o" octdigit+
hexinteger ::= "0x" hexdigit+
Exponents in floating point literals [3]_ will now be defined by:
exponent ::= "e" ["+" | "-"] digit+
Imaginary numbers [4]_ will be will now be defined with the syntax:
imagnumber ::= (floatnumber | intpart) "j"
The grammar of these literals will be otherwise unchanged. For example, the
specification of 'hexdigit' will continue to allow both uppercase and lowercase
digits.
Since this PEP is targeted at Python 3000, the suffix for specifying long
integer literals ('l' or 'L') and the prefix for specifying unicode strings
('u' or 'U') are ignored as both forms will disappear as these types are merged
with int and str, respectively.
Semantic Changes
================
The behavior of the 'int' builtin when passed a radix of 0 will be changed to
follow the above grammar. This change is to maintain the specified behavior
[5]_ that a radix of 0 mirrors the literal syntax. The behavior of this
function will otherwise not be altered. In particular, the behavior of
accepting the prefix '0X' when a radix of 16 is specified will be kept for
backwards compatibility and easier parsing of data files.
Automatic Conversion
====================
It should be trivial for the '2to3' conversion tool to convert literals to the
new syntax in all cases. The only possible incompatibility will be from the
subtle changes to the 'int' builtin.
Open Issues
===========
The main issue involves the treatment of hexadecimal values employing the
legacy '0X' prefix when passed to the 'int' builtin. Several people showed a
desire to maintain parity with the literal syntax and the 'eval' function when
0 was passed in as the radix. The argument against this behavior is that it
breaks backwards compatibility and makes parsing integers from arbitrary
sources more difficult. This PEP makes the compromise of allowing the use of
the prefix '0X' only when the radix is explicitly specified. The rationale for
this choice is that when parsing integers from data files, the radix is often
know ahead of time, and thus can be supplied as a second argument to maintain
the previous behavior, while maintaining the symmetry between the literal
syntax and the 0 radix form.
BDFL Pronouncements
===================
The BDFL supports the disallowing of leading zeros in the syntax for integer
literals, and was leaning towards maintaining this behavior when a radix of 0
was passed to the 'int' builtin [6]_. This would break backwards compatibility
for automatically parsing octal literals.
Later, the BDFL expressed a preference that '0X' be an allowable prefix for
hexadecimal numbers when a radix of 0 was passed to the 'int' builtin [7]_. The
PEP currently only allows this prefix when the radix is explicitly specified.
Reference Implementation
========================
A reference implementation is not yet provided, but since no additional
behavior is proposed, simply the removal of previously allowed behavior,
changes should be minimal.
References
==========
.. [1] http://www.python.org/doc/current/ref/strings.html
.. [2] http://www.python.org/doc/current/ref/integers.html
.. [3] http://www.python.org/doc/current/ref/floating.html
.. [4] http://www.python.org/doc/current/ref/imaginary.html
.. [5] http://docs.python.org/lib/built-in-funcs.html
.. [6] http://mail.python.org/pipermail/python-3000/2007-March/006325.html
.. [7] http://mail.python.org/pipermail/python-3000/2007-March/006423.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
More information about the Python-3000
mailing list