[Python-Dev] Octal literals
ncoghlan at gmail.com
Fri Feb 3 11:07:12 CET 2006
Bengt Richter wrote:
> On Fri, 3 Feb 2006 10:16:17 +1100, "Delaney, Timothy (Tim)" <tdelaney at avaya.com> wrote:
>> Andrew Koenig wrote:
>>>> I definately agree with the 0c664 octal literal. Seems rather more
>>> I still prefer 8r664.
>> The more I look at this, the worse it gets. Something beginning with
>> zero (like 0xFF, 0c664) immediately stands out as "unusual". Something
>> beginning with any other digit doesn't. This just looks like noise to
>> I found the suffix version even worse, but they're blown out of the
>> water anyway by the fact that FFr16 is a valid identifier.
> Are you sure you aren't just used to the x in 0xff? I.e., if the leading
> 0 were just an alias for 16, we could use 8x664 instead of 8r664.
No, I'm with Tim - it's definitely the distinctive shape of the '0' that helps
the non-standard base stand out. '0c' creates a similar shape, also helping it
to stand out. More on distinctive shapes below, though.
That said, I'm still trying to figure out exactly what problem is being solved
here. Thinking out loud. . .
The full syntax for writing integers in any base is:
5 prefix chars, 3 or 8 in the middle (counting the space, and depending on
whether the keyword is used or not), one on the end, and one or two to specify
the radix. That's quite verbose, so its unsurprising that many would like
something nicer in the toolkit when they need to write multiple numeric
literals in a base other than ten. This can typically happen when writing Unix
system admin scripts, bitbashing to control a piece of hardware or some other
The genuine use cases we have for integer literals are:
- decimal (normal numbers)
- hex (compact bitmasks)
- octal (unix file permissions)
- binary (explicit bitmasks for those that don't speak fluent hex)
Currently, there is no syntax for binary literals, and the syntax for octal
literals is both magical (where else in integer mathematics does a leading
zero matter?) and somewhat error prone (int and eval will give different
answers for a numeric literal with a leading zero - int ignores the leading
zero, eval treats it as signifying that the value is in octal. The charming
result is that the following statement fails: assert int('0123') == 0123).
Looking at existing precedent in the language, a prefix is currently used when
the parsing of the subsequent literal may be affected (that is, the elements
that make up the literal may be interpreted differently depending on the
prefix). This is the case for hex and octal literals, and also for raw and
Suffixes are currently used when the literal as a whole is affected, but the
meaning of the individual elements remains the same. This is the case for both
long integer and imaginary number literals. A suffix also makes sense for
decimal float literals, as the individual elements would still be interpreted
as base 10 digits.
So, since we want to affect the parsing process, this means we want a prefix.
The convention of using '0x' to denote hex extends far beyond Python, and
doesn't seem to provoke much in the way of objection.
This suggests options like '0o' or '0c' for octal literals. Given that '0x'
matches the '%x' in string formatting, the least magical option would be '0o'
(to match the existing '%o' output format). While '0c' is cute and quite
suggestive, it creates a significant potential for confusion , as it most
emphatically does *not* align with the meaning of the '%c' format specifier.
I'd be +0 on changing the octal literal prefix from '0' to '0o', and also +0
on adding an '0b' prefix and '%b' format specifier for binary numbers.
Whether anyone will actually care enough to implement a patch to change the
syntax for any of these is an entirely different question ;)
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-Dev