[Python-Dev] PEP 515: Underscores in Numeric Literals

Martin Panter vadmium+py at gmail.com
Thu Feb 11 20:29:26 EST 2016


On 12 February 2016 at 00:16, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, Feb 11, 2016 at 06:03:34PM +0000, Brett Cannon wrote:
>> On Thu, 11 Feb 2016 at 02:13 Steven D'Aprano <steve at pearwood.info> wrote:
>>
>> > On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:
>> >
>> > > And honestly, are you really claiming that in your opinion, "123_456_"
>> > > is worse than all of their other examples, like "1_23__4"?
>> >
>> > Yes I am, because 123_456_ looks like you've forgotten to finish typing
>> > the last group of digits, while 1_23__4 merely looks like you have no
>> > taste.
>> >
>>
>> OK, but the keyword in your sentence is "taste".
>
> I disagree. The key *idea* in my sentence is that the trailing
> underscore looks like a programming error. In my opinion, avoiding that
> impression is important enough to make trailing underscores a syntax
> error.
>
> I've seen a few people vote +1 for things like 123_j and 1.23_e99, but I
> haven't seen anyone in favour of trailing underscores. Does anyone think
> there is a good case for allowing trailing underscores?
>
>
>> If we update PEP 8 for our
>> needs to say "Numerical literals should not have multiple underscores in a
>> row or have a trailing underscore" then this is taken care of. We get a
>> dead-simple rule for when underscores can be used, the implementation is
>> simple, and we get to have more tasteful usage in the stdlib w/o forcing
>> our tastes upon everyone or complicating the rules or implementation.
>
> I think this is a misrepresentation of the alternative. As I see it, we
> have two alternatives:
>
> - one or more underscores can appear AFTER the base specifier or any digit;
+1

> - one or more underscores can appear BETWEEN two digits.
-0

Having underscores between digits is the main usage, but I don’t see
much harm in the more liberal version, unless it that makes the
specification or implementation too complex. Allowing stuff like
0x_100, 4.7_e3, and 1_j seems of slightly more benefit IMO than
disallowing 1_000_.

> To describe the second alternative as "complicating the rules" is, I
> think, grossly unfair. And if Serhiy's proposal is correct, the
> implementation is also no more complicated:
>
> # underscores after digits
> octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
> hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
> bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
>
> # underscores between digits
> octinteger: "0" ("o" | "O") octdigit (["_"] octdigit)*
> hexinteger: "0" ("x" | "X") hexdigit (["_"] hexdigit)*
> bininteger: "0" ("b" | "B") bindigit (["_"] bindigit)*
>
>
> The idea that the second alternative "forc[es] our tastes on everyone"
> while the first does not is bogus. The first alternative also prohibits
> things which are a matter of taste:
>
> # prohibited in both alternatives
> 0_xDEADBEEF
> 0._1234
> 1.2e_99
> -_1

This one is already a valid variable identifier name.

> 1j_
>
>
> I think that there is broad agreement that:
>
> - the basic idea is sound
> - leading underscores followed by digits are currently legal
>   identifiers and this will not change
+1 to both
> - underscores should not follow the sign - +
> - underscores should not follow the decimal point .
> - underscores should not follow the exponent e|E
No strong opinion on these from me
> - underscores will not be permitted inside the exponent (even if
>   it is harmless, it's silly to write 1.2e9_9)
-0, it seems like a needless inconsistency, unless it somehow hurts
the implementation
> - underscores should not follow the complex suffix j
No opinion

> and only minor disagreement about:
>
> - whether or not underscores will be allowed after the base
>   specifier 0x 0o 0b
+0

> - whether or not underscores will be allowed before the decimal
>   point, exponent and complex suffix.
No opinion about directly before decimal point; +0 before exponent or
imaginary (complex) suffix.

> Can we have a show of hands, in favour or against the above two? And
> then perhaps Guido can rule on this one way or the other and we can get
> back to arguing about more important matters? :-)
>
> In case it isn't obvious, I prefer to say No to allowing underscores
> after the base specifier, or before the decimal point, exponent and
> complex suffix.


More information about the Python-Dev mailing list