[Python-ideas] 1_000_000

Guido van Rossum guido at python.org
Sat May 7 05:45:18 CEST 2011


On Fri, May 6, 2011 at 7:00 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Bruce Leban wrote:
>
>> Consider these examples instead:
>>
>>   - 1_234_000
>>   - 9.876_543_210
>>   - 0xFEFF_0042
>>
>> I'm not advocating this change (nor against it); I just think the
>> discussion
>> should be focused on the actual idea. I do have a question:
>>
>> Is _ just ignored in numbers or are there more complex rules?
>>
>>   - 1_2345_6789  (can I use groups of other sizes instead?)
>>   - 1_2_3_4_5  (ditto)
>>   - 1_234_6789  (do all the groups need to be the same size?)
>
> +1 on all of these. I don't particularly like the look of _ as a number
> separator, but it's hard to think of any alternatives other than space, and
> some separator is better than long sequences of digits.
>
> I'm -0.5 on spaces even though it looks MUCH better, because it's too easy
> to leave the commas out in lists etc:
>
> L = [1, 2, 3, 4 5, 6, 7, 8, 9, 10]  # oops, wanted 4 & 5 not 45
>
> (Admittedly if the items where strings, the same failure mode applies.)

And it does sometimes bite. So let's not do more of that. (In
retrospect 'xxx' + 'yyy' would have been good enough.)

>>   - 1_   (must the _ only be in between 2 digits?)
>>   - 1__234   (what about multiple _s?)
>
> -1 on allowing either _1 or 1_ as numbers.
>
> -0 on allowing doubled underscores.
>
>
>>   - 9.876_543_210   (can it be used to the right of the decimal point?)
>>   - 0xFEFF_0042   (can it be used in hex, octal or binary numbers?)
>
> +1 on these two.

Steven channels me well so far.

Fine points about _ in floats: IMO the _ should be allowed to appear
between any two digits, or between the last digit and the 'e' in the
exponent, or between the 'e' and a following digit. But not adjacent
to the '.' or to the '+' or '-' in the exponent. So 3.141_593 yes,
3_.14 no.

Fine points about _ in bin/oct/hex literals: 0x_dead_beef yes, 0_xdeadbeef no.

(The overall rule seems to be that it must be internal to alphanumeric
strings, except that leading 0x, 0o or 0b must not be separated --
somehow I find 0_x_dead_beef would be a disservice to human readers.)

>>   - int('123_456')   (do other functions accept this syntax too?)
>
> That's a tricky one... I'd say No, but I'm not entirely sure. It's easy
> enough to say:
>
> int('123_456'.replace('_', ''))
>
> albeit a tad verbose. Also easy to say:
>
> int('123' '456')
>
> which is less verbose.

But that's not how it'll be used. The argument will be provided by the
user of the code.

> And it will change the behaviour of the int function.
> So I don't think we need to support separators inside strings.

I think it's fine, the same reason why we want to write 1_234_567 in
code sometimes applies to input or command line arguments too, and I
see little harm.

> We can always change our mind later and add it in, but it's much harder to
> take it out later.

It seems entirely harmless here. Also for float().

It would also be nice to have an easy way to emit _ in suitable
places. Maybe this could be added to the .format() language for
numbers? It would be nice if you could tell it to emit an _ every N
positions.

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list