[Python-ideas] User-defined literals

Thu Jun 4 01:03:35 CEST 2015

On Jun 3, 2015, at 14:48, Chris Angelico <rosuav at gmail.com> wrote:
> 
>> On Thu, Jun 4, 2015 at 2:55 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> In Python, it's perfectly fine that -2 and 1+2j and (1, 2) are all compiled into expressions, so why isn't it fine that 1.2d is compiled into an expression? And, once you accept that, what's wrong with the expression being `literal_d('1.2')` instead of `Decimal('1.2')`?
> 
> That's exactly the thing: 1.2d should be atomic. It should not be an
> expression. The three examples you gave are syntactically expressions,
> but they act very much like literals thanks to constant folding:
> 
>>>> dis.dis(lambda: -2)
>  1           0 LOAD_CONST               2 (-2)
>              3 RETURN_VALUE
>>>> dis.dis(lambda: 1+2j)
>  1           0 LOAD_CONST               3 ((1+2j))
>              3 RETURN_VALUE
>>>> dis.dis(lambda: (1, 2))
>  1           0 LOAD_CONST               3 ((1, 2))
>              3 RETURN_VALUE
> 
> which means they behave the way people expect them to.

But that's not something that's guaranteed by Python. It's something that implementations are allowed to do, and that CPython happens to do. If user code actually relied on that optimization, that code would be nonportable.

But the reason Python allows that optimization in the first place is that user code actually doesn't care whether these expressions are evaluated "atomically" or at compile time, so it's ok to do so behind users' backs. It's not surprising because no one is going to monkeypatch int.__neg__ between definition time and call time (which CPython doesn't, but some implementations do), or call dis and read the bytecode if they don't even understand what a compile-time optimization is, and so on.

> There is no way
> for run-time changes to affect what any of those expressions yields.
> Whether you're talking about shadowing the name Decimal or the name
> literal_d, the trouble is that it's happening at run-time. Here's
> another confusing case:
> 
> import decimal
> from fractionliterals import literal_fr
> # oops, forgot to import literal_d
> 
> # If we miss off literal_fr, we get an immediate error, because
> # 1/2fr gets evaluated at def time.
> def do_stuff(x, y, portion=1/2fr):
>    try: result = decimal.Decimal(x*y*portion)
>    except OverflowError: return 0.0d
> 
> You won't know that your literal has failed until something actually
> triggers the error.

If that's a problem, then you're using the wrong language. You also won't know that you've typo'd OvreflowError or reslt, or called d.sqrt() instead of decimal.sqrt(d), or all kinds of other errors until something actually triggers the error. Which means either executing the code, or running a static linter. Which would be exactly the same for 1.2d.

> That is extremely unobvious, especially since the
> token "literal_d" doesn't occur anywhere in do_stuff().

This really isn't going to be confusing in real life. You get an error saying you forgot to define literal_d. You say, "Nuh uh, I did define it right at the top, same way I did literal_fr, in this imp... Oops, looks like I forgot to import it".

> Literals look
> like atoms, and if they behave like expressions, sooner or later
> there'll be a ton of Stack Overflow questions saying "Why doesn't my
> code work? I just changed this up here, and now I get this weird
> error".

Can you come up with an actual example where changing this up here gives this weird error somewhere else? If not, I doubt even the intrepid noobs of StackOverflow will come up with one.

Neither of the examples so far qualifies--the first one is an error that the design can never produce, and the second one is not weird or confusing any more than any other error in any dynamic languages.

And if you're going to suggest "what if I just redefine literal_d for no reason", ask yourself who would ever do that? Redefining decimal makes sense, because that's a reasonable name for a variable; redefining literal_d is as silly as redefining __name__. (But if you think those are different because double underscores are special, I suppose __literal_d__ doesn't bother me.)