[Python-ideas] User-defined literals

Thu Jun 4 21:14:38 CEST 2015

On Jun 4, 2015, at 05:08, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Wed, Jun 03, 2015 at 12:43:00PM -0700, Andrew Barnert wrote:
>> On Jun 2, 2015, at 19:52, Steven D'Aprano <steve at pearwood.info> wrote:
> [...]
>>> But, really, your proposal is in no way, shape or form syntax for 
>>> *literals*,
>> 
>> It's a syntax for things that are somewhat like `2`, more like `-2`, 
>> even more like `(2,)`, but still not exactly the same as even that.
> 
> Not really. It's a syntax for something that is not very close to *any* 
> of those examples. Unlike all of those example, it is a syntax for 
> calling a function at runtime.
> 
> Let's take (-2, 1+3j) as an example. As you point out in another post, 
> Python may constant-fold it, but isn't required to. Python 3.3 compiles 
> it to a single constant:
> 
>  LOAD_CONST               6 ((-2, (1+3j)))
> 
> 
> but Python 1.5 compiles it to a series of byte-code operations:
> 
>  LOAD_CONST          0 (2)
>  UNARY_NEGATIVE
>  LOAD_CONST          1 (1)
>  LOAD_CONST          2 (3j)
>  BINARY_ADD
>  BUILD_TUPLE         2
> 
> 
> But that's just implementation detail. Whether Python 3.3 or 1.5, both 
> expressions have something in common: the *operation* is immutable (I 
> don't mean the object itself); there is nothing you can do, from pure 
> python code, to make the literal (-2, 1+3j) something other than a 
> two-tuple consisting of -2 and 1+3j. You can shadow int, complex and 
> tuple, and it won't make a lick of difference. For lack of a better 
> term, I'm going to call this a "static operation" (as opposed to dynamic 
> operations like called len(x), which can be shadowed or monkey-patched).

But this isn't actually true. That BINARY_ADD opcode looks up the addition method at runtime and calls it. And that means that if you monkeypatch complex.__radd__, your method will get called.

As an implementation-specific detail, CPython 3.4 doesn't let you modify the complex type. Python allows this, but doesn't require it, and some other implementations do let you modify it.

So, if it's important to your code that 1+3j is a "static operation", then your code is non-portable at best. But once again, I suspect that the reason you haven't thought about this is that you've never written any code that actually cares what is or isn't a static operation. It's a typical "consenting adults" case.

> I don't wish to debate the definition of "literal", as that may be very 
> difficult. For example, is 2+3j actually a literal, or an expression 
> containing only literals? If a literal, how about 2*3**4/5 for that 
> matter? As soon as Python compilers start doing compile-time constant 
> folding, the boundary between literals and constant expressions becomes 
> fuzzy. But that boundary is actually not very interesting. What is 
> interesting is that every literal shares at least the property that I 
> refer to above, that you cannot redefine the result of that literal at 
> runtime by shadowing or monkey-patching.

What you're arguing here, and for the rest of the message, can be summarized in one sentence: the difference between user-defined literals and implementation-defined literals is that the former are user-defined. To which I have no real answer.

> 
>> If 
>> you don't like using the word "literal" for that, you can come up with 
>> a different word. I called it a "literal" because "user-defined 
>> literals" is what people were asking for when they asked for `2.3d`,
> 
> If you asked for a turkey and cheese sandwich on rye bread, and I said 
> "Well, I haven't got any turkey, or rye, but I can give you a slice of 
> cheese on white bread and we'll just call it a turkey and cheese rye 
> sandwich", you probably wouldn't be impressed :-)

But if I asked for a turkey and cheese hoagie, and you said I have turkey and cheese and a roll, but that doesn't count as a hoagie by my definition so you can't have it, I'd say just put the turkey and cheese on the roll and call it whatever you want to call it.

If people are asking for user-defined literals like 2.3d, and your argument is not that we can't or shouldn't do it, but that the term "user-defined literal" is contradictory, then the answer is the same: just call it something different.

I don't know how else to put this. I already said, in two different ways, that if you want to call it something different that's fine. You replied by saying you don't want to argue about the definition of literals, followed by multiple paragraphs arguing about the definition of literals.

>> A literal is a notation for expressing some value that means what it 
>> says in a sufficiently simple way.
> 
> I don't think that works. "Sufficiently simple" is a problematic 
> concept. If "123_d" is sufficiently simply, surely "d(123)" is equally 
> simple? It's only one character more, and it's a much more familiar 
> and conventional syntax.

If you're talking about APL or J, the number of characters might be a relevant measure of simplicity. But in the vast majority of languages, including Python, it has very little relevance. Of course "simple" inherently a vague concept, and it will be different in different languages and contexts. But it's still one of the most important concepts. That's why language design is an art, and why we have a Zen of Python and not an Assembly Manual of Python. Trying to reduce it to something the wc program can measure means reducing it to the point of meaninglessness.

Let's give a different example. I could claim that currying makes higher-order expressions simpler. You could rightly point out that it makes the simplest function calls less simple. If we disagree on those points, or on the relative importance of them, we might draw up a bunch of examples to look at the human readability and writability or computer parsability of different expressions, in the context of idiomatic code in the language we were designing. If the rest of the language were a lot like Haskell, we'd probably agree that curried functions were simpler; if it were a lot like Python, we'd probably agree on the reverse. But at no point would the fact that f(1,2) is one character shorter than f(1)(2) come into the discussion. The closest we'd reasonably get might a discussion of the fact that the parens feel "big" and "get in the way" of reading the "more important" parts of the expression, or encourage the reader to naturally partition up the expression in a way that isn't appropriate to the intended meaning, or other such things. (See the "grit on Tim's monitor" appeal.) But those are still vague and subjective things. There's no objective measure to appeal to. Otherwise, every language proposal, Guido would just run the objective simplicity measurement program and it would say yes or no.

>> In C++, a constructor expression like Fraction(2) may be evaluable at 
>> compile time, and may evaluate to something that's constant at both 
>> compile time and runtime, and yet it's still not a literal. Why? 
>> Because their rule for what counts as "sufficiently simple" includes 
>> constexpr postfix user-literal operators, but not constexpr function 
>> or constructor calls.
> 
> What is the logic for that rule?

In the case of C++, a committee actually sat down and hammered out a rigorous definition that codified the intuitive sense they were going for; if you want to read it, you can. But that isn't going to apply to anything but C++. And if you want to argue about it, the place to do so is the C++17 ISO committee. Just declaring that the C++ standard definition of literals doesn't define what you want to call literals doesn't really accomplish anything.