[Python-ideas] User-defined literals

Wed Jun 3 21:43:00 CEST 2015

On Jun 2, 2015, at 19:52, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Tue, Jun 02, 2015 at 12:03:25PM -0700, Andrew Barnert via Python-ideas wrote:
>> 
>> I explored the convertible literals a while ago, and I'm pretty sure 
>> that doesn't work in a duck-typed language. But the C++ design does 
>> work, as long as you're willing to have the conversion (including the 
>> lookup of the conversion function itself) done at runtime.
> 
> I'm torn. On the one hand, some sort of extensible syntax for literals 
> would be nice. I say "nice" rather than useful because there are 
> advantages and disadvantages and there's no way of really knowing 
> which outweighs the other.

That's exactly why I came up with something I could hack up without any changes to the interpreter. It means anyone can try it out and see whether the advantages outweigh the disadvantages for them. (Of course there are additional disadvantages to the hack in efficiency, hackiness, and possibly debugability, so it may unfairly bias people who don't keep that in mind--but if so, it can only bias them in the conservative direction of rejecting the idea, which I think is ok.)

> But, really, your proposal is in no way, shape or form syntax for 
> *literals*,

It's a syntax for things that are somewhat like `2`, more like `-2`, even more like `(2,)`, but still not exactly the same as even that. If you don't like using the word "literal" for that, you can come up with a different word. I called it a "literal" because "user-defined literals" is what people were asking for when they asked for `2.3d`, and it has clear parallels with a very similar feature with the same name in other languages. But I'm fine calling it something different, as long as people who are looking for it will know how to find it.

> it's a new syntax for an unary postfix operator

That's fair; C++ in fact defines its user literal syntax in terms of special constexpr operator overloads, and points out the similarities to postfix operator++ in a note.

> or function. 
> The whole point of something being a literal is that it is parsed and 
> converted at compile time.
> Now you might (and do) say that worrying 
> about this is "premature optimization", but call me a pedant if you 
> like, I don't think we should call something a literal if it's a 
> runtime function call.

I don't think this is the right distinction.

A literal is a notation for expressing some value that means what it says in a sufficiently simple way. That concept has significant overlap with "compile-time evaluable", and with "constant", but they're not the same concepts.

And this is especially true for a language that doesn't define any compile-time computation phase. In Python, `-2` may be compiled to UNARY_NEGATIVE on the compiled-in constant value 2, or just to the compiled-in constant value -2, depending on what the implementation wants to optimize. Do you want to call it a literal in some implementations but not others? No reasonable user code that isn't reflecting on the internals is going to care, or even know, what the implementation is doing.

Being "user-defined" means that the "sufficiently simple way" the notation gets its meaning has to involve user code. In a language with a compile-time computation phase like C++, that can mean "constexpr" user code, but Python doesn't define a "constexpr"-like phase.

At any rate, again, if you want to call it something different, that's fine, as long as people looking for "what does `1.2d` mean in this program" or "how do I do the Python equivalent of a C++ user-defined literal" will be able to understand it.

> Otherwise, we might as well say that 
> 
>    from fractions import Fraction
>    Fraction(2)
> 
> is a literal, in which case I can say your proposal is unnecessary as we 
> already have user-specified literals in Python.

In C++, a constructor expression like Fraction(2) may be evaluable at compile time, and may evaluate to something that's constant at both compile time and runtime, and yet it's still not a literal. Why? Because their rule for what counts as "sufficiently simple" includes constexpr postfix user-literal operators, but not constexpr function or constructor calls. I don't know of anyone who's confused by that. It's a useful (and intuitively useful) distinction, separate from the "constexpr" and "const" distinctions.

> I can think of some interesting uses for postfix operators, or literals, 
> or whatever we want to call them:
> 
> 45°
> 10!!
> 23.5d
> 3d6
> 35'24"
> 15ell
> 
> I've deliberately not explained what I mean by each of them. You can 
> probably guess some, or all, but I hope it demonstrates one problem with 
> this suggestion. Like operator overloading, it risks making code less 
> clear rather than more.

Sure. In fact, it's very closely analogous--both of them are ways to allow a user-defined type to act more like a builtin type, which can be abused to do completely different things instead. The C++ proposal specifically pointed out this comparison.

I think the risk is lower in Python than in C++ just because Python idiomatically discourages magical or idiosyncratic programming much more strongly in general, and that means operator overloading is already used more consistently and less confusingly than in C++, so the same is more likely to be true with this new feature. But of course the risk isn't zero. 

Again, I'm hoping people will play around with it, come up with example code they can show to other people for impressions, etc., rather than trying to guess, or come up with some abstract argument. It's certainly possible that everything that looks like a good example when you think of it will look too magical to anyone who reads your code. Then the idea can be rejected, and if anyone thinks of a similar idea in the future, they can be pointed to the existing examples and asked, "Can your idea solve these problems?"