[Python-ideas] User-defined literals

Tue Jun 2 21:40:01 CEST 2015

On Wed, Jun 3, 2015 at 5:03 AM, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> Of course `0x12decimal` becomes `literal_imal('0x12dec')`, and `21jump` becomes `literal_ump('21j'), which are not at all useful, and potentially confusing, but I don't think that would be a serious problem in practice.
>

There's probably no solution to the literal_imal problem, but the
easiest fix for literal_ump is to have 21j be parsed the same way -
it's a 21 modified by j, same as 21jump is a 21 modified by jump.

> Unlike C++, the lookup of that literal function happens at runtime, so `1.2z3` is no longer a SyntaxError, but a NameError on `literal_z3`. Also, this means `literal_d` has to be in scope in every module you want decimal literals in, which often means a `from … import` (or something worse, like monkeypatching builtins). C++ doesn't have that problem because of argument-dependent lookup, but that doesn't work for any other language. I think this is the biggest flaw in the proposal.
>

I'd much rather see it be done at compile time. Something like this:

compile("x = 1d/10", "<>", "exec")

would immediately call literal_d("1") and embed its return value in
the resulting code as a literal. (Since the peephole optimizer
presumably doesn't currently understand Decimals, this would probably
keep the division, but if it got enhanced, this could end up
constant-folding to Decimal("0.1") before returning the code object.)
So it's only the compilation step that needs to know about all those
literal_* functions. Should there be a way to globally register them
for default usage, or is this too much action-at-a-distance?

> Also unlike C++, there's no overloading on different kinds of literals; the conversion function has no way of knowing whether the user actually typed a string or a number. This could easily be changed (e.g., by using different names, or just by passing the repr of the string instead of the string itself), but I don't think it's necessary.
>

I'd be inclined to simply always provide a string. The special case
would be that the quotes can sometimes be omitted, same as redundant
parens on genexprs can sometimes be omitted. Otherwise, 1.2d might
still produce wrong results.

> Similarly, this idea could be extended to handle all literal types, so you can do `{'spam': 1, 'eggs': 2}_o` to create an OrderedDict literal, but I think that's ugly enough to not be worth proposing. (A prefix looks better there... but a prefix doesn't work for numbers or strings. And I'm not sure it's unambiguously parseable even for list/set/dict.) Plus, there's the problem that comprehensions and actual literals are both parsed as displays, but you wouldn't want user-defined comprehensions.
>

I thought there was no such thing as a dict/list/set literal, only
display syntax? In any case, that can always be left for a future
extension to the proposal.

ChrisA