[Python-ideas] User-defined literals

Andrew Barnert abarnert at yahoo.com
Wed Jun 3 02:47:03 CEST 2015


On Jun 2, 2015, at 12:40, Chris Angelico <rosuav at gmail.com> wrote:
> 
> On Wed, Jun 3, 2015 at 5:03 AM, Andrew Barnert via Python-ideas
> <python-ideas at python.org> wrote:
>> Of course `0x12decimal` becomes `literal_imal('0x12dec')`, and `21jump` becomes `literal_ump('21j'), which are not at all useful, and potentially confusing, but I don't think that would be a serious problem in practice.
> 
> There's probably no solution to the literal_imal problem, but the
> easiest fix for literal_ump is to have 21j be parsed the same way -
> it's a 21 modified by j, same as 21jump is a 21 modified by jump.

Thanks; I should have thought of that--especially since that's exactly how C++ solves similar problems. (Although reserving all suffixes that don't start with an underscore for the implementation's use doesn't hurt...)

>> Unlike C++, the lookup of that literal function happens at runtime, so `1.2z3` is no longer a SyntaxError, but a NameError on `literal_z3`. Also, this means `literal_d` has to be in scope in every module you want decimal literals in, which often means a `from … import` (or something worse, like monkeypatching builtins). C++ doesn't have that problem because of argument-dependent lookup, but that doesn't work for any other language. I think this is the biggest flaw in the proposal.
> 
> I'd much rather see it be done at compile time. Something like this:
> 
> compile("x = 1d/10", "<>", "exec")
> 
> would immediately call literal_d("1") and embed its return value in
> the resulting code as a literal. (Since the peephole optimizer
> presumably doesn't currently understand Decimals, this would probably
> keep the division, but if it got enhanced, this could end up
> constant-folding to Decimal("0.1") before returning the code object.)
> So it's only the compilation step that needs to know about all those
> literal_* functions. Should there be a way to globally register them
> for default usage, or is this too much action-at-a-distance?

It would definitely be nicer to have it done at compile time if possible. I'm just not sure there's a good design that makes it possible.

In particular, with your suggestion (which I considered), it seems a bit opaque to me that 1.2d is an error unless you _or some other module_ first imported decimalliterals; it's definitely more explicit if you (not some other module) have to from decimalliterals import literal_d. (And when you really want to be implicit, you can inject it into other modules or into builtins, the same as any other rare case where you really want to be implicit.)

But many real projects are either complex enough to need centralized organization or simple enough to fit in one script, so maybe it wouldn't turn out too "magical" in practice.

>> Also unlike C++, there's no overloading on different kinds of literals; the conversion function has no way of knowing whether the user actually typed a string or a number. This could easily be changed (e.g., by using different names, or just by passing the repr of the string instead of the string itself), but I don't think it's necessary.
> 
> I'd be inclined to simply always provide a string. The special case
> would be that the quotes can sometimes be omitted, same as redundant
> parens on genexprs can sometimes be omitted.

Yes, that's what I thought too. The only real use case C++ has for this is allowing the same suffix to mean different things for different types, which I think would be more of a bug magnet than a feature if anyone actually did it...

> Otherwise, 1.2d might
> still produce wrong results.
> 
>> Similarly, this idea could be extended to handle all literal types, so you can do `{'spam': 1, 'eggs': 2}_o` to create an OrderedDict literal, but I think that's ugly enough to not be worth proposing. (A prefix looks better there... but a prefix doesn't work for numbers or strings. And I'm not sure it's unambiguously parseable even for list/set/dict.) Plus, there's the problem that comprehensions and actual literals are both parsed as displays, but you wouldn't want user-defined comprehensions.
> 
> I thought there was no such thing as a dict/list/set literal, only
> display syntax?

That's what I meant in the last sentence: technically, there's no such thing as a dict literal, just a dict display that isn't a comprehension. I don't think you want user-defined suffixes on comprehensions, and coming up with a principled and simply-implementable way to make them work on literal-type displays but not comprehension-type displays doesn't seem like an easy problem.

> In any case, that can always be left for a future
> extension to the proposal.
> 
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list