[Python-ideas] User-defined literals

Andrew Barnert abarnert at yahoo.com
Fri Jun 5 10:47:43 CEST 2015


On Jun 5, 2015, at 00:06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>> On 5 June 2015 at 09:03, Andrew Barnert <abarnert at yahoo.com> wrote:
>>> On Jun 4, 2015, at 06:48, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> 
>>>> On 4 June 2015 at 23:06, Paul Moore <p.f.moore at gmail.com> wrote:
>>>> As a straw man how about a new syntax (this won't work as written,
>>>> because it'll clash with the "<" operator, but the basic idea works):
>>>> 
>>>>   LITERAL_CALL = PRIMARY "<" <any source character except right
>>>> angle bracket>* ">"
>>> 
>>> The main idea I've had for compile time metaprogramming that I figured
>>> I might be able to persuade Guido not to hate is:
>>> 
>>>  python_ast, names2cells, unbound_names =
>>> !(this_is_an_arbitrary_python_expression)
>>> 
>>> As suggested by the assignment target names, the default behaviour
>>> would be to compile the expression to a Python AST, and then at
>>> runtime provide some relevant information about the name bindings
>>> referenced from it. (I haven't even attempted to implement this,
>>> although I've suggested it to some of the SciPy folks as an idea they
>>> might want to explore to make R style lazy evaluation easier)
>>> 
>>> By using the prefix+delimiters notation, it would become possible to
>>> later have variants that were similarly transparent to the compiler,
>>> but *called* a suitably registered callable at compile time to do the
>>> conversion to runtime Python objects. For example:
>>> 
>>>  !sh(shell command)
>>>  !format(format string with implicit interpolation)
>>>  !sql(SQL query)
>>> 
>>> So for custom numeric types, you could register:
>>> 
>>>   d = !decimal(1.2)
>>>   r = !rational(22/7)
>> 
>> But what would that get you?
>> 
>> If it's meant to be a "compile-time decimal value"... What kind of value is that? What ends up in your co_consts? An instance of decimal.Decimal? How does that get marshaled?
>> 
>> Also, what's the point of it being compile-time? Unless there's some way to write arbitrary code that operates at compile time (like Lisp special forms, or C++ constexpr functions), what code is going to care about the difference between a compile-time decimal value and a run-time decimal value?
>> 
>> Also, where and how do you define sh, decimal, sql, etc.? I'm having a hard time seeing how you have any different options than my proposal does. You could have a function named bang_decimal that's looked up normally, or some way to register_bang_function('decimal', my_decimal_parser), or any of the other options mentioned in this thread, but what's the difference (other than there being a default "no-name" function that does an AST parse and name binding, which doesn't really seem related to any of the non-default examples)?
> 
> The larger idea (again, keeping in mind I haven't actually fully
> thought through how to implement this) is to give the parsers access
> to the surrounding namespace, which means that the compiler needs to
> be made aware of any *actual* name references, and the *way* names are
> referenced would be parser dependent (shell variables, format string
> interpolation, SQL interpolation, etc).
> 
> So, for example:
> 
>    print(!format(The {item} cost {amount} {units}))
> 
> Would roughly translate to:
> 
>    print("The {item} cost {amount} {units}".format(item=item,
> amount=amount, units=units))
> 
> It seemed relevant in this context, as a compile time AST
> transformation would let folks define their own pseudo-literals. Since
> marshal wouldn't know how to handle them, the AST produced at compile
> time would still need to be for a runtime constructor call rather than
> for a value to be stored in co_consts. These cases:
> 
>    d = !decimal(1.2)
>    r = !rational(22/7)
> 
> Might simply translate directly to the following as the runtime code:
> 
>    d = decimal.Decimal("1.2")
>    r = fractions.Fraction(22, 7)
> 
> With the difference being that the validity of the passed in string
> would be checked at compile time rather than at runtime, so you could
> only use it for literal values, not to construct values from
> variables.

Note that, as discussed earlier in this thread, it is far easier to accidentally shadow `decimal` than something like `literal_decimal` or `bang_parser_decimal`, so there's a cost to doing this half-way at compile time, not just a benefit.

Also, a registry is definitely more "magical" than an explicit import: something some other module imported that isn't even visible in this module has changed the way this module is run, and even compiled. Of course that's true for import hooks as well, but I think in the case of import hooks there's really no avoiding the magic; in this case, there is. Obviously explicit vs. implicit isn't the only factor in usability/readability, so it's possible it would be better anyway, but I'm not sure it is.

At any rate, although you haven't shown how you expect these functions to be implemented, I think this proposal ends up being roughly equivalent to mine. Sure, the `bang_parser_decimal` function can compile the source to an AST and look up names in some way, but `literal_decimal` can do that too. And presumably whatever helper functions you were imagining to make that easier could still be written. So it's ultimately just bikeshedding the syntax, and whether you use a registry vs. normal lookup.

> As far as registration goes, yes, there'd need to be a way to hook the
> compiler to notify it of the existence of these compile time AST
> generation functions. Dave Malcolm's patch to allow parts of the
> compiler to be written in Python rather than C
> (https://bugs.python.org/issue10399 ) might be an interest place to
> start on that front.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list