[Python-ideas] User-defined literals

Paul Moore p.f.moore at gmail.com
Thu Jun 4 15:06:12 CEST 2015


On 4 June 2015 at 13:08, Steven D'Aprano <steve at pearwood.info> wrote:
> I don't wish to argue about other languages, but I think for Python, the
> important characteristic of "literals" is that they are static, as
> above, not "simple". An expression with nested containers isn't
> necessarily simple:
>
>     {0: [1, 2, {3, 4, (5, 6)}]}  # add arbitrary levels of complexity
>
> nor is it necessarily constructed as a compile-time constant, but it is
> static in the above sense.

I think that the main reason that people keep asking for things like
1.2d in place of D('1.2') is basically that the use of a string
literal, for some reason "feels different". It's not a technical
issue, nor is it one of compile time constants or static values - it's
simply about not wanting to *think* of the process as passing a string
literal to a function. They want "a syntax for a decimal" rather than
"a means of getting a decimal from a string" because that's how they
think of what they are doing.

People aren't asking for decimal literals because they don't know that
they can do D('1.2'). They want to avoid the quotes because they don't
"feel right", that's all. That's why the common question is "why
doesn't D(1.2) do what I expect?" rather than "how do I include a
decimal constant in my program?"

"Literal" syntax is about taking a chunk of the source code as a
string, and converting it into a runtime object. For built in types
the syntax is known to the lexer and the compiler knows how to create
the runtime constants (that applies as much to Python as to C or any
other language). The fundamental question here is whether there is a
Pythonic way of extending that to user-defined forms. That would have
to be handled at runtime, so the *syntax* would need to be immutable,
but the *semantics* could be defined in terms of runtime, without
violating the spirit of the request.

Such a syntax could be used for lots of things - regular expressions
are a common type that gets dedicated syntax (Javascript, Perl).

As a straw man how about a new syntax (this won't work as written,
because it'll clash with the "<" operator, but the basic idea works):

    LITERAL_CALL = PRIMARY "<" <any source character except right
angle bracket>* ">"

which is a new option for PRIMARY alongside CALL. This translates
directly into PRIMARY(str) where str is a string composed of the
source characters within <...>.

Decimal "literals" would then be

    from decimal import Decimal as D
    x = D<1.2>

Code objects could be

    compile<x+1>.

Regular expressions could be

    from re import compile as RE
    regex = RE<a.*([bc]+)$>

As you can see the potential for line noise and unreadable code is
there, but regular expressions always have that problem :-) Also, this
proposal gives a "literal syntax" that works with existing features,
rather than being a specialised add-on. Maybe that's a benefit (or
maybe it's over-generalisation).

Paul


More information about the Python-ideas mailing list