[Python-ideas] Re: Custom string prefixes

28 Aug 2019

      On Wed, 28 Aug 2019 at 05:04, Andrew Barnert via Python-ideas
 wrote:
...
What matters here is not whether things like the OP’s czt'abc' or my 1.23f or 1.23d are literals to the compiler, but whether they’re readable ways to enter constant values to the human reader.
If so, they’re useful. Period.
Now, it’s possible that even though they’re useful, the feature is still not worth adding because of Chris’s issue that it can be abused, or because there’s an unavoidable performance cost that makes it a bad idea to rely on them, or because they’re not useful in _enough_ code to be worth the effort, or whatever. Those are questions worth discussing. But arguing about whether they meet (one of the three definitions of) “literal” is not relevant.
Extended (I'm avoiding the term "custom" for now) literals like 0.2f,
3.14D, re/^hello.*/ or qw{a b c} have a fairly solid track record in
other languages, and I think in general have proved both useful and
straightforward in those languages. And even in Python, constructs
like f-strings and complex numbers are examples of such things.
However, I know of almost no examples of other languages that have
added *user-definable* literal types (with the notable exception of
C++, and I don't believe I've seen use of that feature in user code -
which is not to say that it's not used). That to me says that there
are complexities in extending the question to user-defined literals
that we need to be careful of.

In my view, the issue isn't abuse of the feature, or performance, or
limited value. It's the very basic problem that it's *really hard* to
define and implement such a feature in a way that everyone is happy
with - particularly in a language like Python which doesn't have a
user-exposed "compile source to binary" step (I tried very hard to
cover myself against nitpicking there - I'm sure I failed, but please,
don't get sidetracked, you know what I mean here :-)). Some specific
questions which would need to be dealt with:

1. What is valid in the "literal" part of the construct (this is the
p"C:\" question)?
2. How do definitions of literal syntax get brought into scope in time
for the parser to act on them (this is about "import xyz_literal"
making xyz"a string" valid but leaving abc"a string" as a syntax
error)?

These questions also fundamentally affect other tools like IDEs,
linters, code formatters, etc.

In addition, there is the question of how user-defined literals would
get turned into constants within the code. In common with list
expressions, tuples, etc, user-defined literals would need to be
handled as translating into runtime instructions for constructing the
value (i.e., a function call). But people typically don't expect
values that take the form of a literal like this to be "just" syntax
sugar for a function call. So there's an education issue here. Code
will get errors at runtime that the users might have expected to
happen at compile time, or in the linter.

It's not that these questions can't be answered. Obviously they can,
as you produced a proof of concept implementation. But the design
trade-offs that one person might make are deeply unsatisfactory to
someone else, and there's no "obviously right" answer (at least not
yet, as no-one Dutch has explained what's obvious ;-))

Also, it's worth noting that the benefits of *user-defined* literals
are *not* the same as the benefits of things like 0.2f, or 3.14d, or
even re/^hello.*/. Those things may well be useful. But the benefit
you gain from *user-defined* literals is that of letting the end user
make the design decisions, rather than the language designer. And
that's a subtly different thing.

So, to summarise, the real problem with user defined literal proposals
is that the benefit they give hasn't yet proven sufficient to push
anyone to properly address all of the design-time details. We keep
having high-level "would this be useful" debates, but never really
focus on the key question, of what, in precise detail, is the "this"
that we're talking about - so people are continually making arguments
based on how they conceive such a feature might work. A really good
example here is the p"C:\" question. Is the proposal that the "string
part" of the literal is just a normal string? If so, then how do you
address this genuine issue that not all paths are valid? What about
backslash-escapes (p"C:\temp")? Is the string a raw string or not? If
the proposal is that the path-literal code can define how the string
is parsed, then *how does that work*?

The OP even made this point explicitly:
...
I'm not discussing possible implementation of this feature just yet, we can get to
that point later when there is a general understanding that this is worth considering.
I don't think we *can* agree on much without the implementation
details (well, other than "yes, it's worth discussing, but only if
someone proposes a properly specified design" ;-))

Paul

[Python-ideas] Re: Custom string prefixes

Paul Moore