[Python-ideas] Custom string prefixes

Thu May 30 22:22:13 CEST 2013

> So you would have to use some editor with up-to-date support and do it
at run-time.  No web viewers.  No distro-installed editors.  This has
a particularly smell to it.

Grep is hardly an editor with up to date support! It'd just be like looking
up a function you don't know. Either it's in the docs, or if not, you start
grepping. Literally, it will be exactly like looking up a function! It's
not like grepping for "@str_prefix\ndef regex" is any harder than grepping
for "def regex"!

> Then put them in a class with the related methods.  However, my whole
moving-things-around recommendation is orthogonal to this proposal.
Handling a prefix during a function call would be equivalent (or even
less performant) to explicitly calling the handler.

The idea was to handle the prefix at either import or compile time (i'm not
actually sure of the distinction, although i'm sure there is one), so it
would be fully inlined by the time the code starts executed (over and
over). I don't think "big class full of related static methods only called
once" is a very good way of organizing code, and the reason people do (and
they do do it!) it for use cases like this is because they have to, not
because they like to.

> p.s. Do you think this feature would ever be used in the standard
library if it made it into the language?

Maybe? I could imagine the regex module using it right away for a very nice
syntax for:

regex"<my-big-regex>".match(...)

While simultaneously getting rid of the behaviourally unspecified
global-compiled-regex-cache (what does "a few regular expressions at a
time<http://docs.python.org/3.3/library/re.html?highlight=re.sub#re.compile>"
mean anyway?) in favor of per-regex-literal interning of compiled regexes,
which is what the global-compiled-regex-cache is trying to approximate
anyway. The sqlite3 (or sqlite4 or sqlite10) module could eventually use
this to pre-parse SQL literal queries, as well as the xml libraries for
pre-parsing XML blobstrings. If any of the parser-combinator libraries like
Parsimonious end up in the std lib, they could definitely use it too.

Anyway, I don't expect everyone to get up and agree with me all of a
sudden, or to have it used in the std lib tomorrow, so it's OK if you
disagree with my judgement or your experience tells you it's a bad idea. I
just thought it's worth a substantive debate because I think it's a neat
idea that's started appearing in other places, and the OP brought it up, so
there's some indication it's not just me being crazy. I've made all the
points I want to make, so I should probably stop talking now.

-Haoyi

On Thu, May 30, 2013 at 4:00 PM, Eric Snow <ericsnowcurrently at gmail.com>wrote:

> On Thu, May 30, 2013 at 4:42 AM, Haoyi Li <haoyi.sg at gmail.com> wrote:
> >> The language needs to fit in our brains and you could argue
> > that we've already pushed past that threshold
> >
> > I would argue that "prefix just calls the registered function at import
> > time" fits with my brain better then "20 special cases in the lexer which
> > result in magic things happening", but maybe that's just me.
>
> The existing prefixes are not going away, but they are easy to look up
> in the docs.  So that brain-space is not going way.  Additionally,
> under this proposal those prefixes could change meaning at run-time.
> Furthermore, you could encounter prefixes that are not in the docs and
> you would have to know how to figure out what they are supposed to do.
>  Right now there is zero ambiguity about what the prefixes do.
>
> So this proposal would simply be adding one more thing that has to fit
> inside your brain without adding any power to the language.  On top of
> that I'm not seeing why this would be very widely used meaning people
> would probably not recognize what's going on at first.  This is rough
> on the non-experts, and adds more complexity to the language where we
> already have a perfectly good way of doing the same thing (explicit
> function calls).
>
> >
> >> One concern I'd have
> > is in how you would look up what some arbitrary prefix is supposed to
> > do
> >
> > In theory PyCharm/Vim/Emacs would be able to just jump to the definition
> of
> > the prefix, instantly giving you the docstring along with the source
> code of
> > what it does, so I don't think this is a real problem. Tool support will
> > catch up, as always, and you could always grep "@str_prefix\ndef blah" or
> > something similar.
>
> So you would have to use some editor with up-to-date support and do it
> at run-time.  No web viewers.  No distro-installed editors.  This has
> a particularly smell to it.
>
> >
> >> Furthermore, it sounds like the intent is to apply the prefix at
> > run-time.
> >
> > Is there any other time in Python? There's only import-time and
> > execution-time to choose from.
>
> 1. compile-time (during import with no up-to-date .pyc file)
> 2. run-time
> 2a. import time (execution of the module)
> 2b. call-time (execution of a function body)
>
> .pyc files allow us to skip this part after the first time.  Currently
> string prefixes are handled at compile time.  This proposal moves it
> to run-time.  Furthermore, only the raw string literals would still
> actually be literals and able to be interned/compiled away/stored in a
> functions constants.  But now there would be extra byte codes used in
> the compiled code to process the raw literals.  All for the sake of
> hypothetical use cases that can already be handled by explicit
> function calls.
>
> >
> >> That means that the function that gets applied depends on
> > what's in some registry at a given moment.  So source with these
> > custom string prefixes will be even more ambiguous when read.
> >
> > I don't think the second line follows from the first. You could say the
> same
> > of PEP302 import hooks, which overload a fundamental operation (import
> xxx)
> > to do whatever the hell you want, depending on what's in your
> sys.meta_path
> > at any given moment. Heck, you could say the same of any function call
> > my_func(...), that it depends on what my_func has been defined as at a
> given
> > moment! Remember functions can be (and often are!) rebound multiple
> times,
> > but people seem to get by just fine.
>
> Consider that the import system is a giant black box for most people
> and even gives the experts fits.  Comparing this proposal to it really
> says something.  Furthermore, existing dynamic-language-induced
> challenges do not justify adding more.
>
> >
> >> In that case, why not just call the
> > function at the module scope, bind the result to a name there, and use
> > that name wherever you like.  It's effectively a constant, right?
> >
> > The main reason I'd give is that moving a whole bunch of things into
> module
> > scope spaghettifies your program. If I'm making a bunch of SQL queries
> in a
> > bunch of functions, I expect the SQL to be in the functions where I'm
> using
> > them (nobody else cares about them, right?) rather than at module scope.
> > Moving stuff to module scope for performance adds another layer of
> > unnecessary indirection when i look at my_function and wonder what
> > my_function_sql_query and my_function_xml_template is meant to do, for
> every
> > single function.
>
> Then put them in a class with the related methods.  However, my whole
> moving-things-around recommendation is orthogonal to this proposal.
> Handling a prefix during a function call would be equivalent (or even
> less performant) to explicitly calling the handler.
>
> Bottom line: I'm still not seeing the utility of this feature over
> plain function calls.
>
> -eric
>
> p.s. Do you think this feature would ever be used in the standard
> library if it made it into the language?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130530/539cd539/attachment.html>