[Python-ideas] Verbatim names (allowing keywords as names)

Steven D'Aprano steve at pearwood.info
Wed May 16 20:21:55 EDT 2018


On Thu, May 17, 2018 at 10:58:34AM +1200, Greg Ewing wrote:

> The trouble with explicitly overriding keywords is that it
> still requires old code to be changed whenever a new keyword
> is added, which as far as I can see almost competely defeats
> the purpose. If e.g. you need to change all uses of given
> to \given in order for your code to keep working in
> Python 3.x for some x, you might just as well change it
> to given_ or some other already-legal name.

Well, maybe. Certainly using name_ is a possible solution, and it is one 
which has worked for over a quarter century.

We can argue about whether \name or name_ looks nicer, but \name 
has one advantage: the key used in the namespace is actually "name". 
That's important: see below.


> The only remotely legitimate use I can think of is for
> calling APIs that come from a different language, but the
> same thing applies there -- names in the Python binding can
> always be modified somehow to make them legal.

Of course they can be modified. But having to do so is a pain.

With the status quo, when dealing with external data which may include 
names which are keywords, we have to:

- add an underscore when we read keywords from external data
- add an underscore when used as obj.kw literals
- add an underscore when used as getattr("kw") literals
- conditionally remove trailing underscore when writing to external APIs

to:

- do nothing special when we read keywords from external data
- add a backslash when used as obj.kw literals
- do nothing special when used as getattr("kw") literals
- do nothing special when writing to external APIs


I think that overall this pushes it from a mere matter of visual 
preference \kw versus kw_ to a significant win for verbatim names.

Let's say you're reading from a CSV file, creating an object from each 
row, and processing it:

# untested
reader = csv.reader(infile)
header = next(reader)
header = [name + "_" if name in keywords.kwlist() else name for name in header]
for row in reader:
    obj = SimpleNamespace(*zip(header, row))
    process(obj)


The consumer of these objects, process(), has to reverse the 
transformation:

def process(obj):
    for name, value in vars(obj):
        if name.endswith("_") and name[:-1] in keywords.kwlist():
            name = name[:-1]
        write_to_external_API(name, value)


Verbatim names lets us skip both of these boilerplate steps.

An interesting case is when you are using the keywords as hard-coded 
names for attribute access. In the status quo, we write:

    obj.name_
    obj.getattr("name_")

In the first line, if you neglect the _ the compiler will complain and 
you get a syntax error. In the second line, if you neglect the _ you'll 
get no warning, only a runtime failure.

With verbatim names, we can write:

    obj.\name
    obj.getattr("name")  # Don't escape it here!

In this case, the failure modes are similar:

- if you forget the backslash in the first line, you get a 
  SyntaxError at compile time, so there's no change here.

- if you wrongly include the backslash in the second line,
  there are two cases:

  * if the next character matches a string escape, say \n
    or \t, you'll get no error but a runtime failure;

    (but linters could warn about that)

  * if it doesn't match, say \k, you'll now get a warning
    and eventually a failure as we depreciate silently
    ignoring backslashes.


-- 
Steve


More information about the Python-ideas mailing list