[Python-ideas] Verbatim names (allowing keywords as names)
Steven D'Aprano
steve at pearwood.info
Wed May 16 20:21:55 EDT 2018
On Thu, May 17, 2018 at 10:58:34AM +1200, Greg Ewing wrote:
> The trouble with explicitly overriding keywords is that it
> still requires old code to be changed whenever a new keyword
> is added, which as far as I can see almost competely defeats
> the purpose. If e.g. you need to change all uses of given
> to \given in order for your code to keep working in
> Python 3.x for some x, you might just as well change it
> to given_ or some other already-legal name.
Well, maybe. Certainly using name_ is a possible solution, and it is one
which has worked for over a quarter century.
We can argue about whether \name or name_ looks nicer, but \name
has one advantage: the key used in the namespace is actually "name".
That's important: see below.
> The only remotely legitimate use I can think of is for
> calling APIs that come from a different language, but the
> same thing applies there -- names in the Python binding can
> always be modified somehow to make them legal.
Of course they can be modified. But having to do so is a pain.
With the status quo, when dealing with external data which may include
names which are keywords, we have to:
- add an underscore when we read keywords from external data
- add an underscore when used as obj.kw literals
- add an underscore when used as getattr("kw") literals
- conditionally remove trailing underscore when writing to external APIs
to:
- do nothing special when we read keywords from external data
- add a backslash when used as obj.kw literals
- do nothing special when used as getattr("kw") literals
- do nothing special when writing to external APIs
I think that overall this pushes it from a mere matter of visual
preference \kw versus kw_ to a significant win for verbatim names.
Let's say you're reading from a CSV file, creating an object from each
row, and processing it:
# untested
reader = csv.reader(infile)
header = next(reader)
header = [name + "_" if name in keywords.kwlist() else name for name in header]
for row in reader:
obj = SimpleNamespace(*zip(header, row))
process(obj)
The consumer of these objects, process(), has to reverse the
transformation:
def process(obj):
for name, value in vars(obj):
if name.endswith("_") and name[:-1] in keywords.kwlist():
name = name[:-1]
write_to_external_API(name, value)
Verbatim names lets us skip both of these boilerplate steps.
An interesting case is when you are using the keywords as hard-coded
names for attribute access. In the status quo, we write:
obj.name_
obj.getattr("name_")
In the first line, if you neglect the _ the compiler will complain and
you get a syntax error. In the second line, if you neglect the _ you'll
get no warning, only a runtime failure.
With verbatim names, we can write:
obj.\name
obj.getattr("name") # Don't escape it here!
In this case, the failure modes are similar:
- if you forget the backslash in the first line, you get a
SyntaxError at compile time, so there's no change here.
- if you wrongly include the backslash in the second line,
there are two cases:
* if the next character matches a string escape, say \n
or \t, you'll get no error but a runtime failure;
(but linters could warn about that)
* if it doesn't match, say \k, you'll now get a warning
and eventually a failure as we depreciate silently
ignoring backslashes.
--
Steve
More information about the Python-ideas
mailing list