[Python-3000] symbols?

Nick Coghlan ncoghlan at gmail.com
Fri Apr 14 13:31:56 CEST 2006


Guido van Rossum wrote:
> I'd also like to point out (again?) that the "const-ness" of ALLCAPS
> is a red herring; in practice, we treat almost all imported names as
> constants, and in fact class and function names are generally
> considered constant. (I've also seen plenty of code that used ALLCAPS
> to indicate "configuration parameter" rather than "constant" and which
> freely assigned to ALLCAPS variables in configuration code.)

I tend to use all caps that way, too (i.e. they refer to data structures that 
won't be modified after the module has been imported, but they're fair game 
while the module itself is being initialised).

> Then, on 4/14/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> [...]
> class C:
>   x = property("_get_x", "_set_x", "_del_x",
>                "This is the x property")
> [...]
> 
> (BTW I thought I implemented this but can't find any evidence.)

I thought you'd implemented it too, so I was surprised to find it didn't work 
in 2.5a1. Could it be languishing in a working copy somewhere?

>> However, if a leading dot in an expression simply indicated "this is a string
>> that is also a legal Python identifier", the above could be written:
> [...]
> class C:
>   x = property(._get_x, ._set_x, ._del_x,
>                "This is the x property")
> [...]
> 
> This is slightly more Pythonic, and unambiguous, but I'd rather
> reserve leading dot for a more powerful feature, perhaps .foo as an
> abbreviation for self.foo, or some other scope-ish operator. (I also
> note that in a proportional font the . is easily missed.)

The leap from "Starts with ." to "It's a string" is a pretty substantial one 
too, so scratch that idea.

>> By using a separate syntax, you get the following benefits:
>>    1. It makes the code easier to write as it's not a bracketed syntax and if
>> your keyboard makes '.' inconvenient writing Python code would already be hellish
>>    2. It lets the reader know that these values are identifiers rather than
>> arbitrary strings
>>    3. It also lets the compiler know to enforce the rules for identifiers
>> rather than the rules for strings (which are far more lax)
> 
> I'm not sure (1) weighs much but I buy (2) and some of (3), so perhaps
> we ought to think of some other prefix character (or overcome my
> objection to '.').

Crazy idea time. . . since we won't be using it for repr() anymore, what about 
using backtick quoting? That wouldn't make much difference for (1), but would 
provide both (2) and (3).

class C:
   x = property(`_get_x`, `_set_x`, `_del_x`,
                "This is the x property")

Since the results actually *are* strings with the given contents, the 
typographic ambiguity isn't as big of an issue as it is with the current 
meaning of backticks - `foo`, 'foo' and "foo" would all result in the exact 
same object, so if the current font happened to display `foo` and 'foo' the 
same, you'd still be able to understand the code correctly.

The difference between backtick quoting and normal quoting would then be that 
the compiler would just be far pickier about what was allowed between 
backticks, with only legal identifiers permitted (i.e. must start with a 
letter or underscore, may contain only letters, numbers and underscores, no 
string escape sequences). The result would still just be a normal string.

 From a 2.x compatibility standpoint:

   - python3warn would pick up any usage of backticks and recommend replacing 
`EXPR` with repr(EXPR) (or backticks could just be fully deprecated in 2.x)
   - identifier strings written within either '' or "" would work in both versions

Programs written expecting the Python 3 semantics would fail on 2.x, either 
due to backticks being fully deprecated or else failing with a NameError when 
attempting to eval() the expression between the backticks.

> Somewhat unrelated: there's one advantage to enums (which have been
> suggested as an alternative to symbols) which symbols don't share: if
> you have a typo in an enum name, you presumably get an early, hard
> NameError or AttributeError (and pychecker could easily check this);
> but if you misspell a symbol, you just pass a different symbol, which
> is likely to trigger a later error or no error at all (depending on
> how the symbol is used). Of course, enums really serve a different use
> case; they wouldn't help at all for the property-defining use case
> Nick described here.

Yes, this idea doesn't really overlap all that much with enumerations. The 
existence of the enumeration definition as an object in its own right means 
normal attribute access generally works fine for those.

>>    x.attr      <=>  getattr(x, .attr)     <=>  getattr(x, 'attr')
>>    x.attr = y  <=>  setattr(x, .attr, y)  <=>  setattr(x, 'attr', y)
>>    del x.attr  <=>  delattr(x, .attr)     <=>  delattr(x, 'attr')
> 
> So this begs the question: would the following assertion pass or fail?
> 
>     assert .foo == "foo"
 >
> What about this one?
> 
>     assert type(.foo) == str

I would expect both of those assertions to be accurate. This is really just 
about a bit of syntactic sugar to more clearly convey intent when programming 
(and to get the compiler to help out a bit with sanity checking).

Something that might be related is an additional string introspection method, 
isidentifier(), which returned true only if the string was a legal Python 
identifier. This would then always be true for a string literal using 
backticks, but it could be false for other string literals.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-3000 mailing list