Default scope of variables
Dave Angel
davea at davea.name
Thu Jul 4 22:03:52 EDT 2013
On 07/04/2013 09:24 PM, Steven D'Aprano wrote:
> On Thu, 04 Jul 2013 17:54:20 +0100, Rotwang wrote:
> [...]
>> Anyway, none of the calculations that has been given takes into account
>> the fact that names can be /less/ than one million characters long.
>
>
> Not in *my* code they don't!!!
>
> *wink*
>
>
>> The
>> actual number of non-empty strings of length at most 1000000 characters,
>> that consist only of ascii letters, digits or underscores, and that
>> don't start with a digit, is
>>
>> sum(53*63**i for i in range(1000000)) == 53*(63**1000000 - 1)//62
>
>
> I take my hat of to you sir, or possibly madam. That is truly an inspired
> piece of pedantry.
>
>
>> It's perhaps worth mentioning that some non-ascii characters are allowed
>> in identifiers in Python 3, though I don't know which ones.
>
> PEP 3131 describes the rules:
>
> http://www.python.org/dev/peps/pep-3131/
>
> For example:
>
> py> import unicodedata as ud
> py> for c in 'é極¿μЖᚃ‰⇄∞':
> ... print(c, ud.name(c), c.isidentifier(), ud.category(c))
> ...
> é LATIN SMALL LETTER E WITH ACUTE True Ll
> æ LATIN SMALL LETTER AE True Ll
> ¥ YEN SIGN False Sc
> µ MICRO SIGN True Ll
> ¿ INVERTED QUESTION MARK False Po
> μ GREEK SMALL LETTER MU True Ll
> Ж CYRILLIC CAPITAL LETTER ZHE True Lu
> ᚃ OGHAM LETTER FEARN True Lo
> ‰ PER MILLE SIGN False Po
> ⇄ RIGHTWARDS ARROW OVER LEFTWARDS ARROW False So
> ∞ INFINITY False Sm
>
>
>
The isidentifier() method will let you weed out the characters that
cannot start an identifier. But there are other groups of characters
that can appear after the starting "letter". So a more reasonable
sample might be something like:
> py> import unicodedata as ud
> py> for c in 'é極¿μЖᚃ‰⇄∞':
> ... xc = "X" + c
> ... print(c, ud.name(c), xc.isidentifier(), ud.category(c))
> ...
In particular,
http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers
has a definition for id_continue that includes several interesting
categories. I expected the non-ASCII digits, but there's other stuff
there, like "nonspacing marks" that are surprising.
I'm pretty much speculating here, so please correct me if I'm way off.
--
DaveA
More information about the Python-list
mailing list