[Python-3000] some stats on identifiers (PEP 3131)
Steve Howell
showell30 at yahoo.com
Sun May 27 01:42:46 CEST 2007
Here is a survey of some Python code to see how often
tokens typically get used in Python 2.
Here is the program I used to count the tokens, if you
want to try it out on your own in-house codebase:
import tokenize
import sys
fn = sys.argv[1]
g = tokenize.generate_tokens(open(fn).readline)
dct = {}
for tup in g:
if tup[0] == 1:
identifier = tup[1]
dct[identifier] = dct.get(identifier, 0) + 1
identifiers = dct.keys()
identifiers.sort()
for identifier in identifiers:
print '%4d' % dct[identifier], identifier
The top 15 in gettext.py:
ssslily> python2.5 count.py
/usr/local/lib/python2.5/gettext.py | sort -rn | head
-15
98 self
73 if
69 return
39 def
35 msgid1
34 tmsg
33 n
33 None
32 domain
31 message
29 msgid2
28 _fallback
21 else
20 locale
20 in
The top 15 in an in-house program that deals with an
American-based format for sending financial
transactions (closest thing I could find to Dutch tax
law):
23 trackData
19 ErrorMessages
18 rest
16 cuts
12 encryptedPin
11 return
10 request
10 p2
10 p1
10 maskedMessage
10 j
10 in
10 i
9 len
9 ccNum
____________________________________________________________________________________Choose the right car based on your needs. Check out Yahoo! Autos new Car Finder tool.
http://autos.yahoo.com/carfinder/
More information about the Python-3000
mailing list