a python puzzle
Lulu of the Lotus-Eaters
mertz at gnosis.cx
Thu Sep 26 00:20:33 EDT 2002
|the fact that it was mostly Python code would have
|likely skewed the letter frequencies, since Python keywords, modules,
|and builtin names appear more frequently in Python code than general
|user-chosen identifiers; the letter frequency of the code would be
|biased against the letter frequency of the common "words" in Python,
|which is likely to be somewhat different from English as a whole.
I wonder about that. Python reserved words (and pseudo-reserved names)
are all rather ordinary English words. I have a hunch that their letter
distribution falls pretty close to that of English prose.
Of course, you'd have to decide how to weight things. If you merely did
a histogram on a list of keywords, you might get a somewhat different
pattern than if you checked actual scripts (with the comments and
variable names removed). For example, most scripts have just a few
'import's at the top, but a whole bunch of 'if's 'for's and 'in's
scattered throughout the body.
Maybe I'll try an experiment.
---[ to our friends at TLAs (spread the word) ]--------------------------
Echelon North Korea Nazi cracking spy smuggle Columbia fissionable Stego
White Water strategic Clinton Delta Force militia TEMPEST Libya Mossad
---[ Postmodern Enterprises <mertz at gnosis.cx> ]--------------------------
More information about the Python-list