Fun with function argument counts
travisgriggs at gmail.com
Wed Feb 12 01:34:03 CET 2014
After the recent discussion about the classic error:
Among many thing, the OPs contention was that the ability to have this kind of error was a Bad Thing (tm). Which led to me asking about code smells and parameterless functions/methods.
So I got curious. Semantics of implicit objects aside, how often is it possible to write code like that. In short, how frequent are methods/functions that have zero explicit args (not implicit, because while fun, that’s not how we code them). It was a fun experiment, I’ve been doing python for a little over a year now, and I thought it would be enjoyable/educational to do a bit of metaprogramming. First the results though.
Below is a histogram of function argument count:
argCount x occurrences (% of total)
-1 x 10426 ( 3.2%) # these are where I stuffed all of the routines I couldn’t extract the argspecs for
0 x 160763 (48.7%)
1 x 109028 (33.0%)
2 x 40473 (12.3%)
3 x 7059 ( 2.1%)
4 x 2383 ( 0.7%)
5 x 141 ( 0.0%)
6 x 46 ( 0.0%)
7 x 12 ( 0.0%)
10 x 1 ( 0.0%)
16 x 1 ( 0.0%)
19 x 2 ( 0.0%)
Nearly half of the functions/methods I scanned were zero args (48.7). Which was more than I expected. I haven’t dug enough to see if there’s a flock of canaries in there or not. The code to produce that table is here: https://gist.github.com/anonymous/8947229. Yes, it’s hacky. Wasn’t trying to win any style/idiom awards with this one.
To get this table, I used PyPy 3-2.1 beta for OSX. I basically attempted to parse all of the modules found in it’s lib-python directory. A subset of the modules wouldn’t load, I’m not sure whether to write that off as the work-in-progress nature of pypy or what. And I black listed some of them (such as idlelib.idle, ctypes.test.*, etc). But from the counts, I was able to get a pretty large corpus of them.
I learned a number of fun things as part of the exercise:
1) I did not want to try load all modules at once into my environment. I suspect that would create problems. Using os.fork() to isolate the load/analysis of each module was a handy way to deal with that. The trick of using if pid: branch to split of the flow of execution was cool. Maybe it’s the wrong tool for the job and I missed an obvious one, but I thought it was kinda clever.
2) Using cpython, a lot of the core library can’t be reflected on. I tried to use the inspect module, and found that things like inspect.getmembers(datetime.datetime, inspect.ismethod) resulted in a big surprise. Especially if you leave off the predicate, and see that there are a lot of things in there that claim they ARE methods). I finally stumbled into inspect.isroutine, but found that most of the results couldn’t be reified using inspect.signature anyway. The “it’s all python, all the way down, well mostly” ideology of pypy ensured that a higher percentage opt the base libraries could be analyzed.
3) It’s easy to take 3.3.x for granted. I found right away that signature was introduced in 3.3, so I had to use inspect.getfullargspec() instead.
4) optional arguments were an interesting dillema. I chose to reduce the argument count of a signature by the number of default arguments. Since they are essentially optional. So the stats there have a bias to the “minimal” call signature.
5) my method of scanning loadable modules is probably very naive/brute force/stupid. I would love it if there was a better way to do that.
6) Who writes a function with 19 mandatory arguments anyway???? subprocess._execute_child() and distutils.cygwincompiler._execute_child()
7) I’m not entirely sure, given #6, that I’m not seeing inherited methods for all classes, so getting an overinflated count from them. Can anyone answer that?
More information about the Python-list