Graduate thesis on Python-related subject

Mon Apr 30 12:19:44 EDT 2001

"Benjamin.Altman" <benjamin.altman at noaa.gov> wrote in message
news:3AED6C1E.D20BEB42 at noaa.gov...
> They may be a 'random' measure, but in practice it seems to be true.
Obviously
> an
>     if x:
>         blah
> verses
>     if x: blah
>
> is not what I was talking about.  If it comes down to it you could say
also:
>         if \
>         acondition(foo):
>             blah(bar)
>
> too.

Exactly.  Number of physical lines is only _correlated_ to "true
size of the program", and number of marginal lines also has a
correlation-factor with "real size", which strongly depends on
coding-style.  Focusing on "less lines of code" runs the serious
risk of encouraging line-stingy coding styles as "less bug-prone",
where it's quite likely that the clearest (and thus least bug
prone) coding style makes judicious use of "more lines than
minimally necessary" to optimize clarity, readability, etc.

The problem of measuring "the true size of a program" thus becomes
subtle in important ways -- ones that a graduate thesis (as per
subject of this thread) would be well-advised to explore.  There
is ample literature on the subject, but not all that many solid,
experimental, repeatable results.  I suspect that the most useful
measurements may correlate with, e.g., Cyclomatic Complexity (just
to take a VERY classical measure!), _at least_ as much as it
correlates with "counts of line" for any kind of "line".  There
are, of course, many other candidates.

Just because "raw [physical] lines of code" is VERY simple to
measure (Unix's wc program, for example -- or a Python 1-liner:-)
AND has good correlation with "real size of the program" *FOR A
FIXED CODING STYLE*, doesn't automatically make it the measure
of choice...:-)

Alex