source code size metric: Python and modern C++

Alex K. Angelopoulos aka at mvps.org
Mon Dec 2 02:23:06 EST 2002


"Neal Norwitz" <neal at metaslash.com> wrote in message
news:pan.2002.12.02.03.44.36.314421.12688 at metaslash.com...
> On Sun, 01 Dec 2002 21:09:28 -0500, Pavel Vozenilek wrote:
>
> > (I do not wish to start any religious war here, please.)
> >
> > Guido van Rossum in his article
> > (http://www.python.org/doc/essays/comparisons.html) compares C++ and
> > Python and finds Python source often 5-10 smaller than equivalent C++.

> > I will be glad to get info/links on this single metric, without going
> > into "who is better" discussion, please. I know the answer is _more_
> > complicated than single number and am curious on practical experience.
>
> http://www.metaslash.com/brochure/recall.html
>

> This is a real world example of an open source project.
> Short answer for the lazy:
>
>                Main Code  Support Code   Example's Code
> C++               4988         3105          2573
> Python and IDL    1858            0           659
>
> In sum (2517 for Python vs. 10666 for C++), that's a factor of over 4.
> The person doing the code was very proficient in both C++ & Python.
> Although the Python lines could be a bit lower if he used a more
> recent version of python.  I think he stuck w/1.5.2.

There's one other relevant issue here, which is what might be called
_perceptual_ lines of code - and you might indeed see some numbers distorted if
this isn't taken into account.

Probably 75% of the discrete tasks which I perform in scripts involves
read/write a file, capture/filter console output, perform regular expression
searches on text.  For the primary tasks I perform like that, I have plain
vanilla boilerplate functions I have written which are automatically included in
every script I write.  The ReadFile function I use for example, is 10 lines (14
if you count comments).

Perceptually, I tend to blur all of this together; if I need to write a script
that will take a file as an argument, parse it for a specific pattern, and write
that to standard output or another file, I can write it with one line of code in
about 15 seconds.  There's another 30+ lines of code which I don't even think
about that are either embedded explicitly or imported with a single line of
code.  Darned if I know how to count _that_.

I suspect that "fair" metrics would be really hard to develop for this; I do
much less programming in strictly compiled languages, but I bet that some
programmers have the experience as they write boilerplate-built C++ mini-apps
that only about 10 lines of code require actual attention, even though they may
have 100+ lines in the source.





More information about the Python-list mailing list