counting lines of code

Michele Simionato michele.simionato at gmail.com
Fri Jan 22 00:00:30 EST 2010


On Jan 21, 9:24 pm, Phlip <phlip2... at gmail.com> wrote:
> On Jan 20, 11:20 pm, Michele Simionato <michele.simion... at gmail.com>
> wrote:
>
> > pylint does too many things, I want something fast that just counts
> > the lines and can be run on thousands of files at once.
> > cloc seems fine, I have just tried on 2,000 files and it gives me a
> > report in just a few seconds.
>
> In my experience with Python codebases that big...
>
> ...how many of those lines are duplicated, and might merge together
> into a better design?
>
> The LOC would go down, too.

Actually 2,000 files is a very small portion of our code base, the one
I am working on now. I have spent the last couple of months on a big
refactoring project (which is still only at the beginning) and I
wanted to count the difference between the lines of code before the
refactoring and after the refactoring. I guess the new code is less
than half than the old one. There was no cut and paste in the old code
but a lot of subtle duplication, i.e. a code that could be unified in
common libraries, but only after a lot of grunt work. The core parts
were written 10 years ago, with a wrong architecture starting from the
beginning, and then things started growing and growing on that
monster. Just for fun I have run cloc on our trunk:

Language           files     blank   comment      code    scale   3rd
gen. equiv
--------------------------------------------------------------------------------
C++                 1528     67150     48251    304365 x   1.51 =
459591.15
XML                  560      2769      2517    223223 x   1.90 =
424123.70
ASP                  731     40136      4630    216713 x   1.29 =
279559.77
Python              2027     38825     47261    179532 x   4.20 =
754034.40
C/C++ Header        2150     51352     72619    141356 x   1.00 =
141356.00
Javascript           153     26196      9819    115311 x   1.48 =
170660.28
C                    332     14147     12871     97918 x   0.77
=       75396.86
SQL                  426     16432      4214     93598 x   2.29 =
214339.42
CSS                  110      1493      1013     23087 x   1.00
=       23087.00
C#                    83      3301      1990     19827 x   1.36
=       26964.72
Visual Basic          35      4363      5927     14633 x   2.76
=       40387.08
make                 259      1617       650      8339 x   2.50
=       20847.50
Bourne Shell          52       598      1282      6557 x   3.81
=       24982.17
m4                    28       611       627      5612 x   1.00
=        5612.00
IDL                   23       560         0      3895 x   3.80
=       14801.00
HTML                  33       354        76      3834 x   1.90
=        7284.60
MSBuild scripts        3         2         7      3419 x   1.90
=        6496.10
Lisp                  33       562       648      2695 x   1.25
=        3368.75
Ruby                  13       272        97      1141 x   4.20
=        4792.20
DOS Batch             77       790       410      1034 x   0.63
=         651.42
Java                   4       148       181       972 x   1.36
=        1321.92
Perl                   6       104       131       922 x   4.00
=        3688.00
XSD                    6         0         0       506 x   1.90
=         961.40
awk                    5        65        17       366 x   3.81
=        1394.46
DTD                    4       117        50       351 x   1.90
=         666.90
ASP.Net               36       153       561       280 x   1.29
=         361.20
Bourne Again Shell    12        63         8       245 x   3.81
=         933.45
XSLT                   1        15        14       196 x   1.90
=         372.40
NAnt scripts           3        27         0       119 x   1.90
=         226.10
Teamcenter def        10        16         0        93 x   1.00
=          93.00
--------------------------------------------------------------------------------
SUM:                8743    272238    215871   1470139 x   1.84 =
2708354.95



More information about the Python-list mailing list