COCOMO - appropriate for languages like Python?

Sun Jul 7 11:19:14 EDT 2002

hi skip,

> It's been many years since I've considered software cost estimation models
> like COCOMO.  Is COCOMO appropriate to apply to high-level languages like
> Python?  I believe COCOMO assumes each fully debugged source line of code
> (SLOC) in any language and for any purpose costs roughly the same amount to
> produce.  The cost savings for languages like Python would come because you
> have to write so many fewer lines of code to implement the desired
> functionality.
dont know about cocomo (thou i read caper jones' book bout software
estamating -
but that was sometime ago). but i am a fan of function point analysis
(fpa)
and there they have the concept of a productivity factor (dont know
the exact
term, sorry) which tells you how many function points you are able to
express with (if i remember correctly) 1K SLOC. i can remeber a table
saying that this factor was twice as high for python as with java.
i have looked on the net but could not find the exact link.

> This constant cost per SLOC assumption seems at least marginally invalid to
> me.  For example, writing an extension module which wraps a preexisting C
> library is generally pretty straightforward (even without tools like SWIG or
> Pyrex), but is pretty verbose, so COCOMO would tend to overestimate the cost
> to produce such code.  It also seems the clarity of a language's constructs
> will make a difference.  For example, in C I can write
> 
>     if ((c = gets(s)) == EOF) {
>         ...
>     }
> 
> while in Python, even though I have to express that concept in two lines:
> 
>     s = sys.stdin.readline()
>     if s == "":
>         ...
> 
> you can argue the Python code is less likely to contain a bug.  Together,
> the cost to create those two (fully debugged) lines of Python code may well
> be less than the cost to create the one line of C code, not twice the cost
> of the one line of C code, as COCOMO would estimate it.
> 
> Are there other (more modern?) cost estimation models which don't assume
> cost-wise that a SLOC is a SLOC is a SLOC?

one of the usual misconceptions about metric-suites, of which software
estimating is one, is that they do treat everything uniformly. but
every
metrics book (and paper) advises you to tailor you metrics program to
your (that is your project) needs. so for example if your projects 
consists of a lot of very dense declarative sourcefiles that will be
transformed into very verbose sourcefiles (eg. python - > c) you
are asked to measure those resulting sourcefiles differently than 
sourcefiles that might be written from hand.

one very simple categorisation might be
id | description               | loadfactor
--------------------------------------------
 1 | completely handmade files | 1
 2 | partially generated       | 0.5
 3 | fully generated           | 0.15
etc.

so i dont belive that cocomo says a sloc a sloc is a sloc.
at least fpa doesnt do that and it predates cocomo II.

ciao robertj