[Tutor] refactoring book and function size

Magnus Lycka magnus@thinkware.se
Wed, 18 Sep 2002 19:13:58 +0200


>On Wednesday, September 18, 2002, at 10:29  AM, Magnus Lycka wrote:
>>On one hand, it makes each method trivial, but on the other
>>hand, it might lead to a situation where you are a bit lost
>>with all these methods. In Python there is also the time
>>involved in function call overhead to consider.

At 10:44 2002-09-18 -0400, Erik Price wrote:
>I have often wondered about this.  Are classes considered "bloated" if 
>they feature a ton of methods to do various things with the data contained 
>within the object?  If so, is that true even if it wouldn't make sense to 
>move some of these methods to a separate class (say, these methods are 
>inherent behaviors that would seem best to go with the original object)?
>
>I find it easier to have a single method that does what I need than to 
>have four methods that can be combined to do what I need (unless of course 
>I ever have need for one of those four methods individually).
>It's less information about the class's interface that I need to 
>remember.  But that might not be the best approach to be taking.

Short answer: It depends.

Long answer:

If you only need one method in the public interface of
the class, please stay with one public method. You can
still implement the code through several private methods.
If the need arises, you can make these methods public
later. You can always extend the public interface of a
class, but once you have offered an interface, it's very
difficult to revoke it. This is what Bertrand Meyer calls
the Open-Closed principle.

Arthur Riel writes in "Object-Oriented Design Heuristics"
that the protocol of a class should have as few messages
(i.e. methods) as possible. But again, that's the public
interface, and no reason to try to do more than one thing
in one implementation routine. Of cource you can distribute
the implementation of your one public operation to several
private or protected methods. And his counter example is
a linked list class with 4000 methods... (That's more
than tons, right?)

Bertrand Meyer's "Object-Oriented Software Construction"
( http://www.eiffel.com/doc/manuals/technology/oosc/ )
is one of the main works on OOP.

Discussing "the road to object orientation" Meyer writes
about five criteria:
  * Decomposability
  * Composability
  * Understandability
  * Continuity
(* Protection)

"A software construction method satisfies Modular Decomposability if it
helps in the task of decomposing a software problem into a small number of
less complex subproblems, connected by a simple structure, and independent
enough to allow further work to proceed separately on each of them."

"A method satisfies Modular Composability if it favors the production of
software elements which may then be freely combined with each other to
produce new systems, possibly in an environment quite different from the
one in which they were initially developed."

"A method favors Modular Understandability if it helps produce software in
which a human reader can understand each module without having to know
the others, or, at worst, by having to examine only a few of the others."

"A method satisfies Modular Continuity if, in the software architectures that
it yields, a small change in a problem specification will trigger a change of
just one module, or a small number of modules."

Taken together, this means that if we have a number of tasks
that are bundled into one big method since we typically perform
them together, we are locked to just this behaviour. On the other
hand, if we have turned every task into an independent method, a
programmer utilizing our class can come up with uses we didn't
ever think of. This is closely related to the Unix Philosophy:
http://www.thinkware.se/cgi-bin/thinki.cgi/UnixPhilosophy

B.M. continues with five rules and five principles, of which the
Open-Closed principle is one. The book is roughly 1250 pages, so
he says a lot more (obviously). But for instance he is very
clear in stating that if a method both changes the state of the
system (a procedure) and retrieves information to the caller
(a function) it should be split, so that it is for instance
possible to retrieve the state without changing anything in the
object.

The idea here is based on something else he states. I don't remember
the exact wording, but he roughly writes that just as in movies that
end when the hero and the pretty girl marry, some people consider
the software development to end when the program is delivered. In fact,
claims Meyer, this is when the fun begins... In other word, classes
evolve over the life time of a program. A class should match some kind
of real world object--concrete or abstract--and with time it will
contain more and more features, that really match what the real
world object is like.

In other words, a class will (at least eventually) have a lot of
generic classes that don't exactly match any distinct parts of
any Use Cases that was written during the design phase (if there was
one ;). Let me say that B.M. (and Riel) seems to have a very different
view on OOP than for instance Ivar Jacobson, and I tent to side with B.M.
See http://www.thinkware.se/cgi-bin/thinki.cgi/TheDarkSideOfUseCases and
http://www.eiffel.com/doc/manuals/technology/oosc/finding/page.html#sources

Returning to Refactoring, which is an XP practice, the
relevant design rule in XP is "Make the simplest thing
that could possibly work!"

The advantage with refactoring is that we don't need to
anticipate all needs in advance. Do we need to reuse part
of a methods? Well, then it's time to refactor that into
a separate method.

I'd say that the XP approach of making the simplest thing
that could possibly work is a reasonable approach for classes
in a monolithic application, developed by a small number of
people.

It doesn't work at all in a library which is thought of as
a separate product with an API used by others, for instance
an OODBMS or something like PIL or ReportLab. Particularly
if it's a proprietary project, but even with open source,
you can't expect the application developers to redesign
their underlying libraries. In these cases you need to work
out a full featured interface, so that the people who use
the library can make full use of it's features in ways we
never imagined. XP seems to be a non-optimal methodology
for such products.

But I think it's a bigger problem to build the right classes,
and to determine how they correlate, than to build the right
methods. Let's take that another day...


-- 
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/  mailto:magnus@thinkware.se