[Tutor] Creating class / OOP structure

Steven D'Aprano steve at pearwood.info
Wed Nov 6 03:19:33 CET 2013


On Tue, Nov 05, 2013 at 03:55:05PM -0800, Johan Martinez wrote:
> I need help in modifying my program. Right now it looks as follows:
[snip code]
> Can someone help me in improving my code?

Yes! The first thing to do is get rid of the unnecessary class. This is 
not Java where you have to write classes for everything. From the sample 
code that you show below, using OOP here accomplishes nothing except 
making the code more complicated and less efficient.

Please don't think I am *against* OOP. But it is a tool, nothing more, 
and you should always use the right tool for the job. And in this case, 
I think the right tool is to create two functions:

def parse(filename):
    with open(filename, 'r') as f:
        ...
    return parsed_file

def process(parsed_file):
    for k in parsed_file:  # Iteration automatically uses keys.
        ...
    return processed_file


And then simply call them like this:

processed = process(parse('sales.txt'))

Why is this better?

In the code you show, you don't use the object-oriented structure, so 
why waste time building an object-ordiented structure? Similarly, after 
you have read the file once, the class solution holds onto the contents 
forever. But you don't need it forever, once you have parsed the 
contents they are no longer needed. So why hold on to them longer than 
necessary?

Classes should only be used when you need to encapsulate state plus 
behaviour, in other words, when they represent a *thing*. Classes should 
be nouns: BankAccount, User, DecimalNumber all make good classes. Your 
class *needs* only behaviour -- the class doesn't need to store either 
self.fname nor self.parsed_file, since they are both only used once. In 
effect, they are temporary variables that are given names and made 
long-living:

# no need for this
filename = "sales.txt"
parsed_file = parse(filename)
processed = process(parsed_file)

# now go on to use processed but not filename or parsed_file
...

Since those two variables are only used once, there is no need to keep 
them as named variables. Get rid of them!

processed = process(parse('sales.txt'))


Now, obviously there are parts of the code you haven't shown us. Perhaps 
you do have good reason for storing parsed_file and filename for later 
use. If so, then my objection to using a class will be reduced, or even 
eliminated.

But even then, there is one last problem with your class: it has a 
dangerously misleading name, which hints that it is not a well-designed 
class.

"Filedict" is neither a file nor a dict -- it doesn't inherit from file, 
or behave like a file; it doesn't inherit from dict, or behave like a 
dict. It is, in fact, *not* a file/dict at all. What you really have is 
something with a nasty compound name:

FileParserAndDictProcessor


When your classes have nasty compound names like this, it is a very 
strong clue that the class doesn't actually represent a thing and 
probably shouldn't exist. Don't use classes just as a wrapper around a 
few functions and unrelated variables. (What does the file name have to 
do with the process() method? Why are they stored together?)

For a very entertaining -- although quite long -- look at this issue, 
have a read of this:

http://steve-yegge.blogspot.com.au/2006/03/execution-in-kingdom-of-nouns.html


You don't need to be a Java developer to get value from it.


> Should I store parsed_file as an object attribute as follows?
> 
> class Filedict():
>     def __init__(self, fname):
>         self.fname =fname
>         self.parsed_file = self.parse()

Should you? No. Can you? Yes. But follow the logic -- you could continue 
the process of pushing more code into the __init__ method:

class Filedict():  # Terrible name...
    def __init__(self, fname):
        self.fname =fname
        self.parsed_file = self.parse()
        self.processed = self.process()

f = Filedict("sales.txt")
processed = f.processed

But now the instance f has no reason to exist, so dump it:

processed = Filedict("sales.txt").processed

which basically means you are using __init__ just as a wrapper to hide 
away a couple of function calls.



-- 
Steven


More information about the Tutor mailing list