[Tutor] Trying to find a value in a function.

Magnus Lycka magnus@thinkware.se
Fri Mar 14 19:22:03 2003


Warning: subjective opinions and untested code below...

Andre wrote:
>Magnus Lycka a écrit:
>
> > Could you please show us a simple example of such a text file,
> > and tell us what you want to happen with it. I think I need
> > something very concrete here...
>
>The text file only contains properties of the instance to be created and
>is exported from a database.
>
>Let's take an example
>
>In our program,
>the class
>user:
>   name
>   adress
>   job
>   homedir

But the code defining the class, "class User: ...", will be
in your program, right?

>the constructor (addaUser) will :

The word constructor is typically used to mean the method
__init__ which is executed when a class is instanciated.

It's probably a good thing not to do too much there. It
reduces the flexibility of the class.

>create an instance,
>create the home directory,
>update the company's tree,
>update the company's directory,
>update the payroll
>verify the email,
>   notify the sysadmin in case of failure or
>   notify the users of the creation of their homedir,
>notify the financial services
>etc...

I'm a little confused here, because it seems to me that you
are mixing different conceptual levels. Creating home directory
and updating the payroll seems like functional requirements.
Create an instance seems like a technicality in the program,
and something that will only have relevance during a short
run of a program unless I misunderstand you. But never mind
that, I think we'll sort this out.

>The constructor not only creates instances of the product.

Constructor... Hm... This can be arranged in many ways. Let's
see below.

>Provided by the customer, a user file containing name, adress, job,
>department, home directory, email, which structure may be different for
>each customer.

Ok, no problem, as long as we know what the structure is.

>Bob, 15 parkway, CEO, Headquarters, bobdir, bob@thecorp.com
>Jim, Washington, CTO, Technical division, jimdir, jim@thecorp.com
>
>or
>
>Bob    15 ParkWay   CEO  Headquarters          bobdir   bob@thecorp.com
>Jim    Washington   CTO  Technical division    jimdir   jim@thecorp.com
>
>or
>
>Bob
>bobdir, bob@thecorp.com
>15 ParkWay
>CEO, Headquarters

The second version assumes that you know the positions where each column
starts, since there are spaces in the text. The third is a rather boring
mix with one or two params per row, but I get your point...

But let's look at this part of the problem separately:

 From either of these files, you should be able to get a list
of tuples, or a list of dictionares, that you can feed to your
User class to get instances.

Either

userList = [
      ('Bob', '15 parkway', 'CEO', 'Headquarters', 'bobdir',
       'bob@thecorp.com'),
      ('Jim', 'Washington', 'CTO', 'Technical division', 'jimdir',
       'jim@thecorp.com')
     ]

or

userList = [
      ({'name': 'Bob', 'address': '15 parkway', 'position': 'CEO',
       'dept': 'Headquarters', 'homedir': 'bobdir',
       'email': 'bob@thecorp.com'},
      {'name': 'Jim', 'address': 'Washington', 'position': 'CTO',
       'dept': 'Technical division', 'homedir': 'jimdir',
       'email': 'jim@thecorp.com'}
     ]

Ok? This transformation from various kinds of text files (or read
via a database interface from an Oracle database or whatever) is
the first step, and we shouldn't mix this with the rest.

If you need help with that, we can return to it, but is has nothing
to do with instanciating classes.

Now you could do something like:

for userData in l:
     try:
         u = User(userData)
         u.createHome()
         updateTree(u)
         updateDirectory(u)
         updatePayroll(u)
         u.verifyEmail()
         notifyFinancialServices(u)
     except Exception, e:
         alertSysAdmin(userData, e)

As you see, I envision some functions as methods in the User
class, and other as functions in your program that take a
User instance as a parameter. This depends on how you want
to configure the code. All this could be methods in User.

Assuming a list of tuples, your class could look something
like this:

class User:
     def __init__(self, userData):
         (self.name, self.address, self.position, self.dept,
          self.homedir, self.email) = userData

     def createHome(self):
         try:
             os.mkdir(self.homedir)
             email(self.email, HOME_DIR_CREATED_MSG %
                               (self.name, self.homedir))
         except OSError:
             email(self.email, HOME_DIR_CREATE_FAILED_MSG %
                               (self.name, self.homedir))
             email(SYS_ADM, HOME_DIR_CREATE_FAILED_MSG %
                               (self.name, self.homedir))

With a list of tuples, you'd have an __init__ like this instead:

     def __init__(self, userData):
         for attr in ('name', 'address', 'position', 'dept',
                      'homedir', 'email'):
             setattr(self, attr, userData[attr])

If there are both compulsory and optional parameters that the
users can provide to the program, it might get a bit more
complicated, but not very. The easiest solution is to always
provide all attributes to the __init__ method, but to set
left out values to Null, and to handle that in the code. (You
have to handle the optional values the same way either way
once the instance is constructed.)

>The application will be used by non-programers.
>To update it, they'll import the data by choosing the ImportStructure of
>what they are importing, describing their FormatStructure in a form,
>         (in the last example the format will be :
>         multiple lines record
>         field separator in a line : ','
>         4 lines per record (there may be a record separator)

To be honest, I doubt that this will work very well... People will
use comma separated formats, and still have commas in fields etc.
Non-programmers rarely understand just how stupid computers are, and
assume that things that are obvious to a person reading a texts is
obvious for a computer to parse.

But as I said, this is a separate problem from the instanciation
and execution of classes and methods. Create a list of tuples or
dictionaries from the user supplied data!

>and associate the object properties to their files via a form.
>         instance.name           = field1
>         instance.adress         = field4
>         instance.job            = field5
>          instance.department    = field6
>         instance.homedir        = field2
>         instance.email          = field3

Well, eventually, but that's handled in the conventional way by
init above.

>My application will use the FormatStructure to identify each field and
>the user will associate the fields to the instance via the
>RelationshipStructure.

This sounds overly complicated to me. May I suggest that people
stick to either:

name: Jim
address: Washington
etc

or a conventional CSV format as in

name,address,job...
Bob,"15 Parkway",CEO...

For reading the latter, there are several readymade python modules (CSV
etc, see http://www.thinkware.se/cgi-bin/thinki.cgi/UsefulPythonModules)

Regular expressions might also be useful to find out how to extract
data. You might well be able to get your list of lists with something
like...

import re
userFormat = re.compile(...something...)
userList = userFormat.findall(file(inFileName, 'r').read())

Look at http://www.amk.ca/python/howto/regex/ and
http://www.python.org/doc/current/lib/module-re.html

>Notice that the application will be used by different companies, that we
>won't know the structure of their files.

Ok. But make sure that transformation from custom files to
a uniform format is a small separate code that has nothing
to do with the actual business logic. Only solve one problem
at a time...

>We will provide a clear description of the object in a form for them
>being able to update their application without external assistance.

Don't have your customers understand the internals of your
code. First of all, it's a technicality they shouldn't need
to be involved in, and secondly it locks you up.

Just define an interface. "For each user, you should provide these
attributes in this format..."

>What i want too, is to use this application to provide an evolutive
>documentation on fonctions and classes used in an application at the
>property level (what is the property, where it is coming from, how it is
>   involved in the application, in what other functions it is used, etc.)
>updated by programers and technical users.
>For example i'm trying to understand a python application, i'll be happy
>to have a structured description of the application and to store my
>understanding of every element of the application via a form.

In my opinion. Well written Python source code is probably one
of the best ways to describe program logic. By all means, add
general comments on the purpose of this and that to the code,
but for typical high level business logic, python code usually
speaks for itself, and it's easy to modify etc. I'm not sure
your "forms" will be useful though. As I have indicated, the
normal python collection types: list, tuple, dictionary, are
very useful for these kinds of things.

If it's difficult to follow the business logic in the main
routines, you should probably factor out pieces into separate
functions/classes/modules.

Obviously, python source code will be confusing for any
non-programmer, but it's my experience that someone who
can't write a program is unlikely to create a functioning
system by setting up some kind of non-trivial configuration.
It's much, much easier for you to fix the conversion from
any potential format to the generic format above, than to
predict all the kinds of possible formats that might occur,
and to construct a general solution that will handle all
these situations. I wouldn't even try.

With some experience, you should be able to write a converter
for any reasonable format in a matter of minutes. As the
number of customers grow, you will be assemble a catalog
of converters, and most of the time you will be able to use
an existing one, or possibly make a tiny modification to an
existing one. Unless you plan to ship this as shrinkwrap
software, be a sport and help each customer. I think you need
a lot of customers to save money with a generic solution.
Besides, it will be much easier to write a good generic
solution when you have solved the problem for a number of
real customers. It's not until then that you can really
know where the real problems lie, and what kind of
requirements there are from the customers. Why just text
files? You could also write converters that fetch data from
databases, web pages of web services!

You might also want to take a look at these resources if you
are interested in simple descriptions of structured data:
http://yaml.org/
http://www.scottsweeney.com/projects/slip

>I don't think it's so complicated and there may be simpler ways to do
>that (perhaps provided by python functions) so i'd follow the tracks
>you'd indicate to me.
>I wanted to know how deep i'd be able to automate the analysis of a
>function.

You can to a lot there, but I don't think it's the right
path for you to follow. If you have a class where you want
to inspect that you have the required attributes, you can
either do as I did above in the two __init__ methods. In
both those cases, you will get an exception when you try
to instanciate the object if you lack needed parameters
(you can add assertions to make sure properties are correct)
and you can catch the exception and act upon that.

Another option is to do something like:

class User:
     attributes = (('name', str),
                   ('address', str),
                   ('position', str),
                   ('dept', str),
                   ('homedir', PathName),
                   ('email', EmailAddress))
     def __init__(self, userData):
         for attr, typeOrClass in self.attributes:
             setattr(self, attr, typeOrClass(userData[attr]))

In this case you can if you want access User.attributes to find
out what parameters you need to provide and what requirements they
need to fulfil. The first parameter is the name, and the second
is a type (assuming python 2.2 or later) or class that takes the
value as it's single argument on instanciation. That way you
can test that values exist and are ok before you instanciate
the class if you prefer that to catching exceptions on
instanciation.

If you really want a more generic way of converting data (which
might pay off if you have many involved classes) it's much better
to design the relevant classes so that they provide an explicit
interface for the information you need, rather than to try to
use introspection magic. There is really no way you can know
for certain what to provide to an arbitrary class, whatever you
might extract through introspection. Look at the __init__ above.
Nothing about the attributes of the class is revealed from the
code in the __init__ method. It's all driven from the User.attributes
list, which is clear when you look at the code, but hopeless to
find out programmatically unless you are prepared for it...

>Notice that my last programing experience occured 20 years ago and that
>i learnt the name of python 4 months ago.
>I knew that OOP existed. :)
>That may explain that, to me, a hill may look like the Everest.

I hope it looks a little less steep now...

>Andre.

-- 
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/  mailto:magnus@thinkware.se