Extending/embedding versus separation

Thu Mar 28 06:12:09 EST 2002

Hi

Thanks for your help. No, I didn't write the hash tables. I'm writing
a stats program based on webalizer. Webalizer and Analog were the
fastest and least memory hungry of all the programs I know of, and
webalizer is GPL so I was able to use it as a basis for my work, so I
get to make free software at work! The hash tables are already written
there. 

This program is intended to run on all the sites hosted by the company
I work for, and also on sites not hosted by it, hence the memory/disk
space concerns. 

The way I see it, the python side of things will be a python script
that imports the C parts, and can therefore process the logfiles and
turn them into hash tables. These are then fed back to the python
part, which processes them further, and turns them into graphs and html. 

The reason for this is I need to be able to quickly develop and expand
on the reports produced, so we can do complicated things like, say,
visitor paths or percentage increase graphs.

As I see it now, the only difference I get if extending python with my
adapted webalizer, is that I save writing a little output to file and
parsing it into python again.   

Is this going to be worthwhile?

Ale

--- In python-list at y..., sjmachin at l... (John Machin) wrote:
<snip!>
> 
> My take on a project like yours would be to write *everything* (that's
> not available off the shelf) in Python first, even parts that you
> think you know for sure are going to have to be implemented in C
> later. Those parts can be coded up in Python modules that can be
> replaced by C extensions if really needed. The infrastructure stuff
> like command-line-arg handling (or GUI input), file handling, etc is
> so much easier to bolt together in Python than in C that I would
> prefer extending Python (if necessary) to embedding Python in C.
> 
> Hash tables in C? Are you using a package like Cdt, or did you write
> your own? If you can process data into hash tables in C and then get
> the data into Python faster than you can process the data into
> dictionaries in Python, then please divulge your deep dark magic
> spells to Tim Peters the Python-dict-shaman.
> 
> You seem to be a bit concerned about memory usage. My advice is to
> write a prototype of your application in Python, bearing memory
> efficiency in mind, but not obsessively -- i.e. use the most
> appropriate data structures and don't distort your Python code into
> unreadability. Then you either have enough memory or you don't. Do
> some back-of-the-envelope calculations, like: a 256MB stick of memory
> costs how few hours of developer time? If you can't for whatever
> reason get more memory, then it's time to consider your next step. If
> your application has some large dictionaries that only have objects of
> type X as keys and type Y as values (where Y is a simple type like int
> or float) then it might be a good idea to take a copy of dictobject.c
> and make a specialised intdict (say) module that instead of managing
> (PyObject *) pointers, managed int values directly -- this would save
> you heaps (pardon the pun) of memory; see recent thread in this
> newsgroup about amount of memory taken up by Python objects.
> 
> For another (already implemented) variation on this memory-saving
> theme, google("c.l.py", "Machin intern memory").
> 
> HTH,
> John
> -- 
> http://mail.python.org/mailman/listinfo/python-list