[Mailman-Developers] Want to Code... need some feedback
Charles Iliya Krempeaux
tnt@linux.ca
Thu, 05 Jul 2001 14:16:53 -0700
Hello,
J C Lawrence <claw@kanga.nu> wrote:
>> To do this, I think that e-mail messages should be dumped
>> into a database. (Since I have MySQL and PostgreSQL at my
>> disposal, those are what I'll be able to support myself.)
>
> I have some early proof of concepts done on having MHonArc generate
> scripts which when executed insert their respective message contents
> into PostgresQL with the appropriate threading links. The code is
> based off the PHP and templated based archiving I already do at
> Kanga.Nu, merely taking the already products PHP variable
> assignments in the current system and insteaf having the back end
> use them as the values to insert into the DB.
>
> It works. Kinda. Its not pretty. The reliance on PHP as an
> intermediate layer should be removed (slightly messy as MHonArc
> insists on inserting HTML-style comments), Proper thread handling
> and generation needs to be improved (Shouldn't reluy on MHonArc but
> should be dynamically generated). etc.
My way of thinking, of having it designed, is that Mailman (using
Python) directly dumps the e-mail messages into the database.
(Are there standard [or defacto standard] Python modules for accessing
databases?... For accessing MySQL and PostgreSQL?)
Then, standard PHP (and whatever other languages) bindings/libraries,
to the database, can be provided. That way, the database is the middle man.
And Mailman, and the PHP binding/library (and any other language
binding/library) only depend on the database. (And better still, Mailman
is completely independent of the PHP [and vice versa]. Only the database
structure matters.)
(To get a little deeper into the design...) the important things I see,
to extract from each message (and also store), is:
The author of the message. (This will probably be based on the
e-mail address. But, IMO, it would be better to design it so
a person/author is thought of as a seperate entity from an
e-mail address. That way, a person/author could have more than
one e-mail address, and still be recognized as the same person/author.
There is one problem though... what happens if more than one person
uses an e-mail address?)
The message (or possiblely messages) that the e-mail message
is a response to.
The mailing list (or mailing lists) that it was sent to.
The date & time it was received by the mailing list.
The date & time it was (suppose) to have been sent. (Although
this can be inaccurate when someone does not set their clock
correctly,... as I understand it anyways.)
The subject of the message.
Other `things' that might want to be stored. (Maybe for statistical
reasons. Maybe for other reasons.)
The delivery history of the of the e-mail message.
All the other headers found in the message.
If a message is sent from the web interface, maybe store stuff
like the IP address of the sender, etc.
Does that sound reasonable? Have I missed anything? Your insight into
the workings of Mailman would be much appreciated.
See ya
Charles Iliya Krempeaux