please comment on technologies

Cameron Laird claird at lairds.com
Mon Apr 28 16:02:04 EDT 2003


In article <slrnbalplq.254.luc at trillian.dont-panic.info>,
luc wastiaux  <luc at nospam.com> wrote:
>Hello, I am writing the specifications for a school project (the subject 
>is free for us to choose), I was thinking about doing a news-server 
>archive database just like google groups, but on a smaller scale (it would 
>only archive a limited number of newsgroup, not all of USENET). I got this 
>idea because my school uses a news server and a lot of valuable 
>information is lost when old messages are erased from the server's spool.
>
>The project would be developped by a group of 4 to 6 CS students 
>(including me), over a three months timespan (but not full time, we have 
>to go to classes and other stuff)
			.
			.
			.
>I'm (almost) sure we will make use of python as the main language, and use
>either Mysql or postgresql for the database (I'm familiar with mysql but
>maybe it's too limited for what I want to do?). I just found out about
			.
			.
			.
I'll say a few things I've read from no one else.

If a commercial client told me that he depended on Usenet, and
it was working well, but "a lot of valuable information is lost
...", my first recommendation would be to tune his Usenet 
service.  That's a way to re-use a lot of existing infrastruc-
ture.  You can set up a special-purpose NetNews server, with 
its own local datastore of just the groups you want, and
configure expiration times to "never".  That's the most con-
servative approach that comes to my mind, and, in
return-on-investment calculations, I'm a conservative.

Also, while it's fine with me that you practice your Mysql or
PostGresql skills, I don't get the point.  You're just storing
messages, right?  It's a write-once-read-many-delete-never
model?  You can do worse than just dumping stuff into the file
system.  Again, look at the existing NetNews server implementa-
tions.

These simplifications free attention to be focused on the
query model, which is likely to be a more interesting problem
than you realize.

For historical interest, I'll note that <URL: http://
phaseit.net/claird/news.lists/newsgroup_archives.html > indexes
a few thousand (well, it used to; maybe a couple hundred are
still live) special-purpose Usenet archives.  Perhaps it'll
amuse you to see how others have worked in this area before
you.
-- 

Cameron Laird <Cameron at Lairds.com>
Business:  http://www.Phaseit.net
Personal:  http://phaseit.net/claird/home.html




More information about the Python-list mailing list