Performance Issues with Threaded Python Network Server

Joao Prado Maia JMaia at lexgen.com
Tue Jan 15 15:52:46 EST 2002


> > Please be aware that this is my first real-world python 
> project, so please
> > be gentil if you see something stupid in my code ;)
> >
> Hey, this is c.l.python, not one of those other groups ;-)
> 

Good to know that :)


> > Anyway, I read 'Programming Python' a little bit on the Network
> programming
> > chapter and decided to use threads (aka 
> SocketServer.ThreadingTCPServer)
> on
> > my NNTP server. Everything works great but I have been 
> experiencing some
> > heavy CPU usage on the server.
> >
> > A little bit more of information - the heavy CPU load is 
> triggered when
> the
> > an user tries to download all 1500 messages / articles of 
> one message
> board.
> > The way NNTP works and the way Outlook Express (the 
> newsreader on this
> case)
> > works is that it will download all the headers for the 
> articles at once,
> and
> > then request the actual body of the articles one by one.
> >
> What news clients do (from actual network traces against my 
> news server) is
> a query on the group, which returns a tuple (resp, estimate, 
> first, last,
> name), followed by an XOVER, which is documented in the nntplib
> documentation. I'm not sure what OE would do if it got no 
> response to an
> XOVER, but this is certainly the fastest way to work.
> 

Well, I think you misunderstood the issue here. Outlook Express does a XOVER
1-1500 (or whatever the range is) to get the header information about all of
the articles, and then does a ARTICLE 'number' on every article on the group
to get the actual body of the article. This only happens if the user selects
to have all messages downloaded to his hard disk.


> > What this means is that the server will write to the 'wfile' file
> descriptor
> > to send the response (the headers and bodies of the articles) to the
> > newsreader.
> >
> In the case of an XOVER the news server responds with fairly 
> abbreviated
> information about the available articles, the whole point 
> being to remove
> the need to transfer all headers and bodies down the wire 
> just so the client
> knows what's available. OE (and other news clients) will be fairly
> intelligent about not asking for articles it's already got 
> details for.
> 

Indeed, I understand perfectly the NNTP protocol on this issue. However,
even  'fairly' abbreviated means 1500 lines of header information in just
one message board, which grows by the day.


> > The problem here is that whenever this happens, the CPU 
> usage of the NNTP
> > server goes to about 35% and continues increasing slowly while the
> > newsreader is receiving all the message headers and bodies.
> >
> Too much data! I'm guessing you aren't using XOVER?
> 

It's not my fault. Outlook Express is the one asking for 1500 articles one
at a time because the user wants to download the information of all articles
to his hard disk.


> At the foot of this response is a program I've run against my 
> NNTP server to
> download the XOVER data. It's really quick, so I presume your 
> server should
> also be if it has the required information in a relational 
> store. Don't know
> whether this will help or not, but you can try it just to see 
> whether ig
> vies you any useful information.
> 

Again, the problem is not really related to the XOVER command. See above :)



> > The NNTP server gets its information from a MySQL database 
> (and no, MySQL
> is
> > not the bottleneck as far as I know, since 'top' shows the 
> NNTP server
> > consuming 35% of CPU, not MySQL), formats the output by using string
> > replacement (aka "%s %s <%s@%s>" % (v,x,z,y)) and writes to 
> the 'wfile'
> file
> > descriptor.
> >
> Your do_XOVER code doesn't look outrageously bad.
> 

I'm glad to hear that. Can you tell me something that could improve its
performance anyway ?

Cheers,
Joao


--
Joao Prado Maia
Software QA
Bioinformatics Dept.
Lexicon Genetics, Inc.


*************************************************************************** 
 The contents of this communication are intended only for the addressee and
may contain confidential and/or privileged material. If you are not the
intended recipient, please do not read, copy, use or disclose this
communication and notify the sender.  Opinions, conclusions and other
information in this communication that do not relate to the official
business of my company shall be understood as neither given nor endorsed by
it.  
*************************************************************************** 






More information about the Python-list mailing list