[Tutor] update html pages using python

Alan Gauld alan.gauld at btinternet.com
Mon Aug 31 00:58:57 CEST 2009


"Stefan Behnel" <stefan_ml at behnel.de> wrote

>> "pedro" <pedrooconnell at gmail.com> wrote
>>> Hi, I was wondering if anyone could point me in the right direction as
>>> far as the best way to use python to update html. 
>> 
>> There are a number of modules in the standard library that can help but
>> the best known module for this is BeautifulSoup
> 
> I would call that statement highly exaggerated.
> 
> http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/

There may be a language thing at work here but by "best known module"
I do not mean Beautiful Soup is the best of all known modules, rather it 
is the module which is most widely known of the non standard HTML 
packages. I stand by that. It is also, arguably, one of the easiest to use
and well behaved with non compliant html - ie most web pages - and it 
has reasonable documentation and support.

Ian B's article considers parser performance and Beautiful Soup has 
never claimed to be the fastest! 

I hope that clarifies any confusion.

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/





More information about the Tutor mailing list