[Baypiggies] replacement for urllib2 that can handle xhtml

Tony Cappellini tony at tcapp.com
Tue Dec 28 18:42:53 CET 2010


It appears my url was misformed (OE).
When I had initially tried opening the url with urllib2.urlopen(), an
exception was thrown.

<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet href="http://imgs.xkcd.com/s/c40a9f8.css"
type="text/css" media="screen" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

When I saw the xml in the header, I had thought that urllib2 wasn't
compatible with it.
But it works now.


On Tue, Dec 28, 2010 at 7:05 AM, Aahz <aahz at pythoncraft.com> wrote:
> On Mon, Dec 27, 2010, Tony Cappellini wrote:
>>
>> What's the best module/package for parsing xhtml?  HTMLParser is
>> built in, but is there another package which is more like urlib2 or
>> Beautiful Soup- but handles xhtml?
>
> lxml?  (Never used it myself, but xhtml is supposed to be xml-compliant,
> therefore lxml would be the obvious choice.  ElementTree also ought to
> work.)
> --
> Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/
>
> "Think of it as evolution in action."  --Tony Rand
> _______________________________________________
> Baypiggies mailing list
> Baypiggies at python.org
> To change your subscription options or unsubscribe:
> http://mail.python.org/mailman/listinfo/baypiggies
>


More information about the Baypiggies mailing list