help with (x)html / xml encoding...
staschuk at telusplanet.net
Fri Mar 21 03:51:11 CET 2003
> i'm looking for a way to extract encoding from a file retrieved by urllib,
> i'm planning of creating a "restricted" parser which will only examine <?
> and <meta tags, to check for :
> <meta http-equiv="content-type" content="text/html; charset=xxxencodingxxx">
> <?xml version="1.0" encoding="'xxxencodingxxx'"?>
> do you think that is enough ? how should you do it ?
You should also check the data in urlopen(foo).info() for a
Content-Type header; the value of that header is supposed
to take precedence over either of the above.
Steven Taschuk staschuk at telusplanet.net
"Telekinesis would be worth patenting." -- James Gleick
More information about the Python-list