Using utidylib, empty string returned in some cases
gagsl-py2 at yahoo.com.ar
Wed Jan 23 02:14:14 CET 2008
En Tue, 22 Jan 2008 15:35:16 -0200, Boris <savinovboris at gmail.com>
> I'm using debian linux, Python 2.4.4, and utidylib (http://
> utidylib.berlios.de/). I wrote simple functions to get a web page,
> convert it from windows-1251 to utf8 and then I'd like to clean html
> with it.
Why the intermediate conversion? I don't know utidylib, but can't you feed
it with the original page, in the original encoding? If the page itself
contains a "meta http-equiv" tag stating its content-type and charset, it
won't be valid anymore if you reencode the page.
More information about the Python-list