ANN: encutils 0.4

Christof csad7 at
Wed Aug 17 22:09:26 CEST 2005

Some basic helper functions to deal with encodings of text files (like 
HTML, XHTML, XML) via HTTP. Developed for cssutils but looked worth an 
independent release.

Download from
Included are some unittests.

	Creative Commons License

Note: All encodings returned are uppercase.

encodingByMediaType(media_type, log=None)

     Returns a default encoding for the given Media-Type, e.g. 'UTF-8' 
   for media-type='application/xml'. If no default encoding is available 
returns None.

getHTTPInfo(HTTPResponse, log=None)

     Returns (media_type, encoding) information from the response' 
Content-Type HTTP header (case of headers is ignored.) May be (None, 
None) e.g. if no Content-Type header is available.

getMetaInfo(text, log=None)

     Returns (media_type, encoding) information from (first) X/HTML 
Content-Type <meta> element if available.

getXMLEncoding(text, log=None)

     Parses XML declaration of a document (if present) (simplified). 
Returns (encoding, explicit).
     No autodetection of BOM is done yet. If no explicit encoding is 
found returns ('UTF-8', False).

guessEncoding(HTTPResponse, text, log=None)

     Tries to find the encoding of given text. Uses information in 
headers of supplied HTTPResponse, possible XML declaration and X/HTML 
<meta> elements.
     Returns (encoding, mismatch). Encoding is the explicit or implicit 
encoding or None and returned always uppercase. Mismatch is True if any 
mismatches between media_type, XML declaration or textcontent are found. 
More detailed mismatch reports are written to the optional log.
     Mismatches are not nessecarily errors! For details see the 

Plan is to integrate XML autodetection (of BOM) in the next release.

I would very much welcome any feedback about spec compliance, errors or 
other problems with the functions (or the tests!).
Please use or

Thanks a lot!

<P><A HREF="">encutils 0.4</A> - basic helper 
functions to deal with encodings of text files (17-Aug-05)

More information about the Python-announce-list mailing list