ANN: encutils 0.4
Christof
csad7 at t-online.de
Wed Aug 17 22:09:26 CEST 2005
Some basic helper functions to deal with encodings of text files (like
HTML, XHTML, XML) via HTTP. Developed for cssutils but looked worth an
independent release.
Download from http://cthedot.de/encutils/
Included are some unittests.
License
Creative Commons License
http://creativecommons.org/licenses/by/2.0/
Functions:
Note: All encodings returned are uppercase.
encodingByMediaType(media_type, log=None)
Returns a default encoding for the given Media-Type, e.g. 'UTF-8'
for media-type='application/xml'. If no default encoding is available
returns None.
getHTTPInfo(HTTPResponse, log=None)
Returns (media_type, encoding) information from the response'
Content-Type HTTP header (case of headers is ignored.) May be (None,
None) e.g. if no Content-Type header is available.
getMetaInfo(text, log=None)
Returns (media_type, encoding) information from (first) X/HTML
Content-Type <meta> element if available.
getXMLEncoding(text, log=None)
Parses XML declaration of a document (if present) (simplified).
Returns (encoding, explicit).
No autodetection of BOM is done yet. If no explicit encoding is
found returns ('UTF-8', False).
guessEncoding(HTTPResponse, text, log=None)
Tries to find the encoding of given text. Uses information in
headers of supplied HTTPResponse, possible XML declaration and X/HTML
<meta> elements.
Returns (encoding, mismatch). Encoding is the explicit or implicit
encoding or None and returned always uppercase. Mismatch is True if any
mismatches between media_type, XML declaration or textcontent are found.
More detailed mismatch reports are written to the optional log.
Mismatches are not nessecarily errors! For details see the
specifications..
Plan is to integrate XML autodetection (of BOM) in the next release.
I would very much welcome any feedback about spec compliance, errors or
other problems with the functions (or the tests!).
Please use http://cthedot.de/blog/?cat=14 or http://cthedot.de/contact/.
Thanks a lot!
chris
<P><A HREF="http://cthedot.de/encutils/">encutils 0.4</A> - basic helper
functions to deal with encodings of text files (17-Aug-05)
More information about the Python-announce-list
mailing list