[issue11316] RFC822 header parsing API inconsistencies between httplib.HTTPMessage and email.message.Message

Georges Martin report at bugs.python.org
Fri Feb 25 11:37:52 CET 2011


New submission from Georges Martin <jrjsmrtn at gmail.com>:

Both httplib.HTTPMessage and email.message.Message classes[1] implements methods for RFC822 headers parsing. Unfortunately, they have different implementations and they do not provide the same level of functionality.

One example that is bugging me is that:

* httplib.HTTPMessage is missing the get_filename method present in email.message.Message, that allows you to easily retrieve the filename from a 'Content-disposition: attachment; filename="fghi.xyz"' header;

* httplib.HTTPMessage has getparam, getplist and parseplist methods but AFAIK, they are not and cannot be used outside of the 'content-type' header parsing;

* email.message.Message has a generic get_param method to parse any RFC822 header with parameters, such as 'content-disposition' or 'content-type'.

The workaround I'm using is to decorate an httplib.HTTPMessage with the missing methods from email.message.Message:

    def monkeypatch_http_message(obj):
        """ Decorate an httplib.HTTPMessage instance's class 
            with the RFC822 header parameters parsing methods 
            from email.message.Message. (thanks to ncoghlan)
        """
        import httplib
        assert isinstance(obj, httplib.HTTPMessage)
        cls = obj.__class__
    
        from email import utils
        from email.message import (
            _parseparam, 
            _unquotevalue, 
            Message
        )
        funcnames = (
            '_get_params_preserve', 
            'get_params', 
            'get_param',
            'get_filename'
        )
        for funcname in funcnames:
            cls.__dict__[funcname] = Message.__dict__[funcname]

So I can do:

    import mechanize
    from some.module import monkeypatch_http_message
    browser = mechanize.Browser()
    
    # in that form, browser.retrieve returns a temporary filename 
    # and an httplib.HTTPMessage instance
    (tmp_filename, headers) = browser.retrieve(someurl) 

    # monkeypatch the httplib.HTTPMessage instance
    monkeypatch_http_message(headers)

    # yeah... my original filename, finally
    filename = headers.get_filename()

----------
components: Library (Lib)
messages: 129348
nosy: jrjsmrtn
priority: normal
severity: normal
status: open
title: RFC822 header parsing API inconsistencies between httplib.HTTPMessage and email.message.Message
type: behavior
versions: Python 2.6

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11316>
_______________________________________


More information about the Python-bugs-list mailing list