[issue11243] email/message.py str conversion

Steffen Daode Nurpmeso report at bugs.python.org
Tue Mar 15 14:43:41 CET 2011


Steffen Daode Nurpmeso <sdaoden at googlemail.com> added the comment:

On Tue, Mar 15, 2011 at 04:21:24AM +0000, R. David Murray wrote:
> Please test and let me know if it works; it should, since the code patch is very close to the one you suggested.

;-)
Hello David, hope you have a good time at Pycon! 
(Just Googled, weather will be fine right after all of you 
will see the sun and the blue sky once again! 
Hey -- there is a world out there!! :)

Just like i've stated on EMAIL-SIG, you really have convinced me 
of simply using the binary feedparser, but since you have found 
even more places where explicit str() is necessary, this package 
is once again at least 50% better than before!

But i've readded a
    email.header.make_header(email.header.decode_header(b))
thing in my Ticket._bewitch_msg() and ran that patched S-Postman 
;=) on an 3.8 MB mbox file (by the way, if you need f..d .. 
emails, subscribe to OpenBSD Misc), and i'll end up like this:

Traceback (most recent call last):
  File "/Users/steffen/usr/bin/s-postman.py", line 1815, in _walk
    self._tickets.extend(Ticket.process_message(msg))
  File "/Users/steffen/usr/bin/s-postman.py", line 1671, in process_message
    return [Ticket(m, _targets=rsm.targets) for m in splitter]
  File "/Users/steffen/usr/bin/s-postman.py", line 1671, in <listcomp>
    return [Ticket(m, _targets=rsm.targets) for m in splitter]
  File "/Users/steffen/usr/bin/s-postman.py", line 1681, in __init__
    self._bewitch_msg()
  File "/Users/steffen/usr/bin/s-postman.py", line 1752, in _bewitch_msg
    self._msg[n] = email.header.make_header(email.header.decode_header(b))
  File "/Users/steffen/usr/opt/py3k/lib/python3.3/email/header.py", line 73, in decode_header
    if not ecre.search(header):
Exception: TypeError: expected string or buffer

Here: header==<class 'email.header.Header'>
And:

Traceback (most recent call last):
  File "/Users/steffen/usr/bin/s-postman.py", line 1815, in _walk
    self._tickets.extend(Ticket.process_message(msg))
  File "/Users/steffen/usr/bin/s-postman.py", line 1671, in process_message
    return [Ticket(m, _targets=rsm.targets) for m in splitter]
  File "/Users/steffen/usr/bin/s-postman.py", line 1671, in <listcomp>
    return [Ticket(m, _targets=rsm.targets) for m in splitter]
  File "/Users/steffen/usr/bin/s-postman.py", line 1681, in __init__
    self._bewitch_msg()
  File "/Users/steffen/usr/bin/s-postman.py", line 1752, in _bewitch_msg
    self._msg[n] = email.header.make_header(email.header.decode_header(b))
  File "/Users/steffen/usr/opt/py3k/lib/python3.3/email/header.py", line 154, in make_header
    h.append(s, charset)
  File "/Users/steffen/usr/opt/py3k/lib/python3.3/email/header.py", line 270, in append
    s = s.decode(input_charset, errors)
Exception: AttributeError: 'Header' object has no attribute 'decode'

Here s==<class 'email.header.Header'>
And after adding
        # Steffen is out now
        if isinstance(s, email.header.Header):
            s = str(s)
i got stuck on this:

Traceback (raising call only):
  File "/Users/steffen/usr/opt/py3k/lib/python3.3/email/header.py", line 278, in append
    s.encode(output_charset, errors)
Exception: UnicodeEncodeError: 'ascii' codec can't encode character '\ufffd' in position 7: ordinal not in range(128)
{Aaaargh!  Special case UNICODE replacement character, mongrel!}

s was a Header here, too.
I apply a simple email_header.diff which applies cleanly to a49bda. 
Hope i could help a bit.

----------
Added file: http://bugs.python.org/file21210/email_header.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11243>
_______________________________________
-------------- next part --------------
diff --git a/Lib/email/header.py b/Lib/email/header.py
--- a/Lib/email/header.py
+++ b/Lib/email/header.py
@@ -70,7 +70,7 @@
     occurs (e.g. a base64 decoding exception).
     """
     # If no encoding, just return the header with no charset.
-    if not ecre.search(header):
+    if not ecre.search(str(header)):
         return [(header, None)]
     # First step is to parse all the encoded parts into triplets of the form
     # (encoded_string, encoding, charset).  For unencoded strings, the last
@@ -265,6 +265,9 @@
             charset = self._charset
         elif not isinstance(charset, Charset):
             charset = Charset(charset)
+        # Steffen is out now
+        if isinstance(s, email.header.Header):
+            s = str(s)
         if not isinstance(s, str):
             input_charset = charset.input_codec or 'us-ascii'
             s = s.decode(input_charset, errors)


More information about the Python-bugs-list mailing list