Python 3 how to convert a list of bytes objects to a list of strings?
Chris Green
cl at isbd.net
Sat Aug 29 11:50:09 EDT 2020
Chris Green <cl at isbd.net> wrote:
> Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> > On Fri, 28 Aug 2020 12:26:07 +0100, Chris Green <cl at isbd.net> declaimed the
> > following:
> >
> >
> >
> > >Maybe I shouldn't but Python 2 has been managing to do so for several
> > >years without any issues. I know I *could* put the exceptions in a
> > >bucket somewhere and deal with them separately but I'd really rather
> > >not.
> > >
> >
> > In Python2 "string" IS BYTE-STRING. It is never UNICODE, and ignores
> > any encoding.
> >
> > So, for Python3, the SAME processing requires NOT USING "string" (which
> > is now Unicode) and ensuring that all literals are b"stuff", and using the
> > methods of the bytes data type.
> >
> Now I'm beginning to realise that *this* may well be what I need to
> do, after going round in several convoluted circles! :-)
>
However the problem appears to be that internally in Python 3 mailbox
class there is an assumption that it's being given 'ascii'. Here's
the error (and I'm doing no processing of the message at all):-
Traceback (most recent call last):
File "/home/chris/.mutt/bin/filter.py", line 102, in <module>
mailLib.deliverMboxMsg(dest, msg, log)
File "/home/chris/.mutt/bin/mailLib.py", line 52, in deliverMboxMsg
mbx.add(msg)
File "/usr/lib/python3.8/mailbox.py", line 603, in add
self._toc[self._next_key] = self._append_message(message)
File "/usr/lib/python3.8/mailbox.py", line 758, in _append_message
offsets = self._install_message(message)
File "/usr/lib/python3.8/mailbox.py", line 830, in _install_message
self._dump_message(message, self._file, self._mangle_from_)
File "/usr/lib/python3.8/mailbox.py", line 215, in _dump_message
gen.flatten(message)
File "/usr/lib/python3.8/email/generator.py", line 116, in flatten
self._write(msg)
File "/usr/lib/python3.8/email/generator.py", line 181, in _write
self._dispatch(msg)
File "/usr/lib/python3.8/email/generator.py", line 214, in _dispatch
meth(msg)
File "/usr/lib/python3.8/email/generator.py", line 432, in
_handle_text
super(BytesGenerator,self)._handle_text(msg)
File "/usr/lib/python3.8/email/generator.py", line 249, in
_handle_text
self._write_lines(payload)
File "/usr/lib/python3.8/email/generator.py", line 155, in
_write_lines
self.write(line)
File "/usr/lib/python3.8/email/generator.py", line 406, in write
self._fp.write(s.encode('ascii', 'surrogateescape'))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)
Any message with other than ASCII in it is going to have bytes >128
unless it's encoded some way to make it 7-bit and that's not going to
happen in the general case.
--
Chris Green
ยท
More information about the Python-list
mailing list