[Python3] Reading a binary file and wrtiting the bytes verbatim in an utf-8 file

Chris Rebert clp2 at rebertia.com
Fri Apr 23 12:48:01 EDT 2010


On Fri, Apr 23, 2010 at 9:22 AM,  <fab at slick.airforce-one.org> wrote:
> I have to read the contents of a binary file (a PNG file exactly), and
> dump it into an RTF file.
>
> The RTF-file has been opened with codecs.open in utf-8 mode.
>
> As I expected, the utf-8 decoder

You mean encoder.

> chokes on some combinations of bits;

Well yeah, it's supposed to be getting *characters*, not bytes.

> how can I tell python to dump the bytes as they are, without
> interpreting them?

Go around the encoder and write bytes directly to the file:

# Disclaimer: Completely untested

import codecs

raw_rtf = open("path/to/rtf.rtf", 'w')
png = open("path/to/png.png", 'r')
writer_factory = codecs.getwriter('utf-8')

encoded_rtf = writer_factory(raw_rtf)
encoded_rtf.write(u"whatever text we want") # use unicode
# ...write more text...

# flush buffers
encoded_rtf.reset()
raw_rtf.flush()

raw_rtf.write(png.read()) # write from bytes to bytes

raw_rtf.close()
#END code

I have no idea how you'd go about reading the contents of such a file
in a sensible way.

Cheers,
Chris
--
http://blog.rebertia.com



More information about the Python-list mailing list