codecs / subprocess interaction: utf help requested

John Machin sjmachin at
Mon Jun 11 00:10:40 CEST 2007

On Jun 11, 7:17 am, smitty1e <smitt... at> wrote:
> The first print statement does what you'd expect.
> The second print statement has rather a lot of rat in it.
> The goal here is to write a function that will return the man page for
> some command (mktemp used as a short example here) as text to client
> code, where the groff markup will be chopped to extract all of the
> command options.  Those options will eventually be used within an
> emacs mode, all things going swimmingly.
> I don't know what's going on with the piping in the second version.
> It looks like the output of p0 gets converted to unicode at some
> point,

Whatever gave you that idea?

> but I might be misunderstanding what's going on.  The 4.8
> codecs  module documentation doesn't really offer much enlightment,
> nor google.  About the only other place I can think to look would be
> the unit test cases shipped with python.

Get your head out of the red herring factory; unicode, "utf" (which
one?) and codecs have nothing to do with your problem. Think about
looking at your own code and at the bzip2 documentation.

> Sort of hoping one of the guru-level pythonistas can point to
> illumination, or write something to help out the next chap.  This
> might be one of those catalytic questions, the answer to which tackles
> five other questions you didn't really know you had.
> Thanks,
> Chris
> ---------------------------
> #!/usr/bin/python
> import subprocess
> p = subprocess.Popen(["bzip2", "-c", "-d", "/usr/share/man/man1/mktemp.
> 1.bz2"]
>                     , stdout=subprocess.PIPE)
> stdout, stderr = p.communicate()
> print stdout
> p0 = subprocess.Popen(["cat","/usr/share/man/man1/mktemp.1.bz2"],
> stdout=subprocess.PIPE)
> p1 = subprocess.Popen(["bzip2"], stdin=p0.stdout                ,
> stdout=subprocess.PIPE)
> stdout, stderr = p1.communicate()
> print stdout
> ---------------------------

You left out the command-line options for bzip2. The "rat" that you
saw was the result of compressing the already-compressed man page.
Read this:
which is a bit obscure. The --help output from my copy of an antique
(2001, v1.02) bzip2 Windows port explains it plainly:
   If invoked as `bzip2', default action is to compress.
              as `bunzip2',  default action is to decompress.
              as `bzcat', default action is to decompress to stdout.

   If no file names are given, bzip2 compresses or decompresses
   from standard input to standard output.


More information about the Python-list mailing list