[Tutor] Subprocess how to use?

Albert-Jan Roskam fomcl at yahoo.com
Thu Nov 6 09:55:00 CET 2014

On Thu, Nov 6, 2014 12:57 AM CET Cameron Simpson wrote:

>On 05Nov2014 14:05, jarod_v6 at libero.it <jarod_v6 at libero.it> wrote:
>> I need to use external program from my scirpt.
>> code = "intersectBed -a %s -b /database/Refseq_Gene2.bed  -wa -wb|cut -f 4,8|uniq > tmp.tmp1"%("gene5.tmp.bed")
>>    code2 = "intersectBed -a %s -b /database/Refseq_Gene2.bed -wa -wb|cut -f 4,8|uniq > tmp.tmp2"%("gene3.tmp.bed")
>>    proc = []
>>    p = subprocess.Popen(code,shell=True)
>>    proc.append(p)
>>    time.sleep(1) # seconds
>>    p = subprocess.Popen(code2,shell=True)
>>    proc.append(p)
>>    time.sleep(1)
>>    print >sys.stderr,'-------------------------------------------------------------------------'
>>    print >sys.stderr,"Waiting for Star Fusion Annotate to finish running..."
>>    for p in proc:
>>        p.communicate()
>First remark: as written above, your shell pipelines do not read any input and all their output goes to your temp files. Therefore using p.communicate() is a waste of time. Just use p.wait(); there is no I/O to do from the point of view of your Python code.
>Second remark: it is generally a bad idea to use Python's % operator to embed strings in shell commands (or other "parsed by someone else" text, such as SQL) because unless you are always very careful about the inserted string (eg "gene5.tmp.bed") you may insert punctuation, leading to bad behaviour. In your example code above you are sort of ok, and we can address this as a separate issue after you have things working.
>>    What append I don't habve any error however the file are truncated.
>You need to find out why.
>The shell code _will_ truncate "tmp.tmp1", but then we expect the output of "uniq" to appear there.
>Start by stripping back the shell pipeline.
>If you remove the "|uniq", is there anything in "tmp.tmp1"?
>If there is not, strip off the "|cut -f 4,8" as well and check again. At that point we're asking: does intersectBed actually emit any output?
>You can test that on the command line directly, and then build up the shell pipeline to what you have above by hand. When working, _then_ put it into your Python program.
>>    So I try to use subprocess.call
>>     p = subprocess.call(shlex.split(code),stdout = subprocess.PIPE, stderr = subprocess.STDOUT, shell = False)
>>     but with p.comunicate I have this error:
>>     ('\n***** ERROR: Unrecognized parameter: -wb|cut *****\n***** ERROR: Unrecognized parameter: >
>This is a standard mistake. shlex is  _not_ a full shell syntax parser. It is a simple parser that understands basic shell string syntax (quotes etc) but NOT redirections. It is really a tool for using with your own minilanguages, as you might if you were writing an interactive program that accepts user commands.
>You can't use shlex for shell pipelines or shell redirections.
>What is happening above is that all the "words" are being gathered up and passed to the "intersectBed" command as arguments; no pipelines! And "intersectBed" complains bitterly as you show.
>You can see this in the example you post below:
>> shlex.split(code)
>> Out[102]:
>> ['intersectBed',
>> '-a',
>> 'gene5.tmp.bed',
>> '-b',
>> '/home/maurizio/database/Refseq_Gene2.bed',
>> '-wa',
>> '-wb|cut',
>> '-f',
>> '4,8|uniq',
>> '>',
>> 'tmp.tmp1']
>> So what can I do? which system do you suggest to my Problem?
>First, forget about shlex. It does not do what you want.
>Then there are three approaches:
>1: Fix up your shell pipeline. You original code should essentially work: there is just something wrong with your pipeline. Strip it back until you can see the error.
>2: Construct the pipeline using Popen. This is complicated for someone new to Popen and subprocess. Essentially you would make each pipeline as 3 distinct Popen calls, using stdout=PIPE and attached that to the stdin of the next Popen.

Look for "connecting segments" on 

More information about the Tutor mailing list