converting a sed / grep / awk / . . . bash pipe line into python

Peter Otten __peter__ at web.de
Wed Sep 3 03:15:17 EDT 2008


hofer wrote:

> Something I have to do very often is filtering / transforming line
> based file contents and storing the result in an array or a
> dictionary.
> 
> Very often the functionallity exists already in form of a shell script
> with sed / awk / grep , . . .
> and I would like to have the same implementation in my script
> 
> What's a compact, efficient (no intermediate arrays generated /
> regexps compiled only once) way in python
> for such kind of 'pipe line'
> 
> Example 1 (in bash):  (annotated with comment (thus not working) if
> copied / pasted
 
> cat file \                   ### read from file
> | sed 's/\.\..*//' \        ### remove '//' comments
> | sed 's/#.*//' \           ### remove '#' comments
> | grep -v '^\s*$'  \        ### get rid of empty lines
> | awk '{ print $1 + $2 " " $2 }' \ ### knowing, that all remaining
> lines contain always at least
> \                                           ### two integers calculate
> sum and 'keep' second number
> | grep '^42 '                 ### keep lines for which sum is 42
> | awk '{ print $2 }'         ### print number
> thanks in advance for any suggestions of how to code this (keeping the
> comments)

for line in open("file"): # read from file
    try:
        a, b = map(int, line.split(None, 2)[:2]) # remove extra columns,
                                                 # convert to integer
    except ValueError:
        pass # remove comments, get rid of empty lines,
             # skip lines with less than two integers
    else:
        # line did start with two integers
        if a + b == 42: # keep lines for which the sum is 42
            print b # print number

The hard part was keeping the comments ;)

Without them it looks better:

import sys
for line in sys.stdin:
    try:
        a, b = map(int, line.split(None, 2)[:2])
    except ValueError:
        pass
    else:
        if a + b == 42:
            print b

Peter



More information about the Python-list mailing list