converting a sed / grep / awk / . . . bash pipe line into python
Peter Otten
__peter__ at web.de
Wed Sep 3 03:15:17 EDT 2008
hofer wrote:
> Something I have to do very often is filtering / transforming line
> based file contents and storing the result in an array or a
> dictionary.
>
> Very often the functionallity exists already in form of a shell script
> with sed / awk / grep , . . .
> and I would like to have the same implementation in my script
>
> What's a compact, efficient (no intermediate arrays generated /
> regexps compiled only once) way in python
> for such kind of 'pipe line'
>
> Example 1 (in bash): (annotated with comment (thus not working) if
> copied / pasted
> cat file \ ### read from file
> | sed 's/\.\..*//' \ ### remove '//' comments
> | sed 's/#.*//' \ ### remove '#' comments
> | grep -v '^\s*$' \ ### get rid of empty lines
> | awk '{ print $1 + $2 " " $2 }' \ ### knowing, that all remaining
> lines contain always at least
> \ ### two integers calculate
> sum and 'keep' second number
> | grep '^42 ' ### keep lines for which sum is 42
> | awk '{ print $2 }' ### print number
> thanks in advance for any suggestions of how to code this (keeping the
> comments)
for line in open("file"): # read from file
try:
a, b = map(int, line.split(None, 2)[:2]) # remove extra columns,
# convert to integer
except ValueError:
pass # remove comments, get rid of empty lines,
# skip lines with less than two integers
else:
# line did start with two integers
if a + b == 42: # keep lines for which the sum is 42
print b # print number
The hard part was keeping the comments ;)
Without them it looks better:
import sys
for line in sys.stdin:
try:
a, b = map(int, line.split(None, 2)[:2])
except ValueError:
pass
else:
if a + b == 42:
print b
Peter
More information about the Python-list
mailing list