converting a sed / grep / awk / . . . bash pipe line into python

Paul McGuire ptmcg at austin.rr.com
Wed Sep 3 01:43:26 EDT 2008


On Sep 2, 12:36 pm, hofer <bla... at dungeon.de> wrote:
> Hi,
>
> Something I have to do very often is filtering / transforming line
> based file contents and storing the result in an array or a
> dictionary.
>
> Very often the functionallity exists already in form of a shell script
> with sed / awk / grep , . . .
> and I would like to have the same implementation in my script
>

All that sed'ing, grep'ing and awk'ing, you might want to take a look
at pyparsing.  Here is a pyparsing take on your posted problem:

from pyparsing import LineEnd, Word, nums, LineStart, OneOrMore,
restOfLine

test = """

1 2 3
47 23  // this will never match
  # blank lines are not of any interest
91 26

23 19

41 1 97 26 // extra numbers don't matter
"""

# define pyparsing expressions to match a line of integers
EOL = LineEnd()
integer = Word(nums)

# by default, pyparsing will implicitly skip over whitespace and
# newlines, so EOL is skipped over by default - this would mix
together
# integers on consecutive lines - we only want OneOrMore integers as
long
# as they are on the same line, that is, integers with no intervening
# EOL's
line_of_integers = (LineStart() + integer + OneOrMore(~EOL + integer))

# use a parse action to identify the target lines
def select_significant_values(t):
    v1, v2 = map(int, t[:2])
    if v1+v2 == 42:
        print v2
line_of_integers.setParseAction(select_significant_values)

# skip over comments, wherever they are
line_of_integers.ignore( '//' + restOfLine )
line_of_integers.ignore( '#' + restOfLine )

# use the line_of_integers expression to search through the test text
# the parse action will print the matching values
line_of_integers.searchString(test)


-- Paul




More information about the Python-list mailing list