[Tutor] how to get blank value

Paul McGuire ptmcg at austin.rr.com
Wed Jul 29 06:36:49 CEST 2009


Ok, I've seen various passes at this problem using regex, split('='), etc.,
but the solutions seem fairly fragile, and the OP doesn't seem happy with
any of them.  Here is how this problem looks if you were going to try
breaking it up with pyparsing:
- Each line starts with an integer, and the string "ALA"
- "ALA" is followed by a series of "X = 1.2"-type attributes, where the
value part might be missing.

And to implement (with a few bells and whistles thrown in for free):

data = """48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
85 ALA H = 8.60 N =  CA =  HA = 4.65 C =""".splitlines()


from pyparsing import *

# define some basic data expressions
integer = Word(nums)
real = Combine(Word(nums) + "." + Word(nums))

# use parse actions to automatically convert numeric 
# strings to actual numbers at parse time
integer.setParseAction(lambda tokens:int(tokens[0]))
real.setParseAction(lambda tokens:float(tokens[0]))

# define expressions for 'X = 1.2' assignments; note that the
# value might be missing, so use Optional - we'll fill in
# a default value of 0.0 if no value is given
keyValue = Word(alphas.upper()) + '=' + \
            Optional(real|integer, default=0.0)

# define overall expression for the data on a line
dataline = integer + "ALA" + OneOrMore(Group(keyValue))("kvdata")
    
# attach parse action to define named values in the returned tokens
def assignDataByKey(tokens):
    for k,_,v in tokens.kvdata:
        tokens[k] = v
dataline.setParseAction(assignDataByKey)

# for each line in the input data, parse it and print some of the data
fields
for d in data:
    print d
    parsedData = dataline.parseString(d)
    print parsedData.dump()
    print parsedData.CA
    print parsedData.N
    print


Prints out:

48 ALA H = 8.33 N = 120.77 CA = 55.18 HA = 4.12 C = 181.50
[48, 'ALA', ['H', '=', 8.3300000000000001], ['N', '=', 120.77], ['CA', '=',
55.18], ['HA', '=', 4.1200000000000001], ['C', '=', 181.5]]
- C: 181.5
- CA: 55.18
- H: 8.33
- HA: 4.12
- N: 120.77
- kvdata: [['H', '=', 8.3300000000000001], ['N', '=', 120.77], ['CA', '=',
55.18], ['HA', '=', 4.1200000000000001], ['C', '=', 181.5]]
55.18
120.77

104 ALA H = 7.70 N = 121.21 CA = 54.32 HA = 4.21 C =
[104, 'ALA', ['H', '=', 7.7000000000000002], ['N', '=', 121.20999999999999],
['CA', '=', 54.32], ['HA', '=', 4.21], ['C', '=', 0.0]]
- C: 0.0
- CA: 54.32
- H: 7.7
- HA: 4.21
- N: 121.21
- kvdata: [['H', '=', 7.7000000000000002], ['N', '=', 121.20999999999999],
['CA', '=', 54.32], ['HA', '=', 4.21], ['C', '=', 0.0]]
54.32
121.21

85 ALA H = 8.60 N =  CA =  HA = 4.65 C =
[85, 'ALA', ['H', '=', 8.5999999999999996], ['N', '=', 0.0], ['CA', '=',
0.0], ['HA', '=', 4.6500000000000004], ['C', '=', 0.0]]
- C: 0.0
- CA: 0.0
- H: 8.6
- HA: 4.65
- N: 0.0
- kvdata: [['H', '=', 8.5999999999999996], ['N', '=', 0.0], ['CA', '=',
0.0], ['HA', '=', 4.6500000000000004], ['C', '=', 0.0]]
0.0
0.0


Learn more about pyparsing at http://pyparsing.wikispaces.com.

-- Paul




More information about the Tutor mailing list