Multiline regex help

Yatima yatima_ at konishi.polis.net
Thu Mar 3 16:40:19 EST 2005


On Thu, 03 Mar 2005 16:25:39 -0500, Kent Johnson <kent37 at tds.net> wrote:
> Here is another attempt. I'm still not sure I understand what form you want the data in. I made a 
> dict -> dict -> list structure so if you lookup e.g. scores['10/11/04']['60'] you get a list of all 
> the RelevantInfo2 values for Relevant1='10/11/04' and Relevant2='60'.
>
> The parser is a simple-minded state machine that will misbehave if the input does not have entries 
> in the order Relevant1, Relevant2, Relevant3 (with as many intervening lines as you like).
>
> All three values are available when Relevant3 is detected so you could do something else with them 
> if you want.
>
> HTH
> Kent
>
> import cStringIO
>
> raw_data = '''Gibberish
> 53
> MoreGarbage
[mass snippage]
> 60
> Lalala'''
> raw_data = cStringIO.StringIO(raw_data)
>
> scores = {}
> info1 = info2 = info3 = None
>
> for line in raw_data:
>      if line.startswith('RelevantInfo1'):
>          info1 = raw_data.next().strip()
>      elif line.startswith('RelevantInfo2'):
>          info2 = raw_data.next().strip()
>      elif line.startswith('RelevantInfo3'):
>          info3 = raw_data.next().strip()
>          scores.setdefault(info1, {}).setdefault(info3, []).append(info2)
>          info1 = info2 = info3 = None
>
> print scores
> print scores['10/11/04']['60']
> print scores['10/10/04']['23']
>
> ## prints:
> {'10/10/04': {'44': ['33'], '23': ['22', '22']}, '10/11/04': {'60': ['45']}}
> ['45']
> ['22', '22']

Thank you so much. Your solution and Steve's both give me what I'm looking
for. I appreciate both of your incredibly quick replies!

Take care.

-- 
You worry too much about your job.  Stop it.  You are not paid enough to worry.



More information about the Python-list mailing list