[Tutor] Help with iterators

Matthew Johnson mcooganj at gmail.com
Fri Mar 22 01:39:03 CET 2013

Dear list,

I have been trying to understand out how to use iterators and in
particular groupby statements.  I am, however, quite lost.

I wish to subset the below list, selecting the observations that have
an ID ('realtime_start') value that is greater than some date (i've
used the variable name maxDate), and in the case that there is more
than one such record, returning only the one that has the largest ID

The code below does the job, however i have the impression that it
might be done in a more python way using iterators and groupby

could someone please help me understand how to go from this code to
the pythonic idiom?

thanks in advance,

Matt Johnson


## Code example

import pprint

obs = [{'date': '2012-09-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-10-15',
  'value': '231.951'},
 {'date': '2012-09-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-11-15',
  'value': '231.881'},
 {'date': '2012-10-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-11-15',
  'value': '231.751'},
 {'date': '2012-10-01',
  'realtime_end': '9999-12-31',
  'realtime_start': '2012-12-19',
  'value': '231.623'},
 {'date': '2013-02-01',
  'realtime_end': '9999-12-31',
  'realtime_start': '2013-03-21',
  'value': '231.157'},
 {'date': '2012-11-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2012-12-14',
  'value': '231.025'},
 {'date': '2012-11-01',
  'realtime_end': '9999-12-31',
  'realtime_start': '2013-01-19',
  'value': '231.071'},
 {'date': '2012-12-01',
  'realtime_end': '2013-02-18',
  'realtime_start': '2013-01-16',
  'value': '230.979'},
 {'date': '2012-12-01',
  'realtime_end': '9999-12-31',
  'realtime_start': '2013-02-19',
  'value': '231.137'},
 {'date': '2012-12-01',
  'realtime_end': '9999-12-31',
  'realtime_start': '2013-03-19',
  'value': '231.197'},
 {'date': '2013-01-01',
  'realtime_end': '9999-12-31',
  'realtime_start': '2013-02-21',
  'value': '231.198'},
 {'date': '2013-01-01',
  'realtime_end': '9999-12-31',
  'realtime_start': '2013-03-21',
  'value': '231.222'}]

maxDate = "2013-03-21"

dobs = dict([(d, []) for d in set([e['date'] for e in obs])])

for o in obs:

dobs_subMax = dict([(k, [d for d in v if d['realtime_start'] <= maxDate])
                for k, v in dobs.items()])

rts = lambda x: x['realtime_start']

mmax = [sorted(e, key=rts)[-1] for e in dobs_subMax.values() if e]

mmax.sort(key = lambda x: x['date'])


More information about the Tutor mailing list