Searching a large dictionary

Sean DiZazzo half.italian at gmail.com
Thu Sep 24 02:03:26 CEST 2009


On Sep 23, 4:05 pm, "Rhodri James" <rho... at wildebst.demon.co.uk>
wrote:
> On Wed, 23 Sep 2009 22:52:56 +0100, mike171562  
>
> <support.desk.... at gmail.com> wrote:
> > Sorry for the confusion, Simon, this is almost exactly what I need,
> > but i need to be able to search for a string in a given value of an
> > item
>
> > Here is an example of the dict I am working with
>
> > {'252': [{'code': '51679', 'date': '2009-08-01 11:35:38', 'userfield':
> > '252', 'from': '9876662881', 'to': '19877760406', 'fpld': '"Foobar"
> > <9855562881>', 'result': 'ANSW', 'sec': 131}, {'code': '51679',
> > 'date': '2009-08-01 14:33:55', 'userfield': '252', 'from':
> > '9876662881', 'to': '19877770391', 'fpld': '"Foobar" <9876555881>',
> > 'result': 'ANSW', 'sec': 86}]}
>
> Ugh.  This isn't, you'll notice, either of the versions of the structure
> that you told us it was.  Let's prettyprint it for added clarity:
>
> { '252': [ { 'code': '51679',
>               'date': '2009-08-01 11:35:38',
>               'userfield': '252',
>               'from': '9876662881',
>               'to': '19877760406',
>               'fpld': '"Foobar" <9855562881>',
>               'result': 'ANSW',
>               'sec': 131
>             },
>             { 'code': '51679',
>               'date': '2009-08-01 14:33:55',
>               'userfield': '252',
>               'from': '9876662881',
>               'to': '19877770391',
>               'fpld': '"Foobar" <9876555881>',
>               'result': 'ANSW',
>               'sec': 86
>             }
>           ]
>
> }
>
> A dictionary containing a single entry, whose value is a list of
> dictionaries.
>
>
>
> > 252 being the key,
>
> This vaguely implies that 252 (or rather '252') is the *only*
> key.  I hope that's not true, for it would be a silly dictionary.
>
> > I need to be able to search for a string in a given
> > item , say 777 in the 'to' field so
>
> > print wtf(dict,'to','777')
>
> > would return
>
> > {'252': [{'code': '51679', 'date': '2009-08-01 11:35:38', 'userfield':
> > '252', 'from': '9876662881', 'to': '19877760406', 'fpld': '"Foobar"
> > <9855562881>', 'result': 'ANSW', 'billsec': 131}, {'code': '51679',
> > 'date': '2009-08-01 14:33:55', 'userfield': '252', 'from':
> > '9876662881', 'to': '19877770391', 'fpld': '"Foobar" <9876555881>',
> > 'result': 'ANSW', 'billsec': 86}]}
>
> Which would be the entire original structure, except for some reason
> relabelling 'sec' as 'billsec' telling us very little really.
>
>
>
> > I hope this makes sense, sorry for not being clear
>
> Not a lot, frankly.  There are still a lot of unanswered questions here.
> Would I be right in thinking that you want to check all keys of the outer
> dictionary, and you want the corresponding values in the returned
> dictionary to be lists of all inner dictionaries that match?  And that
> the business with sec/billsec is a typo?
>
> Assuming that, try this:
>
> # Untested, and it is midnight
> def wtf(data_dict, key, value):
>    # Don't call it 'dict', by the way, you'll mask the builtin object
>    result = {}
>    for outer_key, data_list in data_dict.items():
>      new_list = []
>      for inner_data in data_list:
>        if value in inner_data[key]:
>          new_list.append(inner_data)
>      if new_list:
>        result[outer_key] = new_list
>    return result
>
> Note that this won't work on the 'sec' field, since it's a number rather
> than a string.
>
> --
> Rhodri James *-* Wildebeest Herder to the Masses

I like to perform what I call "objectify" on nested dictionary type
stuff into classes.

class TestPart(object):
        def __init__(self, **kwargs):
            for k,v in kwargs.items():
                setattr(self, k, v)


class Test(object):
    def __init__(self):
            self.name = None
            self.entries = []

    def set(self, name, listofdicts):
            self.name = name
            for l in listofdicts:
                    self.entries.append(TestPart(**l))

    def search(self, key, value):
        rets = []
        for t in self.entries:
            if value in getattr(t, key):
                rets.append(t)
        return rets

if __name__ == "__main__":
    d = {'252': [{'code': '51679', 'date': '2009-08-01 11:35:38',
'userfield':
        '252', 'from': '9876662881', 'to': '19877760406',
        'fpld': '"Foobar" <9855562881>', 'result': 'ANSW', 'sec':
131}, {'code': '51679',
            'date': '2009-08-01 14:33:55', 'userfield': '252',
'from':
            '9876662881', 'to': '19877770391', 'fpld': '"Foobar"
<9876555881>',
            'result': 'ANSW', 'sec': 86}]}
    items = []
    for k, v in d.items():
        t = Test()
        t.set(k, v)
        items.append(t)

    for i in items:
        got =  i.search("code", "9")
        print got
        for r in got:
            print r.code

It's not quite right, but you get the basic idea.  I just find it
easier to wrap my head around structures built like this rather than
trying to remember alot of inner/outer, index variables, etc.

~Sean



More information about the Python-list mailing list