File to dict

Matt Nordhoff mnordhoff at mattnordhoff.com
Fri Dec 7 07:09:51 EST 2007


Chris wrote:
> For the first one you are parsing the entire file everytime you want
> to lookup just one domain.  If it is something reused several times
> during your code execute you could think of rather storing it so it's
> just a simple lookup away, for eg.
> 
> _domain_dict = dict()
> def generate_dict(input_file):
>     finput = open(input_file, 'rb')
>     global _domain_dict
>     for each_line in enumerate(finput):
>         line = each_line.strip().split(':')
>         if len(line)==2: _domain_dict[line[0]] = line[1]
> 
>     finput.close()
> 
> def domain_lookup(domain_name):
>     global _domain_dict
>     try:
>         return _domain_dict[domain_name]
>     except KeyError:

What about this?

_domain_dict = dict()
def generate_dict(input_file):
    global _domain_dict
    # If it's already been run, do nothing. You might want to change
    # this.
    if _domain_dict:
        return
    fh = open(input_file, 'rb')
    try:
        for line in fh:
            line = line.strip().split(':', 1)
            if len(line) == 2:
                _domain_dict[line[0]] = line[1]
    finally:
        fh.close()

def domain_lookup(domain_name):
    return _domain_dict.get(domain_name)

I changed generate_dict to do nothing if it's already been run. (You
might want it to run again with a fresh dict, or throw an error or
something.)

I removed enumerate() because it's unnecessary (and wrong -- you were
trying to split a tuple of (index, line)).

I also changed the split to only split once, like Duncan Booth suggested.

The try-finally is to ensure that the file is closed if an exception is
thrown for some reason.

domain_lookup doesn't need to declare _domain_dict as global because
it's not assigning to it. .get() returns None if the key doesn't exist,
so now the function returns None. You might want to use a different
value or throw an exception (use _domain_dict[domain_name] and not catch
the KeyError if it doesn't exist, perhaps).

Other than that, I just reformatted it and renamed variables, because I
do that. :-P
-- 



More information about the Python-list mailing list