[Tutor] (no subject)

Mats Wichmann mats at wichmann.us
Wed Apr 25 14:19:28 EDT 2018


On 12/31/1969 05:00 PM,  wrote:
> Hello everybody,
> I'm coming from a Perl background and try to parse some Exim Logfiles into a
> data structure of dictionaries. The regex and geoip part works fine and I'd
> like to save the email adress, the countries (from logins) and the count of
> logins.
> 
> The structure I'd like to have:
> 
> result = {
>         'foo at bar.de': {
>             'Countries': [DE,DK,UK]
>             'IP': ['192.168.1.1','172.10.10.10']
>             'Count': [12]
>             }
>         'bar at foo.de': {
>             'Countries': [DE,SE,US]
>             'IP': ['192.168.1.2','172.10.10.11']
>             'Count': [23]
>             }
>         }

I presume that's pseudo-code, since it's missing punctuation (commas
between elements) and the country codes are not quoted....

> 
> I don't have a problem when I do these three seperately like this with a one
> dimensonial dict (snippet):
> 
> result = defaultdict(list)
> 
> with open('/var/log/exim4/mainlog',encoding="latin-1") as logfile:
>     for line in logfile:
>         result = pattern.search(line)
>         if (result):
>             login_ip = result.group("login_ip")
>             login_auth =  result.group("login_auth")
>             response = reader.city(login_ip)
>             login_country = response.country.iso_code
>             if login_auth in result and login_country in result[login_auth]:
>                 continue
>             else:
>                 result[login_auth].append(login_country)
>         else:
>             continue
> 
> This checks if the login_country exists within the list of the specific
> login_auth key, adds them if they don't exist and gives me the results I want.
> This also works for the ip addresses and the number of logins without any problems. >
> As I don't want to repeat these loops three times with three different data
> structures I'd like to do this in one step. There are two main problems I
> don't understand right now:
> 
> 1. How do I check if a value exists within a list which is the value of a key 
> which is again a value of a key in my understanding exists? What I like to do:
> 
>  if login_auth in result and (login_country in result[login_auth][Countries])
>   continue

you don't actually need to check (there's a Python aphorism that goes
something like "It's better to ask forgiveness than permission").

You can do:

try:
    result[login_auth]['Countries'].append(login_country)
except KeyError:
    # means there was no entry for login_auth
    # so add one here

that will happily add another instance of a country if it's already
there, but there's no problem with going and cleaning the 'Countries'
value later (one trick is to take that list, convert it to a set, then
(if you want) convert it back to a list if you need unique values.

you're overloading the name result here so this won't work literally -
you default it outside the loop, then also set it to the regex answer...
I assume you can figure out how to fix that up.



More information about the Tutor mailing list