[Tutor] (no subject)
Mats Wichmann
mats at wichmann.us
Wed Apr 25 14:19:28 EDT 2018
On 12/31/1969 05:00 PM, wrote:
> Hello everybody,
> I'm coming from a Perl background and try to parse some Exim Logfiles into a
> data structure of dictionaries. The regex and geoip part works fine and I'd
> like to save the email adress, the countries (from logins) and the count of
> logins.
>
> The structure I'd like to have:
>
> result = {
> 'foo at bar.de': {
> 'Countries': [DE,DK,UK]
> 'IP': ['192.168.1.1','172.10.10.10']
> 'Count': [12]
> }
> 'bar at foo.de': {
> 'Countries': [DE,SE,US]
> 'IP': ['192.168.1.2','172.10.10.11']
> 'Count': [23]
> }
> }
I presume that's pseudo-code, since it's missing punctuation (commas
between elements) and the country codes are not quoted....
>
> I don't have a problem when I do these three seperately like this with a one
> dimensonial dict (snippet):
>
> result = defaultdict(list)
>
> with open('/var/log/exim4/mainlog',encoding="latin-1") as logfile:
> for line in logfile:
> result = pattern.search(line)
> if (result):
> login_ip = result.group("login_ip")
> login_auth = result.group("login_auth")
> response = reader.city(login_ip)
> login_country = response.country.iso_code
> if login_auth in result and login_country in result[login_auth]:
> continue
> else:
> result[login_auth].append(login_country)
> else:
> continue
>
> This checks if the login_country exists within the list of the specific
> login_auth key, adds them if they don't exist and gives me the results I want.
> This also works for the ip addresses and the number of logins without any problems. >
> As I don't want to repeat these loops three times with three different data
> structures I'd like to do this in one step. There are two main problems I
> don't understand right now:
>
> 1. How do I check if a value exists within a list which is the value of a key
> which is again a value of a key in my understanding exists? What I like to do:
>
> if login_auth in result and (login_country in result[login_auth][Countries])
> continue
you don't actually need to check (there's a Python aphorism that goes
something like "It's better to ask forgiveness than permission").
You can do:
try:
result[login_auth]['Countries'].append(login_country)
except KeyError:
# means there was no entry for login_auth
# so add one here
that will happily add another instance of a country if it's already
there, but there's no problem with going and cleaning the 'Countries'
value later (one trick is to take that list, convert it to a set, then
(if you want) convert it back to a list if you need unique values.
you're overloading the name result here so this won't work literally -
you default it outside the loop, then also set it to the regex answer...
I assume you can figure out how to fix that up.
More information about the Tutor
mailing list