[Tutor] using re to build dictionary
spir
denis.spir at free.fr
Tue Feb 24 13:40:06 CET 2009
Le Tue, 24 Feb 2009 12:48:51 +0100,
Norman Khine <norman at khine.net> s'exprima ainsi:
> Hello,
> From my previous post on create dictionary from csv, i have broken the
> problem further and wanted the lists feedback if it could be done better:
>
> >>> s = 'Association of British Travel Agents (ABTA) No. 56542\nAir
> Travel Organisation Licence (ATOL)\nAppointed Agents of IATA
> (IATA)\nIncentive Travel & Meet. Association (ITMA)'
> >>> licences = re.split("\n+", s)
> >>> licence_list = [re.split("\((\w+)\)", licence) for licence in licences]
> >>> association = []
> >>> for x in licence_list:
> ... for y in x:
> ... if y.isupper():
> ... association.append(y)
> ...
> >>> association
> ['ABTA', 'ATOL', 'IATA', 'ITMA']
>
>
> In my string 's', I have 'No. 56542', how would I extract the '56542'
> and map it against the 'ABTA' so that I can have a dictionary for example:
>
> >>> my_dictionary = {'ABTA': '56542', 'ATOL': '', 'IATA': '', 'ITMA': ''}
> >>>
>
>
> Here is what I have so far:
>
> >>> my_dictionary = {}
>
> >>> for x in licence_list:
> ... for y in x:
> ... if y.isupper():
> ... my_dictionary[y] = y
> ...
> >>> my_dictionary
> {'ABTA': 'ABTA', 'IATA': 'IATA', 'ITMA': 'ITMA', 'ATOL': 'ATOL'}
>
> This is wrong as the values should be the 'decimal' i.e. 56542 that is
> in the licence_list.
>
> here is where I miss the point as in my licence_list, not all items have
> a code, all but one are empty, for my usecase, I still need to create
> the dictionary so that it is in the form:
>
> >>> my_dictionary = {'ABTA': '56542', 'ATOL': '', 'IATA': '', 'ITMA': ''}
>
> Any advise much appreciated.
>
> Norman
I had a similar problem once. The nice solution was -- I think, don't take this for granted I have no time to verify -- simply using multiple group with re.findall again. Build a rule like:
r'.+(code-pattern).+(number_pattern).+\n+'
Then the results will be a list of tuples like
[
(code1, n1),
(code2, n2),
...
]
where some numbers will be missing. from this it's straightforward to instantiate a dict, maybe using a default None value for n/a numbers. Someone will probably infirm or confirm this method.
denis
------
la vita e estrany
More information about the Tutor
mailing list