[Tutor] creating dictionary from a list

Sat Apr 13 03:22:37 CEST 2013

On 13/04/13 09:52, Saad Bin Javed wrote:
> Hi, I'm using a script to fetch my calendar events. I split the output at newline which produced a list 'lst'. I'm trying to clean it up and create a dictionary with date:event key value pairs. However this is throwing up a bunch of errors.

Would you like us to guess what the errors are? Please copy and paste the complete traceback, starting with the line

Traceback (most recent line last)

all the way to the end.

In the meantime, my comments are below, interleaved with your code:

> lst = ['', 'Thu Apr 04           Weigh In', '', 'Sat Apr 06 Collect NIC', \
> '                     Finish PTI Video', '', 'Wed Apr 10           Serum uric acid test', \
> '', 'Sat Apr 13   1:00pm  Get flag from dhariwal', '', 'Sun Apr 14      Louis CK Oh My God', '', '']

There's no need to use line-continuation backslash \ inside a list. Any form of bracket (round, square or curly) automatically continues the line.

Also, you might find it easier to process the list if you strip out empty items. There are two simple ways to do it:

lst = [x for x in list if x != '']
# or
lst = filter(None, lst)

Either way should work nicely on the data shown, and that might then simplify your processing below.

> dict = {}
> same_day = ''
> for x in lst:
>      c = x.split('           ')

It's best not to rely on an exact number of spaces for the separation, if you can. If you call split() with no argument, the string will be split on runs of whitespace. If you must depend on the difference between one space and two spaces, it's probably best to split on "  " (two spaces only), then delete any extra empty strings, as above.

Also, please try to use more meaningful variable names. In this case, I would suggest "words" instead of "c". ("c" for "character"?)

>      if c[0] is '':

Do you read the other emails on this tutor list? You should, you will learn a lot from other people's questions. We've just had a few emails explaining that you should never use "is" to test equality, since "is" tests for object identity. So, repeating myself from the last few emails:

* always use "if x is None" or "if x is not None" when comparing against None;

* otherwise, never use "is", always use == or != when comparing against (nearly) everything else.

("Always" and "never" of course are understood to be general rules which may be ignored by experts when necessary.)

So replace the line above with "if c[0] == '':".

>          for q in c:
>              if q is not '':

Likewise, replace this with "if q != '':".

>                  dict.update({same_day: dict[same_day] + ', ' + q.strip()})
>                  break

I'm not sure why you use a "break" here. That will stop processing the entire list. Is that intended? Maybe you meant to use "continue"?

>      else:
>          if c[0].find('  '):

You can save an indentation level by using "elif" here.

if test1:
     ...
else:
     if test2:
         ...
     else:
         ...

becomes more nicely written as:

if test1:
     ...
elif test2:
     ...
else:
     ...

>              print c[0]
>              a = c[0].split('  ', 1)
>              same_day = a[0]
>              print a[0], a[1].lstrip()

You can simplify your processing here by assigning directly to a list of names. Instead of:

a = c[0].split('  ', 1)
process(a[0])
process(a[1])

you can give each item a name directly:

same_day, text = c[0].split('  ', 1)
process(same_day)
process(text)

which makes your code cleaner and easier to understand.

>              dict.update({a[0] : a[1].lstrip()})
>          else:
>              same_day = c[0]
>              dict.update({c[0] : c[1]})

If you're going to assign a value to a name, why don't you actually use the name?

same_day = c[0]
dict.update({same_day: c[1]})

Try making the changes I've suggested, and see if that makes the code cleaner and easier to understand. It might even fix some of the errors that you are getting.

Good luck!

-- 
Steven