[Tutor] request for comments regarding my function
Isaac
hyperneato at gmail.com
Mon Oct 26 03:56:52 CET 2009
Thank you very much! You certainly gave me plenty to think and read
about. I had not known about functools or collections.
On Sun, Oct 25, 2009 at 7:09 AM, Rich Lovely <roadierich at googlemail.com> wrote:
> 2009/10/25 Isaac <hyperneato at gmail.com>:
>> Hello all,
>> I wrote a function to organize a list of sorted objects ( django blog
>> entries ); each object has a Python datetime attribute ( called
>> pub_date ). I am posting a message to the tutor mailing list to get
>> feedback regarding better ways to accomplish the same task. I have a
>> feeling I could use recursion here, instead of looping through the
>> entire _query list 3 times. Here is my code:
>>
>> def create_archive_list(_query):
>> """
>> given a sorted queryset, return a list in the format:
>>
>> archive_list = [{'2009': [{'December': ['entry1', 'entry2']},
>> {'January': ['entry3', 'entry4']}
>> ]
>> },
>>
>> {'2008': [{'January': ['entry5', 'entry6']},
>> ]
>> }
>> ]
>> """
>>
>> archive_list = []
>> tmp_year_list = []
>>
>> # make a list of Python dictionaries; the keys are the year, as in
>> '2009', and the value
>> # is an empty list.
>> for item in _query:
>> if item.pub_date.year not in tmp_year_list:
>> tmp_year_list.append(item.pub_date.year)
>> tmp_dict = {}
>> tmp_dict[item.pub_date.year] = []
>> archive_list.append(tmp_dict)
>> else:
>> pass
>>
>> # for every list in the archive_list dictionaries, append a
>> dictionary with the
>> # blog entry's month name as a key, and the value being an empty list
>> for entry in _query:
>> for item in archive_list:
>> _year = entry.pub_date.year
>> if _year in item:
>> _tmp_month = entry.pub_date.strftime("%B")
>> # make a dictionary with the month name as a key and
>> an empty list as a value
>> tmp_dict = {}
>> tmp_dict[_tmp_month] = []
>>
>> if tmp_dict not in item[_year]:
>> item[_year].append(tmp_dict)
>>
>> # append the blog entry object to the list if the pub_date month
>> and year match
>> # the dictionary keys in the archive_list dictionary/list tree.
>> for entry in _query:
>> # dict in list
>> for item in archive_list:
>> # year of entry
>> _year = entry.pub_date.year
>> if _year in item:
>> _year_list = item[_year]
>> for _dict_month in _year_list:
>> _tmp_month = entry.pub_date.strftime("%B")
>> if _tmp_month in _dict_month:
>> _dict_month[_tmp_month].append(entry)
>>
>> return archive_list
>> _______________________________________________
>> Tutor maillist - Tutor at python.org
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>
>
> Instant first comment, after just a glance at your code, Can't you use
> an int for the year and months, rather than strings? (This is a
> principle of the MVC (Model, View, Controller) structure: you keep
> your model (The actual data) as easy to handle as possible, and using
> ints as keys makes conversion between the query and the output and
> sorting easier. Conversion from {2009:{12: ...}} to "December 2009:
> ..." should be handled by the view: either the GUI, or immediatly
> before it's displayed. (This may be a point of contention, it could
> be argued that the view should only display it, and all manipulation
> should be handled by the controller level.)
>
> This isn't a case for recursion: recursion is best used when you have
> one thing containing a container, which contains a container, which
> contains another, which contains another, etc, to (usually) unknown
> depths. (Yes, there are numeric uses for recursion, but they are
> usually best replaced with an iterative approach).
>
> It would be more pythonic to format your output as a dict of dicts (of
> lists), rather than lists of single-item dicts of lists of single-item
> dicts:
> archive_list = \
> {2009: {12: ['entry1', 'entry2'],
> 1: ['entry3', 'entry4']}
> },
> '2008': {1: ['entry5', 'entry6']}
> }
>
> You loose the ordering, but that isn't a problem when your keys have
> implicit ordering.
>
> Look up collections.defaultdict. To generate this structure, you can
> use dict as the factory function with defaultdict:
>
> d = collections.defaultdict(dict)
>
> or, more usefully,
> d = collections.defaultdict(functools.partial(collections.defaultdict, list))
> This will give a defaultdict of defaultdicts of lists. (The
> functools.partial just returns a callable, which is what defaultdict
> needs to create its default value.
>
> Then, as you iterate through the report, you simply assign to the dict
> as follows:
>
> for entry in month_record:
> d[year][month].append(entry)
>
> It avoids all the messing around with tmp_year_list and so on.
>
> Another way, using only builtins, is to use dict.setdefault():
>
> d = {}
> d.setdefault(year, {}).setdefault(month, [])
>
> This is essentially what the defaultdict does for you.
>
> Doing it like this will give you the advantage of being easy to read:
> I'm struggling to follow what your code does at the moment: you've
> got far too many levels of nesting and temporary variables.
>
> I don't know the structure of your query result, but using pseudo-functions:
>
> import collections, functools
>
> def create_archive_list(query):
> d = collections.defaultdict(functools.partial(collections.defaultdict,
> list))
> for entry in query: #iterate over blog posts, rather than years,
> months etc - therefore using a single pass
> d[entry.year()][entry.month()].append(entry.data())
> return d
>
> And finally, a little utility function, which DOES use recursion:
>
> def mapToDict(d):
> """"recursively convert defaultdicts into regular dicts"""
> d = dict(d)
> d.update((k, map_to_dict(v)) for k,v in d.items() if isinstance(v,
> collections.defaultdict))
> return d
>
>
> This function is more for personal comfort than regular use (although
> it does clean up debugging output): it's poor programming if code
> that works with a regular dict fails if given a defaultdict. If you
> DO discover any, I highly recommend reporting it as a bug.
>
> Like I said, I've not seen your data structure, so I'm making some big
> guesses here, but this should give you enough to look into towards
> improving your code.
>
> --
> Rich "Roadie Rich" Lovely
>
> There are 10 types of people in the world: those who know binary,
> those who do not, and those who are off by one.
>
More information about the Tutor
mailing list