[Chicago] May's Collections Module talk

Aaron Elmquist elmq0022 at umn.edu
Thu Jun 2 10:27:35 EDT 2016


Phil,

Nice work on this.  Thank you for putting this together and sharing it with
everyone.

I didn't get to attend your presentation, but I do have a couple thoughts
to share.

First, I like the idea of speeding up the top ten method, but the
implementation isn't quite correct in all cases.  Consider passing a
sequence to your function in sorted descending order.  The curMin equals to
the maximum value on the first iteration of the series and only one item
will be return from your deque.  Also, renaming curMin to curMax would be
more clear.

For a better implementation, we can look at the collections source code (
https://hg.python.org/cpython/file/2.7/Lib/collections.py line 484 ). The
Counter class has a most_common method implemented, and it does exactly
what we want.  Here the method defaults to the sorted method you shared
when "n" is not passed.  Of course this runs O(n log n) as it does a
complete sort.  However, when n is provided, the algorithm switches to a
sorted heap which is faster when we only want a specific number of items (
O(k log n),  n := heap size, k := items requested from the heap ).

One other thought on the Counter class.  The constructor does take an iterable,
so if you want you could pass your csv reader or a generator directly to
the Counter during the initialization.

If any of this was discussed at the meeting, I apologize for the repeat but
hope others will benefit from the discussion.

Again thanks for putting this together.  You made me think about these
structures more and made me look at python source code.

Best,

Aaron

On Tue, May 31, 2016 at 7:11 PM, Joshua Herman <zitterbewegung at gmail.com>
wrote:

> Yes it was a very interesting and easy to understand what you were talking
> about great presentation. Also good practical examples.
>
> On Tue, May 31, 2016 at 6:23 PM Michael Tamillow <
> mikaeltamillow96 at gmail.com> wrote:
>
>> Yeah, I second what Bob said. You kicked as+
>>
>> Sent from my iPhone
>>
>> > On May 31, 2016, at 6:18 PM, Bob Haugen <bob.haugen at gmail.com> wrote:
>> >
>> > That's really educational for a self-taught sloppy python programmer.
>> > Thank you very much.
>> >
>> > On Tue, May 31, 2016 at 6:13 PM, Robare, Phillip (TEKSystems)
>> > <proba at allstate.com> wrote:
>> >> I have put the files from my May 12th meeting talk on the Collections
>> module up on github:
>> https://github.com/verisimilidude/TheCollectionsModule
>> >>
>> >> The one exception is the data file I used, the City of Chicago's
>> building permits data.  I have a file in the archive about how to download
>> that data.
>> >>
>> >> Phil Robare
>> >> TEK Systems / Allstate QR&A
>> >> 847-667-0431
>> >> D2W-722D
>> >>
>> >>
>> >> _______________________________________________
>> >> Chicago mailing list
>> >> Chicago at python.org
>> >> https://mail.python.org/mailman/listinfo/chicago
>> > _______________________________________________
>> > Chicago mailing list
>> > Chicago at python.org
>> > https://mail.python.org/mailman/listinfo/chicago
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> https://mail.python.org/mailman/listinfo/chicago
>>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> https://mail.python.org/mailman/listinfo/chicago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20160602/1ac5da90/attachment.html>


More information about the Chicago mailing list