This functionality already exists in the ever so useful defaultdict object. You pass a factory method to the constructor of defaultdict for an object, and it returns a new object when there is no key:<br><br>from collections import defaultdict
<br><span class="e" id="q_1172199c68d27eaf_1">mydict = defaultdict(list)<br>for record in mylist:<br></span><span class="e" id="q_1172199c68d27eaf_1"> mydict[ record[0] ].append( record )<br><br>defaultdict is usually good enough for datasets I've used it for.
<br><br> --Michael</span><span class="e" id="q_1172199c68d27eaf_1"></span><br><br><br><br><div><span class="gmail_quote">On 12/28/07, <b class="gmail_sendername">doug shawhan</b> <<a href="mailto:doug.shawhan@gmail.com">
doug.shawhan@gmail.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">*sigh* Ignore folks. I had forgotten about .has_key().
<div><span class="e" id="q_1172199c68d27eaf_1"><br><br><br><br><div class="gmail_quote">On Dec 28, 2007 11:22 AM, doug shawhan <<a href="mailto:doug.shawhan@gmail.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
doug.shawhan@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I'm building a dictionary from a list with ~ 1M records.<br><br>Each record in the list is itself a list.<br>Each record in the list has a line number, (index 0) which I wish to use as a dictionary key.<br><br>The problem: It is possible for two different records in the list to share this line number. If they do, I want to append the record to the value in the dictionary.
<br><br>The obvious (lazy) method of searching for doubled lines requires building and parsing a key list for every record. There must be a better way!<br><br>dict = {}<br>for record in list<br> if record[0] in dict.keys
():<br> dict[ record[0] ].append( record )<br> else:<br> dict[ record[0] ] = [record]<br><br>Once you get ~ 80,000 records it starts slowing down pretty badly (I would too ...).<br><br>Here's hoping there is a really fast, pythonic way of doing this!
<br>
</blockquote></div><br>
</span></div><br>_______________________________________________<br>Tutor maillist - <a onclick="return top.js.OpenExtLink(window,event,this)" href="mailto:Tutor@python.org">Tutor@python.org</a><br><a onclick="return top.js.OpenExtLink(window,event,this)" href="http://mail.python.org/mailman/listinfo/tutor" target="_blank">
http://mail.python.org/mailman/listinfo/tutor</a><br><br></blockquote></div><br><br clear="all"><br>-- <br>Michael Langford<br>Phone: 404-386-0495<br>Consulting: <a href="http://www.RowdyLabs.com">http://www.RowdyLabs.com
</a>