<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Something as simple as this (straw man) demonstrates what I mean: <div><br></div><div><blockquote type="cite"><div><font face="Consolas">class Record(defaultdict):</font></div><div><font face="Consolas"> def __init__(self, headers, fields):</font></div><div><font face="Consolas"> super(Record, self).__init__(list)</font></div><div><font face="Consolas"> self.headers = headers</font></div><div><font face="Consolas"> self.fields = fields</font></div><div><font face="Consolas"> map(self.enter, self.headers, self.fields)</font></div><div><font face="Consolas"> def valuemap(self, first=False):</font></div><div><font face="Consolas"> index = 0 if first else -1</font></div><div><font face="Consolas"> return dict([(key,values[index]) for key,values in self.items()])</font></div><div><font face="Consolas"> def enter(self, header, *values):</font></div><div><font face="Consolas"> if isinstance(header, int):</font></div><div><font face="Consolas"> header = self.headers[header]</font></div><div><font face="Consolas"> self[header].extend(values)</font></div><div><font face="Consolas"> def itemseq(self):</font></div><div><font face="Consolas"> return zip(self.headers,self.fields)</font></div><div><font face="Consolas"> def __getitem__(self, spec):</font></div><div><font face="Consolas"> if isinstance(spec, int):</font></div><div><font face="Consolas"> return self.fields[spec]</font></div><div><font face="Consolas"> return super(Record, self).__getitem__(spec)</font></div><div><font face="Consolas"> def __getslice__(self, *args):</font></div><div><font face="Consolas"> return self.fields.__getslice__(*args)</font></div><div><br></div></blockquote><div apple-content-edited="true"><div><br class="Apple-interchange-newline">This would let you access column values using header names, just like before. Each column's value(s) is now in a list, and would contain multiple values anytime for any column repeated more than once in the header. </div><div>Values can also be accessed sequentially using integer indexes, and the <font face="Consolas">valuemap()</font> returns a standard dictionary that conforms to the previous standard exactly: there is a one to one mapping between column headers and values, which the last value associated with a given column name being the value. </div><div><br></div><div>While I think the changes should be added without changing what exists for backward compatibility reasons, I've started to think the existing version should also be deprecated, rather than maintained as a special case. Even when the format is perfect for the existing code, I don't see any big advantages to using it over this approach. </div><div><br></div><div>Keep in mind the example is just a quick straw man: performance is a big difference (and plenty of bugs), but that doesn't seem like the right thing to base the decision, as performance can easily be enhanced later. </div><div><br></div></div></div><div apple-content-edited="true">
<div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">In summary, given headers: A, B, C, D, E, B, G</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br></div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record.headers == ["A", "B", "C", "D", "E", "B", "G"]</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record.fields = [0, 1, 2, 3, 4, 5, 6, 7]</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br></div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record["A"] == [0]</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record["B"] == [1, 5]</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br></div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "># Note sequential access values are not in lists, and the second "B" column's value 5 is in it's original 5th position. </div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record[0] == 0</div><div>record[1] ==1</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record[2] == 2</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record[3] == 3</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record[4] == 4</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record[5] == 5<br class="Apple-interchange-newline"><br class="Apple-interchange-newline">record.items() == [("A", [0]), ("B", [1, 5)), …]</div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">record.valuemap() == {"A": 0, "B": 5, …} # This returns exactly what DictReader does today, a single value per named column, with the last value being the one used. </div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br></div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br></div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">Shane Green </div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><a href="http://www.umbrellacode.com">www.umbrellacode.com</a></div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">408-692-4666 | <a href="mailto:shane@umbrellacode.com">shane@umbrellacode.com</a></div>
</div>
<div><br><div>Begin forwarded message:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; font-size:medium; color:rgba(0, 0, 0, 1.0);"><b>From: </b></span><span style="font-family:'Helvetica'; font-size:medium;">Shane Green <<a href="mailto:shane@umbrellacode.com">shane@umbrellacode.com</a>><br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; font-size:medium; color:rgba(0, 0, 0, 1.0);"><b>Subject: </b></span><span style="font-family:'Helvetica'; font-size:medium;"><b>Re: [Python-ideas] csv.DictReader could handle headers more intelligently.</b><br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; font-size:medium; color:rgba(0, 0, 0, 1.0);"><b>Date: </b></span><span style="font-family:'Helvetica'; font-size:medium;">January 26, 2013 6:39:11 AM PST<br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; font-size:medium; color:rgba(0, 0, 0, 1.0);"><b>To: </b></span><span style="font-family:'Helvetica'; font-size:medium;">"Stephen J. Turnbull" <<a href="mailto:stephen@xemacs.org">stephen@xemacs.org</a>><br></span></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;"><span style="font-family:'Helvetica'; font-size:medium; color:rgba(0, 0, 0, 1.0);"><b>Cc: </b></span><span style="font-family:'Helvetica'; font-size:medium;"><a href="mailto:python-ideas@python.org">python-ideas@python.org</a><br></span></div><br><meta http-equiv="Content-Type" content="text/html charset=windows-1252"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Okay, I like your point about DictReader having a place with a subset of CSV tables, and agree that, given that definition, it should throw an exception when its fed something that doesn't conform to this definition. I like that.<div><br></div><div>One thing, though, the new version would let you access column data by name as well: </div><div><br></div><div>Instead of</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>row["timestamp"] == 1359210019.299478</div><div><br></div><div>It would be</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>row["timestamp"] == [1359210019.299478]</div><div><br></div><div>And potentially </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>row["timestamp"] == [1359210019.299478,1359210019.299478]</div><div><div apple-content-edited="true">
<div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br class="Apple-interchange-newline">It could also be accessed as: </div><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><span class="Apple-tab-span" style="white-space: pre; "> </span>row.headers[0] == "timestamp"</div><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><span class="Apple-tab-span" style="white-space:pre"> </span>row.headers[1] == "timestamp"</div><div style="font-family: Helvetica; font-size: medium; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="font-style: normal; "><span class="Apple-tab-span" style="white-space: pre; "> </span>row.values[0] == 1359210019.299478</div><div style="font-style: normal; "><span class="Apple-tab-span" style="white-space:pre"> </span>row.values[1] == 1359210019.299478</div><div style="font-style: normal; "><br></div><div style="font-style: normal; ">Could still provide: </div><div style="font-style: normal; "><span class="Apple-tab-span" style="white-space:pre"> </span>for name,value in records.iterfirstitems(): # get the first value for each column with a given name.</div><div style="font-style: normal; "><span class="Apple-tab-span" style="white-space:pre"> </span> <span class="Apple-tab-span" style="white-space:pre"> </span>- or - </div><div style="font-style: normal; "><span class="Apple-tab-span" style="white-space:pre"> </span>for name,value in records.iterlasttitems(): # get the last value for each column with a given name.</div><div style="font-style: normal; "><br></div><div style="font-style: normal; ">And the exact functionality you have now: </div><div style="font-style: normal; "><span class="Apple-tab-span" style="white-space:pre"> </span>records.itervaluemaps() # or something… just a map(dict(records.iterlastitesm()))</div><div style="font-style: normal; "><span class="Apple-tab-span" style="white-space:pre"> </span></div><div style="font-style: normal; ">Overkill, but really simple things to add… </div><div style="font-style: normal; "><br></div><div>The only thing this really adds to the "convenience" of the current DictReader for well-behaved tables, is the ability to access values sequentially <i>or </i>by name; other than that, the only difference would be iterating on a generator method's output instead of the instance itself. </div><div style="font-style: normal; "><br></div><div style="font-style: normal; "><br></div></div><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br></div><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><br></div><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">Shane Green </div><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><a href="http://www.umbrellacode.com/">www.umbrellacode.com</a></div><div style="font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; ">408-692-4666 | <a href="mailto:shane@umbrellacode.com">shane@umbrellacode.com</a></div>
</div>
<br><div><div>On Jan 26, 2013, at 5:53 AM, "Stephen J. Turnbull" <<a href="mailto:stephen@xemacs.org">stephen@xemacs.org</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">Shane Green writes:<br><br><blockquote type="cite">And while it's true that a dictionary is a dictionary and it works<br>the way it works, the real point that drives home is that it's an<br>inappropriate mechanism for dealing ordered rows of sequential<br>values.<br></blockquote><br>Right! So use csv.reader, or csv.DictReader with an explicit<br>fieldnames argument.<br><br>The point of csv.DictReader with default fieldnames is to take a<br>"well-behaved" table and turn it into a sequence of "poor-man's"<br>objects.<br><br><blockquote type="cite">The final point is a simple one: while that CSV file format was<br>stupid, it was perfectly legal. Something that deals with CSV<br>content should not be losing any of its content.<br></blockquote><br>That's a reasonable requirement.<br><br><blockquote type="cite">It also should [not] be barfing or throwing exceptions, by the way.<br></blockquote><br>That's not. As long as the module provides classes capable of<br>handling any CSV format (it does), it may also provide convenience<br>classes for special purposes with restricted formats. Those classes<br>may throw exceptions on input that doesn't satisfy the restrictions.<br><br><blockquote type="cite">And what about fixing it by replacing implementing a class that<br>does it correctly, [...]?<br></blockquote><br>Doesn't help users who want automatically detected access-by-name.<br>They must have unique field names. (I don't have a use case. I<br>assume the implementer of csv.DictReader did.<wink/>)<br><br></blockquote></div><br></div></div></blockquote></div><br></body></html>