[Python-ideas] csv.DictReader could handle headers more intelligently.

Tue Jan 29 14:28:19 CET 2013

On 29/01/13 23:35, Mark Hackett wrote:
> On Tuesday 29 Jan 2013, Steven D'Aprano wrote:
>>> If it dropped the columns and shouldn't have, then the results will be
>>> seen to be wrong anyway, so there's not a huge amount of need for this.
>>
>> You cannot assume that the caller knows that there are duplicated column
>>   names
>>
>
> You cannot assume they wanted them as a list.

I don't need to assume that. They can take the list and post-process it into
any data type they want.

A list is a natural fit for associating multiple values to a single key,
because it doesn't lose data: it is variable-sized, so it can handle "no
values" or "1000 values" equally easily; it is ordered, and it is iterable.
If the caller wants something else, they can convert it.

> You cannot assume that duplicate replacement is what they want.

I don't think I ever suggested that it was.

> If someone is using a csv file with header names they have never read, how are
> they going to use the data?

reader = csv.DictReader(whatever)
for mapping in reader:
     for key, value in mapping.items():
         process(key, value)

Or perhaps you only care about one column, and don't care about the other, unknown,
columns:

for mapping in reader:
     value = mapping.get('spam', 'some default')
     process(value)

> They won't even know the name to access the value in the dictionary!

Dealing with arbitrary field names in data you read from a file is not hard.

-- 
Steven