csv.DictReader line skipping should be considered a bug?
Neil Cerutti
neilc at norwich.edu
Tue Dec 5 13:39:51 EST 2017
On 2017-12-05, Jason <jasonhihn at gmail.com> wrote:
> I ran into this:
> https://stackoverflow.com/questions/27707581/why-does-csv-dictreader-skip-empty-lines
>
> # unlike the basic reader, we prefer not to return blanks,
> # because we will typically wind up with a dict full of None
> # values
>
> while iterating over two files, which are line-by-line corresponding. The DictReader skipped ahead many lines breaking the line-by-line correspondence.
>
> And I want to argue that the difference of behavior should be considered a bug. It should be considered as such because:
> 1. I need to know what's in the file to know what class to use. The file content should not break at-least-1-record-per-line. There may me multiple lines per record in the case of embedded new lines, but it should never no record per line.
> 2. It's a premature optimization. If skipping blank lines is desirable, then have another class on top of DictReader, maybe call it EmptyLineSkippingDictReader.
> 3. The intent of DictReader is to return a dict, nothing more, therefore the change of behavior isn inappropriate.
>
> Does anyone agree, or am I crazy?
I've used csv.DictReader for years and never come across this
oddity. Very interesting!
I am with you. Silently discarding blank records hides
information--the current design is unusable if blank records are
of interest. Moreover, what's wrong with a dict full of None, if
that's what's in the record? Haw many Nones are too many?
--
Neil Cerutti
More information about the Python-list
mailing list