[Csv] These csv test cases seem incorrect to me...

John Machin sjmachin at lexicon.net
Mon Mar 12 04:09:05 CET 2007


On 12/03/2007 1:01 PM, skip at pobox.com wrote:
> I decided it would be worthwhile to have a csv module written in Python (no
> C underpinnings) for a number of reasons:
> 
>     * It will probably be easier to add Unicode support to a Python version
> 
>     * More people will be able to read/grok/modify/fix bugs in a Python
>       implementation than in the current mixed Python/C implementation.
> 
>     * With alternative implementations of Python available (PyPy,
>       IronPython, Jython) it makes sense to have a Python version they can
>       use.
> 
> I'm far from having anything which will pass the current test suite, but in
> diagnosing some of my current failures I noticed a couple test cases which
> seem wrong.  In the TestDialectExcel class I see these two questionable
> tests:
> 
>     def test_quotes_and_more(self):
>         self.readerAssertEqual('"a"b', [['ab']])
> 
>     def test_quote_and_quote(self):
>         self.readerAssertEqual('"a" "b"', [['a "b"']])
> 
> It seems to me that if a field starts with a quote it *has* to be a quoted
> field.  Any quotes appearing within a quoted field have to be escaped and
> the field has to end with a quote.  Both of these test cases fail on or the
> other assumption.  If they are indeed both correct and I'm just looking at
> things crosseyed I think they at least deserve comments explaining why they
> are correct.
> 
> Both test cases date from the first checkin.  I performed the checkin
> because of the group developing the module I believe I was the only one with
> checkin privileges at the time, not because I wrote the test cases.
> 
> Any ideas about why these test cases are in there?  I can't imagine Excel
> generating either one.
> 

Hi Skip,

'"a"b' can't be produced by applying minimalist CSV writing rules to 
'ab'. A non-minimalist writer could produce '"ab"', but certainly not 
'"a"b'.

The second case is worse -- it's inconsistent; the reader is supposed to 
remove the quotes from "a" but not from "b"???

IMHO these test cases are *WRONG* and it's a worry that they "work" with 
the current csv module :-(

Regards,

John



More information about the Csv mailing list