[Csv] These csv test cases seem incorrect to me...

skip at pobox.com skip at pobox.com
Mon Mar 12 03:01:08 CET 2007


I decided it would be worthwhile to have a csv module written in Python (no
C underpinnings) for a number of reasons:

    * It will probably be easier to add Unicode support to a Python version

    * More people will be able to read/grok/modify/fix bugs in a Python
      implementation than in the current mixed Python/C implementation.

    * With alternative implementations of Python available (PyPy,
      IronPython, Jython) it makes sense to have a Python version they can
      use.

I'm far from having anything which will pass the current test suite, but in
diagnosing some of my current failures I noticed a couple test cases which
seem wrong.  In the TestDialectExcel class I see these two questionable
tests:

    def test_quotes_and_more(self):
        self.readerAssertEqual('"a"b', [['ab']])

    def test_quote_and_quote(self):
        self.readerAssertEqual('"a" "b"', [['a "b"']])

It seems to me that if a field starts with a quote it *has* to be a quoted
field.  Any quotes appearing within a quoted field have to be escaped and
the field has to end with a quote.  Both of these test cases fail on or the
other assumption.  If they are indeed both correct and I'm just looking at
things crosseyed I think they at least deserve comments explaining why they
are correct.

Both test cases date from the first checkin.  I performed the checkin
because of the group developing the module I believe I was the only one with
checkin privileges at the time, not because I wrote the test cases.

Any ideas about why these test cases are in there?  I can't imagine Excel
generating either one.

Thx,

Skip



More information about the Csv mailing list