[Patches] [ python-Patches-1767398 ] test_csv struni fixes + unicode support in _csv

SourceForge.net noreply at sourceforge.net
Mon Aug 6 21:33:25 CEST 2007


Patches item #1767398, was opened at 2007-08-03 20:11
Message generated for change (Comment added) made by gvanrossum
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1767398&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Python 3000
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Adam Hupp (hupp)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: test_csv struni fixes + unicode support in _csv

Initial Comment:
This patch fixes test_csv.py for the struni branch and modifies _csv.c to support unicode strings.

Changes:

 1. The test_csv.py failures caused by bytes/str conflicts have been resolved.  

 2. Uses of mkstemp have been replaced with TemporaryFile in a 'with' block. 

 3. The _csv.c module now uses unicode for string handling.   I've uncommented the unicode read tests in test_csv.py, and added tests for writing unicode content and a unicode delimiter.

All tests are now passing on my system (linux).

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2007-08-06 15:33

Message:
Logged In: YES 
user_id=6380
Originator: NO

This looked good enough to submit.
I had to clean up the whitespace use in the C code. Please next time set
your tabs to 8 spaces when editing C code. Also try to conform to the
surrounding code's use of spaces or tab (unfortunately this file is
inconsistent and sometimes uses spaces, other times tabs -- that's worth a
separate cleanup).

Committed revision 56777.


----------------------------------------------------------------------

Comment By: Adam Hupp (hupp)
Date: 2007-08-05 12:39

Message:
Logged In: YES 
user_id=508906
Originator: YES

Skip,
I think the error you're seeing is being caused by a conversion from
Py_UNICODE -> char -> unicode through get_nullchar_as_None.  That function
should look like this:

static PyObject *
get_nullchar_as_None(Py_UNICODE c)
{
        if (c == '\0') {
                Py_INCREF(Py_None);
                return Py_None;
        }
        else
            return PyUnicode_FromUnicode((Py_UNICODE*)&c, 1);
}

Unfortunately I'm on the road right now so I can't test it.

Is there something I need to do with my build to trigger those assertions?
 I didn't see them.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2007-08-05 09:07

Message:
Logged In: YES 
user_id=44345
Originator: NO

Adam,

I've spent some time looking at this patch.  Bear in mind this is my first
foray into Py3k.  Still, I'm confused about what's going on here.  I'm
hoping you can help me understand the changes.  In parse_save_field, you
replaced PyString_FromStringAndSize with PyUnicode_FromUnicode, however in
get_nullchar_as_None you replaced it with PyUnicode_DecodeASCII.

When I execute the csv tests there are a number of assertion errors
related to the default delimiter.  The traceback goes something like this:

FAIL: test_writer_kw_attrs (__main__.Test_Csv)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_csv.py", line 88, in test_writer_kw_attrs
    self._test_kw_attrs(csv.writer, StringIO())
  File "Lib/test/test_csv.py", line 75, in _test_kw_attrs
    self.assertEqual(obj.dialect.delimiter, ':')
AssertionError: s'\x00' != ':'

Any idea how to solve that?  It looks to me like some Unicode buffer might
be getting interpreted as a char *, but I'm not sure.

Skip


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1767398&group_id=5470


More information about the Patches mailing list