[Python-3000] str/unicode tests: pyexpat.c and read(n)

Joe Gregorio joe at bitworking.org
Sat Jul 21 06:12:51 CEST 2007


Should xml.parsers.expat.XMLParser.ParseFile(file) operate on
both text and binary streams?

If it should operate on text streams then an
issue arises from "read(n)" meaning different
things for text and binary streams. If the stream passed in
is "text" then read(n) will read
'n' unicode characters, but pyexpat.c allocates
a buffer of 2048 bytes and calls read(2048) which could
obviously return more than 2048 bytes.

The simplest solution in the case of a text stream
is to be safe and convert that into read(2048/4)
to accommodate the worst case scenario.

Has this come up before and is there a better solution?

   Thanks,
   -joe



On 7/20/07, Guido van Rossum <guido at python.org> wrote:
> Thanks to all who helped fixing tests in the str/unicode branch! We're
> down to about 35 failing tests. I still need help -- especially since
> we're now getting into territory that I don't know all that well, for
> example the email package or XML support.
>
> The list of unit tests that need help is still on the wiki:
> http://wiki.python.org/moin/Py3kStrUniTests
>
> Instructions on how to help and how to avoid duplicate work are also
> there. Please help!
>
> Thanks to all those who already fixed one or more tests!
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/joe%40bitworking.org
>


-- 
Joe Gregorio        http://bitworking.org


More information about the Python-3000 mailing list