[Python-Dev] a suggestion ... Re: PEP 383 (again)

"Martin v. Löwis" martin at v.loewis.de
Thu Apr 30 18:21:03 CEST 2009

>> If I pass a string with an embedded U+0000 to gtk, gtk will truncate
>> the string, and stop rendering it at this character. This is worse than
>> what it does for invalid UTF-8 sequences. Chances are fairly high that
>> other C libraries will fail in the same way, in particular if they
>> expect char* (which is very common in C).
> Hmm.  I believe the intended failure mode here, for PyGTK at least, is
> actually this:
>    TypeError: GtkLabel.set_text() argument 1 must be string without null
> bytes, not unicode

It may depend on the widget also, I tried it with wxMessageDialog
(I only had the wx example available, and am using wxgtk).

> APIs in PyGTK which accept NULLs and silently trucate are probably
> broken.  Although perhaps I've just made your point even more strongly;
> one because the behavior is inconsistent, and two because it sometimes
> raises an exception if a NULL is present, and apparently the goal here
> is to prevent exceptions from being raised anywhere in the process.

Indeed so.

> For this idiom to be of any use to GTK programs,
> gtk.FileChooser.get_filename() will probably need to be changed, since
> (in py2) it currently returns a str, not unicode.

Perhaps - the entire PEP is about Python 3 only. I don't know whether
PyGTK already works with 3.x.

> The PEP should say something about how GUI libraries should handle file
> choosers, so that they'll be consistent and compatible with the standard
> library.  Perhaps only that file choosers need to take this PEP into
> account, and the rest is obvious.  Or maybe the right thing for GTK to
> do would be to continue to use bytes on POSIX and convert to text on
> Windows, since open(), listdir() et. al. will continue to accept bytes
> for filenames?

In Python 3, the file chooser should definitely return strings, and it
would be good if they were PEP 383 compliant.

>> So I prefer the half surrogate because its failure mode is better th
> Heh heh heh.

And it wasn't even intentional :-)


More information about the Python-Dev mailing list