UTF-8 is no fun...
Things are hard to get right when you have to deal with backward *and* forward compatibility, interoperability and user-friendliness all at the same time... but we'll keep trying ;-)
Let me say publically that I think you have done a fine job, and obviously have put lots of thought and effort into it. If parts of the design turn out to be less than ideal (and subsequently changed before 1.6 is real) then this will not detract from your excellent work.
Well done!
[And also to Fredrik, whose code was the basis for the Unicode object itself - that was a nice piece of code too!]
Mark
I've spent a fair bit of time converting strings and files the last few days, and I'd add that what we have now seems both rock solid and very easy to use. The remaining issues are entirely a matter of us end users trying to figure out what we should have asked for in the first place. Whether we achieve that finally before 1.6 is our problem; Marc-Andr\u00C9 and Fredrik have done a great job, and I think we are on track for providing something much more useful and extensible than (say) Java. As proof of this, someone has already contributed Japanese codecs based on the spec. - Andy Robinson
Andy Robinson <andy@reportlab.com> wrote:
I've spent a fair bit of time converting strings and files the last few days, and I'd add that what we have now seems both rock solid and very easy to use.
I'm not worried about the core string types or the conversion machinery; what disturbs me is mostly the use of automagic conversions to UTF-8, which breaks the fundamental assumption that a string is a sequence of len(string) characters. "The items of a string are characters. There is no separate character type; a character is represented by a string of one item" (from the language reference) I still think the "all strings are sequences of unicode characters" strawman I posted earlier would simplify things for everyone in- volved (programmers, users, and the interpreter itself). more on this later. gotta ship some code first. </F>
participants (2)
-
Andy Robinson
-
Fredrik Lundh