Feb. 16, 2012
1:49 a.m.
Greg Ewing writes:
Maybe there should be a real data type [parallel to str and bytes that mixes str and bytes], or a flag on the unicode type.
-1. This is yesterday's problem. It still hurts today; we need workarounds. But it's going to be less and less important as time goes on, because nobody can afford one-locale software anymore, and the cheapest way to be multilocale is to process in Unicode, and insist on Unicode on input and output. The unknown encoding problem is not one with a generally acceptable solution. That's why Unicode was invented. To "solve" the problem by ensuring it doesn't occur in the first place.