Stephen J. Turnbull wrote:
Of course it must be supported. My point is that many strings (in my applications, all but those strings that result from slurping in a file or process output in one go -- example, not a statistically valid sample!) are not the beginning of "what once was a stream". It is error-prone (not to mention unaesthetic) to not make that distinction.
"Explicit is better than implicit."
I can't put these two paragraphs together. If you think that explicit is better than implicit, why do you not want to make different calls for the first chunk of a stream, and the subsequent chunks?
s=cStringIO.StringIO() s1=codecs.getwriter("utf-8")(s) s1.write(u"Hallo") s.getvalue()
Yes! Exactly (except in reverse, we want to _read_ from the slurped stream-as-string, not write to one)! ... and there's no need for a utf-8-sig codec for strings, since you can support the usage in exactly this way.
However, if there is an utf-8-sig codec for streams, there is currently no way of *preventing* this codec to also be available for strings. The very same code is used for streams and for strings, and automatically so.