[Python-Dev] Unicode byte order mark decoding
"Martin v. Löwis"
martin at v.loewis.de
Wed Apr 6 22:22:19 CEST 2005
Stephen J. Turnbull wrote:
> Because the signature/BOM is not a chunk, it's a header. Handling the
> signature/BOM is part of stream initialization, not translation, to my
> mind.
I'm sorry, but I'm losing track as to what precisely you are trying to
say. You seem to be using a mental model that is entirely different
from mine.
> The point is that explicitly using a stream shows that initialization
> (and finalization) matter. The default can be BOM or not, as a
> pragmatic matter. But then the stream data itself can be treated
> homogeneously, as implied by the notion of stream.
But what follows from that point? So it shows some kind of matter...
what does that mean for actual changes to Python API?
> I think it probably also would solve Walter's conundrum about
> buffering the signature/BOM if responsibility for that were moved out
> of the codecs and into the objects where signatures make sense.
>
> I don't know whether that's really feasible in the short run---I
> suspect there may be a lot of stream-like modules that would need to
> be updated---but it would be a saner in the long run.
What is "that" which might be really feasible? To "solve Walter's
conundrum"? That "signatures make sense"?
So I can't really respond to your message in a meaningful way;
I just let it rest...
Regards,
Martin
More information about the Python-Dev
mailing list