Unicode BOM marks
shoot at the.moon
Mon Mar 14 01:19:07 CET 2005
Martin v. Löwis wrote:
> Steve Horsley wrote:
>> It is my understanding that the BOM (U+feff) is actually the Unicode
>> character "Non-breaking zero-width space".
> My understanding is that this used to be the case. According to
> the application should now specify specific processing, and both
> simply dropping it, or reporting an error are both acceptable behaviour.
> Applications that need the ZWNBSP behaviour (i.e. want to indicate that
> there should be no break at this point) should use U+2060 (WORD JOINER).
I'm out of date, then. Thanks for the link.
More information about the Python-list