remove BOM from string read from utf-8 file
Piet van Oostrum
piet at cs.uu.nl
Fri Feb 27 18:37:38 EST 2004
>>>>> "Achim Domma" <domma at procoders.net> (AD) wrote:
AD> "Piet van Oostrum" <piet at cs.uu.nl> wrote in message
AD> news:wzoerkinig.fsf at Ordesa.local...
>> Check text[0] and len(text) to verify.
AD> That's what I did. The file contains 24 chinese characters and len(text) is
AD> 25. And 0xef is the hex code for the BOM if I'm not completely wrong.
Sorry, I was wrong.
You have to check for text.startswith(u'\ufeff')
--
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP]
Private email: P.van.Oostrum at hccnet.nl
More information about the Python-list
mailing list