Georg Brandl schrieb: >>>>> b = (codecs.BOM_UTF8 + "hello").decode("utf-8") >>>>> len(a) >> 5 > > This behavior is questionable... Indeed. Try py> b = (codecs.BOM_UTF8 + "hello").decode("utf-8-sig") py> len(b) 5 instead. Regards, Martin