I took a look at test_sax and it seems sax.parser expects all (XML) input as unicode rather than bytes. Apparently ElementTree does the same. Is there any rationale for this decision? cheers Antoine.