<br><br><div><span class="gmail_quote">On 9/12/06, <b class="gmail_sendername">"Martin v. L÷wis"</b> <<a href="mailto:firstname.lastname@example.org">email@example.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br>> I can assure you<br>> that most of the documents that I work with are not in CP436 - they are<br>> a combination of ASCII, ISO8859-1, and UTF-8. I would also guess that<br>> this is true of many Windows XP (US-English) users. So, for me and users
<br>> like me, Python is going to silently misinterpret my data.<br><br>No. It will use a different API to determine the system encoding, and<br>it will guess correctly.</blockquote><div><br>If Python reports "cp1252" as I expect it to, then it has not "guessed correctly" for Brian's documents as described above. The mistake will be harmless for the ASCII files and often for the ISO8859-1 files, but would be dangerous for the UTF-8 ones.
<br><br> Paul Prescod<br><br></div></div>