[Python-Dev] beta1 coming real soon

"Martin v. Löwis" martin at v.loewis.de
Tue Jun 13 20:33:59 CEST 2006

Walter Dörwald wrote:
> And passing MB_ERR_INVALID_CHARS in a call to MultiByteToWideChar()
> doesn't help either, because AFAICT there's no information about the
> error location. What could work would be to try MultiByteToWideChar()
> with various string lengths to try to determine whether the error is due
> to an incomplete byte sequence or invalid data. But that sounds ugly and
> slow to me.

That's all true, yes.

>> but can't possibly work for ISO-2022.
> So does that mean that IsDBCSLeadByte() returns garbage in this case?

IsDBCSLeadByteEx is documented to only validate lead bytes for selected
code pages; MSDN versions differ in what these code pages are. The
current online version says

"This function validates leading byte values only in the following code
pages: 932, 936, 949, 950, and 1361."

whereas my January 2006 MSDN (DVD version) says

"IsDBCSLeadByteEx does not validate any lead byte in multi-byte
character set (MBCS) code pages, for example, code pages 52696, 54936,
51949 and 5022x."

Whether or not this is relevant for IsDBCSLeadByte also, I cannot tell:
- maybe they forgot to document the limitation there as well
- maybe you can't use one of the unsupported code pages as CP_ACP,
  so the problem cannot occur
- maybe IsDBCSLeadByte does indeed work correctly in these cases, when
  IsDBCSLeadByteEx doesn't

The latter is difficult to believe, though, as IsDBCSLeadByte is likely
implemented as

BOOL IsDBCSLeadByte(BYTE TestChar)
  return IsDBCLeadByteEx(GetACP(), TestChar);


More information about the Python-Dev mailing list