[IronPython] Problems with 8-bit strings

Dino Viehland dinov at exchange.microsoft.com
Sat Nov 24 21:54:30 CET 2007

I think the construction w/ characters > 0x80 is a bug.  But the returning of unicode vs. 8-bit strings is just a display issue.  IronPython only has unicode strings and is one of the big differences between CPython and IronPython.  If you haven't already I'll open a bug when I'm back after Thanksgiving.

From: users-bounces at lists.ironpython.com [users-bounces at lists.ironpython.com] On Behalf Of Patrick Dubroy [pdubroy at gmail.com]
Sent: Wednesday, November 21, 2007 12:18 PM
To: users at lists.ironpython.com
Subject: [IronPython] Problems with 8-bit strings


I've noticed that in the latest version of IronPython (2.0A6), I
noticed some weird behaviour with 8-bit strings:

    IronPython console: IronPython 2.0A6 (2.0.11102.00) on .NET 2.0.50727.1378
    Copyright (c) Microsoft Corporation. All rights reserved.
    >>> str("\x7e")
    >>> str("\x7f")
    >>> str("\x80")
    >>> str("\x81")
    Traceback (most recent call last):
      File , line 0, in ##23
      File mscorlib, line unknown, in GetString
      File mscorlib, line unknown, in GetChars
      File mscorlib, line unknown, in Fallback
      File mscorlib, line unknown, in Throw
    UnicodeDecodeError: Unable to translate bytes [81] at index 0 from
specified code page to Unicode.

The first problem is that if the string contains characters 127 (0x7F)
or 128 (0x80), str() will return a Unicode string rather than an 8-bit
string. CPython, on the other hand, returns a standard 8-bit string
for both of those cases. Then, if the string contains any bytes
greater than 128, it throws an exception. CPython, on the other hand,
is happy to have bytes up to 0xFF in an 8-bit string.

Is this a known issue? Should I open a bug?

Users mailing list
Users at lists.ironpython.com

More information about the Ironpython-users mailing list