[IronPython] Problems with 8-bit strings
Dino Viehland
dinov at exchange.microsoft.com
Sat Nov 24 21:54:30 CET 2007
I think the construction w/ characters > 0x80 is a bug. But the returning of unicode vs. 8-bit strings is just a display issue. IronPython only has unicode strings and is one of the big differences between CPython and IronPython. If you haven't already I'll open a bug when I'm back after Thanksgiving.
________________________________________
From: users-bounces at lists.ironpython.com [users-bounces at lists.ironpython.com] On Behalf Of Patrick Dubroy [pdubroy at gmail.com]
Sent: Wednesday, November 21, 2007 12:18 PM
To: users at lists.ironpython.com
Subject: [IronPython] Problems with 8-bit strings
Hi,
I've noticed that in the latest version of IronPython (2.0A6), I
noticed some weird behaviour with 8-bit strings:
IronPython console: IronPython 2.0A6 (2.0.11102.00) on .NET 2.0.50727.1378
Copyright (c) Microsoft Corporation. All rights reserved.
>>> str("\x7e")
'~'
>>> str("\x7f")
u'\x7f'
>>> str("\x80")
u'\x80'
>>> str("\x81")
Traceback (most recent call last):
File , line 0, in ##23
File mscorlib, line unknown, in GetString
File mscorlib, line unknown, in GetChars
File mscorlib, line unknown, in Fallback
File mscorlib, line unknown, in Throw
UnicodeDecodeError: Unable to translate bytes [81] at index 0 from
specified code page to Unicode.
The first problem is that if the string contains characters 127 (0x7F)
or 128 (0x80), str() will return a Unicode string rather than an 8-bit
string. CPython, on the other hand, returns a standard 8-bit string
for both of those cases. Then, if the string contains any bytes
greater than 128, it throws an exception. CPython, on the other hand,
is happy to have bytes up to 0xFF in an 8-bit string.
Is this a known issue? Should I open a bug?
Pat
_______________________________________________
Users mailing list
Users at lists.ironpython.com
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
More information about the Ironpython-users
mailing list