[Python-3000] string API growth [was: Re: PEP 3138- String representation in Python 3000]
Jim Jewett
jimjjewett at gmail.com
Wed May 14 19:45:10 CEST 2008
On 5/14/08, Georg Brandl <g.brandl at gmx.net> wrote:
> M.-A. Lemburg schrieb:
>>> On Fri, May 9, 2008 at 3:54 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>> On 2008-05-08 22:55, Terry Reedy wrote:
>>>>> Functions that map unicode->unicode or bytes->bytes could be called
>>>>> transcoders.
bytes->bytes might be, but for many mappings (and all unicode->unicode
mappings) they are general transformers.
If you care about the concrete representation, then you aren't really
dealing with unicode anymore; you're dealing with the ByteString.
>>>> Are you suggesting to have two separate methods which then
>>>> allow same-type-conversions ?
>>>> ... have to map naturally to the codec method encode and
>>>> decode
For str->str or bytes->bytes, how do you decide which direction is
"en"coding vs "de"coding?
> > How about these:
> > str.str_encode() -> str
> > str.str_decode() -> str
> > bytes.bytes_encode() -> bytes
> > bytes.bytes_decode() -> bytes
> What about transform/untransform?
Maybe I'm missing something, but it seems to me that there are only a
few logical combinations; if the below is wrong, maybe that is one
reason unicode seems more complex than it should.
Encoding: str -> ByteString
(staticmethod) BytesString.encode(my_string, encoding=?)
==
my_string.encode(encoding=?)
Decoding: ByteString -> str
my_bytes.decode(encoding=?)
==
(staticmethod) str.decode(my_bytes, encoding=?)
General Transforming:
# Why insist on type-preservation?
# Why even make these methods?
my_string.transform(fn) == fn(my_string)
my_bytes.transform(fn) == fn(my_bytes)
Transcoding: ByteString -> ByteString
# If you care how it is represented, it is no longer unicode;
# it is a specific (ByteString) representation
mybytes.recode(old_encoding=?, new_encoding)
# Can the old encoding often be inferred?
# Or should it always be written because of EIBTI?
-jJ
More information about the Python-3000
mailing list