[Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()

M.-A. Lemburg mal@lemburg.com
Thu, 18 Jan 2001 11:51:46 +0100

Ka-Ping Yee wrote:
> On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> >     str() looks for __str__
> Oops.  I forgot that
>       str() looks for __str__, then tries __repr__
> So, presumably,
>       unicode() should look for __unicode__, then __str__, then __repr__

Not quite... str() does this:

1. strings are passed back as-is
2. the type slot tp_str is tried
3. the method __str__ is tried
4. Unicode returns are converted to strings
5. anything other than a string return value is rejected

unistr() does the same, but makes sure that the return
value is an Unicode object.

unicode() does the following:

1. for instances, __str__ is called
2. Unicode objects are returned as-is
3. string objects or character buffers are used as basis for decoding
4. decoding is applied to the character buffer and the results
   are returned

I think we should perhaps merge the two approaches into one
which then applies all of the above in unicode() (and then
forget about unistr()). This might lose hide some type errors,
but since all other generic constructors behave more or less
in the same way, I think unicode() should too.

Thoughts ?

Marc-Andre Lemburg
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/