[Patches] [Patch #101664] Add new unistr() builtin + PyObject_Unicode()C API

M.-A. Lemburg mal@lemburg.com
Thu, 18 Jan 2001 10:52:12 +0100


Ka-Ping Yee wrote:
> 
> On Wed, 17 Jan 2001 noreply@sourceforge.net wrote:
> > Comment:
> > This patch adds a utility function unistr() which works just like
> > the standard builtin str()  -- only that the return value will
> > always be a Unicode object.
> 
> Sorry for barging in, but i have an issue/question:
> 
> Why are unistr() and unicode() two separate functions?
> 
> str() performs one task: convert to string.  It can convert anything,
> including strings or Unicode strings, numbers, instances, etc.
> 
> The other type-named functions e.g. int(), long(), float(), list(),
> tuple() are similar in intent.
> 
> Why have unicode() just for converting strings to Unicode strings,
> and unistr() for converting everything else to a Unicode string?
> What does unistr(x) do differently from unicode(x) if x is a string?

unistr() is meant to complement str() very closely. unicode()
works as constructor for Unicode objects which can also take
care of decoding encoded data. str() and unistr() don't provide
this capability but instead always assume the default encoding.

There's also a subtle difference in that str() and unistr() 
try the tp_str slot which unicode() doesn't. unicode()
supports any character buffer which str() and unistr() don't.

Perhaps you are right though in that we should make all three
APIs behave in the same way with respect to coercing their
arguments. This could hide some errors... still in the long
run, I agree that the existing setup probably causes more confusion
than good.

Guido ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/