[Python-Dev] Re: other "magic strings" issues

Alex Martelli aleaxit at yahoo.com
Mon Nov 10 16:51:10 EST 2003


On Monday 10 November 2003 10:12 pm, Marangozov, Vladimir (Vladimir) wrote:
   ...
> Put it another way, it's good to have all string functions being
> attributes to a single well-known object, that object being the
> 'string' module, instead of spreading it all over...  So add the

Not sure anybody wants to "spread it all over", for whatever "it".
str.whatever should be usable where string.whatever is usable
now, so, what would the problem be...?

As for being able to call, when appropriate:

    something.amethod(somestring, whatever)

rather than _having_ to call

    somestring.amethod(whatever)

I _do_ sympathize with this.  str.methodname, being an unbound
method, may NOT be usable quite as freely ("quite as polymorphically",
in OO-speak:-) as string.method was recently.  E.g. :

>>> import string
>>> string.upper(u'ciao')
u'CIAO'
>>> string.upper('ciao')
'CIAO'
>>> str.upper('ciao')
'CIAO'
>>> str.upper(u'ciao')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: descriptor 'upper' requires a 'str' object but received a 'unicode'

in other words, string.upper is currently callable on ANY object which
internally defines an .upper() method, whether that object is a string or
not; str.upper instead does typechecking on its first argument -- you can
only call it on a bona fide instance of str or a subclass, not polymorphically
in the usual Python sense of signature-based polymorphism.

So, if I have a sequence with some strings and some unicode objects
I cannot easily get a correspondent sequence with each item uppercased
_except_ with string.upper...:

>>> map(string.upper, ('ciao', u'ciao'))
['CIAO', u'CIAO']

>>> map(str.upper, ('ciao', u'ciao'))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: descriptor 'upper' requires a 'str' object but received a 'unicode'

>>> map(unicode.upper, ('ciao', u'ciao'))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: descriptor 'upper' requires a 'unicode' object but received a 'str'


To be honest I don't currently have any real use case that's quite like this 
(i.e., based on a mix of string and unicode objects), but I DO have cases
in completely different domains where I end up coding the like of (sigh...):

def fooper(obj): return obj.foop()
foopresults = map(fooper, lotsofobjects)

or equivalently:

foopresults = map(lambda obj: obj.foop(), lotsofobjects)

or also (probably best for this specific use case):

foopresults = [ obj.foop() for obj in lotsofobjects ]


map may not be the best example, because it's old-ish and most replaceable
with list comprehensions (optionally with zip), itertools, etc.  But I _do_ 
need an "easily expressed callable" for _many_ perfectly current and indeed
future (2.4) idioms.  E.g., "order the items of lotsobjs in increasing order 
of their .foop() results" in 2.4 would be

lotsobjs.sort(key=lambda obj: obj.foop())

...and we're back to wishing for a way to pass a nonlambda-callable.  E.g.
a string-related example would be "order the strings in list lotsastrings 
(which may be all plain strings, or all unicode strings, on different calls 
of this overall function) in case-insensitive-alphabetical order".  In 2.4 
_with_ the string module that's a snap:

lotsastrings.sort(key=string.upper)

_without_ string.upper's quiet and strong polymorphism, we'd be back to
lambda, or a tiny def for the equivalent of string.upper, or nailing down
the exact type involved, leading perhaps to nasty code such as

lotsastrings.sort(key=type(lotsastrings[0]).upper)

(not ADVOCATING this by any means -- on the contrary, pointing it out as a 
danger of having such callables ONLY available as unbound methods and
thus requiring the exact type...).


But it does not seem to me that keeping module string as it is now is
necessarily the ideal solution to this small quandary.  It works for those
methods which strings _used_ to have in 1.5.2 -- try, e.g., string.title --
and you're hosed again.  _Extending_ module string doesn't seem like
a pleasant option either -- and if we did we'd _still_ leave exactly the
same problem open for non-string objects on which we'd like to get a
polymorphic callable that's normally a method (key= parameter in sort,
all the 'func' and 'pred' parameters to itertools functions, ...).

Rather, why not think of a slightly more general solution...?  We could
have an object -- say "callmethod", although I'm sure better names can
easily be found by this creative crowd;-) -- with functionality roughly
equivalent to the following Python code...:

class MethodCaller(object):
    def __getattr__(self, name):
        def callmethod(otherself, *args, **kwds):
            return getattr(otherself, name)(*args, **kwds)
        return callmethod
callmethod = MethodCaller()

Now, the ability to obtain callables for each of the above examples
becomes available -- with parametric polymorphism just like Python
normally offers.  Performance with this implementation would surely
be bad (but then, string.upper(s) is over twice as slow as s.upper()
and I don't hear complaints on that...:-) but maybe a more clever
implementation might partly compensate... _if_, that is, there IS
any interest at all in the idea, of course!


Alex




More information about the Python-Dev mailing list