Implicit lists

Bengt Richter bokr at oz.net
Thu Jan 30 20:51:13 EST 2003


On Thu, 30 Jan 2003 21:28:33 GMT, Alex Martelli <aleax at aleax.it> wrote:

>holger krekel wrote:
>
>> Alex Martelli wrote:
>>> holger krekel wrote:
>>>    ...
>>> > I think it's safer to skip "iteration" with an explicit
>>> > 
>>> >     isinstance(arg, (str, unicode))
>>> > 
>>> > check.  It's also simpler to read and understand.
>>> 
>>> And it breaks *EVERY* use of UserString -- a module in the
>>> standard Python library, even!!! -- not to mention user-coded
>>> string-like types and classes.  *shudder*.
>> 
>> sorry, but fewer scream-emoticons would suffice to make
>> your point.
>
>Not for an Italian forced to express himself _without_ moving
>his hands about.
>
>> However, it feels like an implicit kind of type-check.
>
>Your feelings are a bit off here -- because types work
>differently, e.g. UserString has NO type connection with
>str -- but not by much: what it is is a PROTOCOL-check --
>Python has no explicit notion of protocol but makes
>substantial use of the notion under the cover (and if we
>had protocol-adaptation in the language/library, life
>would be sweeter).
>
>Python does define (loosely) a sequence protocol, but,
>alas for this case, strings and string-like thingies
>also meet the sequence protocol and we DON'T want to
>treat them as sequences in this context (and most other
>application cases, in my limited experience and humble
>opinion).  So we need to synthesize a "nonstringlike sequence
>protocol" -- and for THAT, we need a "stringlike" protocol.
>
>Identifying "stringlike" with "can be catenated to a
>string" may seem like a daring leap, but think about
>it for a second: why WOULD a non-stringlike sequence
>ever WANT to be catenable to a string?  Your example:
>
>> Maybe it's a "filename" class that allows iteration and
>> allows __add__ with strings?
>
>...seems somewhat far-fetched to me.  For any two
>catenable sequences X and Y, given:
>
>def catall(*n):
>    for seq in n:
>        for item in seq:
>            yield item
>
>surely catall(X, Y) should == catall(X + Y).  How
>would this reasonable protocol constraint on sequence
>catenation be met when X instantiates your hypothetical
>class and Y is a string?
>
It may be conventionally reasonable when X and Y are the same
type of sequence, but IMO it's too constraining to dictate how
a sequence-implementing class should implement '+' when the other
sequence is another type. And ISTM that is what Holger's example
is about. E.g, it may be that the class will want to coerce the
other argument to its own type before the addition. E.g.,
(this happens to be a list of strings, which may confuse somewhat,
but it could be a list of special ojects that can only be built
from certain elements, so I am limiting constructor args by
brute force type checks here (not even allowing unicode etc):

====< stringlist.py >===============================
from __future__ import generators
class Stringlist:
    def __init__(self, x):
        # check for string, list of strings, or Stringlist instance
        if isinstance(x, str): x = [x]
        if (isinstance(x, list) and
            reduce(lambda n,item: n+isinstance(item,str), x, 0)==len(x)
        ):
            self.sl = x[:]
        elif isinstance(x, self.__class__):
            self.sl = x.sl[:]
        else:
            raise ValueError, 'Stringlist ctor requires stringlists or strings'
    def __len___(self): return len(self.sl)
    def __getitem__(self, i): return self.sl[i]
    def __setitem__(self, i, v): self.sl[i] = stringlist(v)[0]
    def __add__(self, other):
        if isinstance(other, self.__class__):
            return self.__class__(self.sl+other.sl)
        return self + self.__class__(other)
    def __repr__(self): return '<Stringlist %s>' % `self.sl`

if __name__ == '__main__':
    
    # [Alex Martelli]
    #...seems somewhat far-fetched to me.  For any two
    #catenable sequences X and Y, given:

    #( from __future__ import generators at top)
    def catall(*n):
        for seq in n:
            for item in seq:
                yield item

    #surely catall(X, Y) should == catall(X + Y).  How
    #would this reasonable protocol constraint on sequence
    #catenation be met when X instantiates your hypothetical
    #class and Y is a string?
    
    X = Stringlist('hypothetical instance data'.split())
    Y = 'a string'
    
    print 'X: %s, Y: %s' % (`X`, `Y`)
    print 'X+Y: %s' % `X+Y`
    print 'catall(X, Y): %s' % [x for x in catall(X, Y)]
    print 'catall(X + Y): %s' % [x for x in catall(X + Y)] 
    assert [x for x in catall(X, Y)] == [x for x in catall(X + Y)]
====================================================
When run, the results do break the protocol as expected, but I don't
think a Stringlist class that coerces strings to its own type before
doing '+' is unreasonable:

[17:49] C:\pywk\clp>stringlist.py
X: <Stringlist ['hypothetical', 'instance', 'data']>, Y: 'a string'
X+Y: <Stringlist ['hypothetical', 'instance', 'data', 'a string']>
catall(X, Y): ['hypothetical', 'instance', 'data', 'a', ' ', 's', 't', 'r', 'i', 'n', 'g']
catall(X + Y): ['hypothetical', 'instance', 'data', 'a string']
Traceback (most recent call last):
  File "C:\pywk\clp\stringlist.py", line 47, in ?
    assert [x for x in catall(X, Y)] == [x for x in catall(X + Y)]
AssertionError

(I didn't assume you meant "catall(X, Y) should == catall(X + Y)" literally,
so I changed the assert)

There might be bugs or better ways to implement the above, but I hope it
illustrates the main point.

Welcome back BTW ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list