[ python-Bugs-1001011 ] str.join([ str-subtype-instance ])
misbehaves
SourceForge.net
noreply at sourceforge.net
Sat Aug 7 17:48:58 CEST 2004
Bugs item #1001011, was opened at 2004-07-31 00:08
Message generated for change (Comment added) made by niemeyer
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1001011&group_id=5470
Category: Type/class unification
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Wouters (twouters)
Assigned to: Nobody/Anonymous (nobody)
Summary: str.join([ str-subtype-instance ]) misbehaves
Initial Comment:
Joining a list of string subtype instances usually
results in a single string instance:
>>> class mystr(str): pass
>>> type("".join([mystr("a"), mystr("b")]))
<type 'str'>
But if the list only contains one object that is a
string subtype instance, that instance is returned
unchanged:
>>> type("".join([mystr("a")]))
<class '__main__.mystr'>
This can have odd effects, for instance when the result
of "".join(lst) is used as the returnvalue of a __str__
hook. "".join should perhaps return the type of the
joining string, but definately vary its type based on
the *number* of items its joining.
----------------------------------------------------------------------
>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2004-08-07 15:48
Message:
Logged In: YES
user_id=7887
If this was considered a bug:
>>> type(ms("a")+ms("b"))
<type 'str'>
>>> type(ms("a")[:])
<type 'str'>
Are these bugs as well?
I belive this is how the implementation was intended to be, even if not
optimal for subclasses.
I suggest closing this bug as invalid, and writing a PEP about the possible new
subclass support change (for all classes), if there's enough interest.
----------------------------------------------------------------------
Comment By: Terry J. Reedy (tjreedy)
Date: 2004-08-05 16:10
Message:
Logged In: YES
user_id=593130
Duh, my turn to forget. For any beginners reading this ...
>>> class ms(str): pass
...
>>> a=ms('a')
>>> type(''.join((a,)))
<class '__main__.ms'>
Expanding mhw's second point:
>>> e=ms()
>>> type(e)
<class '__main__.ms'>
>>> import copy
>>> e2=copy.copy(e)
>>> type(e2)
<class '__main__.ms'>
>>> e3=e[:]
>>> type(e3)
<type 'str'>
>>> id(e),id(e2),id(e3)
(9494608, 9009936, 8577440)
so [:] is not exactly an abbreviated synonym for copy(). Is
this a butg? (I haven't rechecked the respective docs yet.)
One reason I hesitate to call the OP's original observation a
bug is that the whole sujbect of operations on subtype
instances seems not completely baked. Knowing the result
types in all cases may require experiments as well as doc
reading.
----------------------------------------------------------------------
Comment By: Michael Hudson (mwh)
Date: 2004-08-05 12:04
Message:
Logged In: YES
user_id=6656
A clue for Terry: think about what "(a)" isn't :-)
I initially agreed that this was a bug because, e.g.
str_subclass()[:] returns a str. Isn't this the same sort
of thing?
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2004-08-04 20:28
Message:
Logged In: YES
user_id=38388
I agree with Terry. The result type is defined by the
semantics or the list elements and the length of the list:
len(list) > 1:
sep.join(list) := list[0] + sep + ... + sep + list[n]
len(list) == 1:
sep.join(list) := list[0]
len(list) == 0:
sep.join(list) := sep[:0]
----------------------------------------------------------------------
Comment By: Terry J. Reedy (tjreedy)
Date: 2004-08-04 19:39
Message:
Logged In: YES
user_id=593130
This behavior does not, to me, clearly violate the current doc:
"Return a string which is the concatenation of the strings in
the sequence seq"
where string is bytestring or Unicodestring. If one takes
'string' narrowly, then your subclass instances should be
rejected as input. If one takes 'string' more broadly as
isinstance(s,basestring) then your subclass should be equally
acceptible as input or output. If neither consistent
interpretation of 'string' is meant, then there is a doc bug, or
at least an underspecification.
Workaround 0: if len(seq) == 1: ...
Workaround 1. map(str, seq)) to force str out.
*However*, in playing around (in 2.2), I discovered:
>>> type(''.join((a)))
<type 'str'>
>>> type(''.join([a]))
<class '__main__.ms'>
>>> type(''.join({a:None}))
<class '__main__.ms'>
Having the type of the join of a singleton depend on the type
(mutability?) of the singleton wrapper is definitely disquieting.
Workaround 2: tuple(seq)
----------------------------------------------------------------------
Comment By: Michael Hudson (mwh)
Date: 2004-08-02 14:25
Message:
Logged In: YES
user_id=6656
What are you asking? I agree it's a bug. I'm sure you're
competent to write a patch :-)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1001011&group_id=5470
More information about the Python-bugs-list
mailing list