[ python-Bugs-1001011 ] str.join([ str-subtype-instance ]) misbehaves

SourceForge.net noreply at sourceforge.net
Thu Aug 5 14:04:02 CEST 2004


Bugs item #1001011, was opened at 2004-07-31 01:08
Message generated for change (Comment added) made by mwh
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1001011&group_id=5470

Category: Type/class unification
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Wouters (twouters)
Assigned to: Nobody/Anonymous (nobody)
Summary: str.join([ str-subtype-instance ]) misbehaves

Initial Comment:
Joining a list of string subtype instances usually
results in a single string instance:

  >>> class mystr(str): pass
  >>> type("".join([mystr("a"), mystr("b")]))
  <type 'str'>

But if the list only contains one object that is a
string subtype instance, that instance is returned
unchanged:

  >>> type("".join([mystr("a")]))
  <class '__main__.mystr'>

This can have odd effects, for instance when the result
of "".join(lst) is used as the returnvalue of a __str__
hook. "".join should perhaps return the type of the
joining string, but definately vary its type based on
the *number* of items its joining.



----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2004-08-05 13:04

Message:
Logged In: YES 
user_id=6656

A clue for Terry: think about what "(a)" isn't :-)

I initially agreed that this was a bug because, e.g.
str_subclass()[:] returns a str.  Isn't this the same sort
of thing?



----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2004-08-04 21:28

Message:
Logged In: YES 
user_id=38388

I agree with Terry. The result type is defined by the
semantics or the list elements and the length of the list:

len(list) > 1:
sep.join(list) := list[0] + sep + ... + sep + list[n]

len(list) == 1:
sep.join(list) := list[0]

len(list) == 0:
sep.join(list) := sep[:0]


----------------------------------------------------------------------

Comment By: Terry J. Reedy (tjreedy)
Date: 2004-08-04 20:39

Message:
Logged In: YES 
user_id=593130

This behavior does not, to me, clearly violate the current doc:
"Return a string which is the concatenation of the strings in 
the sequence seq"
where string is bytestring or Unicodestring.  If one takes
'string' narrowly, then your subclass instances should be 
rejected as input.  If one takes 'string' more broadly as 
isinstance(s,basestring) then your subclass should be equally 
acceptible as input or output.  If neither consistent 
interpretation of 'string' is meant, then there is a doc bug, or 
at least an underspecification.

Workaround 0: if len(seq) == 1: ...
Workaround 1. map(str, seq)) to force str out.

*However*, in playing around (in 2.2), I discovered:

>>> type(''.join((a)))
<type 'str'>
>>> type(''.join([a]))
<class '__main__.ms'>
>>> type(''.join({a:None}))
<class '__main__.ms'>

Having the type of the join of a singleton depend on the type 
(mutability?) of the singleton wrapper is definitely disquieting.

Workaround 2: tuple(seq)


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2004-08-02 15:25

Message:
Logged In: YES 
user_id=6656

What are you asking?  I agree it's a bug.  I'm sure you're 
competent to write a patch :-)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1001011&group_id=5470


More information about the Python-bugs-list mailing list