[Python-Dev] Re: PEP 292 - Simpler String Substitutions

M.-A. Lemburg mal at egenix.com
Tue Aug 24 10:39:32 CEST 2004


ncoghlan at iinet.net.au wrote:
> Quoting Raymond Hettinger <python at rcn.com>:
> 
> 
>>>This code-snippet is littered everwhere in my applications:
>>>
>>>    string.join([str(x) for x in iterable])
>>>
>>>Its tedious and makes code hard to read.  Do we need a PEP to fix
>>
>>this?
>>
>>A PEP would be overkill.
>>
>>Still, it would be helpful to do PEP-like things such as reference
>>implementation, soliticing comments, keep an issue list, etc.
>>
>>A minor issue is that the implementation automatically shifts to Unicode
>>upon encountering a Unicode string.  So you would need to test for this
>>before coercing to a string.
> 
> Perhaps have string join coerce to string, and Unicode join coerce to the
> separator's encoding. If we do that, the existing string->Unicode promotion code
> should handle the switch between the two join types.

The general approach is always to coerce to Unicode if strings
and Unicode meet; very much like coercion to floats is done when
integers and floats meet.

Your suggestion would break this logic and make coercion depend
on an argument.

>>Also, join works in multiple passes.  The proposal should be specific
>>about where stringizing occurs.  IIRC, you need the object length on the
>>first pass, but the error handling and switchover to Unicode occur on
>>the second.
> 
> 
> Having been digging in the guts of string join last week, I'm pretty sure the
> handover to the Unicode join happens on the first 'how much space do we need'
> pass (essentially, all of the work done so far is thrown away, and the Unicode
> join starts from scratch. If you know you have Unicode, you're better off using
> a Unicode separator to avoid this unnecessary work </tangent>).

It's just a simple length querying loop; there's no storage allocation
or anything expensive happening there, so the "throw-away" operation
is not expensive. Aside: ''.join() currently only works for true
sequences - not iterators.

OTOH, the %-format operation is which is why PyString_Format goes
through some extra hoops to make sure the work already is not
dropped (indeed, it may not even be possible to reevaluate the
arguments; think iterators here).

We could probably add similar logic to ''.join() to have
it also support iterators.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 24 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list