[Python-ideas] The non-obvious nature of str.join (was Re: sum(...) limitation)

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Mon Aug 11 22:17:50 CEST 2014


On 11.08.2014 21:55, Terry Reedy wrote:
> On 8/11/2014 9:56 AM, Alexander Belopolsky wrote:
>>
>> On Mon, Aug 11, 2014 at 2:56 AM, Wolfgang Maier
>> <wolfgang.maier at biologie.uni-freiburg.de
>> <mailto:wolfgang.maier at biologie.uni-freiburg.de>>
>>
>> wrote:
>>
>>     I am using Python for teaching programming to absolute beginners at
>>     university and, in my experience, joiner.join is never a big hurdle.
>>
>>
>> In my experience, it is the asymmetry between x.join(y) and x.split(y)
>> which causes most of the confusion.   In x.join(y), x is the separator
>
> Given that the two parameters of join are a concrete string and the
> abstraction 'iterable of strings', join can only be a method of the joiner.
>
> I would first teach ' '.join(it_of_strings) as the 'normal' case of
> joining 'words', along with print with the default sep = ' '.
>
>
>> and y is the data being joined, but in x.split(y), it is the other way
>> around.
>
> *If* sep is present, then sep.split(string) would be possible. But when
> sep is *not* not present, split cannot be a method of something that is
> not there.  So I think I would teach s.split() first and then add
> .split(sep) and .splitlines().
>
> I would also teach join and split together since they are, at their
> cores (excluding special cases), inverses.
>

I like to show students early on that with Python they can do things 
very quickly that would be very hard to achieve manually. Most of my 
students are biologists so they do not think initially about theoretical 
aspects of programming much, but want to know whether they can do 
something with it fast.

For join/split, I typically use problems like:

- you'd like to use program xy to work on your data, but it expects 
input elements to be separated by semicolons when all you have is a 
tab-delimited format

or

- you have input consisting of numbers with ',' as the decimal separator 
(the default in German-speaking countries), but downstream software 
expects '.'

,i.e., they can all be solved with the general pattern:

s = new_sep.join(s.split(old_sep))

Seeing that, in Python, you can solve these problems (in principle) with 
one line of quite understandable code is a very convincing argument for 
starting to learn the language.

Wolfgang



More information about the Python-ideas mailing list