[Tutor] python 3.3 split method confusion

Peter Otten __peter__ at web.de
Wed Jan 8 09:56:31 CET 2014


Danny Yoo wrote:

> One of the common cases for split() is to break a line into a list of
> words, for example.
> 
> #####################################
>>>> 'hello this is a test'.split()
> ['hello', 'this', 'is', 'a', 'test']
> #####################################
> 
> The Standard Library can not do everything that we can conceive of as
> being useful, because that set is fairly large.
> 
> If the Standard Library doesn't do it, we'll probably need to do it
> ourselves, or find someone who has done it already.
> 
> 
> ##########################################
>>>> def mysplit(s, delim):
> ...     start = 0
> ...     while True:
> ...         index = s.find(delim, start)
> ...         if index != -1:
> ...             yield s[start:index]
> ...             yield delim
> ...             start = index + len(delim)
> ...         else:
> ...             yield s[start:]
> ...             return
> ...
>>>> list(mysplit("this,is,a,test", ","))
> ['this', ',', 'is', ',', 'a', ',', 'test']
> ##########################################

The standard library does provide a way to split a string and preserve the 
delimiters:

>>> import re
>>> re.split("(,)", "this,is,a,test")
['this', ',', 'is', ',', 'a', ',', 'test']

It is very flexible...

>>> re.split("([-+*/])", "alpha*beta/gamma-delta")
['alpha', '*', 'beta', '/', 'gamma', '-', 'delta']

but you need to learn a mini-language called "regular expressions" and it 
takes some time to get used to them and to avoid the pitfalls (try to swap 
the "-" and "+" in my second example).



More information about the Tutor mailing list