Need better string methods

Christian Tismer tismer at
Sat Mar 6 22:07:57 CET 2004

William Park wrote:


>># Current best Python:
>>clean = [' '.join(t.split()).strip('.') for t in line.split('|')]
> Both Bash shell and Python can split based on regular expression.
> However, shell is not a bad alternative here:
>     tr -s ' \t' ' ' | sed -e 's/ ?| ?/|/g' -e 's/^ //' -e 's/ $//' |
>     while IFS='|' read -a clean; do
> 	...
>     done

But isn't that regex expression much harder to understand
for part-time programmers than the few Python methods?

(Quoting David's post)
clean = [' '.join(t.split()).strip('.') for t in line.split('|')]

This is too much to expect of a non-programmer, even one who
undestands the methods.  The usability problems are 1) the three
variations in syntax ( methods, a list comprehension, and what *looks
like* a join function prefixed by some odd punctuation), and 2) The
order in which each step is entered at the keyboard.  ( I can show
this in step-by-step detail if anyone doesn't understand what I mean.)
3) Proper placement of parens can be confusing.

Right. This quite a couple of concepts in one line, and it
might be short and efficient, but obfuscated for the none-
Isn't this more readable? :

pieces = line.split(|)   # break at the bars
nodots  = [ piece.strip(".") for piece in pieces ] # remove leading or 
trailing dots
clean = [" ".join(words.split()) for words in nodots] # normalise spaces

Well, there is still some complexity with the join/split mess.
But still more readable than the regex?

Christian Tismer             :^)   <mailto:tismer at>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship*
14109 Berlin                 :     PGP key ->
work +49 30 89 09 53 34  home +49 30 802 86 56  mobile +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?

More information about the Python-list mailing list