[Tutor] str.split and quotes

Fri Apr 8 04:33:48 CEST 2005

On Fri, 8 Apr 2005, Tony Meyer wrote:

> > Is there a reason to prefer one over the other?  Is one
> > faster?  I compiled my regular expression to make it quicker.
> 
> With Python 2.4 I get these results (all imports are factored out, all give
> the same result except for CSV which strips the "s) with timeit.py:
> 
> Own split:   26.8668364275
> Tokenize:    78.8295112926
> Rejoin:      11.237671827
> Re:          13.9386123097
> Re compiled:  8.19355839918
> CSV:         23.3710904598

Wow Tony.  That is so generous of you to do this experiment.  It must
be a great re engine.

I wondered if tokenize was resource-hungry.  It goes so far around the
bush.  But I'm glad to know about it.  Thank you again, Danny.

> 
> Of course, speed isn't everything (or you wouldn't be using Python).
> Readability is probably the most important factor - I'd say that a re
> (particularly a verbose re) would be the most readable, followed by using
> the CSV module, followed by writing your own split function.  Since re is
> also the fastest of the methods suggested so far, it seems like a good
> choice.

Yes!  To everything you say.

And there's the added feature that I don't have to go back and change
that code now!  And so little typing in the first place.

> 
> > What a rich language!  So many choices.
> 
> Somewhat ironically, one of the tenets of Python is "there should be one--
> and preferably only one --obvious way to do it." (type "import this" at an

In this case, there is: regular expressions.  :^)

"Obvious" doesn't mean we can, necessarily, all see it immediately, or
that even any one of us can see it without study, thought and
inspiration, but, it means we can all see it once it's pointed out.

Or maybe I'm wrong.  

Thank you, you guys.

Marilyn

> interactive prompt).
> 
> =Tony.Meyer
> 

--