how to avoid leading white spaces
Neil Cerutti
neilc at norwich.edu
Fri Jun 3 16:49:08 EDT 2011
On 2011-06-03, rurpy at yahoo.com <rurpy at yahoo.com> wrote:
>>> or that I have to treat commas as well as spaces as
>>> delimiters.
>>
>> source.replace(",", " ").split(" ")
>
> Uhgg. create a whole new string just so you can split it on one
> rather than two characters? Sorry, but I find
>
> re.split ('[ ,]', source)
It's quibbling to complain about creating one more string in an
operation that already creates N strings.
Here's another alternative:
list(itertools.chain.from_iterable(elem.split(" ")
for elem in source.split(",")))
It's weird looking, but delimiting text with two different
delimiters is weird.
> states much more clearly exactly what is being done with no
> obfuscation. Obviously this is a simple enough case that the
> difference is minor but when the pattern gets only a little
> more complex, the clarity difference becomes greater.
re.split is a nice improvement over str.split. I use it often.
It's a happy place in the re module. Using a single capture group
it can perhaps also be used for applications of str.partition.
> I would not recommend you use a regex instead of a string
> method solely because you might need a regex later. But when
> you have to spend 10 minutes writing a half-dozen lines of
> python versus 1 minute writing a regex, your evaluation of the
> possibility of requirements changing should factor into your
> decision.
Most of the simplest and clearest applications of the re module
are simply not necessary at all. If I'm inspecting a string with
what amounts to more than a couple of lines of basic Python then
break out the re module.
Of course often times that means I've got a context sensitive
parsing job on my hands, and I have to put it away again. ;)
> Yes, as I said, the regex attitude here seems in large part to
> be a reaction to their frequent use in Perl. It seems anti- to
> me in that I often see cautions about their use but seldom see
> anyone pointing out that they are often a better solution than
> a mass of twisty little string methods and associated plumbing.
That doesn't seem to apply to the problem that prompted your
complaint, at least.
>> There are a few problems with regexes:
>>
>> - they are another language to learn, a very cryptic a terse
>> language;
>
> Chinese is cryptic too but there are a few billion people who
> don't seem to be bothered by that.
Chinese *would* be a problem if you proposed it as the solution
to a problem that could be solved by using a persons native
tongue instead.
>> - hence code using many regexes tends to be obfuscated and
>> brittle;
>
> No. With regexes the code is likely to be less brittle than a
> dozen or more lines of mixed string functions, indexes, and
> conditionals.
That is the opposite of my experience, but YMMV.
>> - they're over-kill for many simple tasks;
>> - and underpowered for complex jobs, and even some simple ones;
>
> Right, like all tools (including Python itself) they are suited
> best for a specific range of problems. That range is quite
> wide.
>
>> - debugging regexes is a nightmare;
>
> Very complex ones, perhaps. "Nightmare" seems an
> overstatement.
I will abandon a re based solution long before the nightmare.
>> - they're relatively slow;
>
> So is Python. In both cases, if it is a bottleneck then
> choosing another tool is appropriate.
It's not a problem at all until it is.
>> - and thanks in part to Perl's over-reliance on them, there's
>> a tendency among many coders (especially those coming from
>> Perl) to abuse and/or misuse regexes; people react to that
>> misuse by treating any use of regexes with suspicion.
>
> So you claim. I have seen more postings in here where
> REs were not used when they would have simplified the code,
> then I have seen regexes used when a string method or two
> would have done the same thing.
Can you find an example or invent one? I simply don't remember
such problems coming up, but I admit it's possible.
--
Neil Cerutti
More information about the Python-list
mailing list