[Tutor] Help on RE

Albert-Jan Roskam fomcl at yahoo.com
Sun Jan 23 10:33:32 CET 2011



http://imgs.xkcd.com/comics/regular_expressions.png

;-)
 
Cheers!!
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have the 
Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~




________________________________
From: Steven D'Aprano <steve at pearwood.info>
To: tutor at python.org
Sent: Sun, January 23, 2011 4:10:35 AM
Subject: Re: [Tutor] Help on RE

tee chwee liong wrote:
> thanks for making me understand more on re. re is a confusing topic as i'm 
>starting on python. 
>

I quote the great Jamie Zawinski, a world-class programmer and hacker:

    Some people, when confronted with a problem, think 'I know, I'll
    use regular expressions." Now they have two problems.


Zawinski doesn't mean that you should never use regexes. But they should be used 
only when necessary, for problems that are difficult enough to require a 
dedicated domain-specific language for solving search problems.

Because that's what regexes are: they're a programming language for text 
searching. They're not a full-featured programming language like Python 
(technically, they are not Turing Complete) but nevertheless they are a 
programming language. A programming language with a complicated, obscure, 
hideously ugly syntax (and people complain about Forth!). Even the creator of 
Perl, Larry Wall, has complained about regex syntax and gives 19 serious faults 
with regular expressions:

http://dev.perl.org/perl6/doc/design/apo/A05.html

Most people turn to regexes much too quickly, using them to solve problems that 
are either too small to need regexes, or too large. Using regexes for solving 
your problem is like using a chainsaw for peeling an orange.

Your data is very simple, and doesn't need regexes. It looks like this:


Platform: PC
Tempt : 25
TAP0 :0
TAP1 :1
+++++++++++++++++++++++++++++++++++++++++++++
Port Chnl Lane EyVt EyHt
+++++++++++++++++++++++++++++++++++++++++++++
0  1  1  75  55
0  1  2  10 35
0  1  3  25 35
0  1  4  35 25
0  1  5  10 -1
+++++++++++++++++++++++++++++++++++++++++++++
Time: 20s


The part you care about is the table of numbers, each line looks like this:

0  1  5  10 -1

The easiest way to parse this line is this:

numbers = [int(word) for word in line.split()]

All you need then is a way of telling whether you have a line in the table, or a 
header. That's easy -- just catch the exception and ignore it.

template = "Port=%d, Channel=%d, Lane=%d, EyVT=%d, EyHT=%d"
for line in lines:
    try:
        numbers = [int(word) for word in line.split()]
    except ValueError:
        continue
    print(template % tuple(numbers))


Too easy. Adding regexes just makes it slow, fragile, and difficult.


My advice is, any time you think you might need regexes, you probably don't.


-- Steven

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110123/4c8a0299/attachment.html>


More information about the Tutor mailing list