scanf style parsing
Tim Hammerquist
tim at vegeta.ath.cx
Thu Sep 27 06:48:24 EDT 2001
Me parece que Bruce Dawson <comments at cygnus-software.com> dijo:
> For Perl hackers it is easy to figure out
> regexp, but for us old C/C++ types, it's *tough*
It's not usually easy to learn regexps, no matter what your background.
I come from C/C++ roots (Turbo C++ 3.0) and TRS-80 BASIC before that,
and I certainly had no idea what regex's were really for until I looked
at Perl.
I struggled with regex's for months. I even had to take some time away
from Perl and regex's to calm down and not be so intimidated. Many
Pythonistas I've heard in this ng had a lot of difficulty with regular
expressions, and with *good reason*.
Of course, by the time I finally grasped regular expressions, I would
be looking at my mail and catch myself mentally writing a regex to
parse my gas bill! This is part of why Perler's are a bit overzealous
with regex's. Python's syntax tames this pretty quick tho, and that's a
good thing.
Regex's are useful and powerful. But they're also very easy to abuse.
I've actually seen the following Perl code:
if ($filename =~ /\.txt$/) { ... }
Which would be roughly equivalent to:
m = re.search(r'\.txt$', filename)
if m:
...
or, much more preferably:
if filename[-4:] == '.txt':
...
I think another reason for Perlers overusing regex's is Perl's shortage
of convenient string indexing operators.
The equivalent of the last Python code in Perl is:
if (substr($filename, -4) eq '.txt') { ... }
But don't think regex's are disposable just because Python's string type
is more convenient. Consider the following:
# perl
if ($filename =~ /\.([ps]?html?|cgi|php[\d]?|pl)$/) { ... }
# python
re_web_files = re.compile(r'\.([ps]?html?|cgi|php[\d]?|pl)$')
m = re_web_files.search(filename)
if m:
...
This is a very complicated (but relatively efficient way) to match files
with all the folowing extensions:
.htm .html .shtm .shtml .phtm .phtml
.cgi
.php .php2 .php3 .php4
.pl
Even with Python's less convenient class implementation of regex's (as
opposed to Perl's operator implementation), not a bad example, and half
of the power of regular expressions hasn't even been displayed here.
If you don't need a regex, don't feel obligated. (You very rarely *need*
a regex, but workarounds can get pretty ugly.)
Use them sparingly and they can save your butt. They did mine. <wink>
--
In 1968 it took the computing power of 2 C-64's to fly a rocket to the moon.
Now, in 1998 it takes the Power of a Pentium 200 to run Microsoft Windows 98.
Something must have gone wrong.
More information about the Python-list
mailing list