scanf style parsing

Richard Jones richard at bizarsoftware.com.au
Wed Sep 26 08:09:29 CEST 2001


On Wednesday 26 September 2001 15:42, Bruce Dawson wrote:
> I love programming in Python, but there are some things I have not found
> the easy way to do.

That's what we're here for :)


> I understand that Python is supposed to be good at
> text parsing, but I am having trouble with this simple task. Given this
> text (the output from VisualC++) I want to find out how many errors and
> warnings there were:
>
> smtpmail.exe - 0 error(s), 0 warning(s)
>
> In C/C++ that would be something like:
> sscanf(buffer, "smtpmail.exe - %d error(s), %d warning(s)", &errors,
> &warnings);
>
> It's not that I think the sscanf syntax is particularly elegant, but it
> sure is compact! I saw the discussion about adding scanf to Python
>
> http://mail.python.org/pipermail/python-dev/2001-April/014027.html
>
> but I need to know what people do right now when faced with this task.

Right now, people use regular expressions, which are more flexible that 
sscanf, but don't do the type conversions (everything comes out as a string) 
and are a little more verbose, code-wise.

[richard at ike ~]% python
Python 2.1.1 (#1, Jul 20 2001, 22:37:24) 
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.58mdk)] on linux-i386
Type "copyright", "credits" or "license" for more information.
>>> import re
>>> scan = re.compile(r'smtpmail.exe - (\d+) error\(s\), (\d+) warning\(s\)')
>>> result = scan.match("smtpmail.exe - 0 error(s), 0 warning(s)")
>>> errors, warnings = map(int, result.groups())
>>> errors
0
>>> warnings
0
>>> 

... or something similar. The RE can be made more flexible to allow, eg. the 
non-existence of the ", %d warning(s)" part:

>>> scan = re.compile(r'foo.exe - (\d+) error\(s\)(, (\d+) warning\(s\))?')
>>> result = scan.match("foo.exe - 0 error(s)")
>>> result.groups()
('0', None, None)
>>> result = scan.match("foo.exe - 0 error(s), 0 warning(s)")
>>> result.groups()
('0', ', 0 warning(s)', '0')

... and so on...



     Richard




More information about the Python-list mailing list