How to catch exceptions elegantly in this situation?

Bengt Richter bokr at oz.net
Wed Oct 13 04:01:54 EDT 2004


On 12 Oct 2004 23:24:54 -0700, syed_saqib_ali at yahoo.com (Saqib Ali) wrote:

>Jeremy Bowers <jerf at jerf.org> wrote in message news:<pan.2004.10.08.02.50.53.714563 at jerf.org>...
>> On Thu, 07 Oct 2004 20:13:06 -0700, Saqib Ali wrote:
>> 
>> 
>> Can you give me more detail? What is the real problem?
>> 
>
>
>Sure.
>The real issue is that I am doing some screen-scraping from on-line
>white pages (residential telephone directory).
>
>I have defined a bunch of regular expressions and myFunc() populates a
>dictionary corresponding to each result from the white pages.
>
>So essentially myDict corresponds to a single record found. See below
>
>myDict["fullName"] = fullNameRegExp.match(htmlText)[0]
>myDict["telNum"] = telNumRegExp.match(htmlText)[0]
>myDict["streetAddr"] = streetAddrRegExp.match(htmlText)[0]
>myDict["city"] = cityRegExp.match(htmlText)[0]
>myDict["state"] = stateRegExp.match(htmlText)[0]
>myDict["zip"] = zipRegExp.match(htmlText)[0]
>
>
>Sometimes one or more of these regexps fails to match. In which Case
>an exception will be raised. I want to catch the exception, print out
>a message..... but then keep on going to the next assignment
>statement.
>
>
>How can I do that without wrapping each assignment in its own
>try/except block??
>
What exception are you getting? TypeError on unsubscriptable object?

How about (untested) something like:

for key in 'fullName telNum streetAddr city state zip'.split():
    m = getattr(locals()[key+'RegExp'], 'match')(htmlText)
    if m is None: print 'Error: % not found' % key
    else: myDict[key] = m.group()

Obviously you could do something cleaner than silly locals() and getattr stuff by using a
prepared dict, e.g. rxDict == {fullName:fullnameRegExp.match, ...}, something like (untested)

for key, matcher in rxDict.items():
    m = matcher(htmlText)
    if m is None: print 'Error: % not found' % key
    else: myDict[key] = m.group()

Or, if you just put '?' as value where nothing is found, you can get the dict in one line (untested ;-):

myDict = dict([(k,m is None and '?' or m.group()) for k,m in [(k2,mat(htmlText) for k2,mat in rxDict.items()]])

Regards,
Bengt Richter



More information about the Python-list mailing list