if or exception

Thomas Lindgaard thomas at it-snedkeren.BLACK_HOLE.dk
Thu Jul 29 06:02:02 EDT 2004


On Thu, 29 Jul 2004 08:44:26 +0000, Duncan Booth wrote:

>> number = 0
>> if len(list) > 0: number = anotherNumber / len(list)

[snip]
 
> Your first suggestion may be the right answer in some situations.
> 
> Your second suggestion is never the right answer. This on the other, could 
> be a suitable answer:
> 
> try:
> 	number = anotherNumber / len(aList)
> except ZeroDivisionError:
> 	number = 0

Hmm... somehow that ZeroDivisionError got lost in translation... is
_was_ supposed to be there :)

> Don't use a bare except, it will just mask other errors: if you feel
> that catching an exception is the way to go, then catch only the
> exceptions that you expect. Also, but minor, don't use 'list' as a
> variable name.

I only used 'list' as a variable name to illustrate the type.
 
> However, it seems to me that I would be unlikely to use either of these.
> I can't think of a situation where I would want a value that is either
> the result of a division, or 0 if the division failed. It is much more
> likely that you want to execute some different code if the list is empty
> than that you want a different value.
> 
> If you explained what problem you are solving leads to this code then
> you might get a more useful suggestion about style.

The situation is this: I am trying to make my web crawler "spread out",
ie. I want it to fetch an equal number of pages from each known host (and
not 8000 pages from one host and 1 page from another host as it does now).
Setting number = 0 on ZeroDivisionError was just the first result that
came to mind... after writing the the pseudo code below I can see that it
should have been something along the lines of number = 42 in stead :)

+--- pseudo code ---
| # hostDict is a dictionary mapping currently known host names to info
| # about the host (ie. time for last visit and number of pages fetched
| # from the host)
|
| class Crawler:
|   ... 
| 
|   def startPages(self):
|     try:
|       avgPagesPerHost = numPagesFetched / len(hostDict)
|     except ZeroDivisionError: 
|       # this will only happen the first time around when no hosts are
|       # known
|       avgPagesPerHost = 42
|
|     while len(queue):
|       link = queue.pop()
|       host = parseLinkAndExtractHost(link)
|
|       # a lot of checks
|
|       if hostDict[host]['numPagesFetchedFromHost'] < avgPagesPerHost:
|         fetchPage(link)
|       else:
|         delayPage(link)
+----

If the average number of pages found on a host turns out to be 10, then
something has to be done to make sure that the crawler continues fetching
pages from hosts with more than 10 pages... but that is another problem
entirely :)

-- 
Regards
/Thomas




More information about the Python-list mailing list