Good programming style

Bruno Desthuilliers bruno.42.desthuilliers at websiteburo.invalid
Fri Sep 12 12:44:36 CEST 2008

Astley Le Jasper a écrit :
> I'm still learning python and would like to know what's a good way of
> organizing code.
> I am writing some scripts to scrape a number of different website that
> hold similar information and then collating it all together. Obviously
> each site needs to be handled differently, but once the information is
> collected then more generic functions can be used.
> Is it best to have it all in one script or split it into per site
> scripts that can then be called by a manager script?
> If everything is
> in one script would you have per site functions to extract the data or
> generic function that contain vary slightly depending on the site,

As far as I'm concerned, I'd choose the first solution. Decoupling 
what's varying (here, site-specific stuff) from "invariants" is so far 
the best way I know to keep complexity manageable.

> for
> example
> import firstSiteScript
> import secondSiteScript
> firstsitedata = firstSiteScript.getData('search_str)
> secondsitedata = secondSiteScript.getData('search_str)
> etc etc

Even better :

- put generic functions in a 'generic' module
- put all site-specific stuff each in it's own module in a specific 
'site_scripts' directory
- in your 'main' script, scan the site_scripts directory to loop over 
site-specific modules, import them and run them (look for the __import__ 

This is kind of a Q&D lightweight plugin system, that avoids having to 
hard-code imports and calls in the main script, so you just have to 
add/remove site-specific script to/from the site_scripts directory .

Also, imported modules are not recompiled on each import - only when 
they change - while the 'main' script get recompiled on each invocation.


> OR
> def getdata(search_str, website):
>   if website == 'firstsite':
>     ....
>   elif website =='secondsite':

This one is IMHO the very worst thing to do.

My 2 cents...

More information about the Python-list mailing list