[Chicago] When to load?

Mon Feb 1 12:19:54 EST 2016

Leon,

You have asked a deeper and more pertinent question than I have seen on this list for a while.  Thank you for upping the discussion level.

When a module is imported the code in each module is loaded and executed only once, regardless of how often you use the import statement.  Subsequent import statements simply bind the module name to the module object already created by the previous import.  You can find a dictionary containing all currently loaded modules in the variable sys.modules.  This dictionary maps module names to module objects.  The contents of this dictionary are used to determine whether import loads a fresh copy of a module. (Python Essential Reference, 4th edition by David Beazley, pg 144)

So in answer to your questions:
1) If the dictionary import code is part of your module's __init__.py it will be run only once.
2) Hiding the operation of your program to make the code 'look cleaner' is not a best practice.  A good program has methods that depend only on their parameters and their (static) environment.  This allows the reader to determine how the code will run based on its input.  This doesn't mean that your python code has to look like Fortran with a few dozen parameters.  Use Python's data structures to create a structure (probably a class or list) that will hold the parser and its parameters and then you will have only one added parameter to pass to your analysis routines.
3) See answer for 2

Phil Robare

-----Original Message-----
From: Chicago [mailto:chicago-bounces+proba=allstate.com at python.org] On Behalf Of Leon Shernoff
Sent: Monday, February 01, 2016 10:14 AM
To: chicago at python.org
Subject: [Chicago] When to load?

Hello,

I have a modularity design question. I am writing a program that, as it goes along, calls a text-parsing routine. In fact, the main program is a scraping program (or pseudo-scraping -- it will also run on a collection of text files) that runs this parsing routine in a loop over many pages/files.

The parsing routine calls various other subroutines, so I'd like to put the whole set of them in a separate file that gets imported by the main program. The parsing program uses several dictionaries of terms, and as it processes more and more texts it adds more terms to those dictionaries and they get stored in a database that is read at launch to construct the dictionaries. So the dictionaries are a bit expensive to generate and I'd like to have to construct them only once.

So, I'm unclear on the persistence here (experienced developer, pretty new to Python):

1) If I put the database-read dictionary-construction code in the parser's file, will those get run (and the dictionaries reconstructed) each time the main program uses the parser?

2) If so, do I need to construct the dictionaries in the main program and pass them to the parser each time I invoke it? That would make for several parameters, all of which would be the same each time except for the text to be parsed. This may be one of those things that's more annoying to humans than it is to machines; but if the whole point of sequestering the parse routines in a separate file is to make my main program look cleaner and understand, it is kind of backwards to do that and then issue ugly, cluttered calls to those routines. :-)

3) Is there a better way? (or is #1 just not a problem and they only get constructed once) (Please, please...)  :-)

--
Best regards,
     Leon

"Creative work defines itself; therefore, confront the work."
      -- John Cage

Leon Shernoff
1511 E 54th St, Bsmt
Chicago, IL  60615

(312) 320-2190

_______________________________________________
Chicago mailing list
Chicago at python.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.python.org_mailman_listinfo_chicago&d=CwICAg&c=gtIjdLs6LnStUpy9cTOW9w&r=VXIryE9UwJGlNMLzgMzDT4_t2NMrZf6alSphHwSEwC0&m=OkiVgo931DJbjZrn5eyuk-W1Y2XI6TpIscU-PVfJTno&s=tLbB0f0AZsSDvVf1Uzg5rawQh6ASuyMeUUtFL9f3Es8&e=