Is there a maximum size to a Python program?

John Machin sjmachin at
Mon Apr 27 06:44:01 CEST 2009

On Apr 27, 1:01 pm, "Carbon Man" <dar... at> wrote:
> I have a program that is generated from a generic process. It's job is to
> check to see whether records (replicated from another system) exist in a
> local table, and if it doesn't, to add them. I have 1 of these programs for
> every table in the database. Everything works well until I do the postcode
> table. The generated code is 5MB for a system with no current data. Normally
> the file would not be this big as only the changes are copied over. Python
> just quits, I have tried stepping through the code in the debugger but it
> doesn't even start.
> I am thinking that dynamically generating the programs to run might not be
> such a good idea.

I tend to agree with that line of thinking. What you need is:
(a) ONE program. That's not one per table, that's one and only one,
(b) Per-table config info like column names and types -- you have this

The idea is that instead of reading a data file and generating SQL per
row, you generate SQL once, with parameter markers for the data
values. Then you read the data file and execute the prepared SQL for
each row

More detail:

At the start, build an INSERT statement that's tailored for the table
that you are updating:

insert_sql = "INSERT INTO table_name VALUES (?,?,?,?)"
or (if you are paranoid)
"INSERT INTO table_name(col1,col2,col3,col4) VALUES (?,?,?,?)"

Notes: (a) assumes four columns as an example (b) Lower-case stuff has
to be replaced with the correct table name and column name(s) (c) "?"
is the parameter marker for the DB module you are using

For each row read from the data file, transform the input into a tuple
of suitable data types, and then do the insert:
import yourdbmodule
# much later:
   cursor.execute(insert_sql, data_tuple)
except yourdbmodule.IntegrityError: #
   if it's the primary key not unique exception:
      pass # ignoring this is part of your requirement
      raise # probably never happen, but should be checked

> It would be a shame to drop it because the system needs to
> be generic

OTOH it would be a shame to continue with a process where the amount
of code lying around is O(N_update_rows) instead of O(1) and the
number of programs is O(N_tables) instead of O(1).

> and it runs from an XML file so the resulting code could be
> pretty complex, and I am new to Python.

The above advice is in general language-agnostic; I'm just using
Python code because you asked in this newsgroup. The fact that other
languages may give you more rope with which to hang yourself is also
immaterial :-)


More information about the Python-list mailing list