Newbie: Crazy, but Quick

Fri Feb 22 22:23:04 EST 2002

On 22 Feb 2002 23:26:42 GMT
Quinn Dunkan wrote:

> On Fri, 22 Feb 2002 09:23:24 -0800, Cliff Wells
> <logiplexsoftware at earthlink.net> wrote:
> >For that matter, you could also store the Python objects directly in the
> >SQL database.  My own opinion is that SQL is so easy that it's a
no-brainer
> >to use it whenever possible.  A problem I've seen many times is people
> >utilize some shortcut because they feel the program will never need
> >anything more, only later to discover that the program has utility far
> >beyond what they expected, but the shortcut they've taken cripples it,
and
> >they end up rewriting (or worse, they don't).
> 
> I don't find SQL to be a no-brainer.  You have to download the thing
(MySQL at
> least is massive... the sunos binary tarball I have is >5MB).  Then you
have to

5MB?  And how big was /your/ Python download?  The RPMs for my system were:
Python 2.2: 8MB
Python docs: 2MB

Hm.  5MB is looking pretty good.

I won't mention wxPython, but maybe I shouldn't use that either, since it's
almost 3MB.

> install it (create tables, blah blah, learn MySQL's complicated
permission
> system, etc.).  Then you have to turn it on (which means either getting
root to
> edit the system rcs or setting up some hackery to restart it if the
server
> reboots).  Not to mention that you have to learn enough SQL to do all
this.
> Oh yes, and if there are no binaries for your platform, I hope you happen
to
> have a decent C++ compiler lying around.

I was referring specifically to the Windows port to which none of these
things (other than setting user permissions and creating tables) applies. 
You download a single .zip (yes, it's fairly large ~ 12MB, but still very
doable, even over 56K - I'm doing it at this moment, just for fun ;), unzip
it, run the program named "install.exe", read the file named README (still
with me?),  and run a single command (the secret of which is revealed in
the the README) "mysqladmin.exe".  From that point on, MySQL runs
automatically upon boot.  Anyone who can't manage that shouldn't bother
trying to write an application (unless Rue's 9 commands are done by then). 

> And even if it is easy to introduce giant complicated support program,
should
> you?  If linking in a massive shared library (or *any* shared library) is
as
> easy as tacking on a '-lpiggy', is it a no-brainer to link it in whenever
> possible?  I don't think so.  How easy it is for the developer to create

Absolutely agree.  But when writing a database application, it makes sense
to use a database.

> something is only part of the equation (that's what the python "creed" of
> legibility and simplicity is about, yes?).

Sure.  And using MySQL will simplify your code, because it handles the data
for you.  Writing a report becomes as simple as doing a query, rather than
writing a couple of pages of Python to sort and group your data (oh, wait
we didn't spec reports, did we?  Well, we want them now).  In fact, if you
have Excel, you can use it to create the reports, no coding at all and you
get some nice formatting to boot (your Python code will by now be hundreds
of lines, and quite complex).

> 
> Of course, you're right that you can easily marshal python objects to
SQL, I
> assume by pickling to a string.  But then you lose the "it's not python
> specific" advantage of SQL.  And you still have to scrunch your data into
the
> relational rows/columns model.  What if it's hierarchical and lumpy?

Obviously, modeling hierarchical data in a relational database isn't as
easy, but is that really on-topic?  The OP's data sounded pretty relational
to me.  If I /did/ need to model hierarchical data, I would try to find a
database that does it already, versus reinventing the wheel.  Besides, I
wouldn't think a programmer who can deal with that sort of thing should
have any problem installing a database, would she?

> 
> >What if she decides to maintain historical data?  Then to keep the files
> >reasonably sized, she'll have to start juggling files.
> 
> I think the *dbm type stuff scales ok up to mediumish files at least
(there are
> a few > 50MB gdbms lying around, INN uses it for article index or
something).
> I have no idea if it's happy all the way up to a few gigs.  If I wanted
to use
> it for that much data I'd test it out.

And I'd still argue that using MySQL is as easy as using dbm, and I won't
need to wonder if it will scale.  Not to mention it'll be /slightly/ faster
;)

> >Software has a tendency to grow beyond the developer's expectations, so
I
> >consider it unwise to take shortcuts when a better long-term solution
isn't
> >much more difficult to implement.
> 
> That's one point of view, and a reasonable one.  Another point of view is
that
> you should only design for what you know you need, instead of increasing
> generality and complexity to deal with some hypothetical future
situation.
> If you think your program has acquired a new requirement, you decide if
> it's really a requirement or if it should be done by another program, and
if
> it is, you modify your program, which hopefully is easy to modify since
you've
> been trying to keep things simple.

Yes and no.  I would agree with this if you are the only one who will ever
use the software, or you are writing a simple, single-purpose utility.  If
other people will be using the software, then I can guarantee you that you
don't "know what you need", because people will invariably want more
features, ways to interface your program with their existing software, etc.
 If you store your data in a SQL database, you're halfway home.  How about
making your database available on a network?  Seems like a reasonable
request, but your dbm/flatfile based code is going to get more complex (and
fragile), while the SQL-based solution will remain more or less the same.

> In the shelve case, it would be pretty easy to write an SQL-using backend
> that implements the shelve interface.  And maybe you'll like that
interface
> better than the SQL-ish cur.execute(this_and_that) anyway.

I probably wouldn't actually do this unless there were a very good reason
to.  I have in the past, actually stored compiled Python byte-code in a SQL
database (I had a good reason ;), but in general, I haven't had much
problem with the standard SQL datatypes and query language.

> >Not only that, but storing the data in a SQL database separates the data
> >storage from the program, so in the future she could use that data from
> >some other program, not just the Python one.  Then if she decides she
wants
> >to access the data from, say, a web server, it's not a big problem. 
> 
> You can write CGI scripts in python :)

Easier than using MySQL?  That's arguable.  Besides, web access was a
single example.  What if she wants to analyze the data in Excel?  Mark
Hammond's win32 stuff is nice and makes such things reasonably easy, but
still more far more difficult than accessing the data via ODBC.  And now,
you've got three access methods to support: Python, CGI, and Excel.  This
is the point behind n-tier design: separate data and interface from program
logic.  Using an external database is an easy and flexible way to separate
the data.  This is especially important when you consider that most of the
time, the data is more valuable than the program that generated it - what
happens when you leave that program behind and some poor sod is left trying
to extract all your data so it can be used in a more scalable program?

Sorry if I sound like an jerk - I've had a long week hacking on someone's
code for a custom database that had to grow beyond it's means.  Now if he'd
only used SQL...

Regards,

-- 
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308  (800) 735-0555 x308