[Tutor] OT SQL (but through Python...)

Kent Johnson kent37 at tds.net
Fri Feb 25 12:15:05 CET 2005


Liam Clarke wrote:
> Hi, 
> 
> Hope I don't annoy anyone by asking this here, if I do, let me know. 
> 
> When you're doing a SQL select statement, what would be better? Say
> you're searching by name, should I do -
> j = cx.execute
> j('select * from foo where first == %s and last == %s') % (a,b)
> q = cx.fetchall()
> if not q: 
>     j('select * from foo where first like %s%% and last like %s%%') % (a,b)
> 
> or just use 
> 
> j('select * from foo where first like %s%% and last like %s%%') % (a,b)
> 
> straight off the bat?

First, a code correction. You should never substitute user values into your SQL. Use the database 
driver's capabilities instead. That will ensure that special characters are escaped correctly.

The syntax to do this varies for each database - the 'paramstyle' attribute of the database driver 
will tell you which style to use; the docs should also mention it.

IIRC you are using SQLite, which says,
  >>> import sqlite
  >>> sqlite.paramstyle
'pyformat'

The docs have this example:
  >>> cu.execute("insert into test(u1, u2) values (%s, %s)", (u"\x99sterreich", u"Ungarn"))

so I think your code should be
   j('select * from foo where first == %s and last == %s', (a,b))

(note the final '%' is now a comma and (a, b) is a parameter passed to j.)

Why should you care? Imagine a user searches for "Tim" "John'son". Then your SQL becomes
   select * from foo where first == Tim and last == John'son
which will probably give you a syntax errer.

Worse, this opens you up to malicious attacks. What if I search for "Tim" "Johnson; delete table 
foo"? Now your SQL is
   select * from foo where first == Tim and last == Johnson; delete table foo

...oops

Finally, some databases will optimize repeated queries. If you substitute the string yourself you 
don't get this benefit because the query string is different each time.

Two more minor notes about the SQL - I use =, not == - actually I'm surprised == works. And if you 
are going to have string literals in your SQL put them in 'quotes'.

> 
> The first method gives me direct match, and searches for alternatives
> if none are found,
> the second may not give me a direct match if there is one. 
> 
> i.e 
> 
> first             last
> 
> Tim           Johns
> Timothy    Johnston
> 
> the 1st method will find the 1st row for a='Tim' b = 'Johns's, but the
> 2nd method will find both.
> 
> I'm asking more from a user perspective really, would the quicker
> action of the first outweigh the inconstant UI resulting from a
> search?

This isn't really a database question but a UI question. Who are the users? Can you ask them what 
they would prefer? If it is just for you, what do you prefer?

Maybe you want an "Exact match" checkbox in your GUI?

I wouldn't decide on the basis of speed, for any decent database you won't notice the difference.

Kent



More information about the Tutor mailing list