data structure suggestion (native python datatypes or sqlite; compound select)

MRAB python at mrabarnett.plus.com
Thu Sep 16 20:18:04 EDT 2010


On 17/09/2010 00:56, Vlastimil Brom wrote:
> 2010/9/17 MRAB<python at mrabarnett.plus.com>:
>> On 16/09/2010 23:11, Vlastimil Brom wrote:
>>>
>>> ...
>>> I put together some code, which works as expected, but I suspect
>>> somehow, that there must be better ways of doing it.
>>>
>>> Two things I am not quite clear about are using the placeholders for
>>> the data identifiers and "chaining" the SELECT parameters.
>>>
>>> I Couldn't find a way to use "?" placeholder for table or column
>>> names, hence I ended up using string interpolation for them and
>>> placeholders for the data values, like.
>>> curs.execute('SELECT * FROM "%s" WHERE "%s"==?' % (text_name,
>>> index_col), (text_index,))
>>> is there a better way or is it not supposed to supply these
>>> identifiers programatically?
>>>
>> You would normally expect the structure of the database to be fixed and
>> only the contents to vary.
>>
>>> For getting the matching text indices given the tags, tag_values
>>> combination I ended up with a clumsy query:
>>>
>>> combined_query_list = ['SELECT "%s" FROM "%s" WHERE "%s"==?' %
>>> (index_col, text_name, tag) for tag in tags]
>>> sql_query = " INTERSECT ".join(combined_query_list)
>>> curs.execute(sql_query, tag_values)
>>>
>>> which produces e.g.:
>>> SELECT "ind" FROM "n" WHERE "KC"==? INTERSECT SELECT "ind" FROM "n"
>>> WHERE "VN"==?
>>>
>>> or alternatively:
>>>
>>> select_begin = 'SELECT "%s" FROM "%s" WHERE ' % (index_col, text_name)
>>> where_subquery = " AND ".join('"%s"==?' % (tag,) for tag in tags)
>>> sql_query = select_begin + where_subquery
>>>
>>> with the resulting query string like:
>>> SELECT "ind" FROM "n" WHERE "KC"==? AND "VN"==? ('12', '1')
>>>
>>> (BTW, are these queries equivalent, as the outputs suggest, or are
>>> there some distinctions to be aware of?)
>>>
>>> Anyway, I can't really believe, this would be the expected way ...
>>>
>> If you're selecting rows of a table then using 'AND' would seem the
>> obvious way.
>>
>>
> Thanks for the answer,
> Well, that may be a part of the problem, the database structure is
> going to be fixed once I'll have the text sources complete, but I was
> trying to keep it more general, also allowing the identifiers to be
> passed programmatically (based on the tagged text in question).
>
> yes, I am just selecting rows - based on the combination of the column
> values (which, I guess, might be an usual database approach(?).
> However, I was unsure, whether it is usual to construct the query
> string this way - partly using string interpolation or sequence
> joining.
> Or should there normally be no need for construct like the above and I
> am doing something wrong in a more general sense?
>
In general you'd have a fixed database structure and fixed queries.
You'd design it so that you wouldn't have different columns for
different tagged texts, forcing you to change the database structure
when texts were added or removed, but no-one will come around to your
house to stop you creating queries dynamically. :-)



More information about the Python-list mailing list