XML to SQL or XML into Tables SomeHow

Kent Polk kent at tiamat.goathill.org
Thu May 25 14:26:26 EDT 2000


On Thu, 25 May 2000 08:22:09 -0400, Frank V. Castellucci wrote:
>
>The DTD entity describes a table with attribute lists that name the
>columns.
>
><db = "test">
>	<table name="foo",sex="yesplease">
>		<table_name>person</table_name>
>	</table>
></db>
>
>This was choosen because (prior to finding (and still looking) for XML
>schema capable compilers), the attribute lists was deemed to have
>flexible constraint properties.
>
>The database driver component is capable of responding to reasoning
>request (such as which columns are
>nullable, which are primary, etc., etc.)

I'm in the process of doing this also. After reading W3C info,
etc., I settled on the following format:

<db = "test">
  <table name="genope">
    <genope sid="24" mid="D5S392" dye="R" gloc="1*-02" a1="84" a2="104"/>
    <genope sid="30" mid="D5S392" dye="B" gloc="1*-01" a1="84" a2="110">
      <pgree>DISCR009</pgree>
    </genope>
    <genope sid="31" mid="D5S392" dye="G" gloc="1*-42" a1="94" a2="98"/>
  </table>
</db>

Using xml.sax parsers and the simple example they provide, I added
about 10 lines of code and was correctly parsing xml files of this
general format and storing them into a dictionary of database names
containing a dictionary of tables, containing a list of 'table
record' dictionaries. It's pretty easy to insert into simple database
tables from there. The tag attributes indicate items that cannot
be null (can be a null string), vs subtags that indicate items that
can be null. Admittedly I use what is probably a big hack - the
tag level indicator - to determine name associations, but the
advantage is that the parser only has to know about the tags 'db'
and 'table' to handle any database table data that I'm likely to
run across.

The <table> tag is really superfluous and I oscillate between
thinking it ought to be there and taking it out, but having it
there affords a bit more readability and separates the initialization
of the table of record dictionaries from appending items to the
list as well as making sure they are all encapsulated (if that
matters?).

Does this seem to be a reasonable path to take, given your experience?




More information about the Python-list mailing list