[Tutor] Get it, Parse it, Dump it....

Schmidt, Allen J. aschmidt@nv.cc.va.us
Thu, 14 Jun 2001 07:52:28 -0400


Hello All...

I've been lurking for a few weeks and now need the services of this list.

I have a URL which connects to someone else's database system and returns
what I have been told is "compact XML." Having never heard of it before I
was curious. Submitting the URL in a browser I get back a ton of data that
is lumped together on the screen. A view source shows something else.

The first 2 lines are tag-like and the data starts on the third line. That
line contains this:
<COLUMN>NAME    ADDRESS    PHONE</COLUMN>

The above data is substituted to make this easier and the line really has
about 80 fields.
Each field is separated by a TAB character.

Then all the following lines look like this: (several hundred lines)
<DATA>Fred    1123 Main Street    555-1212</DATA>

This goes on until the end where there is another ending-tag-like tag on the
very last line.

Here is what I need to be able to do AUTOMATICALLY:

1)Submit the URL (I can get the data back by using urllib.urlopen(url) )

2)Parse the data to remove the first 2 lines and the last line.

3)Parse each line to remove the start and end tags.

4)Make a connection to an ODBC data source on the same PC.

5)Delete all the data in a table.

6)'Pour' all the new data into the appropriate fields in the table

7)Send me an email letting me know the status of the job - number or
records, etc.


That's it!  I have been able to do some of the things at the winPython
prompt but not sure how to proceed with the details of make this into a
stand-alone py file that handles all these things.
My thought would be to CHRON this to have it run at night to update my data.

So...thoughts, comments, directions...

Don't want someone to DO it for me but I think I need some basic help in
putting the pieces together. If its too much to ask of this list, I would
like to know that too.

Thanks!

Allen