[Tutor] Get it, Parse it, Dump it....

Schmidt, Allen J. aschmidt@nv.cc.va.us
Thu, 14 Jun 2001 11:17:03 -0400


Great! Thanks for responding.

>From the winPython interpreter, one line at a time, I do this:

>>> import urllib
>>> x=urllib.urlopen("http://site.that.has.the.data/search?etc=stuff")   
         ## This takes 2 minutes to come back

>>> print x.readlines()

And this shows everything on one LONG line coded as I describe below with
the TAB character showing between each field as '\011'

I have been able to use string.split to break the line into a list on the
TAB character.

Not sure how to go about building this into something that can run on its
own instead of the command prompt.

Also need to know how to connect to and talk to my ODBC defined database
(Access2000) and then take the fields, one line at a time, and insert them
into the DB table.

I am using Zope so if there is a way to build this into a script or external
method and call it from Zope that would be great too. Using Zope, I have put
off learning Python long enough. Now I really need it!

Thanks
Allen

-----Original Message-----
From: Patrick K. O'Brien [mailto:pobrien@orbtech.com]
Sent: Thursday, June 14, 2001 10:01 AM
To: Python Tutor
Subject: RE: [Tutor] Get it, Parse it, Dump it....


I can't speak for anyone else, but it would help me help you if you could
expand on what you have that *is* working and what *specifically* you still
need on the other things. Otherwise it's a pretty big list. I know some of
the answers, but I don't know if you want advice on where to start or
whether you want to see sample code or whether you have some code and it
isn't doing exactly what you want. So now that we know the scope of your
project (which sounds pretty cool, btw) maybe you could break out some
specific requests so that we can help you with smaller chunks until you have
everything that you need.

---
Patrick K. O'Brien
Orbtech
"I am, therefore I think."

-----Original Message-----
From: tutor-admin@python.org [mailto:tutor-admin@python.org]On Behalf Of
Schmidt, Allen J.
Sent: Thursday, June 14, 2001 6:52 AM
To: tutor@python.org
Subject: [Tutor] Get it, Parse it, Dump it....

Hello All...

I've been lurking for a few weeks and now need the services of this list.

I have a URL which connects to someone else's database system and returns
what I have been told is "compact XML." Having never heard of it before I
was curious. Submitting the URL in a browser I get back a ton of data that
is lumped together on the screen. A view source shows something else.

The first 2 lines are tag-like and the data starts on the third line. That
line contains this:
<COLUMN>NAME    ADDRESS    PHONE</COLUMN>

The above data is substituted to make this easier and the line really has
about 80 fields.
Each field is separated by a TAB character.

Then all the following lines look like this: (several hundred lines)
<DATA>Fred    1123 Main Street    555-1212</DATA>

This goes on until the end where there is another ending-tag-like tag on the
very last line.

Here is what I need to be able to do AUTOMATICALLY:

1)Submit the URL (I can get the data back by using urllib.urlopen(url) )

2)Parse the data to remove the first 2 lines and the last line.

3)Parse each line to remove the start and end tags.

4)Make a connection to an ODBC data source on the same PC.

5)Delete all the data in a table.

6)'Pour' all the new data into the appropriate fields in the table

7)Send me an email letting me know the status of the job - number or
records, etc.


That's it!  I have been able to do some of the things at the winPython
prompt but not sure how to proceed with the details of make this into a
stand-alone py file that handles all these things.
My thought would be to CHRON this to have it run at night to update my data.

So...thoughts, comments, directions...

Don't want someone to DO it for me but I think I need some basic help in
putting the pieces together. If its too much to ask of this list, I would
like to know that too.

Thanks!

Allen

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor