Linux+Python: need help, have comments...

Ron Stephens rdsteph at earthlink.net
Mon Jan 28 04:07:40 CET 2002


I switched to Linux two weeks ago. Works fine, I like it. One problem,
can anyone help?

I set up a crontab file using the crontab my_cron_file command. I
checked it with crontab-l and it is fine. Even set up a cron.allow
config file. my_cron_file is as such:

30 4 * * * python mygale.py get all
30 5 * * * python urls.py
30 6 * * * python urldb.py
30 7 * * * python mygale-uploader.py

Now, the intention is to automatically, at 4:30 AM each day, run the
mygale web spider to collect any new Python articles from the web. At
5:30 and then at 6:30, run the associated uls.py and urldb.py scripts,
and then at 7:30 each day run the automatic FTP uploader script,
mygale-uploader.py. 

When I run these scripts exactly as typed above, from the Bash shell,
everything works as planned and that's how I have been updating the
online HTML page that shows the Pythonic articles on my web site. So
something must be wrong with my_cron_file or something else in the way I
have cron set up on my Linux box. If any Linux gurus can offer any
suggestions, I am all ears ;-))) It would be nice to have an
automatically updated HTML file each day on the web site showing any new
Pythonic articles. As it is, when I travel like am doing the next two
weeks on Monday, I can not update the file manually.

The spider does produce useful and interesting lists of articles. There
seem to be a few new articles each day, on average, that it finds. One
problem has been Informit; Hans Nowak added this site to the search list
and Informit seems to add "old' articles a lot to it site, which
confuses mygale because they appear to be "new" articles and so they
show up and dominate the new listing. Also Informit requires
registration, which I dislike. So, I have removed Informit from the
search list for now. 

When I get back from my business trip, I plan to start running the
mygale web spider with the -n switch each day, so that it will only
produce  a list each day of the "new" articles it finds on its web
searches. I will leave the old list up, which is an HTML page with
literally thousands of articles about Python sorted by date; but then
each day I will add a new, short list of new articles found by mygale.
This should help those with modems because they won't have to download
the huge HTML file containing all the old articles just in order to see
the new articles.

If I can figure out my crontab woes with the help of someone, I can set
it up to automatically do the job each day and we will have, I believe,
a real nice tool courtesy of Hans Nowak. 

I also have read and reviewed Steve Holden's new book, Python Web
Programming, which is a most excellent book and highly recommended. The
whole review is on the web site and also at Amazon now. 

Again, I really like linux, I just wish I had more time to play with it.
Python and Idle run just fine and Mandrake 8.1 is about as easy to set
up and use as it could be. 

I do find it irritating that the clip and paste functionality doesn't
work from my text editor to my email program, but I guess its not
perfect yet. The Bash shell sure gives you a lot of functionality and
power, and KDE is almost as slick as windows, except for the afore
mentioned clipboard woes ;-)))

Ron Stephens
http://www.awaretek.com/plf.html (Python City)



More information about the Python-list mailing list