[Tutor] Parsing Word Docs
Stephen Nelson-Smith
sanelson at gmail.com
Fri Mar 9 13:52:39 CET 2007
On 3/8/07, Tim Golden <mail at timgolden.me.uk> wrote:
> Simplest thing's probably antiword (http://www.winfield.demon.nl/)
> and then whatever text-scanning approach you want.
I've gone for:
#!/usr/bin/env python
import glob, os
url = "/home/cherp/prddoc"
searchstring = "dxpolbl.p"
worddocs = []
for (dirpath, dirnames, filenames) in os.walk(url):
for f in filenames:
if f.endswith(".doc"):
worddocs.append(os.path.join(dirpath,f))
for d in worddocs:
for i in glob.glob(d):
if searchstring in open(i,"r").read():
print "Found it in: ", i.split('/')[-1]
Now... I want to convert this to a cgi-script... how do I grab
$QUERY_STRING in python?
S.
More information about the Tutor
mailing list