[OT] Plone/Zope... Enhancing browser for better search?
mynewjunkaccount at hotmail.com
Tue Sep 28 05:15:09 CEST 2004
I am working on a project that is becoming quite unwieldy. I would
appreciate comments on...:
1) The general approach to the "GOAL" I am trying to accomplish (see
2) Any single item in the laundry list presented below, i.e. you need
not address every item. I welcome all comments.
3) Existing implementations/Technologies. In particular, I would
like feedback about how Plone/Zope might be a more appropriate
framework for this project.
GOAL: To write a toolbox of utilities in Python that extends
functionality of your browser (well, kinda...) in order to enhance
your search missions.
Motivation: I spend a heck of a lot of time searching for stuff. I
thought that a whole lot of work goes into generating 3, rather large,
data sets... namely:
1) Browser History
2) Browser Temp Internet Files
3) Browser Cookies
4) Favorites ...(although, not large... worth mentioning in this
Once generated, one might be able to automate an analytical process
that allows new content pages to be written to a location on your hard
drive. This new web page could be accessed on-the-fly by writing a
shortcut to this file to one of your Browser's "Favorites" folders
(using wmi.py or winshell.py, I forget). You can then launch the
newly constructed page, generate new entries in the data sets, 1), 2),
3), and 4) mentioned above, and the process starts over, etc.
Here are some of my dabblings so far...
APIs: PyAmazon, PyGoogle, PyBlogger, ...(PyTehoma, PyCiteseer,
PyWikipedea, etc?) Used to automate search queries. Uses inputs (e.g.
keywords, URLs, etc.) from the data sets 1), 2), 3), and 4)...I
haven't considered "Terms of Service" issues yet.
Storage/Databases: Python Shelves -
Recursive searches will generate a lot of data. In this case I was
considering constructing a number of "Bookcases" from Python Shelves.
Each "Bookcase" would be dedicated to serving data of a particular
type OR for a particular purpose. Each Python Shelve would be
specialized for a particular task required by the "Bookcase". What I
(think) I am after here is a persistent "compound dictionary":
http://tinyurl.com/4o268 ...other suggestions are welcome. It would
be nice to be able to "Shelve" an entire search session and publish a
distilled version online. Again, I don't know much regarding "Terms
of Service" issues at this point.
Text Manipulations: Parsers/Query Tools -
...were constructed from htmllib and BeautifulSoup. I am also
tinkering with the idea of using xmlbuilder.py to reformat text (XML)
before it is entered in the "Bookcases"; I though this might be a
format that lends itself to efficient parsing, comparing, and
constructing new data structures. The new data structures may be
preferred for certain types of analysis (e.g. arrays).
Analytical Procedures: Statistical Engines -
Once data sets 1) - 4) have been coerced into the appropriate
form, they will be submitted for analysis... probably to R using
RPython. (I won't go into details... this post is too long as is...
for the purpose of illustration, assume that these engines perform
analysis on data sets 1)-4), which produces "useful" results.)
In summary, it would do something like this while you were browsing:
DataSets 1,2,3 & 4 --> Fed to Parsers --> Persisted in "Bookcases"
--> Requested by Statistical Engines --> New Data Sets from
Analysis --> Fed to APIs --> Search results from APIs are
presented in a local web page and bookmaked in "Favorites" --> User
accesses newly generated local web page via special bookmark --> User
generates new data as a byproduct of searching and browsing.....
Thanks in advance for your comments,
More information about the Python-list