Python good for data mining?
Cameron Walsh
cameron.walsh at gmail.com
Sat Nov 3 22:49:22 EDT 2007
Jens wrote:
> I'm starting a project in data mining, and I'm considering Python and
> Java as possible platforms.
>
> I'm concerned by performance. Most benchmarks report that Java is
> about 10-15 times faster than Python, and my own experiments confirms
> this. I could imagine this to become a problem for very large
> datasets.
If most of the processing is done with SQL calls, this shouldn't be an
issue. I've known a couple of people at Sydney University who were
using Python for data mining. I think they were using sqlite3 and MySQL.
>
> How good is the integration with MySQL in Python?
Never tried it, but a quick google reveals a number of approaches you
could try - the MySQLdb module, MySQL for Python, etc.
>
> What about user interfaces? How easy is it to use Tkinter for
> developing a user interface without an IDE? And with an IDE? (which
> IDE?)
WxPython was recommended to me when I was learning how to create a GUI.
It has more features than Tkinter and a more native look and feel across
platforms. With WxPython it was fairly easy to create a multi-pane,
tabbed interface for a couple of programs, without using an IDE. The
demos/tutorials were fantastic.
>
> What if I were to use my Python libraries with a web site written in
> PHP, Perl or Java - how do I integrate with Python?
Possibly the simplest way would be python .cgi files. The cgi and cgitb
modules allow form data to be read fairly easily. Cookies are also
fairly simple. For a more complicated but more customisable approach,
you could look in to the BaseHTTPServer module or a socket listener of
some sort, running that alongside the webserver publicly or privately.
Publicly you'd have links from the rest of your php/whatever pages to
the python server. Privately the php/perl/java backend would request
data from the local python server before feeding the results back
through the main server (apache?) to the client.
More information about the Python-list
mailing list