Faster md5/1-way encryption?

Alexander Skwar lists.ASkwar at DigitalProjects.com
Wed Apr 24 17:57:52 EDT 2002


Hi!

I'm developing an application in Python 2.2 which will read a file and
then calculate the md5 for each line.  This is app is goint to use
wxPython as the GUI frontend.  The app should finally run on Linux and
Windows.

The files which should be processed are rather large (100,000+ lines)
and on my machine this takes ages to run.  I'm currently doing it like
this:


import os
import xreadlines
import threading
import md5

class md5Thread(threading.Thread):
    def __init__(self, Dateiname):
        threading.Thread.__init__(self)
        self.done = 0
        self.Zeilennummer = None
        self.Zeilenanzahl = None
        self.Dateiname = Dateiname
        self.Inhalt = None
        self.max = 1000
        
    def run(self):
        dateiname = self.Dateiname
        self.Zeilenanzahl = os.path.getsize(dateiname)
        linesep = os.linesep
        datei = file(dateiname, 'r', 1)
        
        self.Zeilennummer = 0
        self_Inhalt = ''
        i = 0
        for zeile in xreadlines.xreadlines(datei):
            md5_zeile = md5.new(zeile).hexdigest()
            self_Inhalt += md5_zeile + linesep
            self.Zeilennummer = datei.tell()
            self.ZeilennummerZeile = i
            i += 1
            
        self.Inhalt = self_Inhalt
        self.done = 1

As you can see, I've put the md5 generation stuff into a seperate
thread.  This thread is started when a button is pressed, like this:

        mt = md5Thread(filename)
        mt.start()

Further, I use a wxTimer to poll the current byteposition of the thread
and display it in a wxGauge and wxTextctrl.

Since this is my first Python app, there are for sure some things that
can be optimized.

I suppose what's taking so very long is 
a) The md5 generation itself
b) That I instantiate a new md5 object for every single line and thus
will have 100,000+ objects.

Could somebody please tell me, how I could speed the whole thing up?  I
don't have to calculate the md5 sum of every line, but I need to use
some sort of 1-way encryption.  So, if there are faster alternatives
than md5, I'd be happy as well.

Thanks,

Alexander Skwar
-- 
How to quote:	http://learn.to/quote (german) http://quote.6x.to (english)
Homepage:	http://www.iso-top.de      |    Jabber: askwar at a-message.de
   iso-top.de - Die günstige Art an Linux Distributionen zu kommen
                       Uptime: 16 hours 47 minutes





More information about the Python-list mailing list