reversing a dictionary (newbie)

Christopher Brewster C.Brewster at dcs.shef.ac.uk
Thu Mar 15 05:02:40 EST 2001


I have written a program to extract pairs of words from a corpus (the BNC).
It is my first Python program
Once it has processed the files I "reverse" the dictionary so that the keys
are frequencies and the values lists of word pairs of that frequency.

The processing of the files is quite quick (700 odd files (@ 100k mean size)
of directory 'A' in about 20 minutes)
but the reversing of the dictionary took hours and hours. I have run the
program successfully on a PC (500 MHz, 256 RAM, running Windows NT) and I am
still waiting for results from a Sun (1 Giga RAM) since yesterday. Here is
the code - what have I done wrong?

def revdict(diction):
	revdict = {}
	for wordpair in diction.keys():
		#print wordpair
		list = []
		freq = diction[wordpair]
		if revdict.has_key(freq):
			list.extend(revdict[freq])
			list.append(wordpair)
			revdict[freq] = list
		else:
			list.append(wordpair)
			revdict[freq] = list
	return revdict

The total number of 'wordpair's (i.e. keys in the original dictionary) were
567000. I need to be able to handle ten times that many.

Thank you,

Christopher Brewster

Department of Computer Science, University of Sheffield
Tel: +44(0)114-22.21872  Fax: +44 (0)114-22.21810
Regent Court, 211 Portobello Street
Sheffield   S1 4DP   UNITED KINGDOM






More information about the Python-list mailing list