[Tutor] alphabetizing a file by lines

Brian van den Broek bvande at po-box.mcgill.ca
Sat Jul 17 23:28:22 CEST 2004


orbitz said unto the world upon 17/07/2004 16:33:
> lines = open(file).readlines()
> lines.sort()
> print lines
> 
> 
> Dragonfirebane at aol.com wrote:
> 
>> Hello all,
>>  
>> I'm trying to write a program that alphabetizes a file by the first 
>> letter on each line. I'm having some problems because as soon as the 
>> program finds a letter, it looks for the next letter, ignoring 
>> subsequent appearances of that letter. I can think of a couple ways to 
>> fix this but none of them seem to work. The first of these would be to 
>> add a special character to lines that have already been alphabetized, 
>> but file.write() writes to the end of a file and i would need to write 
>> the character at the current position in the file. This might be 
>> circumvented by creating a file for each line that is alphabetized, 
>> but that seems a bit extreme . . . The code is below. Any suggestions 
>> would be appreciated. Future concerns will be alphabetizing past the 
>> first letter.
>>  
>> ##############
>> def linum():
>>     global i
>>     linu = open("%s%s" % (fn, ext), "r")
>>     i = 0
>>     for line in linu.readlines():
>>         i += 1
>>     linu.close()
>> def alph():
>>     alp = open("alphebatized%s%s" % (fn, ext), "w")
>>     pal = open("prealp%s%s" % (fn, ext), "w")        ## eventually for 
>> writing "\xfe" after a line that has been alphabetized
>>     read = open("%s%s" % (fn, ext), "r")                ## same reason 
>> for this line until read.close()
>>     for line in read.read():
>>         pal.write(line)
>>     pal.close()
>>     read.close()
>>     print i
>>     for do in range(i):
>>         falp(alp)
>>     alp.close()
>> def falp(alp):
>>     global a
>>     read = open("prealp%s%s" % (fn, ext), "r")
>>     for line in read.readlines():
>>         try:
>>             alpn = re.compile("%s(?!\xfe)" % alpha[a], re.IGNORECASE)
>>             falpn = alpn.match(line)
>>             if falpn:
>>                 print line
>>                 alp.write(line)
>>                 a += 1
>>                 break
>>         except IndexError:
>>             pass
>> import re
>> import string
>> i = 0
>> a = 0
>> alpha = ' '.join(string.ascii_letters[:26]).split()
>> fn = raw_input("Please enter name of file you wish to prioritize: ")
>> ext = raw_input("Please enter extension of file you wish to 
>> prioritize: ")
>> linum()
>> alph()
>> ################
>>  
>> Here is random.txt, the file being alphabetized:
>>  
>> K
>> cx
>> c
>> cd
>> e
>> X
>> y
>> v
>> l
>> f
>> w
>> Q
>> z
>> h
>> r
>> i
>> T
>> s
>> p
>> d
>> m
>> n
>> a
>> o
>> j
>> u
>> b
>> G
>>  
>> Thanks in advance,
>> Orri
>>  
>> Email: dragonfirebane at aol.com
>> AIM: singingxduck
>> Programming Python for the fun of it.

Hi Orri and all,

the advice at the top is good. The only thing I would add is that this 
will sort your lines by their entire contents and will perform a case 
sensitive sort.

It seemed from your problem description that you wanted only the first 
letter taken into account regardless of case so that the lines:

abc
ABC
azz
acd

would not be rearranged.

Both the case-insensitivity and the consideration of just the first letter 
can easily be obtained by writing a custom compare function to pass in 
with the sort method call.

The case insensitivity part can be accomplished like so:

def alphabetical_sort(first, second):
     return cmp(first.lower(), second.lower())

(You could easily extend this to look at just the leading n-characters of 
the two strings being compared.)

You'd use it like so:

lines = open(file).readlines()
lines.sort(alphabetical_sort)
print lines

Best,

Brian vdB


More information about the Tutor mailing list