[Tutor] alphabetizing a file by lines
Brian van den Broek
bvande at po-box.mcgill.ca
Sat Jul 17 23:28:22 CEST 2004
orbitz said unto the world upon 17/07/2004 16:33:
> lines = open(file).readlines()
> lines.sort()
> print lines
>
>
> Dragonfirebane at aol.com wrote:
>
>> Hello all,
>>
>> I'm trying to write a program that alphabetizes a file by the first
>> letter on each line. I'm having some problems because as soon as the
>> program finds a letter, it looks for the next letter, ignoring
>> subsequent appearances of that letter. I can think of a couple ways to
>> fix this but none of them seem to work. The first of these would be to
>> add a special character to lines that have already been alphabetized,
>> but file.write() writes to the end of a file and i would need to write
>> the character at the current position in the file. This might be
>> circumvented by creating a file for each line that is alphabetized,
>> but that seems a bit extreme . . . The code is below. Any suggestions
>> would be appreciated. Future concerns will be alphabetizing past the
>> first letter.
>>
>> ##############
>> def linum():
>> global i
>> linu = open("%s%s" % (fn, ext), "r")
>> i = 0
>> for line in linu.readlines():
>> i += 1
>> linu.close()
>> def alph():
>> alp = open("alphebatized%s%s" % (fn, ext), "w")
>> pal = open("prealp%s%s" % (fn, ext), "w") ## eventually for
>> writing "\xfe" after a line that has been alphabetized
>> read = open("%s%s" % (fn, ext), "r") ## same reason
>> for this line until read.close()
>> for line in read.read():
>> pal.write(line)
>> pal.close()
>> read.close()
>> print i
>> for do in range(i):
>> falp(alp)
>> alp.close()
>> def falp(alp):
>> global a
>> read = open("prealp%s%s" % (fn, ext), "r")
>> for line in read.readlines():
>> try:
>> alpn = re.compile("%s(?!\xfe)" % alpha[a], re.IGNORECASE)
>> falpn = alpn.match(line)
>> if falpn:
>> print line
>> alp.write(line)
>> a += 1
>> break
>> except IndexError:
>> pass
>> import re
>> import string
>> i = 0
>> a = 0
>> alpha = ' '.join(string.ascii_letters[:26]).split()
>> fn = raw_input("Please enter name of file you wish to prioritize: ")
>> ext = raw_input("Please enter extension of file you wish to
>> prioritize: ")
>> linum()
>> alph()
>> ################
>>
>> Here is random.txt, the file being alphabetized:
>>
>> K
>> cx
>> c
>> cd
>> e
>> X
>> y
>> v
>> l
>> f
>> w
>> Q
>> z
>> h
>> r
>> i
>> T
>> s
>> p
>> d
>> m
>> n
>> a
>> o
>> j
>> u
>> b
>> G
>>
>> Thanks in advance,
>> Orri
>>
>> Email: dragonfirebane at aol.com
>> AIM: singingxduck
>> Programming Python for the fun of it.
Hi Orri and all,
the advice at the top is good. The only thing I would add is that this
will sort your lines by their entire contents and will perform a case
sensitive sort.
It seemed from your problem description that you wanted only the first
letter taken into account regardless of case so that the lines:
abc
ABC
azz
acd
would not be rearranged.
Both the case-insensitivity and the consideration of just the first letter
can easily be obtained by writing a custom compare function to pass in
with the sort method call.
The case insensitivity part can be accomplished like so:
def alphabetical_sort(first, second):
return cmp(first.lower(), second.lower())
(You could easily extend this to look at just the leading n-characters of
the two strings being compared.)
You'd use it like so:
lines = open(file).readlines()
lines.sort(alphabetical_sort)
print lines
Best,
Brian vdB
More information about the Tutor
mailing list