My first ever Python program, comments welcome
Peter Otten
__peter__ at web.de
Sun Jul 22 03:56:50 EDT 2012
Lipska the Kat wrote:
> Greetings Pythoners
>
> A short while back I posted a message that described a task I had set
> myself. I wanted to implement the following bash shell script in Python
>
> Here's the script
>
> sort -nr $1 | head -${2:-10}
>
> this script takes a filename and an optional number of lines to display
> and sorts the lines in numerical order, printing them to standard out.
> if no optional number of lines are input the script prints 10 lines
>
> Here's the file.
>
> 50 Parrots
> 12 Storage Jars
> 6 Lemon Currys
> 2 Pythons
> 14 Spam Fritters
> 23 Flying Circuses
> 1 Meaning Of Life
> 123 Holy Grails
> 76 Secret Policemans Balls
> 8 Something Completely Differents
> 12 Lives of Brian
> 49 Spatulas
>
>
> ... and here's my very first attempt at a Python program
> I'd be interested to know what you think, you can't hurt my feelings
> just be brutal (but fair). There is very little error checking as you
> can see and I'm sure you can crash the program easily.
> 'Better' implementations most welcome
> #! /usr/bin/env python3.2
>
> import fileinput
> from sys import argv
> from operator import itemgetter
>
> l=[]
> t = tuple
> filename=argv[1]
> lineCount=10
>
> with fileinput.input(files=(filename)) as f:
Note that (filename) is not a tuple, just a string surrounded by superfluous
parens.
>>> filename = "foo.bar"
>>> (filename)
'foo.bar'
>>> (filename,)
('foo.bar',)
>>> filename,
('foo.bar',)
You are lucky that FileInput() tests if its files argument is just a single
string.
> for line in f:
> t=(line.split('\t'))
> t[0]=int(t[0])
> l.append(t)
> l=sorted(l, key=itemgetter(0))
>
> try:
> inCount = int(argv[2])
> lineCount = inCount
> except IndexError:
> #just catch the error and continue
> None
>
> for c in range(lineCount):
> t=l[c]
> print(t[0], t[1], sep='\t', end='')
>
I prefer a more structured approach even for such a tiny program:
- process all commandline args
- read data
- sort
- clip extra lines
- write data
I'd break it into these functions:
def get_commmandline_args():
"""Recommended library: argparse.
Its FileType can deal with stdin/stdout.
"""
def get_quantity(line):
return int(line.split("\t", 1)[0])
def sorted_by_quantity(lines):
"""Leaves the lines intact, so you don't
have to reassemble them later on."""
return sorted(lines, key=get_quantity)
def head(lines, count):
"""Have a look at itertools.islice() for a more
general approach"""
return lines[:count]
if __name__ == "__main__":
# protecting the script body allows you to import
# the script as a library into other programs
# and reuse its functions and classes.
# Also: play nice with pydoc. Try
# $ python -m pydoc -w ./yourscript.py
args = get_commandline_args()
with args.infile as f:
lines = sorted_by_quantity(f)
with args.outfile as f:
f.writelines(head(lines, args.line_count))
Note that if you want to handle large files gracefully you need to recombine
sorted_by_quantity() and head() (have a look at heapq.nsmallest() which was
already mentioned in the other thread).
More information about the Python-list
mailing list