Perl and Python, a practical side-by-side example.
John Machin
sjmachin at lexicon.net
Fri Mar 2 19:33:57 EST 2007
On Mar 3, 9:44 am, "Shawn Milo" <S... at Milochik.com> wrote:
> I'm new to Python and fairly experienced in Perl, although that
> experience is limited to the things I use daily.
>
> I wrote the same script in both Perl and Python, and the output is
> identical. The run speed is similar (very fast) and the line count is
> similar.
>
> Now that they're both working, I was looking at the code and wondering
> what Perl-specific and Python-specific improvements to the code would
> look like, as judged by others more knowledgeable in the individual
> languages.
>
> I am not looking for the smallest number of lines, or anything else
> that would make the code more difficult to read in six months. Just
> any instances where I'm doing something inefficiently or in a "bad"
> way.
>
> I'm attaching both the Perl and Python versions, and I'm open to
> comments on either. The script reads a file from standard input and
> finds the best record for each unique ID (piid). The best is defined
> as follows: The newest expiration date (field 5) for the record with
> the state (field 1) which matches the desired state (field 6). If
> there is no record matching the desired state, then just take the
> newest expiration date.
>
[big snip]
Here is my rewrite in what I regard as idiomatic reasonably-modern
Python (OMMV of course). A few of the comments are applicable
irrespective of the language used.
HTH,
John
8<------
#! /usr/bin/env python
### Layout: Use 4-space indent. Don't use tabs. Don't exceed 79 chars
per line.
import sys
def process_file(opened_file=sys.stdin):
### Local variable access is faster than global
### input = sys.stdin ### shadowing built_in function "input"
### Use names to access elements in rows
PIID = 0
ACTUAL_STATE = 1
DESIRED_STATE = 6
EXPIRY_DATE = 5
recs = {}
for row in opened_file:
row = row.rstrip('\n').split('\t')
### Do the split('\t') *once* per row
piid = row[PIID]
### if recs.has_key(piid) is False:
### has_key() is ancient
### "if boolean_expression is False" is megabletchworthy;
### use "if not boolean_expression"
if piid not in recs:
recs[piid] = []
recs[piid].append(row)
### for piid in recs.keys():
for piid in recs:
best = None ### use out-of-band sentinel
for current in recs[piid]:
if best is None:
best = current ### had cockroach crap (";") at EOL
else:
#If the current record is the correct state
### Clear code (like the next line) doesn't need
comments
### like the above line
if current[ACTUAL_STATE] == current[DESIRED_STATE]:
if best[ACTUAL_STATE] == best[DESIRED_STATE]:
if current[EXPIRY_DATE] > best[EXPIRY_DATE]:
best = current
else:
best = current
else:
if (best[ACTUAL_STATE] != best[ACTUAL_STATE]
and current[EXPIRY_DATE] > best[EXPIRY_DATE]):
best = current
print "\t".join(best)
if __name__ == "__main__":
### Standard idiom to avoid executing script content
### when/if file is imported
process_file()
8<----
More information about the Python-list
mailing list