Large File Parsing

Tim Roberts timr at
Tue Jun 17 06:49:39 CEST 2003

Robert S Shaffer <r.shaffer9 at> wrote:
>I have upto a 3 million record file to parse, remove duplicates and
>sort by size then numeric value. Is this the best way to do this in

In my opinion, no; the best way would be to use a simple chain of command

  cut -f 0 -d , inputfile | sort -n | uniq > outputfile

There is no need to reinvent the wheel when perfectly good solutions exist.

even if you are using Windows, you can download either Cygwin or the
UnxUtils, which provides all of these tools.
- Tim Roberts, timr at
  Providenza & Boekelheide, Inc.

More information about the Python-list mailing list