On Sun, May 24, 2020 at 6:56 PM Steven D'Aprano <steve@pearwood.info> wrote:
> I use bash a lot, and writing something like this is common:
> cat data | sort | cut -d; -f6 | grep ^foo | sort -r | uniq -c

And today's "Useless Use Of cat Award" goes to... :-)

    sort data | ...

(What is it specifically about cat that is so attractive? I almost
certainly would have done exactly what you did, even knowing that sort
will take a file argument.)

This is probably going afield since it is a bash thing, not a Python thing.  But I can actually answer this quite specifically.

When I write a pipeline like that, I usually do not do it in one pass.  I write a couple of the stages, look at what I have, and then add some more stages until I get it right.  Many of the commands in the pipeline can take a file argument (not just sort, also cut, also grep, also uniq... everything I used in the example).

But I find fairly often that I need to add a step BEFORE what I initially thought was first processing step.  And then I have to remove the filename as an argument of that no-longer-first step.  Rinse and repeat.  With `cat` I know it does nothing, and I won't have to change it later (well, OK, sometimes I want -n or -s).  So it is a completely generic "data" object ... sort of like how I would write "fluent programming" starting with a Pandas DataFrame, for example, and calling chains of methods..

--
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.