JavaScript has an early proposal for this use-case: https://github.com/tc39/proposal-partial-application#pipeline-and-partial-application
where "cat data | sort | cut -d; -f6 | grep ^foo | sort -r | uniq -c" could be represented as:
  data
    |> sort
    |> cut(?, delimeter=";", fields=6)
    |> grep(?, "^foo")
    |> sort(?, reverse=True)
    |> uniq(?, count=True)

A very similar operator already exists for Hack which has served to clean up a lot of code at Facebook: https://docs.hhvm.com/hack/expressions-and-operators/pipe

( fun fact: I was one of the people who asked for __matmul__ back in the day, but now I prefer the pipeline operator as it's not restricted to unary functions and is, IMHO, more readable
https://mail.python.org/archives/list/python-ideas@python.org/message/I4DXFR4P5KAMHDL4MRAM43QCUMW4MIJY )

On Sun, May 24, 2020 at 7:33 PM David Mertz <mertz@gnosis.cx> wrote:
On Sun, May 24, 2020 at 6:56 PM Steven D'Aprano <steve@pearwood.info> wrote:
> I use bash a lot, and writing something like this is common:
> cat data | sort | cut -d; -f6 | grep ^foo | sort -r | uniq -c

And today's "Useless Use Of cat Award" goes to... :-)

    sort data | ...

(What is it specifically about cat that is so attractive? I almost
certainly would have done exactly what you did, even knowing that sort
will take a file argument.)

This is probably going afield since it is a bash thing, not a Python thing.  But I can actually answer this quite specifically.

When I write a pipeline like that, I usually do not do it in one pass.  I write a couple of the stages, look at what I have, and then add some more stages until I get it right.  Many of the commands in the pipeline can take a file argument (not just sort, also cut, also grep, also uniq... everything I used in the example).

But I find fairly often that I need to add a step BEFORE what I initially thought was first processing step.  And then I have to remove the filename as an argument of that no-longer-first step.  Rinse and repeat.  With `cat` I know it does nothing, and I won't have to change it later (well, OK, sometimes I want -n or -s).  So it is a completely generic "data" object ... sort of like how I would write "fluent programming" starting with a Pandas DataFrame, for example, and calling chains of methods..

--
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7DRNTPAWCU5SA666E6ZLEXZQVUYCS7VN/
Code of Conduct: http://python.org/psf/codeofconduct/