Supporting ~ as a binary operator is an interesting idea, especially given the relatively limited usage of unary ~. However, the big hole in this proposal for formulas is that there is a de facto standard "minilanguage" for writing such formulas in Python, namely what Patsy supports: https://patsy.readthedocs.io/en/latest/formulas.html Patsy is used by statsmodels and other tools to support the same formulas as we see in R (or S). 

We immediately see the problem with the interaction operator : (colon), which conflicts with how it is used to support annotations in Python. Given that this formula minilanguage is comprehensive, this seems to be a fatal objection.

Note that deferred evaluation is sort of a red herring - it is straightforward to defer execution in Python's object model, as we see in SymPy, Pandas dataframes where clauses, and SQLAlchemy, among other examples.


On Sun, Feb 23, 2020 at 5:37 PM <jdveiga@gmail.com> wrote:
Aaron Hall wrote:
> Currently, Python only has ~ (tilde) in the context of a unary operation (like
> -, with __neg__(self), and +, __pos__(self)). 
> ~ currently calls __invert__(self) in the unary context.
> I think it would be awesome to have in the language, as it would allow modelling along the
> lines of R that we currently only get with text, e.g.:
> smf.ols(formula='Lottery ~ Literacy + Wealth + Region', data=df)
> With a binary context for ~, we could write the above string as pure Python, with
> implications for symbolic evaluation (with SymPy) and statistical modelling (such as with
> sklearn or statsmodels) - and other use-cases/DSLs.
> In LaTeX we call this \sim (Wikipedia indicates this is for "similar to").
> I'm not too particular, but __sim__(self, other) would have the benefits of
> being both short and consistent with LaTeX.
> This is not a fully baked idea, perhaps there's a good reason we haven't added a binary
> ~.  It seems like I've seen discussion in the past. But I couldn't find such
> discussion. And as I'm currently taking some statistics courses, I'm getting
> R-feature-envy again...
> What do you think? 
> Aaron Hall

I really do not fully understand your proposal. I do not know nothing about R and my statistical knowledge has gone long ago.

However, I think that we cannot expect that Python accommodates every existing domain. Let me explain: Python have not special features, syntax, operators to deal with SQL, HTML, ini files, OpenGL, etc. These domains, and others, are supported via libraries, outside of the language core.

~ exists in bit-wise context and, as long as I know, it comes from C --I have never used it indeed in Python. It is a unary operator because it works in that way as a bitwise operator.

I cannot see any improvement in becoming ~ into a binary operator. I imagine that a binary ~ would have a completely different meaning from a unary ~. I can foresee many problems here.

In my opinion, you should prove that binary ~ has a relevant benefit for the whole language, not just for R tasks. It should be useful in some different domains and behave consistently --or at least so consistent as possible-- in those domains.

Can you, for instance, envision other uses of binary ~ beyond R?
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UMXBIM6TSOTR76KOBOJXGVJUDMX7IHBT/
Code of Conduct: http://python.org/psf/codeofconduct/