Experimental syntax proposal
# Experimental Syntax Proposal I would like to propose that Python adopts a modified process before introducing significant changes of its syntax. ## Preamble Given the following file: ```python # coding: experimental-syntax from experimental-syntax import fraction_literal from experimental-syntax import decimal_literal assert 1 /3F == Fraction(1, 3) assert 0.33D == Decimal('0.33') print("simple_test.py ran successfully.") ``` This is what happens when I run it, with the standard Python interpreter: ``` $ python simple_test.py simple_test.py ran successfully. ``` In what follows, I use this as a concrete example for one of the three possible options mentioned. ## The problem Python evolves in many ways including with the addition of new modules in the standard library and with the introduction of new syntax. Before considering the addition of a module in the standard library, it is often suggested to have a version on Pypi so that users can experiment with it, which can lead to significant improvements. However, when it comes to proposed syntax changes, this is currently not possible to do, at least not in any standard way that would immediately be easily recognized as such by the wider community. ## Proposed solutions For those that agree that this is something that could be improved upon, I see at least three possible solutions. 1. Adoption of a simple convention to identify the use of non-standard syntax. This could take the form of a comment with a specific form introduced near the top of the file. This comment could be used as a search term, to identify modules or projects that make use of some custom syntax. 2. Addition of a special encoding module in the Python standard library that can be used to implement nonstandard syntax through source transformation. The short program written at the top provides an example of what this might look like. Note the comment at the top defining a special codec, which could also 3. Addition of a standard import hook in the standard library which could be used, like the special encoding example above, to implement syntactic changes. By doing searches in standard locations (Github, Gitlab, etc.) on, one could identify how many modules are making use of a given experimental syntax, giving an idea of the desirability of incorporating into Python. Imagine how different the discussion about the walrus operator would have been if people had had the opportunity to try it out on their own for a few months prior to a decision being made about adding it to Python. ## New Syntax: two approaches There are currently at least two ways in which one can write a Python program that can use some non standard syntax: * By using an import hook. This is the most powerful approach as it allows one to do changes either at the source level (prior to parsing), or at the AST level, or both. * By using a custom encoding. This only allows transformations at the source level. Both approaches currently require a two-step process. To use an import hook, one has first to load a function that sets up this import hook before importing the module that contains the experimental syntax. This makes it impossible to simply write ``` python module_with_new_syntax.py ``` and have it being executed. Import hooks, like the name implies, only apply to modules being imported - not to the first module executed by Python. With a custom encoding, it is however possible to first register a custom codec either via a site.py or usercustomize.py file. Once this is done, the above way of running a module with new syntax is possible, provided that the appropriate encoding declaration is included in the script. This is what I did with the example shown above. ### Current limitation of these two approaches Using an import hook for enabling new syntax does not work for code entered in the standard Python REPL, which limits the possibility of experimentation. While I have read that specifying the environment variable `PYTHONIOENCODING` enables the Python REPL to use a custom encoding, I have not been able to make it work on Windows and confirm that it is indeed possible to do so. However, I have been able to use a simple customized REPL that can make use of the above. Perhaps the standard REPL could be modified in a similar way. ## New suggested process Assuming one of the options mentioned above is adopted, before changes to the syntax are introduced in a Python version, they would be made available to be used either as an encoding variant or an import hook giving enough time for interested pythonistas to experiment with the new syntax, writing actual code and possibly trying out various alternatives. ## Proof of concept As a proof of concept shown at the beginning of this post, I have created two tiny modules that introduce new syntax that had been discussed many times on this list: 1. a new syntax for decimal literals 2. a new syntax for rationals Both modules have been uploaded separately to Pypi; I wanted to simulate what could happen if a proposal such as this one were adopted. To make use of these, you need to use the `experimental-syntax` codec found in my `ideas` package. To run the example as shown above, you first need to register the codec, which can be done by using either the `site.py` or `usercustomize.py` approach. I chose the latter, by setting the environment variable `PYTHONPATH` to be a path where the following `usercustomize.py` file is found. ```python from ideas import experimental_syntax_encoding print(f" --> {__file__} was executed") ``` Doing this on Windows, I found that it did not seem to work when using a virtual environment (I added the print statement to confirm that it was loaded). Here's a sample session using the code available currently on pypi. ``` C:\Users\andre\github\ideas>python simple_test.py --> C:\Users\andre\github\ideas\usercustomize.py was executed simple_test.py ran successfully. C:\Users\andre\github\ideas>python --> C:\Users\andre\github\ideas\usercustomize.py was executed Python 3.8.4 (tags/v3.8.4:dfa645a, Jul 13 2020, 16:30:28) [MSC v.1926 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
from ideas import experimental_syntax_encoding from ideas import console console.start() Configuration values for the console: transform_source: <function transform_source at 0x00A8AE38>
Ideas Console version 0.0.19. [Python version: 3.8.4] ~>> import simple_test simple_test.py ran successfully. ~>> from experimental-syntax import decimal_literal ~>> 3.46D Decimal('3.46') ~>> from experimental-syntax import fraction_literal ~>> 2/3F Fraction(2, 3) ~>> ``` Installation: python -m pip install ideas # will also install token-utils python -m pip install decimal-literal python -m pip install fraction-literal More information about "ideas" can be found at https://aroberge.github.io/ideas/docs/html/ It has not (yet) been updated to include anything about the `experimental_syntax_encoding` module, which I just worked on today. ## Final thoughts This proposal is really about the idea of adopting a standard process of some sort that enables users to experiment with any proposed new syntax, rather than the specific "silly" examples I have chosen to illustrate it. André Roberge P.S. If anyone knows how to make a usercustomize file run in a virtual environment on Windows, please let me know.
See PEP 638: https://www.python.org/dev/peps/pep-0638/ If I have understood well, it proposes what you want
On Sat, Oct 24, 2020 at 4:39 PM Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
See PEP 638: https://www.python.org/dev/peps/pep-0638/
If I have understood well, it proposes what you want
No, it does not. It proposes actual changes to the Python interpreter. Under my proposal, something like what is proposed there would first be implemented as a third party package. André Roberge
On Sat, 24 Oct 2020 at 21:49, André Roberge <andre.roberge@gmail.com> wrote:
No, it does not. It proposes actual changes to the Python interpreter.
Under my proposal, something like what is proposed there would first be implemented as a third party package.
Not sure, but yes, the PEP proposes a change to the interpreter, but in such a way that you can define your own macro, ie custom keyword(s) in a third party package that can be import!ed.
Hello André, On Sat, 24 Oct 2020 13:32:30 -0300 André Roberge <andre.roberge@gmail.com> wrote:
# Experimental Syntax Proposal
I would like to propose that Python adopts a modified process before introducing significant changes of its syntax.
[]
## New suggested process
Assuming one of the options mentioned above is adopted, before changes to the syntax are introduced in a Python version, they would be made available to be used either as an encoding variant or an import hook giving enough time for interested pythonistas to experiment with the new syntax, writing actual code and possibly trying out various alternatives.
Thanks for posting this proposal. It should be pretty clear that's the best process which should be followed. It's also should be pretty clear that (almost) nobody follows it. If anything, lack of responses by the substance of the matter proposed, i.e. the *process*, not technical means to achieve it, is very telling. So again, I fully agree that trying to implement any Python changes in the Python itself (instead of rushing to hack a particular implementation in C) should the best practice. Now let's talk why it's not that way. The main reasons would be lack of technical best practices to achieve that, and perceived complexity of existing ways to do that. Even from your proposal lack of the "best" solution is clear:
* By using an import hook. This is the most powerful approach as it allows one to do changes either at the source level (prior to parsing), or at the AST level, or both.
* By using a custom encoding. This only allows transformations at the source level.
So, "custom encoding" way is essentially a hack. As it operates on the surface "stream of characters" representation of a program, only trivial, or imprecise transformations can be implemented. Alternatively, a program can be parsed, modified and dumped again as a stream of characters, just to immediately be parsed by the Python interpreter again. In either case, it's a hack, and should be rejected as a viable approach to a problem domain. Now you say that the second choice is "import hooks". But import hooks in Python is its own kingdom of "wonderland". Import hooks aren't intended to "allow to experiment with syntax", they are intended to do anything what's possible with module loading (and many things that aren't possible, I would add). So, they have rather complex, non-intuitive API, thru which anyone would need to wade, neck-deep, before implementing something useful. In other words, both choices you list aren't really viable to experiment with syntax/semantics. That may be a good explanation why nobody rushes to. I'd formulate choices to experiment with syntax/semantics differently. Let's remember the compilation pipeline: the source is tokenized, then it's parsed into AST, then AST is compiled into bytecode. 1. The central part of the pipeline is the AST. Many interesting features can be implemented on AST level. Others can be at least prototyped, using existing syntactic elements (by assigning to them new semantics). Of course, arbitrary syntax changes can't be implemented this way. The good news that for simple experiments and demonstrations you don't need import hooks in any way (just run "python3 -m my_python_dialect source.py"). 2. For when AST-level transformation is not enough, it should be possible to "fork" or "subclass" the entire compilation pipeline (in Python of course). Second choice is fully and ultimately flexible, but also not attainable with the standard Python3 alone. It's even more sad that it was with Python2. So, let's see: there's "tokenize" (https://docs.python.org/3/library/tokenize.html) module implemented in the stdlib in Python, which you can fork and modify as you like. But that's dead end, because to-AST module is implemented in C, and accepts raw characters anyway. Then AST-to-bytecode compiler is also implemented in C. You can't easily "subclass" any of them to implement your changes, as you may imagine. The story was different with Python2, which had "compiler" package: https://docs.python.org/2/library/compiler.html . Some time ago, I ported this compiler to CPython3.5: https://github.com/pfalcon/python-compiler/ . Specifically, I ported AST-to-bytecode compiler part, as the most interesting. AST parser needs to be sourced from yet another 3rd-party module (but choices definitely exist). Note that import hooks are still orthogonal to this approach either. Where they kick in is when you want to use your changes not just for "experimenting", but kind of "for real". Again, import hooks on the base level aren't ideal choice for that - they "mud waters" too much. What's needed is higher-level API specifically for the usecase of letting Python source of modules to run thru custom tokenizer/parser/bytecode compiler. The best known (non-adhoc) approach to that is PEP511: https://www.python.org/dev/peps/pep-0511/. But that PEP is the vivid example of Python core developers self-policing, and policing the community: "This PEP was seen as blessing new Python-like programming languages which are close but incompatible with the regular Python language. It was decided to not promote syntaxes incompatible with Python." So, well, the best action is that somebody implements it anyway, and maintain as a 3rd-party module. Unless there's there's a clear vision for even simpler API yet fully general API, and then that should be implemented and promoted. -------- This turned out to be a long intro. As I said, I fully agree with you, that changes to Python should be prototyped in Python. And there's no better way to "agree" than actually dogfood this approach to oneself. As I lately argued here on the list that implementing block-level scoping for Python is "not rocket science at all" and "a notch above trivial", I decided to do just that - and subject myself to coding it up, using the very python-compiler project I mentioned above. The result is a branch on that repo: https://github.com/pfalcon/python-compiler/tree/for-block-scope (I'll post a separate message with details.) -- Best regards, Paul mailto:pmiscml@gmail.com
Hello Paul (and everyone else), On Sun, Nov 29, 2020 at 12:00 PM Paul Sokolovsky <pmiscml@gmail.com> wrote:
...
# Experimental Syntax Proposal
I would like to propose that Python adopts a modified process before introducing significant changes of its syntax.
[]
## New suggested process
Assuming one of the options mentioned above is adopted, before changes to the syntax are introduced in a Python version, they would be made available to be used either as an encoding variant or an import hook giving enough time for interested pythonistas to experiment with the new syntax, writing actual code and possibly trying out various alternatives.
Thanks for posting this proposal. It should be pretty clear that's the best process which should be followed.
Thank you.
It's also should be pretty clear that (almost) nobody follows it. If anything, lack of responses by the substance of the matter proposed, i.e. the *process*, not technical means to achieve it, is very telling.
I had taken the lack of response as indicative that my idea was simply "wrong".
So again, I fully agree that trying to implement any Python changes in the Python itself (instead of rushing to hack a particular implementation in C) should the best practice.
Now let's talk why it's not that way. The main reasons would be lack of technical best practices to achieve that, and perceived complexity of existing ways to do that.
Even from your proposal lack of the "best" solution is clear:
* By using an import hook. This is the most powerful approach as it allows one to do changes either at the source level (prior to parsing), or at the AST level, or both.
* By using a custom encoding. This only allows transformations at the source level.
So, "custom encoding" way is essentially a hack. As it operates on the surface "stream of characters" representation of a program, only trivial, or imprecise transformations can be implemented.
Aside: some fairly complex transformations can be implemented using codecs; see https://github.com/dropbox/pyxl for example.
Alternatively, a program can be parsed, modified and dumped again as a stream of characters, just to immediately be parsed by the Python interpreter again. In either case, it's a hack, and should be rejected as a viable approach to a problem domain.
Now you say that the second choice is "import hooks". But import hooks in Python is its own kingdom of "wonderland". Import hooks aren't intended to "allow to experiment with syntax", they are intended to do anything what's possible with module loading (and many things that aren't possible, I would add). So, they have rather complex, non-intuitive API, thru which anyone would need to wade, neck-deep, before implementing something useful.
In my original email, I linked to a project I created ("Ideas": https://aroberge.github.io/ideas/docs/html/) which meant to simplify this as much as possible.
In other words, both choices you list aren't really viable to experiment with syntax/semantics.
I respectfully disagree, based on my own experimentation, documented in "Ideas" mentioned previously. For example, I implemented module level constants using this approach. They *almost* work ... except for some corner cases due to the fact that I am using module objects created using Python's type() as I have not figured out (yet?) how to create a module object with a custom dict.
That may be a good explanation why nobody rushes to.
I'd formulate choices to experiment with syntax/semantics differently. Let's remember the compilation pipeline: the source is tokenized, then it's parsed into AST, then AST is compiled into bytecode.
I tried to represent this visually on https://aroberge.github.io/ideas/docs/html/possible.html
1. The central part of the pipeline is the AST. Many interesting features can be implemented on AST level. Others can be at least prototyped, using existing syntactic elements (by assigning to them new semantics). Of course, arbitrary syntax changes can't be implemented this way.
Exactly. Which is why I suggested using either import hooks or encodings. [Snip: explanation leading to discussion of Python2 "compiler" package which is referred to again at the end]
What's needed is higher-level API specifically for the usecase of letting Python source of modules to run thru custom tokenizer/parser/bytecode compiler. The best known (non-adhoc) approach to that is PEP511: https://www.python.org/dev/peps/pep-0511/.
I believe that the intention of PEP-511 was to optimize the code generated, and not to experiment with different syntax. I *think* that my ideas project is essentially a superset of what is described in pep-511, but with a completely different goal. Best, André
--------
This turned out to be a long intro. As I said, I fully agree with you, that changes to Python should be prototyped in Python. And there's no better way to "agree" than actually dogfood this approach to oneself.
Yes. One additional comment, point of emphasis about my original proposal. Take PEP 505 (None-aware operators) written 5 years ago. It is still deferred. Imagine that someone wants to experiment with a custom implementation. Using the approach I mentioned, it should be possible to do so. Suppose that the same person wishes to also experiment with block-level scoping. By using the approach I mentioned in my original message, it would be possible to "stack" transformations and experiment with code that support either None aware operators and/or block-level scoping. Different syntax variants could be implemented separately and, provided they do not create conflicts, could be tested together so that people could get a feel as to whether or not some proposed change should be included in Python.
As I lately argued here on the list that implementing block-level scoping for Python is "not rocket science at all" and "a notch above trivial", I decided to do just that - and subject myself to coding it up, using the very python-compiler project I mentioned above. The result is a branch on that repo: https://github.com/pfalcon/python-compiler/tree/for-block-scope
(I'll post a separate message with details.)
I look forward to reading it. Best regards, André
-- Best regards, Paul mailto:pmiscml@gmail.com
Hello André, On Sun, 29 Nov 2020 13:07:48 -0400 André Roberge <andre.roberge@gmail.com> wrote: ]
Thanks for posting this proposal. It should be pretty clear that's the best process which should be followed.
Thank you.
It's also should be pretty clear that (almost) nobody follows it. If anything, lack of responses by the substance of the matter proposed, i.e. the *process*, not technical means to achieve it, is very telling.
I had taken the lack of response as indicative that my idea was simply "wrong".
I pray for that to not be the case. People who *do* stuff shouldn't be discouraged so easily. Let doubts fall on people who get a random idea and 5 minutes later it's on the list, without any attempt of research or something on their side. []
Aside: some fairly complex transformations can be implemented using codecs; see https://github.com/dropbox/pyxl for example.
Alternatively, a program can be parsed, modified and dumped again as a stream of characters, just to immediately be parsed by the Python interpreter again. In either case, it's a hack, and should be rejected as a viable approach to a problem domain.
I'd say that your link proves my point: that's a roundabout and complex way, but of course, whole companies ("dropbox" in the link above) can afford themselves to wade thru it, for the lack of a better alternative.
Now you say that the second choice is "import hooks". But import hooks in Python is its own kingdom of "wonderland". Import hooks aren't intended to "allow to experiment with syntax", they are intended to do anything what's possible with module loading (and many things that aren't possible, I would add). So, they have rather complex, non-intuitive API, thru which anyone would need to wade, neck-deep, before implementing something useful.
In my original email, I linked to a project I created ("Ideas": https://aroberge.github.io/ideas/docs/html/) which meant to simplify this as much as possible.
In other words, both choices you list aren't really viable to experiment with syntax/semantics.
I respectfully disagree, based on my own experimentation, documented in "Ideas" mentioned previously. For example, I implemented module level constants using this approach.
I know. I also experimented with it. But that didn't leave me with a feeling of achievement. If anything, it left me with a feeling of powerlessness, which I tried to convey in my reply. When I think of all the people who need to wade thru it again and again, and the whole thing is purposely made hard/confusing for people to use (PEP 511 rejection, again), and how many gave up on that way - that doesn't make me feel good. I'm well aware that many people trod that way nonetheless. But no, it's not many, it's "a few", and that's the problem. For example, your "Ideas" project sits in my browser tab nearby to https://jon.how/likepython/ , which likely means I got links of them together. That's from 2010, and I don't think it's first of its kind or anything. And let me know if it helped you with your task, or you had to retrace the path from scratch. Anyway, I don't consider "I did that" to be good criteria. "Every 2nd Python user can do that easily" would be. (Ok, let's be realistic - every 10th).
They *almost* work ... except for some corner cases due to the fact that I am using module objects created using Python's type() as I have not figured out (yet?) how to create a module object with a custom dict.
So, I postponed replying, because even by reading "part 1" in your blog I got an idea that we were working of very close things. So, I was anxious to get my proposal on the "strict mode" out, which includes https://github.com/pfalcon/python-strict-mode , a strict mode prototype for CPython - to have a bit of the "clean room" (after-the-fact) comparison. So, I just finished reading https://aroberge.blogspot.com/2020/03/true-constants-in-python-part-2-and.ht... , and yeah, we were following almost the same way. I also implemented that "custom module object" you write about. Except it's not custom at all. It's a generic "read-only proxy" object I write about in https://mail.python.org/archives/list/python-ideas@python.org/message/7TRMSB... In my Pycopy, it lives in sys.roproxy. In the CPython prototype it's coded approximately at https://github.com/pfalcon/python-strict-mode/blob/master/strict.py#L52 . I wrote that code as a part of personal "advent of code" hacking of the winter holiday season of 2019/2020. So, if you have cookies for your "contest" winners, count me in the line ;-). []
What's needed is higher-level API specifically for the usecase of letting Python source of modules to run thru custom tokenizer/parser/bytecode compiler. The best known (non-adhoc) approach to that is PEP511: https://www.python.org/dev/peps/pep-0511/.
I believe that the intention of PEP-511 was to optimize the code generated, and not to experiment with different syntax. I *think* that my ideas project is essentially a superset of what is described in pep-511, but with a completely different goal.
Glad to hear you (seem) to promote your "Ideas" as *the* solution to the issue. We need people making bets like that. But how does it compare to other folks' stuff? You say you implemented Decimal literals? And this guy implement Roman literals: https://github.com/isidentical-archive/pepgrave (of a 1st-april PEP). Whose approach is better? And he writes "Re-write of my old project PEPAllow with a complete new approach", and "new" doesn't always mean "better", so his old approach should be in the competition too. And I guess that's the only way to do it "right" - take stuff from different people (including not yet written stuff) and "throw it to arena" to find the best.
Best,
André
--------
This turned out to be a long intro. As I said, I fully agree with you, that changes to Python should be prototyped in Python. And there's no better way to "agree" than actually dogfood this approach to oneself.
Yes. One additional comment, point of emphasis about my original proposal.
Take PEP 505 (None-aware operators) written 5 years ago. It is still deferred.
Imagine that someone wants to experiment with a custom implementation. Using the approach I mentioned, it should be possible to do so.
Suppose that the same person wishes to also experiment with block-level scoping. By using the approach I mentioned in my original message, it would be possible to "stack" transformations and experiment with code that support either None aware operators and/or block-level scoping. Different syntax variants could be implemented separately and, provided they do not create conflicts, could be tested together so that people could get a feel as to whether or not some proposed change should be included in Python.
Absolutely. As I said, that's the right approach. Block-level vars can be implemented purely by AST rewriting for example. Now people interested in that approach should get together, take each own project as a shield and sword, find out whose tools are the best, everyone else to abandon their own "projects of love and sweat", and then with common tools get to cooperate on achieving that aim, instead of duplicating each other's efforts, multiplying incompatible solutions. If that sounds almost impossible, then certainly it is. So, let's get to work! [] -- Best regards, Paul mailto:pmiscml@gmail.com
participants (4)
-
André Roberge
-
Marco Sulla
-
Paul Sokolovsky
-
redradist@gmail.com