Why shouldn't Python be better at implementing Domain Specific Languages?

Why shouldn't Python be better at implementing Domain Specific Languages?
But I don't see any need (or even benefit) in adding new language features to Python, so it can do better at DSLs.
It would be nice if there was a DSL for describing neural networks (Keras). The current syntax looks like this: model.add(Dense(units=64, activation='relu', input_dim=100)) model.add(Dense(units=10, activation='softmax'))

On Fri, Aug 31, 2018 at 3:19 AM, Michael Selik <mike@selik.org> wrote:
Presumably because those are even harder to read and write for humans. I believe that the key issue with using Python as a DSL has to do with its insistence on punctuation -- the above example uses nested parentheses, commas, equal signs, and quotation marks. Those are in general needed to avoid ambiguities, but DSLs are often highly stylized, and a language that doesn't need them has a certain advantage. For example if a shell-like language was adopted, the above could probably be written with spaces instead of commas, parentheses and equal signs, and dropping the quotes (though perhaps it would be more readable if the equal signs were kept). I'm not sure how we would go about this though. IIRC there was a proposal once to allow top-level function calls to be written without parentheses, but it was too hard to make it unambiguous (e.g. would "foo +1" mean "foo(+1)" or "foo + 1"?) -- --Guido van Rossum (python.org/~guido)

On Thu, Aug 30, 2018 at 9:41 PM Guido van Rossum <guido@python.org> wrote:
Guido is absolutely right (as usual) that JSON or XML would be *vastly* harder to read than that very clean Python code. That said, if you wanted a "DSL", the perfect choice would be YAML. The description above could look like this:
This format could easily be either a string within a Python program or an external file with the definition. YAML is already well supported in Python, all you'd need to do is write a little wrapper to translate the description above into the actual Keras API calls, which would be pretty easy. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Fri, Aug 31, 2018 at 4:04 AM, David Mertz <mertz@gnosis.cx> wrote:
Hm. YAML is indeed a great, readable alternative to JSON or XML. But the term DSL implies (to me) more than just nested key-value pairs. (Though who knows maybe that's all Keras needs, and then it's a poor argument for having a DSL.) Then again maybe I'm confusing DSL (which appears to be a Rubyism) with "little language": http://wiki.c2.com/?LittleLanguage -- --Guido van Rossum (python.org/~guido)

On Fri, Aug 31, 2018, 12:08 AM Guido van Rossum <guido@python.org> wrote:
Keras is deliberately very declarative in defining models. So indeed sequences and mappings and scalars is everything it needs. Maybe I'll actually implement the idea I sketch out in a small independent library. Then again maybe I'm confusing DSL (which appears to be a Rubyism) with
"little language": http://wiki.c2.com/?LittleLanguage
I'm pretty sure I heard "DSL" before Ruby existed. Definitely before Ruby entered my consciousness as another neat language. I don't really care whether a DSL needs to be Turing complete, but I think many such are used primary declaratively. Mostly though I was pointing out that Keras doesn't need flow and branching for it's model definitions.

On Thu, Aug 30, 2018, 9:23 PM David Mertz <mertz@gnosis.cx> wrote:
Defining Keras models reminds me of the ugly ASP code I'd sometimes write to create HTML in an object-oriented fashion. Writing plain HTML was usually more pleasant and readable. I suggested XML with that memory in mind. It's cumbersome for many tasks, but Keras models in particular might be a good fit. Or YAML.

The word "domain" appears in this sense on the first page of Aho and Ullman and ANTLR (which I know you've used) describes itself as a tool for building domain-specific languages. Both pre-date Ruby I'm fairly sure. James Lu, quoting Jonathan Fine, used the term "internal DSL" and although that's new to me, people seem to be interpreting it in the sense that Gradle is a Groovy DSL (note caps), a build tool a lot of software developers will be familiar with. In that case, what you write really is Groovy, but the execution environment has been pre-conditioned with objects and libraries that (almost) make a new language. When you understand what's going on (not sure I always do), it becomes possible to mix Gradle statements and Groovy freely. The most immediate benefit is that all the apparatus of expressions and types/methods is already present. So "internal" is the key word. The point about punctuation is spot-on, I think: Groovy is relatively free of (makes optional) some punctuation, including the parentheses that make calls easily identifiable in Python. So quite possibly starting from Python is limiting if what you want is an*internal* DSL with a grammar you choose: the object system is fantastic-plastic, but the grammar is not. DSLs embedded in Python are common, of course (f-strings, regexes, SQL), and DSLs can generate Python from fragments with almost no constraints on their own grammar. iPython strikes me as possibly a Python internal DSL, or Django, but what they've done does not take us far from pure Python. Jeff Allen On 31/08/2018 05:07, Guido van Rossum wrote:

On Fri, Aug 31, 2018 at 03:40:22AM +0200, Guido van Rossum wrote:
On Fri, Aug 31, 2018 at 3:19 AM, Michael Selik <mike@selik.org> wrote:
On Thu, Aug 30, 2018 at 5:31 PM James Lu <jamtlu@gmail.com> wrote:
[James]
Is there something wrong with that style? I'm not sure what syntax you would consider an improvement. [Michael]
Why not JSON or XML for cross-language compatibility?
James dodn't mention cross-language compatibility, he presumably wants a better way to write machine learning code. Being able to exchange data from one application to another is great. Having to write your code as XML is not. [Guido]
Presumably because those are even harder to read and write for humans.
Indeed. One criticism of XML is that it is the hammer which leads people to treat every problem as a nail. "Just use XML". Now you have two problems *wink*
Right -- especially for imperative-style code. I'm reminded of an example from Leo Brodie's classic "Learning Forth", the top level application in an embedded washing machine controller: WASH SPIN RINSE SPIN That sort of punctuation-free imperative code elegantly matches the way we might right it down as a list of commands.
Please no, Ruby has that, and the meaning of expressions depends on whether you put whitespace around operators or not. Given: def a(x=4) x+2 end b = 1 the result of "a+b" depends on the spaces around the plus sign: irb(main):005:0> a + b => 7 irb(main):006:0> a +b => 3 -- Steve

James Lu started this thread by quoting me. Thank you, James, for the compliment. And I feel somewhat obliged to contribute here, are at removed I started the thread. In the message James quoted, I also said <quote> But most strengths, in another situation, can be a weakness. Language design is often a compromise between conciseness and readability, ease of use and performance, good for beginners and good for experts, and many other factors. Such as innovation and stability. Guido's decisions, as BDFL, have shaped Python and its community into what it is now. It is one set of compromises. Other languages, such as Ruby, have made different compromises. Now that the BDFL is on vacation, the challenge is to maintain the essence of Python while continuing to innovate. </quote> It's important, of course, for the developers of a DSL to understand the domain. I'm starting to learn http://elm-lang.org/. It describes itself as
A delightful language for reliable webapps. Generate JavaScript with great performance and no runtime exceptions.
The Elm developers have learnt a great deal from Python, and I think that we in turn can learn from them. Particularly about catching coding errors early, with good feedback. But that's a different thread. So I'd say to focus on improving the API to an existing library is a good way to develop our understanding of DSLs more generally. James provided a Keras example model.add(Dense(units=64, activation='relu', input_dim=100)) model.add(Dense(units=10, activation='softmax')) What might be better here, if allowed, is model.extend([ Dense(units=64, activation='relu', input_dim=100), Dense(units=10, activation='softmax'), ]) Another approach would be to provide a fluent interface. https://martinfowler.com/bliki/FluentInterface.html https://en.wikipedia.org/wiki/Fluent_interface Done this want, we might get something like jQuery ( model .dense(units=64, activation='relu', input_dim=100) .dense(units=10, activation='softmax') ) JSON and XML and YAML have already been mentioned. Here's another, XML-ish approach. A combined list-dictionary is quite common. It's used widely in XML (and SGML before it). So how to create such. A few years ago I experimented with an API such as A(a=1, b=2)[ X(1, 2, 3), Y[ ....], ] As I recall, someone told me that https://kivy.org does something similar. Kivi and Elm, are systems I'd like to learn. Ease of use is important in language and library design. We can learn from the success of others, as well as from our own successes and failures (smile). -- Jonathan

as for elm, you have to look twice not to see the python of it Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius

I prefer Python's syntax. In the Keras example, Python pays off compared to the XML or YAML or whatever as soon as you need to define something programmatically. For example, if your model is generated based on some other input. Anyway, most of your time is not spent typing punctuation. Most of your time is spent debugging. I wish that LaTeX, the DSLs I use, were implemented as a Python package. There are a thousand terrible design decision in latex that could all be fixed if it had been done as one good Python package. Maybe LuaTex will end up fullfilling the dream. On Thursday, August 30, 2018 at 9:41:46 PM UTC-4, Guido van Rossum wrote:

i believe a DSL is simple enough for an enthusiastic py programmer to write if you really wanted one just write the tasks you need to accomplish, the data needed, the constrcuts needed (if needed), the feel/look of it on your editor plan first, come up with a good mock, then implement it. implementation is easy, ideas are hard. good ideas offload the efforts on the implementation side, they can also save you future troubles let me take an example : a DSL to calculate the cost of houses aim : calculate cost of houses input : num of houses price of house output : price of houses technical tasks : show to screen it might go on like that --- file --- house num 1,000 house price 250,000 calculate sum --- output --- $ 250 000 000 in the above example, assumptions were made and functions crammed but you have a dsl. real-life dsls are not far from the specs of this one but differ in the tools used Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius

On Fri, Aug 31, 2018 at 11:39:16AM -0400, James Lu wrote:
We should all take a look at Ruby Blocks and think about how Python could benefit from something similar.
You are not the first person to suggest Ruby-like anonymous blocks or multi-statement lambdas. https://duckduckgo.com/?q=python-ideas+ruby+blocks Its not enough to just think about the benefits. We also need to think about the costs, the disadvantages, the possible syntax, and how well it fits into the existing language. If anyone has some new and interesting ideas, that would be fantastic. But if we're just going to rehash the same rejected ideas again, please don't. Python is 20+ years old and the idea of multi-statement anonymous functions has been discussed since before Ruby even existed. -- Steve

On Fri, Aug 31, 2018 at 11:14:35AM +0400, Abdur-Rahmaan Janhangeer wrote:
I don't think the problem is people coming up with DSLs for their problems, but implementing those DSLs. Writing an informal specification is easy; writing the implementation is trickier. For the above, you need an interpreter for the DSL, otherwise it is just a static data file that generates no output at all. -- Steve

the implementation is very easy just a program and some ifs if you need more complex, just some reading on tokens etc but the first choice is enough if in py competitions people are solving regex like questions and word extractions without knowing they are solving compiler theory problems, implementing DSL is very easy Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius

wrote a quick article here : https://www.pythonmembers.club/2018/09/03/how-to-create-your-own-dsldomain-s... -- Abdur-Rahmaan Janhangeer https://github.com/abdur-rahmaanj Mauritius

On Thu, Aug 30, 2018 at 8:31 PM James Lu <jamtlu@gmail.com> wrote:
How about something like this? with model: with Dense() as dense: dense.units = 64 dense.activation = 'relu' dense.input_dim = 100 with Dense() as dense: dense.units = 10 dense.activation = 'softmax' The `with` creates a context to which subsequent layers are added when created within the context. But it does suffer from the fact that the `with` object's (or the underlying stack that would implement this in the model/layers) scope is not local to the function, so if within the `with` context you call a function that creates a layer, the layer will be added to the caller's context, which would be surprising. I was working on a similar approach for a python GUI, and `with` seemed like a very nice candidate, but ran into this issue, which didn't seem to have a clean fix. I also thought about using an decorator that pre-processes the AST for a GUI description, which for your example would look like something like: with model: with Dense(): units = 64 activation = 'relu' input_dim = 100 with Dense(): units = 10 activation = 'softmax' But here the issues are (a) the similarity with local variables may be confusing (b) you'd need to either make all `with` statements special, or annotate the `with` statements that are going to be processed by the compiler (e.g. by prefixing the object with a dash). It seemed messy enough that I'm still pondering this. Matt

On Fri, Aug 31, 2018 at 3:19 AM, Michael Selik <mike@selik.org> wrote:
Presumably because those are even harder to read and write for humans. I believe that the key issue with using Python as a DSL has to do with its insistence on punctuation -- the above example uses nested parentheses, commas, equal signs, and quotation marks. Those are in general needed to avoid ambiguities, but DSLs are often highly stylized, and a language that doesn't need them has a certain advantage. For example if a shell-like language was adopted, the above could probably be written with spaces instead of commas, parentheses and equal signs, and dropping the quotes (though perhaps it would be more readable if the equal signs were kept). I'm not sure how we would go about this though. IIRC there was a proposal once to allow top-level function calls to be written without parentheses, but it was too hard to make it unambiguous (e.g. would "foo +1" mean "foo(+1)" or "foo + 1"?) -- --Guido van Rossum (python.org/~guido)

On Thu, Aug 30, 2018 at 9:41 PM Guido van Rossum <guido@python.org> wrote:
Guido is absolutely right (as usual) that JSON or XML would be *vastly* harder to read than that very clean Python code. That said, if you wanted a "DSL", the perfect choice would be YAML. The description above could look like this:
This format could easily be either a string within a Python program or an external file with the definition. YAML is already well supported in Python, all you'd need to do is write a little wrapper to translate the description above into the actual Keras API calls, which would be pretty easy. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Fri, Aug 31, 2018 at 4:04 AM, David Mertz <mertz@gnosis.cx> wrote:
Hm. YAML is indeed a great, readable alternative to JSON or XML. But the term DSL implies (to me) more than just nested key-value pairs. (Though who knows maybe that's all Keras needs, and then it's a poor argument for having a DSL.) Then again maybe I'm confusing DSL (which appears to be a Rubyism) with "little language": http://wiki.c2.com/?LittleLanguage -- --Guido van Rossum (python.org/~guido)

On Fri, Aug 31, 2018, 12:08 AM Guido van Rossum <guido@python.org> wrote:
Keras is deliberately very declarative in defining models. So indeed sequences and mappings and scalars is everything it needs. Maybe I'll actually implement the idea I sketch out in a small independent library. Then again maybe I'm confusing DSL (which appears to be a Rubyism) with
"little language": http://wiki.c2.com/?LittleLanguage
I'm pretty sure I heard "DSL" before Ruby existed. Definitely before Ruby entered my consciousness as another neat language. I don't really care whether a DSL needs to be Turing complete, but I think many such are used primary declaratively. Mostly though I was pointing out that Keras doesn't need flow and branching for it's model definitions.

On Thu, Aug 30, 2018, 9:23 PM David Mertz <mertz@gnosis.cx> wrote:
Defining Keras models reminds me of the ugly ASP code I'd sometimes write to create HTML in an object-oriented fashion. Writing plain HTML was usually more pleasant and readable. I suggested XML with that memory in mind. It's cumbersome for many tasks, but Keras models in particular might be a good fit. Or YAML.

The word "domain" appears in this sense on the first page of Aho and Ullman and ANTLR (which I know you've used) describes itself as a tool for building domain-specific languages. Both pre-date Ruby I'm fairly sure. James Lu, quoting Jonathan Fine, used the term "internal DSL" and although that's new to me, people seem to be interpreting it in the sense that Gradle is a Groovy DSL (note caps), a build tool a lot of software developers will be familiar with. In that case, what you write really is Groovy, but the execution environment has been pre-conditioned with objects and libraries that (almost) make a new language. When you understand what's going on (not sure I always do), it becomes possible to mix Gradle statements and Groovy freely. The most immediate benefit is that all the apparatus of expressions and types/methods is already present. So "internal" is the key word. The point about punctuation is spot-on, I think: Groovy is relatively free of (makes optional) some punctuation, including the parentheses that make calls easily identifiable in Python. So quite possibly starting from Python is limiting if what you want is an*internal* DSL with a grammar you choose: the object system is fantastic-plastic, but the grammar is not. DSLs embedded in Python are common, of course (f-strings, regexes, SQL), and DSLs can generate Python from fragments with almost no constraints on their own grammar. iPython strikes me as possibly a Python internal DSL, or Django, but what they've done does not take us far from pure Python. Jeff Allen On 31/08/2018 05:07, Guido van Rossum wrote:

On Fri, Aug 31, 2018 at 03:40:22AM +0200, Guido van Rossum wrote:
On Fri, Aug 31, 2018 at 3:19 AM, Michael Selik <mike@selik.org> wrote:
On Thu, Aug 30, 2018 at 5:31 PM James Lu <jamtlu@gmail.com> wrote:
[James]
Is there something wrong with that style? I'm not sure what syntax you would consider an improvement. [Michael]
Why not JSON or XML for cross-language compatibility?
James dodn't mention cross-language compatibility, he presumably wants a better way to write machine learning code. Being able to exchange data from one application to another is great. Having to write your code as XML is not. [Guido]
Presumably because those are even harder to read and write for humans.
Indeed. One criticism of XML is that it is the hammer which leads people to treat every problem as a nail. "Just use XML". Now you have two problems *wink*
Right -- especially for imperative-style code. I'm reminded of an example from Leo Brodie's classic "Learning Forth", the top level application in an embedded washing machine controller: WASH SPIN RINSE SPIN That sort of punctuation-free imperative code elegantly matches the way we might right it down as a list of commands.
Please no, Ruby has that, and the meaning of expressions depends on whether you put whitespace around operators or not. Given: def a(x=4) x+2 end b = 1 the result of "a+b" depends on the spaces around the plus sign: irb(main):005:0> a + b => 7 irb(main):006:0> a +b => 3 -- Steve

James Lu started this thread by quoting me. Thank you, James, for the compliment. And I feel somewhat obliged to contribute here, are at removed I started the thread. In the message James quoted, I also said <quote> But most strengths, in another situation, can be a weakness. Language design is often a compromise between conciseness and readability, ease of use and performance, good for beginners and good for experts, and many other factors. Such as innovation and stability. Guido's decisions, as BDFL, have shaped Python and its community into what it is now. It is one set of compromises. Other languages, such as Ruby, have made different compromises. Now that the BDFL is on vacation, the challenge is to maintain the essence of Python while continuing to innovate. </quote> It's important, of course, for the developers of a DSL to understand the domain. I'm starting to learn http://elm-lang.org/. It describes itself as
A delightful language for reliable webapps. Generate JavaScript with great performance and no runtime exceptions.
The Elm developers have learnt a great deal from Python, and I think that we in turn can learn from them. Particularly about catching coding errors early, with good feedback. But that's a different thread. So I'd say to focus on improving the API to an existing library is a good way to develop our understanding of DSLs more generally. James provided a Keras example model.add(Dense(units=64, activation='relu', input_dim=100)) model.add(Dense(units=10, activation='softmax')) What might be better here, if allowed, is model.extend([ Dense(units=64, activation='relu', input_dim=100), Dense(units=10, activation='softmax'), ]) Another approach would be to provide a fluent interface. https://martinfowler.com/bliki/FluentInterface.html https://en.wikipedia.org/wiki/Fluent_interface Done this want, we might get something like jQuery ( model .dense(units=64, activation='relu', input_dim=100) .dense(units=10, activation='softmax') ) JSON and XML and YAML have already been mentioned. Here's another, XML-ish approach. A combined list-dictionary is quite common. It's used widely in XML (and SGML before it). So how to create such. A few years ago I experimented with an API such as A(a=1, b=2)[ X(1, 2, 3), Y[ ....], ] As I recall, someone told me that https://kivy.org does something similar. Kivi and Elm, are systems I'd like to learn. Ease of use is important in language and library design. We can learn from the success of others, as well as from our own successes and failures (smile). -- Jonathan

as for elm, you have to look twice not to see the python of it Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius

I prefer Python's syntax. In the Keras example, Python pays off compared to the XML or YAML or whatever as soon as you need to define something programmatically. For example, if your model is generated based on some other input. Anyway, most of your time is not spent typing punctuation. Most of your time is spent debugging. I wish that LaTeX, the DSLs I use, were implemented as a Python package. There are a thousand terrible design decision in latex that could all be fixed if it had been done as one good Python package. Maybe LuaTex will end up fullfilling the dream. On Thursday, August 30, 2018 at 9:41:46 PM UTC-4, Guido van Rossum wrote:

i believe a DSL is simple enough for an enthusiastic py programmer to write if you really wanted one just write the tasks you need to accomplish, the data needed, the constrcuts needed (if needed), the feel/look of it on your editor plan first, come up with a good mock, then implement it. implementation is easy, ideas are hard. good ideas offload the efforts on the implementation side, they can also save you future troubles let me take an example : a DSL to calculate the cost of houses aim : calculate cost of houses input : num of houses price of house output : price of houses technical tasks : show to screen it might go on like that --- file --- house num 1,000 house price 250,000 calculate sum --- output --- $ 250 000 000 in the above example, assumptions were made and functions crammed but you have a dsl. real-life dsls are not far from the specs of this one but differ in the tools used Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius

On Fri, Aug 31, 2018 at 11:39:16AM -0400, James Lu wrote:
We should all take a look at Ruby Blocks and think about how Python could benefit from something similar.
You are not the first person to suggest Ruby-like anonymous blocks or multi-statement lambdas. https://duckduckgo.com/?q=python-ideas+ruby+blocks Its not enough to just think about the benefits. We also need to think about the costs, the disadvantages, the possible syntax, and how well it fits into the existing language. If anyone has some new and interesting ideas, that would be fantastic. But if we're just going to rehash the same rejected ideas again, please don't. Python is 20+ years old and the idea of multi-statement anonymous functions has been discussed since before Ruby even existed. -- Steve

On Fri, Aug 31, 2018 at 11:14:35AM +0400, Abdur-Rahmaan Janhangeer wrote:
I don't think the problem is people coming up with DSLs for their problems, but implementing those DSLs. Writing an informal specification is easy; writing the implementation is trickier. For the above, you need an interpreter for the DSL, otherwise it is just a static data file that generates no output at all. -- Steve

the implementation is very easy just a program and some ifs if you need more complex, just some reading on tokens etc but the first choice is enough if in py competitions people are solving regex like questions and word extractions without knowing they are solving compiler theory problems, implementing DSL is very easy Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Mauritius

wrote a quick article here : https://www.pythonmembers.club/2018/09/03/how-to-create-your-own-dsldomain-s... -- Abdur-Rahmaan Janhangeer https://github.com/abdur-rahmaanj Mauritius

On Thu, Aug 30, 2018 at 8:31 PM James Lu <jamtlu@gmail.com> wrote:
How about something like this? with model: with Dense() as dense: dense.units = 64 dense.activation = 'relu' dense.input_dim = 100 with Dense() as dense: dense.units = 10 dense.activation = 'softmax' The `with` creates a context to which subsequent layers are added when created within the context. But it does suffer from the fact that the `with` object's (or the underlying stack that would implement this in the model/layers) scope is not local to the function, so if within the `with` context you call a function that creates a layer, the layer will be added to the caller's context, which would be surprising. I was working on a similar approach for a python GUI, and `with` seemed like a very nice candidate, but ran into this issue, which didn't seem to have a clean fix. I also thought about using an decorator that pre-processes the AST for a GUI description, which for your example would look like something like: with model: with Dense(): units = 64 activation = 'relu' input_dim = 100 with Dense(): units = 10 activation = 'softmax' But here the issues are (a) the similarity with local variables may be confusing (b) you'd need to either make all `with` statements special, or annotate the `with` statements that are going to be processed by the compiler (e.g. by prefixing the object with a dash). It seemed messy enough that I'm still pondering this. Matt
participants (12)
-
Abdur-Rahmaan Janhangeer
-
David Mertz
-
Guido van Rossum
-
James Lu
-
Jeff Allen
-
Jonathan Fine
-
Matthew Einhorn
-
Michael Selik
-
Neil Girdhar
-
Paul Moore
-
Stefan Behnel
-
Steven D'Aprano