Python as meta-language

One of my long-standing interests is in mini-languages, particularly declarative languages. I'm always looking out for examples where a declarative language is used to represent some idea or concept that is not easily written in an imperative language. Examples are the behaviors of particle systems, kinematic constraints, formalized grammars, logical inferencing systems, query languages, and so on. In other words, you are using a language to describe a complex set of relationships, but you aren't giving specific commands to execute in a specific order. Often such mini-languages are implemented by writing a custom parser. However, often this is not necessary if the underlying language (such as Python) is flexible enough. Python's ability to declare complex structures as literals, combined with its ability to overload operators, means that one can often embed the mini-language within the Python syntax itself, and use the Python compiler as the parser for your mini language. It also allows you a convenient means to "escape" back into the procedural world when needed. Examples of the kind of things I am talking about include the SConstruct file format from SCONS and the SQLBuilder syntax from SQLObject. And although it's not directly related to Python, JSON has a lot of the same ideas - that is, using a scripting language source code as an efficient representation of complex data structures. And these are just a few of the many examples out there. What I'd be interested in doing, in this python-ideas list, is brainstorming some ideas for how we can improve Python's ability to 'host' other kinds of mini-languages within the Python syntax. We can start perhaps by examining some of the use cases I listed in the first paragraph - particle systems, etc - and see how one would want to represent those kinds of semantic structures within Python. Of course, there are some languages (Lisp and Dylan come to mind), which are even more flexible in this regard - the languages can be 'morphed' out of all recognition to the original syntax. (For example, in Dylan, a 'macro' was not a simple textual substitution as in C, but in fact added new production rules to the parser.) In I'm in no way advocating such a course. (Well, at least not at this moment :) ) I certainly recognize that there is a danger in making a language too 'plastic', in that it can easily be obfuscated with too much cleverness and lose it's identity. So I'm more interested in ideas that are subtle yet powerful. I'm sure that there are lots of approaches to this general concept. I'm going to throw out a couple of ideas, but I am going to post them as separate replies to this email, and not right away - the reason is, I don't want this thread to be taken over by the discussion / criticism of those specific ideas, when I'm more interested in brainstorming the general concept. -- Talin

Talin wrote:
This sounds like a wide range of things. Maybe even a bit too broad. Can you narrow it down a bit. The first thing that comes to mind for me is Visual's modeling system with both an internal C loop for handling the defined objects, motions, colors, lighting and updating the screen, but it also allows modifying and querying the models while everything is moving around in a python loop. In this case extending the background processing vs foreground processing would be good. When dealing with real time, or 1/nth time for slower computers, the preferred ordering would be given the faster changing objects. Those values should be updated or checked more frequently. So it's not a procedural problem where doing things in a predefined order is the best way.
Another (possibly related?) idea I think might be useful is to incorporate the concept of responsibility, authorization, delegation into the language more directly. In a (well run) business responsibility and authority are delegated in a way that gives the employee the ability to succeed, while also protecting the interests of the business owners(s) by not giving too much power to any one employee. These are good business practices and I think they could be useful concepts if implemented in a computer language in a more formal way. Cheers, Ron Adam

Ron Adam wrote:
I wanted to expand on this a bit since I do think it's related to several discussions that have taken place on the python-3000 list. In general, I get a feeling that there is an effort to reach for that next thing. So far that next thing has centered around efforts to extend generators, the with statement, generic functions, and improved threading/multiprocessing. And now your desire for mini-language support could be included in that. So how are these all related and is there a common theme that can organize these ideas in a way that is not brain exploding, as one respected individual here abouts would put it. I think so. I mentioned above the business concepts of responsibility, authorization and delegation. But without describing how those concept would relate to a computer languages in general and also python, it probably doesn't mean a whole lot. So I'll try to give an outline of how this might work. Most business's are pretty boring in that there are no (or few) surprises. Everything is planned out. There is usually a manual of procedures to follow that is used to train new workers. Those procedures are usually based on sound and well tried business practices. The biggest problem most business's have is hiring dependable workers to actually do the work. In most cases workers are expected to do more than one job. Also there are times or situations where workers join together to do one larger job. These are also relationships that would be good to emulate in software. So lets look at some of these relationships and how they might be organized in a software program. A brief outline to help picture things: Data base - The inventory Skill base - The procedures, (a collection of routines) Knowledge base - How to combine skills and data to achieve desired results. Worker objects: Responsibility - the objectives the worker must achieve Authorization - limits of what data, skills, and knowledge, can be used Delegation - The assignment of responsibility and authority to a worker. A Skill base might just be another name for generic functions. It's really just a collection of routines to do standardized things to data. These could be tests to check quality or values, or routines to combine, split or alter a unit of data. A knowledge base is a bit more, it's the instructions of how to combine skills to achieve specific objectives. Workers would be something new. They would be sort of be like an object, but instead of having limited skills (methods) in them, they might have responsibilities and authorizations. Each responsibility would represent a different activity which would be defined by the knowledge base. And authorizations would specify what knowledge and skills can be used. This would be needed to insure a worker object doesn't get used in an abusive way. For example if you have a worker object assigned to processing input you may want to limit this particular worker to this one activity so there is little chance it can be hacked and used for other things. In other less critical places you may have a group (pool) of worker objects that can fill multiple needs. These worker objects could be run on different threads. Groups of workers might be run on different CPU's. Possibly, this would be something that is determined by the interpreter (manager) and not something the programmer has to think about. These are just seeds of ideas right now, but I think there is a lot of potential in going in this direction. I would place this in between artificial intelligence and objective programming. This would be a higher level of abstraction with the goal of using both software and the underlying hardware more efficiently. How this translates to different programming models depends on how you distribute and organize the work between worker objects. A series of worker with each a single responsibility would be a procedural structure. (an assembly line) A group of workers with many shared responsibilities would be a parallel (or as close as the hardware allows) structure. These would be used to simulate models with many object and specifications. Ok it's late and I'm starting to loose my focus. Hopefully you can see how this might be related to either emulating mini-languages, or how it might be a mini-language in it self. And also hopefully there isn't too many typos or mistakes due to me being tired. Cheers, Ron

Ron Adam wrote:
<snip> Two places I think you might want to look for ideas. The first is the operating system EROS, which is a 'capability-based' operating system. Every object in the system has an explicit set of 'capabilities' (i.e. authorizations) that are enforced at the kernel level. For the second, google for the words "agoric systems". The notion here is that a software system is like an economic system. Agorics takes this further by having various software modules compete against each other in a marketplace, so for example if you want to sort a list, then various agents within the system (representing different algorithms with different time/space trade offs) 'bid' on the job; The winner (which may vary depending on the current system constraints) gets 'paid' from the budget of the caller. Also, since you mentioned concurrency: I recently saw a presentation by a fellow from Intel (I don't remember his name, but apparently he's also on the C++ standards committee), who was basically saying that from now, until the rest of our lives, concurrency will be the only way to improve performance - that "Moore's Law" (in the sense of how many transistors can be stuffed on a chip) will continue for the foreseeable future, but we're pretty much reached the limit in terms of how much performance we can squeeze out of a single scalar processor; Clock speeds won't get that much higher (certainly not orders of magnitude), and any further pipelining is just going to increase memory latency. One the other hand, there's no practical barrier to having a single chip with, say, 128 hyperthread cores on it, within the next 10 years or so. (I'm not sure I entirely buy this - there may be some way around this by rethinking the way memory is accessed.) Anyway, his basic message was this: There will come a time when, if your app is only single-threaded, that you'll effectively be using 1/128th of the machine's power. And this will be true for the rest of your life. He also mentioned that until we get 'transactional memory' (i.e. the ability to do low-level operations that have commit / rollback semantics) that multi threaded programming is always going to suck. ...but that's a whole 'nother topic. -- Talin

Talin <talin@acm.org> wrote:
If you haven't already seen it, you should check out Logix: http://www.livelogix.net/logix/ - Josiah

Josiah Carlson wrote:
I think I looked at it before, but I'll have another look. From what I can tell, there's some interesting ideas here, but there are also some things that are kind of kludged together. I'm not sure that I would want Python to be quite so malleable as Logix. The reason for this is that it's hard to reason about a language when the language is so changeable. Of course, this argument shouldn't be taken too strongly - I don't mean to condemn programmable syntax in general, I just don't want to take it too far. I think what would make sense is to identify some of the most common use cases in Logix, and see if those specific use cases fit within the Python model. The example in the docs shows how to define an 'isa' operator, and I certainly think that Python readability would be improved by having something similar. (I've argued this before, so I don't expect to get much traction on this.) But I wouldn't go so far as to allow you to define brand new operators of arbitrary precedence. Similarly, I think that it wouldn't hurt to expand the suite of operators that are built into the compiler. Like most languages, Python's compiler supports only those operators which are defined for built-in types. Thus, we have the math operators +, -, * and /, because we have built-in numeric types which support those basic operations. However, mathematics defines many different kinds of operators, many of which operate on other kinds of entities than just scalars. Examples include cross-product and dot-product. Most languages don't define operators for cross-product and dot-product, because they don't define matrices as a built-in type. This gets inconvenient when you are doing things like, say, 3D graphics programming, where they types that you are operating on are vectors and matrices, and writing the code using operators allows for more concise and readable code than having to do everything using function calls. (There's an inherent readability advantage IMHO to being able to take something right out of a math textbook and type it directly as a line of code.) Of course, anything that's done with an operator can also be done with a function call, but from a readability standpoint, there are times when operators make a lot of sense. After all, do you really want to write 'add( multiply( a, 1 ), 2 )'? Given the ability to overload operators and define their meaning, why should we limit the set of operators recognized by the compiler to only those that make sense on built-in types? I would say that while it certainly does make sense to give operators a standard *meaning*, there's no reason why operators have to have a standard *implementation*. In other words - unlike Logix, I want to be able to look at a given operator and always know what it means, just as I can look at the symbol '+' and know that it is somehow related to the concept of 'addition'. The same would be true for any new operators. In the case of mathematics and other languages that make heavy use of symbols, there's the additional problem of rendering those symbols into some combination of ASCII characters. To see what I mean, have a look at this chart of math symbols as a starting point: http://en.wikipedia.org/wiki/Table_of_mathematical_symbols. Some of these symbols would be fairly difficult to represent in ASCII, others would be pretty easy. For example, the characters '=>' could be used to represent the logical 'implies' symbol. A less trivial example would be the "::=" operator that is used in many formal grammars. I'd call this the "becomes" operator - thus, the expression "a ::= b" might be spoken as "a becomes b" (or is it the other way around?). A corresponding __becomes__ function would allow the operator to be implemented for certain types, although I haven't really worked out what the calling protocol would be. Such an operator could be used in more than just parser generators however; Ideally, it ought to be usable as an overloadable assignment operator, or any situation where you want to express the concept 'a is defined as b'. -- Talin

Talin <talin@acm.org> wrote:
I should have said a bit more. Basically, my idea of suggesting that you take a look at Logix was a veiled attempt at saying "I think the base Python language syntax is fine". While Logix does go to the extreme of making Python syntax far more maleable than is generally desired (in my opinion), there are cases where conservative use does make life easier. Say, for example, if one preferred C-style conditional syntax to Python's conditional syntax.
The ultimate question is, "what is too far?" [snip math operation discussion] While it would be convenient to define all standard mathematic operations, I don't believe that Python is the language for it. Python can be used as a language to do mathematics, it's not a computer algebra system, and I believe the vast majority of mathematical operations should be limited to computer algebra systems. Say, for example, Sage. Offering "production rules" a'la ::=, wile being an interesting idea, can be represented with current syntax, if one uses certain "tricks" (AB for concatenating A and B, | for offering alternatives, * for repetition, etc., and the use of a custom namespace for name resolution): A = B | C | D.F | E Anyways. Come up with a set of operators. I don't know if consensus is where we should go with this, but the existence of Logix, in my opinion, puts the realm of alternate/additional operators within the realm of extension modules, and operators need to show that they have (or would be used) significantly within the Python user community (like the requirements for standard library module additions). So far, aside from matrix multiplication (which can be defined as * for all non-single vector multiplications unambiguously), I've not heard any *compelling* cases for operators. Also, I'm sort of a curmudgeony jerk (I like my Python as it is), so while you perhaps shouldn't take me *too* seriously when I say "Use Logix", maybe you should take my advice to find a set of *useful* operators, and specify *why* they would be useful (hopefully in a general sense). - Josiah

Talin wrote:
This sounds like a wide range of things. Maybe even a bit too broad. Can you narrow it down a bit. The first thing that comes to mind for me is Visual's modeling system with both an internal C loop for handling the defined objects, motions, colors, lighting and updating the screen, but it also allows modifying and querying the models while everything is moving around in a python loop. In this case extending the background processing vs foreground processing would be good. When dealing with real time, or 1/nth time for slower computers, the preferred ordering would be given the faster changing objects. Those values should be updated or checked more frequently. So it's not a procedural problem where doing things in a predefined order is the best way.
Another (possibly related?) idea I think might be useful is to incorporate the concept of responsibility, authorization, delegation into the language more directly. In a (well run) business responsibility and authority are delegated in a way that gives the employee the ability to succeed, while also protecting the interests of the business owners(s) by not giving too much power to any one employee. These are good business practices and I think they could be useful concepts if implemented in a computer language in a more formal way. Cheers, Ron Adam

Ron Adam wrote:
I wanted to expand on this a bit since I do think it's related to several discussions that have taken place on the python-3000 list. In general, I get a feeling that there is an effort to reach for that next thing. So far that next thing has centered around efforts to extend generators, the with statement, generic functions, and improved threading/multiprocessing. And now your desire for mini-language support could be included in that. So how are these all related and is there a common theme that can organize these ideas in a way that is not brain exploding, as one respected individual here abouts would put it. I think so. I mentioned above the business concepts of responsibility, authorization and delegation. But without describing how those concept would relate to a computer languages in general and also python, it probably doesn't mean a whole lot. So I'll try to give an outline of how this might work. Most business's are pretty boring in that there are no (or few) surprises. Everything is planned out. There is usually a manual of procedures to follow that is used to train new workers. Those procedures are usually based on sound and well tried business practices. The biggest problem most business's have is hiring dependable workers to actually do the work. In most cases workers are expected to do more than one job. Also there are times or situations where workers join together to do one larger job. These are also relationships that would be good to emulate in software. So lets look at some of these relationships and how they might be organized in a software program. A brief outline to help picture things: Data base - The inventory Skill base - The procedures, (a collection of routines) Knowledge base - How to combine skills and data to achieve desired results. Worker objects: Responsibility - the objectives the worker must achieve Authorization - limits of what data, skills, and knowledge, can be used Delegation - The assignment of responsibility and authority to a worker. A Skill base might just be another name for generic functions. It's really just a collection of routines to do standardized things to data. These could be tests to check quality or values, or routines to combine, split or alter a unit of data. A knowledge base is a bit more, it's the instructions of how to combine skills to achieve specific objectives. Workers would be something new. They would be sort of be like an object, but instead of having limited skills (methods) in them, they might have responsibilities and authorizations. Each responsibility would represent a different activity which would be defined by the knowledge base. And authorizations would specify what knowledge and skills can be used. This would be needed to insure a worker object doesn't get used in an abusive way. For example if you have a worker object assigned to processing input you may want to limit this particular worker to this one activity so there is little chance it can be hacked and used for other things. In other less critical places you may have a group (pool) of worker objects that can fill multiple needs. These worker objects could be run on different threads. Groups of workers might be run on different CPU's. Possibly, this would be something that is determined by the interpreter (manager) and not something the programmer has to think about. These are just seeds of ideas right now, but I think there is a lot of potential in going in this direction. I would place this in between artificial intelligence and objective programming. This would be a higher level of abstraction with the goal of using both software and the underlying hardware more efficiently. How this translates to different programming models depends on how you distribute and organize the work between worker objects. A series of worker with each a single responsibility would be a procedural structure. (an assembly line) A group of workers with many shared responsibilities would be a parallel (or as close as the hardware allows) structure. These would be used to simulate models with many object and specifications. Ok it's late and I'm starting to loose my focus. Hopefully you can see how this might be related to either emulating mini-languages, or how it might be a mini-language in it self. And also hopefully there isn't too many typos or mistakes due to me being tired. Cheers, Ron

Ron Adam wrote:
<snip> Two places I think you might want to look for ideas. The first is the operating system EROS, which is a 'capability-based' operating system. Every object in the system has an explicit set of 'capabilities' (i.e. authorizations) that are enforced at the kernel level. For the second, google for the words "agoric systems". The notion here is that a software system is like an economic system. Agorics takes this further by having various software modules compete against each other in a marketplace, so for example if you want to sort a list, then various agents within the system (representing different algorithms with different time/space trade offs) 'bid' on the job; The winner (which may vary depending on the current system constraints) gets 'paid' from the budget of the caller. Also, since you mentioned concurrency: I recently saw a presentation by a fellow from Intel (I don't remember his name, but apparently he's also on the C++ standards committee), who was basically saying that from now, until the rest of our lives, concurrency will be the only way to improve performance - that "Moore's Law" (in the sense of how many transistors can be stuffed on a chip) will continue for the foreseeable future, but we're pretty much reached the limit in terms of how much performance we can squeeze out of a single scalar processor; Clock speeds won't get that much higher (certainly not orders of magnitude), and any further pipelining is just going to increase memory latency. One the other hand, there's no practical barrier to having a single chip with, say, 128 hyperthread cores on it, within the next 10 years or so. (I'm not sure I entirely buy this - there may be some way around this by rethinking the way memory is accessed.) Anyway, his basic message was this: There will come a time when, if your app is only single-threaded, that you'll effectively be using 1/128th of the machine's power. And this will be true for the rest of your life. He also mentioned that until we get 'transactional memory' (i.e. the ability to do low-level operations that have commit / rollback semantics) that multi threaded programming is always going to suck. ...but that's a whole 'nother topic. -- Talin

Talin <talin@acm.org> wrote:
If you haven't already seen it, you should check out Logix: http://www.livelogix.net/logix/ - Josiah

Josiah Carlson wrote:
I think I looked at it before, but I'll have another look. From what I can tell, there's some interesting ideas here, but there are also some things that are kind of kludged together. I'm not sure that I would want Python to be quite so malleable as Logix. The reason for this is that it's hard to reason about a language when the language is so changeable. Of course, this argument shouldn't be taken too strongly - I don't mean to condemn programmable syntax in general, I just don't want to take it too far. I think what would make sense is to identify some of the most common use cases in Logix, and see if those specific use cases fit within the Python model. The example in the docs shows how to define an 'isa' operator, and I certainly think that Python readability would be improved by having something similar. (I've argued this before, so I don't expect to get much traction on this.) But I wouldn't go so far as to allow you to define brand new operators of arbitrary precedence. Similarly, I think that it wouldn't hurt to expand the suite of operators that are built into the compiler. Like most languages, Python's compiler supports only those operators which are defined for built-in types. Thus, we have the math operators +, -, * and /, because we have built-in numeric types which support those basic operations. However, mathematics defines many different kinds of operators, many of which operate on other kinds of entities than just scalars. Examples include cross-product and dot-product. Most languages don't define operators for cross-product and dot-product, because they don't define matrices as a built-in type. This gets inconvenient when you are doing things like, say, 3D graphics programming, where they types that you are operating on are vectors and matrices, and writing the code using operators allows for more concise and readable code than having to do everything using function calls. (There's an inherent readability advantage IMHO to being able to take something right out of a math textbook and type it directly as a line of code.) Of course, anything that's done with an operator can also be done with a function call, but from a readability standpoint, there are times when operators make a lot of sense. After all, do you really want to write 'add( multiply( a, 1 ), 2 )'? Given the ability to overload operators and define their meaning, why should we limit the set of operators recognized by the compiler to only those that make sense on built-in types? I would say that while it certainly does make sense to give operators a standard *meaning*, there's no reason why operators have to have a standard *implementation*. In other words - unlike Logix, I want to be able to look at a given operator and always know what it means, just as I can look at the symbol '+' and know that it is somehow related to the concept of 'addition'. The same would be true for any new operators. In the case of mathematics and other languages that make heavy use of symbols, there's the additional problem of rendering those symbols into some combination of ASCII characters. To see what I mean, have a look at this chart of math symbols as a starting point: http://en.wikipedia.org/wiki/Table_of_mathematical_symbols. Some of these symbols would be fairly difficult to represent in ASCII, others would be pretty easy. For example, the characters '=>' could be used to represent the logical 'implies' symbol. A less trivial example would be the "::=" operator that is used in many formal grammars. I'd call this the "becomes" operator - thus, the expression "a ::= b" might be spoken as "a becomes b" (or is it the other way around?). A corresponding __becomes__ function would allow the operator to be implemented for certain types, although I haven't really worked out what the calling protocol would be. Such an operator could be used in more than just parser generators however; Ideally, it ought to be usable as an overloadable assignment operator, or any situation where you want to express the concept 'a is defined as b'. -- Talin

Talin <talin@acm.org> wrote:
I should have said a bit more. Basically, my idea of suggesting that you take a look at Logix was a veiled attempt at saying "I think the base Python language syntax is fine". While Logix does go to the extreme of making Python syntax far more maleable than is generally desired (in my opinion), there are cases where conservative use does make life easier. Say, for example, if one preferred C-style conditional syntax to Python's conditional syntax.
The ultimate question is, "what is too far?" [snip math operation discussion] While it would be convenient to define all standard mathematic operations, I don't believe that Python is the language for it. Python can be used as a language to do mathematics, it's not a computer algebra system, and I believe the vast majority of mathematical operations should be limited to computer algebra systems. Say, for example, Sage. Offering "production rules" a'la ::=, wile being an interesting idea, can be represented with current syntax, if one uses certain "tricks" (AB for concatenating A and B, | for offering alternatives, * for repetition, etc., and the use of a custom namespace for name resolution): A = B | C | D.F | E Anyways. Come up with a set of operators. I don't know if consensus is where we should go with this, but the existence of Logix, in my opinion, puts the realm of alternate/additional operators within the realm of extension modules, and operators need to show that they have (or would be used) significantly within the Python user community (like the requirements for standard library module additions). So far, aside from matrix multiplication (which can be defined as * for all non-single vector multiplications unambiguously), I've not heard any *compelling* cases for operators. Also, I'm sort of a curmudgeony jerk (I like my Python as it is), so while you perhaps shouldn't take me *too* seriously when I say "Use Logix", maybe you should take my advice to find a set of *useful* operators, and specify *why* they would be useful (hopefully in a general sense). - Josiah
participants (3)
-
Josiah Carlson
-
Ron Adam
-
Talin