Number of languages known [was Re: Python is readable] - somewhat OT

Thu Mar 29 13:48:40 EDT 2012

On Thu, Mar 29, 2012 at 10:03 AM, Chris Angelico <rosuav at gmail.com> wrote:
> On Fri, Mar 30, 2012 at 12:44 AM, Nathan Rice
> <nathan.alexander.rice at gmail.com> wrote:
>> We would be better off if all the time that was spent on learning
>> syntax, memorizing library organization and becoming proficient with
>> new tools was spent learning the mathematics, logic and engineering
>> sciences.  Those solve problems, languages are just representations.
>
> Different languages are good at different things. REXX is an efficient
> text parser and command executor. Pike allows live updates of running
> code. Python promotes rapid development and simplicity. PHP makes it
> easy to add small amounts of scripting to otherwise-static HTML pages.
> C gives you all the power of assembly language with all the
> readability of... assembly language. SQL describes a database request.

Here's a thought experiment.  Imagine that you have a project tree on
your file system which includes files written in many different
programming languages.  Imagine that the files can be assumed to be
contiguous for our purposes, so you could view all the files in the
project as one long chunk of data.  The directory and file names could
be interpreted as statements in this data, analogous to "in the
context of somedirectory" or "in the context of somefile with
sometype".  Any project configuration files could be viewed as
declarative statements about contexts, such as "in xyz context, ignore
those" or "in abc context, any that is actually a this".  Imagine the
compiler or interpreter is actually part of your program (which is
reasonable since it doesn't do anything by itself).  Imagine the build
management tool is also part of your program in pretty much the same
manner.  Imagine that your program actually generates another program
that will generate the program the machine runs.  I hope you can
follow me here, and further I hope you can see that this is a
completely valid description of what is actually going on (from a
different perspective).

In the context of the above thought experiment, it should be clear
that we currently have something that is a structural analog of a
single programming metalanguage (or rather, one per computer
architecture), with many domain specific languages constructed above
that to simplify tasks in various contexts.  The model I previously
proposed is not fantasy, it exists, just not in a form usable by human
beings.  Are machine instructions the richest possible metalanguage?
I really doubt it.

Lets try another thought experiment... Imagine that instead of having
machine instructions as the common metalanguage, we pushed the point
of abstraction closer to something programmers can reasonably work
with: abstract syntax trees.  Imagine all programming languages share
a common abstract syntax tree format, with nodes generated using a
small set of human intelligible semantic primes.  Then, a domain
specific language is basically a context with a set of logical
implications.  By associating a branch of the tree to one (or the
union of several) context, you provide a transformation path to
machine instructions via logical implication.  If implications of a
union context for the nodes in the branch are not compatible, this
manifests elegantly in the form of a logical contradiction.

What does pushing the abstraction point that far up provide?  For one,
you can now reason across language boundaries.  A compiler can tell me
if my prolog code and my python code will behave properly together.
Another benefit is that you make explicit the fact that your parser,
interpreter, build tools, etc are actually part of your program, from
the perspective that your program is actually another program that
generates programs in machine instructions.  By unifying your build
chain, it makes deductive inference spanning steps and tools possible,
and eliminates some needless repetition.  This also greatly simplifies
code reuse, since you only need to generate a syntax tree of the
proper format and associate the correct context to it.  It also
simplifies learning languages, since people only need to understand
the semantic primes in order to read anything.

Of course, this describes Lisp to some degree, so I still need to
provide some answers.  What is wrong with Lisp?  I would say that the
base syntax being horrible is probably the biggest issue.  Beyond
that, transformations on lists of data are natural in Lisp, but graph
transformations are not, making some things awkward.  Additionally,
because Lisp tries to nudge you towards programming in a functional
style, it can be un-intuitive to learn.  Programming is knowledge
representation, and state is a natural concept that many people desire
to model, so making it a second class citizen is a mistake.  If I were
to re-imagine Lisp for this purpose, I would embrace state and an
explicit notion of temporal order.  Rather than pretending it didn't
exist, I would focus on logical and mathematical machinery necessary
to allow powerful deductive reasoning about state.  It is no
coincidence that when a language needs to support formal verification
(such as microcontrollers and DSPS for mission critical devices) a
synchronous language is the go-go.  On the other side of the spectrum,
Haskell is the darling of functional programmers, but it is one of the
worst languages in existence as far as being able to reason about the
behavior of your program goes.  Ignoring state for a few mathematical
conveniences is the damning mark on the brow of the functional
paradigm.  Functional programming may be better on the whole than
imperative programming, but anyone who doesn't acknowledge that it is
an evolutionary dead-end is delusional.

> You can't merge all of them without making a language that's
> suboptimal at most of those tasks - probably, one that's woeful at all
> of them. I mention SQL because, even if you were to unify all
> programming languages, you'd still need other non-application
> languages to get the job done.
>
> Keep the diversity and let each language focus on what it's best at.

I don't know of any semi-modern programming language that doesn't
generate an abstract syntax tree.  Since any turing complete language
can emulate any other turing complete language, there is no reason why
a concise metalanguage for describing nodes of abstract syntax trees
couldn't form the semantic vocabulary for every language in existence
at the AST level.  The syntax could be wildly different, but even then
there is a VERY simple feature of CFGs that helps: they are closed
under union.  The only issue you could run into is if a node with a
given name is represented by two different compositions of semantic
primes at the AST level.  Even this is not a show stopper though,
because you could proceed using a union node.  When converting the
parse tree to an AST, it is likely only one of the two possible nodes
in the union will fulfill all the requirements given its neighboring
nodes and location in the tree.  If there is more than one
incompatible match, then of course you just alert the programmer to
the contradiction and they can modify the tree context.

I'm all for diversity of language at the level of minor notation and
vocabulary, but to draw an analogy to the real world, English and
Mandarin are redundant, and the fact that they both creates a
communication barrier for BILLIONS of people.  That doesn't mean that
biologists shouldn't be able to define words to describe biological
things, if you want to talk about biology you just need to learn the
vocabulary.  That also doesn't mean or that mathematicians shouldn't
be able to use notation to structure complex statements, if you want
to do math you need to man up and learn the notation (of course, I
have issues with some mathematical notation, but there is no reason
you should cry about things like set builder).