Thoughts on programming language notations

Some posts ago, we looked at what it's required in creating a new programming language. In this post we're going a little bit more into it, trying to find ways to effectively express meanings in natural ways, similar to what we can express in a natural language.




First, let's begin by considering the sentence: "George loves Maria".

We have three components:
  1. "George"
    •    is the subject doing the action
  2. "loves"
    •    is the action itself
  3. "Maria"
    •    is the subject on which the action is being done

From a programming language perspective, subjects are equivalent with objects and actions are equivalent with methods.

object1 := George
object2 := Maria
method  := loves

In an object-oriented programming language, the sentence can be expressed as:

  George.loves(Maria)

In general, we have:

  object1.method(object2)

In a natural language, like Romanian, we can express the same meaning in six different ways:
  • O iubește, George, pe Maria.
  • iubește, pe Maria, George.
  • Georgeiubește pe Maria.
  • Pe Mariaiubește George.
  • George, pe Mariaiubește.
  • Pe Maria, Georgeiubește.
(The first two sentences are usually used when asking a question)

Trying to build this natural language feature into a programming language, it's an interesting task. First, we can start by looking for patterns in the above six sentences.
  • "o" always precedes the action "iubește"
  • "pe" always precedes the subject "Maria"
  • "," appears as a separator between the subjects "Maria" and "George"

In a hypothetical programming language, we can define these rules as following:
  • "o ..." as "<...>"
  • "pe ..." as "[...]"
  • "{...}" for the self-object(s)
  • ":" as separator between object-lists (optional)
Using this simple rules, we have the capability of expressing the same meaning in six different ways:
  • <method> {object1: [object2]
  • <method> [object2] : {object1}
  • {object1} <method> [object2]
  • [object2] <method> {object1}
  • {object1: [object2] <method>
  • [object2] : {object1} <method>
The first two expressions are written using prefix notation, the middle two, using infix notation, while the last two are written using postfix notation.

In representing the expression:

  A.method(B, C)

we introduce a new operator ("&"), which zips two argument-lists into a single object:
  • <method> {A: [B] & [C]
  • <method> [B] & [C] : {A}
  • {A} <method> [B] & [C]
  • [B] & [C] <method{A}
  • {A} : [B] & [C] <method>
  • [B] & [C: {A} <method>
We can generalize this concept by allowing multiple expressions inside the "[...]" argument-list, as well inside the "{...}" self-list.

For example, the following four expressions:

  A.method(X)
  A.method(Y)
  B.method(X)
  B.method(Y)

...are equivalent with the following expression:

  <method> {A,B: [X,Y]

...where the complete list of notations is:
  • <method> {A,B: [X,Y]
  • <method> [X,Y: {A,B}
  • {A,B} <method> [X,Y]
  • [X,Y] <method{A,B}
  • {A,B} : [X,Y] <method>
  • [X,Y: {A,B} <method>
Combining the Cartesian product with the "&" zip-operator, the meaning of:

  <method> {A,B: [W,X] & [Y,Z]

...is equivalent with the following four method calls in a classical programming language:

  A.method(W, Y)
  A.method(X, Z)
  B.method(W, Y)
  B.method(X, Z)

The basic grammar rules for our language, are the following: each expression must have a list of self-objects (list of expressions), a method-call and an optional list of method-arguments (also a list of expressions).

# Extra

Let's explore how a real expression might look like in this hypothetical programming language. Bellow we have the division of two expressions, written in standard mathematical notation:

  a / b

Using the rules that we defined above, we can express the same meaning in six different ways:
  • </> {a: [b]
  • </> [b] : {a}
  • {a} </> [b]
  • [b] </{a}
  • {a} : [b] </>
  • [b] : {a} </>
In invoking a method without arguments, we can omit the ":" and the "[...]" argument-list. 

For example:

  person.walk()

can be represented in our language in two distinct ways:

  {person} <walk>
  <walk{person}

We can also effectively represent more than one self-object doing the same action at the same time. For example, bellow we have two self-objects (a and b), both doing division by the same object (c):

  a / c
  b / c

...which we can represent (using the first-kind of prefix notation) as:

  </> {a,b: [c]

...where the complete list of notations is:
  • </> {a,b: [c]
  • </> [c: {a,b}
  • {a,b} </> [c]
  • [c] </{a,b}
  • {a,b: [c] </>
  • [c: {a,b} </>
Additionally, the following two expressions:

  a / b
  a / c

...can be represented as:

  </> {a: [b,c]

# Conclusion

The expressiveness of a natural language is a very powerful feature from which we can take inspiration and design better and more expressive programming languages that would increase our freedom and would allow us to express ideas in more natural and intuitive ways, with a good potential of also increasing our productivity, creativity and ability to read and reason about code.

Comments