Getting towards an infix notation for hy

| categories: hylang | tags:

Engineers need infix notation. It's a bold statement I know, but I am an engineer, teach engineers, and write a fair bit of mathematical programs. Your typical engineer is not a programmer, and just wants to write an equation they way we would write it on paper. It is hard to undo 20+ years of education on that point! So, here we consider how to adapt hy to use infix notation.

In a recent post gilch suggested using strings with the builtin python eval function. There are some potential downsides to that approach including the overhead of byte-compiling each time it is eval'd, but the payoff is operator precedence, and doing it like you would do it in Python.

1 using strings

UPDATE: Thanks to some help from Jiege Chen I updated this section to solve the namespace issues previously discussed. That resulted in quite a bit of improvement. Thanks Jiege!

Here is one implementation.

(def py-eval (get __builtins__ "eval"))

And how to use it.

(import [infix [*]])

(print (py-eval "2+3*5"))

(import [numpy :as np])
(print (py-eval "2 * np.exp(np.pi / 2)"))
17
9.62095476193

We can eliminate the need for quotes (") with the stringify code we previously developed.

(import [serialize [*]])
(import [infix [*]])

(print (py-eval (stringify `(2+3*5))))
(print (py-eval (stringify `(2 + 3 * 5))))

(import [numpy :as np])
(print (py-eval (stringify `(2 * np.exp(np.pi / 2)))))
17
17
9.62095476193

Let's just take that one more step with a new reader macro to tighten the syntax up. A critical feature of this reader macro is that it expands to code evaluated in the namespace where it is used. Nothing gets evaluated in the macro. That occurs in another namespace, where most things in a script are not available.

(defreader p [code]
  `(do
    (import [serialize [stringify]])
    (import [infix [py-eval]])
    (py-eval (stringify ~code))))

(defmacro py [code]
  `(do
    (import [serialize [stringify]])
    (import [infix [py-eval]])
    (py-eval (stringify ~code))))

Now we can use it like this. We have to require the infix module to get the reader macro. It seems unfortunate to me we still have to quote the code. Later I show an example where that isn't necessary, so there must be some subtle difference I have not found yet.

;; we have to require to get the reader macro
(require infix)

(import [numpy :as np])
(print #p`(2 + 3 * 5))
(print #p`((2 + 3) * 5))
(print #p`(1 + 1 * np.exp(7)))

(setv x 5)
(print #p`(x + 2))

(print #p`(1 + 1 * np.exp(1e-15)))
;; note the real python syntax with commas.
;; also not the extra parens around 1e-5
(print #p`(1 + np.linspace((1e-5), 1, 5)))

; The 1e-5 gets mangled to 1e-5 in this example
; (print #p`(1 + np.linspace(1e-5, 1, 5)))

;; Here is the macro form. It is about as easy to write.
(print (py `(1 + np.linspace((1e-5), 1, 5))))
17
25
1097.63315843
7
2.0
[ 1.00001    1.2500075  1.500005   1.7500025  2.       ]
[ 1.00001    1.2500075  1.500005   1.7500025  2.       ]

Lots of things seem to work! Let's look into some other solutions that do not rely on the builtin eval.

2 Infix to prefix using code manipulation

This solution is inspired by https://sourceforge.net/p/readable/wiki/Solution/ , but probably isn't a full implementation. We will first develop a function to convert infix notation to prefix notation. This function is recursive to deal with nested expressions. So far it doesn't seem possible to recurse with macros (at least, I cannot figure out how to do it). We tangle this function to infix.hy so we can use it later.

It will have some limitations though:

  1. No operator precedence. We will use parentheses for precedence.
  2. Lisp syntax means 3+4 is not the same as 3 + 4. The first is interpreted as a name. So we will need spaces to separate everything.
(try
 (print (3+4))
 (except [e Exception]
   (print e)))

(print (+ 3 4))
name '3+4' is not defined
7

So, here is our infix function. Roughly, the function takes a CODE argument. If the CODE is iterable, it is a list of symbols, and we handle a few cases:

  • If it is a string, we return it.
  • if it has a length of one and is an expression we recurse on it, otherwise return the symbol.
  • if it has a length of two, we assume a unary operator and recurse on each element.
  • If there are three elements, we take the middle one as the operator, and switch it with the first element.
  • Otherwise we switch the first and second elements, and recurse on the rest of the list.
  • If it is not iterable we just return the element.

Two optional arguments provide some debug support to print what is happening.

(import [serialize [*]])

(defn nfx [code &optional [indent 0] [debug False]]
  "Transform the CODE expression to prefix notation.
We assume that CODE is in infix notation."
  (when debug (print (* " " indent) "code: " code " type: " (type code)))
  (cond
   [(coll? code)
    (cond

     ;; treat lists in [] special
     [(and (instance?  hy.models.list.HyList code)
           (not (instance?  hy.models.expression.HyExpression code)))
      (when debug (print "list: " code " type: " (type code)))
      code]

     [(= 1 (len code))
      ;; element is an Expression
      (when debug (print (* " " indent) "1: " code))
      (if (isinstance (car code) hy.models.expression.HyExpression)
        (nfx (car code) (+ indent 1) debug)
        ;; single element
        (car code))]

     ;; {- 1} ->  (- 1)
     [(= 2 (len code))
      (when debug (print (* " " indent) "2: " code))
      `(~(nfx (get code 0) (+ indent 1) debug)
         ~(nfx (get code 1) (+ indent 1) debug))]

     ;; {1 + 2} -> (+ 1 2)
     [(= 3 (len code))
      (when debug (print (* " " indent) "3: " code))
      `(~(get code 1)
         ~(nfx (get code 0) (+ indent 1) debug)
         ~(nfx (get code 2) (+ indent 1) debug))]

     ;; longer expression, swap first two and take the rest.
     [true
      (when debug (print "expr: " code))
      `(~(nfx (get code 1) (+ indent 1) debug)
         ~(nfx (get code 0) (+ indent 1) debug)
         (~@(nfx (cut code 2) (+ indent 1) debug)))])]

   ;; non-iterable just gets returned
   [true
    (when debug (print (* " " indent) "true: " code))
    code]))

Now, for some tests. First, an example with debug we can see what happens.

(import [infix [*]])
(print (nfx `(1 + (3 * 4)) :debug True))
 code:  (1L u'+' (3L u'*' 4L))  type:  <class 'hy.models.expression.HyExpression'>
 3:  (1L u'+' (3L u'*' 4L))
  code:  1  type:  <class 'hy.models.integer.HyInteger'>
  true:  1
  code:  (3L u'*' 4L)  type:  <class 'hy.models.expression.HyExpression'>
  3:  (3L u'*' 4L)
   code:  3  type:  <class 'hy.models.integer.HyInteger'>
   true:  3
   code:  4  type:  <class 'hy.models.integer.HyInteger'>
   true:  4
(u'+' 1L (u'*' 3L 4L))

You can see we return a list of symbols, and the result is not evaluated. Now for some more thorough tests. I use a little helper function here to show the input and output.

(import [infix [*]])
(import [serialize [stringify]])

(defn show [code]
  (print (.format "{0} -> {1}\n"
                  (stringify code)
                  (stringify (nfx code)))))

(show 1)
(show `(1))
(show `(- 1))
(show `((1)))
(show `(- (2 + 1)))

(show `(2 ** 4))
(show `(3 < 5))

(show `(1 + 3 * 5 + 6 - 9))
(show `((1 + (1 + 2)) * 5 + 6 - 9))
(show `(1 + 1 * (5 - 4)))
(show `(1 + 1 * (np.exp (17 - 10))))

; Note this one does not work right.
(show `(1 + (np.linspace 1e-5  1 5)))

(show `(x + long-name)) ; note name mangling occurs.

(show `(1 + 1 + 1 + 1 + 1))
1 -> 1

(1) -> 1

(- 1) -> (- 1)

((1)) -> 1

(- (2 + 1)) -> (- (+ 2 1))

(2 ** 4) -> (** 2 4)

(3 < 5) -> (< 3 5)

(1 + 3 * 5 + 6 - 9) -> (+ 1 (* 3 (+ 5 (- 6 9))))

((1 + (1 + 2)) * 5 + 6 - 9) -> (* (+ 1 (+ 1 2)) (+ 5 (- 6 9)))

(1 + 1 * (5 - 4)) -> (+ 1 (* 1 (- 5 4)))

(1 + 1 * (np.exp (17 - 10))) -> (+ 1 (* 1 (np.exp (- 17 10))))

(1 + (np.linspace 1e-05 1 5)) -> (+ 1 (1e-05 np.linspace (1 5)))

(x + long_name) -> (+ x long_name)

(1 + 1 + 1 + 1 + 1) -> (+ 1 (+ 1 (+ 1 (+ 1 1))))

Those all look reasonable I think. The last case could be simplified, but it would take some logic to make sure all the operators are the same, and that handles if any of the operands are expressions. We save that for later.

Now, we illustrate that the output code can be evaluated. Since we expand to code, we don't seem to have the namespace issues since the code is executed in our script.

(import [infix [*]])

(print (eval (nfx `(1 + 1 * (5 - 4)))))

(import [numpy :as np])
(print (eval (nfx `(1 + 1 * (np.exp (17 - 10))))))
2
1097.63315843

That syntax is not particularly nice, so next we build up a macro, and a new reader syntax. First, the macro.

(defmacro $ [&rest code]
  "Eval CODE in infix notation."
  `(do
    (import infix)
    (eval (infix.nfx ~code))))

Now we can use the simpler syntax here. It seems we still have quote the math to prevent it from being evaluated (which causes an error).

(import infix)
(require infix)

(print ($ `(1 + 1 * (5 - 4))))

(import [numpy :as np])
(print ($ `(1 + 1 * (np.exp (17 - 10)))))
2
1097.63315843

For the penultimate act, we introduce a new syntax for this. In the sweet expression syntax we would use {} for this, but this isn't currently possible for hylang, and is also used for dictionaries. We define a reader macro for this.

(defreader $ [code]
  (import infix)
  (infix.nfx code))

(defreader P [code]
  `(do (import infix)
       (eval (infix.nfx ~code))))
(import [infix [*]])
(require infix)

(import [numpy :as np])

(print #$(- 1))

(print #$(- (2 + 1)))

(print #$(2 ** 4))
(print #$(3 < 5))

(print #$(1 + 3 * 5 + 6 - 9))
(print #$((1 + (1 + 2)) * 5 + 6 - 9))
(print #$(1 + 1 * (5 - 4)))
(print #$(1 + 1 + 1 + 1 + 1))

;; we still have to be lispy with function calls (func args)
(print #$(1 + 1 * (np.exp (17 - 10))))

(setv a 3 t 6)
(print #$(a + t))

(setv long-a 5 long-b 6)
(print #$(long-a + long-b))

;; this fails because the linspace should not get unfixed. This is a bug in
;; our implementation

;; (print #P`(1 + (np.linspace 1e-5  1 5)))
-1
-3
16
True
7
8
2
5
1097.63315843
9
11

Mostly wonderful! We get variables passed through, and the name-mangling doesn't seem to matter. Note we don't have to quote this code. I think it is because in this reader macro we do not return code, but actually evaluate it I think. And somehow it works.

There is an issue with (print #$(1 + (np.linspace 1e-5 1 5))). The linspace call gets unfixed, which is wrong. There are some ways we could deal with that. One might be to only unfix known operators. Another might be some escape syntax that indicates not to unfix certain lists. For another day (TM).

(import [infix [*]])
(print (nfx `(1 + (np.linspace 1e-5  1 5)) :debug True))
 code:  (1L u'+' (u'np.linspace' 1e-05 1L 5L))  type:  <class 'hy.models.expression.HyExpression'>
 3:  (1L u'+' (u'np.linspace' 1e-05 1L 5L))
  code:  1  type:  <class 'hy.models.integer.HyInteger'>
  true:  1
  code:  (u'np.linspace' 1e-05 1L 5L)  type:  <class 'hy.models.expression.HyExpression'>
expr:  (u'np.linspace' 1e-05 1L 5L)
   code:  1e-05  type:  <class 'hy.models.float.HyFloat'>
   true:  1e-05
   code:  np.linspace  type:  <class 'hy.models.symbol.HySymbol'>
   true:  np.linspace
   code:  (1L 5L)  type:  <class 'hy.models.expression.HyExpression'>
   2:  (1L 5L)
    code:  1  type:  <class 'hy.models.integer.HyInteger'>
    true:  1
    code:  5  type:  <class 'hy.models.integer.HyInteger'>
    true:  5
(u'+' 1L (1e-05 u'np.linspace' (1L 5L)))

See, the linspace call is out of order.

3 The final test

For the final act, we use infix notation in a real problem we posed before.

3.1 with the string reader

We almost get way with exactly what we would have done in Python. The only thing was we had to put a space between -x to avoid a mangling issue that turned it into _x. I feel like that might be a fixable issue.

(import [numpy :as np])
(import [scipy.integrate [odeint]])
(import [scipy.special [jn]])
(import [matplotlib.pyplot :as plt])

(import [infix [*]])
(require infix)

(defn fbessel [Y x]
  "System of 1st order ODEs for the Bessel equation."
  (setv nu 0.0
        y (get Y 0)
        z (get Y 1))

  ;; define the derivatives
  (setv dydx z
        ;; the Python way is: "1.0 / x**2 * (-x * z - (x**2 - nu**2) * y)"
        dzdx (py `(1.0 / x**2 * (- x * z - (x**2 - nu**2) * y))))
  ;; Here is what it was with prefix notation
  ;; dzdx (* (/ 1.0 (** x 2)) (- (* (* -1 x) z) (* (- (** x 2) (** nu 2)) y))))
  ;; return derivatives
  [dydx dzdx])

(setv x0 1e-15
      y0 1.0
      z0 0.0
      Y0 [y0 z0])

(setv xspan (np.linspace 1e-15 10)
      sol (odeint fbessel Y0 xspan))

(plt.plot xspan (. sol [[Ellipsis 0]]) :label "Numerical solution")
(plt.plot xspan (jn 0 xspan) "r--" :label "Analytical solution")
(plt.legend :loc "best")

(plt.savefig "bessel-infix-s.png")

3.2 with #$ reader

This version is also somewhat close to the Python syntax, but it needs a lot more parentheses to get the right precedence, and spaces between almost everything for the lisp syntax, i.e. x**2 is a name, and (x ** 2) is the infix notation for exponentiation.

(import [numpy :as np])
(import [scipy.integrate [odeint]])
(import [scipy.special [jn]])
(import [matplotlib.pyplot :as plt])

(import [infix [*]])
(require infix)

(defn fbessel [Y x]
  "System of 1st order ODEs for the Bessel equation."
  (setv nu 0.0
        y (get Y 0)
        z (get Y 1))

  ;; define the derivatives
  (setv dydx z
        ;; the Python way is: "1.0 / x**2 * (-x * z - (x**2 - nu**2) * y)"
        dzdx #$((1.0 / (x ** 2)) * ((- x) * z) - (((x ** 2) - (nu ** 2)) * y)))
  ;; Here is what it was with prefix notation
  ;; dzdx (* (/ 1.0 (** x 2)) (- (* (* -1 x) z) (* (- (** x 2) (** nu 2)) y))))
  ;; return derivatives
  [dydx dzdx])

(setv x0 1e-15
      y0 1.0
      z0 0.0
      Y0 [y0 z0])

(setv xspan (np.linspace 1e-15 10)
      sol (odeint fbessel Y0 xspan))

(plt.plot xspan (. sol [[Ellipsis 0]]) :label "Numerical solution")
(plt.plot xspan (jn 0 xspan) "r--" :label "Analytical solution")
(plt.legend :loc "best")

(plt.savefig "bessel-infix.png")

That worked pretty well. This feels like an improvement for writing engineering programs in lisp!

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter