Operator precedence in infix notation by automatic parenthesizing

Posted April 10, 2016 at 01:32 PM | categories: hylang | tags:

1. Another test of a real problem

I am continuing some investigation in getting operator precedence right with infix notation. You can fully parenthesize your expressions for this, but it is tedious and hard to read. Apparently in Fortran I (yep, one) the compiler would expand each operator in an expression with a sequence of parentheses to get the precedence right (https://en.wikipedia.org/wiki/Operator-precedence_parser )!

Roughly, these were the rules.

replace + and – with ))+(( and ))-((, respectively;
replace * and / with )*( and )/(, respectively;
add (( at the beginning of each expression and after each left parenthesis in the original expression; and
add )) at the end of the expression and before each right parenthesis in the original expression.

So this

a * b + c ^ d / e

becomes

((((a))*((b)))+(((c)^(d))/((e))))

Not too pretty, but correct! The wikipedia page provides an example C program to implement this, and we adapt it here for hy. The idea is to take an expression as a string, parenthesize it, and then we could eval it.

(defn parenthesize [input]
  "Fully parenthize the input string."
  (let [s ""]
    (+= s "((((")
    (for [(, i char) (enumerate input)]
      (cond
       [(= char "(")
        (+= s "((((")]
       [(= char ")")
        (+= s "))))")]
       ;; rewrite ^ to **
       [(= char "^")
        (+= s ")**(")]
       [(= char "*")
        (+= s "))*((")]
       [(= char "/")
        (+= s "))/((")]
       [(= char "+")
        (if (or (= 0 i) (in (get input (- i 1)) ["(" "^" "*" "/" "+" "-"]))
          (+= s "+ ")
          (+= s ")))+((("))]
       [(= char "-")
        (if (or (= 0 i) (in (get input (- i 1)) ["(" "^" "*" "/" "+" "-"]))
          (+= s "- ")
          (+= s ")))-((("))]
       [true
        (+= s char)]))
    (+= s "))))")
    s))

Let's try it out.

(import [infix [*]])

(print (parenthesize "a * b + c ^ d / e"))

((((a ))*(( b )))+((( c )**( d ))/(( e))))

For comparison:

((((a))*((b)))+(((c)^(d))/((e))))

Spaces aside, it looks like we got that right. The spaces should not be a problem for lisp. This is another strategy to get infix notation with operator precedence! Let's see some examples.

(import [infix [*]])
(require infix)

(print (eval (nfx (read-str (parenthesize "1 + 2 * 5")))))
(print (eval (nfx (read-str (parenthesize "1 * 2 + 5")))))
(print (eval (nfx (read-str (parenthesize "1 * 2 + 2^2")))))

11
7
6

We can get that string representation easy enough.

(import [infix [*]])
(require infix)

(print (eval (nfx (read-str (parenthesize (stringify `(1 + 2)))))))

This too is worthy of simplifying the notation with a function.

(defn NFX [code &optional [globals (globals)]]
  "Evaluate the infix CODE.
CODE is stringified, parenthesized, read back and infixed."
  (import infix)
  (import serialize)
  (eval (infix.nfx
         (read-str
          (infix.parenthesize
           (serialize.stringify code)))) globals))

(defmacro NFX [code]
  "Evaluate the infix CODE.
CODE is stringified, parenthesized, read back and infixed."
  `(do
    (import infix)
    (import serialize)
    (eval (infix.nfx
           (read-str
            (infix.parenthesize
             (serialize.stringify ~code)))))))

Here is a simple example.

;(import [infix [*]])
(require infix)

(print (NFX `(1 + 2 * 5)))
(print (NFX `((1 + 2) * 5)))

(import [numpy :as np])
(print (NFX `(1 + (np.exp 2))))

; not working because of infix
;(print (NFX `(1 + (np.linspace 0 1 5))))

;; But this is ok since no infix mangling happens.
(let [a (np.linspace 0 1 5)]
  (print (NFX `(1 + a))))

11
15
8.38905609893
[ 1.    1.25  1.5   1.75  2.  ]

That is slightly heavy still, and we can fix it with a new reader macro.

(defreader m [code]
 `(do
    (import infix)
    (import serialize)
    (eval (infix.nfx
           (read-str
            (infix.parenthesize
             (serialize.stringify ~code)))))))

Since we return code in that reader macro, we have to quote the code. This is debatably more concise than the NFX macro.

(require infix)

(print #m`(1 + 2 + 5))
(print #m`(1 + 2 * 5))
(print #m`((1 + 2) * 5))

(import [numpy :as np])
(print #m`((1 + (np.exp 2))))

;; these are all the same
(print (+ 1 (np.exp 2) (* 2 5)))
(print #m(`(1 + (np.exp 2) + 2 * 5)))
(print (NFX `(1 + (np.exp 2) + 2 * 5)))

8
11
15
8.38905609893
18.3890560989
18.3890560989
18.3890560989

1 Another test of a real problem

Here is another test of using an infix notation, this time with operator precedence. Note the use of ^ for exponentiation. The parenthesize function assumes single character operators, and would take some work to use **. Note we still need the space between - and x to avoid a mangling issue with _x in hy.

(import [numpy :as np])
(import [scipy.integrate [odeint]])
(import [scipy.special [jn]])
(import [matplotlib.pyplot :as plt])

(import [infix [*]])
(require infix)

(defn fbessel [Y x]
  "System of 1st order ODEs for the Bessel equation."
  (setv nu 0.0
        y (get Y 0)
        z (get Y 1))

  ;; define the derivatives
  (setv dydx z
        ;; the Python way is: "1.0 / x**2 * (-x * z - (x**2 - nu**2) * y)"
        dzdx #m`((1.0 / x^2) * ((- x) * z - (x^2 - nu^2) * y)))
  ;; Here is what it was with prefix notation
  ;; dzdx (* (/ 1.0 (** x 2)) (- (* (* -1 x) z) (* (- (** x 2) (** nu 2)) y))))
  ;; return derivatives
  [dydx dzdx])

(setv x0 1e-15
      y0 1.0
      z0 0.0
      Y0 [y0 z0])

(setv xspan (np.linspace 1e-15 10)
      sol (odeint fbessel Y0 xspan))

(plt.plot xspan (. sol [[Ellipsis 0]]) :label "Numerical solution")
(plt.plot xspan (jn 0 xspan) "r--" :label "Analytical solution")
(plt.legend :loc "best")

(plt.savefig "bessel-infix-m.png")

I wonder if there is actually some ambiguity in the expression or how it is parenthesized. We get the right answer with:

(1.0 / x^2) * ((- x) * z - (x^2 - nu^2) * y)

but not with:

1.0 / x^2 * ((- x) * z - (x^2 - nu^2) * y))

Let's see if we can see why. Consider 1 / x * a. This should probably be evaluated as (1 / x) * a. This shows the algorithm does not do that.

(import [infix [*]])

(print
 (nfx
 (read-str
 (parenthesize
  (stringify `(1 / x * a))))))
;   `(1.0 / x^2 * ((- x) * z - (x^2 - nu^2) * y)))))))

(u'/' 1L (u'*' u'x' u'a'))

That reads: 1 / (x * a)

If we had a layer of parentheses we get the right answer.

(import [infix [*]])

(print
 (nfx
 (read-str
 (parenthesize
  (stringify `((1 / x) * a))))))
;   `((1.0 / x^2) * ((- x) * z - (x^2 - nu^2) * y)))))))

(u'*' (u'/' 1L u'x') u'a')

This reads (1 / x) * a. Our algorithm doesn't do exactly what we expect here. I guess this could be a general issue of neighboring operators with equal precedence.

Related to this, the Wikipedia page points out this example:

- a ^ 2

What does this mean? It is either (-a)^2 or -(a^2). The second is correct based on normal precedence, but the algorithm gives the unary operator - a higher precedence.

(import [infix [parenthesize]])

(print (parenthesize "- a ^ 2"))
(print (parenthesize "- (a ^ 2)"))

((((-  a )**( 2))))
((((-  ((((a )**( 2))))))))

To get the right thing, you need to use parentheses. Sometimes I do that in real code anyway to make sure what I want to happen does. Maybe some of this can be fixed in our parser function. Probably for another day.

org-mode source

Org-mode version = 8.2.10

The Kitchin Research Group

Chemical Engineering at Carnegie Mellon University

Operator precedence in infix notation by automatic parenthesizing

Table of Contents

1 Another test of a real problem