Code completion in HyDE

| categories: hylang | tags:

Code completion is often useful in an editor. Today, we add some code completion to Emacs for hy . It isn't that hard; we get a list of known keywords from the hy language, a list of functions and macros, and a list of variables from the current buffer. If you are following this line of development, the code can be found here: https://github.com/jkitchin/jmax/blob/master/mile-hy.el

If not, there might be some interesting tips here on getting completion in Emacs ;)

We will use auto-complete (http://auto-complete.org/doc/manual.html#extend ) for now. First, we can add hy-mode to the list of ac-modes:

;; * auto-complete
(add-to-list 'ac-modes 'hy-mode)

Next, we need to define some sources and functions for completion. Over at https://github.com/jkitchin/hyve/blob/master/hylp.hy#L65 I defined a function that returns a list of all hy core functions and macros that Emacs can directly read.

(defn hy-all-keywords-emacs-completion []
  "Return a string for Emacs completion suitable for read in Emacs.
We unmangle the names and replace _ with -."
  (str
   (+ "("
      (.join " " (list-comp (.format "\"{}\"" (.replace x "_" "-"))
                            [x (hy-all-keywords)]))
      ")")))

Here, we define a source that gets that information from the hy repl using the lispy–eval-hy function. This has the downside of calling the repl, but it seems fast, and I haven't noticed any lags so far. The upside is it only gets called once and has everything hy knows about, i.e. i don't have to update this for new core functions/macros.

(defvar ac-source-hy-keywords
  `((candidates . ,(read (lispy--eval-hy "(hy-all-keywords-emacs-completion)"))))
  "Keywords known from hy. The command is defined in hyve.hylp.")

It would also be nice to have the defns/macros in the current file available for completion. This hackery searches the current buffer for these with a pretty simple regex and accumulates the results.

(defun hy-defns-macros ()
  "Get a list of defns in the current file."
  (let ((defns '()))
    (save-excursion
      (goto-char (point-min))
      (while (re-search-forward "\\(?:defn\\|defmacro\\)[[:space:]]+\\(.*?\\) "nil t)
        (push (match-string 1) defns)))
    defns))

Finally, we would also like the variable names from setv and let. Hy is lispy, so we use a hybrid regex search, followed by read to get every other name in the case of setv, and the vector expression in the let case.

(defun hy-variables ()
  "Collect the variable names in the current buffer.
These are every other name after setv."
  (let ((vars '())
        expr
        set-vars
        let-vars)
    (save-excursion
      (goto-char (point-min))
      (while (re-search-forward "(setv" nil t)
        (save-excursion
          (goto-char (match-beginning 0))
          (setq expr (read (current-buffer)))
          (setq set-vars (loop for x in (cdr expr) by #'cddr
                               collect x)))))
    (save-excursion
      (goto-char (point-min))
      (while (re-search-forward "(let" nil t)
        (save-excursion
          (goto-char (match-beginning 0))
          (setq expr (read (current-buffer)))
          ;; this is read as a vector, so we convert to a list.
          (setq let-vars
                (loop for x in (append (nth 1 expr) nil)
                      by #'cddr collect x)))))
    (append set-vars let-vars)))

Next, we define two new sources for completion that use those two functions:

(defvar ac-source-hy-defns
  '((candidates . hy-defns-macros))
  "Functions/macros defined in the file.")

(defvar ac-source-hy-variables
  '((candidates . hy-variables))
  "Hy variables defined in the file.")

And finally add this to the hy-setup hook function:

(setq ac-sources '(ac-source-hy-keywords
                     ac-source-hy-defns
                     ac-source-hy-variables))

  (ac-set-trigger-key "TAB")
  (auto-complete-mode 1)

And we should be good to go with completion. Let's try it out.

Checkout the video here: https://www.youtube.com/watch?v=L6j5IWkpoz0

(let [some-long-name 5
      boring-and-tedious "tree"]
  (print boring-and-tedious))

(setv another-var nil inline-name (+ 4 5)
      hylarious-var 5)

(+ hylarious-var 8 )

(defn Some-long-function []
  (print 6))

(Some-long-function)
tree
6

Sweet.

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

What are you hy?

| categories: hylang | tags:

Hy lets us do things that either aren't possible, or definitely aren't easy in Python. You may have drank the Python Koolaid and don't think those things are necessary, but we have Hy-C, and we took a sip of that just now, so let's see what we can do.

We can have functions that are punctuation!

(defn ! [arg] (not arg))

(print (! True))
(print (! False))
False
True

How about that function that just returns something truthy? Shouldn't those end in a question-mark? They can and sometimes do. Not a problem when you are hy.

(defn string? [s]
 (isinstance s str))

(print (string? 4))
(print (string? "4"))        ;; haha. strings in hy like "4" are unicode, not a str.
(print (string? (str "4")))
False
False
True
False

Isn't that better than is_a_string?

Underscores. Pfffft…. Dashes in names are awesome. Unless you hate your pinky and are shifty.

(defn 100-yard- [x]
  "Funniest function name ever. Don't do this at home."
  (.format "You ran that in {} seconds! New World Record!" 9.42))

(print (100-yard- 2))
You ran that in 9.42 seconds! New World Record!

Why not build code with code? Here is a fun way to add up only the even numbers in a list. wHy? wHy?? Because we can, and it leads to other interesting opportunities!

(import hy)
(let [a [1 2 3 4 5 6]
      code '()]
  (+= code `(+))  ;; add an operator
  (for [n a]
    (when (even? n)
      (+= code `(~(hy.models.integer.HyInteger n)))))
  (print code)

  (print (eval code)))
(u'+' 2L 4L 6L)
12

Ok, that isn't so beautiful, but it shows we can generate code and then execute it. We could also do that like we do in python where you build up the list of even numbers, and then sum them. It's the beginning of macros.

But I can't live without objects! How else can you encapsulate data? Let's see how and give you some closure to get on with programming. (yea, the puns get worse ;).

This next example illustrates a closure which we can use to encapsulate data. We use let to create a context with the variable i defined. i doesn't exist outside the context, but the lambda function created inside it retains access to the variable i.

(def counter
  (let [i [0]]
    (lambda [] (assoc i 0 (+ 1 (get i 0))) (get i 0))))

(print (counter))
(print (counter))

;; i is not a global var!
(try
 (print i)
 (except [e NameError] (print "i is not defined here!")))
1
2
i is not defined here!

Yes, the use of a list to store the counter is wonky; it is because of namespaces in Python. We get around the issue with a list here, that stores the data. Thanks Paul Tagliamonte (the resident Hypster) for the tip. Apparently Python scoping doesn't work enough here, but the list approach does, as does creating class instances to store the counter. Hylarious.

Let's check out a macro. First, here is a code example. A common pattern is to save a value in a let statement temporarily, so we can reuse it in other expressions.

(let [x (> 2 0)]
  (if x
    (print (.format "{}" x))
   (print (.format "{}" x))))

;; a one line version for comparison
(let [x (< 2 0)] (if x (print (.format "{}" x)) (print (.format "{}" x))))
True
False

That example has a lot of parentheses, and it might nice if there were fewer parentheses. There is a macro form to deal with this (it is actually defined in the hylang contrib directory, but it is short so we look at it here). This is called an anaphoric macro, because it captures a variable called "it" for reuse later in the macro. With the aif macro we can eliminate the use of the let statement in production code, eliminating a set of parentheses, and also the temporary variable.

(defmacro aif [test-form then-form &optional else-form]
  `(let [it ~test-form]
     (if it ~then-form ~else-form)))

;; In this code, it is bound to the first form value.
(print (aif (> 2 0) (.format "{}" it) (.format "{}" it)))
(print (aif (< 2 0) (.format "{}" it) (.format "{}" it)))

;; How does it work? By expanding to code.
(print (macroexpand '(aif (< 2 0) (.format "{}" it) (.format "{}" it))))
True
False
((u'fn' [] (u'setv' u'it' (u'<' 2L 0L)) (u'if' u'it' (u'.format' u'{}' u'it') (u'.format' u'{}' u'it'))))

Here is how you would do this in a regular program if you wanted to use the contrib library in hy.

(require hy.contrib.anaphoric)

(print (ap-if (> 2 0) (.format "{}" it) (.format "{}" it)))
True

Macros are useful for changing syntax and simplifying code. That works because the code in the macro is like data that can be manipulated and selectively evaluated. Here is an example of manipulating code like that. We start with an expression to add two numbers, and then modify it to be a multiplication.

(setv code '(+ 5 6))
(print (eval code))

;; change + to *
(assoc code 0 '*)
(print code)
(print (eval code))
11
(u'*' 5L 6L)
30

That is an indication that we can do some very interesting things with Lisp! Let's be fair and show this can also be done in Python. We just have to parse out the AST, and then we can manipulate it and get back to code. It isn't pretty, but doable.

import ast

# parse the statement
p = ast.parse("print 5 + 6")

exec compile(p, "<string>", "exec")
print ast.dump(p)

# Change + to *
p.body[0].values[0].op = ast.Mult()

print
exec compile(p, "<string>", "exec")
print ast.dump(p)
11
Module(body=[Print(dest=None, values=[BinOp(left=Num(n=5), op=Add(), right=Num(n=6))], nl=True)])

30
Module(body=[Print(dest=None, values=[BinOp(left=Num(n=5), op=Mult(), right=Num(n=6))], nl=True)])

That is not as clear as what we did in hy! Why? Because we had to transform the Python to AST, and manipulate it. In Lisp, the code is already in the abstract tree form, and we manipulate it more directly. It is easier to reason about.

I bet you didn't think we could use a hy program for more than one thing. Sure we may want to run it, but maybe we would like a different representation of the program than the code too. Here we define two macros that both take a program as input. One simply evaluates the program, so we can use it. The other takes the program, and outputs a LaTeX representation of it. It only converts a division expression correctly (and only if all the arguments are numbers and not other expressions), but it illustrates that we can use a program as data, and do different things with it!

(defmacro run [body] `(eval ~body))

(defmacro latex [body]
  `(cond
   [(= (car ~body) '/)
    (.format "\(\\frac{{{0}}} {{{1}}}\)"
            (get ~body 1)
            (.join " \\cdot " (list-comp (str x) [x (cut ~body 2)])))]
   [true (raise (Exception "Unsupported program"))]))

(setv code '(/ 1 2 4.0))

(print (run code))
(print (latex code))
0.125
\(\frac{1} {2 \cdot 4.0}\)

It is possible to do something kind of like this in Python. In this post I put a lisp function onto the base classes of objects so you could transform Python objects to lisp representations.

Well, that is probably enough Hy-C for the day. I am still playing around to figure out what kinds of things can we do with Hy that aren't easy or feasible in Python. These are a few of my favorite examples! If you have other cool things you do, put them in a comment hyre!

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Operator precedence in infix notation by automatic parenthesizing

| categories: hylang | tags:

I am continuing some investigation in getting operator precedence right with infix notation. You can fully parenthesize your expressions for this, but it is tedious and hard to read. Apparently in Fortran I (yep, one) the compiler would expand each operator in an expression with a sequence of parentheses to get the precedence right (https://en.wikipedia.org/wiki/Operator-precedence_parser )!

Roughly, these were the rules.

  • replace + and – with ))+(( and ))-((, respectively;
  • replace * and / with )*( and )/(, respectively;
  • add (( at the beginning of each expression and after each left parenthesis in the original expression; and
  • add )) at the end of the expression and before each right parenthesis in the original expression.

So this

a * b + c ^ d / e

becomes

((((a))*((b)))+(((c)^(d))/((e))))

Not too pretty, but correct! The wikipedia page provides an example C program to implement this, and we adapt it here for hy. The idea is to take an expression as a string, parenthesize it, and then we could eval it.

(defn parenthesize [input]
  "Fully parenthize the input string."
  (let [s ""]
    (+= s "((((")
    (for [(, i char) (enumerate input)]
      (cond
       [(= char "(")
        (+= s "((((")]
       [(= char ")")
        (+= s "))))")]
       ;; rewrite ^ to **
       [(= char "^")
        (+= s ")**(")]
       [(= char "*")
        (+= s "))*((")]
       [(= char "/")
        (+= s "))/((")]
       [(= char "+")
        (if (or (= 0 i) (in (get input (- i 1)) ["(" "^" "*" "/" "+" "-"]))
          (+= s "+ ")
          (+= s ")))+((("))]
       [(= char "-")
        (if (or (= 0 i) (in (get input (- i 1)) ["(" "^" "*" "/" "+" "-"]))
          (+= s "- ")
          (+= s ")))-((("))]
       [true
        (+= s char)]))
    (+= s "))))")
    s))

Let's try it out.

(import [infix [*]])

(print (parenthesize "a * b + c ^ d / e"))
((((a ))*(( b )))+((( c )**( d ))/(( e))))

For comparison:

((((a))*((b)))+(((c)^(d))/((e))))

Spaces aside, it looks like we got that right. The spaces should not be a problem for lisp. This is another strategy to get infix notation with operator precedence! Let's see some examples.

(import [infix [*]])
(require infix)

(print (eval (nfx (read-str (parenthesize "1 + 2 * 5")))))
(print (eval (nfx (read-str (parenthesize "1 * 2 + 5")))))
(print (eval (nfx (read-str (parenthesize "1 * 2 + 2^2")))))
11
7
6

We can get that string representation easy enough.

(import [infix [*]])
(require infix)

(print (eval (nfx (read-str (parenthesize (stringify `(1 + 2)))))))
3

This too is worthy of simplifying the notation with a function.

(defn NFX [code &optional [globals (globals)]]
  "Evaluate the infix CODE.
CODE is stringified, parenthesized, read back and infixed."
  (import infix)
  (import serialize)
  (eval (infix.nfx
         (read-str
          (infix.parenthesize
           (serialize.stringify code)))) globals))
(defmacro NFX [code]
  "Evaluate the infix CODE.
CODE is stringified, parenthesized, read back and infixed."
  `(do
    (import infix)
    (import serialize)
    (eval (infix.nfx
           (read-str
            (infix.parenthesize
             (serialize.stringify ~code)))))))

Here is a simple example.

;(import [infix [*]])
(require infix)

(print (NFX `(1 + 2 * 5)))
(print (NFX `((1 + 2) * 5)))

(import [numpy :as np])
(print (NFX `(1 + (np.exp 2))))

; not working because of infix
;(print (NFX `(1 + (np.linspace 0 1 5))))

;; But this is ok since no infix mangling happens.
(let [a (np.linspace 0 1 5)]
  (print (NFX `(1 + a))))
11
15
8.38905609893
[ 1.    1.25  1.5   1.75  2.  ]

That is slightly heavy still, and we can fix it with a new reader macro.

(defreader m [code]
 `(do
    (import infix)
    (import serialize)
    (eval (infix.nfx
           (read-str
            (infix.parenthesize
             (serialize.stringify ~code)))))))

Since we return code in that reader macro, we have to quote the code. This is debatably more concise than the NFX macro.

(require infix)

(print #m`(1 + 2 + 5))
(print #m`(1 + 2 * 5))
(print #m`((1 + 2) * 5))

(import [numpy :as np])
(print #m`((1 + (np.exp 2))))

;; these are all the same
(print (+ 1 (np.exp 2) (* 2 5)))
(print #m(`(1 + (np.exp 2) + 2 * 5)))
(print (NFX `(1 + (np.exp 2) + 2 * 5)))
8
11
15
8.38905609893
18.3890560989
18.3890560989
18.3890560989

1 Another test of a real problem

Here is another test of using an infix notation, this time with operator precedence. Note the use of ^ for exponentiation. The parenthesize function assumes single character operators, and would take some work to use **. Note we still need the space between - and x to avoid a mangling issue with _x in hy.

(import [numpy :as np])
(import [scipy.integrate [odeint]])
(import [scipy.special [jn]])
(import [matplotlib.pyplot :as plt])

(import [infix [*]])
(require infix)

(defn fbessel [Y x]
  "System of 1st order ODEs for the Bessel equation."
  (setv nu 0.0
        y (get Y 0)
        z (get Y 1))

  ;; define the derivatives
  (setv dydx z
        ;; the Python way is: "1.0 / x**2 * (-x * z - (x**2 - nu**2) * y)"
        dzdx #m`((1.0 / x^2) * ((- x) * z - (x^2 - nu^2) * y)))
  ;; Here is what it was with prefix notation
  ;; dzdx (* (/ 1.0 (** x 2)) (- (* (* -1 x) z) (* (- (** x 2) (** nu 2)) y))))
  ;; return derivatives
  [dydx dzdx])

(setv x0 1e-15
      y0 1.0
      z0 0.0
      Y0 [y0 z0])

(setv xspan (np.linspace 1e-15 10)
      sol (odeint fbessel Y0 xspan))

(plt.plot xspan (. sol [[Ellipsis 0]]) :label "Numerical solution")
(plt.plot xspan (jn 0 xspan) "r--" :label "Analytical solution")
(plt.legend :loc "best")

(plt.savefig "bessel-infix-m.png")

I wonder if there is actually some ambiguity in the expression or how it is parenthesized. We get the right answer with:

(1.0 / x^2) * ((- x) * z - (x^2 - nu^2) * y)

but not with:

1.0 / x^2 * ((- x) * z - (x^2 - nu^2) * y))

Let's see if we can see why. Consider 1 / x * a. This should probably be evaluated as (1 / x) * a. This shows the algorithm does not do that.

(import [infix [*]])

(print
 (nfx
 (read-str
 (parenthesize
  (stringify `(1 / x * a))))))
;   `(1.0 / x^2 * ((- x) * z - (x^2 - nu^2) * y)))))))
(u'/' 1L (u'*' u'x' u'a'))

That reads: 1 / (x * a)

If we had a layer of parentheses we get the right answer.

(import [infix [*]])

(print
 (nfx
 (read-str
 (parenthesize
  (stringify `((1 / x) * a))))))
;   `((1.0 / x^2) * ((- x) * z - (x^2 - nu^2) * y)))))))
(u'*' (u'/' 1L u'x') u'a')

This reads (1 / x) * a. Our algorithm doesn't do exactly what we expect here. I guess this could be a general issue of neighboring operators with equal precedence.

Related to this, the Wikipedia page points out this example:

- a ^ 2

What does this mean? It is either (-a)^2 or -(a^2). The second is correct based on normal precedence, but the algorithm gives the unary operator - a higher precedence.

(import [infix [parenthesize]])

(print (parenthesize "- a ^ 2"))
(print (parenthesize "- (a ^ 2)"))
((((-  a )**( 2))))
((((-  ((((a )**( 2))))))))

To get the right thing, you need to use parentheses. Sometimes I do that in real code anyway to make sure what I want to happen does. Maybe some of this can be fixed in our parser function. Probably for another day.

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Getting towards an infix notation for hy

| categories: hylang | tags:

Engineers need infix notation. It's a bold statement I know, but I am an engineer, teach engineers, and write a fair bit of mathematical programs. Your typical engineer is not a programmer, and just wants to write an equation they way we would write it on paper. It is hard to undo 20+ years of education on that point! So, here we consider how to adapt hy to use infix notation.

In a recent post gilch suggested using strings with the builtin python eval function. There are some potential downsides to that approach including the overhead of byte-compiling each time it is eval'd, but the payoff is operator precedence, and doing it like you would do it in Python.

1 using strings

UPDATE: Thanks to some help from Jiege Chen I updated this section to solve the namespace issues previously discussed. That resulted in quite a bit of improvement. Thanks Jiege!

Here is one implementation.

(def py-eval (get __builtins__ "eval"))

And how to use it.

(import [infix [*]])

(print (py-eval "2+3*5"))

(import [numpy :as np])
(print (py-eval "2 * np.exp(np.pi / 2)"))
17
9.62095476193

We can eliminate the need for quotes (") with the stringify code we previously developed.

(import [serialize [*]])
(import [infix [*]])

(print (py-eval (stringify `(2+3*5))))
(print (py-eval (stringify `(2 + 3 * 5))))

(import [numpy :as np])
(print (py-eval (stringify `(2 * np.exp(np.pi / 2)))))
17
17
9.62095476193

Let's just take that one more step with a new reader macro to tighten the syntax up. A critical feature of this reader macro is that it expands to code evaluated in the namespace where it is used. Nothing gets evaluated in the macro. That occurs in another namespace, where most things in a script are not available.

(defreader p [code]
  `(do
    (import [serialize [stringify]])
    (import [infix [py-eval]])
    (py-eval (stringify ~code))))

(defmacro py [code]
  `(do
    (import [serialize [stringify]])
    (import [infix [py-eval]])
    (py-eval (stringify ~code))))

Now we can use it like this. We have to require the infix module to get the reader macro. It seems unfortunate to me we still have to quote the code. Later I show an example where that isn't necessary, so there must be some subtle difference I have not found yet.

;; we have to require to get the reader macro
(require infix)

(import [numpy :as np])
(print #p`(2 + 3 * 5))
(print #p`((2 + 3) * 5))
(print #p`(1 + 1 * np.exp(7)))

(setv x 5)
(print #p`(x + 2))

(print #p`(1 + 1 * np.exp(1e-15)))
;; note the real python syntax with commas.
;; also not the extra parens around 1e-5
(print #p`(1 + np.linspace((1e-5), 1, 5)))

; The 1e-5 gets mangled to 1e-5 in this example
; (print #p`(1 + np.linspace(1e-5, 1, 5)))

;; Here is the macro form. It is about as easy to write.
(print (py `(1 + np.linspace((1e-5), 1, 5))))
17
25
1097.63315843
7
2.0
[ 1.00001    1.2500075  1.500005   1.7500025  2.       ]
[ 1.00001    1.2500075  1.500005   1.7500025  2.       ]

Lots of things seem to work! Let's look into some other solutions that do not rely on the builtin eval.

2 Infix to prefix using code manipulation

This solution is inspired by https://sourceforge.net/p/readable/wiki/Solution/ , but probably isn't a full implementation. We will first develop a function to convert infix notation to prefix notation. This function is recursive to deal with nested expressions. So far it doesn't seem possible to recurse with macros (at least, I cannot figure out how to do it). We tangle this function to infix.hy so we can use it later.

It will have some limitations though:

  1. No operator precedence. We will use parentheses for precedence.
  2. Lisp syntax means 3+4 is not the same as 3 + 4. The first is interpreted as a name. So we will need spaces to separate everything.
(try
 (print (3+4))
 (except [e Exception]
   (print e)))

(print (+ 3 4))
name '3+4' is not defined
7

So, here is our infix function. Roughly, the function takes a CODE argument. If the CODE is iterable, it is a list of symbols, and we handle a few cases:

  • If it is a string, we return it.
  • if it has a length of one and is an expression we recurse on it, otherwise return the symbol.
  • if it has a length of two, we assume a unary operator and recurse on each element.
  • If there are three elements, we take the middle one as the operator, and switch it with the first element.
  • Otherwise we switch the first and second elements, and recurse on the rest of the list.
  • If it is not iterable we just return the element.

Two optional arguments provide some debug support to print what is happening.

(import [serialize [*]])

(defn nfx [code &optional [indent 0] [debug False]]
  "Transform the CODE expression to prefix notation.
We assume that CODE is in infix notation."
  (when debug (print (* " " indent) "code: " code " type: " (type code)))
  (cond
   [(coll? code)
    (cond

     ;; treat lists in [] special
     [(and (instance?  hy.models.list.HyList code)
           (not (instance?  hy.models.expression.HyExpression code)))
      (when debug (print "list: " code " type: " (type code)))
      code]

     [(= 1 (len code))
      ;; element is an Expression
      (when debug (print (* " " indent) "1: " code))
      (if (isinstance (car code) hy.models.expression.HyExpression)
        (nfx (car code) (+ indent 1) debug)
        ;; single element
        (car code))]

     ;; {- 1} ->  (- 1)
     [(= 2 (len code))
      (when debug (print (* " " indent) "2: " code))
      `(~(nfx (get code 0) (+ indent 1) debug)
         ~(nfx (get code 1) (+ indent 1) debug))]

     ;; {1 + 2} -> (+ 1 2)
     [(= 3 (len code))
      (when debug (print (* " " indent) "3: " code))
      `(~(get code 1)
         ~(nfx (get code 0) (+ indent 1) debug)
         ~(nfx (get code 2) (+ indent 1) debug))]

     ;; longer expression, swap first two and take the rest.
     [true
      (when debug (print "expr: " code))
      `(~(nfx (get code 1) (+ indent 1) debug)
         ~(nfx (get code 0) (+ indent 1) debug)
         (~@(nfx (cut code 2) (+ indent 1) debug)))])]

   ;; non-iterable just gets returned
   [true
    (when debug (print (* " " indent) "true: " code))
    code]))

Now, for some tests. First, an example with debug we can see what happens.

(import [infix [*]])
(print (nfx `(1 + (3 * 4)) :debug True))
 code:  (1L u'+' (3L u'*' 4L))  type:  <class 'hy.models.expression.HyExpression'>
 3:  (1L u'+' (3L u'*' 4L))
  code:  1  type:  <class 'hy.models.integer.HyInteger'>
  true:  1
  code:  (3L u'*' 4L)  type:  <class 'hy.models.expression.HyExpression'>
  3:  (3L u'*' 4L)
   code:  3  type:  <class 'hy.models.integer.HyInteger'>
   true:  3
   code:  4  type:  <class 'hy.models.integer.HyInteger'>
   true:  4
(u'+' 1L (u'*' 3L 4L))

You can see we return a list of symbols, and the result is not evaluated. Now for some more thorough tests. I use a little helper function here to show the input and output.

(import [infix [*]])
(import [serialize [stringify]])

(defn show [code]
  (print (.format "{0} -> {1}\n"
                  (stringify code)
                  (stringify (nfx code)))))

(show 1)
(show `(1))
(show `(- 1))
(show `((1)))
(show `(- (2 + 1)))

(show `(2 ** 4))
(show `(3 < 5))

(show `(1 + 3 * 5 + 6 - 9))
(show `((1 + (1 + 2)) * 5 + 6 - 9))
(show `(1 + 1 * (5 - 4)))
(show `(1 + 1 * (np.exp (17 - 10))))

; Note this one does not work right.
(show `(1 + (np.linspace 1e-5  1 5)))

(show `(x + long-name)) ; note name mangling occurs.

(show `(1 + 1 + 1 + 1 + 1))
1 -> 1

(1) -> 1

(- 1) -> (- 1)

((1)) -> 1

(- (2 + 1)) -> (- (+ 2 1))

(2 ** 4) -> (** 2 4)

(3 < 5) -> (< 3 5)

(1 + 3 * 5 + 6 - 9) -> (+ 1 (* 3 (+ 5 (- 6 9))))

((1 + (1 + 2)) * 5 + 6 - 9) -> (* (+ 1 (+ 1 2)) (+ 5 (- 6 9)))

(1 + 1 * (5 - 4)) -> (+ 1 (* 1 (- 5 4)))

(1 + 1 * (np.exp (17 - 10))) -> (+ 1 (* 1 (np.exp (- 17 10))))

(1 + (np.linspace 1e-05 1 5)) -> (+ 1 (1e-05 np.linspace (1 5)))

(x + long_name) -> (+ x long_name)

(1 + 1 + 1 + 1 + 1) -> (+ 1 (+ 1 (+ 1 (+ 1 1))))

Those all look reasonable I think. The last case could be simplified, but it would take some logic to make sure all the operators are the same, and that handles if any of the operands are expressions. We save that for later.

Now, we illustrate that the output code can be evaluated. Since we expand to code, we don't seem to have the namespace issues since the code is executed in our script.

(import [infix [*]])

(print (eval (nfx `(1 + 1 * (5 - 4)))))

(import [numpy :as np])
(print (eval (nfx `(1 + 1 * (np.exp (17 - 10))))))
2
1097.63315843

That syntax is not particularly nice, so next we build up a macro, and a new reader syntax. First, the macro.

(defmacro $ [&rest code]
  "Eval CODE in infix notation."
  `(do
    (import infix)
    (eval (infix.nfx ~code))))

Now we can use the simpler syntax here. It seems we still have quote the math to prevent it from being evaluated (which causes an error).

(import infix)
(require infix)

(print ($ `(1 + 1 * (5 - 4))))

(import [numpy :as np])
(print ($ `(1 + 1 * (np.exp (17 - 10)))))
2
1097.63315843

For the penultimate act, we introduce a new syntax for this. In the sweet expression syntax we would use {} for this, but this isn't currently possible for hylang, and is also used for dictionaries. We define a reader macro for this.

(defreader $ [code]
  (import infix)
  (infix.nfx code))

(defreader P [code]
  `(do (import infix)
       (eval (infix.nfx ~code))))
(import [infix [*]])
(require infix)

(import [numpy :as np])

(print #$(- 1))

(print #$(- (2 + 1)))

(print #$(2 ** 4))
(print #$(3 < 5))

(print #$(1 + 3 * 5 + 6 - 9))
(print #$((1 + (1 + 2)) * 5 + 6 - 9))
(print #$(1 + 1 * (5 - 4)))
(print #$(1 + 1 + 1 + 1 + 1))

;; we still have to be lispy with function calls (func args)
(print #$(1 + 1 * (np.exp (17 - 10))))

(setv a 3 t 6)
(print #$(a + t))

(setv long-a 5 long-b 6)
(print #$(long-a + long-b))

;; this fails because the linspace should not get unfixed. This is a bug in
;; our implementation

;; (print #P`(1 + (np.linspace 1e-5  1 5)))
-1
-3
16
True
7
8
2
5
1097.63315843
9
11

Mostly wonderful! We get variables passed through, and the name-mangling doesn't seem to matter. Note we don't have to quote this code. I think it is because in this reader macro we do not return code, but actually evaluate it I think. And somehow it works.

There is an issue with (print #$(1 + (np.linspace 1e-5 1 5))). The linspace call gets unfixed, which is wrong. There are some ways we could deal with that. One might be to only unfix known operators. Another might be some escape syntax that indicates not to unfix certain lists. For another day (TM).

(import [infix [*]])
(print (nfx `(1 + (np.linspace 1e-5  1 5)) :debug True))
 code:  (1L u'+' (u'np.linspace' 1e-05 1L 5L))  type:  <class 'hy.models.expression.HyExpression'>
 3:  (1L u'+' (u'np.linspace' 1e-05 1L 5L))
  code:  1  type:  <class 'hy.models.integer.HyInteger'>
  true:  1
  code:  (u'np.linspace' 1e-05 1L 5L)  type:  <class 'hy.models.expression.HyExpression'>
expr:  (u'np.linspace' 1e-05 1L 5L)
   code:  1e-05  type:  <class 'hy.models.float.HyFloat'>
   true:  1e-05
   code:  np.linspace  type:  <class 'hy.models.symbol.HySymbol'>
   true:  np.linspace
   code:  (1L 5L)  type:  <class 'hy.models.expression.HyExpression'>
   2:  (1L 5L)
    code:  1  type:  <class 'hy.models.integer.HyInteger'>
    true:  1
    code:  5  type:  <class 'hy.models.integer.HyInteger'>
    true:  5
(u'+' 1L (1e-05 u'np.linspace' (1L 5L)))

See, the linspace call is out of order.

3 The final test

For the final act, we use infix notation in a real problem we posed before.

3.1 with the string reader

We almost get way with exactly what we would have done in Python. The only thing was we had to put a space between -x to avoid a mangling issue that turned it into _x. I feel like that might be a fixable issue.

(import [numpy :as np])
(import [scipy.integrate [odeint]])
(import [scipy.special [jn]])
(import [matplotlib.pyplot :as plt])

(import [infix [*]])
(require infix)

(defn fbessel [Y x]
  "System of 1st order ODEs for the Bessel equation."
  (setv nu 0.0
        y (get Y 0)
        z (get Y 1))

  ;; define the derivatives
  (setv dydx z
        ;; the Python way is: "1.0 / x**2 * (-x * z - (x**2 - nu**2) * y)"
        dzdx (py `(1.0 / x**2 * (- x * z - (x**2 - nu**2) * y))))
  ;; Here is what it was with prefix notation
  ;; dzdx (* (/ 1.0 (** x 2)) (- (* (* -1 x) z) (* (- (** x 2) (** nu 2)) y))))
  ;; return derivatives
  [dydx dzdx])

(setv x0 1e-15
      y0 1.0
      z0 0.0
      Y0 [y0 z0])

(setv xspan (np.linspace 1e-15 10)
      sol (odeint fbessel Y0 xspan))

(plt.plot xspan (. sol [[Ellipsis 0]]) :label "Numerical solution")
(plt.plot xspan (jn 0 xspan) "r--" :label "Analytical solution")
(plt.legend :loc "best")

(plt.savefig "bessel-infix-s.png")

3.2 with #$ reader

This version is also somewhat close to the Python syntax, but it needs a lot more parentheses to get the right precedence, and spaces between almost everything for the lisp syntax, i.e. x**2 is a name, and (x ** 2) is the infix notation for exponentiation.

(import [numpy :as np])
(import [scipy.integrate [odeint]])
(import [scipy.special [jn]])
(import [matplotlib.pyplot :as plt])

(import [infix [*]])
(require infix)

(defn fbessel [Y x]
  "System of 1st order ODEs for the Bessel equation."
  (setv nu 0.0
        y (get Y 0)
        z (get Y 1))

  ;; define the derivatives
  (setv dydx z
        ;; the Python way is: "1.0 / x**2 * (-x * z - (x**2 - nu**2) * y)"
        dzdx #$((1.0 / (x ** 2)) * ((- x) * z) - (((x ** 2) - (nu ** 2)) * y)))
  ;; Here is what it was with prefix notation
  ;; dzdx (* (/ 1.0 (** x 2)) (- (* (* -1 x) z) (* (- (** x 2) (** nu 2)) y))))
  ;; return derivatives
  [dydx dzdx])

(setv x0 1e-15
      y0 1.0
      z0 0.0
      Y0 [y0 z0])

(setv xspan (np.linspace 1e-15 10)
      sol (odeint fbessel Y0 xspan))

(plt.plot xspan (. sol [[Ellipsis 0]]) :label "Numerical solution")
(plt.plot xspan (jn 0 xspan) "r--" :label "Analytical solution")
(plt.legend :loc "best")

(plt.savefig "bessel-infix.png")

That worked pretty well. This feels like an improvement for writing engineering programs in lisp!

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Writing hy code from hy code

| categories: hylang | tags:

Here is one of the main reasons I am interested in a lisp for programming. I want to write programs that write programs. In Python, I have ended up doing things like this where we build up a script with string formatting and manipulation, write it to a file, and run it later or somewhere else. We need this because we run a lot of our calculations through a queue system which runs asynchronously from the work we do in an editor.

import os
for x in [1, 2, 3]:
    fname = 'p{0}.py'.format(x)

    program = '''#!/usr/bin/env python
def f(x):
    return x**{0}

import sys
print f(float(sys.argv[1]))'''.format(x)

    with open(fname, 'w') as f:
        f.write(program)

    os.chmod(fname, 0o755)

Then you can call these now at the command line like:

./p2.py 3
./p3.py 3
9.0
27.0

That is not too bad because the script is simple, but it is tedious to keep the indentation right, it is not always easy to keep track of the arguments (even with numbered indexes, names, etc… in the formatting), there is limited logic you can use in the arguments (e.g. no if/elif/elif/else, etc…), you lose all the value of having an editor in Python mode, so no syntax highlighting, eldoc, code completion, automatic indentation, etc… I don't like it, but it gets the job done.

Lisps allow you to treat code like data, in an editor in lisp-mode, so it should be ideal for this kind of thing. Here we look at getting that done with hy. For the simplest forms, we simply convert the code to a string, which can then be written to a file. You can see we probably got lucky here that the objects in the expression all print in a simple form that allows us to reconstruct the code. You can see here some aspects of Python peeking through the hy implementation. In data/quoted mode, the atoms in the list are not all simple symbols. By the time the program gets to running the code, they have been transformed to objects of various types that need to be handled separately.

(setv program `(+ 4 5))
(print (+ "(" (.join " " (list-comp (str x) [x program])) ")"))
(print (list-comp (type x) [x program]))
(+ 4 5)
[<class 'hy.models.symbol.HySymbol'>, <class 'hy.models.integer.HyInteger'>, <class 'hy.models.integer.HyInteger'>]

Real programs are not this simple, and we need to handle nested expressions and other types of objects. Consider this program. It has many different types in it, and they don't all get represented by the right syntax in print (i.e. with (repr object).

(let [program `(list {"a" 1 "b" 3} "b" 3 3.0 [1 1 2] :keyword (lambda [x] (* x 3)))]
  (print (list-comp (type x) [x program]))
  (for [x program] (print (.format "{0!r}" x))))
[<class 'hy.models.symbol.HySymbol'>, <class 'hy.models.dict.HyDict'>, <class 'hy.models.string.HyString'>, <class 'hy.models.integer.HyInteger'>, <class 'hy.models.float.HyFloat'>, <class 'hy.models.list.HyList'>, <class 'hy.models.keyword.HyKeyword'>, <class 'hy.models.expression.HyExpression'>]
u'list'
{u'a' 1L u'b' 3L}
u'b'
3L
3.0
[1L 1L 2L]
u'\ufdd0:keyword'
(u'lambda' [u'x'] (u'*' u'x' 3L))

Next we make a recursive expression to handle some of these. It is recursive to handle nested expressions. Here are the things in hy.models that might need special treatment. We make sure to wrap expressions in (), lists in [], dictionaries in {}, and strings in "". Keywords have a unicode character put in front of them, so we cut that off. Everything else seems to be ok to just convert to a string. This function gets tangled to serialize.hy so it can be used in subsequent code examples.

(import hy)

(defn stringify [form &optional debug]
  "Convert a FORM to a string."
  (when debug (print (.format "{0}: {1}" form (type form))))
  (cond
   [(isinstance form hy.models.expression.HyExpression)
    (+ "(" (.join " " (list-comp (stringify x debug) [x form])) ")")]
   [(isinstance form hy.models.dict.HyDict)
    (+ "{" (.join " " (list-comp (stringify x debug) [x form])) "}")]
   [(isinstance form hy.models.list.HyList)
    (+ "[" (.join " " (list-comp (stringify x debug) [x form])) "]")]
   [(isinstance form hy.models.symbol.HySymbol)
    (.format "{}" form)]
   [(isinstance form hy.models.keyword.HyKeyword)
    ;; these have some unicode prefix I want to remove
    (.format "{}" (cut form 1))]
   [(or (isinstance form hy.models.string.HyString)
        (isinstance form unicode))
    (.format "\"{}\"" form)]
   [true
    (.format "{}" form)]))

Now, some examples. These cover most of what I can imagine coming up.

(import [serialize [stringify]])  ;; tangled from the block above

;; some examples that cover most of what I am doing.
(print (stringify `(+ 5 6.0)))
(print (stringify `(defn f [x] (* 2 x))))
(print (stringify `(get {"a" 1 "b" 3} "b")))
(print (stringify `(print (+ 4 5 (* 6 7)))))
(print (stringify `(import [numpy :as np])))
(print (stringify `(import [scipy.optimize [fsolve]])))
(print (stringify `(set [2 2 3])))
(print (stringify `(complex 4 5)))
(print (stringify `(cons 4 5)))
(+ 5 6.0)
(defn f [x] (* 2 x))
(get {"a" 1 "b" 3} "b")
(print (+ 4 5 (* 6 7)))
(import [numpy :as np])
(import [scipy.optimize [fsolve]])
(set [2 2 3])
(complex 4 5)
(cons 4 5)

Those all look promising. Maybe it looks like nothing happened. Something did happen! We took code that was quoted (and hence like a list of data), and converted it into a string representation of the code. Now that we have a string form, we can do things like write it to a file.

Next, we add a function that can write that to an executable script.

(defn scriptify [form fname]
  (with [f (open fname "w")]
        (.write f "#!/usr/bin/env hy\n")
        (.write f (stringify form)))
  (import os)
  (os.chmod fname 0o755))

Here is an example

(import [serialize [stringify scriptify]])

;; make functions
(for [x (range 1 4)]
  (scriptify
   `(do
     (import sys)
     (defn f [x]
       (** x ~x))
     (print (f (float (get sys.argv 1)))))
   ;; fname to write to
   (.format "h{}.hy" x)))

Here is the proof those programs got created.

ls h[0-9].hy
echo
cat h1.hy
h1.hy
h2.hy
h3.hy

#!/usr/bin/env hy
(do (import sys) (defn f [x] (** x 1)) (print (f (float (get sys.argv 1)))))

The code is all on one line, which doesn't matter or hy. Yep, if it didn't occur to you, we could take those strings and send them over the internet so they could get executed remotely. They are one read-str and eval away from being lisp code again. Yes there are security concerns with that. And an amazing way to get something done.

(import [serialize [*]])
(print (eval (read-str (stringify `(+ 4 5)))))
9

We can run those programs at the command line:

hy h2.hy 10
hy h3.hy 10
100.0
1000.0

Now for a more realistic test. I make some scripts related to the kinds of molecular simulation we do. These scripts just setup a model of bulk Cu or Pt, and print the generated object. In a real application we would compute some thing from this object.

(import [serialize [stringify scriptify]])

(for [element ["Cu" "Pt"]]
  (scriptify `(do (import [ase.lattice [bulk]])
                  ;; we have to str the element to avoid a unicode error
                  ;; ase does not do unicode.
                  (setv atoms (bulk (str ~element) :a 4.5 :cubic True))
                  (print atoms))
             (.format "{}.hy" element)))

Here is what one of those scripts looks like

cat Pt.hy
#!/usr/bin/env hy
(do (import [ase.lattice [bulk]]) (setv atoms (bulk (str "Pt") :a 4.5 :cubic True)) (print atoms))

Note the comments are not in the generated script. These are evidently ignored in hy, and are not even elements. We can run this at the command line to. If this script did an actual calculation, we would have a mechanism to generate simulation scripts that run calculations and output the results we want!

hy Pt.hy
Atoms(symbols='Pt4', positions=..., cell=[4.5, 4.5, 4.5], pbc=[True, True, True])

So, we can write programs that write programs!

1 Serialize as compiled Python

It could be convenient to run the generated programs from Python instead of hy. Here we consider how to do that. I adapted this code from hy.importer.write_hy_as_pyc.

(import [hy.importer :as hi])
(import [hy._compat [PY3 PY33 MAGIC wr_long long_type]])
(import marshal)
(import os)

(defn hy2pyc [code fname]
  "Write CODE as Python compiled byte-code in FNAME."

  (setv program (stringify code))

  (setv _ast (hi.import_buffer_to_ast
              program
              "main"))

  (setv code (hi.ast_compile _ast "<string>" "exec"))

  ;; create file and close it so we get the size
  (with [f (open fname "wb")] nil)
  (with [f (open fname "wb")]
        (try
         (setv st (os.fstat (f.fileno)))
         (except [e AttributeError]
           (setv st (os.stat fname))))
        (setv timestamp (long_type (. st st_mtime))))
  (with [fc (open fname "wb")]
        (if PY3
          (.write fc b"\0\0\0\0") ; I amnot sure this is right in hy with b""
          (.write fc "\0\0\0\0"))
        (wr_long fc timestamp)
        (when PY33
          (wr_long fc st.st_size))
        (.dump marshal code fc)
        (.flush fc)
        (.seek fc 0 0)
        (.write fc MAGIC)))

Now for an example.

(import [serialize [*]])

(hy2pyc `(do
          (import sys)
          (defn f [x]
            (** x 3))
          (print (.format "Hy! {0}^3 is {1}."
                          (get sys.argv 1)
                          (f (float (get sys.argv 1))))))
          "main.pyc")

Now we can execute it like this.

python main.pyc 4
Hy! 4^3 is 64.0.

Well, that worked fine too!

2 Summary

In some ways this is similar to the string manipulation approach (they both generate programs after all), but there are these differences:

  1. We do not have the indentation issues of generating Python.
  2. The code is edited in hy-mode with full language support.
  3. Instead of formatting, and string replacements, you have to think of what is quoted and what is evaluated. I find that easier to think about than with strings.

There are some ways we could simplify this perhaps. In this post I added code to the built in python types so they could be represented as lisp code. We could add something like this to each of the hy.model objects so they natively can be represented as hy code. The repr functions on these should technically be used for that I think. On the other hand, this serialize code works fine, and lets me do what I want. It is pretty cool this is all possible!

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter
« Previous Page -- Next Page »