Finding similar bibtex entries

| categories: bibtex, similarity | tags: | View Comments

A common task while writing scientific papers is citing previous research. I use org-ref extensively for that, and it makes it pretty easy to find similar references, e.g. that have common authors, or common keywords. It also lets me find similar articles in Web of Science or Scopus. Suppose that I have cited a particular paper, e.g. e boes-2016-neural-networ, and I want to find similar references to it that are already in my bibtex file, and similar by my definition. With org-ref I can easily search by keyword or author to find similar entries, but these are limited by what I search for, and they are not sorted. Today, I will explore the first step in a recommender system that calculates similarity, and provides a sorted list of candidates with the most relevant ones first.

The idea is to calculate some measure of similarity between the title of that reference, and the titles of other references in my bibtex file, and then sort them by similarity. This is the reference I want to find similar entries for:

Boes, J. R., Groenenboom, M. C., Keith, J. A., & Kitchin, J. R., Neural network and Reaxff comparison for Au properties, Int. J. Quantum Chem., 116(13), 979–987 (2016).

The first thing we do is read in our bibtex file, and print a representative entry.

import bibtexparser
from bibtexparser.bparser import BibTexParser

with open('/Users/jkitchin/Dropbox/bibliography/references.bib') as bibtex_file:
    parser = BibTexParser()
    bib_database = bibtexparser.load(bibtex_file, parser=parser)
    entries = bib_database.entries


{'author': 'Jaan Aarik and Aleks Aidla and V{\\"a}ino Sammelselg and Teet\nUustare', 'title': 'Effect of Growth Conditions on Formation of \\ce{TiO_2}-{II}\nThin Films in Atomic Layer Deposition Process', 'journal': 'Journal of Crystal Growth', 'volume': '181', 'number': '3', 'pages': '259 - 264', 'year': '1997', 'doi': '10.1016/S0022-0248(97)00279-0', 'link': '', 'issn': '0022-0248', 'ENTRYTYPE': 'article', 'ID': 'aarik-1997-effec-growt'}

Each entry is a dictionary containing the fields and their values. For this exploration, I will only consider similarities between titles. The next step is we find which entry corresponds to the reference we want to find similarities to.

ids = [e['ID'] for e in entries]
i = ids.index('boes-2016-neural-networ')

{'author': 'Jacob R. Boes and Mitchell C. Groenenboom and John A. Keith\nand John R. Kitchin', 'title': 'Neural Network and {Reaxff} Comparison for {Au} Properties', 'journal': 'Int. J. Quantum Chem.', 'volume': '116', 'number': '13', 'pages': '979-987', 'year': '2016', 'doi': '10.1002/qua.25115', 'link': '', 'issn': '1097-461X', 'keyword': 'Kohn-Sham density functional theory, neural networks, reactive\nforce fields, potential energy surfaces, machine learning', 'ENTRYTYPE': 'article', 'ID': 'boes-2016-neural-networ'}

It is best if we make the entry we want to find similarities to the first one, so here we swap the first and ith entries.

entries[0], entries[i] = entries[i], entries[0]

Now, we prepare the list of strings to get similarities for.

titles = [e.get('title', '') for e in entries]

We will use term frequency–inverse document frequency to get a vector that represents each title, and then use cosine similarity as a measure of similarity. Here is the place to note that I chose these, and could choose other ones too. Also, it is worth noting that in this measure of similarity I did not choose which keywords to measure similarity on.

The functionality for this is provided by sklearn. It has implemented functions for the algorithms above, and in just a few lines of code you get an array of tf-idf features to analyze. The array we get from our vectorizer contains normalized vectors, so we can get the cosine similarity just from a dot product of the vectors. The first row corresponds to the similarity of the first string to all the others. I want them sorted in descending order. The argsort function returns ascending order, so we use a trick to sort the negative of the similarity score which achieves that. There are certainly more advanced treatments of the text we could use by customizing the vectorizer, e.g. word stemming, but for now we neglect that.

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(titles)

cosine_similarities = (X * X.T).A[0]

related_docs_indices = (-cosine_similarities).argsort()

print('The top 10 recommendations for {} are:\n'.format(S[0]))
for i, j in enumerate(related_docs_indices[1:11]):
    print('{i}. {ID}: {title}, {author}\n'.format(i=i + 1, **entries[j]))

The top 10 recommendations for Neural Network and {Reaxff} Comparison for {Au} Properties are:

  1. behler-2010-neural: Neural network potential-energy surfaces for atomistic

simulations, J{\"o}rg Behler

  1. boes-2017-neural-networ: Neural Network Predictions of Oxygen Interactions on a Dynamic

{Pd} Surface, Jacob R. Boes and John R. Kitchin

  1. eshet-2010-ab: Ab Initio Quality Neural-Network Potential for Sodium, Hagai Eshet and Rustam Z. Khaliullin and Thomas D. K{\"u}hne

and J{\"o}rg Behler and Michele Parrinello

  1. behler-2014-repres-poten: Representing Potential Energy Surfaces By High-Dimensional

Neural Network Potentials, J Behler

  1. behler-2007-gener-neural: Generalized Neural-Network Representation of High-Dimensional

Potential-Energy Surfaces, J{\"o}rg Behler and Michele Parrinello

  1. artrith-2012-high: High-Dimensional Neural Network Potentials for Metal Surfaces:

A Prototype Study for Copper, Nongnuch Artrith and J{\"o}rg Behler

  1. behler-2015-const: Constructing High-Dimensional Neural Network Potentials: A

Tutorial Review, J{\"o}rg Behler

  1. artrith-2011-high: High-Dimensional Neural-Network Potentials for Multicomponent

Systems: Applications To Zinc Oxide, Nongnuch Artrith and Tobias Morawietz and J{\"o}rg Behler

  1. sosso-2012-neural-gete: Neural Network Interatomic Potential for the Phase Change

Material \ce{GeTe}, Gabriele C. Sosso and Giacomo Miceli and Sebastiano Caravati and J{\"o}rg Behler and Marco Bernasconi

  1. lorenz-2006-descr: Descriptions of Surface Chemical Reactions Using a Neural

Network Representation of the Potential-Energy Surface, S{\"o}nke Lorenz and Matthias Scheffler and Axel Gross

It is evident that this is showing other references containing the words "neural network"! I guess that is a little disappointing, since these would just as easily been narrowed down in org-ref. On the other hand, they are sorted and grouped, which would not happen in org-ref. This is a comparison of pretty short strings (just the titles), so maybe this would be much more interesting if abstracts were also included. Including authors would give a different set as well (I tried it, and got a bunch of my own references!).

I don't think it would be very difficult to get this into an Emacs selection tool, e.g. helm/ivy. Check this out:

import pycse.lisp


'(1592 1650 299 1751 103)'

That is a result that can be read directly by lisp, so we could simply write the code above as a shell script that takes an argument, and returns a list of indices to sort the candidates on. The alternative is to implement this in elisp, perhaps via a dynamic module if there is already a good C library for this. My sense is the Python libraries are more advanced in functionality.

This could have a number of other applications. Given some reference content, you could imagine finding emails that are similar to it, finding RSS entries that are similar to it, finding org headlines that are related, similar files, or similarity with any other set of strings that can be gathered, e.g. from Crossref or some other search, etc. I predict there will be more on these topics in the future!

Copyright (C) 2017 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.0.7

Read and Post Comments

New publication in Journal of Physics Condensed Matter

| categories: news, publication | tags: | View Comments

The Atomic Simulation Environment is a powerful python library for setting up, running and analyzing molecular simulations. I have been using it and contributing to it since around 2002 when I used the ASE-2 version in Python 1.5! The new ase-3 version is much simpler to use, and much more powerful. This paper describes some of its design principles and capabilities. If you use ASE, please cite this paper!

  author =       {Ask Hjorth Larsen and Jens J{\o}rgen Mortensen and Jakob
                  Blomqvist and Ivano E Castelli and Rune Christensen and
                  Marcin Dułak and Jesper Friis and Michael N Groves and
                  Bj{\o}rk Hammer and Cory Hargus and Eric D Hermes and Paul C
                  Jennings and Peter Bjerre Jensen and James Kermode and John
                  R Kitchin and Esben Leonhard Kolsbjerg and Joseph Kubal and
                  Kristen Kaasbjerg and Steen Lysgaard and J{\'o}n Bergmann
                  Maronsson and Tristan Maxson and Thomas Olsen and Lars
                  Pastewka and Andrew Peterson and Carsten Rostgaard and Jakob
                  Schi{\o}tz and Ole Sch{\"u}tt and Mikkel Strange and Kristian
                  S Thygesen and Tejs Vegge and Lasse Vilhelmsen and Michael
                  Walter and Zhenhua Zeng and Karsten W Jacobsen},
  title =        {The Atomic Simulation Environment-A Python Library for Working
                  With Atoms},
  journal =      {Journal of Physics: Condensed Matter},
  volume =       29,
  number =       27,
  pages =        273002,
  year =         2017,
  url =          {},
  abstract =     {The atomic simulation environment (ASE) is a software package
                  written in the Python programming language with the aim of
                  setting up, steering, and analyzing atomistic simulations. In
                  ASE, tasks are fully scripted in Python. The powerful syntax
                  of Python combined with the NumPy array library make it
                  possible to perform very complex simulation tasks. For
                  example, a sequence of calculations may be performed with the
                  use of a simple 'for-loop' construction. Calculations of
                  energy, forces, stresses and other quantities are performed
                  through interfaces to many external electronic structure codes
                  or force fields using a uniform interface. On top of this
                  calculator interface, ASE provides modules for performing many
                  standard simulation tasks such as structure optimization,
                  molecular dynamics, handling of constraints and performing
                  nudged elastic band calculations.},

Copyright (C) 2017 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.0.7

Read and Post Comments

New publication in Crystal Growth & Design

| categories: news, publication | tags: | View Comments

Usually, metal oxides grow in a single, most stable crystal structure at a particular set of conditions. For example, TiO2 grows in the rutile structure for a large range of pressure and temperature conditions, but under some conditions it can also grow in the anatase structure. In this paper we show that epitaxial stabilization can be used to influence which crystal structures are observed for the growth of tin oxide. Tin oxide is normally only observed in the rutile structure. We grew tin oxide as an epitaxial film on a poly-crystalline substrate of CoNb2O6 which has an α-PbO2 crystal structure. We found that both rutile and α-PbO2 structures could be found in the film, and that the structure correlated with the orientation of the underlying grains. In other words, the orientation of a substrate can influence the structure of an epitaxial film, enabling one to grow films in crystal structures that may be metastable, and unobtainable in bulk samples.

  author =       {Wittkamper, Julia and Xu, Zhongnan and Kombaiah, Boopathy and
                  Ram, Farangis and De Graef, Marc and Kitchin, John R. and
                  Rohrer, Gregory S. and Salvador, Paul A.},
  title =        {Competitive Growth of Scrutinyite ($\alpha$-PbO2) and Rutile
                  Polymorphs of \ce{SnO2} on All Orientations of Columbite
                  \ce{CoNb2O6} Substrates},
  journal =      {Crystal Growth \& Design},
  volume =       17,
  number =       7,
  pages =        {3929-3939},
  year =         2017,
  doi =          {10.1021/acs.cgd.7b00569},
  url =          {},
  eprint =       { },

Copyright (C) 2017 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.0.7

Read and Post Comments

Overloading mathematical operators in elisp

| categories: elisp, emacs | tags: | View Comments

Table of Contents

In Python I am used to some simple idioms like this:

print([1, 2, 3] * 2)
print("ab" * 3)

[1, 2, 3, 1, 2, 3] ababab

There is even such fanciness as defining operators for objects, as long as they have the appropriate dunder methods defined:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __str__(self):
        return "Point ({}, {})".format(self.x, self.y)

    def __mul__(self, a):
        return Point(self.x * a, self.y * a)

    def __rmul__(self, a):
        return Point(self.x * a, self.y * a)
p = Point(1, 1)
print(p * 2)
print(3 * p)

Point (2, 2) Point (3, 3)

Out of the box, these things are not possible in elisp. Operators like * in elisp only take numbers or markers. We have a few options to change this. The worst option is to simply redefine these functions. That is bad because it is not reversible. We could define new functions that have the behavior we want, but then we lose the semantic meaning of "*" that we were aiming for. A better option is to advise these functions. This is reversible, because you can later unadvise them. Today we look at some strategies to do this.

We will use "around" advise because it will let us bypass the original intent of the function when we want to, or use it when we do. First, we create a function that will be the advice and add it to the * function. This first draft won't actually change the behavior of *; if all the args are numbers or markers it will simply use the original function as before.

(require 'dash)

(defun *--*-around (orig-fun &rest args)
  "if every arg is a number do *, else do something else."
   ((-every? (lambda (x) (or (numberp x) (markerp x))) args)
    (apply orig-fun args))))

(advice-add '* :around #'*--*-around)

Let's just confirm

(* 1 2 3)

Now, we can start modifying our function to handle some other cases. Let's do the list and string first. The * function is variadic, but in these cases it makes sense to limit to two arguments. We need two cases for each type since we can write (* 2 list) or (* list 2). We also should create a fall-through case that raises an error to alert us we can't multiply things.

(defun *--*-around (orig-fun &rest args)
  "if every arg is a number do *, else do something else."
   ;; The original behavior
   ((-every? (lambda (x) (or (numberp x) (markerp x))) args)
    (apply orig-fun args))

   ;; create repeated copies of list
   ((and (listp (first args))
         (integerp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (second args) append (copy-list (first args))))

   ((and (integerp (first args))
         (listp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (first args) append (copy-list (second args))))

   ;; Make repeated string
   ((and (stringp (first args))
         (integerp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (second args) concat (first args)))

   ((and (integerp (first args))
         (stringp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (first args) concat (second args)))

    (error "You cannot * %s" args))))

Here is the new advice in action.

 (* '(a b) 2)
 (* 2 '(c d))
 (* 2 "ab")
 (* "cd" 2))
(a b a b) (c d c d) abab cdcd

That captures the spirit of overloading * for lists and strings. What about that object example? We have to make some assumptions here. Python looks for an uses a dunder mul method. We will assume a double dash method (–mul–) in a similar spirit. We have to modify the advice one final time. We just add a condition to check if one of the arguments is an eieio-object, and then call the –mul– function on the arguments.

(defun *--*-around (orig-fun &rest args)
  "if every arg is a number do *, else do something else."
   ;; The original behavior
   ((-every? (lambda (x) (or (numberp x) (markerp x))) args)
    (apply orig-fun args))

   ;; create repeated copies of list
   ((and (listp (first args))
         (integerp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (second args) append (copy-list (first args))))

   ((and (integerp (first args))
         (listp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (first args) append (copy-list (second args))))

   ;; Make repeated string
   ((and (stringp (first args))
         (integerp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (second args) concat (first args)))

   ((and (integerp (first args))
         (stringp (second args))
         (= 2 (length args)))
    (loop for i from 0 below (first args) concat (second args)))

   ;; Handle object
   ((or (and (eieio-object-p (first args))
             (numberp (second args)))
        (and (numberp (first args))
             (eieio-object-p (second args))))
    (apply '--mul-- args))

    (error "You cannot * %s" args))))

Now, we can define a class and the –mul– function and show that our overloaded * function works. Note we can define two signatures of –mul– so it is not necessary to define an –rmul– in this case as it was with Python (although we still create two functions in the end).

(require 'eieio)

(defclass Point ()
  ((x :initarg :x)
   (y :initarg :y)))

(cl-defmethod --mul-- ((p Point) a)
  (Point :x (* (oref p :x) a) :y (* (oref p :y) a)))

(cl-defmethod --mul-- (a (p Point))
  (Point :x (* (oref p :x) a) :y (* (oref p :y) a)))

(cl-defmethod --str-- ((p Point))
  (format "Point (%s, %s)" (oref p :x) (oref p :y)))

(let ((P (Point :x 1 :y 1)))
   (--str-- (* P 2))
   (--str-- (* 3 P))))
Point (2, 2) Point (3, 3)

That is pretty awesome. Before going on, here is how you remove the advice:

(advice-remove '* '*--*-around)

This example has been pretty instructive. You have to handle overloading for all the intrinsic types. We did lists and strings here; you might also consider vectors. For objects, it looks like we can at least try using a generic method like –mul–. One detail I neglected to consider here is that * is natively variadic. For these special cases, we did not implement variadic versions. This isn't a feature of Python which uses infix notation, so every call is with two arguments. In some cases it might make sense to support variadic args, but that seems like a generally challenging thing to do. While (* "a" 2 3) might be expected to create a string of "aaaaaa", (* "a" 2 '(3)) doesn't make sense at all.

It would be straightforward to extend this to other operators like '+ to concatenate strings, lists and vectors, or '- to remove chars or elements, including extensions to objects using double-dash functions like –add–, –subtract–, etc. Another nice idea might be to advise print to use –str– on objects.

On the surface this looks useful so far. Python defines a lot of dunder methods that cover all kinds of scenarios including logical comparisons, bit shifting, mod, incrementing operators, casting, comparisons, right/left operations, indexing and assignment, length and others. That would be a lot of advices. This approach is moderately tedious to expand though; you have to keep adding conditional cases.

An alternative to the big conditional statement used in the advice might be the use of a generic function. With this approach we define a generic function that just does multiplication by default. Then we define specific cases with specific signatures that are used for lists, strings, objects, etc. That is basically all our conditional above was doing, matching signatures and executing a chunk of code accordingly.

Here is our default case that does the original behavior. We still use advice to apply the function.

(cl-defgeneric generic-multiply (orig-fun &rest args)
  "Generic multiply for when no specific case exists."
  (apply orig-fun args))

(defun *--*-around-generic (orig-fun &rest args)
  (apply 'generic-multiply orig-fun args))

(advice-add '* :around #'*--*-around-generic)

That should just work as usual for regular multiplication.

(* 1 2 3 4)

Sure enough it does. Now, we can define a specific method for a string. We need a specialized method for each signature, e.g. pre and post multiplication.

(cl-defmethod generic-multiply ((orig-fun subr) (s string) (n integer))
  (loop for i from 0 below n concat s))

(cl-defmethod generic-multiply ((orig-fun subr) (n integer) (s string))
  (loop for i from 0 below n concat s))

 (* "Ac" 2)
 (* 2 "Ad"))
AcAc AdAd

That works fine, and we did not have to modify our original advice function at all! Next the list:

(cl-defmethod generic-multiply ((orig-fun subr) (L list) (n integer))
  (loop for i from 0 below n append (copy-list L)))

(cl-defmethod generic-multiply ((orig-fun subr) (n integer) (L list))
  (loop for i from 0 below n append (copy-list L)))

(list (* '(1 2) 2)
      (* 2 '(3 4)))
1 2 1 2
3 4 3 4

That also works fine. Last, our class example. This should work on all objects I think (unless there is some way to make classes that do not inherit the default superclass).

(cl-defmethod generic-multiply ((orig-fun subr) (n integer) (obj eieio-default-superclass))
  (--mul-- n obj))

(cl-defmethod generic-multiply ((orig-fun subr) (obj eieio-default-superclass) (n integer))
  (--mul-- n obj))

(let ((P (Point :x 1 :y 1)))
   (--str-- (* P 2))
   (--str-- (* 3 P))))
Point (2, 2) Point (3, 3)

This is a much better approach to extending the multiplication operator! If I continue this path in the future I would probably take this one. This could be useful to make elisp more like some more popular contemporary languages like Python, as well as to add linear algebra like notation or mathematical operations on objects in elisp. It kind of feels like these operations ought to be generic functions to start with to make this kind of overloading easier from the beginning. Functions like "*" are currently defined in the C source code though, maybe for performance reasons. It is not obvious what the consequences of making them generic might be.

1 Addendum

Christopher Wellons pointed out an important limitation of advice: they don't work on byte-compiled functions. Let's see what he means. Here is a simple function that will just multiply a Point object by an integer:

(defun to-be-bytten (p1 n)
  (* p1 n))

Here it is in action, and here it works fine.

(to-be-bytten (Point :x 1 :y 1) 2)
[eieio-class-tag--Point 2 2]

Now, let's byte-compile that function and try it again:

(byte-compile 'to-be-bytten)

(condition-case err
    (to-be-bytten (Point :x 1 :y 1) 2)
  ((error r)
   (message "Doh! Christopher was right. It did not work...\n%s" err)))
Doh! Christopher was right. It did not work...
(wrong-type-argument number-or-marker-p [eieio-class-tag--Point 1 1])

So the advice is pretty limited since most of the functions in Emacs core are likely to be byte-compiled, and it might mean you have to redefine * completely, or define some new function that looks like it. Too bad, the advice was pretty easy!

Copyright (C) 2017 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.0.7

Read and Post Comments

Linear algebra in Emacs using MKL and dynamic modules

| categories: dynamic-module, emacs | tags: | View Comments

In a previous post I integrated some linear algebra into Emacs using the GNU Scientific library and a dynamic module. In this post, I use a similar approach that uses the Intel MKL library in conjunction with some helper elisp functions to mimic the array broadcasting features in Numpy. I thought this might be easier and lead to at least a complementary set of functionalities.

Note: I had to follow the directions here to disable some security feature on my Mac so that it would use the MKL libraries. Thanks Apple.

It is convenient to use vectors for the representation of arrays in Emacs because there are nice functions in the emacs-module.h for accessing vector properties. Also vectors sound closer to an array than a list. So what about array broadcasting, e.g. the way numpy lets you multiply a 2d array with a 1d array? If you multiply two arrays with size (m1, n1) * (m2, n2), it is required that the number of columns in the first array (n1) be equal to the number of rows in the second one (m2), and the resulting size of the array product will be (m1, n2). What should happen though when we have 1d array? This is neither a row or column vector itself, but we can treat as either one if we choose too. For example the vector [1 2 3] can be thought of as an array with the shape (1, 3), e.g. a single row with three columns, or (3, 1), i.e. three rows in a single column. We will build this capability into the module for convenience.

I still find it moderately tedious to write c functions that take emacs arguments, transform them to c arguments, do some c computations, and convert the results back to emacs values. So, we only implement one c function for this that multiplies two 2d arrays together using the cblas_dgemm routine in the MKL library. Then, we will create a complementary elisp library that will provide some additional functionality to get the shapes of vector arrays, dimensions, and allow us to multiply 1d and 2d vectors together the same way Numpy does array broadcasting.

The dynamic module code is listed in The module code. The elisp code is listed in Elisp helper functions. In the following sections we just demonstrate how to use the results.

1 Convenience functions to get array properties

I found it convenient to do array shape and dimension analysis in elisp before sending arrays to the dynamic module. The shape of an array is just the number of elements in each dimension. Here we look at a 2× 3 array.

(vector-shape [[1 2 3]
               [3 4 5]])
[2 3]

You see it returns a vector showing two rows and three columns. There are two convenience commands to get the number of rows (vector-nrows) and columns (vector-ncols). Here is one of them.

(vector-ncols [[1 2 3]
               [3 4 5]])

2 Matrix multiplication

The main problem we want to calculate is the matrix multiplication \(A\cdotB\) where \(A\) and \(B\) are either 1d vectors or 2d arrays. Here we examine several representative cases of matrix multiplication.

2.1 1d * 1d

This is a simple dot-product that is actually calculated in elisp.

\([1 1 1] \cdot [2 2 2] = 6\)

(matrix-multiply [1 1 1] [2 2 2])

Note we get a float. That is because we initialize the sum with 0.0 to be consistent with all the other cases which are done with floats. dgemm is a double routine in MKL, which means it should return floats. Internally in the module, we cast all numbers as doubles for the multiplication.

2.2 2d * 1d

This is a matrix multiplication that is typically like \(A b\) where \(b\) is a column vector. We return a 1d array as a result, rather than a 2d array of nrows and 1 column.

\[ \left[\begin{array}{cc} 1 & 2 \\ 3 & 4 \end{array}\right] \left [ \begin{array}{c} 1 \\ 1 \end{array}\right] = \left[\begin{array}{c}3\\7\end{array}\right]\]

(let ((A [[1 2]
          [3 4]])
      (b [1 1]))
  (matrix-multiply  A b))
[3.0 7.0]

2.3 1d * 2d

This case is \(b A\) where \(b\) is a row vector. For example:

\[\left[\begin{array}{cc}1 & 1\end{array}\right] \left[\begin{array}{cc} 1 & 2\\ 3 & 4\end{array}\right] = \left[\begin{array}{cc} 4 & 6 \end{array}\right ]\]

(matrix-multiply [1 1]
                 [[1 2]
                  [3 4]])
[4.0 6.0]

As with the previous case, we return a 1d vector result rather than a 2d array with 1 row and ncolumns.

2.4 2d * 2d

Finally we have the case of \(A B\). The number of columns in A must be the same as the number of rows in B, and the result has a size that is the number of rows in A and the number of columns in B. Here is one example:

\[\left[\begin{array}{cc} 0 & 1\\ 0 & 0\end{array}\right] \left[\begin{array}{cc} 0 & 0\\ 1 & 0\end{array}\right] = \left[\begin{array}{cc} 1 & 0\\ 0 & 0\end{array}\right] \]

(matrix-multiply [[0 1]
                  [0 0]]
                 [[0 0]
                  [1 0]])
[[1.0 0.0] [0.0 0.0]]

This example is adapted from here. The correct answer is at the bottom of that page, and shown here.

\[\left[\begin{array}{cccc} 1 & 2 & -2 & 0 \\ -3 & 4 & 7 & 2 \\ 6 & 0 & 3 & 1\end{array}\right] \left[\begin{array}{cc} -1 & 3 \\ 0 & 9 \\ 1 & -11 \\ 4 & -5 \end{array}\right] = \left[\begin{array}{cc} -3 & 43 \\ 18 & -60 \\ 4 & -5\end{array}\right] \]

For readability we use temporary variables here, and pretty-print the result.

(let ((A [[1 2 -2 0]
          [-3 4 7 2]
          [6 0 3 1]])
      (B [[-1 3]
          [0  9]
          [1 -11]
          [4 -5]]))
  (pp (matrix-multiply A B)))
[[-3.0 43.0]
 [18.0 -60.0]
 [1.0 -20.0]]

So, all these example work as we expect. The elisp function for matrix-multiply does a lot of work for you to make these cases work, including error checking for dimensional consistency.

3 Summary thoughts

It was not any easier to write this dynamic module than the previous one I used with the Gnu Scientific Library. The approach and code are remarkably similar. In one way the GSL was easier to use; it worked out of the box, whereas I had to fiddle with a security option in my OS to get it to run MKL! My Anaconda Python distribution must get around that somehow since it ships with an MKL compiled Numpy and scipy.

The idea of using elisp for analysis of the inputs and making sure they are correct is a good one and helps prevent segfaults. Of course it is a good idea to write defensive c-code to avoid that too. Overall, this is another good example of expanding the capabilities of Emacs with a dynamic module.

4 The module code

The c-code is loosely adapted from We do not implement the full dgemm behavior which is able to calculate \(C = \alpha A * B + \beta*C\). We set α=1, and β=0 in this example. We should do some dimension checking here, but it is easier to do it in emacs in a helper function. As long as you use the helper function there should not be an issue, but it is possible to segfault Emacs if you use the module function incorrectly.

#include "emacs-module.h"
#include "emacs-module-helpers.h"
#include <mkl.h>

int plugin_is_GPL_compatible;

emacs_value Fmkl_dgemm (emacs_env *env, ptrdiff_t nargs, emacs_value args[], void *data)
  double *A, *B, *C;
  int m, n, k, i, j;
  double alpha = 1.0;
  double beta = 0.0;
  // These will be 2d vectors
  emacs_value M0 = args[0]; // array 1 - A (m x k)
  emacs_value M1 = args[1]; // array 2 - B (k x n)

  // I need to get the number of rows and columns of each one.
  m = env->vec_size(env, M0);
  k  = 0;
  // We assume that we have a 2d array.
  emacs_value el1 = env->vec_get (env, M0, 0);
  k = env->vec_size(env, el1);
  // Now we know A has dimensions (m, k)
  emacs_value el2 = env->vec_get (env, M1, 0);
  n = env->vec_size(env, el2);
  // Now we know M1 had dimensions (k, n)
  // Now we have to build up arrays.
  // We are looking at a * b = c
  A = (double *)mkl_malloc( m*k*sizeof( double ), 64 );
  B = (double *)mkl_malloc( k*n*sizeof( double ), 64 );
  C = (double *)mkl_malloc( m*n*sizeof( double ), 64 );
  if (A == NULL || B == NULL || C == NULL) {
    printf( "\n ERROR: Can't allocate memory for matrices. Aborting... \n\n");
    return 1;

  //populate A
  emacs_value row, ij;
  for (int i = 0; i < m; i++)
      row = env->vec_get(env, M0, i);
      for (int j = 0; j < k; j++)
          // get M0[i, j]
          ij = env->vec_get(env, row, j);
          A[k * i + j] = extract_double(env, ij);

  // populate B
  for (int i = 0; i < k; i++)
      row = env->vec_get(env, M1, i);
      for (int j = 0; j < n; j++)
          // get M0[i, j]
          ij = env->vec_get(env, row, j);
          B[n * i + j] = extract_double(env, ij);

  // initialize C.  The solution will have dimensions of (rows1, cols2).
  for (int i = 0; i < m; i++)
      for (int j = 0; j < n; j++)
          C[n * i + j] = 0.0;

  // the multiplication is done here.
  cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 
                m, n, k, alpha, A, k, B, n, beta, C, n);

  // now we build up the vector to return
  emacs_value vector = env->intern(env, "vector");
  emacs_value *array = malloc(sizeof(emacs_value) * m);
  emacs_value *row1;
  emacs_value vec;
  for (int i = 0; i < m; i++)
      row1 = malloc(sizeof(emacs_value) * n);
      for (int j = 0; j < n; j++)
          row1[j] = env->make_float(env, C[j + i * n]);
      vec = env->funcall(env, vector, n, row1);
      array[i] = vec;

  emacs_value result = env->funcall(env, vector, m, array);
  return result;

int emacs_module_init(struct emacs_runtime *ert)
  emacs_env *env = ert->get_environment(ert);
  DEFUN("mkl-dgemm", Fmkl_dgemm, 2, 2,
        "(mkl-dgemm A B)\n"\
        "Multiply the matrices A and B. A and B must both be 2d vectors.\n" \
        "Returns the product as a vector.",
  provide(env, "mkl");
  return 0;

To build this we have to run org-babel-tangle to generate the mkl.c file, and then run this shell block to compile it.

sh /opt/intel/mkl/bin/ intel64
gcc -Wall -m64 -I${MKLROOT}/include -fPIC -c mkl.c
gcc -shared -L${MKLROOT}/lib -Wl,-rpath,${MKLROOT}/lib -lmkl_rt -lpthread -lm -ldl -L. -lemacs-module-helpers -o mkl.o

5 Elisp helper functions

We will often want to know the shape of our arrays. The shape is how many elements there are in each dimension. Here we define a recursive function that gets the shape of arbitrarily nested vectors and returns a vector of the shape. We define some helper functions to get the number of dimensions, elements, rows and columns.

The main function is a helper elisp function that multiplies two arrays. The function analyzes the shapes and transforms 1d vectors into the right 2d shape to multiply them together, and then returns the shape that makes sense. The c-code is not very robust to mistakes in the array dimensions. It tends to make emacs segfault if you get it wrong. So we try to avoid that if possible.

We have four cases to consider for multiplication:

2d * 2d
(assert (= m1 n2)) return (n1, m2)
1d * 2d
1d is a row vector (1, n1) (assert (= n1 m2)) return vector with n2 elements
2d * 1d
1d is a column vector (m2, 1) (assert (= n1 m2)) return vector with m2 elements
1d * 1d
(assert (= (length m1) (length m2)) return a scalar

Here is the

(add-to-list 'load-path (expand-file-name "."))
(require 'mkl)
(require 'cl)
(require 'seq)

(defun vector-shape (vec)
  "Return a vector of the shape of VEC."
  (let ((shape (vector (length vec))))
    (if (vectorp (aref vec 0))
        (vconcat shape (vector-shape (aref vec 0)))

(defun vector-ndims (vec)
  "Returns the number of dimensions in VEC."
  (length (vector-shape vec)))

(defun vector-numel (vec)
  "Returns the number of elements in VEC."
  (if (> (length vec) 0)
      (seq-reduce '* (vector-shape vec) 1)

(defun vector-nrows (vec)
 "Return the number of rows in VEC."
 (aref (vector-shape vec) 0))

(defun vector-ncols (vec)
 "Return the number of columns in VEC."
 (aref (vector-shape vec) 1))

(defun matrix-multiply (A B)
  "Return A * B in the matrix multiply sense."
   ;; 1d * 1d i.e. a dot-product
   ((and (= 1 (vector-ndims A))
         (= 1 (vector-ndims B))
         (= (length A) (length B)))
    ;; this is easy to compute so we don't use dgemm.
    (seq-reduce '+ (mapcar* (lambda (a b) (* a b)) A B) 0.0))

   ;; 2d * 1d (m1, n1) * (n2, 1)
   ((and (= 2 (vector-ndims A))
         (= 1 (vector-ndims B))
         ;; ncols-A = len-B
         (= (vector-ncols A) (length B)))
    ;; transform B into a 2d column vector
    (let* ((B2d (apply 'vector (mapcar 'vector B)))
           (result  (mkl-dgemm A B2d)))
      ;; Now call (dgemm A B2d) -> (m2, 1) column vector
      ;; and convert it back to a 1d result
      (cl-map 'vector (lambda (v) (aref v 0)) result)))

   ;; 1d * 2d (1, n1) * (m2, n2) len-A = nrows-B
   ((and (= 1 (vector-ndims A))
         (= 2 (vector-ndims B))
         (= (length A) (vector-nrows B)))
    ;; transform B into a 2d row vector
    (let* ((A2d (vector A))
           (result  (mkl-dgemm A2d B)))
      ;; should be a 2d row vector
      (aref result 0)))

   ;; 2d * 2d (m1, n1) * (m2, n2) rows-A = ncols-B
   ((and (= 2 (vector-ndims A))
         (= 2 (vector-ndims B))
         (= (vector-ncols A)
            (vector-nrows B)))
    ;; call (dgemm A B) and return result
    (mkl-dgemm A B))
    ;; Error checking, getting here means none of the cases above were caught.
    ;; something is probably wrong.
     ((or (> (vector-ndims A) 2)
          (> (vector-ndims B) 2))
      (error "One of your arrays has more than 2 dimensions. Only 1 or 2d arrays are supported"))
     ((and (= 1 (vector-ndims A))
           (= 1 (vector-ndims B))
           (not (= (length A) (length B))))
      (error "A and B must be the same length.
len(A) = %d
len(B) = %d" (length A) (length B)))
       (= (vector-ndims A) 2)
       (= (vector-ndims B) 2)
       (not (= (vector-nrows A) (vector-ncols B))))
      (error "Your array shapes are not correct.
The number of rows in array A must equal the number of columns in array B.
There are %d rows in A and %d columns in B" (vector-nrows A) (vector-ncols B)))
       (= (vector-ndims A) 2)
       (= (vector-ndims B) 1)
       (not (= (vector-nrows A) (length B))))
      (error "Your array shapes are not correct.
The number of rows in array A must equal the number of columns in array B.
There are %d rows in A and %d columns in B" (vector-nrows A) (length B)))
      (error "Unknown error"))))))

Copyright (C) 2017 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.0.7

Read and Post Comments

Next Page »