A prototype implementation of jasp in emacs-lisp

| categories: lisp, ase, vasp, emacs | tags:

I want to try implementing an interface to VASP in emacs-lisp. VASP is a program that uses density functional theory to calculate properties of materials. VASP was designed for the user to create several text-based input files that contain an atomic geometry, and calculation parameters. Then you run the VASP program, which reads those input files and generates output files. Finally, you parse out the results you want from the output files.

It is moderately tedious to do that, so we already have a very extensive Python interface (http://github.com/jkitchin/jasp ) that automates the input file creation and output file parsing, and with org-mode we have a pretty good literate research environment to document what we do. But, the Python/emacs environment is not as integrated as the emacs-lisp/emacs environment is, particularly when it comes to accessing documentation. So, I want to try out a lisp approach to see what this would look like, and if it has any benefits.

The bare bones implementation will have an Atom object, to store the type of atom and its coordinates, an Atoms object which will be a collection of atoms, and a unit cell, and a Calculator object that will store calculation parameters.

Then, we will try using it to see what advantages it might have. This will be a moderately long post.

1 The Atom class

This is my first serious effort to use the object system in emacs-lisp, so I do not claim it is optimal, or beautiful. It is functional though. We make a simple Atom class that holds the chemical symbol, and xyz coordinates. We do not consider having a magnetic moment at this point, and this class has no methods.

(defclass Atom ()
  ((symbol :initarg :symbol
           :documentation "A string for the chemical symbol")
   (x :initarg :x
      :documentation "The x coordinate")
   (y :initarg :y
      :documentation "The y coordinate")
   (z :initarg :z
      :documentation "The z coordinate"))
 "A class to represent an atom.")

(provide 'Atom)
Atom

Let us try it out. We make an Atom, then get the symbol using the oref function. I don't think we need a "Name" for the atom, so the second argument is nil.

(Atom nil :symbol "C" :x 0 :y 0 :z 0)
[object Atom nil "C" 0 0 0]
(let ((a1 (Atom nil :symbol "C" :x 0 :y 0  :z 0)))
  (oref a1 symbol))
C

It is not difficult to modify an Atom object. We use oset.

(let ((a1 (Atom nil :symbol "C" :x 0 :y 0  :z 0)))
  (oset a1 x 2.5)
  a1)
[object Atom nil "C" 2.5 0 0]

2 The Atoms class

We need at Atoms object, which will store a list of Atom objects, and a unit cell. I do not know how to make this class act like a list the way that the Atoms object in ase acts. We will have to access the list of atoms as a slot. We define one method here to get the ith atom from the object.

(defclass Atoms ()
  ((atoms :initarg :atoms
          :documentation "A list of `Atom' objects")
   (cell :initarg :cell
         :documentation "A 3-tuple of lengths of an orthorhombic unit cell"))
  "A class to represent an atoms object.")


(defmethod get-atom ((atoms Atoms) &rest args)
  "Get a list of atoms in ARGS.
Return atom if ARGS contains one element, a list of atom objects
otherwise."
  (cond
   ((= 1 (length args))
    (elt (oref atoms atoms) (car args)))
   (t
    (mapcar (lambda (i)
              (elt (oref atoms atoms) i))
            args))))

(provide 'Atoms)
get-atom

That seems to do what we need. It has some limitations:

  1. We only allowed for orthorhombic unit cells
  2. We did not enable any constraints or magnetic moments on the atoms.

Here is an example Atoms object. Note we use `, notation to ensure each Atom is created, since Atom is a constructor function that returns the object.

(Atoms nil
       :atoms `(,(Atom nil :symbol "C" :x 0 :y 0 :z 0)
                ,(Atom nil :symbol "O" :x 1.1 :y 0 :z 0))
       :cell '(8 9 10))
[object Atoms nil ([object Atom nil "C" 0 0 0] [object Atom nil "O" 1.1 0 0]) (8 9 10)]

We can drill into the object, e.g. to get the second atom:

(let ((A1 (Atoms nil
                 :atoms `(,(Atom nil :symbol "C" :x 0 :y 0 :z 0)
                          ,(Atom nil :symbol "O" :x 1.1 :y 0 :z 0))
                 :cell '(8 9 10))))
  (get-atom A1 1))
[object Atom nil "O" 1.1 0 0]

We can modify the atoms in place like this. Suppose we want to change the symbol of the first atom to "O". We use setf for this too.

(let ((A1 (Atoms nil
                 :atoms `(,(Atom nil :symbol "C" :x 0 :y 0 :z 0)
                          ,(Atom nil :symbol "O" :x 1.1 :y 0 :z 0))
                 :cell '(8 9 10))))
  (oset (get-atom A1 0) symbol "O")
  A1)
[object Atoms nil ([object Atom nil "O" 0 0 0] [object Atom nil "O" 1.1 0 0]) (8 9 10)]

The only think I do not like about this syntax is the need to get the list of atoms from the object. That is a little clunkier than the Python analog where the object is a list itself. That may be just my inexperience with emacs-lisp. Probably you can define some getter function to smooth this over.

This Atoms class lacks much of the functionality of the ase.Atoms class, but it is sufficient for this prototype.

3 The Calculator class

Next, we need our Calculator. This will store the parameters, and be responsible for creating the INCAR, POSCAR, KPOINTS, and POTCAR files, running a calculation, and getting data from the output. We also create a with-current-directory macro that will temporarily change the working directory since VASP uses the same filenames over and over, but in different directories.

(defmacro with-current-directory (directory &rest body)
  "Set the working directory temporarily set to DIRECTORY and run BODY.
DIRECTORY is expanded, and create it and its parents if needed."
  `(progn
     (unless (file-exists-p (file-name-as-directory
                             (expand-file-name ,directory)))
       (make-directory ,directory t))
     
     (let ((default-directory (file-name-as-directory
                                (expand-file-name ,directory)))) 
        ,@body)))


(defclass Jasp ()
  ((wd :initarg :wd
       :initform "."  ; default to the current working directory
       :documentation "Directory to run calculation in.")
   (encut :initarg :encut
          :documentation "Positive number in eV for planewave cutoff.
See URL `http://cms.mpi.univie.ac.at/vasp/vasp/ENCUT_tag.html'.")
   (nbands :initarg :nbands
           :documentation "Integer number of bands.
See URL `http://cms.mpi.univie.ac.at/vasp/vasp/NBANDS_tag.html'.")
   (kpts :initarg :kpts
         :initform (1 1 1)  ; default value
         :documentation "3-tuple for Monkhorst-Pack K-point grid.")
   (xc :initarg :xc
       :documentation "String of exchange correlation functional.")
   (atoms :initarg :atoms
          :documentation "An `Atoms' object."))
 "A class to represent a calculator that runs VASP.")


(defmethod view-atoms ((calc Jasp))
  "Open the ase-gui"
  (unless (and (file-exists-p "POSCAR")
               (file-exists-p "POTCAR"))
    (write-poscar calc)
    (write-potcar calc))
  (shell-command "ase-gui POSCAR"))


(defmethod write-poscar ((calc Jasp))
  "create a POSCAR file for CALC."
  (with-temp-file "POSCAR"
    (insert "Created by jasp.el\n")
    (insert "  1.0") ; unit cell scale factor
    (let* ((atoms (oref calc atoms))
           (cell (oref atoms cell)))
      (loop for v in cell
            for i below (length cell)     
            do
            (insert "\n")
            (loop for j below (length cell)
                  do
                  (if (equal i j)
                      (insert (format " %f " (float (elt cell i))))
                    (insert (format " %f " 0.0 ))))))
    ;; The next line is counts for each atom type. For each number in
    ;; this line, there will be a copy of the POTCAR in the POTCAR
    ;; file. In ase, we sort the atoms and group them so that there is
    ;; only one POTCAR per atom. We do not do that here yet. We will
    ;; have a POTCAR for each atom.
    (insert "\n")
    (loop for atom in (oref (oref calc atoms) atoms)
          do (insert (format "1 ")))
    
    ;; now we do the atoms
    (insert "\nCartesian\n")
    (loop for atom in (oref (oref calc atoms) atoms)
          do
          (insert
           (format "%f %f %f\n"
                   (oref atom x)
                   (oref atom y)
                   (oref atom z))))
    (buffer-string)))


(defmethod write-kpoints ((calc Jasp))
  "Create KPOINTS file for CALC. 
Limited to automatic generation, and no offset."
  (with-temp-file "KPOINTS"
    (insert "Automatic mesh
0
Monkhorst-Pack
")
    (dolist (k (oref calc kpts))
      (insert (format "%4d " k)))
    (insert "\n0.0 0.0 0.0")
    (buffer-string)))


(defmethod write-potcar ((calc Jasp))
  "Generate the POTCAR file for CALC.
No `Atom' grouping is done, there is one POTCAR for each atom."
  (with-temp-file "POTCAR"
    (let ((xc (oref calc xc))
          (atoms (oref calc atoms))
          (vasp_pp_path (getenv "VASP_PP_PATH")))
      (loop for atom in (oref atoms atoms)
            do
            (insert-file-contents
             (f-join
              vasp_pp_path
              (concat "potpaw_" xc)
              (oref atom symbol)
              "POTCAR"))))
    (buffer-string)))


(defmethod write-incar ((calc Jasp))
  "Generate the INCAR file for CALC."
  (with-temp-file "INCAR"
    (insert (format "ENCUT = %f\n" (oref calc encut)))
    (insert (format "NBANDS = %d\n" (oref calc nbands)))
    (buffer-string)))


(defmethod run ((calc Jasp))
  "Write out input files, and run VASP as a simple shell command"
  (write-poscar calc)
  (write-kpoints calc)
  (write-potcar calc)
  (write-incar calc)
  (shell-command "vasp"))


(defmethod update ((calc Jasp))
  "Run vasp if needed for CALC.
We just check for a properly ended OUTCAR."
  (with-current-directory
   (oref calc wd)
   (unless (and (file-exists-p "OUTCAR")
                (with-temp-buffer
                  (insert-file-contents "OUTCAR")
                  (re-search-forward
                  "                 Voluntary context switches:"
                  (point-max)
                  t)))
     (run calc))))


(defmethod get-potential-energy ((calc Jasp))
  "Get potential energy from CALC."
  (update calc)
  (with-current-directory
   (oref calc wd)
   (with-temp-buffer
     (insert-file-contents "OUTCAR")
     ;; go to last entry
     (while (re-search-forward
             "free  energy   TOTEN  =\\s-*\\([-]?[0-9]*\\.[0-9]*\\) eV"
             (point-max)
             t)
       nil)
     ;; return last match
     (string-to-number  (match-string 1)))))

(provide 'jasp)
get-potential-energy

This is worth some discussion. On one hand, the constructor is a bit more verbose than the implementation in Python. In Python we use a context handler in place of the macro here. On the other hand, that verbosity comes with detailed, accessible documentation for each argument. We only considered the simplest of input arguments. It might be trickier to include lists, and other types of input. But I think those can all be worked out like they were in ase. We only implemented the simplest job control logic, but that also can be worked out. The biggest challenge might be getting more complex data from the output. Nearly everything can be obtained from vasprun.xml also, in the event that parsing is to slow or difficult.

Now, let us test this out. We can make a calculator:

(setq calc (Jasp
            nil
            :xc "PBE"
            :encut 350
            :nbands 6
            :atoms (Atoms
                    nil
                    :atoms `(,(Atom nil :symbol "C" :x 0 :y 0 :z 0)
                             ,(Atom nil :symbol "O" :x 1.1 :y 0 :z 0))
                    :cell '(8 9 10))))
[object Jasp nil "." 350 6 (1 1 1) "PBE" [object Atoms nil ([object Atom nil "C" 0 0 0] [object Atom nil "O" 1.1 0 0]) (8 9 10)]]

We can call the class functions like this. Here we write out the corresponding POSCAR:

(write-poscar calc)
Created by jasp.el
  1.0
 8.000000  0.000000  0.000000 
 0.000000  9.000000  0.000000 
 0.000000  0.000000  10.000000 
1 1 
Cartesian
0.000000 0.000000 0.000000
1.100000 0.000000 0.000000

It looks a little backward if you have only seen Python, where this would be calc.writeposcar(). It is almost the same characters, just a different order (and no . in lisp)!

Here we get the KPOINTS file:

(write-kpoints calc)
Automatic mesh
0
Monkhorst-Pack
   1    1    1 
0.0 0.0 0.0

I cannot show the POTCAR file for licensing reasons, but it works.

(write-potcar calc)

and the INCAR file:

(write-incar calc)
ENCUT = 350.000000
NBANDS = 6

We run a calculation like this. This will run vasp directly (not through the queue system).

(run calc)
0

The returned 0 means the shell command finished correctly.

And we retrieve the potential energy like this:

(get-potential-energy calc)
-14.687906

Not bad. That is close to the result we got from a similar calculation here .

4 Putting it all together to run calculations

If we put this all together the way we might use it in practice, it looks like this.

(load-file "Atom.el")
(load-file "Atoms.el")
(load-file "Jasp.el")

(let* ((co (Atoms
            nil
            :atoms `(,(Atom nil :symbol "C" :x 0 :y 0 :z 0)
                     ,(Atom nil :symbol "O" :x 1.1 :y 0 :z 0))
            :cell '(8 9 10)))

       (calc (Jasp
              nil
              :xc "PBE"
              :nbands 6
              :encut 350
              :atoms co)))
  
  (get-potential-energy calc))
-14.687906

Compare this with this Python code which does approximately the same thing:

from ase import Atoms, Atom
from jasp import *

co = Atoms([Atom('C', [0,   0, 0]),
            Atom('O', [1.1, 0, 0])],
            cell=(6., 6., 6.))

with jasp('molecules/simple-co', #output dir
          xc='PBE',  # the exchange-correlation functional
          nbands=6,  # number of bands
          encut=350, # planewave cutoff
          atoms=co) as calc:
    print 'energy = {0} eV'.format(co.get_potential_energy())

They look pretty similar. One thing clearly missing from emacs-lisp that Python has is full support for numerics and plotting. Some of this could be addressed via Pymacs , but certainly not all of it. Some of it could also be handled using org-mode to enable data from emacs-lisp to go to other code blocks that can handle it.

Finally, for a little fun, we illustrate mapping over a range of bond lengths. There is more than one way to do this. For example, we could create a list of calculators, and then run over them. Here we create one calculator, and just change the x position in a loop. We use the more general setf approach instead of oset to see what it looks like.

(let* ((co (Atoms
            nil
            :atoms `(,(Atom nil :symbol "C" :x 0 :y 0 :z 0)
                     ,(Atom nil :symbol "O" :x 1.1 :y 0 :z 0))
            :cell '(8 9 10)))
       (calc (Jasp
              nil
              :wd nil
              :xc "PBE"
              :nbands 6
              :encut 350
              :atoms co)))
  (dolist (d '(1.05 1.1 1.15 1.2 1.25))
    ;; change working directory
    (setf (oref calc wd) (format "co-%s" d))
    ;; set x-coordinate on oxygen atom
    (setf (oref (elt (oref co atoms) 1) x) d)
    (print (format "d = %s\nEnergy = %s eV"
                   d
                   (get-potential-energy calc)))))
"d = 1.05
Energy = -14.195892 eV"

"d = 1.1
Energy = -14.698456 eV"

"d = 1.15
Energy = -14.814809 eV"

"d = 1.2
Energy = -14.660395 eV"

"d = 1.25
Energy = -14.319904 eV"

See http://kitchingroup.cheme.cmu.edu/dft-book/dft.html#sec-3-4-1 for how this was done in Python. It looks pretty similar to me.

5 Summary thoughts

We implemented a bare bones emacs-lisp calculator for VASP. The library automates creation of input files, running the calculation, and parsing the output.

It seems pretty feasible to implement a pretty complete interface to VASP in emacs-lisp. The main reasons to do this are:

  1. Integrated access to documentation
  2. Emacs editing of emacs-lisp code
  3. Integration with org-mode

Even with Python editor that had access to documentation as deeply integrated as emacs has with emacs-lisp, it would still just be a Python editor, i.e. you probably could not use the editor to write org-mode, LaTeX, etc… It is time to recognize we need both scientific document creation and code editing capability in the same place! This kind of suggests a need to get a better Python environment going in Emacs, which deeper integration of the documentation. See this post for some progress in that area!

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

The sqlite variation of ase.db

| categories: ase, vasp | tags:

In a recent post we explored writing VASP calculations to an ase database in json format. Today we explore a similar idea, but writing to sqlite. I have incorporated the code from the previous post into a utils module in jasp.

from jasp.utils import *

print get_jasp_dirs('/home-research/jkitchin/research/rutile-atat')
['/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/0', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/1', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/10', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/11', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/12', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/13', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/14', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/15', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/16', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/17', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/2', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/3', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/4', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/5', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/59', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/6', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/66', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/7', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/73', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/74', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/78', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/8', '/home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/9']

That new function get_jasp_dirs just returns a list of directories that are known finished VASP calculations. We will use a functional style of programming to map a function onto each directory in that list.

from jasp import *
from jasp.utils import *

from ase.db import connect

# I want a sqlite extension, so we have to specify db as the type, which means sqlite
c = connect('vaspdb.sqlite', type='db')

dirs = get_jasp_dirs('/home-research/jkitchin/research/rutile-atat')

def write(directory):
    with jasp(directory) as calc:
        atoms = calc.get_atoms()
        calc.results['energy'] = atoms.get_potential_energy()
        calc.results['forces'] = atoms.get_forces()
    print c.write(atoms), directory

# functional approach to writing
map(write, dirs)
1 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/0
2 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/1
3 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/10
4 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/11
5 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/12
6 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/13
7 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/14
8 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/15
9 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/16
10 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/17
11 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/2
12 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/3
13 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/4
14 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/5
15 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/59
16 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/6
17 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/66
18 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/7
19 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/73
20 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/74
21 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/78
22 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/8
23 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/9

Now, we have a sqlite database. Let us explore that a bit before using the python interface again.

1 Exploring the database with sqlite

First we see the tables.

.tables
keywords           species            text_key_values  
number_key_values  systems          

We can see details of the tables like this.

select * from sqlite_master;
table|systems|systems|2|CREATE TABLE systems (
    id integer primary key autoincrement,
    unique_id text unique,
    ctime real,
    mtime real,
    user text,
    numbers blob,
    positions blob,
    cell blob,
    pbc integer,
    initial_magmoms blob,
    initial_charges blob,
    masses blob,
    tags blob,
    momenta blob,
    constraints text,
    calculator text,
    calculator_parameters text,
    energy real,
    free_energy real,
    forces blob,
    stress blob,
    dipole blob,
    magmoms blob,
    magmom blob,
    charges blob,
    data text)
index|sqlite_autoindex_systems_1|systems|3|
table|sqlite_sequence|sqlite_sequence|4|CREATE TABLE sqlite_sequence(name,seq)
table|species|species|5|CREATE TABLE species (
    Z integer,
    n integer,
    id text,
    foreign key (id) references systems(id))
table|keywords|keywords|6|CREATE TABLE keywords (
    keyword text,
    id text,
    foreign key (id) references systems(id))
table|text_key_values|text_key_values|8|CREATE TABLE text_key_values (
    key text,
    value text,
    id text,
    foreign key (id) references systems(id))
table|number_key_values|number_key_values|10|CREATE TABLE number_key_values (
    key text,
    value real,
    id text,
    foreign key (id) references systems (id))
index|unique_id_index|systems|11|CREATE INDEX unique_id_index on systems(unique_id)
index|ctime_index|systems|12|CREATE INDEX ctime_index on systems(ctime)
index|user_index|systems|13|CREATE INDEX user_index on systems(user)
index|calculator_index|systems|14|CREATE INDEX calculator_index on systems(calculator)
index|species_index|species|15|CREATE INDEX species_index on species(Z)
index|keyword_index|keywords|16|CREATE INDEX keyword_index on keywords(keyword)
index|text_index|text_key_values|17|CREATE INDEX text_index on text_key_values(key)
index|number_index|number_key_values|18|CREATE INDEX number_index on number_key_values(key)

Let us see one entry from the systems table.

select * from systems where id=1;
id|unique_id|ctime|mtime|user|numbers|positions|cell|pbc|initial_magmoms|initial_charges|masses|tags|momenta|constraints|calculator|calculator_parameters|energy|free_energy|forces|stress|dipole|magmoms|magmom|charges|data
1|3ed58bb16897177be0ed56400c90b6f4|14.2361260035084|14.2361260035084|jkitchin|@|7|||||||vasp|{"incar": {"doc": "INCAR parameters", "nbands": 43, "sigma": 0.1, "prec": "high", "encut": 350.0}, "doc": "JSON representation of a VASP calculation.\n\nenergy is in eV\nforces are in eV/\\AA\nstress is in GPa (sxx, syy, szz,  syz, sxz, sxy)\nmagnetic moments are in Bohr-magneton\nThe density of states is reported with E_f at 0 eV.\nVolume is reported in \\AA^3\nCoordinates and cell parameters are reported in \\AA\n\nIf atom-projected dos are included they are in the form:\n{ados:{energy:data, {atom index: {orbital : dos}}}\n", "potcar": [["Ru", "potpaw_PBE/Ru/POTCAR", "dee616f2a1e7a5430bb588f1710bfea3001d54ea"], ["O", "potpaw_PBE/O/POTCAR", "9a0489b46120b0cad515d935f44b5fbe3a3b1dfa"]], "input": {"kpts": [6, 6, 10], "kpts_nintersections": null, "reciprocal": false, "setups": {}, "xc": "PBE", "txt": "-", "gamma": true}, "atoms": {"cell": [[4.526933343669885, 0.0, 0.0], [0.0, 4.526933343669885, 0.0], [0.0, 0.0, 3.095292162609941]], "symbols": ["O", "O", "O", "O", "Ru", "Ru"], "tags": [0, 0, 0, 0, 0, 0], "pbc": [true, true, true], "positions": [[1.3820537023391204, 1.3820537023391204, 0.0], [3.1448796413307645, 3.1448796413307645, 0.0], [3.645520374174063, 0.8814129694958222, 1.5476460813049704], [0.8814129694958222, 3.645520374174063, 1.5476460813049704], [0.0, 0.0, 0.0], [2.2634666718349425, 2.2634666718349425, 1.5476460813049704]]}, "data": {"stress": [0.0884313161515024, 0.0884313161515024, 0.06042693164307849, -0.0, -0.0, -0.0], "doc": "Data from the output of the calculation", "volume": 63.432210741434858, "total_energy": -44.251496, "forces": [[-0.023609, -0.023609, 0.0], [0.023609, 0.023609, 0.0], [-0.023609, 0.023609, 0.0], [0.023609, -0.023609, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]], "fermi_level": 5.0374}, "metadata": {"date.created": 1395241327.477995, "uuid": "7081ee4a-af77-11e3-a6e6-003048f5e49e", "date.created.ascii": "Wed Mar 19 11:02:07 2014", "user.username": "jkitchin", "atoms.resort": [2, 3, 4, 5, 0, 1], "user.email": "jkitchin@andrew.cmu.edu", "user.fullname": "John Kitchin", "O.potential.git_hash": "9a0489b46120b0cad515d935f44b5fbe3a3b1dfa", "atoms.tags": [0, 0, 0, 0, 0, 0], "O.potential.path": "potpaw_PBE/O/POTCAR", "Ru.potential.path": "potpaw_PBE/Ru/POTCAR", "Ru.potential.git_hash": "dee616f2a1e7a5430bb588f1710bfea3001d54ea"}}|-44.251496||+|||||{"keywords": [], "data": {}, "key_value_pairs": {}}

How about the species table:

select * from species where id=1;
Z|n|id
8|4|1
44|2|1

Now we can find a calculation with two Ru and 4 oxygen atoms, but it takes some sqlite knowledge.

select sp1.id,sys.id 
from species as sp1 
inner join
species as sp2
on sp1.id = sp2.id
inner join systems as sys
on sp1.id=sys.id
where (sp1.Z=44 and sp1.n=2) and (sp2.Z=8 and sp2.n=4);
id|id
1|1

That is an expected result. Let us get back to the python interface.

2 The python interface to the ase-db

Let us search for entries containing 2 Ru atoms and 4 O atoms again. We know this should be the first entry from before. Note how much simpler this syntax is.

from ase.db import connect

# I want a sqlite extension, so we have to specify db as the type, which means sqlite
c = connect('vaspdb.sqlite', type='db')

for row in c.select('Ru=2,O=4'): print row
{'ctime': 14.236126003508412, 'energy': -44.251496, 'tags': array([0, 0, 0, 0, 0, 0], dtype=int32), 'positions': array([[ 1.3820537 ,  1.3820537 ,  0.        ],
       [ 3.14487964,  3.14487964,  0.        ],
       [ 3.64552037,  0.88141297,  1.54764608],
       [ 0.88141297,  3.64552037,  1.54764608],
       [ 0.        ,  0.        ,  0.        ],
       [ 2.26346667,  2.26346667,  1.54764608]]), 'calculator': u'vasp', 'calculator_parameters': {u'incar': {u'doc': u'INCAR parameters', u'prec': u'high', u'nbands': 43, u'sigma': 0.1, u'encut': 350.0}, u'doc': u'JSON representation of a VASP calculation.\n\nenergy is in eV\nforces are in eV/\\AA\nstress is in GPa (sxx, syy, szz,  syz, sxz, sxy)\nmagnetic moments are in Bohr-magneton\nThe density of states is reported with E_f at 0 eV.\nVolume is reported in \\AA^3\nCoordinates and cell parameters are reported in \\AA\n\nIf atom-projected dos are included they are in the form:\n{ados:{energy:data, {atom index: {orbital : dos}}}\n', u'potcar': [[u'Ru', u'potpaw_PBE/Ru/POTCAR', u'dee616f2a1e7a5430bb588f1710bfea3001d54ea'], [u'O', u'potpaw_PBE/O/POTCAR', u'9a0489b46120b0cad515d935f44b5fbe3a3b1dfa']], u'input': {u'kpts': array([ 6,  6, 10]), u'reciprocal': False, u'xc': u'PBE', u'kpts_nintersections': None, u'setups': {}, u'txt': u'-', u'gamma': True}, u'atoms': {u'cell': array([[ 4.52693334,  0.        ,  0.        ],
       [ 0.        ,  4.52693334,  0.        ],
       [ 0.        ,  0.        ,  3.09529216]]), u'symbols': [u'O', u'O', u'O', u'O', u'Ru', u'Ru'], u'tags': array([0, 0, 0, 0, 0, 0]), u'pbc': array([ True,  True,  True], dtype=bool), u'positions': array([[ 1.3820537 ,  1.3820537 ,  0.        ],
       [ 3.14487964,  3.14487964,  0.        ],
       [ 3.64552037,  0.88141297,  1.54764608],
       [ 0.88141297,  3.64552037,  1.54764608],
       [ 0.        ,  0.        ,  0.        ],
       [ 2.26346667,  2.26346667,  1.54764608]])}, u'data': {u'stress': array([ 0.08843132,  0.08843132,  0.06042693, -0.        , -0.        , -0.        ]), u'doc': u'Data from the output of the calculation', u'volume': 63.43221074143486, u'total_energy': -44.251496, u'forces': array([[-0.023609, -0.023609,  0.      ],
       [ 0.023609,  0.023609,  0.      ],
       [-0.023609,  0.023609,  0.      ],
       [ 0.023609, -0.023609,  0.      ],
       [ 0.      ,  0.      ,  0.      ],
       [ 0.      ,  0.      ,  0.      ]]), u'fermi_level': 5.0374}, u'metadata': {u'date.created': 1395241327.477995, u'uuid': u'7081ee4a-af77-11e3-a6e6-003048f5e49e', u'date.created.ascii': u'Wed Mar 19 11:02:07 2014', u'user.username': u'jkitchin', u'atoms.resort': array([2, 3, 4, 5, 0, 1]), u'user.email': u'jkitchin@andrew.cmu.edu', u'user.fullname': u'John Kitchin', u'O.potential.git_hash': u'9a0489b46120b0cad515d935f44b5fbe3a3b1dfa', u'atoms.tags': array([0, 0, 0, 0, 0, 0]), u'O.potential.path': u'potpaw_PBE/O/POTCAR', u'Ru.potential.path': u'potpaw_PBE/Ru/POTCAR', u'Ru.potential.git_hash': u'dee616f2a1e7a5430bb588f1710bfea3001d54ea'}}, 'cell': array([[ 4.52693334,  0.        ,  0.        ],
       [ 0.        ,  4.52693334,  0.        ],
       [ 0.        ,  0.        ,  3.09529216]]), 'numbers': array([ 8,  8,  8,  8, 44, 44], dtype=int32), 'forces': array([[-0.023609, -0.023609,  0.      ],
       [ 0.023609,  0.023609,  0.      ],
       [-0.023609,  0.023609,  0.      ],
       [ 0.023609, -0.023609,  0.      ],
       [ 0.      ,  0.      ,  0.      ],
       [ 0.      ,  0.      ,  0.      ]]), 'mtime': 14.236126003508412, 'pbc': array([ True,  True,  True], dtype=bool), 'id': 1, 'unique_id': u'3ed58bb16897177be0ed56400c90b6f4', 'user': u'jkitchin'}

3 Summary

It is not yet obvious what the advantage of the sqlite format over the json format is. One is that you can use SQL to create queries, which is probably more powerful than the ase-db format. It is a little mysterious how the ase-db searches the json format to me.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.5h

Discuss on Twitter

writing VASP calculations to ase.db formats

| categories: ase, vasp | tags:

The ase development team has created a new database module (https://wiki.fysik.dtu.dk/ase/ase/db/db.html ). In this post, we will examine how to use that in conjunction with jasp to save calculations to a database. To use this you need the latest version of jasp from http://github.com/jkitchin/jasp .

The idea here is to use the code from http://kitchingroup.cheme.cmu.edu/blog/2014/03/20/Finding-VASP-calculations-in-a-directory-tree/ to add each calculation in that subtree to an ase database. We will use the json format for the database.

import os
from jasp import *
from ase.db import connect
c = connect('vaspdb.json')

def vasp_p(directory):
    'returns True if a finished OUTCAR file exists in the current directory, else False'
    outcar = os.path.join(directory, 'OUTCAR')
    if os.path.exists(outcar):
        with open(outcar, 'r') as f:
            contents = f.read()
            if 'General timing and accounting informations for this job:' in contents:
                return True
    return False
            
        
total_time = 0

for root, dirs, files in os.walk('/home-research/jkitchin/research/rutile-atat'):
    for d in dirs:
        # compute absolute path to each directory in the current root
        absd = os.path.join(root, d)
        if vasp_p(absd):
            # we found a vasp directory, so we can do something in it. 
            # here we get the elapsed time from the calculation
            with jasp(absd) as calc:
                atoms = calc.get_atoms()
                calc.results['energy'] = atoms.get_potential_energy()
                calc.results['forces'] = atoms.get_forces()
            # it is important that this line not be inside the jasp
            # context-manager, because c.write writes in the local
            # dir.
            print c.write(atoms, ['atat']), absd
1 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/0
2 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/1
3 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/10
4 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/11
5 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/12
6 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/13
7 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/14
8 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/15
9 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/16
10 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/17
11 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/2
12 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/3
13 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/4
14 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/5
15 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/59
16 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/6
17 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/66
18 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/7
19 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/73
20 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/74
21 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/78
22 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/8
23 /home-research/jkitchin/research/rutile-atat/RuTi_O_rutile/9

The result is in this file: vaspdb.json .

We can see the contents of that database like this:

import os
from jasp import *
from ase.db import connect
c = connect('vaspdb.json')

g = c.select()  # g is a generator
print g.next()
{u'username': u'jkitchin', u'calculator_name': u'vasp', u'tags': array([0, 0, 0, 0, 0, 0]), u'positions': array([[ 1.382,  1.382,  0.   ],
       [ 3.145,  3.145,  0.   ],
       [ 3.646,  0.881,  1.548],
       [ 0.881,  3.646,  1.548],
       [ 0.   ,  0.   ,  0.   ],
       [ 2.263,  2.263,  1.548]]), u'energy': -44.251496, u'calculator_parameters': {u'incar': {u'doc': u'INCAR parameters', u'prec': u'high', u'nbands': 43, u'sigma': 0.1, u'encut': 350.0}, u'doc': u'JSON representation of a VASP calculation.\n\nenergy is in eV\nforces are in eV/\\AA\nstress is in GPa (sxx, syy, szz,  syz, sxz, sxy)\nmagnetic moments are in Bohr-magneton\nThe density of states is reported with E_f at 0 eV.\nVolume is reported in \\AA^3\nCoordinates and cell parameters are reported in \\AA\n\nIf atom-projected dos are included they are in the form:\n{ados:{energy:data, {atom index: {orbital : dos}}}\n', u'potcar': [[u'Ru', u'potpaw_PBE/Ru/POTCAR', u'dee616f2a1e7a5430bb588f1710bfea3001d54ea'], [u'O', u'potpaw_PBE/O/POTCAR', u'9a0489b46120b0cad515d935f44b5fbe3a3b1dfa']], u'input': {u'kpts': array([ 6,  6, 10]), u'reciprocal': False, u'xc': u'PBE', u'kpts_nintersections': None, u'setups': {}, u'txt': u'-', u'gamma': True}, u'atoms': {u'cell': array([[ 4.527,  0.   ,  0.   ],
       [ 0.   ,  4.527,  0.   ],
       [ 0.   ,  0.   ,  3.095]]), u'symbols': [u'O', u'O', u'O', u'O', u'Ru', u'Ru'], u'tags': array([0, 0, 0, 0, 0, 0]), u'pbc': array([ True,  True,  True], dtype=bool), u'positions': array([[ 1.382,  1.382,  0.   ],
       [ 3.145,  3.145,  0.   ],
       [ 3.646,  0.881,  1.548],
       [ 0.881,  3.646,  1.548],
       [ 0.   ,  0.   ,  0.   ],
       [ 2.263,  2.263,  1.548]])}, u'data': {u'stress': array([ 0.088,  0.088,  0.06 , -0.   , -0.   , -0.   ]), u'doc': u'Data from the output of the calculation', u'volume': 63.43221074143486, u'total_energy': -44.251496, u'forces': array([[-0.024, -0.024,  0.   ],
       [ 0.024,  0.024,  0.   ],
       [-0.024,  0.024,  0.   ],
       [ 0.024, -0.024,  0.   ],
       [ 0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ]]), u'fermi_level': 5.0374}, u'metadata': {u'date.created': 1395241327.477995, u'uuid': u'7081ee4a-af77-11e3-a6e6-003048f5e49e', u'date.created.ascii': u'Wed Mar 19 11:02:07 2014', u'user.username': u'jkitchin', u'atoms.resort': array([2, 3, 4, 5, 0, 1]), u'user.email': u'jkitchin@andrew.cmu.edu', u'user.fullname': u'John Kitchin', u'O.potential.git_hash': u'9a0489b46120b0cad515d935f44b5fbe3a3b1dfa', u'atoms.tags': array([0, 0, 0, 0, 0, 0]), u'O.potential.path': u'potpaw_PBE/O/POTCAR', u'Ru.potential.path': u'potpaw_PBE/Ru/POTCAR', u'Ru.potential.git_hash': u'dee616f2a1e7a5430bb588f1710bfea3001d54ea'}}, u'cell': array([[ 4.527,  0.   ,  0.   ],
       [ 0.   ,  4.527,  0.   ],
       [ 0.   ,  0.   ,  3.095]]), u'numbers': array([ 8,  8,  8,  8, 44, 44]), u'pbc': array([ True,  True,  True], dtype=bool), u'timestamp': 14.23343757848325, u'keywords': [u'atat'], u'forces': array([[-0.024, -0.024,  0.   ],
       [ 0.024,  0.024,  0.   ],
       [-0.024,  0.024,  0.   ],
       [ 0.024, -0.024,  0.   ],
       [ 0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ]]), 'id': 1, u'unique_id': u'123901e31734f14418381a23d1ee1072'}

The data stored there comes from the calc.todict() function written for jasp.

We can do some searches like this. Say we want to find all the calculations where there are four oxygen atoms.

import os
from jasp import *
from ase.db import connect
c = connect('vaspdb.json')

g = c.select(O=4)  # g is a generator
for entry in g:
    print c.get_atoms(entry['id']), '\n'
Atoms(symbols='O4Ru2', positions=..., tags=...,
      cell=[4.526933343669885, 4.526933343669885, 3.095292162609941],
      pbc=[True, True, True], calculator=SinglePointCalculator(...)) 

Atoms(symbols='O4Ti2', positions=..., tags=...,
      cell=[4.614336091353763, 4.614336091353763, 2.9555779409837473],
      pbc=[True, True, True], calculator=SinglePointCalculator(...)) 

Atoms(symbols='O4RuTi', positions=..., tags=...,
      cell=[[-0.0151920891931803, -4.604112035041115, 0.0],
      [-4.604112035041115, -0.0151920891931803, 0.0], [0.0, 0.0,
      -3.0110141191854245]], pbc=[True, True, True],
      calculator=SinglePointCalculator(...))

You can see there are three calculations with that criterion.

1 The command-line interface

There is also a command-line interface.

ase-db --help
usage: ase-db [-h] [-n] [-c COLUMNS] [--explain] [-y] [-i INSERT_INTO]
              [-k ADD_KEYWORDS] [-K ADD_KEY_VALUE_PAIRS]
              [--delete-keywords DELETE_KEYWORDS]
              [--delete-key-value-pairs DELETE_KEY_VALUE_PAIRS] [--delete]
              [-v] [-q] [-s SORT] [-r] [-l] [--limit LIMIT]
              [-p PYTHON_EXPRESSION]
              name [selection]

positional arguments:
  name
  selection

optional arguments:
  -h, --help            show this help message and exit
  -n, --count
  -c COLUMNS, --columns COLUMNS
                        short/long+row-row
  --explain
  -y, --yes
  -i INSERT_INTO, --insert-into INSERT_INTO
  -k ADD_KEYWORDS, --add-keywords ADD_KEYWORDS
  -K ADD_KEY_VALUE_PAIRS, --add-key-value-pairs ADD_KEY_VALUE_PAIRS
  --delete-keywords DELETE_KEYWORDS
  --delete-key-value-pairs DELETE_KEY_VALUE_PAIRS
  --delete
  -v, --verbose
  -q, --quiet
  -s SORT, --sort SORT
  -r, --reverse
  -l, --long
  --limit LIMIT
  -p PYTHON_EXPRESSION, --python-expression PYTHON_EXPRESSION

It is not obvious all those options are actually supported. For example, it is not clear there is a function that actually deletes keywords in the source code. You can add keywords, but I cannot figure out the syntax to add to one entry.

Below are some examples that do work. We can list details of the calculation with id=1.

ase-db vaspdb.json id=1
id|age|user    |formula|calc| energy| fmax|pbc|  size|keywords|   mass
 1|9m |jkitchin|O4Ru2  |vasp|-44.251|0.033|111|63.432|atat    |266.138

Get calculations with four oxygen atoms:

ase-db vaspdb.json O=4
id|age|user    |formula|calc| energy| fmax|pbc|  size|keywords|   mass
 1|9m |jkitchin|O4Ru2  |vasp|-44.251|0.033|111|63.432|atat    |266.138
 2|9m |jkitchin|O4Ti2  |vasp|-52.970|0.033|111|62.930|atat    |159.758
11|9m |jkitchin|O4RuTi |vasp|-48.115|0.157|111|63.826|atat    |212.948

Get all calculations tagged with atat

ase-db vaspdb.json atat
id|age|user    |formula  |calc|  energy| fmax|pbc|   size|keywords|   mass
 1|9m |jkitchin|O4Ru2    |vasp| -44.251|0.033|111| 63.432|atat    |266.138
 2|9m |jkitchin|O4Ti2    |vasp| -52.970|0.033|111| 62.930|atat    |159.758
 3|9m |jkitchin|O8Ru2Ti2 |vasp| -96.601|0.086|111|126.719|atat    |425.895
 4|9m |jkitchin|O8RuTi3  |vasp|-100.842|0.075|111|126.846|atat    |372.705
 5|9m |jkitchin|O8Ru3Ti  |vasp| -92.376|0.133|111|127.420|atat    |479.085
 6|9m |jkitchin|O8Ru2Ti2 |vasp| -96.594|0.184|111|127.176|atat    |425.895
 7|9m |jkitchin|O8RuTi3  |vasp|-100.959|0.176|111|126.924|atat    |372.705
 8|9m |jkitchin|O8Ru3Ti  |vasp| -92.314|0.084|111|127.377|atat    |479.085
 9|9m |jkitchin|O8Ru2Ti2 |vasp| -96.612|0.086|111|126.542|atat    |425.895
10|9m |jkitchin|O8RuTi3  |vasp|-100.816|0.080|111|126.557|atat    |372.705
11|9m |jkitchin|O4RuTi   |vasp| -48.115|0.157|111| 63.826|atat    |212.948
12|9m |jkitchin|O8Ru3Ti  |vasp| -92.429|0.163|111|127.291|atat    |479.085
13|9m |jkitchin|O8Ru2Ti2 |vasp| -96.770|0.166|111|126.870|atat    |425.895
14|9m |jkitchin|O8RuTi3  |vasp|-101.014|0.222|111|126.881|atat    |372.705
15|9m |jkitchin|O12Ru4Ti2|vasp|-140.969|0.114|111|190.614|atat    |692.033
16|9m |jkitchin|O8Ru3Ti  |vasp| -92.323|0.125|111|127.541|atat    |479.085
17|9m |jkitchin|O12Ru2Ti4|vasp|-149.516|0.241|111|190.070|atat    |585.653
18|9m |jkitchin|O8Ru2Ti2 |vasp| -96.661|0.064|111|127.038|atat    |425.895
19|9m |jkitchin|O12Ru4Ti2|vasp|-140.472|0.138|111|190.640|atat    |692.033
20|9m |jkitchin|O12Ru3Ti3|vasp|-144.667|0.166|111|190.604|atat    |638.843
21|9m |jkitchin|O12Ru2Ti4|vasp|-148.813|0.055|111|190.084|atat    |585.653
22|9m |jkitchin|O8RuTi3  |vasp|-100.874|0.051|111|126.690|atat    |372.705
23|9m |jkitchin|O8Ru3Ti  |vasp| -92.246|0.102|111|127.383|atat    |479.085

2 Summary thoughts

This is a nice addition to ase. I think it needs some thorough, production work testing to find out exactly how useful it is. We may need to reconsider the calc.todict() function in jasp to remove redundancy, but overall this is a good idea.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.5h

Discuss on Twitter

Finding VASP calculations in a directory tree

| categories: vasp, python | tags:

The goal in this post is to work out a way to find all the directories in some root directory that contain VASP calculations. This is a precursor to doing something with those directories, e.g. creating a summary file, adding entries to a database, doing some analysis, etc… For fun, we will just calculate the total elapsed time in the calculations.

What is challenging about this problem is that the calculations are often nested in a variety of different subdirectories, and we do not always know the structure. We need to recursively descend into those directories to check if they contain VASP calculations.

We will use a function that returns True or False to tell us if a particular directory is a VASP calculation or not. We can tell that because a completed VASP calculation has specific files in it, and specific content in those files. Notably, there is an OUTCAR file that contains the text "General timing and accounting informations for this job:" near the end of the file.

We will also use os.walk as the way to recursively descend into the root directory.

import os
from jasp import *

def vasp_p(directory):
    'returns True if a finished OUTCAR file exists in the current directory, else False'
    outcar = os.path.join(directory, 'OUTCAR')
    if os.path.exists(outcar):
        with open(outcar, 'r') as f:
            contents = f.read()
            if 'General timing and accounting informations for this job:' in contents:
                return True
    return False
            
        
total_time = 0

for root, dirs, files in os.walk('/home-research/jkitchin/research/rutile-atat'):
    for d in dirs:
        # compute absolute path to each directory in the current root
        absd = os.path.join(root, d)
        if vasp_p(absd):
            # we found a vasp directory, so we can do something in it. 
            # here we get the elapsed time from the calculation
            with jasp(absd) as calc:
                total_time += calc.get_elapsed_time()

print 'Total computational time on this project is {0:1.0f} minutes.'.format(total_time / 60)
Total computational time on this project is 231 minutes.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.5h

Discuss on Twitter

Serializing jasp calculations as json data

| categories: ase, vasp, jasp | tags:

We use VASPto calculate materials properties in our research We use the jasppython module we have developed to setup, run and analyze those calculations. One of the things we have worked on developing recently is to more transparently share how do this kind of work by using org-mode supporting information files. Doing this should make our research more reproducible, and allow others to build off of it more easily.

We have run into the following problem trying to share VASP results however. The VASP license prohibits us from sharing the POTCAR files that are used to run the calculations. That is unfortunate, but since these files are also what give VASP some competitive advantage, they are protected, and we agreed to that when we bought the license. The problem is that the jasp module requires the POTCAR files to work, so without them, our scripts are not reproducible by researchers without a VASP license.

So, we have been looking at new ways to share the data from our calculations. In this post, we consider representing the calculation as a JSON file. We will look at a couple of new features built into the development branch of jasp

1 The simplest case of a simple calculation

Here we setup and run a simple calculation, and output the JSON file.

from ase import Atoms, Atom
from jasp import *
import numpy as np
np.set_printoptions(precision=3, suppress=True)

co = Atoms([Atom('C',[0,   0, 0]),
            Atom('O',[1.2, 0, 0])],
            cell=(6., 6., 6.))

with jasp('molecules/simple-co', #output dir
          xc='PBE',  # the exchange-correlation functional
          nbands=6,  # number of bands
          encut=350, # planewave cutoff
          ismear=1,  # Methfessel-Paxton smearing
          sigma=0.01,# very small smearing factor for a molecule
          atoms=co) as calc:
    print 'energy = {0} eV'.format(co.get_potential_energy())
    print co.get_forces()
    with open('JSON', 'w') as f:
        f.write(calc.json)
energy = -14.687906 eV
[[ 5.095  0.     0.   ]
 [-5.095  0.     0.   ]]

Now, we can analyze the JSON file independently of jasp. The json data contains all the inputs we used for the VASP calculation, the atomic geometry, and many of the outputs of the calculation. Here is the JSONfile.

import json
with open('molecules/simple-co/JSON', 'rb') as f:
    d = json.loads(f.read())

print('The energy is {0}'.format(d['data']['total_energy']))
print('The forces are {0}'.format(d['data']['forces']))
The energy is -14.687906
The forces are [[5.095488, 0.0, 0.0], [-5.095488, 0.0, 0.0]]

2 Including extra information in the JSON file

If we use a slightly different syntax, we can also include the total DOS in the JSON file.

from jasp import *

with jasp('molecules/simple-co') as calc:
    with open('JSON-DOS', 'w') as f:
        f.write(calc_to_json(calc, dos=True))

To illustrate that we have done that, let us plot the DOS without using jasp from the JSON-DOSfile.

import json
import matplotlib.pyplot as plt

with open('molecules/simple-co/JSON-DOS', 'rb') as f:
    d = json.loads(f.read())

energies = d['data']['dos']['e']
dos = d['data']['dos']['dos']
plt.plot(energies, dos)
plt.savefig('molecules/simple-co/dos.png')

We are still working on getting atom-projected DOS into the json file, and ensuring that all the spin cases are handled (e.g. the spin-up and spin-down DOS).

3 Limitations?

JSON is flexible, and can store text and numeric data. It does not store numpy arrays, but rather it is limited to storing lists of data. You would have to convert them back to arrays if you want to do array math. You probably wouldn't want to store a 3d array of electron density in this format, although it probably isn't worse than a CUBE file format. We haven't tested these files very significantly yet at a large scale to see how fast it is to read from lots of them.

Nonetheless, this looks like a reasonable format to share data in human and machine readable form, without violating the VASP licence conditions.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter
Next Page ยป