## Finding VASP calculations in a directory tree

| categories: | tags:

The goal in this post is to work out a way to find all the directories in some root directory that contain VASP calculations. This is a precursor to doing something with those directories, e.g. creating a summary file, adding entries to a database, doing some analysis, etc… For fun, we will just calculate the total elapsed time in the calculations.

What is challenging about this problem is that the calculations are often nested in a variety of different subdirectories, and we do not always know the structure. We need to recursively descend into those directories to check if they contain VASP calculations.

We will use a function that returns True or False to tell us if a particular directory is a VASP calculation or not. We can tell that because a completed VASP calculation has specific files in it, and specific content in those files. Notably, there is an OUTCAR file that contains the text "General timing and accounting informations for this job:" near the end of the file.

We will also use os.walk as the way to recursively descend into the root directory.

import os
from jasp import *

def vasp_p(directory):
'returns True if a finished OUTCAR file exists in the current directory, else False'
outcar = os.path.join(directory, 'OUTCAR')
if os.path.exists(outcar):
with open(outcar, 'r') as f:
if 'General timing and accounting informations for this job:' in contents:
return True
return False

total_time = 0

for root, dirs, files in os.walk('/home-research/jkitchin/research/rutile-atat'):
for d in dirs:
# compute absolute path to each directory in the current root
absd = os.path.join(root, d)
if vasp_p(absd):
# we found a vasp directory, so we can do something in it.
# here we get the elapsed time from the calculation
with jasp(absd) as calc:
total_time += calc.get_elapsed_time()

print 'Total computational time on this project is {0:1.0f} minutes.'.format(total_time / 60)

Total computational time on this project is 231 minutes.