Home

Welcome to the SUNCAT Center for Interface Science and Catalysis Wiki. Our work focuses on catalyst and materials design for next-generation energy solutions, and you can find more information about our efforts on our research page.

These pages will focus on sharing the expertise that make our work possible. This site is still under construction, but will soon host information on:

Our computational resources
Background on our calculations
Common/Useful scripts
Recommended productivity software
C ommon errors and possible solutions
Optimizing simulations
Advanced computational techniques

If you have any suggestions for content you'd like to see here, please send an email to doylead@stanford.edu

CatApp Leaderboard

CatApp Leaderboard 10/20/2017 10:53 AM*

roling 7927
bajdich 860
jboes 809
jschum 408
maxjh 296
hangaard 106
mttang 92
schlexer 25
vjbukas 15
jgauth32 14
pplessow 6
sback 2

* Based on number of trajectory files in catapp_data folders and ase-db entries on SUNCAT & Sherlock

Useful Scripts

Electrostatic Potential (Quantum Espresso) - Courtesy of Andrew Doyle

#######################################################
## Tutorial by: Andrew
## Topic: Electrostatic potential in ase-espresso
##
## dependencies: ase, espresso, pickle
##
## Generates the data files necessary to analyze
## the charge density and electrostatic potential
## in and around a Pt(100) slab
#######################################################

# Import necessary packages
from ase import Atoms
from ase.io import write
from ase.lattice.surface import fcc100
from espresso import espresso
import pickle

# Generate the representation of the atoms in ASE
slab = fcc100('Pt', size=(2,2,3), vacuum=10.0)

# Create an espresso calculator object
# (Tailor for desired settings and accuracy)
calc = espresso(pw=400 , # Plane wave cutoff
                dw=4000, # Density wave cutoff
                kpts=(3,3,1), # (rather sparse) k-point (Brillouin) sampling
                xc='RPBE', #Exchange-correlation functional
                dipole={'status':True}, # Includes dipole correction (necessary for asymmetric slabs)
                outdir='pt100_analysis') # Espresso-generated files will be put here

# Attach the calculator object to the atoms
slab.set_calculator(calc)

# Perform a static calculation to generate wavefunctions
slab.get_potential_energy()

# Create an image of the atoms
write('slab.png',slab)

# Get the electrostatic potential
potential = calc.extract_ionic_and_hartree_potential()

# Create and populate output file for electrostatic potential

potential_file=open('potential.pickle','w')
pickle.dump(potential,potential_file)
potential_file.close()

Now we are done with the intensive calculations. The following scripts will interpret the results and can be run easily on any computer in a matter of seconds.

#######################################################
## Tutorial by: Andrew
## Topic: Plotting electrostatic potential data from
##        ase-espresso
##
## dependencies: pickle, pylab, numpy
##
## Plots an XY-averaged electrostatic potential as
## a function of Z position through and outside
## a Pt(100) slab.
#######################################################

# Import necessary packages
import pickle
import pylab as plt
import numpy as np

# Retrieves the data generated in the previous script
potential_file = open('potential.pickle')
potential = pickle.load(potential_file)

origin=potential[0] # [X0,Y0,Z0] - The positions of the origin of the cell.  Here they are [0,0,0]

cell = potential[1] # [3x3 Matrix] Contains the [x,y,z] components of the 3 cell vectors
cell_x = cell[0][0] # The x component of the first cell vector (Remember: This cell is rectangular)
cell_y = cell[1][1] # The y component of the second cell vector (Remember: This cell is rectangular)
cell_z = cell[2][2] # The z component of the third cell vector (Remember: This cell is rectangular)

v = potential[2] # Contains all real-space potential field points.

#######################################################
## Understanding the architecture of the rest of the
## data can be tricky.  Using the variables presented
## here, V(x,y,z)=v[x][y][z].
##
## First we aim to extract all the information
## available to us, based on numpy's architecture.
#######################################################

n_x = len(v) # Returns the number of entries in the first subset (number of X entries)
n_y = len(v[0]) # Returns the number of entries in the second subset (number of Y entries - choice of '0' was arbitrary)
n_z = len(v[0][0]) # Returns the number of entries in the third subset (number of Z entries - choice of '0' was arbitrary)

potential = [] # Initializes a list for average potentials

# Iterate over all space to average and process potential information
for z in range(n_z):
        temp = 0 # Will serve as a partial sum for averaging purposes
        for x in range(n_x):
                for y in range(n_y):
                        temp+=v[x][y][z] # Sums the potential at all points in an X-Y plane
        temp = temp / (n_x*n_y) # Divides by the total number of points
        potential.append( temp ) # Stores the value for average potential at this Z position

# Translates the indices of points along the Z axis into real-space positions
z_coordinate = np.linspace( 0 , cell_z , n_z )

# Plots the data, formats plot
plt.plot(z_coordinate, potential)
plt.xlabel(r'z ($\AA$)',size=18)
plt.ylabel(r'$\langle V \ \rangle_{XY}$ (Volts)',size=18)
plt.title('Electrostatics of Pt(100)',size=18)

# Display plot
plt.show()
# If you'd prefer to save the plot, highlight the line above and uncomment the line below
#plt.savefig('potential_analysis.png')

The end result should look something like this:

Plot self-consistent field convergence data

This script, when run in a quantum espresso job folder, will plot the estimated SCF accuracy and total DFT energy as a function of electronic step. This can be useful for diagnosing weird convergence behavior, or for impatiently waiting for the next ionic step to finish.

#!/home/vossj/suncat/bin/python
import os
import pylab as plt
import sys
import matplotlib.pyplot as plt2
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA

rydberg = 13.6057 #rydberg to eV conversion

# default to outdir/log if no log is given
if len(sys.argv) &gt; 1:
    log = sys.argv[1]
else:
    log = 'outdir/log'

filename = 'scf_temp.txt'
filename2 = 'etmp.txt'
os.system('grep scf %s &gt; %s'%(log,filename))
os.system("grep 'total energy' %s &gt; %s"%(log,filename2))
file = open(filename)
lines = file.read().split('\n')
lines = filter(None,lines)

file2 = open(filename2)
lines2 = file2.read().split('\n')
lines2 = filter(None,lines2)

scf = []
energy = []
iter = []
iter2 = []
n = 0

# append scf convergence data from log file to scf
for line in lines:
    items = filter(None, line.split(' '))
    try:
        scf.append( float(items[4])*rydberg )
        n += 1
        iter.append(n)
    except:
        continue

# append total dft energy data from log to energy
n = 0
for line in lines2:
    items = filter(None, line.split(' '))
    try:
        n += 1
        energy.append(float(items[3])*rydberg )
        iter2.append(n)
    except:
        continue

# find the plane-wave cutoff from the log file
os.system("grep 'kinetic-energy cutoff' %s &gt; %s"%(log,filename))
file = open(filename)
items = filter(None, file.read().split(' '))
pw = float(items[3]) * 13.606
os.system('rm %s'%filename)
os.system('rm %s'%filename2)

# plot everything nicely
host = host_subplot(111,axes_class = AA.Axes)
plt2.subplots_adjust(right=0.75)
par1 = host.twinx()
host.set_xlabel('Number of Iterations')
host.set_ylabel('Estimated SCF Accuracy (eV)',color='red')
par1.set_ylabel('DFT Total-Energy (eV)',color='green')

p1, = host.plot(iter,scf,label='SCF Acc')
p2, = par1.plot(iter2,energy,label='DFT Energy')

host.set_yscale('log')
plt.semilogy(iter,scf)
plt.xlabel('Number of Iterations')
plt.ylabel('Estimated SCF Accuracy (eV)')
plt.title(r'$PW_{Cut}$=%d eV'%pw)
plt.tight_layout()
plt.show()

The output should look something like this, for a nicely converging run with several ionic steps. Multiple ionic step directories will contain all ionic steps in the plot.

Background on our Calculations

This page attempts to deconvolute the different methods used in the group to perform Density Functional Theory (DFT) calculations. The different pathways for DFT calculations are shown in the flow chart below. The green line represents the most common path used in the group. Most members use the Atomic Simulation Environment (ASE) interfaced with the Quantum Espresso calculator to perform DFT calculations. The dashed box represents what is done when using "native" mode, bypassing any use of ASE.

Input files
- ASE trajectories, eg init.traj
Python scripts
- See the example scripts page.
ASE
- See the ASE documentation here for ASE modules.
DFT input files
- Quantum Espresso uses a file called pw.inp which can be found in the output directory, eg 'outdir/' or 'calcdir/'
- VASP uses several files:
  - POSCAR contains the coordinates of all atoms in the unit cell, as well as information about the unit cell geometry. For MD calculations, it can also contain initial velocities.
  - KPOINTS describes the k-point grid used to sample the Brillouin zone.
  - POTCAR contains the pseudopotential used to describe the core electrons of each atom type in the unit cell.
  - INCAR contains optional parameters used in the DFT calculation. See below for some of these parameters.
- GPAW uses ..
Vienna Ab-Initio Simulation Package (VASP), Quantum Espresso, and Grid-based Projector Augmented Wave (GPAW) are different DFT calculators that can be used to calculate energies and forces of periodic systems.

Common Errors and Possible Solutions

This page lists several common errors you will most likely encounter in both Quantum-Espresso and VASP, as well as how some users have had success overcoming them.

Quantum-Espresso

S Matrix not positive definite

This error is usually caused by two atoms in your unit cell overlapping. Check the positions of atoms remembering that periodic boundary conditions are usually enforced. Rattling atoms in the unit cell has fixed this error for some users.

Switching the optimizer may also help, particularly it may help using QE's internal optimizer.

Alternately, you can try increasing dw, the charge density cutoff.

VASP

forrtl: severe (174): SIGSEGV, segmentation fault occurred

This error can be caused by the node running out of memory. You can try the following:

Reducing NPAR (it should still be a number that evenly divides the number of cores on the node)
Reducing ENCUT
Reducing vacuum between periodic slabs
Parallelizing over multiple nodes (with KPAR = # nodes)
Make sure you're requesting as much memory as possible from the node. On sherlock, use #SBATCH --mem-per-cpu=4000
Use a node with more memory (e.g. bigmem nodes on Sherlock)

Convergence Tips

If you're having difficulty with electronic convergence, see some tips below based on the code you're using.

ASE-Quantum Espresso

Try adjusting the smearing method (default is based on a Fermi-Dirac distribution). Other options are listed here, but generally Gaussian smearing is a good idea.

calc = espresso( ...
     smearing='gauss',
     ...)

The default mixing parameters are quite aggressive, and will often fail for more heterogeneous systems (alloys, oxides, et cetera). They are passed to ASE-Espresso in a dictionary with defaults shown below:

calc = espresso(...
     convergence = {'energy': 1e-6,
                    'mixing': 0.7,
                    'nmix': 8,
                    'mixing_mode': 'plain',
                    'maxsteps': 100,
                    'diag': 'david'},

     ...)

You may want to consider changing:
mixing and nmix - 'mixing' controls the relative size of the contribution previous SCF iterations' electronic densities. Lowering the value of 'mixing' is often required to allow the system to converge, but reducing it too far will provide aphysical results. As a general rule, the argument passed for 'mixing' should be at least 0.1, and the arguments passed for 'mixing' and 'nmix' should have a product of at least 1.

mixing_mode - For systems with reduced symmetry (including calculations at a surface) it is often helpful to use the 'local-TF' mixing mode, which more easily accounts for a heterogeneous charge density.

maxsteps - 100 steps is usually enough for many systems, and 200 steps might be more reasonable for the remainder. Unless you can see a clear trend towards electronic convergence, increasing 'maxsteps' beyond 200 is likely just causing your calculation to take longer to fail.

diag - I have yet to encounter a system where changing this parameter has eased convergence. However, in the interest of completeness, it may also be set to 'cg' to follow a conjugate-gradient based approach. This is substantially slower than the default Davidson diagonalization.

In summary, a common setup I use for oxide surfaces is shown below:

calc = espresso(...
     convergence = {'energy': 1e-6,
                    'mixing': 0.2,
                    'mixing_mode': 'local-TF',
                    'nmix': 10,
                    'maxsteps': 200,
                    'diag': 'david'},
     ...)

By default ASE-Espresso adds ten "extra" bands (based on the number of valence electrons in the pseudopotentials). This is often enough, however it doesn't scale with system size. Adding additional empty bands will slightly increase computational cost per SCF iteration, but may reduce overall computational time by reducing the number of SCF steps required. As a general rule, it makes sense to have up to 20-30% more bands than needed based on valence electron count alone.
If you are using more advanced computational techniques (Hubbard "+U" correction, exact exchange, et cetera) you may find it helpful to refer to those user guides (coming soon).

VASP

If you're using spin-polarization or a dipole correction, try starting from the CHGCAR of a non-spin-polarized calculation by setting:
ISTART = 0
ICHARG = 1
Use linear mixing:
BMIX = 0.0001
BMIX_MAG = 0.0001
Reduce AMIX and AMIX_MAG
Reduce MAXMIX (default is 45)
Restart from partly converged results by stopping after 20 electronic steps and restarting from WAVECAR (ie ISTART = 1

For further information, see: https://www.vasp.at/vasp-workshop/slides/optelectron.pdf

GPAW

Check out the guide GPAW has put together for just this purpose! Link

Printing at SUNCAT

SUNCAT users have access to standard (letter size) printing in the Shriram 312 (graduate student office), the multi-function devices (MFD) throughout the Stanford Engineering Quad, and in our offices at SLAC. To print posters, business cards, or for other projects, please contact one of our administrators.

Printing in Shriram 312:

Mac OS X Installation:

Open "System Preferences"
Select "Printers & Scanners"
You should now see a screen like this one:
Click on the + sign in the lower-left area of the window to add a new printer
Click "IP" in the upper-left, and enter "dband.stanford.edu" in the address field. The other fields will autocomplete as necessary. You should now see something like this:
Click "Add" in the lower-right
The printer should now be installed and ready for use. Print from anywhere (with internet access), and pick your documents up in Shriram 312

Computational Resources

Most of our work is carried out by supercomputing clusters managed and/or maintained by SUNCAT members. Here we detail which resources are available, and basic policies for each. To get started using any of these resources, contact Edward Truong (edtruong@stanford.edu) to set up an account.

"SLAC"

Our oldest and (currently) largest system is hosted at SLAC National Accelerator laboratory, and is often referred to by group members as "SLAC." SUNCAT members have access to several different queues based on the demands of their calculations:

Computing resources hosted at SLAC
Queue Name	Processor Type	Cores/GPUs per node	Nodes Available	Memory per core (or GPU)	Interconnect	Cost Factor
suncat-test	Nehalem X5550	8*	2	3 GB	1 Gbit Ethernet	0.0
suncat	Nehalem X5550	8*	284	3 GB	1 Gbit Ethernet	1.0
suncat2	Westmere X5650	12*	46	4 GB	2 Gbit Ethernet	1.1
suncat3	Sandy Bridge E5-2670	16*	64	4 GB	40 Gbit QDR Infiniband	1.8
suncat-gpu	Nvidia M2090	7	17	6 GB	40 Gbit QDR Infiniband	0.0

* These numbers represent the number of logical cores in a system that utilizes hyper-threading. Certain codes may use this technology more efficiently than others.

"Sherlock"

We now additionally have computer resources hosted at Stanford University hosted in the cluster codenamed Sherlock. Please ntoe that not all Sherlock users are SUNCAT affiliates, so you may see unfamiliar usernames. As Sherlock is relatively new, it's also relatively homogeneous, though different use cases are detailed below (and on their own wiki):

Node Type	Processor / GPU	Cores/GPUs per node	Memory per CPU (or GPU)	Notes
CPU / Default	Dual-Socket Xeon E5-2650	16	4 GB	Will be used by default
GPU	NVIDIA Tesla K20Xm (or) NVIDIA GeForce GTX Titan Black	8	32 GB	Will be used by default
Big Data/Increased RAM	Quad-Socket Xeon E5-4640	21	48 GB	Use --qos='bigmem' and --partition='bigmem'

As detailed below, counting available nodes is complicated by the variety of partitions available to SUNCAT users.

Partitions

Please note that there is no "test" queue on Sherlock. To check syntax, and for general python scripting you should enter "sdev" to move from the login node to a production node.

All Sherlock users (including SUNCAT members) are able to submit jobs to the "normal" partition. SUNCAT users are also able to submit to the "iric" partition (which is shared with several other research groups). Most SUNCAT members will notice no effective difference between these partitions, and should generally submit to both.

Additionally, SUNCAT users may submit to the "owners" partition. These are nodes purchased by other groups that would otherwise be left idle. You are free to use them, but your job may be cancelled if that resource is requested by its proper owner. However, there are also generally several nodes available in this partition at any time. Therefore, it is best used for jobs that are either expected to be short, or are equipped to restart efficiently.

Other specialty partitions exist (gpu and bigmem), and are detailed on the Sherlock wiki.

Running on GPUs

At this time there are no quantum chemical software packages configured for GPU use on Sherlock. However, if you are using another script that would benefit specifically from GPU acceleration you can follow the instructions here.

Electrostatic potential example

The following script generates and saves the electrostatic potential data.

#######################################################
## Tutorial by: Andrew
## Topic: Electrostatic potential in ase-espresso
##
## dependencies: ase, espresso, pickle
##
## Generates the data files necessary to analyze
## the charge density and electrostatic potential
## in and around a Pt(100) slab
#######################################################

# Import necessary packages
from ase import Atoms
from ase.io import write
from ase.lattice.surface import fcc100
from espresso import espresso
import pickle

# Generate the representation of the atoms in ASE
slab = fcc100('Pt', size=(2,2,3), vacuum=10.0)

# Create an espresso calculator object
# (Tailor for desired settings and accuracy)
calc = espresso(pw=400 , # Plane wave cutoff
                dw=4000, # Density wave cutoff
                kpts=(3,3,1), # (rather sparse) k-point (Brillouin) sampling
                xc='RPBE', #Exchange-correlation functional
                dipole={'status':True}, # Includes dipole correction (necessary for asymmetric slabs)
                outdir='pt100_analysis') # Espresso-generated files will be put here

# Attach the calculator object to the atoms
slab.set_calculator(calc)

# Perform a static calculation to generate wavefunctions
slab.get_potential_energy()

# Create an image of the atoms
write('slab.png',slab)

# Get the electrostatic potential
potential = calc.extract_total_potential()

# Create and populate output file for electrostatic potential

potential_file=open('potential.pickle','w')
pickle.dump(potential,potential_file)
potential_file.close()

Now we are done with the intensive calculations. The following scripts will interpret the results and can be run easily on any computer in a matter of seconds.

#######################################################
## Tutorial by: Andrew
## Topic: Plotting electrostatic potential data from
##        ase-espresso
##
## dependencies: pickle, pylab, numpy
##
## Plots an XY-averaged electrostatic potential as
## a function of Z position through and outside
## a Pt(100) slab.
#######################################################

# Import necessary packages
import pickle
import pylab as plt
import numpy as np

# Retrieves the data generated in the previous script
potential_file = open('potential.pickle')
potential = pickle.load(potential_file)

origin=potential[0] # [X0,Y0,Z0] - The positions of the origin of the cell.  Here they are [0,0,0]

cell = potential[1] # [3x3 Matrix] Contains the [x,y,z] components of the 3 cell vectors
cell_x = cell[0][0] # The x component of the first cell vector (Remember: This cell is rectangular)
cell_y = cell[1][1] # The y component of the second cell vector (Remember: This cell is rectangular)
cell_z = cell[2][2] # The z component of the third cell vector (Remember: This cell is rectangular)

v = potential[2] # Contains all real-space potential field points.

#######################################################
## Understanding the architecture of the rest of the
## data can be tricky.  Using the variables presented
## here, V(x,y,z)=v[x][y][z].
##
## First we aim to extract all the information
## available to us, based on numpy's architecture.
#######################################################

n_x = len(v) # Returns the number of entries in the first subset (number of X entries)
n_y = len(v[0]) # Returns the number of entries in the second subset (number of Y entries - choice of '0' was arbitrary)
n_z = len(v[0][0]) # Returns the number of entries in the third subset (number of Z entries - choice of '0' was arbitrary)

potential = [] # Initializes a list for average potentials

# Iterate over all space to average and process potential information
for z in range(n_z):
        temp = 0 # Will serve as a partial sum for averaging purposes
        for x in range(n_x):
                for y in range(n_y):
                        temp+=v[x][y][z] # Sums the potential at all points in an X-Y plane
        temp = temp / (n_x*n_y) # Divides by the total number of points
        potential.append( temp ) # Stores the value for average potential at this Z position

# Translates the indices of points along the Z axis into real-space positions
z_coordinate = np.linspace( 0 , cell_z , n_z )

# Plots the data, formats plot
plt.plot(z_coordinate, potential)
plt.xlabel(r'z ($\AA$)',size=18)
plt.ylabel(r'$\langle V \ \rangle_{XY}$ (Volts)',size=18)
plt.title('Electrostatics of Pt(100)',size=18)

# Display plot
plt.show()
# If you'd prefer to save the plot, highlight the line above and uncomment the line below
#plt.savefig('potential_analysis.png')

The end result should look something like this: