Merge pull request #1971 from charlessievers/chem_snap

Chemical specificity for SNAP
2020-06-18 10:11:15 -04:00 · 2020-06-18 10:11:15 -04:00 · 77f6fecc86
parent 586c81b1be a221f13308
commit 77f6fecc86
27 changed files with 2060 additions and 624 deletions
--- a/doc/src/compute_sna_atom.rst
+++ b/doc/src/compute_sna_atom.rst
@ -30,7 +30,7 @@ Syntax
 * R_1, R_2,... = list of cutoff radii, one for each type (distance units)
 * w_1, w_2,... = list of neighbor weights, one for each type
 * zero or more keyword/value pairs may be appended
-* keyword = *rmin0* or *switchflag* or *bzeroflag* or *quadraticflag*
+* keyword = *rmin0* or *switchflag* or *bzeroflag* or *quadraticflag* or *chem* or *bnormflag* or *wselfallflag*

  .. parsed-literal::

@ -44,6 +44,15 @@ Syntax
       *quadraticflag* value = *0* or *1*
          *0* = do not generate quadratic terms
          *1* = generate quadratic terms
+       *chem* values = *nelements* *elementlist*
+          *nelements* = number of SNAP elements
+          *elementlist* = *ntypes* integers in range [0, *nelements*)
+       *bnormflag* value = *0* or *1*
+          *0* = do not normalize
+          *1* = normalize bispectrum components
+       *wselfallflag* value = *0* or *1*
+          *0* = self-contribution only for element of central atom
+          *1* = self-contribution for all elements

 Examples
 """"""""
@ -54,6 +63,7 @@ Examples
   compute db all sna/atom 1.4 0.95 6 2.0 1.0
   compute vb all sna/atom 1.4 0.95 6 2.0 1.0
   compute snap all snap 1.4 0.95 6 2.0 1.0
+   compute snap all snap 1.0 0.99363 6 3.81 3.83 1.0 0.93 chem 2 0 1

 Description
 """""""""""
@ -71,27 +81,26 @@ mathematical definition is given in the paper by Thompson et
 al. :ref:`(Thompson) <Thompson20141>`

 The position of a neighbor atom *i'* relative to a central atom *i* is
-a point within the 3D ball of radius *R_ii' = rcutfac\*(R_i + R_i')*
+a point within the 3D ball of radius :math:`R_{ii'}` = *rcutfac* :math:`(R_i + R_i')`

 Bartok et al. :ref:`(Bartok) <Bartok20101>`, proposed mapping this 3D ball
 onto the 3-sphere, the surface of the unit ball in a four-dimensional
-space.  The radial distance *r* within *R_ii'* is mapped on to a third
-polar angle *theta0* defined by,
+space.  The radial distance *r*  within *R_ii'* is mapped on to a third
+polar angle :math:`\theta_0` defined by,

 .. math::

-  \theta_0 = {\tt rfac0} \frac{r-r_{min0}}{R_{ii'}-r_{min0}} \pi
+  \theta_0 = {\sf rfac0} \frac{r-r_{min0}}{R_{ii'}-r_{min0}} \pi

 In this way, all possible neighbor positions are mapped on to a subset
-of the 3-sphere.  Points south of the latitude *theta0max=rfac0\*Pi*
+of the 3-sphere.  Points south of the latitude :math:`\theta_0` = *rfac0* :math:`\pi`
 are excluded.

-The natural basis for functions on the 3-sphere is formed by the 4D
-hyperspherical harmonics *U\^j_m,m'(theta, phi, theta0).*  These
-functions are better known as *D\^j_m,m',* the elements of the Wigner
+The natural basis for functions on the 3-sphere is formed by the
+representatives of *SU(2)*, the matrices :math:`U^j_{m,m'}(\theta, \phi, \theta_0)`.
+These functions are better known as :math:`D^j_{m,m'}`, the elements of the Wigner
 *D*\ -matrices :ref:`(Meremianin <Meremianin2006>`,
-:ref:`Varshalovich) <Varshalovich1987>`.
-
+:ref:`Varshalovich <Varshalovich1987>`, :ref:`Mason) <Mason2009>`
 The density of neighbors on the 3-sphere can be written as a sum of
 Dirac-delta functions, one for each neighbor, weighted by species and
 radial distance. Expanding this density function as a generalized
@ -100,20 +109,20 @@ coefficient as

 .. math::

-  u^j_{m,m'} = U^j_{m,m'}(0,0,0) + \sum_{r_{ii'} < R_{ii'}}{f_c(r_{ii'}) w_{i'} U^j_{m,m'}(\theta_0,\theta,\phi)}
+  u^j_{m,m'} = U^j_{m,m'}(0,0,0) + \sum_{r_{ii'} < R_{ii'}}{f_c(r_{ii'}) w_{\mu_{i'}} U^j_{m,m'}(\theta_0,\theta,\phi)}

-The *w_i'* neighbor weights are dimensionless numbers that are chosen
-to distinguish atoms of different types, while the central atom is
-arbitrarily assigned a unit weight.  The function *fc(r)* ensures that
+The :math:`w_{\mu_{i'}}` neighbor weights are dimensionless numbers that depend on
+:math:`\mu_{i'}`, the SNAP element of atom *i'*, while the central atom is
+arbitrarily assigned a unit weight.  The function :math:`f_c(r)` ensures that
 the contribution of each neighbor atom goes smoothly to zero at
-*R_ii'*:
+:math:`R_{ii'}`:

 .. math::

  f_c(r)   = & \frac{1}{2}(\cos(\pi \frac{r-r_{min0}}{R_{ii'}-r_{min0}}) + 1), r \leq R_{ii'} \\
           = & 0,  r > R_{ii'}

-The expansion coefficients *u\^j_m,m'* are complex-valued and they are
+The expansion coefficients :math:`u^j_{m,m'}` are complex-valued and they are
 not directly useful as descriptors, because they are not invariant
 under rotation of the polar coordinate frame. However, the following
 scalar triple products of expansion coefficients can be shown to be
@ -128,7 +137,8 @@ real-valued and invariant under rotation :ref:`(Bartok) <Bartok20101>`.
        {j_2} {m_2} {m'_2} \end{array}}
        u^{j_1}_{m_1,m'_1} u^{j_2}_{m_2,m'_2}

-The constants *H\^jmm'_j1m1m1'_j2m2m2'* are coupling coefficients,
+The constants :math:`H^{jmm'}_{j_1 m_1 m_{1'},j_2 m_ 2m_{2'}}`
+are coupling coefficients,
 analogous to Clebsch-Gordan coefficients for rotations on the
 2-sphere. These invariants are the components of the bispectrum and
 these are the quantities calculated by the compute *sna/atom*\ . They
@ -136,13 +146,12 @@ characterize the strength of density correlations at three points on
 the 3-sphere. The j2=0 subset form the power spectrum, which
 characterizes the correlations of two points. The lowest-order
 components describe the coarsest features of the density function,
-while higher-order components reflect finer detail.  Note that the
-central atom is included in the expansion, so three point-correlations
-can be either due to three neighbors, or two neighbors and the central
-atom.
+while higher-order components reflect finer detail. Each bispectrum
+component contains terms that depend on the positions of up to 4
+atoms (3 neighbors and the central atom).

 Compute *snad/atom* calculates the derivative of the bispectrum components
-summed separately for each atom type:
+summed separately for each LAMMPS atom type:

 .. math::

@ -165,7 +174,7 @@ Again, the sum is over all atoms *i'* of atom type *I*\ .  For each atom
 virial components, each atom type, and each bispectrum component.  See
 section below on output for a detailed explanation.

-Compute *snap* calculates a global array contains information related
+Compute *snap* calculates a global array containing information related
 to all three of the above per-atom computes *sna/atom*\ , *snad/atom*\ ,
 and *snav/atom*\ . The first row of the array contains the summation of
 *sna/atom* over all atoms, but broken out by type. The last six rows
@ -201,8 +210,8 @@ The argument *rcutfac* is a scale factor that controls the ratio of
 atomic radius to radial cutoff distance.

 The argument *rfac0* and the optional keyword *rmin0* define the
-linear mapping from radial distance to polar angle *theta0* on the
-3-sphere.
+linear mapping from radial distance to polar angle :math:`theta_0` on the
+3-sphere, given above.

 The argument *twojmax* defines which
 bispectrum components are generated. See section below on output for a
@ -210,7 +219,7 @@ detailed explanation of the number of bispectrum components and the
 ordered in which they are listed.

 The keyword *switchflag* can be used to turn off the switching
-function.
+function :math:`f_c(r)`.

 The keyword *bzeroflag* determines whether or not *B0*\ , the bispectrum
 components of an atom with no neighbors, are subtracted from
@ -219,13 +228,72 @@ normally only affects compute *sna/atom*\ . However, when
 *quadraticflag* is on, it also affects *snad/atom* and *snav/atom*\ .

 The keyword *quadraticflag* determines whether or not the
-quadratic analogs to the bispectrum quantities are generated.
+quadratic combinations of bispectrum quantities are generated.
 These are formed by taking the outer product of the vector
 of bispectrum components with itself.
 See section below on output for a
 detailed explanation of the number of quadratic terms and the
 ordered in which they are listed.

+The keyword *chem* activates the explicit multi-element variant
+of the SNAP bispectrum components. The argument *nelements*
+specifies the number of SNAP elements that will be handled.
+This is followed by *elementlist*, a list of integers of
+length *ntypes*, with values in the range [0, *nelements* ),
+which maps each LAMMPS type to one of the SNAP elements.
+Note that multiple LAMMPS types can be mapped to the same element,
+and some elements may be mapped by no LAMMPS type. However, in typical
+use cases (training SNAP potentials) the mapping from LAMMPS types
+to elements is one-to-one.
+
+The explicit multi-element variant invoked by the *chem* keyword
+partitions the density of neighbors into partial densities
+for each chemical element.  This is described in detail in the
+paper by :ref:`Cusentino et al. <Cusentino2020>`
+The bispectrum components are indexed on
+ordered triplets of elements:
+
+.. math::
+
+   B_{j_1,j_2,j}^{\kappa\lambda\mu} =
+   \sum_{m_1,m'_1=-j_1}^{j_1}\sum_{m_2,m'_2=-j_2}^{j_2}\sum_{m,m'=-j}^{j} (u^{\mu}_{j,m,m'})^*
+   H {\scriptscriptstyle \begin{array}{l} {j} {m} {m'} \\
+        {j_1} {m_1} {m'_1} \\
+        {j_2} {m_2} {m'_2} \end{array}}
+        u^{\kappa}_{j_1,m_1,m'_1} u^{\lambda}_{j_2,m_2,m'_2}
+
+where :math:`u^{\mu}_{j,m,m'}` is an expansion coefficient for the partial density of neighbors
+of element :math:`\mu`
+
+.. math::
+
+  u^{\mu}_{j,m,m'} =  w^{self}_{\mu_{i}\mu} U^{j,m,m'}(0,0,0) + \sum_{r_{ii'} < R_{ii'}}{\delta_{\mu\mu_{i'}}f_c(r_{ii'}) w_{\mu_{i'}} U^{j,m,m'}(\theta_0,\theta,\phi)}
+
+where :math:`w^{self}_{\mu_{i}\mu}` is the self-contribution, which is either 1 or 0
+(see keyword *wselfallflag* below), :math:`\delta_{\mu\mu_{i'}}` indicates
+that the sum is only over neighbor atoms of element :math:`\mu`,
+and all other quantities are the same as those appearing in the
+original equation for :math:`u^j_{m,m'}` given above.
+
+The keyword *wselfallflag* defines the rule used for the self-contribution.
+If *wselfallflag* is on, then :math:`w^{self}_{\mu_{i}\mu}` = 1. If it is
+off then :math:`w^{self}_{\mu_{i}\mu}` = 0, except in the case
+of :math:`{\mu_{i}=\mu}`, when :math:`w^{self}_{\mu_{i}\mu}` = 1.
+When the *chem* keyword is not used, this keyword has no effect.
+
+The keyword *bnormflag* determines whether or not the bispectrum
+component :math:`B_{j_1,j_2,j}` is divided by a factor of :math:`2j+1`.
+This normalization simplifies force calculations because of the
+following symmetry relation
+
+.. math::
+
+ \frac{B_{j_1,j_2,j}}{2j+1} = \frac{B_{j,j_2,j_1}}{2j_1+1} = \frac{B_{j_1,j,j_2}}{2j_2+1}
+
+This option is typically used in conjunction with the *chem* keyword,
+and LAMMPS will generate a warning if both *chem* and *bnormflag*
+are not both set or not both unset.
+
 .. note::

   If you have a bonded system, then the settings of
@ -257,6 +325,8 @@ described by the following piece of python code:
           for j in range(j1-j2,min(twojmax,j1+j2)+1,2):
               if (j>=j1): print j1/2.,j2/2.,j/2.

+For even twojmax = 2(*m*\ -1), :math:`K = m(m+1)(2m+1)/6`, the *m*\ -th pyramidal number. For odd twojmax = 2 *m*\ -1, :math:`K = m(m+1)(m+2)/3`, twice the *m*\ -th tetrahedral number.
+
 .. note::

   the *diagonal* keyword allowing other possible choices
@ -267,16 +337,15 @@ described by the following piece of python code:
 Compute *snad/atom* evaluates a per-atom array. The columns are
 arranged into *ntypes* blocks, listed in order of atom type *I*\ .  Each
 block contains three sub-blocks corresponding to the *x*\ , *y*\ , and *z*
-components of the atom position.  Each of these sub-blocks contains
-one column for each bispectrum component, the same as for compute
-*sna/atom*
+components of the atom position.  Each of these sub-blocks contains *K*
+columns for the *K* bispectrum components, the same as for compute *sna/atom*

 Compute *snav/atom* evaluates a per-atom array. The columns are
 arranged into *ntypes* blocks, listed in order of atom type *I*\ .  Each
 block contains six sub-blocks corresponding to the *xx*\ , *yy*\ , *zz*\ ,
 *yz*\ , *xz*\ , and *xy* components of the virial tensor in Voigt
-notation.  Each of these sub-blocks contains one column for each
-bispectrum component, the same as for compute *sna/atom*
+notation.  Each of these sub-blocks contains *K*
+columns for the *K* bispectrum components, the same as for compute *sna/atom*

 Compute *snap* evaluates a global array.
 The columns are arranged into
@ -312,6 +381,14 @@ of linear terms i.e. linear and quadratic terms are contiguous.
 So the nesting order from inside to outside is bispectrum component,
 linear then quadratic, vector/tensor component, type.

+If the *chem* keyword is used, then the data is arranged into :math:`N_{elem}^3`
+sub-blocks, each sub-block corresponding to a particular chemical labeling
+:math:`\kappa\lambda\mu` with the last label changing fastest.
+Each sub-block contains *K* bispectrum components. For the purposes
+of handling contributions to force, virial, and quadratic combinations,
+these :math:`N_{elem}^3` sub-blocks are treated as a single block
+of :math:`K N_{elem}^3` columns.
+
 These values can be accessed by any command that uses per-atom values
 from a compute as input.  See the :doc:`Howto output <Howto_output>` doc
 page for an overview of LAMMPS output options.
@ -320,7 +397,8 @@ Restrictions
 """"""""""""

 These computes are part of the SNAP package.  They are only enabled if
-LAMMPS was built with that package.  See the :doc:`Build package <Build_package>` doc page for more info.
+LAMMPS was built with that package.  See the :doc:`Build package <Build_package>`
+doc page for more info.

 Related commands
 """"""""""""""""
@ -332,6 +410,7 @@ Default

 The optional keyword defaults are *rmin0* = 0,
 *switchflag* = 1, *bzeroflag* = 1, *quadraticflag* = 0,
+*bnormflag* = 0, *wselfallflag* = 0

 ----------

@ -352,3 +431,12 @@ available at `arXiv:1409.3880 <http://arxiv.org/abs/1409.3880>`_

 **(Varshalovich)** Varshalovich, Moskalev, Khersonskii, Quantum Theory
 of Angular Momentum, World Scientific, Singapore (1987).
+.. _Varshalovich1987:
+
+.. _Mason2009:
+
+**(Mason)** J. K. Mason, Acta Cryst A65, 259 (2009).
+
+.. _Cusentino2020:
+
+**(Cusentino)** Cusentino, Wood, and Thompson, J Phys Chem A, xxx, xxxxx, (2020)
--- a/doc/src/pair_snap.rst
+++ b/doc/src/pair_snap.rst
@ -24,27 +24,30 @@ Examples
 Description
 """""""""""

-Pair style *snap* computes interactions using the spectral
-neighbor analysis potential (SNAP) :ref:`(Thompson) <Thompson20142>`.
-Like the GAP framework of Bartok et al. :ref:`(Bartok2010) <Bartok20102>`,
-:ref:`(Bartok2013) <Bartok2013>` which uses bispectrum components
+Pair style *snap* defines the spectral
+neighbor analysis potential (SNAP), a machine-learning 
+interatomic potential :ref:`(Thompson) <Thompson20142>`.
+Like the GAP framework of Bartok et al. :ref:`(Bartok2010) <Bartok20102>`, 
+SNAP uses bispectrum components
 to characterize the local neighborhood of each atom
 in a very general way. The mathematical definition of the
-bispectrum calculation used by SNAP is identical
-to that used by :doc:`compute sna/atom <compute_sna_atom>`.
+bispectrum calculation and its derivatives w.r.t. atom positions
+is identical to that used by :doc:`compute snap <compute_sna_atom>`,
+which is used to fit SNAP potentials to *ab initio* energy, force,
+and stress data.
 In SNAP, the total energy is decomposed into a sum over
 atom energies. The energy of atom *i* is
 expressed as a weighted sum over bispectrum components.

 .. math::

-   E^i_{SNAP}(B_1^i,...,B_K^i) = \beta^{\alpha_i}_0 + \sum_{k=1}^K \beta_k^{\alpha_i} B_k^i
+   E^i_{SNAP}(B_1^i,...,B_K^i) = \beta^{\mu_i}_0 + \sum_{k=1}^K \beta_k^{\mu_i} B_k^i

 where :math:`B_k^i` is the *k*\ -th bispectrum component of atom *i*\ ,
-and :math:`\beta_k^{\alpha_i}` is the corresponding linear coefficient
-that depends on :math:\alpha_i`, the SNAP element of atom *i*\ . The
+and :math:`\beta_k^{\mu_i}` is the corresponding linear coefficient
+that depends on :math:`\mu_i`, the SNAP element of atom *i*\ . The
 number of bispectrum components used and their definitions
-depend on the value of *twojmax*
+depend on the value of *twojmax* and other parameters
 defined in the SNAP parameter file described below.
 The bispectrum calculation is described in more detail
 in :doc:`compute sna/atom <compute_sna_atom>`.
@ -136,17 +139,51 @@ The SNAP parameter file can contain blank and comment lines (start
 with #) anywhere. Each non-blank non-comment line must contain one
 keyword/value pair. The required keywords are *rcutfac* and
 *twojmax*\ . Optional keywords are *rfac0*\ , *rmin0*\ ,
-*switchflag*\ , *bzeroflag*\, and *chunksize*\.
+*switchflag*\ , *bzeroflag*\ , *quadraticflag*\ , *chemflag*\ , 
+*bnormflag*\ , *wselfallflag*\ , and *chunksize*\ .

 The default values for these keywords are

 * *rfac0* = 0.99363
 * *rmin0* = 0.0
-* *switchflag* = 0
+* *switchflag* = 1
 * *bzeroflag* = 1
-* *quadraticflag* = 1
+* *quadraticflag* = 0
+* *chemflag* = 0
+* *bnormflag* = 0
+* *wselfallflag* = 0
 * *chunksize* = 2000

+If *quadraticflag* is set to 1, then the SNAP energy expression includes additional quadratic terms 
+that have been shown to increase the overall accuracy of the potential without much increase
+in computational cost :ref:`(Wood) <Wood20182>`. 
+
+.. math::
+
+   E^i_{SNAP}(\mathbf{B}^i) = \beta^{\mu_i}_0 + \boldsymbol{\beta}^{\mu_i} \cdot \mathbf{B}_i + \frac{1}{2}\mathbf{B}^t_i \cdot \boldsymbol{\alpha}^{\mu_i} \cdot \mathbf{B}_i
+
+where :math:`\mathbf{B}_i` is the *K*-vector of bispectrum components, 
+:math:`\boldsymbol{\beta}^{\mu_i}` is the *K*-vector of linear coefficients 
+for element :math:`\mu_i`, and :math:`\boldsymbol{\alpha}^{\mu_i}` 
+is the symmetric *K* by *K* matrix of quadratic coefficients.
+The SNAP element file should contain *K*\ (\ *K*\ +1)/2 additional coefficients
+for each element, the upper-triangular elements of :math:`\boldsymbol{\alpha}^{\mu_i}`.
+
+If *chemflag* is set to 1, then the energy expression is written in terms of explicit multi-element bispectrum
+components indexed on ordered triplets of elements, which has been shown to increase the ability of the SNAP
+potential to capture energy differences in chemically complex systems, 
+at the expense of a significant increase in computational cost :ref:`(Cusentino) <Cusentino20202>`.
+
+.. math::
+
+   E^i_{SNAP}(\mathbf{B}^i) = \beta^{\mu_i}_0 + \sum_{\kappa,\lambda,\mu} \boldsymbol{\beta}^{\kappa\lambda\mu}_{\mu_i} \cdot \mathbf{B}^{\kappa\lambda\mu}_i 
+
+where :math:`\mathbf{B}^{\kappa\lambda\mu}_i` is the *K*-vector of bispectrum components 
+for neighbors of elements :math:`\kappa`, :math:`\lambda`, and :math:`\mu` and 
+:math:`\boldsymbol{\beta}^{\kappa\lambda\mu}_{\mu_i}` is the corresponding *K*-vector 
+of linear coefficients for element :math:`\mu_i`. The SNAP element file should contain 
+a total of :math:`K N_{elem}^3` coefficients for each of the :math:`N_{elem}` elements.
+
 The keyword *chunksize* is only applicable when using the
 pair style *snap* with the KOKKOS package and is ignored otherwise.
 This keyword controls
@ -159,10 +196,6 @@ into two passes.
 Detailed definitions for all the other keywords
 are given on the :doc:`compute sna/atom <compute_sna_atom>` doc page.

-If *quadraticflag* is set to 1, then the SNAP energy expression includes the quadratic term, 0.5\*B\^t.alpha.B, where alpha is a symmetric *K* by *K* matrix.
-The SNAP element file should contain *K*\ (\ *K*\ +1)/2 additional coefficients
-for each element, the upper-triangular elements of alpha.
-
 .. note::

   The previously used *diagonalstyle* keyword was removed in 2019,
@ -221,7 +254,8 @@ Related commands

 :doc:`compute sna/atom <compute_sna_atom>`,
 :doc:`compute snad/atom <compute_sna_atom>`,
-:doc:`compute snav/atom <compute_sna_atom>`
+:doc:`compute snav/atom <compute_sna_atom>`,
+:doc:`compute snap <compute_sna_atom>`

 **Default:** none

@ -235,6 +269,10 @@ Related commands

 **(Bartok2010)** Bartok, Payne, Risi, Csanyi, Phys Rev Lett, 104, 136403 (2010).

-.. _Bartok2013:
+.. _Wood20182:

-**(Bartok2013)** Bartok, Gillan, Manby, Csanyi, Phys Rev B 87, 184115 (2013).
+**(Wood)** Wood and Thompson, J Chem Phys, 148, 241721, (2018)
+
+.. _Cusentino20202:
+
+**(Cusentino)** Cusentino, Wood, and Thompson, J Phys Chem A, xxx, xxxxx, (2020)
--- a/doc/utils/sphinx-config/false_positives.txt
+++ b/doc/utils/sphinx-config/false_positives.txt
@ -487,6 +487,7 @@ Critchley
 crossterm
 Crowson
 Crozier
+Cryst
 Crystallogr
 Csanyi
 csh
@ -510,6 +511,7 @@ CuH
 cuFFT
 Cummins
 Curk
+Cusentino
 customIDs
 cutbond
 cuthi
@ -3296,6 +3298,7 @@ xsu
 xtc
 xu
 Xu
+xxxxx
 xy
 xyz
 xz
--- a/examples/snap/InP_JCPA2020.snap
+++ b/examples/snap/InP_JCPA2020.snap
@ -0,0 +1 @@
+../../potentials/InP_JCPA2020.snap
--- a/examples/snap/InP_JCPA2020.snapcoeff
+++ b/examples/snap/InP_JCPA2020.snapcoeff
@ -0,0 +1 @@
+../../potentials/InP_JCPA2020.snapcoeff
--- a/examples/snap/InP_JCPA2020.snapparam
+++ b/examples/snap/InP_JCPA2020.snapparam
@ -0,0 +1 @@
+../../potentials/InP_JCPA2020.snapparam
--- a/examples/snap/in.snap.InP.JCPA2020
+++ b/examples/snap/in.snap.InP.JCPA2020
@ -0,0 +1,46 @@
+# Demonstrate SNAP InP potential
+
+# Initialize simulation
+
+variable nsteps index 100
+variable nrep equal 4
+variable a equal 5.83
+units           metal
+
+# generate the box and atom positions using a FCC lattice
+
+variable nx equal ${nrep}
+variable ny equal ${nrep}
+variable nz equal ${nrep}
+
+boundary        p p p
+
+lattice         diamond $a
+region          box block 0 ${nx} 0 ${ny} 0 ${nz}
+create_box      2 box
+create_atoms    1 box basis 5 2 basis 6 2 basis 7 2 basis 8 2
+
+mass 1 114.76
+mass 2 30.98
+
+# choose potential
+
+include InP_JCPA2020.snap
+
+# Setup output
+
+thermo          10
+thermo_modify norm yes
+
+# Set up NVE run
+
+timestep 0.5e-3
+neighbor 1.0 bin
+neigh_modify once no every 1 delay 0 check yes
+
+# Run MD
+
+velocity all create 300.0 4928459 loop geom
+fix 1 all nve
+run             ${nsteps}
+
--- a/examples/snap/log.15Jun20.snap.InP.JCPA2020.g++.1
+++ b/examples/snap/log.15Jun20.snap.InP.JCPA2020.g++.1
@ -0,0 +1,157 @@
+LAMMPS (15 Jun 2020)
+  using 1 OpenMP thread(s) per MPI task
+# Demonstrate SNAP InP potential
+
+# Initialize simulation
+
+variable nsteps index 100
+variable nrep equal 4
+variable a equal 5.83
+units           metal
+
+# generate the box and atom positions using a FCC lattice
+
+variable nx equal ${nrep}
+variable nx equal 4
+variable ny equal ${nrep}
+variable ny equal 4
+variable nz equal ${nrep}
+variable nz equal 4
+
+boundary        p p p
+
+lattice         diamond $a
+lattice         diamond 5.83
+Lattice spacing in x,y,z = 5.83 5.83 5.83
+region          box block 0 ${nx} 0 ${ny} 0 ${nz}
+region          box block 0 4 0 ${ny} 0 ${nz}
+region          box block 0 4 0 4 0 ${nz}
+region          box block 0 4 0 4 0 4
+create_box      2 box
+Created orthogonal box = (0.0 0.0 0.0) to (23.32 23.32 23.32)
+  1 by 1 by 1 MPI processor grid
+create_atoms    1 box basis 5 2 basis 6 2 basis 7 2 basis 8 2
+Created 512 atoms
+  create_atoms CPU = 0.000 seconds
+
+mass 1 114.76
+mass 2 30.98
+
+# choose potential
+
+include InP_JCPA2020.snap
+# DATE: 2020-06-01 CONTRIBUTOR: Mary Alice Cusentino mcusent@sandia.gov CITATION: M.A. Cusentino, M. A. Wood, and A.P. Thompson, "Explicit Multi-element Extension of the Spectral Neighbor Analysis Potential for Chemically Complex Systems", J. Phys. Chem. A, xxxxxx (2020)
+
+# Definition of SNAP+ZBL potential.
+
+variable zblcutinner index 4
+variable zblcutouter index 4.2
+variable zblz1 index 49
+variable zblz2 index 15
+
+# Specify hybrid with SNAP and ZBL
+
+pair_style hybrid/overlay zbl ${zblcutinner} ${zblcutouter} snap
+pair_style hybrid/overlay zbl 4 ${zblcutouter} snap
+pair_style hybrid/overlay zbl 4 4.2 snap
+pair_coeff 1 1 zbl ${zblz1} ${zblz1}
+pair_coeff 1 1 zbl 49 ${zblz1}
+pair_coeff 1 1 zbl 49 49
+pair_coeff 1 2 zbl ${zblz1} ${zblz2}
+pair_coeff 1 2 zbl 49 ${zblz2}
+pair_coeff 1 2 zbl 49 15
+pair_coeff 2 2 zbl ${zblz2} ${zblz2}
+pair_coeff 2 2 zbl 15 ${zblz2}
+pair_coeff 2 2 zbl 15 15
+pair_coeff * * snap InP_JCPA2020.snapcoeff InP_JCPA2020.snapparam In P
+Reading potential file InP_JCPA2020.snapcoeff with DATE: 2020-06-01
+SNAP Element = In, Radius 3.81205, Weight 1 
+SNAP Element = P, Radius 3.82945, Weight 0.929316 
+Reading potential file InP_JCPA2020.snapparam with DATE: 2020-06-01
+SNAP keyword rcutfac 1.0 
+SNAP keyword twojmax 6 
+SNAP keyword rfac0 0.99363 
+SNAP keyword rmin0 0.0 
+SNAP keyword bzeroflag 1 
+SNAP keyword quadraticflag 0 
+SNAP keyword wselfallflag 1 
+SNAP keyword chemflag 1 
+SNAP keyword bnormflag 1 
+
+# Setup output
+
+thermo          10
+thermo_modify norm yes
+
+# Set up NVE run
+
+timestep 0.5e-3
+neighbor 1.0 bin
+neigh_modify once no every 1 delay 0 check yes
+
+# Run MD
+
+velocity all create 300.0 4928459 loop geom
+fix 1 all nve
+run             ${nsteps}
+run             100
+Neighbor list info ...
+  update every 1 steps, delay 0 steps, check yes
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 8.6589
+  ghost atom cutoff = 8.6589
+  binsize = 4.32945, bins = 6 6 6
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair zbl, perpetual, half/full from (2)
+      attributes: half, newton on
+      pair build: halffull/newton
+      stencil: none
+      bin: none
+  (2) pair snap, perpetual
+      attributes: full, newton on
+      pair build: full/bin/atomonly
+      stencil: full/bin/3d
+      bin: standard
+Per MPI rank memory allocation (min/avg/max) = 6.027 | 6.027 | 6.027 Mbytes
+Step Temp E_pair E_mol TotEng Press 
+       0          300   -3.4805794            0   -3.4418771    1353.5968 
+      10    285.84677   -3.4787531            0   -3.4418766    1611.7131 
+      20    248.14649   -3.4738884            0   -3.4418756    2312.0308 
+      30    198.94136   -3.4675394            0   -3.4418744    3168.1543 
+      40    152.74831   -3.4615791            0   -3.4418734    3903.5749 
+      50     121.9796   -3.4576091            0   -3.4418728    4387.1254 
+      60    113.27555   -3.4564863            0   -3.4418729    4556.3003 
+      70    125.68089   -3.4580873            0   -3.4418735    4431.2083 
+      80    151.47475   -3.4614159            0   -3.4418745    4107.2369 
+      90    179.18708   -3.4649919            0   -3.4418754    3739.5881 
+     100    197.50662   -3.4673559            0    -3.441876    3492.7778 
+Loop time of 13.3103 on 1 procs for 100 steps with 512 atoms
+
+Performance: 0.325 ns/day, 73.946 hours/ns, 7.513 timesteps/s
+99.7% CPU use with 1 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 13.309     | 13.309     | 13.309     |   0.0 | 99.99
+Neigh   | 0          | 0          | 0          |   0.0 |  0.00
+Comm    | 0.00068474 | 0.00068474 | 0.00068474 |   0.0 |  0.01
+Output  | 0.00020504 | 0.00020504 | 0.00020504 |   0.0 |  0.00
+Modify  | 0.0003829  | 0.0003829  | 0.0003829  |   0.0 |  0.00
+Other   |            | 0.0004075  |            |       |  0.00
+
+Nlocal:    512 ave 512 max 512 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Nghost:    1959 ave 1959 max 1959 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Neighs:    31232 ave 31232 max 31232 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+FullNghs:  62464 ave 62464 max 62464 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+
+Total # of neighbors = 62464
+Ave neighs/atom = 122
+Neighbor list builds = 0
+Dangerous builds = 0
+
+Total wall time: 0:00:13
--- a/examples/snap/log.15Jun20.snap.InP.JCPA2020.g++.4
+++ b/examples/snap/log.15Jun20.snap.InP.JCPA2020.g++.4
@ -0,0 +1,157 @@
+LAMMPS (15 Jun 2020)
+  using 1 OpenMP thread(s) per MPI task
+# Demonstrate SNAP InP potential
+
+# Initialize simulation
+
+variable nsteps index 100
+variable nrep equal 4
+variable a equal 5.83
+units           metal
+
+# generate the box and atom positions using a FCC lattice
+
+variable nx equal ${nrep}
+variable nx equal 4
+variable ny equal ${nrep}
+variable ny equal 4
+variable nz equal ${nrep}
+variable nz equal 4
+
+boundary        p p p
+
+lattice         diamond $a
+lattice         diamond 5.83
+Lattice spacing in x,y,z = 5.83 5.83 5.83
+region          box block 0 ${nx} 0 ${ny} 0 ${nz}
+region          box block 0 4 0 ${ny} 0 ${nz}
+region          box block 0 4 0 4 0 ${nz}
+region          box block 0 4 0 4 0 4
+create_box      2 box
+Created orthogonal box = (0.0 0.0 0.0) to (23.32 23.32 23.32)
+  1 by 2 by 2 MPI processor grid
+create_atoms    1 box basis 5 2 basis 6 2 basis 7 2 basis 8 2
+Created 512 atoms
+  create_atoms CPU = 0.000 seconds
+
+mass 1 114.76
+mass 2 30.98
+
+# choose potential
+
+include InP_JCPA2020.snap
+# DATE: 2020-06-01 CONTRIBUTOR: Mary Alice Cusentino mcusent@sandia.gov CITATION: M.A. Cusentino, M. A. Wood, and A.P. Thompson, "Explicit Multi-element Extension of the Spectral Neighbor Analysis Potential for Chemically Complex Systems", J. Phys. Chem. A, xxxxxx (2020)
+
+# Definition of SNAP+ZBL potential.
+
+variable zblcutinner index 4
+variable zblcutouter index 4.2
+variable zblz1 index 49
+variable zblz2 index 15
+
+# Specify hybrid with SNAP and ZBL
+
+pair_style hybrid/overlay zbl ${zblcutinner} ${zblcutouter} snap
+pair_style hybrid/overlay zbl 4 ${zblcutouter} snap
+pair_style hybrid/overlay zbl 4 4.2 snap
+pair_coeff 1 1 zbl ${zblz1} ${zblz1}
+pair_coeff 1 1 zbl 49 ${zblz1}
+pair_coeff 1 1 zbl 49 49
+pair_coeff 1 2 zbl ${zblz1} ${zblz2}
+pair_coeff 1 2 zbl 49 ${zblz2}
+pair_coeff 1 2 zbl 49 15
+pair_coeff 2 2 zbl ${zblz2} ${zblz2}
+pair_coeff 2 2 zbl 15 ${zblz2}
+pair_coeff 2 2 zbl 15 15
+pair_coeff * * snap InP_JCPA2020.snapcoeff InP_JCPA2020.snapparam In P
+Reading potential file InP_JCPA2020.snapcoeff with DATE: 2020-06-01
+SNAP Element = In, Radius 3.81205, Weight 1 
+SNAP Element = P, Radius 3.82945, Weight 0.929316 
+Reading potential file InP_JCPA2020.snapparam with DATE: 2020-06-01
+SNAP keyword rcutfac 1.0 
+SNAP keyword twojmax 6 
+SNAP keyword rfac0 0.99363 
+SNAP keyword rmin0 0.0 
+SNAP keyword bzeroflag 1 
+SNAP keyword quadraticflag 0 
+SNAP keyword wselfallflag 1 
+SNAP keyword chemflag 1 
+SNAP keyword bnormflag 1 
+
+# Setup output
+
+thermo          10
+thermo_modify norm yes
+
+# Set up NVE run
+
+timestep 0.5e-3
+neighbor 1.0 bin
+neigh_modify once no every 1 delay 0 check yes
+
+# Run MD
+
+velocity all create 300.0 4928459 loop geom
+fix 1 all nve
+run             ${nsteps}
+run             100
+Neighbor list info ...
+  update every 1 steps, delay 0 steps, check yes
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 8.6589
+  ghost atom cutoff = 8.6589
+  binsize = 4.32945, bins = 6 6 6
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair zbl, perpetual, half/full from (2)
+      attributes: half, newton on
+      pair build: halffull/newton
+      stencil: none
+      bin: none
+  (2) pair snap, perpetual
+      attributes: full, newton on
+      pair build: full/bin/atomonly
+      stencil: full/bin/3d
+      bin: standard
+Per MPI rank memory allocation (min/avg/max) = 4.587 | 4.587 | 4.587 Mbytes
+Step Temp E_pair E_mol TotEng Press 
+       0          300   -3.4805794            0   -3.4418771    1353.5968 
+      10    285.84677   -3.4787531            0   -3.4418766    1611.7131 
+      20    248.14649   -3.4738884            0   -3.4418756    2312.0308 
+      30    198.94136   -3.4675394            0   -3.4418744    3168.1543 
+      40    152.74831   -3.4615791            0   -3.4418734    3903.5749 
+      50     121.9796   -3.4576091            0   -3.4418728    4387.1254 
+      60    113.27555   -3.4564863            0   -3.4418729    4556.3003 
+      70    125.68089   -3.4580873            0   -3.4418735    4431.2083 
+      80    151.47475   -3.4614159            0   -3.4418745    4107.2369 
+      90    179.18708   -3.4649919            0   -3.4418754    3739.5881 
+     100    197.50662   -3.4673559            0    -3.441876    3492.7778 
+Loop time of 3.73974 on 4 procs for 100 steps with 512 atoms
+
+Performance: 1.155 ns/day, 20.776 hours/ns, 26.740 timesteps/s
+98.3% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 3.4687     | 3.5182     | 3.5985     |   2.7 | 94.07
+Neigh   | 0          | 0          | 0          |   0.0 |  0.00
+Comm    | 0.13897    | 0.21952    | 0.26888    |  10.7 |  5.87
+Output  | 0.00018191 | 0.00047094 | 0.0012944  |   0.0 |  0.01
+Modify  | 0.00013065 | 0.00013524 | 0.00014186 |   0.0 |  0.00
+Other   |            | 0.001456   |            |       |  0.04
+
+Nlocal:    128 ave 128 max 128 min
+Histogram: 4 0 0 0 0 0 0 0 0 0
+Nghost:    1099 ave 1099 max 1099 min
+Histogram: 4 0 0 0 0 0 0 0 0 0
+Neighs:    7808 ave 7808 max 7808 min
+Histogram: 4 0 0 0 0 0 0 0 0 0
+FullNghs:  15616 ave 15616 max 15616 min
+Histogram: 4 0 0 0 0 0 0 0 0 0
+
+Total # of neighbors = 62464
+Ave neighs/atom = 122
+Neighbor list builds = 0
+Dangerous builds = 0
+
+Total wall time: 0:00:03
--- a/potentials/InP_JCPA2020.snap
+++ b/potentials/InP_JCPA2020.snap
@ -0,0 +1,18 @@
+# DATE: 2020-06-01 CONTRIBUTOR: Mary Alice Cusentino mcusent@sandia.gov CITATION: M.A. Cusentino, M. A. Wood, and A.P. Thompson, "Explicit Multi-element Extension of the Spectral Neighbor Analysis Potential for Chemically Complex Systems", J. Phys. Chem. A, xxxxxx (2020)
+ 
+# Definition of SNAP+ZBL potential.
+
+variable zblcutinner index 4
+variable zblcutouter index 4.2
+variable zblz1 index 49
+variable zblz2 index 15
+
+# Specify hybrid with SNAP and ZBL
+
+pair_style hybrid/overlay &
+zbl ${zblcutinner} ${zblcutouter} &
+snap
+pair_coeff 1 1 zbl ${zblz1} ${zblz1}
+pair_coeff 1 2 zbl ${zblz1} ${zblz2}
+pair_coeff 2 2 zbl ${zblz2} ${zblz2}
+pair_coeff * * snap InP_JCPA2020.snapcoeff InP_JCPA2020.snapparam In P
--- a/potentials/InP_JCPA2020.snapcoeff
+++ b/potentials/InP_JCPA2020.snapcoeff
@ -0,0 +1,487 @@
+# DATE: 2020-06-01 UNITS: metal CONTRIBUTOR: Mary Alice Cusentino mcusent@sandia.gov CITATION: M.A. Cusentino, M. A. Wood, and A.P. Thompson, "Explicit Multi-element Extension of the Spectral Neighbor Analysis Potential for Chemically Complex Systems", J. Phys. Chem. A, xxxxxx (2020)
+
+2 241
+In 3.81205 1
+  0.000000000000 # B[0] Block = 1 Type = In 
+ -0.000666721868 # B[1, 0, 0, 0] Block = 1 Type = In 
+  0.032408881964 # B[2, 1, 0, 1] Block = 1 Type = In 
+  0.182277739455 # B[3, 1, 1, 2] Block = 1 Type = In 
+  0.001455902168 # B[4, 2, 0, 2] Block = 1 Type = In 
+  0.086259367737 # B[5, 2, 1, 3] Block = 1 Type = In 
+ -0.044840628371 # B[6, 2, 2, 2] Block = 1 Type = In 
+ -0.175973261191 # B[7, 2, 2, 4] Block = 1 Type = In 
+ -0.052429169415 # B[8, 3, 0, 3] Block = 1 Type = In 
+  0.195529228497 # B[9, 3, 1, 4] Block = 1 Type = In 
+  0.078718744520 # B[10, 3, 2, 3] Block = 1 Type = In 
+ -0.688127658121 # B[11, 3, 2, 5] Block = 1 Type = In 
+  0.059084058400 # B[12, 3, 3, 4] Block = 1 Type = In 
+  0.006795099274 # B[13, 3, 3, 6] Block = 1 Type = In 
+ -0.043061553886 # B[14, 4, 0, 4] Block = 1 Type = In 
+ -0.046619800530 # B[15, 4, 1, 5] Block = 1 Type = In 
+ -0.117451659827 # B[16, 4, 2, 4] Block = 1 Type = In 
+ -0.233615100720 # B[17, 4, 2, 6] Block = 1 Type = In 
+  0.015358771114 # B[18, 4, 3, 5] Block = 1 Type = In 
+  0.022474133984 # B[19, 4, 4, 4] Block = 1 Type = In 
+  0.002165850235 # B[20, 4, 4, 6] Block = 1 Type = In 
+  0.003458938546 # B[21, 5, 0, 5] Block = 1 Type = In 
+ -0.053507775670 # B[22, 5, 1, 6] Block = 1 Type = In 
+  0.120989101467 # B[23, 5, 2, 5] Block = 1 Type = In 
+  0.092637875162 # B[24, 5, 3, 6] Block = 1 Type = In 
+  0.071459233521 # B[25, 5, 4, 5] Block = 1 Type = In 
+  0.086291858607 # B[26, 5, 5, 6] Block = 1 Type = In 
+  0.006749966752 # B[27, 6, 0, 6] Block = 1 Type = In 
+  0.144917284093 # B[28, 6, 2, 6] Block = 1 Type = In 
+  0.055178211309 # B[29, 6, 4, 6] Block = 1 Type = In 
+ -0.005619133266 # B[30, 6, 6, 6] Block = 1 Type = In 
+  0.005430513632 # B[1, 0, 0, 0] Block = 2 Type = In 
+  0.057269488101 # B[2, 1, 0, 1] Block = 2 Type = In 
+  0.320412300575 # B[3, 1, 1, 2] Block = 2 Type = In 
+  0.035481869944 # B[4, 2, 0, 2] Block = 2 Type = In 
+  0.111076763087 # B[5, 2, 1, 3] Block = 2 Type = In 
+  0.039770598731 # B[6, 2, 2, 2] Block = 2 Type = In 
+  0.141315510383 # B[7, 2, 2, 4] Block = 2 Type = In 
+  0.067792661762 # B[8, 3, 0, 3] Block = 2 Type = In 
+ -0.080858457946 # B[9, 3, 1, 4] Block = 2 Type = In 
+  0.258942062632 # B[10, 3, 2, 3] Block = 2 Type = In 
+  0.061756985062 # B[11, 3, 2, 5] Block = 2 Type = In 
+ -0.112424676196 # B[12, 3, 3, 4] Block = 2 Type = In 
+  0.168376857205 # B[13, 3, 3, 6] Block = 2 Type = In 
+ -0.029743698629 # B[14, 4, 0, 4] Block = 2 Type = In 
+ -0.093967263289 # B[15, 4, 1, 5] Block = 2 Type = In 
+  0.137229827290 # B[16, 4, 2, 4] Block = 2 Type = In 
+  0.056897919200 # B[17, 4, 2, 6] Block = 2 Type = In 
+  0.095137344320 # B[18, 4, 3, 5] Block = 2 Type = In 
+ -0.008598816416 # B[19, 4, 4, 4] Block = 2 Type = In 
+  0.038890602482 # B[20, 4, 4, 6] Block = 2 Type = In 
+ -0.034624751006 # B[21, 5, 0, 5] Block = 2 Type = In 
+ -0.282625695473 # B[22, 5, 1, 6] Block = 2 Type = In 
+  0.103089891872 # B[23, 5, 2, 5] Block = 2 Type = In 
+ -0.024380802146 # B[24, 5, 3, 6] Block = 2 Type = In 
+ -0.063847809434 # B[25, 5, 4, 5] Block = 2 Type = In 
+ -0.024896682749 # B[26, 5, 5, 6] Block = 2 Type = In 
+  0.000464369553 # B[27, 6, 0, 6] Block = 2 Type = In 
+  0.082229290277 # B[28, 6, 2, 6] Block = 2 Type = In 
+ -0.008875503360 # B[29, 6, 4, 6] Block = 2 Type = In 
+ -0.009039017094 # B[30, 6, 6, 6] Block = 2 Type = In 
+  0.005430513686 # B[1, 0, 0, 0] Block = 3 Type = In 
+ -0.004352445887 # B[2, 1, 0, 1] Block = 3 Type = In 
+  0.149882860704 # B[3, 1, 1, 2] Block = 3 Type = In 
+ -0.015528472583 # B[4, 2, 0, 2] Block = 3 Type = In 
+  0.558662861756 # B[5, 2, 1, 3] Block = 3 Type = In 
+  0.039770598731 # B[6, 2, 2, 2] Block = 3 Type = In 
+  0.179060667136 # B[7, 2, 2, 4] Block = 3 Type = In 
+  0.034759981675 # B[8, 3, 0, 3] Block = 3 Type = In 
+  0.603083480153 # B[9, 3, 1, 4] Block = 3 Type = In 
+  0.176946655350 # B[10, 3, 2, 3] Block = 3 Type = In 
+  0.165639632803 # B[11, 3, 2, 5] Block = 3 Type = In 
+  0.055627509305 # B[12, 3, 3, 4] Block = 3 Type = In 
+  0.049782791218 # B[13, 3, 3, 6] Block = 3 Type = In 
+  0.036078617029 # B[14, 4, 0, 4] Block = 3 Type = In 
+  0.064493563641 # B[15, 4, 1, 5] Block = 3 Type = In 
+  0.149250535822 # B[16, 4, 2, 4] Block = 3 Type = In 
+ -0.060208330201 # B[17, 4, 2, 6] Block = 3 Type = In 
+  0.105119833648 # B[18, 4, 3, 5] Block = 3 Type = In 
+ -0.008598816416 # B[19, 4, 4, 4] Block = 3 Type = In 
+  0.041210118888 # B[20, 4, 4, 6] Block = 3 Type = In 
+ -0.002705345469 # B[21, 5, 0, 5] Block = 3 Type = In 
+  0.170191392493 # B[22, 5, 1, 6] Block = 3 Type = In 
+  0.226897293272 # B[23, 5, 2, 5] Block = 3 Type = In 
+  0.013009034793 # B[24, 5, 3, 6] Block = 3 Type = In 
+ -0.020734586320 # B[25, 5, 4, 5] Block = 3 Type = In 
+ -0.018139074523 # B[26, 5, 5, 6] Block = 3 Type = In 
+ -0.016001848874 # B[27, 6, 0, 6] Block = 3 Type = In 
+  0.016663324316 # B[28, 6, 2, 6] Block = 3 Type = In 
+ -0.024245533697 # B[29, 6, 4, 6] Block = 3 Type = In 
+ -0.009039017094 # B[30, 6, 6, 6] Block = 3 Type = In 
+ -0.005654800687 # B[1, 0, 0, 0] Block = 4 Type = In 
+ -0.071064263981 # B[2, 1, 0, 1] Block = 4 Type = In 
+ -0.009868049046 # B[3, 1, 1, 2] Block = 4 Type = In 
+ -0.061297753855 # B[4, 2, 0, 2] Block = 4 Type = In 
+ -0.239682636759 # B[5, 2, 1, 3] Block = 4 Type = In 
+  0.015954956116 # B[6, 2, 2, 2] Block = 4 Type = In 
+  0.176005610703 # B[7, 2, 2, 4] Block = 4 Type = In 
+ -0.081125948095 # B[8, 3, 0, 3] Block = 4 Type = In 
+ -0.170847987084 # B[9, 3, 1, 4] Block = 4 Type = In 
+  0.242239715395 # B[10, 3, 2, 3] Block = 4 Type = In 
+  0.082507688294 # B[11, 3, 2, 5] Block = 4 Type = In 
+  0.247785108978 # B[12, 3, 3, 4] Block = 4 Type = In 
+ -0.008194303016 # B[13, 3, 3, 6] Block = 4 Type = In 
+  0.014786217107 # B[14, 4, 0, 4] Block = 4 Type = In 
+ -0.096877379511 # B[15, 4, 1, 5] Block = 4 Type = In 
+  0.164908528605 # B[16, 4, 2, 4] Block = 4 Type = In 
+  0.151575252604 # B[17, 4, 2, 6] Block = 4 Type = In 
+  0.099757230122 # B[18, 4, 3, 5] Block = 4 Type = In 
+  0.035047662350 # B[19, 4, 4, 4] Block = 4 Type = In 
+  0.007150552805 # B[20, 4, 4, 6] Block = 4 Type = In 
+  0.019198319779 # B[21, 5, 0, 5] Block = 4 Type = In 
+ -0.127113932870 # B[22, 5, 1, 6] Block = 4 Type = In 
+  0.114478010571 # B[23, 5, 2, 5] Block = 4 Type = In 
+  0.050915227324 # B[24, 5, 3, 6] Block = 4 Type = In 
+  0.096853268510 # B[25, 5, 4, 5] Block = 4 Type = In 
+  0.067894750884 # B[26, 5, 5, 6] Block = 4 Type = In 
+ -0.002405537661 # B[27, 6, 0, 6] Block = 4 Type = In 
+  0.058549926350 # B[28, 6, 2, 6] Block = 4 Type = In 
+  0.009481237049 # B[29, 6, 4, 6] Block = 4 Type = In 
+ -0.008649958571 # B[30, 6, 6, 6] Block = 4 Type = In 
+  0.005430513686 # B[1, 0, 0, 0] Block = 5 Type = In 
+  0.057269488102 # B[2, 1, 0, 1] Block = 5 Type = In 
+  0.149882860704 # B[3, 1, 1, 2] Block = 5 Type = In 
+  0.035481869944 # B[4, 2, 0, 2] Block = 5 Type = In 
+  0.378916788823 # B[5, 2, 1, 3] Block = 5 Type = In 
+  0.039770598731 # B[6, 2, 2, 2] Block = 5 Type = In 
+  0.179060667136 # B[7, 2, 2, 4] Block = 5 Type = In 
+  0.067792661762 # B[8, 3, 0, 3] Block = 5 Type = In 
+  0.272613304171 # B[9, 3, 1, 4] Block = 5 Type = In 
+  0.258942062632 # B[10, 3, 2, 3] Block = 5 Type = In 
+  0.100130474069 # B[11, 3, 2, 5] Block = 5 Type = In 
+  0.055627509305 # B[12, 3, 3, 4] Block = 5 Type = In 
+  0.049782791218 # B[13, 3, 3, 6] Block = 5 Type = In 
+ -0.029743698629 # B[14, 4, 0, 4] Block = 5 Type = In 
+ -0.013420300314 # B[15, 4, 1, 5] Block = 5 Type = In 
+  0.137229827290 # B[16, 4, 2, 4] Block = 5 Type = In 
+ -0.034447269506 # B[17, 4, 2, 6] Block = 5 Type = In 
+ -0.033847124314 # B[18, 4, 3, 5] Block = 5 Type = In 
+ -0.008598816416 # B[19, 4, 4, 4] Block = 5 Type = In 
+  0.041210118888 # B[20, 4, 4, 6] Block = 5 Type = In 
+ -0.034624751006 # B[21, 5, 0, 5] Block = 5 Type = In 
+  0.041662678638 # B[22, 5, 1, 6] Block = 5 Type = In 
+  0.103089891872 # B[23, 5, 2, 5] Block = 5 Type = In 
+ -0.044572198386 # B[24, 5, 3, 6] Block = 5 Type = In 
+ -0.063847809434 # B[25, 5, 4, 5] Block = 5 Type = In 
+ -0.018139074523 # B[26, 5, 5, 6] Block = 5 Type = In 
+  0.000464369553 # B[27, 6, 0, 6] Block = 5 Type = In 
+  0.082229290277 # B[28, 6, 2, 6] Block = 5 Type = In 
+ -0.008875503360 # B[29, 6, 4, 6] Block = 5 Type = In 
+ -0.009039017094 # B[30, 6, 6, 6] Block = 5 Type = In 
+ -0.005654800687 # B[1, 0, 0, 0] Block = 6 Type = In 
+ -0.001217874195 # B[2, 1, 0, 1] Block = 6 Type = In 
+ -0.009868049046 # B[3, 1, 1, 2] Block = 6 Type = In 
+ -0.092827766060 # B[4, 2, 0, 2] Block = 6 Type = In 
+  0.439274283244 # B[5, 2, 1, 3] Block = 6 Type = In 
+  0.015954956116 # B[6, 2, 2, 2] Block = 6 Type = In 
+  0.176005610703 # B[7, 2, 2, 4] Block = 6 Type = In 
+  0.102468480364 # B[8, 3, 0, 3] Block = 6 Type = In 
+  0.674122225402 # B[9, 3, 1, 4] Block = 6 Type = In 
+  0.072529538087 # B[10, 3, 2, 3] Block = 6 Type = In 
+  0.330711171466 # B[11, 3, 2, 5] Block = 6 Type = In 
+  0.247785108978 # B[12, 3, 3, 4] Block = 6 Type = In 
+ -0.008194303016 # B[13, 3, 3, 6] Block = 6 Type = In 
+  0.052250780232 # B[14, 4, 0, 4] Block = 6 Type = In 
+  0.374231060518 # B[15, 4, 1, 5] Block = 6 Type = In 
+  0.326667869620 # B[16, 4, 2, 4] Block = 6 Type = In 
+  0.079031873518 # B[17, 4, 2, 6] Block = 6 Type = In 
+  0.224004472527 # B[18, 4, 3, 5] Block = 6 Type = In 
+  0.035047662350 # B[19, 4, 4, 4] Block = 6 Type = In 
+  0.007150552805 # B[20, 4, 4, 6] Block = 6 Type = In 
+  0.040682917098 # B[21, 5, 0, 5] Block = 6 Type = In 
+  0.046855927526 # B[22, 5, 1, 6] Block = 6 Type = In 
+  0.219695071346 # B[23, 5, 2, 5] Block = 6 Type = In 
+ -0.001426581661 # B[24, 5, 3, 6] Block = 6 Type = In 
+  0.028514699601 # B[25, 5, 4, 5] Block = 6 Type = In 
+  0.067894750884 # B[26, 5, 5, 6] Block = 6 Type = In 
+ -0.049888149225 # B[27, 6, 0, 6] Block = 6 Type = In 
+  0.009259151039 # B[28, 6, 2, 6] Block = 6 Type = In 
+  0.003868002128 # B[29, 6, 4, 6] Block = 6 Type = In 
+ -0.008649958571 # B[30, 6, 6, 6] Block = 6 Type = In 
+ -0.005654800692 # B[1, 0, 0, 0] Block = 7 Type = In 
+ -0.071064263981 # B[2, 1, 0, 1] Block = 7 Type = In 
+ -0.085085203640 # B[3, 1, 1, 2] Block = 7 Type = In 
+ -0.061297753855 # B[4, 2, 0, 2] Block = 7 Type = In 
+  0.223668616358 # B[5, 2, 1, 3] Block = 7 Type = In 
+  0.015954956116 # B[6, 2, 2, 2] Block = 7 Type = In 
+  0.033706085249 # B[7, 2, 2, 4] Block = 7 Type = In 
+ -0.081125948095 # B[8, 3, 0, 3] Block = 7 Type = In 
+ -0.005054494008 # B[9, 3, 1, 4] Block = 7 Type = In 
+  0.242239715395 # B[10, 3, 2, 3] Block = 7 Type = In 
+ -0.000886414104 # B[11, 3, 2, 5] Block = 7 Type = In 
+  0.059178212190 # B[12, 3, 3, 4] Block = 7 Type = In 
+  0.008498646326 # B[13, 3, 3, 6] Block = 7 Type = In 
+  0.014786217107 # B[14, 4, 0, 4] Block = 7 Type = In 
+ -0.178665293356 # B[15, 4, 1, 5] Block = 7 Type = In 
+  0.164908528605 # B[16, 4, 2, 4] Block = 7 Type = In 
+ -0.117717485069 # B[17, 4, 2, 6] Block = 7 Type = In 
+  0.146739677531 # B[18, 4, 3, 5] Block = 7 Type = In 
+  0.035047662350 # B[19, 4, 4, 4] Block = 7 Type = In 
+  0.088770688382 # B[20, 4, 4, 6] Block = 7 Type = In 
+  0.019198319779 # B[21, 5, 0, 5] Block = 7 Type = In 
+ -0.148162265312 # B[22, 5, 1, 6] Block = 7 Type = In 
+  0.114478010571 # B[23, 5, 2, 5] Block = 7 Type = In 
+  0.114731400461 # B[24, 5, 3, 6] Block = 7 Type = In 
+  0.096853268510 # B[25, 5, 4, 5] Block = 7 Type = In 
+  0.031183854107 # B[26, 5, 5, 6] Block = 7 Type = In 
+ -0.002405537661 # B[27, 6, 0, 6] Block = 7 Type = In 
+  0.058549926350 # B[28, 6, 2, 6] Block = 7 Type = In 
+  0.009481237049 # B[29, 6, 4, 6] Block = 7 Type = In 
+ -0.008649958571 # B[30, 6, 6, 6] Block = 7 Type = In 
+  0.017733403092 # B[1, 0, 0, 0] Block = 8 Type = In 
+  0.015168905151 # B[2, 1, 0, 1] Block = 8 Type = In 
+ -0.212358294308 # B[3, 1, 1, 2] Block = 8 Type = In 
+  0.115608035432 # B[4, 2, 0, 2] Block = 8 Type = In 
+  0.128621845177 # B[5, 2, 1, 3] Block = 8 Type = In 
+ -0.055682216710 # B[6, 2, 2, 2] Block = 8 Type = In 
+  0.168986321733 # B[7, 2, 2, 4] Block = 8 Type = In 
+ -0.087186888529 # B[8, 3, 0, 3] Block = 8 Type = In 
+  0.378810692322 # B[9, 3, 1, 4] Block = 8 Type = In 
+  0.036128510376 # B[10, 3, 2, 3] Block = 8 Type = In 
+  0.179888488204 # B[11, 3, 2, 5] Block = 8 Type = In 
+ -0.001405954437 # B[12, 3, 3, 4] Block = 8 Type = In 
+  0.010551104009 # B[13, 3, 3, 6] Block = 8 Type = In 
+ -0.059381370200 # B[14, 4, 0, 4] Block = 8 Type = In 
+  0.475432753620 # B[15, 4, 1, 5] Block = 8 Type = In 
+  0.095868282640 # B[16, 4, 2, 4] Block = 8 Type = In 
+  0.106524975238 # B[17, 4, 2, 6] Block = 8 Type = In 
+  0.058941182257 # B[18, 4, 3, 5] Block = 8 Type = In 
+  0.012512778321 # B[19, 4, 4, 4] Block = 8 Type = In 
+  0.080549204239 # B[20, 4, 4, 6] Block = 8 Type = In 
+ -0.068536821891 # B[21, 5, 0, 5] Block = 8 Type = In 
+  0.089459777664 # B[22, 5, 1, 6] Block = 8 Type = In 
+  0.163187761880 # B[23, 5, 2, 5] Block = 8 Type = In 
+  0.139719330200 # B[24, 5, 3, 6] Block = 8 Type = In 
+  0.145095171389 # B[25, 5, 4, 5] Block = 8 Type = In 
+  0.074157391376 # B[26, 5, 5, 6] Block = 8 Type = In 
+  0.018646775951 # B[27, 6, 0, 6] Block = 8 Type = In 
+  0.035882498943 # B[28, 6, 2, 6] Block = 8 Type = In 
+  0.050420424420 # B[29, 6, 4, 6] Block = 8 Type = In 
+  0.009994821144 # B[30, 6, 6, 6] Block = 8 Type = In 
+P 3.82945 0.929316
+  0.000000000000 # B[0] Block = 1 Type = P 
+ -0.002987589706 # B[1, 0, 0, 0] Block = 1 Type = P 
+ -0.072158389412 # B[2, 1, 0, 1] Block = 1 Type = P 
+ -0.025322250603 # B[3, 1, 1, 2] Block = 1 Type = P 
+ -0.030241379192 # B[4, 2, 0, 2] Block = 1 Type = P 
+ -0.676287334962 # B[5, 2, 1, 3] Block = 1 Type = P 
+ -0.172216278028 # B[6, 2, 2, 2] Block = 1 Type = P 
+ -0.246535097343 # B[7, 2, 2, 4] Block = 1 Type = P 
+ -0.035782528462 # B[8, 3, 0, 3] Block = 1 Type = P 
+ -0.560641780823 # B[9, 3, 1, 4] Block = 1 Type = P 
+ -0.549579515296 # B[10, 3, 2, 3] Block = 1 Type = P 
+ -0.016920837991 # B[11, 3, 2, 5] Block = 1 Type = P 
+ -0.288447245376 # B[12, 3, 3, 4] Block = 1 Type = P 
+  0.069076859372 # B[13, 3, 3, 6] Block = 1 Type = P 
+ -0.052298717507 # B[14, 4, 0, 4] Block = 1 Type = P 
+ -0.434250953671 # B[15, 4, 1, 5] Block = 1 Type = P 
+ -0.322043860507 # B[16, 4, 2, 4] Block = 1 Type = P 
+ -0.096837010372 # B[17, 4, 2, 6] Block = 1 Type = P 
+ -0.213169352789 # B[18, 4, 3, 5] Block = 1 Type = P 
+ -0.019566740546 # B[19, 4, 4, 4] Block = 1 Type = P 
+ -0.029415128740 # B[20, 4, 4, 6] Block = 1 Type = P 
+ -0.036591077655 # B[21, 5, 0, 5] Block = 1 Type = P 
+ -0.300384511072 # B[22, 5, 1, 6] Block = 1 Type = P 
+ -0.111126537447 # B[23, 5, 2, 5] Block = 1 Type = P 
+ -0.000209831053 # B[24, 5, 3, 6] Block = 1 Type = P 
+ -0.000023632674 # B[25, 5, 4, 5] Block = 1 Type = P 
+  0.009497323905 # B[26, 5, 5, 6] Block = 1 Type = P 
+ -0.042287705828 # B[27, 6, 0, 6] Block = 1 Type = P 
+ -0.113311457350 # B[28, 6, 2, 6] Block = 1 Type = P 
+  0.029574563913 # B[29, 6, 4, 6] Block = 1 Type = P 
+ -0.027748295426 # B[30, 6, 6, 6] Block = 1 Type = P 
+ -0.001747658243 # B[1, 0, 0, 0] Block = 2 Type = P 
+ -0.026182047943 # B[2, 1, 0, 1] Block = 2 Type = P 
+  0.089481157533 # B[3, 1, 1, 2] Block = 2 Type = P 
+ -0.076525139004 # B[4, 2, 0, 2] Block = 2 Type = P 
+ -0.107925591359 # B[5, 2, 1, 3] Block = 2 Type = P 
+ -0.059117110271 # B[6, 2, 2, 2] Block = 2 Type = P 
+ -0.256324252168 # B[7, 2, 2, 4] Block = 2 Type = P 
+ -0.020755324452 # B[8, 3, 0, 3] Block = 2 Type = P 
+ -0.337284108142 # B[9, 3, 1, 4] Block = 2 Type = P 
+ -0.073956723908 # B[10, 3, 2, 3] Block = 2 Type = P 
+ -0.149119927857 # B[11, 3, 2, 5] Block = 2 Type = P 
+  0.047627781519 # B[12, 3, 3, 4] Block = 2 Type = P 
+  0.061394929126 # B[13, 3, 3, 6] Block = 2 Type = P 
+ -0.082660360252 # B[14, 4, 0, 4] Block = 2 Type = P 
+ -0.183225932285 # B[15, 4, 1, 5] Block = 2 Type = P 
+ -0.046981555049 # B[16, 4, 2, 4] Block = 2 Type = P 
+ -0.046846685850 # B[17, 4, 2, 6] Block = 2 Type = P 
+ -0.014388943769 # B[18, 4, 3, 5] Block = 2 Type = P 
+  0.012133725790 # B[19, 4, 4, 4] Block = 2 Type = P 
+  0.085321452425 # B[20, 4, 4, 6] Block = 2 Type = P 
+ -0.034945525448 # B[21, 5, 0, 5] Block = 2 Type = P 
+ -0.028326642109 # B[22, 5, 1, 6] Block = 2 Type = P 
+ -0.085701075837 # B[23, 5, 2, 5] Block = 2 Type = P 
+  0.108257997015 # B[24, 5, 3, 6] Block = 2 Type = P 
+  0.045837409910 # B[25, 5, 4, 5] Block = 2 Type = P 
+ -0.014180512722 # B[26, 5, 5, 6] Block = 2 Type = P 
+  0.010756044042 # B[27, 6, 0, 6] Block = 2 Type = P 
+  0.023429477590 # B[28, 6, 2, 6] Block = 2 Type = P 
+ -0.007794133717 # B[29, 6, 4, 6] Block = 2 Type = P 
+  0.002019828318 # B[30, 6, 6, 6] Block = 2 Type = P 
+ -0.001747658242 # B[1, 0, 0, 0] Block = 3 Type = P 
+  0.047070626642 # B[2, 1, 0, 1] Block = 3 Type = P 
+ -0.126595340298 # B[3, 1, 1, 2] Block = 3 Type = P 
+  0.022286899829 # B[4, 2, 0, 2] Block = 3 Type = P 
+ -0.483695340547 # B[5, 2, 1, 3] Block = 3 Type = P 
+ -0.059117110271 # B[6, 2, 2, 2] Block = 3 Type = P 
+ -0.067694089340 # B[7, 2, 2, 4] Block = 3 Type = P 
+  0.034974727122 # B[8, 3, 0, 3] Block = 3 Type = P 
+ -0.226290583408 # B[9, 3, 1, 4] Block = 3 Type = P 
+ -0.184699579267 # B[10, 3, 2, 3] Block = 3 Type = P 
+  0.063122270285 # B[11, 3, 2, 5] Block = 3 Type = P 
+  0.041271206739 # B[12, 3, 3, 4] Block = 3 Type = P 
+ -0.004229157928 # B[13, 3, 3, 6] Block = 3 Type = P 
+  0.050689134214 # B[14, 4, 0, 4] Block = 3 Type = P 
+ -0.138276744014 # B[15, 4, 1, 5] Block = 3 Type = P 
+  0.097985494164 # B[16, 4, 2, 4] Block = 3 Type = P 
+ -0.083537235645 # B[17, 4, 2, 6] Block = 3 Type = P 
+  0.098390585361 # B[18, 4, 3, 5] Block = 3 Type = P 
+  0.012133725790 # B[19, 4, 4, 4] Block = 3 Type = P 
+ -0.038067814334 # B[20, 4, 4, 6] Block = 3 Type = P 
+  0.029636266108 # B[21, 5, 0, 5] Block = 3 Type = P 
+ -0.157117938354 # B[22, 5, 1, 6] Block = 3 Type = P 
+ -0.027712542047 # B[23, 5, 2, 5] Block = 3 Type = P 
+  0.084730212710 # B[24, 5, 3, 6] Block = 3 Type = P 
+  0.023437407693 # B[25, 5, 4, 5] Block = 3 Type = P 
+  0.017747856995 # B[26, 5, 5, 6] Block = 3 Type = P 
+  0.050161344183 # B[27, 6, 0, 6] Block = 3 Type = P 
+ -0.098577816149 # B[28, 6, 2, 6] Block = 3 Type = P 
+ -0.046997533090 # B[29, 6, 4, 6] Block = 3 Type = P 
+  0.002019828318 # B[30, 6, 6, 6] Block = 3 Type = P 
+ -0.003152987881 # B[1, 0, 0, 0] Block = 4 Type = P 
+  0.014621850469 # B[2, 1, 0, 1] Block = 4 Type = P 
+  0.098860641022 # B[3, 1, 1, 2] Block = 4 Type = P 
+  0.069546644549 # B[4, 2, 0, 2] Block = 4 Type = P 
+  0.180804700658 # B[5, 2, 1, 3] Block = 4 Type = P 
+  0.034244852922 # B[6, 2, 2, 2] Block = 4 Type = P 
+  0.044579888557 # B[7, 2, 2, 4] Block = 4 Type = P 
+  0.001526239292 # B[8, 3, 0, 3] Block = 4 Type = P 
+  0.203861850923 # B[9, 3, 1, 4] Block = 4 Type = P 
+  0.021679218740 # B[10, 3, 2, 3] Block = 4 Type = P 
+  0.185899872703 # B[11, 3, 2, 5] Block = 4 Type = P 
+ -0.063472862380 # B[12, 3, 3, 4] Block = 4 Type = P 
+ -0.015662648111 # B[13, 3, 3, 6] Block = 4 Type = P 
+  0.076209869795 # B[14, 4, 0, 4] Block = 4 Type = P 
+ -0.050213789331 # B[15, 4, 1, 5] Block = 4 Type = P 
+  0.038175316256 # B[16, 4, 2, 4] Block = 4 Type = P 
+  0.041946424186 # B[17, 4, 2, 6] Block = 4 Type = P 
+ -0.023902281235 # B[18, 4, 3, 5] Block = 4 Type = P 
+  0.008537822812 # B[19, 4, 4, 4] Block = 4 Type = P 
+  0.039989757491 # B[20, 4, 4, 6] Block = 4 Type = P 
+  0.022210126714 # B[21, 5, 0, 5] Block = 4 Type = P 
+  0.010855258243 # B[22, 5, 1, 6] Block = 4 Type = P 
+  0.021570527219 # B[23, 5, 2, 5] Block = 4 Type = P 
+ -0.119983534986 # B[24, 5, 3, 6] Block = 4 Type = P 
+ -0.019726935513 # B[25, 5, 4, 5] Block = 4 Type = P 
+ -0.015720476443 # B[26, 5, 5, 6] Block = 4 Type = P 
+ -0.024522109113 # B[27, 6, 0, 6] Block = 4 Type = P 
+ -0.051478859538 # B[28, 6, 2, 6] Block = 4 Type = P 
+  0.017216285614 # B[29, 6, 4, 6] Block = 4 Type = P 
+ -0.003565797401 # B[30, 6, 6, 6] Block = 4 Type = P 
+ -0.001747658242 # B[1, 0, 0, 0] Block = 5 Type = P 
+ -0.026182047943 # B[2, 1, 0, 1] Block = 5 Type = P 
+ -0.126595340298 # B[3, 1, 1, 2] Block = 5 Type = P 
+ -0.076525139004 # B[4, 2, 0, 2] Block = 5 Type = P 
+ -0.157814129312 # B[5, 2, 1, 3] Block = 5 Type = P 
+ -0.059117110271 # B[6, 2, 2, 2] Block = 5 Type = P 
+ -0.067694089340 # B[7, 2, 2, 4] Block = 5 Type = P 
+ -0.020755324452 # B[8, 3, 0, 3] Block = 5 Type = P 
+ -0.216746420586 # B[9, 3, 1, 4] Block = 5 Type = P 
+ -0.073956723908 # B[10, 3, 2, 3] Block = 5 Type = P 
+ -0.263593571569 # B[11, 3, 2, 5] Block = 5 Type = P 
+  0.041271206739 # B[12, 3, 3, 4] Block = 5 Type = P 
+ -0.004229157928 # B[13, 3, 3, 6] Block = 5 Type = P 
+ -0.082660360252 # B[14, 4, 0, 4] Block = 5 Type = P 
+ -0.305032662779 # B[15, 4, 1, 5] Block = 5 Type = P 
+ -0.046981555049 # B[16, 4, 2, 4] Block = 5 Type = P 
+ -0.187955078269 # B[17, 4, 2, 6] Block = 5 Type = P 
+ -0.121808447372 # B[18, 4, 3, 5] Block = 5 Type = P 
+  0.012133725790 # B[19, 4, 4, 4] Block = 5 Type = P 
+ -0.038067814334 # B[20, 4, 4, 6] Block = 5 Type = P 
+ -0.034945525448 # B[21, 5, 0, 5] Block = 5 Type = P 
+ -0.226555787648 # B[22, 5, 1, 6] Block = 5 Type = P 
+ -0.085701075837 # B[23, 5, 2, 5] Block = 5 Type = P 
+ -0.121081797087 # B[24, 5, 3, 6] Block = 5 Type = P 
+  0.045837409910 # B[25, 5, 4, 5] Block = 5 Type = P 
+  0.017747856995 # B[26, 5, 5, 6] Block = 5 Type = P 
+  0.010756044042 # B[27, 6, 0, 6] Block = 5 Type = P 
+  0.023429477590 # B[28, 6, 2, 6] Block = 5 Type = P 
+ -0.007794133717 # B[29, 6, 4, 6] Block = 5 Type = P 
+  0.002019828318 # B[30, 6, 6, 6] Block = 5 Type = P 
+ -0.003152987881 # B[1, 0, 0, 0] Block = 6 Type = P 
+ -0.003431824919 # B[2, 1, 0, 1] Block = 6 Type = P 
+  0.098860641022 # B[3, 1, 1, 2] Block = 6 Type = P 
+ -0.049867192647 # B[4, 2, 0, 2] Block = 6 Type = P 
+  0.130247785083 # B[5, 2, 1, 3] Block = 6 Type = P 
+  0.034244852922 # B[6, 2, 2, 2] Block = 6 Type = P 
+  0.044579888557 # B[7, 2, 2, 4] Block = 6 Type = P 
+  0.051064338359 # B[8, 3, 0, 3] Block = 6 Type = P 
+ -0.034769100897 # B[9, 3, 1, 4] Block = 6 Type = P 
+ -0.055923569507 # B[10, 3, 2, 3] Block = 6 Type = P 
+  0.101065442824 # B[11, 3, 2, 5] Block = 6 Type = P 
+ -0.063472862380 # B[12, 3, 3, 4] Block = 6 Type = P 
+ -0.015662648111 # B[13, 3, 3, 6] Block = 6 Type = P 
+ -0.020942037301 # B[14, 4, 0, 4] Block = 6 Type = P 
+  0.057686438719 # B[15, 4, 1, 5] Block = 6 Type = P 
+  0.061281723131 # B[16, 4, 2, 4] Block = 6 Type = P 
+  0.041003214284 # B[17, 4, 2, 6] Block = 6 Type = P 
+  0.104968889582 # B[18, 4, 3, 5] Block = 6 Type = P 
+  0.008537822812 # B[19, 4, 4, 4] Block = 6 Type = P 
+  0.039989757491 # B[20, 4, 4, 6] Block = 6 Type = P 
+  0.058310887739 # B[21, 5, 0, 5] Block = 6 Type = P 
+  0.043642228702 # B[22, 5, 1, 6] Block = 6 Type = P 
+  0.119827018636 # B[23, 5, 2, 5] Block = 6 Type = P 
+ -0.017878741482 # B[24, 5, 3, 6] Block = 6 Type = P 
+  0.013615249763 # B[25, 5, 4, 5] Block = 6 Type = P 
+ -0.015720476443 # B[26, 5, 5, 6] Block = 6 Type = P 
+  0.028210503571 # B[27, 6, 0, 6] Block = 6 Type = P 
+  0.138982983531 # B[28, 6, 2, 6] Block = 6 Type = P 
+  0.020848948259 # B[29, 6, 4, 6] Block = 6 Type = P 
+ -0.003565797401 # B[30, 6, 6, 6] Block = 6 Type = P 
+ -0.003152987876 # B[1, 0, 0, 0] Block = 7 Type = P 
+  0.014621850469 # B[2, 1, 0, 1] Block = 7 Type = P 
+  0.136917412546 # B[3, 1, 1, 2] Block = 7 Type = P 
+  0.069546644549 # B[4, 2, 0, 2] Block = 7 Type = P 
+  0.134471034367 # B[5, 2, 1, 3] Block = 7 Type = P 
+  0.034244852922 # B[6, 2, 2, 2] Block = 7 Type = P 
+  0.073714102880 # B[7, 2, 2, 4] Block = 7 Type = P 
+  0.001526239292 # B[8, 3, 0, 3] Block = 7 Type = P 
+  0.029314077312 # B[9, 3, 1, 4] Block = 7 Type = P 
+  0.021679218740 # B[10, 3, 2, 3] Block = 7 Type = P 
+  0.005384023182 # B[11, 3, 2, 5] Block = 7 Type = P 
+  0.029912954139 # B[12, 3, 3, 4] Block = 7 Type = P 
+  0.036308629380 # B[13, 3, 3, 6] Block = 7 Type = P 
+  0.076209869795 # B[14, 4, 0, 4] Block = 7 Type = P 
+ -0.095659211777 # B[15, 4, 1, 5] Block = 7 Type = P 
+  0.038175316256 # B[16, 4, 2, 4] Block = 7 Type = P 
+ -0.054559433157 # B[17, 4, 2, 6] Block = 7 Type = P 
+ -0.079205893849 # B[18, 4, 3, 5] Block = 7 Type = P 
+  0.008537822812 # B[19, 4, 4, 4] Block = 7 Type = P 
+  0.072688459278 # B[20, 4, 4, 6] Block = 7 Type = P 
+  0.022210126714 # B[21, 5, 0, 5] Block = 7 Type = P 
+  0.032318678024 # B[22, 5, 1, 6] Block = 7 Type = P 
+  0.021570527219 # B[23, 5, 2, 5] Block = 7 Type = P 
+  0.038881258714 # B[24, 5, 3, 6] Block = 7 Type = P 
+ -0.019726935513 # B[25, 5, 4, 5] Block = 7 Type = P 
+  0.030961312127 # B[26, 5, 5, 6] Block = 7 Type = P 
+ -0.024522109113 # B[27, 6, 0, 6] Block = 7 Type = P 
+ -0.051478859538 # B[28, 6, 2, 6] Block = 7 Type = P 
+  0.017216285614 # B[29, 6, 4, 6] Block = 7 Type = P 
+ -0.003565797401 # B[30, 6, 6, 6] Block = 7 Type = P 
+  0.000279543258 # B[1, 0, 0, 0] Block = 8 Type = P 
+  0.031561006068 # B[2, 1, 0, 1] Block = 8 Type = P 
+  0.164297477481 # B[3, 1, 1, 2] Block = 8 Type = P 
+  0.020394103829 # B[4, 2, 0, 2] Block = 8 Type = P 
+ -0.136924810031 # B[5, 2, 1, 3] Block = 8 Type = P 
+  0.011488762740 # B[6, 2, 2, 2] Block = 8 Type = P 
+ -0.174577132596 # B[7, 2, 2, 4] Block = 8 Type = P 
+ -0.104272988787 # B[8, 3, 0, 3] Block = 8 Type = P 
+ -0.126737159959 # B[9, 3, 1, 4] Block = 8 Type = P 
+  0.006355291540 # B[10, 3, 2, 3] Block = 8 Type = P 
+ -0.116847920709 # B[11, 3, 2, 5] Block = 8 Type = P 
+  0.093716628094 # B[12, 3, 3, 4] Block = 8 Type = P 
+ -0.015327516258 # B[13, 3, 3, 6] Block = 8 Type = P 
+ -0.015071645969 # B[14, 4, 0, 4] Block = 8 Type = P 
+  0.054380965184 # B[15, 4, 1, 5] Block = 8 Type = P 
+  0.113826098444 # B[16, 4, 2, 4] Block = 8 Type = P 
+  0.012970945123 # B[17, 4, 2, 6] Block = 8 Type = P 
+ -0.047881183904 # B[18, 4, 3, 5] Block = 8 Type = P 
+ -0.010520024430 # B[19, 4, 4, 4] Block = 8 Type = P 
+ -0.077321883428 # B[20, 4, 4, 6] Block = 8 Type = P 
+ -0.087378280220 # B[21, 5, 0, 5] Block = 8 Type = P 
+ -0.221370705680 # B[22, 5, 1, 6] Block = 8 Type = P 
+  0.004554405520 # B[23, 5, 2, 5] Block = 8 Type = P 
+ -0.164836672985 # B[24, 5, 3, 6] Block = 8 Type = P 
+ -0.015080843808 # B[25, 5, 4, 5] Block = 8 Type = P 
+ -0.010907038616 # B[26, 5, 5, 6] Block = 8 Type = P 
+ -0.022228801431 # B[27, 6, 0, 6] Block = 8 Type = P 
+ -0.055154587470 # B[28, 6, 2, 6] Block = 8 Type = P 
+  0.007347917376 # B[29, 6, 4, 6] Block = 8 Type = P 
+ -0.009369956559 # B[30, 6, 6, 6] Block = 8 Type = P 
--- a/potentials/InP_JCPA2020.snapparam
+++ b/potentials/InP_JCPA2020.snapparam
@ -0,0 +1,14 @@
+# DATE: 2020-06-01 UNITS: metal CONTRIBUTOR: Mary Alice Cusentino mcusent@sandia.gov CITATION: M.A. Cusentino, M. A. Wood, and A.P. Thompson, "Explicit Multi-element Extension of the Spectral Neighbor Analysis Potential for Chemically Complex Systems", J. Phys. Chem. A, xxxxxx (2020)
+
+    # required
+    rcutfac 1.0
+    twojmax 6
+
+    #  optional
+    rfac0 0.99363
+    rmin0 0.0
+    bzeroflag 1
+    quadraticflag 0
+    wselfallflag 1
+    chemflag 1
+    bnormflag 1
--- a/src/KOKKOS/pair_snap_kokkos_impl.h
+++ b/src/KOKKOS/pair_snap_kokkos_impl.h
@ -243,6 +243,7 @@ void PairSNAPKokkos<DeviceType>::compute(int eflag_in, int vflag_in)
        typename Kokkos::TeamPolicy<DeviceType,TagPairSNAPComputeUiCPU> policy_ui_cpu(((chunk_size+team_size-1)/team_size)*max_neighs,team_size,vector_length);

        Kokkos::parallel_for("ComputeUiCPU",policy_ui_cpu,*this);
+
      } else { // GPU, vector parallelism, shared memory, separate ulist and ulisttot to avoid atomics

        vector_length = 32;
@ -514,7 +515,7 @@ void PairSNAPKokkos<DeviceType>::coeff(int narg, char **arg)
  Kokkos::deep_copy(d_map,h_map);

  snaKK = SNAKokkos<DeviceType>(rfac0,twojmax,
-                  rmin0,switchflag,bzeroflag);
+                  rmin0,switchflag,bzeroflag,chemflag,bnormflag,wselfallflag,nelements);
  snaKK.grow_rij(0,0);
  snaKK.init();
 }
@ -554,7 +555,6 @@ void PairSNAPKokkos<DeviceType>::operator() (TagPairSNAPComputeNeigh,const typen

      const int jtype = type(j);
      const F_FLOAT rsq = dx*dx + dy*dy + dz*dz;
-      const int elem_j = d_map[jtype];

      if ( rsq < rnd_cutsq(itype,jtype) )
       count++;
@ -584,6 +584,10 @@ void PairSNAPKokkos<DeviceType>::operator() (TagPairSNAPComputeNeigh,const typen
        my_sna.inside(ii,offset) = j;
        my_sna.wj(ii,offset) = d_wjelem[elem_j];
        my_sna.rcutij(ii,offset) = (radi + d_radelem[elem_j])*rcutfac;
+        if (chemflag)
+          my_sna.element(ii,offset) = elem_j;
+        else
+          my_sna.element(ii,offset) = 0;
      }
      offset++;
    }
@ -598,8 +602,10 @@ void PairSNAPKokkos<DeviceType>::operator() (TagPairSNAPPreUi,const typename Kok
  // Extract the atom number
  const int ii = team.team_rank() + team.team_size() * (team.league_rank() % ((chunk_size+team.team_size()-1)/team.team_size()));
  if (ii >= chunk_size) return;
+  int itype = type(ii);
+  int ielem = d_map[itype];

-  my_sna.pre_ui(team,ii);
+  my_sna.pre_ui(team,ii,ielem);
 }

 template<class DeviceType>
@ -649,7 +655,8 @@ void PairSNAPKokkos<DeviceType>::operator() (TagPairSNAPZeroYi,const typename Ko
  const int ii = team.league_rank() / ((my_sna.idxu_max+team.team_size()-1)/team.team_size());
  if (ii >= chunk_size) return;

-  my_sna.zero_yi(idx,ii);
+  for(int ielem = 0; ielem < nelements; ielem++)
+    my_sna.zero_yi(idx,ii,ielem);
 }

 template<class DeviceType>
--- a/src/KOKKOS/sna_kokkos.h
+++ b/src/KOKKOS/sna_kokkos.h
@ -58,7 +58,7 @@ inline
  SNAKokkos(const SNAKokkos<DeviceType>& sna, const typename Kokkos::TeamPolicy<DeviceType>::member_type& team);

 inline
-  SNAKokkos(double, int, double, int, int);
+  SNAKokkos(double, int, double, int, int, int, int, int, int);

  KOKKOS_INLINE_FUNCTION
  ~SNAKokkos();
@ -75,7 +75,7 @@ inline

  // functions for bispectrum coefficients
  KOKKOS_INLINE_FUNCTION
-  void pre_ui(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team,const int&); // ForceSNAP
+  void pre_ui(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team,const int&, int); // ForceSNAP
  KOKKOS_INLINE_FUNCTION
  void compute_ui(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, const int, const int); // ForceSNAP
  KOKKOS_INLINE_FUNCTION
@ -83,7 +83,7 @@ inline
  KOKKOS_INLINE_FUNCTION
  void compute_zi(const int&);    // ForceSNAP
  KOKKOS_INLINE_FUNCTION
-  void zero_yi(const int&,const int&); // ForceSNAP
+  void zero_yi(const int&, const int&, int); // ForceSNAP
  KOKKOS_INLINE_FUNCTION
  void compute_yi(int,
   const Kokkos::View<F_FLOAT**, DeviceType> &beta); // ForceSNAP
@ -131,6 +131,7 @@ inline
  t_sna_2i inside;
  t_sna_2d wj;
  t_sna_2d rcutij;
+  t_sna_2i element;
  t_sna_3d dedr;
  int natom, nmax;

@ -185,7 +186,7 @@ inline
  void init_rootpqarray();    // init()

  KOKKOS_INLINE_FUNCTION
-  void add_uarraytot(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, int, int, double, double, double); // compute_ui
+  void add_uarraytot(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, int, int, double, double, double, int); // compute_ui

  KOKKOS_INLINE_FUNCTION
  void compute_uarray_cpu(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, int, int,
@ -208,8 +209,16 @@ inline
  // 1 = cosine
  int switch_flag;

+  // Chem snap flags
+  int chem_flag;
+  int bnorm_flag;
+  int nelements;
+  int ndoubles;
+  int ntriples;
+
  // Self-weight
  double wself;
+  int wselfall_flag;

  int bzero_flag; // 1 if bzero subtracted from barray
  Kokkos::View<double*, DeviceType> bzero; // array of B values for isolated atoms
--- a/src/KOKKOS/sna_kokkos_impl.h
+++ b/src/KOKKOS/sna_kokkos_impl.h
@ -28,8 +28,8 @@ static const double MY_PI  = 3.14159265358979323846; // pi
 template<class DeviceType>
 inline
 SNAKokkos<DeviceType>::SNAKokkos(double rfac0_in,
-         int twojmax_in,
-         double rmin0_in, int switch_flag_in, int bzero_flag_in)
+         int twojmax_in, double rmin0_in, int switch_flag_in, int bzero_flag_in,
+         int chem_flag_in, int bnorm_flag_in, int wselfall_flag_in, int nelements_in)
 {
  wself = 1.0;

@ -38,6 +38,14 @@ SNAKokkos<DeviceType>::SNAKokkos(double rfac0_in,
  switch_flag = switch_flag_in;
  bzero_flag = bzero_flag_in;

+  chem_flag = chem_flag_in;
+  if (chem_flag)
+    nelements = nelements_in;
+  else
+    nelements = 1;
+  bnorm_flag = bnorm_flag_in;
+  wselfall_flag = wselfall_flag_in;
+
  twojmax = twojmax_in;

  ncoeff = compute_ncoeff();
@ -57,7 +65,10 @@ SNAKokkos<DeviceType>::SNAKokkos(double rfac0_in,

    double www = wself*wself*wself;
    for(int j = 0; j <= twojmax; j++)
-      h_bzero[j] = www*(j+1);
+      if (bnorm_flag)
+        h_bzero[j] = www;
+      else
+        h_bzero[j] = www*(j+1);
    Kokkos::deep_copy(bzero,h_bzero);
  }
 }
@ -223,13 +234,14 @@ void SNAKokkos<DeviceType>::grow_rij(int newnatom, int newnmax)
  inside = t_sna_2i("sna:inside",natom,nmax);
  wj = t_sna_2d("sna:wj",natom,nmax);
  rcutij = t_sna_2d("sna:rcutij",natom,nmax);
+  element = t_sna_2i("sna:rcutij",natom,nmax);
  dedr = t_sna_3d("sna:dedr",natom,nmax,3);

-  blist = t_sna_2d_ll("sna:blist",idxb_max,natom);
+  blist = t_sna_2d_ll("sna:blist",idxb_max*ntriples,natom);
  //ulisttot = t_sna_2c("sna:ulisttot",natom,idxu_max);
-  ulisttot = t_sna_2c_ll("sna:ulisttot",idxu_max,natom);
+  ulisttot = t_sna_2c_ll("sna:ulisttot",idxu_max*nelements,natom);

-  zlist = t_sna_2c_ll("sna:zlist",idxz_max,natom);
+  zlist = t_sna_2c_ll("sna:zlist",idxz_max*ndoubles,natom);

  //ulist = t_sna_3c("sna:ulist",natom,nmax,idxu_max);
 #ifdef KOKKOS_ENABLE_CUDA
@ -246,7 +258,7 @@ void SNAKokkos<DeviceType>::grow_rij(int newnatom, int newnmax)
 #endif

  //ylist = t_sna_2c_lr("sna:ylist",natom,idxu_max);
-  ylist = t_sna_2c_ll("sna:ylist",idxu_max,natom);
+  ylist = t_sna_2c_ll("sna:ylist",idxu_max*nelements,natom);
 }

 /* ----------------------------------------------------------------------
@ -255,8 +267,9 @@ void SNAKokkos<DeviceType>::grow_rij(int newnatom, int newnmax)

 template<class DeviceType>
 KOKKOS_INLINE_FUNCTION
-void SNAKokkos<DeviceType>::pre_ui(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, const int& iatom)
+void SNAKokkos<DeviceType>::pre_ui(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, const int& iatom, int ielem)
 {
+  for(int jelem = 0; jelem < nelements; jelem++)
  for (int j = 0; j <= twojmax; j++) {
    const int jju = idxu_block(j);

@ -270,9 +283,9 @@ void SNAKokkos<DeviceType>::pre_ui(const typename Kokkos::TeamPolicy<DeviceType>
      // if m is on the "diagonal", initialize it with the self energy.
      // Otherwise zero it out
      SNAcomplex init = {0., 0.};
-      if (m % (j+2) == 0) { init = {wself, 0.0}; }
+      if (m % (j+2) == 0 && (!chem_flag || ielem == jelem || wselfall_flag)) { init = {wself, 0.0}; } //need to map iatom to element

-      ulisttot(jjup, iatom) = init;
+      ulisttot(jelem*idxu_max+jjup, iatom) = init;
    });
  }

@ -309,6 +322,7 @@ void SNAKokkos<DeviceType>::compute_ui(const typename Kokkos::TeamPolicy<DeviceT

  const double wj_local = wj(iatom, jnbor);
  const double rcut = rcutij(iatom, jnbor);
+  const int jpos = element(iatom, jnbor)*idxu_max;

  const double rsq = x * x + y * y + z * z;
  const double r = sqrt(rsq);
@ -335,7 +349,7 @@ void SNAKokkos<DeviceType>::compute_ui(const typename Kokkos::TeamPolicy<DeviceT
  Kokkos::single(Kokkos::PerThread(team), [=]() {
    //ulist(0,iatom,jnbor) = { 1.0, 0.0 };
    buf1[0] = {1.,0.};
-    Kokkos::atomic_add(&(ulisttot(0,iatom).re), sfac);
+    Kokkos::atomic_add(&(ulisttot(jpos,iatom).re), sfac);
  });

  for (int j = 1; j <= twojmax; j++) {
@ -364,7 +378,7 @@ void SNAKokkos<DeviceType>::compute_ui(const typename Kokkos::TeamPolicy<DeviceT
      int mb = m / n_ma;

      // index into global memory array
-      const int jju_index = jju+m;
+      const int jju_index = jju+m+jpos;
      //const int jjup_index = jjup+mb*j+ma;

      // index into shared memory buffer for this level
@ -401,7 +415,7 @@ void SNAKokkos<DeviceType>::compute_ui(const typename Kokkos::TeamPolicy<DeviceT
      if (m < total_iters - 1 || j % 2 == 1) {
        const int sign_factor = (((ma+mb)%2==0)?1:-1);
        const int jju_shared_flip = (j+1-mb)*(j+1)-(ma+1);
-        const int jjup_flip = jju + jju_shared_flip; // jju+(j+1-mb)*(j+1)-(ma+1);
+        const int jjup_flip = jju + jju_shared_flip + jpos; // jju+(j+1-mb)*(j+1)-(ma+1);


        if (sign_factor == 1) {
@ -454,7 +468,7 @@ void SNAKokkos<DeviceType>::compute_ui_cpu(const typename Kokkos::TeamPolicy<Dev
  z0 = r / tan(theta0);

  compute_uarray_cpu(team, iatom, jnbor, x, y, z, z0, r);
-  add_uarraytot(team, iatom, jnbor, r, wj(iatom,jnbor), rcutij(iatom,jnbor));
+  add_uarraytot(team, iatom, jnbor, r, wj(iatom,jnbor), rcutij(iatom,jnbor), element(iatom, jnbor));

 }

@ -469,47 +483,62 @@ void SNAKokkos<DeviceType>::compute_zi(const int& iter)
  const int iatom = iter / idxz_max;
  const int jjz = iter % idxz_max;

-  const int j1 = idxz(jjz,0);
-  const int j2 = idxz(jjz,1);
-  const int j = idxz(jjz,2);
-  const int ma1min = idxz(jjz,3);
-  const int ma2max = idxz(jjz,4);
-  const int mb1min = idxz(jjz,5);
-  const int mb2max = idxz(jjz,6);
-  const int na = idxz(jjz,7);
-  const int nb = idxz(jjz,8);
+  const int j1 = idxz(jjz, 0);
+  const int j2 = idxz(jjz, 1);
+  const int j = idxz(jjz, 2);
+  const int ma1min = idxz(jjz, 3);
+  const int ma2max = idxz(jjz, 4);
+  const int mb1min = idxz(jjz, 5);
+  const int mb2max = idxz(jjz, 6);
+  const int na = idxz(jjz, 7);
+  const int nb = idxz(jjz, 8);

-  const double* cgblock = cglist.data() + idxcg_block(j1,j2,j);
+  const double *cgblock = cglist.data() + idxcg_block(j1, j2, j);

-  zlist(jjz,iatom).re = 0.0;
-  zlist(jjz,iatom).im = 0.0;
+  int idouble = 0;
+  for(int elem1 = 0; elem1 < nelements; elem1++)
+  for(int elem2 = 0; elem2 < nelements; elem2++) {

-  int jju1 = idxu_block[j1] + (j1+1)*mb1min;
-  int jju2 = idxu_block[j2] + (j2+1)*mb2max;
-  int icgb = mb1min*(j2+1) + mb2max;
-  for(int ib = 0; ib < nb; ib++) {
+  const int jjza = idouble*idxz_max + jjz;
+
+  zlist(jjza, iatom).re = 0.0;
+  zlist(jjza, iatom).im = 0.0;
+
+  int jju1 = elem1 * idxu_max + idxu_block[j1] + (j1 + 1) * mb1min;
+  int jju2 = elem2 * idxu_max + idxu_block[j2] + (j2 + 1) * mb2max;
+  int icgb = mb1min * (j2 + 1) + mb2max;
+  for (int ib = 0; ib < nb; ib++) {

    double suma1_r = 0.0;
    double suma1_i = 0.0;

    int ma1 = ma1min;
    int ma2 = ma2max;
-    int icga = ma1min*(j2+1) + ma2max;
-    for(int ia = 0; ia < na; ia++) {
-      suma1_r += cgblock[icga] * (ulisttot(jju1+ma1,iatom).re * ulisttot(jju2+ma2,iatom).re - ulisttot(jju1+ma1,iatom).im * ulisttot(jju2+ma2,iatom).im);
-      suma1_i += cgblock[icga] * (ulisttot(jju1+ma1,iatom).re * ulisttot(jju2+ma2,iatom).im + ulisttot(jju1+ma1,iatom).im * ulisttot(jju2+ma2,iatom).re);
+    int icga = ma1min * (j2 + 1) + ma2max;
+    for (int ia = 0; ia < na; ia++) {
+      suma1_r += cgblock[icga] * (ulisttot(jju1 + ma1, iatom).re * ulisttot(jju2 + ma2, iatom).re -
+                                  ulisttot(jju1 + ma1, iatom).im * ulisttot(jju2 + ma2, iatom).im);
+      suma1_i += cgblock[icga] * (ulisttot(jju1 + ma1, iatom).re * ulisttot(jju2 + ma2, iatom).im +
+                                  ulisttot(jju1 + ma1, iatom).im * ulisttot(jju2 + ma2, iatom).re);
      ma1++;
      ma2--;
      icga += j2;
    } // end loop over ia

-    zlist(jjz,iatom).re += cgblock[icgb] * suma1_r;
-    zlist(jjz,iatom).im += cgblock[icgb] * suma1_i;
+    zlist(jjza, iatom).re += cgblock[icgb] * suma1_r;
+    zlist(jjza, iatom).im += cgblock[icgb] * suma1_i;

-    jju1 += j1+1;
-    jju2 -= j2+1;
+    jju1 += j1 + 1;
+    jju2 -= j2 + 1;
    icgb += j2;
+
  } // end loop over ib
+  if (bnorm_flag) {
+    zlist(jjza, iatom).re /= (j+1);
+    zlist(jjza, iatom).im /= (j+1);
+  }
+  idouble++;
+  }
 }

 /* ----------------------------------------------------------------------
@ -518,9 +547,9 @@ void SNAKokkos<DeviceType>::compute_zi(const int& iter)

 template<class DeviceType>
 KOKKOS_INLINE_FUNCTION
-void SNAKokkos<DeviceType>::zero_yi(const int& idx, const int& iatom)
+void SNAKokkos<DeviceType>::zero_yi(const int& idx, const int& iatom, int ielem)
 {
-  ylist(idx,iatom) = {0.0, 0.0};
+  ylist(ielem*idxu_max+idx,iatom) = {0.0, 0.0};
 }

 /* ----------------------------------------------------------------------
@ -536,75 +565,99 @@ void SNAKokkos<DeviceType>::compute_yi(int iter,
  const int iatom = iter / idxz_max;
  const int jjz = iter % idxz_max;

-  const int j1 = idxz(jjz,0);
-  const int j2 = idxz(jjz,1);
-  const int j = idxz(jjz,2);
-  const int ma1min = idxz(jjz,3);
-  const int ma2max = idxz(jjz,4);
-  const int mb1min = idxz(jjz,5);
-  const int mb2max = idxz(jjz,6);
-  const int na = idxz(jjz,7);
-  const int nb = idxz(jjz,8);
-  const int jju = idxz(jjz,9);
+  const int j1 = idxz(jjz, 0);
+  const int j2 = idxz(jjz, 1);
+  const int j = idxz(jjz, 2);
+  const int ma1min = idxz(jjz, 3);
+  const int ma2max = idxz(jjz, 4);
+  const int mb1min = idxz(jjz, 5);
+  const int mb2max = idxz(jjz, 6);
+  const int na = idxz(jjz, 7);
+  const int nb = idxz(jjz, 8);
+  const int jju = idxz(jjz, 9);

-  const double* cgblock = cglist.data() + idxcg_block(j1,j2,j);
+  const double *cgblock = cglist.data() + idxcg_block(j1, j2, j);
  //int mb = (2 * (mb1min+mb2max) - j1 - j2 + j) / 2;
  //int ma = (2 * (ma1min+ma2max) - j1 - j2 + j) / 2;

-  double ztmp_r = 0.0;
-  double ztmp_i = 0.0;
+  int itriple;
+  for(int elem1 = 0; elem1 < nelements; elem1++)
+    for (int elem2 = 0; elem2 < nelements; elem2++) {

-  int jju1 = idxu_block[j1] + (j1+1)*mb1min;
-  int jju2 = idxu_block[j2] + (j2+1)*mb2max;
-  int icgb = mb1min*(j2+1) + mb2max;
-  for(int ib = 0; ib < nb; ib++) {
+    double ztmp_r = 0.0;
+    double ztmp_i = 0.0;

-    double suma1_r = 0.0;
-    double suma1_i = 0.0;
+    int jju1 = elem1 * idxu_max + idxu_block[j1] + (j1 + 1) * mb1min;
+    int jju2 = elem2 * idxu_max + idxu_block[j2] + (j2 + 1) * mb2max;
+    int icgb = mb1min * (j2 + 1) + mb2max;
+    for (int ib = 0; ib < nb; ib++) {

-    int ma1 = ma1min;
-    int ma2 = ma2max;
-    int icga = ma1min*(j2+1) + ma2max;
+      double suma1_r = 0.0;
+      double suma1_i = 0.0;

-    for(int ia = 0; ia < na; ia++) {
-      suma1_r += cgblock[icga] * (ulisttot(jju1+ma1,iatom).re * ulisttot(jju2+ma2,iatom).re - ulisttot(jju1+ma1,iatom).im * ulisttot(jju2+ma2,iatom).im);
-      suma1_i += cgblock[icga] * (ulisttot(jju1+ma1,iatom).re * ulisttot(jju2+ma2,iatom).im + ulisttot(jju1+ma1,iatom).im * ulisttot(jju2+ma2,iatom).re);
-      ma1++;
-      ma2--;
-      icga += j2;
-    } // end loop over ia
+      int ma1 = ma1min;
+      int ma2 = ma2max;
+      int icga = ma1min * (j2 + 1) + ma2max;

-    ztmp_r += cgblock[icgb] * suma1_r;
-    ztmp_i += cgblock[icgb] * suma1_i;
-    jju1 += j1+1;
-    jju2 -= j2+1;
-    icgb += j2;
-  } // end loop over ib
+      for (int ia = 0; ia < na; ia++) {
+        suma1_r += cgblock[icga] * (ulisttot(jju1 + ma1, iatom).re *
+                                    ulisttot(jju2 + ma2, iatom).re -
+                                    ulisttot(jju1 + ma1, iatom).im *
+                                    ulisttot(jju2 + ma2, iatom).im);
+        suma1_i += cgblock[icga] * (ulisttot(jju1 + ma1, iatom).re *
+                                    ulisttot(jju2 + ma2, iatom).im +
+                                    ulisttot(jju1 + ma1, iatom).im *
+                                    ulisttot(jju2 + ma2, iatom).re);
+        ma1++;
+        ma2--;
+        icga += j2;
+      } // end loop over ia

-  // apply to z(j1,j2,j,ma,mb) to unique element of y(j)
-  // find right y_list[jju] and beta(iatom,jjb) entries
-  // multiply and divide by j+1 factors
-  // account for multiplicity of 1, 2, or 3
+      ztmp_r += cgblock[icgb] * suma1_r;
+      ztmp_i += cgblock[icgb] * suma1_i;
+      jju1 += j1 + 1;
+      jju2 -= j2 + 1;
+      icgb += j2;
+    } // end loop over ib

-  // pick out right beta value
+    if (bnorm_flag) {
+      ztmp_i /= j + 1;
+      ztmp_r /= j + 1;
+    }

-  if (j >= j1) {
-    const int jjb = idxb_block(j1,j2,j);
-    if (j1 == j) {
-      if (j2 == j) betaj = 3*beta(jjb,iatom);
-      else betaj = 2*beta(jjb,iatom);
-    } else betaj = beta(jjb,iatom);
-  } else if (j >= j2) {
-    const int jjb = idxb_block(j,j2,j1);
-    if (j2 == j) betaj = 2*beta(jjb,iatom)*(j1+1)/(j+1.0);
-    else betaj = beta(jjb,iatom)*(j1+1)/(j+1.0);
-  } else {
-    const int jjb = idxb_block(j2,j,j1);
-    betaj = beta(jjb,iatom)*(j1+1)/(j+1.0);
+    // apply to z(j1,j2,j,ma,mb) to unique element of y(j)
+    // find right y_list[jju] and beta(iatom,jjb) entries
+    // multiply and divide by j+1 factors
+    // account for multiplicity of 1, 2, or 3
+
+    // pick out right beta value
+    for (int elem3 = 0; elem3 < nelements; elem3++) {
+      const int jjuy = elem3 * idxu_max + jju;
+      if (j >= j1) {
+        const int jjb = idxb_block(j1, j2, j);
+        itriple = ((elem1 * nelements + elem2) * nelements + elem3) * idxb_max + jjb;
+        if (j1 == j) {
+          if (j2 == j) betaj = 3 * beta(itriple, iatom);
+          else betaj = 2 * beta(itriple, iatom);
+        } else betaj = beta(itriple, iatom);
+      } else if (j >= j2) {
+        const int jjb = idxb_block(j, j2, j1);
+        itriple = ((elem3 * nelements + elem2) * nelements + elem1) * idxb_max + jjb;
+        if (j2 == j) betaj = 2 * beta(itriple, iatom);
+        else betaj = beta(itriple, iatom);
+      } else {
+        const int jjb = idxb_block(j2, j, j1);
+        itriple = ((elem2 * nelements + elem3) * nelements + elem1) * idxb_max + jjb;
+        betaj = beta(itriple, iatom);
+      }
+
+      if (!bnorm_flag && j1 > j)
+        betaj *= (j1 + 1) / (j + 1.0);
+
+      Kokkos::atomic_add(&(ylist(jjuy, iatom).re), betaj * ztmp_r);
+      Kokkos::atomic_add(&(ylist(jjuy, iatom).im), betaj * ztmp_i);
+    }
  }
-
-  Kokkos::atomic_add(&(ylist(jju,iatom).re), betaj*ztmp_r);
-  Kokkos::atomic_add(&(ylist(jju,iatom).im), betaj*ztmp_i);
 }

 /* ----------------------------------------------------------------------
@ -620,6 +673,7 @@ void SNAKokkos<DeviceType>::compute_fused_deidrj(const typename Kokkos::TeamPoli
  const int max_m_tile = (twojmax+1)*(twojmax/2+1);
  const int team_rank = team.team_rank();
  const int scratch_shift = team_rank * max_m_tile;
+  const int jpos = element(iatom, jnbor)*idxu_max;

  // double buffer for ulist
  SNAcomplex* ulist_buf1 = (SNAcomplex*)team.team_shmem( ).get_shmem(team.team_size()*max_m_tile*sizeof(SNAcomplex), 0) + scratch_shift;
@ -669,7 +723,7 @@ void SNAKokkos<DeviceType>::compute_fused_deidrj(const typename Kokkos::TeamPoli

  // Accumulate the full contribution to dedr on the fly
  const double du_prod = dsfac * u; // chain rule
-  const SNAcomplex y_local = ylist(0, iatom);
+  const SNAcomplex y_local = ylist(jpos, iatom);

  // Symmetry factor of 0.5 b/c 0 element is on diagonal for even j==0
  double dedr_full_sum = 0.5 * du_prod * y_local.re;
@ -704,7 +758,7 @@ void SNAKokkos<DeviceType>::compute_fused_deidrj(const typename Kokkos::TeamPoli
      int ma = m % n_ma;
      int mb = m / n_ma;

-      const int jju_index = jju+m;
+      const int jju_index = jpos+jju+m;

      // Load y_local, apply the symmetry scaling factor
      // The "secret" of the shared memory optimization is it eliminates
@ -810,18 +864,21 @@ KOKKOS_INLINE_FUNCTION
 void SNAKokkos<DeviceType>::compute_deidrj_cpu(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, int iatom, int jnbor)
 {
  t_scalar3<double> final_sum;
+  const int jelem = element(iatom, jnbor);

  //for(int j = 0; j <= twojmax; j++) {
  Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team,twojmax+1),
      [&] (const int& j, t_scalar3<double>& sum_tmp) {
    int jju = idxu_block[j];
+    int jjuy = idxu_block[j] + jelem*idxu_max;

    for(int mb = 0; 2*mb < j; mb++)
      for(int ma = 0; ma <= j; ma++) {
-        sum_tmp.x += dulist(jju,iatom,jnbor,0).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,0).im * ylist(jju,iatom).im;
-        sum_tmp.y += dulist(jju,iatom,jnbor,1).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,1).im * ylist(jju,iatom).im;
-        sum_tmp.z += dulist(jju,iatom,jnbor,2).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,2).im * ylist(jju,iatom).im;
+        sum_tmp.x += dulist(jju,iatom,jnbor,0).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,0).im * ylist(jjuy,iatom).im;
+        sum_tmp.y += dulist(jju,iatom,jnbor,1).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,1).im * ylist(jjuy,iatom).im;
+        sum_tmp.z += dulist(jju,iatom,jnbor,2).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,2).im * ylist(jjuy,iatom).im;
        jju++;
+        jjuy++;
      } //end loop over ma mb

    // For j even, handle middle column
@ -830,16 +887,17 @@ void SNAKokkos<DeviceType>::compute_deidrj_cpu(const typename Kokkos::TeamPolicy

      int mb = j/2;
      for(int ma = 0; ma < mb; ma++) {
-        sum_tmp.x += dulist(jju,iatom,jnbor,0).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,0).im * ylist(jju,iatom).im;
-        sum_tmp.y += dulist(jju,iatom,jnbor,1).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,1).im * ylist(jju,iatom).im;
-        sum_tmp.z += dulist(jju,iatom,jnbor,2).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,2).im * ylist(jju,iatom).im;
+        sum_tmp.x += dulist(jju,iatom,jnbor,0).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,0).im * ylist(jjuy,iatom).im;
+        sum_tmp.y += dulist(jju,iatom,jnbor,1).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,1).im * ylist(jjuy,iatom).im;
+        sum_tmp.z += dulist(jju,iatom,jnbor,2).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,2).im * ylist(jjuy,iatom).im;
        jju++;
+        jjuy++;
      }

      //int ma = mb;
-      sum_tmp.x += (dulist(jju,iatom,jnbor,0).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,0).im * ylist(jju,iatom).im)*0.5;
-      sum_tmp.y += (dulist(jju,iatom,jnbor,1).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,1).im * ylist(jju,iatom).im)*0.5;
-      sum_tmp.z += (dulist(jju,iatom,jnbor,2).re * ylist(jju,iatom).re + dulist(jju,iatom,jnbor,2).im * ylist(jju,iatom).im)*0.5;
+      sum_tmp.x += (dulist(jju,iatom,jnbor,0).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,0).im * ylist(jjuy,iatom).im)*0.5;
+      sum_tmp.y += (dulist(jju,iatom,jnbor,1).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,1).im * ylist(jjuy,iatom).im)*0.5;
+      sum_tmp.z += (dulist(jju,iatom,jnbor,2).re * ylist(jjuy,iatom).re + dulist(jju,iatom,jnbor,2).im * ylist(jjuy,iatom).im)*0.5;
    } // end if jeven

  },final_sum); // end loop over j
@ -869,69 +927,93 @@ void SNAKokkos<DeviceType>::compute_bi(const typename Kokkos::TeamPolicy<DeviceT
  //            b(j1,j2,j) +=
  //              2*Conj(u(j,ma,mb))*z(j1,j2,j,ma,mb)

-  Kokkos::parallel_for(Kokkos::TeamThreadRange(team,idxb_max),
-      [&] (const int& jjb) {
-  //for(int jjb = 0; jjb < idxb_max; jjb++) {
-    const int j1 = idxb(jjb,0);
-    const int j2 = idxb(jjb,1);
-    const int j = idxb(jjb,2);
+  int itriple = 0;
+  int idouble = 0;
+  int jalloy = 0;
+  for (int elem1 = 0; elem1 < nelements; elem1++)
+    for (int elem2 = 0; elem2 < nelements; elem2++) {
+      jalloy = idouble*idxz_max;
+      for (int elem3 = 0; elem3 < nelements; elem3++) {
+        Kokkos::parallel_for(Kokkos::TeamThreadRange(team, idxb_max),
+        [&](const int &jjb) {
+        //for(int jjb = 0; jjb < idxb_max; jjb++) {
+          const int jjballoy = itriple*idxb_max+jjb;
+          const int j1 = idxb(jjb, 0);
+          const int j2 = idxb(jjb, 1);
+          const int j = idxb(jjb, 2);

-    int jjz = idxz_block(j1,j2,j);
-    int jju = idxu_block[j];
-    double sumzu = 0.0;
-    double sumzu_temp = 0.0;
-    const int bound = (j+2)/2;
-    Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team,(j+1)*bound),
-        [&] (const int mbma, double& sum) {
-        //for(int mb = 0; 2*mb < j; mb++)
-          //for(int ma = 0; ma <= j; ma++) {
-        const int ma = mbma%(j+1);
-        const int mb = mbma/(j+1);
-        const int jju_index = jju+mb*(j+1)+ma;
-        const int jjz_index = jjz+mb*(j+1)+ma;
-        if (2*mb == j) return;
-        sum +=
-          ulisttot(jju_index,iatom).re * zlist(jjz_index,iatom).re +
-          ulisttot(jju_index,iatom).im * zlist(jjz_index,iatom).im;
-      },sumzu_temp); // end loop over ma, mb
-      sumzu += sumzu_temp;
+          int jjz = idxz_block(j1, j2, j);
+          int jju = idxu_block[j];
+          double sumzu = 0.0;
+          double sumzu_temp = 0.0;
+          const int bound = (j + 2) / 2;
+          Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team, (j + 1) * bound),
+          [&](const int mbma, double &sum) {
+            //for(int mb = 0; 2*mb < j; mb++)
+            //for(int ma = 0; ma <= j; ma++) {
+            const int ma = mbma % (j + 1);
+            const int mb = mbma / (j + 1);
+            const int jju_index = elem3 * idxu_max + jju + mb * (j + 1) + ma;
+            const int jjz_index = jalloy + jjz + mb * (j + 1) + ma;
+            if (2 * mb == j) return;
+            sum +=
+                ulisttot(jju_index, iatom).re *
+                zlist(jjz_index, iatom).re +
+                ulisttot(jju_index, iatom).im *
+                zlist(jjz_index, iatom).im;
+          }, sumzu_temp); // end loop over ma, mb
+          sumzu += sumzu_temp;

-    // For j even, special treatment for middle column
+          // For j even, special treatment for middle column

-    if (j%2 == 0) {
-      const int mb = j/2;
-      Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team, mb),
-          [&] (const int ma, double& sum) {
-      //for(int ma = 0; ma < mb; ma++) {
-        const int jju_index = jju+(mb-1)*(j+1)+(j+1)+ma;
-        const int jjz_index = jjz+(mb-1)*(j+1)+(j+1)+ma;
-        sum +=
-          ulisttot(jju_index,iatom).re * zlist(jjz_index,iatom).re +
-          ulisttot(jju_index,iatom).im * zlist(jjz_index,iatom).im;
-      },sumzu_temp); // end loop over ma
-      sumzu += sumzu_temp;
+          if (j % 2 == 0) {
+            const int mb = j / 2;
+            Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team, mb),
+            [&](const int ma, double &sum) {
+              //for(int ma = 0; ma < mb; ma++) {
+              const int jju_index =
+                  elem3 * idxu_max + jju + (mb - 1) * (j + 1) + (j + 1) + ma;
+              const int jjz_index =
+                  jalloy + jjz + (mb - 1) * (j + 1) + (j + 1) + ma;
+              sum +=
+                  ulisttot(jju_index, iatom).re *
+                  zlist(jjz_index, iatom).re +
+                  ulisttot(jju_index, iatom).im *
+                  zlist(jjz_index, iatom).im;
+            }, sumzu_temp); // end loop over ma
+            sumzu += sumzu_temp;

-      const int ma = mb;
-      const int jju_index = jju+(mb-1)*(j+1)+(j+1)+ma;
-      const int jjz_index = jjz+(mb-1)*(j+1)+(j+1)+ma;
-      sumzu += 0.5*
-        (ulisttot(jju_index,iatom).re * zlist(jjz_index,iatom).re +
-         ulisttot(jju_index,iatom).im * zlist(jjz_index,iatom).im);
-    } // end if jeven
+            const int ma = mb;
+            const int jju_index = elem3 * idxu_max + jju + (mb - 1) * (j + 1) + (j + 1) + ma;
+            const int jjz_index = jalloy + jjz + (mb - 1) * (j + 1) + (j + 1) + ma;
+            sumzu += 0.5 *
+                     (ulisttot(jju_index, iatom).re *
+                      zlist(jjz_index, iatom).re +
+                      ulisttot(jju_index, iatom).im *
+                      zlist(jjz_index, iatom).im);
+          } // end if jeven

-    Kokkos::single(Kokkos::PerThread(team), [&] () {
-      sumzu *= 2.0;
+          Kokkos::single(Kokkos::PerThread(team), [&]() {
+            sumzu *= 2.0;

-      // apply bzero shift
+            // apply bzero shift

-      if (bzero_flag)
-        sumzu -= bzero[j];
+            if (bzero_flag){
+              if (!wselfall_flag) {
+                if (elem1 == elem2 && elem1 == elem3)
+                  sumzu -= bzero[j];
+              } else sumzu -= bzero[j];
+            }

-      blist(jjb,iatom) = sumzu;
-    });
-  });
-      //} // end loop over j
-    //} // end loop over j1, j2
+            blist(jjballoy, iatom) = sumzu;
+          });
+        });
+        itriple++;
+      }
+      idouble++;
+    }
+    //} // end loop over j
+  //} // end loop over j1, j2
 }

 /* ----------------------------------------------------------------------
@ -967,14 +1049,13 @@ void SNAKokkos<DeviceType>::compute_duidrj_cpu(const typename Kokkos::TeamPolicy
 template<class DeviceType>
 KOKKOS_INLINE_FUNCTION
 void SNAKokkos<DeviceType>::add_uarraytot(const typename Kokkos::TeamPolicy<DeviceType>::member_type& team, int iatom, int jnbor,
-                                          double r, double wj, double rcut)
+                                          double r, double wj, double rcut, int jelem)
 {
  const double sfac = compute_sfac(r, rcut) * wj;
-
-  Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,ulisttot.extent(0)),
+  Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,idxu_max),
      [&] (const int& i) {
-    Kokkos::atomic_add(&(ulisttot(i,iatom).re), sfac * ulist(i,iatom,jnbor).re);
-    Kokkos::atomic_add(&(ulisttot(i,iatom).im), sfac * ulist(i,iatom,jnbor).im);
+    Kokkos::atomic_add(&(ulisttot(jelem*idxu_max+i,iatom).re), sfac * ulist(i,iatom,jnbor).re);
+    Kokkos::atomic_add(&(ulisttot(jelem*idxu_max+i,iatom).im), sfac * ulist(i,iatom,jnbor).im);
  });
 }

@ -1081,7 +1162,7 @@ void SNAKokkos<DeviceType>::compute_duarray_cpu(const typename Kokkos::TeamPolic
                          double z0, double r, double dz0dr,
                          double wj, double rcut)
 {
-double r0inv;
+  double r0inv;
  double a_r, a_i, b_r, b_i;
  double da_r[3], da_i[3], db_r[3], db_i[3];
  double dz0[3], dr0inv[3], dr0invdr;
@ -1544,6 +1625,10 @@ int SNAKokkos<DeviceType>::compute_ncoeff()
           j <= MIN(twojmax, j1 + j2); j += 2)
        if (j >= j1) ncount++;

+  ndoubles = nelements*nelements;
+  ntriples = nelements*nelements*nelements;
+  if (chem_flag) ncount *= ntriples;
+
  return ncount;
 }

@ -1640,13 +1725,13 @@ double SNAKokkos<DeviceType>::memory_usage()
 #ifdef KOKKOS_ENABLE_CUDA
  }
 #endif
-  bytes += natom * idxu_max * sizeof(double) * 2;        // ulisttot
+  bytes += natom * idxu_max * nelements * sizeof(double) * 2;        // ulisttot
  if (!std::is_same<typename DeviceType::array_layout,Kokkos::LayoutRight>::value)
-    bytes += natom * idxu_max * sizeof(double) * 2;        // ulisttot_lr
+    bytes += natom * idxu_max * nelements * sizeof(double) * 2;        // ulisttot_lr

-  bytes += natom * idxz_max * sizeof(double) * 2;        // zlist
-  bytes += natom * idxb_max * sizeof(double);            // blist
-  bytes += natom * idxu_max * sizeof(double) * 2;        // ylist
+  bytes += natom * idxz_max * ndoubles * sizeof(double) * 2;        // zlist
+  bytes += natom * idxb_max * ntriples * sizeof(double);            // blist
+  bytes += natom * idxu_max * nelements * sizeof(double) * 2;        // ylist

  bytes += jdim * jdim * jdim * sizeof(int);             // idxcg_block
  bytes += jdim * sizeof(int);                           // idxu_block
--- a/src/SNAP/compute_sna_atom.cpp
+++ b/src/SNAP/compute_sna_atom.cpp
@ -34,7 +34,7 @@ ComputeSNAAtom::ComputeSNAAtom(LAMMPS *lmp, int narg, char **arg) :
  radelem(NULL), wjelem(NULL)
 {
  double rmin0, rfac0;
-  int twojmax, switchflag, bzeroflag;
+  int twojmax, switchflag, bzeroflag, bnormflag, wselfallflag;
  radelem = NULL;
  wjelem = NULL;

@ -48,7 +48,12 @@ ComputeSNAAtom::ComputeSNAAtom(LAMMPS *lmp, int narg, char **arg) :
  rmin0 = 0.0;
  switchflag = 1;
  bzeroflag = 1;
+  bnormflag = 0;
  quadraticflag = 0;
+  chemflag = 0;
+  bnormflag = 0;
+  wselfallflag = 0;
+  nelements = 1;

  // offset by 1 to match up with types

@ -104,11 +109,35 @@ ComputeSNAAtom::ComputeSNAAtom(LAMMPS *lmp, int narg, char **arg) :
        error->all(FLERR,"Illegal compute sna/atom command");
      quadraticflag = atoi(arg[iarg+1]);
      iarg += 2;
+    } else if (strcmp(arg[iarg],"chem") == 0) {
+      if (iarg+2+ntypes > narg)
+        error->all(FLERR,"Illegal compute sna/atom command");
+      chemflag = 1;
+      memory->create(map,ntypes+1,"compute_sna_atom:map");
+      nelements = force->inumeric(FLERR,arg[iarg+1]);
+      for(int i = 0; i < ntypes; i++) {
+        int jelem = force->inumeric(FLERR,arg[iarg+2+i]);
+        if (jelem < 0 || jelem >= nelements)
+          error->all(FLERR,"Illegal compute sna/atom command");
+        map[i+1] = jelem;
+      }
+      iarg += 2+ntypes;
+    } else if (strcmp(arg[iarg],"bnormflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute sna/atom command");
+      bnormflag = atoi(arg[iarg+1]);
+      iarg += 2;
+    } else if (strcmp(arg[iarg],"wselfallflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute sna/atom command");
+      wselfallflag = atoi(arg[iarg+1]);
+      iarg += 2;
    } else error->all(FLERR,"Illegal compute sna/atom command");
  }

-  snaptr = new SNA(lmp,rfac0,twojmax,
-                   rmin0,switchflag,bzeroflag);
+  snaptr = new SNA(lmp, rfac0, twojmax,
+                   rmin0, switchflag, bzeroflag,
+                   chemflag, bnormflag, wselfallflag, nelements);

  ncoeff = snaptr->ncoeff;
  size_peratom_cols = ncoeff;
@ -203,6 +232,9 @@ void ComputeSNAAtom::compute_peratom()
      const double ytmp = x[i][1];
      const double ztmp = x[i][2];
      const int itype = type[i];
+      int ielem = 0;
+      if (chemflag)
+        ielem = map[itype];
      const double radi = radelem[itype];
      const int* const jlist = firstneigh[i];
      const int jnum = numneigh[i];
@ -225,6 +257,9 @@ void ComputeSNAAtom::compute_peratom()
        const double delz = ztmp - x[j][2];
        const double rsq = delx*delx + dely*dely + delz*delz;
        int jtype = type[j];
+        int jelem = 0;
+        if (chemflag)
+          int jelem = map[jtype];
        if (rsq < cutsq[itype][jtype] && rsq>1e-20) {
          snaptr->rij[ninside][0] = delx;
          snaptr->rij[ninside][1] = dely;
@ -232,13 +267,14 @@ void ComputeSNAAtom::compute_peratom()
          snaptr->inside[ninside] = j;
          snaptr->wj[ninside] = wjelem[jtype];
          snaptr->rcutij[ninside] = (radi+radelem[jtype])*rcutfac;
+          snaptr->element[ninside] = jelem; // element index for multi-element snap
          ninside++;
        }
      }

-      snaptr->compute_ui(ninside);
+      snaptr->compute_ui(ninside, ielem);
      snaptr->compute_zi();
-      snaptr->compute_bi();
+      snaptr->compute_bi(ielem);
      for (int icoeff = 0; icoeff < ncoeff; icoeff++)
        sna[i][icoeff] = snaptr->blist[icoeff];
      if (quadraticflag) {
@ -261,6 +297,7 @@ void ComputeSNAAtom::compute_peratom()
        sna[i][icoeff] = 0.0;
    }
  }
+
 }

 /* ----------------------------------------------------------------------
--- a/src/SNAP/compute_sna_atom.h
+++ b/src/SNAP/compute_sna_atom.h
@ -42,6 +42,8 @@ class ComputeSNAAtom : public Compute {
  double rcutfac;
  double *radelem;
  double *wjelem;
+  int * map;  // map types to [0,nelements)
+  int nelements, chemflag;
  class SNA* snaptr;
  double cutmax;
  int quadraticflag;
--- a/src/SNAP/compute_snad_atom.cpp
+++ b/src/SNAP/compute_snad_atom.cpp
@ -34,7 +34,7 @@ ComputeSNADAtom::ComputeSNADAtom(LAMMPS *lmp, int narg, char **arg) :
  radelem(NULL), wjelem(NULL)
 {
  double rfac0, rmin0;
-  int twojmax, switchflag, bzeroflag;
+  int twojmax, switchflag, bzeroflag, bnormflag, wselfallflag;
  radelem = NULL;
  wjelem = NULL;

@ -48,7 +48,12 @@ ComputeSNADAtom::ComputeSNADAtom(LAMMPS *lmp, int narg, char **arg) :
  rmin0 = 0.0;
  switchflag = 1;
  bzeroflag = 1;
+  bnormflag = 0;
  quadraticflag = 0;
+  chemflag = 0;
+  bnormflag = 0;
+  wselfallflag = 0;
+  nelements = 1;

  // process required arguments

@ -102,11 +107,35 @@ ComputeSNADAtom::ComputeSNADAtom(LAMMPS *lmp, int narg, char **arg) :
        error->all(FLERR,"Illegal compute snad/atom command");
      quadraticflag = atoi(arg[iarg+1]);
      iarg += 2;
+    } else if (strcmp(arg[iarg],"chem") == 0) {
+      if (iarg+2+ntypes > narg)
+        error->all(FLERR,"Illegal compute snad/atom command");
+      chemflag = 1;
+      memory->create(map,ntypes+1,"compute_snad_atom:map");
+      nelements = force->inumeric(FLERR,arg[iarg+1]);
+      for(int i = 0; i < ntypes; i++) {
+        int jelem = force->inumeric(FLERR,arg[iarg+2+i]);
+        if (jelem < 0 || jelem >= nelements)
+          error->all(FLERR,"Illegal compute snad/atom command");
+        map[i+1] = jelem;
+      }
+      iarg += 2+ntypes;
+    } else if (strcmp(arg[iarg],"bnormflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute snad/atom command");
+      bnormflag = atoi(arg[iarg+1]);
+      iarg += 2;
+    } else if (strcmp(arg[iarg],"wselfallflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute snad/atom command");
+      wselfallflag = atoi(arg[iarg+1]);
+      iarg += 2;
    } else error->all(FLERR,"Illegal compute snad/atom command");
  }

-  snaptr = new SNA(lmp,rfac0,twojmax,
-                   rmin0,switchflag,bzeroflag);
+  snaptr = new SNA(lmp, rfac0, twojmax,
+                   rmin0, switchflag, bzeroflag,
+                   chemflag, bnormflag, wselfallflag, nelements);

  ncoeff = snaptr->ncoeff;
  nperdim = ncoeff;
@ -215,6 +244,9 @@ void ComputeSNADAtom::compute_peratom()
      const double ytmp = x[i][1];
      const double ztmp = x[i][2];
      const int itype = type[i];
+      int ielem = 0;
+      if (chemflag)
+        ielem = map[itype];
      const double radi = radelem[itype];
      const int* const jlist = firstneigh[i];
      const int jnum = numneigh[i];
@ -243,6 +275,9 @@ void ComputeSNADAtom::compute_peratom()
        const double delz = x[j][2] - ztmp;
        const double rsq = delx*delx + dely*dely + delz*delz;
        int jtype = type[j];
+        int jelem = 0;
+        if (chemflag)
+            jelem = map[jtype];
        if (rsq < cutsq[itype][jtype]&&rsq>1e-20) {
          snaptr->rij[ninside][0] = delx;
          snaptr->rij[ninside][1] = dely;
@ -250,21 +285,21 @@ void ComputeSNADAtom::compute_peratom()
          snaptr->inside[ninside] = j;
          snaptr->wj[ninside] = wjelem[jtype];
          snaptr->rcutij[ninside] = (radi+radelem[jtype])*rcutfac;
+          snaptr->element[ninside] = jelem; // element index for multi-element snap
          ninside++;
        }
      }

-      snaptr->compute_ui(ninside);
+      snaptr->compute_ui(ninside, ielem);
      snaptr->compute_zi();
      if (quadraticflag) {
-        snaptr->compute_bi();
+        snaptr->compute_bi(ielem);
      }

      for (int jj = 0; jj < ninside; jj++) {
        const int j = snaptr->inside[jj];
-        snaptr->compute_duidrj(snaptr->rij[jj],
-                                    snaptr->wj[jj],
-                                    snaptr->rcutij[jj],jj);
+        snaptr->compute_duidrj(snaptr->rij[jj], snaptr->wj[jj],
+                                    snaptr->rcutij[jj], jj, snaptr->element[jj]);
        snaptr->compute_dbidrj();

        // Accumulate -dBi/dRi, -dBi/dRj
--- a/src/SNAP/compute_snad_atom.h
+++ b/src/SNAP/compute_snad_atom.h
@ -44,6 +44,8 @@ class ComputeSNADAtom : public Compute {
  double rcutfac;
  double *radelem;
  double *wjelem;
+  int *map;  // map types to [0,nelements)
+  int nelements, chemflag;
  class SNA* snaptr;
  double cutmax;
  int quadraticflag;
--- a/src/SNAP/compute_snap.cpp
+++ b/src/SNAP/compute_snap.cpp
@ -41,7 +41,10 @@ ComputeSnap::ComputeSnap(LAMMPS *lmp, int narg, char **arg) :
  extarray = 0;

  double rfac0, rmin0;
-  int twojmax, switchflag, bzeroflag;
+  int twojmax, switchflag, bzeroflag, bnormflag, wselfallflag;
+  radelem = NULL;
+  wjelem = NULL;
+
  int ntypes = atom->ntypes;
  int nargmin = 6+2*ntypes;

@ -53,6 +56,10 @@ ComputeSnap::ComputeSnap(LAMMPS *lmp, int narg, char **arg) :
  switchflag = 1;
  bzeroflag = 1;
  quadraticflag = 0;
+  chemflag = 0;
+  bnormflag = 0;
+  wselfallflag = 0;
+  nelements = 1;

  // process required arguments

@ -106,11 +113,35 @@ ComputeSnap::ComputeSnap(LAMMPS *lmp, int narg, char **arg) :
        error->all(FLERR,"Illegal compute snap command");
      quadraticflag = atoi(arg[iarg+1]);
      iarg += 2;
+    } else if (strcmp(arg[iarg],"chem") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute snap command");
+      chemflag = 1;
+      memory->create(map,ntypes+1,"compute_snap:map");
+      nelements = force->inumeric(FLERR,arg[iarg+1]);
+      for(int i = 0; i < ntypes; i++) {
+        int jelem = force->inumeric(FLERR,arg[iarg+2+i]);
+        if (jelem < 0 || jelem >= nelements)
+          error->all(FLERR,"Illegal compute snap command");
+        map[i+1] = jelem;
+      }
+      iarg += 2+ntypes;
+    } else if (strcmp(arg[iarg],"bnormflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute snap command");
+      bnormflag = atoi(arg[iarg+1]);
+      iarg += 2;
+    } else if (strcmp(arg[iarg],"wselfallflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute snap command");
+      wselfallflag = atoi(arg[iarg+1]);
+      iarg += 2;
    } else error->all(FLERR,"Illegal compute snap command");
  }

-  snaptr = new SNA(lmp,rfac0,twojmax,
-                   rmin0,switchflag,bzeroflag);
+  snaptr = new SNA(lmp, rfac0, twojmax,
+                   rmin0, switchflag, bzeroflag,
+                   chemflag, bnormflag, wselfallflag, nelements);

  ncoeff = snaptr->ncoeff;
  nperdim = ncoeff;
@ -141,6 +172,8 @@ ComputeSnap::~ComputeSnap()
  memory->destroy(wjelem);
  memory->destroy(cutsq);
  delete snaptr;
+
+  if (chemflag) memory->destroy(map);
 }

 /* ---------------------------------------------------------------------- */
@ -271,6 +304,9 @@ void ComputeSnap::compute_array()
      const double ytmp = x[i][1];
      const double ztmp = x[i][2];
      const int itype = type[i];
+      int ielem = 0;
+      if (chemflag)
+        ielem = map[itype];
      const double radi = radelem[itype];
      const int* const jlist = firstneigh[i];
      const int jnum = numneigh[i];
@ -296,6 +332,9 @@ void ComputeSnap::compute_array()
        const double delz = x[j][2] - ztmp;
        const double rsq = delx*delx + dely*dely + delz*delz;
        int jtype = type[j];
+        int jelem = 0;
+        if (chemflag)
+          jelem = map[jtype];
        if (rsq < cutsq[itype][jtype]&&rsq>1e-20) {
          snaptr->rij[ninside][0] = delx;
          snaptr->rij[ninside][1] = dely;
@ -303,19 +342,19 @@ void ComputeSnap::compute_array()
          snaptr->inside[ninside] = j;
          snaptr->wj[ninside] = wjelem[jtype];
          snaptr->rcutij[ninside] = (radi+radelem[jtype])*rcutfac;
+          snaptr->element[ninside] = jelem; // element index for multi-element snap
          ninside++;
        }
      }

-      snaptr->compute_ui(ninside);
+      snaptr->compute_ui(ninside, ielem);
      snaptr->compute_zi();
-      snaptr->compute_bi();
+      snaptr->compute_bi(ielem);

      for (int jj = 0; jj < ninside; jj++) {
        const int j = snaptr->inside[jj];
-        snaptr->compute_duidrj(snaptr->rij[jj],
-                                    snaptr->wj[jj],
-                                    snaptr->rcutij[jj],jj);
+        snaptr->compute_duidrj(snaptr->rij[jj], snaptr->wj[jj],
+                                    snaptr->rcutij[jj], jj, snaptr->element[jj]);
        snaptr->compute_dbidrj();

        // Accumulate dBi/dRi, -dBi/dRj
@ -507,6 +546,8 @@ double ComputeSnap::memory_usage()
    sizeof(double);                                     // snapall
  bytes += nmax*size_peratom * sizeof(double);          // snap_peratom
  bytes += snaptr->memory_usage();                      // SNA object
+  int n = atom->ntypes+1;
+  bytes += n*sizeof(int);        // map

  return bytes;
 }
--- a/src/SNAP/compute_snap.h
+++ b/src/SNAP/compute_snap.h
@ -44,6 +44,8 @@ class ComputeSnap : public Compute {
  double rcutfac;
  double *radelem;
  double *wjelem;
+  int *map;  // map types to [0,nelements)
+  int nelements, chemflag;
  class SNA* snaptr;
  double cutmax;
  int quadraticflag;
--- a/src/SNAP/compute_snav_atom.cpp
+++ b/src/SNAP/compute_snav_atom.cpp
@ -33,7 +33,7 @@ ComputeSNAVAtom::ComputeSNAVAtom(LAMMPS *lmp, int narg, char **arg) :
  radelem(NULL), wjelem(NULL)
 {
  double rfac0, rmin0;
-  int twojmax, switchflag, bzeroflag;
+  int twojmax, switchflag, bzeroflag, bnormflag, wselfallflag;
  radelem = NULL;
  wjelem = NULL;

@ -47,7 +47,12 @@ ComputeSNAVAtom::ComputeSNAVAtom(LAMMPS *lmp, int narg, char **arg) :
  rmin0 = 0.0;
  switchflag = 1;
  bzeroflag = 1;
+  bnormflag = 0;
  quadraticflag = 0;
+  chemflag = 0;
+  bnormflag = 0;
+  wselfallflag = 0;
+  nelements = 1;

  // process required arguments

@ -97,11 +102,35 @@ ComputeSNAVAtom::ComputeSNAVAtom(LAMMPS *lmp, int narg, char **arg) :
        error->all(FLERR,"Illegal compute snav/atom command");
      quadraticflag = atoi(arg[iarg+1]);
      iarg += 2;
+    } else if (strcmp(arg[iarg],"chem") == 0) {
+      if (iarg+2+ntypes > narg)
+        error->all(FLERR,"Illegal compute sna/atom command");
+      chemflag = 1;
+      memory->create(map,ntypes+1,"compute_sna_atom:map");
+      nelements = force->inumeric(FLERR,arg[iarg+1]);
+      for(int i = 0; i < ntypes; i++) {
+        int jelem = force->inumeric(FLERR,arg[iarg+2+i]);
+        if (jelem < 0 || jelem >= nelements)
+          error->all(FLERR,"Illegal compute snav/atom command");
+        map[i+1] = jelem;
+      }
+      iarg += 2+ntypes;
+    } else if (strcmp(arg[iarg],"bnormflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute snav/atom command");
+      bnormflag = atoi(arg[iarg+1]);
+      iarg += 2;
+    } else if (strcmp(arg[iarg],"wselfallflag") == 0) {
+      if (iarg+2 > narg)
+        error->all(FLERR,"Illegal compute snav/atom command");
+      wselfallflag = atoi(arg[iarg+1]);
+      iarg += 2;
    } else error->all(FLERR,"Illegal compute snav/atom command");
  }

-  snaptr = new SNA(lmp,rfac0,twojmax,
-                   rmin0,switchflag,bzeroflag);
+  snaptr = new SNA(lmp, rfac0, twojmax,
+                   rmin0, switchflag, bzeroflag,
+                   chemflag, bnormflag, wselfallflag, nelements);

  ncoeff = snaptr->ncoeff;
  nperdim = ncoeff;
@ -210,6 +239,9 @@ void ComputeSNAVAtom::compute_peratom()
      const double ytmp = x[i][1];
      const double ztmp = x[i][2];
      const int itype = type[i];
+      int ielem = 0;
+      if (chemflag)
+        ielem = map[itype];
      const double radi = radelem[itype];

      const int* const jlist = firstneigh[i];
@ -236,6 +268,9 @@ void ComputeSNAVAtom::compute_peratom()
        const double delz = x[j][2] - ztmp;
        const double rsq = delx*delx + dely*dely + delz*delz;
        int jtype = type[j];
+        int jelem = 0;
+        if (chemflag)
+          jelem = map[jtype];
        if (rsq < cutsq[itype][jtype]&&rsq>1e-20) {
          snaptr->rij[ninside][0] = delx;
          snaptr->rij[ninside][1] = dely;
@ -243,22 +278,22 @@ void ComputeSNAVAtom::compute_peratom()
          snaptr->inside[ninside] = j;
          snaptr->wj[ninside] = wjelem[jtype];
          snaptr->rcutij[ninside] = (radi+radelem[jtype])*rcutfac;
+          snaptr->element[ninside] = jelem; // element index for multi-element snap
          ninside++;
        }
      }

-      snaptr->compute_ui(ninside);
+      snaptr->compute_ui(ninside, ielem);
      snaptr->compute_zi();
      if (quadraticflag) {
-        snaptr->compute_bi();
+        snaptr->compute_bi(ielem);
      }

      for (int jj = 0; jj < ninside; jj++) {
        const int j = snaptr->inside[jj];

-        snaptr->compute_duidrj(snaptr->rij[jj],
-                                    snaptr->wj[jj],
-                                    snaptr->rcutij[jj],jj);
+        snaptr->compute_duidrj(snaptr->rij[jj], snaptr->wj[jj],
+                                    snaptr->rcutij[jj], jj, snaptr->element[jj]);
        snaptr->compute_dbidrj();

        // Accumulate -dBi/dRi*Ri, -dBi/dRj*Rj
--- a/src/SNAP/compute_snav_atom.h
+++ b/src/SNAP/compute_snav_atom.h
@ -44,6 +44,8 @@ class ComputeSNAVAtom : public Compute {
  double rcutfac;
  double *radelem;
  double *wjelem;
+  int *map;  // map types to [0,nelements)
+  int nelements, chemflag;
  class SNA* snaptr;
  int quadraticflag;
 };
--- a/src/SNAP/pair_snap.cpp
+++ b/src/SNAP/pair_snap.cpp
@ -91,7 +91,6 @@ void PairSNAP::compute(int eflag, int vflag)
  double delx,dely,delz,evdwl,rsq;
  double fij[3];
  int *jlist,*numneigh,**firstneigh;
-  evdwl = 0.0;

  ev_init(eflag,vflag);

@ -157,13 +156,17 @@ void PairSNAP::compute(int eflag, int vflag)
        snaptr->inside[ninside] = j;
        snaptr->wj[ninside] = wjelem[jelem];
        snaptr->rcutij[ninside] = (radi + radelem[jelem])*rcutfac;
+        snaptr->element[ninside] = jelem;
        ninside++;
      }
    }

    // compute Ui, Yi for atom I

-    snaptr->compute_ui(ninside);
+    if (chemflag)
+      snaptr->compute_ui(ninside, ielem);
+    else
+      snaptr->compute_ui(ninside, 0);

    // for neighbors of I within cutoff:
    // compute Fij = dEi/dRj = -dEi/dRi
@ -173,8 +176,12 @@ void PairSNAP::compute(int eflag, int vflag)

    for (int jj = 0; jj < ninside; jj++) {
      int j = snaptr->inside[jj];
-      snaptr->compute_duidrj(snaptr->rij[jj],
-                             snaptr->wj[jj],snaptr->rcutij[jj],jj);
+      if(chemflag)
+        snaptr->compute_duidrj(snaptr->rij[jj], snaptr->wj[jj],
+                               snaptr->rcutij[jj],jj, snaptr->element[jj]);
+      else
+        snaptr->compute_duidrj(snaptr->rij[jj], snaptr->wj[jj],
+                               snaptr->rcutij[jj],jj, 0);

      snaptr->compute_deidrj(fij);

@ -202,6 +209,7 @@ void PairSNAP::compute(int eflag, int vflag)

      double* coeffi = coeffelem[ielem];
      evdwl = coeffi[0];
+      // snaptr->copy_bi2bvec();

      // E = beta.B + 0.5*B^t.alpha.B

@ -320,17 +328,26 @@ void PairSNAP::compute_bispectrum()
        snaptr->inside[ninside] = j;
        snaptr->wj[ninside] = wjelem[jelem];
        snaptr->rcutij[ninside] = (radi + radelem[jelem])*rcutfac;
+        snaptr->element[ninside] = jelem;
        ninside++;
      }
    }

-    snaptr->compute_ui(ninside);
+    if (chemflag)
+      snaptr->compute_ui(ninside, ielem);
+    else
+      snaptr->compute_ui(ninside, 0);
    snaptr->compute_zi();
-    snaptr->compute_bi();
+    if (chemflag)
+      snaptr->compute_bi(ielem);
+    else
+      snaptr->compute_bi(0);

-    for (int icoeff = 0; icoeff < ncoeff; icoeff++)
+    for (int icoeff = 0; icoeff < ncoeff; icoeff++){
      bispectrum[ii][icoeff] = snaptr->blist[icoeff];
+    }
  }
+
 }

 /* ----------------------------------------------------------------------
@ -401,7 +418,6 @@ void PairSNAP::coeff(int narg, char **arg)
    ncoeffq = (ncoeff*(ncoeff+1))/2;
    int ntmp = 1+ncoeff+ncoeffq;
    if (ntmp != ncoeffall) {
-      printf("ncoeffall = %d ntmp = %d ncoeff = %d \n",ncoeffall,ntmp,ncoeff);
      error->all(FLERR,"Incorrect SNAP coeff file");
    }
  }
@ -442,8 +458,9 @@ void PairSNAP::coeff(int narg, char **arg)

  if (count == 0) error->all(FLERR,"Incorrect args for pair coefficients");

-  snaptr = new SNA(lmp,rfac0,twojmax,
-                   rmin0,switchflag,bzeroflag);
+  snaptr = new SNA(lmp, rfac0, twojmax,
+                   rmin0, switchflag, bzeroflag,
+                   chemflag, bnormflag, wselfallflag, nelements);

  if (ncoeff != snaptr->ncoeff) {
    if (comm->me == 0)
@ -636,6 +653,9 @@ void PairSNAP::read_files(char *coefffilename, char *paramfilename)
  switchflag = 1;
  bzeroflag = 1;
  quadraticflag = 0;
+  chemflag = 0;
+  bnormflag = 0;
+  wselfallflag = 0;
  chunksize = 2000;

  // open SNAP parameter file on proc 0
@ -700,6 +720,12 @@ void PairSNAP::read_files(char *coefffilename, char *paramfilename)
      bzeroflag = atoi(keyval);
    else if (strcmp(keywd,"quadraticflag") == 0)
      quadraticflag = atoi(keyval);
+    else if (strcmp(keywd,"chemflag") == 0)
+      chemflag = atoi(keyval);
+    else if (strcmp(keywd,"bnormflag") == 0)
+      bnormflag = atoi(keyval);
+    else if (strcmp(keywd,"wselfallflag") == 0)
+      wselfallflag = atoi(keyval);
    else if (strcmp(keywd,"chunksize") == 0)
      chunksize = atoi(keyval);
    else
--- a/src/SNAP/pair_snap.h
+++ b/src/SNAP/pair_snap.h
@ -58,7 +58,8 @@ protected:
  double** beta;                // betas for all atoms in list
  double** bispectrum;          // bispectrum components for all atoms in list
  int *map;                     // mapping from atom types to elements
-  int twojmax, switchflag, bzeroflag;
+  int twojmax, switchflag, bzeroflag, bnormflag;
+  int chemflag, wselfallflag;
  int chunksize;
  double rfac0, rmin0, wj1, wj2;
  int rcutfacflag, twojmaxflag; // flags for required parameters
--- a/src/SNAP/sna.cpp
+++ b/src/SNAP/sna.cpp
--- a/src/SNAP/sna.h
+++ b/src/SNAP/sna.h
@ -33,7 +33,7 @@ struct SNA_BINDICES {
 class SNA : protected Pointers {

 public:
-  SNA(LAMMPS*, double, int, double, int, int);
+  SNA(LAMMPS*, double, int, double, int, int, int, int, int, int);

  SNA(LAMMPS* lmp) : Pointers(lmp) {};
  ~SNA();
@ -45,15 +45,15 @@ public:

  // functions for bispectrum coefficients

-  void compute_ui(int);
+  void compute_ui(int, int);
  void compute_zi();
  void compute_yi(const double*);
  void compute_yterm(int, int, int, const double*);
-  void compute_bi();
+  void compute_bi(int);

  // functions for derivatives

-  void compute_duidrj(double*, double, double, int);
+  void compute_duidrj(double*, double, double, int, int);
  void compute_dbidrj();
  void compute_deidrj(double*);
  double compute_sfac(double, double);
@ -65,11 +65,16 @@ public:
  int* inside;
  double* wj;
  double* rcutij;
+  int* element;  // index on [0,nelements)
  int nmax;

  void grow_rij(int);

  int twojmax;
+  double* ylist_r, * ylist_i;
+  int idxcg_max, idxu_max, idxz_max, idxb_max;
+
+

 private:
  double rmin0, rfac0;
@ -78,7 +83,6 @@ private:

  SNA_ZINDICES* idxz;
  SNA_BINDICES* idxb;
-  int idxcg_max, idxu_max, idxz_max, idxb_max;

  double** rootpqarray;
  double* cglist;
@ -94,7 +98,7 @@ private:
  int*** idxb_block;

  double** dulist_r, ** dulist_i;
-  double* ylist_r, * ylist_i;
+  int elem_duarray; // element of j in derivative

  static const int nmaxfactorial = 167;
  static const double nfac_table[];
@ -105,13 +109,12 @@ private:
  void init_clebsch_gordan();
  void print_clebsch_gordan();
  void init_rootpqarray();
-  void zero_uarraytot();
-  void addself_uarraytot(double);
-  void add_uarraytot(double, double, double, int);
+  void zero_uarraytot(int);
+  void add_uarraytot(double, double, double, int, int);
  void compute_uarray(double, double, double,
                      double, double, int);
  double deltacg(int, int, int);
-  int compute_ncoeff();
+  void compute_ncoeff();
  void compute_duarray(double, double, double,
                       double, double, double, double, double, int);

@ -125,6 +128,12 @@ private:

  int bzero_flag; // 1 if bzero subtracted from barray
  double* bzero;  // array of B values for isolated atoms
+  int bnorm_flag; // 1 if barray divided by j+1
+  int chem_flag; // 1 for multi-element bispectrum components
+  int wselfall_flag; // 1 for adding wself to all element labelings
+  int nelements;  // number of elements
+  int ndoubles;   // number of multi-element pairs
+  int ntriples;   // number of multi-element triplets
 };

 }