git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@8951 f3b2605a-c512-4ea7-a41b-209d697bcdaa

2012-10-12 16:32:42 +00:00 · 2012-10-12 16:32:42 +00:00 · 08a691bfed
parent 6984d53442
commit 08a691bfed
4 changed files with 174 additions and 74 deletions
--- a/doc/kspace_modify.html
+++ b/doc/kspace_modify.html
@ -17,7 +17,7 @@
 </PRE>
 <UL><LI>one or more keyword/value pairs may be listed 

-<LI>keyword = <I>mesh</I> or <I>order</I> or <I>gewald</I> or <I>slab</I> or (nozforce</I> or <I>compute</I> or <I>diff</I> 
+<LI>keyword = <I>mesh</I> or <I>order</I> or <I>order/disp</I> or <I>overlap</I> or <I>minorder</I> or <I>force</I> or <I>gewald</I> or <I>gewald/disp</I> or <I>slab</I> or (nozforce</I> or <I>compute</I> or <I>diff</I> 

 <PRE>  <I>mesh</I> value = x y z
    x,y,z = grid size in each dimension for long-range Coulombics
@ -27,6 +27,9 @@
    N = extent of Gaussian for PPPM or MSM mapping of charge to grid
  <I>order/disp</I> value = N
    N = extent of Gaussian for PPPM mapping of dispersion term to grid
+  <I>overlap</I> = <I>yes</I> or <I>no</I> = whether the grid stencil for PPPM is allowed to overlap into more than the nearest-neighbor processor
+  <I>minorder</I> value = M
+    M = min allowed extent of Gaussian when auto-adjusting to minimize grid communication
  <I>force</I> value = accuracy (force units)
  <I>gewald</I> value = rinv (1/distance units)
    rinv = G-ewald parameter for Coulombics
@ -37,7 +40,7 @@
      2d approximation compared with the volume of the simulation domain
  <I>nozforce</I> turns off kspace forces in the z direction
  <I>compute</I> value = <I>yes</I> or <I>no</I> 
-  <I>diff</I> value = <I>ik</I> or <I>ad</I> 
+  <I>diff</I> value = <I>ad</I> or <I>ik</I> = 2 or 4 FFTs for PPPM in smoothed or non-smoothed mode 
 </PRE>

 </UL>
@ -70,29 +73,59 @@ Values for x,y,z of 0,0,0 unset the option.
 </P>
 <P>The <I>order</I> keyword determines how many grid spacings an atom's charge
 extends when it is mapped to the grid in kspace style <I>pppm</I> or <I>msm</I>.
-The default for this parameter is 5 for PPPM and 4 for MSM, which means 
-each charge spans 5 or 4 grid cells in each dimension, respectively.  
-For the LAMMPS implementation of MSM, the order can range from 4 to 10
-and must be even. For PPPM, the minimum allowed setting is 2 and the 
-maximum allowed setting is 7.  The larger the value of this parameter, 
-the smaller the grid will need to be to achieve the requested accuracy.  
-Conversely, the smaller the order value, the larger the grid will be.  
-Note that there is an inherent trade-off involved: a small grid will 
-lower the cost of FFTs or MSM direct sum, but a larger order parameter
-will increase the cost of interpolating charge/fields to/from the grid.
+The default for this parameter is 5 for PPPM and 4 for MSM, which
+means each charge spans 5 or 4 grid cells in each dimension,
+respectively.  For the LAMMPS implementation of MSM, the order can
+range from 4 to 10 and must be even. For PPPM, the minimum allowed
+setting is 2 and the maximum allowed setting is 7.  The larger the
+value of this parameter, the smaller that LAMMPS will set the grid
+size, to achieve the requested accuracy.  Conversely, the smaller the
+order value, the larger the grid size will be.  Note that there is an
+inherent trade-off involved: a small grid will lower the cost of FFTs
+or MSM direct sum, but a larger order parameter will increase the cost
+of interpolating charge/fields to/from the grid.
 </P>
 <P>The <I>order/disp</I> keyword determines how many grid spacings an atom's
 dispersion term extends when it is mapped to the grid in kspace style
 <I>pppm/disp</I>.  It has the same meaning as the <I>order</I> setting for
 Coulombics.
 </P>
+<P>The <I>overlap</I> keyword can be used in conjunction with the <I>minorder</I>
+keyword with the PPPM styles to adjust the amount of communication
+that occurs when values on the FFT grid are exchangeed between
+processors.  This communication is distinct from the communication
+inherent in the parallel FFTs themselves, and is required because
+processors interpolate charge and field values using grid point values
+owned by neighboring processors (i.e. ghost point communication).  If
+the <I>overlap</I> keyword is set to <I>yes</I> then this communication is
+allowed to extend beyond nearest-neighbor processors, e.g. when using
+lots of processors on a small problem.  If it is set to <I>no</I> then the
+communication will be limited to nearest-neighbor processors and the
+<I>order</I> setting will be reduced if necessary, as explained by the
+<I>minorder</I> keyword discussion.
+</P>
+<P>The <I>minorder</I> keyword allows LAMMPS to reduce the <I>order</I> setting if
+necessary to keep the communication of ghost grid point limited to
+exchanges between nearest-neighbor processors.  See the discussion of
+the <I>overlap</I> keyword for details.  If the <I>overlap</I> keyword is set to
+<I>yes</I>, which is the default, this is never needed.  If it set to <I>no</I>
+and overlap occurs, then LAMMPS will reduce the order setting, one
+step at a time, until the ghost grid overlap only extends to nearest
+neighbor processors.  The <I>minorder</I> keyword limits how small the
+<I>order</I> setting can become.  The minimum allowed value for PPPM is 2,
+which is the default.  If <I>minorder</I> is set to the same value as
+<I>order</I> then no reduction is allowed, and LAMMPS will generate an
+error if the grid communcation is non-nearest-neighbor and <I>overlap</I>
+is set to <I>no</I>.
+</P>
 <P>The PPPM order parameter may be reset by LAMMPS when it sets up the
 FFT grid if the implied grid stencil extends beyond the grid cells
 owned by neighboring processors.  Typically this will only occur when
 small problems are run on large numbers of processors.  A warning will
 be generated indicating the order parameter is being reduced to allow
-LAMMPS to run the problem. Automatic reduction of order is not currently
-implemented in MSM, so an error (instead of a warning) will be generated.
+LAMMPS to run the problem. Automatic reduction of order is not
+currently implemented in MSM, so an error (instead of a warning) will
+be generated.
 </P>
 <P>The <I>force</I> keyword overrides the relative accuracy parameter set by
 the <A HREF = "kspace_style.html">kspace_style</A> command with an absolute
@ -148,17 +181,23 @@ This keyword gives you that option.
 <P>The <I>diff</I> keyword specifies the differentiation scheme used by the
 PPPM method to compute forces on particles given electrostatic
 potentials on the PPPM mesh.  The <I>ik</I> approach is the default for
-PPPM.  It performs differentiation in Kspace, but uses 3 FFTs to
-transfer the computed fields back to real space (total of 4 FFTs per
-timestep). The analytic differentiation, or <I>ad</I> approach uses only 1
-FFT to transfer the computed fields back to real space (total of 2
-FFTs per timestep), but requires a somewhat larger PPPM mesh to
-achieve the same accuracy as the <I>ik</I> approach. Analogous approaches
-have been implemented in MSM and can be specified using the same
-keywords. The <I>ad</I> approach is the default for MSM.
+PPPM and is the original formulation used in <A HREF = "#Hockney">(Hockney)</A>.  It
+performs differentiation in Kspace, and uses 3 FFTs to transfer each
+component of the computed fields back to real space for total of 4
+FFTs per timestep.
 </P>
-<P>IMPORTANT NOTE: Currently, not all <I>pppm</I> styles support the <I>ad</I>
-option.  Support for those <I>pppm</I> variants will be added later.
+<P>The analytic differentiation <I>ad</I> approach uses only 1 FFT to transfer
+information back to real space for a total of 2 FFTs per timestep.  It
+then performs analytic differentiation on the single quantity to
+generate the 3 components of the electric field at each grid point.
+This is sometimes referred to as "smoothed" PPPM.  This approach
+requires a somewhat larger PPPM mesh to achieve the same accuracy as
+the <I>ik</I> method.  Analogous approaches have been implemented in MSM
+and can be specified using the same keywords.  The <I>ad</I> approach is
+the default for MSM.
+</P>
+<P>IMPORTANT NOTE: Currently, not all PPPM styles support the <I>ad</I>
+option.  Support for those PPPM variants will be added later.
 </P>
 <P><B>Restrictions:</B> none
 </P>
@ -169,11 +208,17 @@ option.  Support for those <I>pppm</I> variants will be added later.
 <P><B>Default:</B>
 </P>
 <P>The option defaults are mesh = mesh/disp = 0 0 0, order = order/disp =
-5 (PPPM), order = 4 (MSM), force = -1.0, gewald = gewald/disp = 0.0, 
-slab = 1.0, compute = yes, and diff = ik (PPPM), diff = ad (MSM).
+5 (PPPM), order = 4 (MSM), minorder = 2, overlap = yes, force = -1.0,
+gewald = gewald/disp = 0.0, slab = 1.0, compute = yes, and diff = ik
+(PPPM), diff = ad (MSM).
 </P>
 <HR>

+<A NAME = "Hockney"></A>
+
+<P><B>(Hockney)</B> Hockney and Eastwood, Computer Simulation Using Particles,
+Adam Hilger, NY (1989).
+</P>
 <A NAME = "Yeh"></A>

 <P><B>(Yeh)</B> Yeh and Berkowitz, J Chem Phys, 111, 3155 (1999).
--- a/doc/kspace_modify.txt
+++ b/doc/kspace_modify.txt
@ -13,7 +13,7 @@ kspace_modify command :h3
 kspace_modify keyword value ... :pre

 one or more keyword/value pairs may be listed :ulb,l
-keyword = {mesh} or {order} or {gewald} or {slab} or (nozforce} or {compute} or {diff} :l
+keyword = {mesh} or {order} or {order/disp} or {overlap} or {minorder} or {force} or {gewald} or {gewald/disp} or {slab} or (nozforce} or {compute} or {diff} :l
  {mesh} value = x y z
    x,y,z = grid size in each dimension for long-range Coulombics
  {mesh/disp} value = x y z
@ -22,6 +22,9 @@ keyword = {mesh} or {order} or {gewald} or {slab} or (nozforce} or {compute} or
    N = extent of Gaussian for PPPM or MSM mapping of charge to grid
  {order/disp} value = N
    N = extent of Gaussian for PPPM mapping of dispersion term to grid
+  {overlap} = {yes} or {no} = whether the grid stencil for PPPM is allowed to overlap into more than the nearest-neighbor processor
+  {minorder} value = M
+    M = min allowed extent of Gaussian when auto-adjusting to minimize grid communication
  {force} value = accuracy (force units)
  {gewald} value = rinv (1/distance units)
    rinv = G-ewald parameter for Coulombics
@ -32,7 +35,7 @@ keyword = {mesh} or {order} or {gewald} or {slab} or (nozforce} or {compute} or
      2d approximation compared with the volume of the simulation domain
  {nozforce} turns off kspace forces in the z direction
  {compute} value = {yes} or {no} 
-  {diff} value = {ik} or {ad} :pre
+  {diff} value = {ad} or {ik} = 2 or 4 FFTs for PPPM in smoothed or non-smoothed mode :pre
 :ule

 [Examples:]
@ -64,29 +67,59 @@ Values for x,y,z of 0,0,0 unset the option.

 The {order} keyword determines how many grid spacings an atom's charge
 extends when it is mapped to the grid in kspace style {pppm} or {msm}.
-The default for this parameter is 5 for PPPM and 4 for MSM, which means 
-each charge spans 5 or 4 grid cells in each dimension, respectively.  
-For the LAMMPS implementation of MSM, the order can range from 4 to 10
-and must be even. For PPPM, the minimum allowed setting is 2 and the 
-maximum allowed setting is 7.  The larger the value of this parameter, 
-the smaller the grid will need to be to achieve the requested accuracy.  
-Conversely, the smaller the order value, the larger the grid will be.  
-Note that there is an inherent trade-off involved: a small grid will 
-lower the cost of FFTs or MSM direct sum, but a larger order parameter
-will increase the cost of interpolating charge/fields to/from the grid.
+The default for this parameter is 5 for PPPM and 4 for MSM, which
+means each charge spans 5 or 4 grid cells in each dimension,
+respectively.  For the LAMMPS implementation of MSM, the order can
+range from 4 to 10 and must be even. For PPPM, the minimum allowed
+setting is 2 and the maximum allowed setting is 7.  The larger the
+value of this parameter, the smaller that LAMMPS will set the grid
+size, to achieve the requested accuracy.  Conversely, the smaller the
+order value, the larger the grid size will be.  Note that there is an
+inherent trade-off involved: a small grid will lower the cost of FFTs
+or MSM direct sum, but a larger order parameter will increase the cost
+of interpolating charge/fields to/from the grid.

 The {order/disp} keyword determines how many grid spacings an atom's
 dispersion term extends when it is mapped to the grid in kspace style
 {pppm/disp}.  It has the same meaning as the {order} setting for
 Coulombics.

+The {overlap} keyword can be used in conjunction with the {minorder}
+keyword with the PPPM styles to adjust the amount of communication
+that occurs when values on the FFT grid are exchangeed between
+processors.  This communication is distinct from the communication
+inherent in the parallel FFTs themselves, and is required because
+processors interpolate charge and field values using grid point values
+owned by neighboring processors (i.e. ghost point communication).  If
+the {overlap} keyword is set to {yes} then this communication is
+allowed to extend beyond nearest-neighbor processors, e.g. when using
+lots of processors on a small problem.  If it is set to {no} then the
+communication will be limited to nearest-neighbor processors and the
+{order} setting will be reduced if necessary, as explained by the
+{minorder} keyword discussion.
+
+The {minorder} keyword allows LAMMPS to reduce the {order} setting if
+necessary to keep the communication of ghost grid point limited to
+exchanges between nearest-neighbor processors.  See the discussion of
+the {overlap} keyword for details.  If the {overlap} keyword is set to
+{yes}, which is the default, this is never needed.  If it set to {no}
+and overlap occurs, then LAMMPS will reduce the order setting, one
+step at a time, until the ghost grid overlap only extends to nearest
+neighbor processors.  The {minorder} keyword limits how small the
+{order} setting can become.  The minimum allowed value for PPPM is 2,
+which is the default.  If {minorder} is set to the same value as
+{order} then no reduction is allowed, and LAMMPS will generate an
+error if the grid communcation is non-nearest-neighbor and {overlap}
+is set to {no}.
+
 The PPPM order parameter may be reset by LAMMPS when it sets up the
 FFT grid if the implied grid stencil extends beyond the grid cells
 owned by neighboring processors.  Typically this will only occur when
 small problems are run on large numbers of processors.  A warning will
 be generated indicating the order parameter is being reduced to allow
-LAMMPS to run the problem. Automatic reduction of order is not currently
-implemented in MSM, so an error (instead of a warning) will be generated.
+LAMMPS to run the problem. Automatic reduction of order is not
+currently implemented in MSM, so an error (instead of a warning) will
+be generated.

 The {force} keyword overrides the relative accuracy parameter set by
 the "kspace_style"_kspace_style.html command with an absolute
@ -142,17 +175,23 @@ This keyword gives you that option.
 The {diff} keyword specifies the differentiation scheme used by the
 PPPM method to compute forces on particles given electrostatic
 potentials on the PPPM mesh.  The {ik} approach is the default for
-PPPM.  It performs differentiation in Kspace, but uses 3 FFTs to
-transfer the computed fields back to real space (total of 4 FFTs per
-timestep). The analytic differentiation, or {ad} approach uses only 1
-FFT to transfer the computed fields back to real space (total of 2
-FFTs per timestep), but requires a somewhat larger PPPM mesh to
-achieve the same accuracy as the {ik} approach. Analogous approaches
-have been implemented in MSM and can be specified using the same
-keywords. The {ad} approach is the default for MSM.
+PPPM and is the original formulation used in "(Hockney)"_#Hockney.  It
+performs differentiation in Kspace, and uses 3 FFTs to transfer each
+component of the computed fields back to real space for total of 4
+FFTs per timestep.

-IMPORTANT NOTE: Currently, not all {pppm} styles support the {ad}
-option.  Support for those {pppm} variants will be added later.
+The analytic differentiation {ad} approach uses only 1 FFT to transfer
+information back to real space for a total of 2 FFTs per timestep.  It
+then performs analytic differentiation on the single quantity to
+generate the 3 components of the electric field at each grid point.
+This is sometimes referred to as "smoothed" PPPM.  This approach
+requires a somewhat larger PPPM mesh to achieve the same accuracy as
+the {ik} method.  Analogous approaches have been implemented in MSM
+and can be specified using the same keywords.  The {ad} approach is
+the default for MSM.
+
+IMPORTANT NOTE: Currently, not all PPPM styles support the {ad}
+option.  Support for those PPPM variants will be added later.

 [Restrictions:] none

@ -163,10 +202,15 @@ option.  Support for those {pppm} variants will be added later.
 [Default:]

 The option defaults are mesh = mesh/disp = 0 0 0, order = order/disp =
-5 (PPPM), order = 4 (MSM), force = -1.0, gewald = gewald/disp = 0.0, 
-slab = 1.0, compute = yes, and diff = ik (PPPM), diff = ad (MSM).
+5 (PPPM), order = 4 (MSM), minorder = 2, overlap = yes, force = -1.0,
+gewald = gewald/disp = 0.0, slab = 1.0, compute = yes, and diff = ik
+(PPPM), diff = ad (MSM).

 :line

+:link(Hockney) 
+[(Hockney)] Hockney and Eastwood, Computer Simulation Using Particles,
+Adam Hilger, NY (1989).
+
 :link(Yeh)
 [(Yeh)] Yeh and Berkowitz, J Chem Phys, 111, 3155 (1999).
--- a/doc/kspace_style.html
+++ b/doc/kspace_style.html
@ -153,12 +153,13 @@ manual.
 </P>
 <HR>

-<P>The <I>msm</I> style invokes a multi-level summation method MSM solver
-<A HREF = "#Hardy">(Hardy)</A> which maps atom charge to a 3d mesh, and uses a 
-multi-level hierarchy of coarser and coarser meshes on which direct
-coulomb solves are done.  This method does not use FFTs and scales
-as N. It may therefore be faster than the other K-space solvers for 
-relatively large problems when running on large core counts.
+<P>The <I>msm</I> style invokes a multi-level summation method MSM solver,
+<A HREF = "#Hardy">(Hardy)</A> or <A HREF = "#Hardy2">(Hardy2)</A>, which maps atom charge to a 3d
+mesh, and uses a multi-level hierarchy of coarser and coarser meshes
+on which direct coulomb solves are done.  This method does not use
+FFTs and scales as N. It may therefore be faster than the other
+K-space solvers for relatively large problems when running on large
+core counts.
 </P>
 <P>MSM is most competitive versus Ewald and PPPM when only relatively 
 low accuracy forces, about 1e-4 relative error or less accurate, 
@ -284,8 +285,13 @@ Adam Hilger, NY (1989).
 </P>
 <A NAME = "Hardy"></A>

-<P><B>(Hardy)</B> David, Thesis: Multilevel Summation for the Fast Evaluation
-of Forces for the Simulation of Biomolecules, University of Illinois
-at Urbana-Champaign, (2006).
+<P><B>(Hardy)</B> David Hardy thesis: Multilevel Summation for the Fast
+Evaluation of Forces for the Simulation of Biomolecules, University of
+Illinois at Urbana-Champaign, (2006).
+</P>
+<A NAME = "Hardy2"></A>
+
+<P><B>(Hardy)</B> Hardy, Stone, Schulten, Parallel Computing 35 (2009)
+164-177.
 </P>
 </HTML>
--- a/doc/kspace_style.txt
+++ b/doc/kspace_style.txt
@ -146,12 +146,13 @@ manual.

 :line

-The {msm} style invokes a multi-level summation method MSM solver
-"(Hardy)"_#Hardy which maps atom charge to a 3d mesh, and uses a 
-multi-level hierarchy of coarser and coarser meshes on which direct
-coulomb solves are done.  This method does not use FFTs and scales
-as N. It may therefore be faster than the other K-space solvers for 
-relatively large problems when running on large core counts.
+The {msm} style invokes a multi-level summation method MSM solver,
+"(Hardy)"_#Hardy or "(Hardy2)"_#Hardy2, which maps atom charge to a 3d
+mesh, and uses a multi-level hierarchy of coarser and coarser meshes
+on which direct coulomb solves are done.  This method does not use
+FFTs and scales as N. It may therefore be faster than the other
+K-space solvers for relatively large problems when running on large
+core counts.

 MSM is most competitive versus Ewald and PPPM when only relatively 
 low accuracy forces, about 1e-4 relative error or less accurate, 
@ -269,6 +270,10 @@ Adam Hilger, NY (1989).
 [(Veld)] In 't Veld, Ismail, Grest, J Chem Phys, in press (2007).

 :link(Hardy)
-[(Hardy)] David, Thesis: Multilevel Summation for the Fast Evaluation
-of Forces for the Simulation of Biomolecules, University of Illinois
-at Urbana-Champaign, (2006).
+[(Hardy)] David Hardy thesis: Multilevel Summation for the Fast
+Evaluation of Forces for the Simulation of Biomolecules, University of
+Illinois at Urbana-Champaign, (2006).
+
+:link(Hardy2)
+[(Hardy)] Hardy, Stone, Schulten, Parallel Computing 35 (2009)
+164-177.