From d17e06c479e961879a1c3b0a3ba5cefa2e5c1121 Mon Sep 17 00:00:00 2001
From: sjplimp The LAMMPS "version" is the date when it was released, such as 1 May
-2010. LAMMPS is updated continuously. Whenever we fix a bug or add a
-feature, we release it immediately, and post a notice on this page of
-the WWW site. Each dated copy of LAMMPS contains all the
-features and bug-fixes up to and including that version date. The
-version date is printed to the screen and logfile every time you run
-LAMMPS. It is also in the file src/version.h and in the LAMMPS
-directory name created when you unpack a tarball, and at the top of
-the first page of the manual (this page).
-
-
-
-
-LAMMPS Documentation
-
10 May 2014 version
-
Version info:
-
-
LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel -Simulator. -
-LAMMPS is a classical molecular dynamics simulation code designed to -run efficiently on parallel computers. It was developed at Sandia -National Laboratories, a US Department of Energy facility, with -funding from the DOE. It is an open-source code, distributed freely -under the terms of the GNU Public License (GPL). -
-The primary developers of LAMMPS are Steve Plimpton, Aidan -Thompson, and Paul Crozier who can be contacted at -sjplimp,athomps,pscrozi at sandia.gov. The LAMMPS WWW Site at -http://lammps.sandia.gov has more information about the code and its -uses. -
- -The LAMMPS documentation is organized into the following sections. If -you find errors or omissions in this manual or have suggestions for -useful information to add, please send an email to the developers so -we can improve the LAMMPS documentation. -
-Once you are familiar with LAMMPS, you may want to bookmark this -page at Section_commands.html#comm since -it gives quick access to documentation for all LAMMPS commands. -
-PDF file of the entire manual, generated by -htmldoc -
-Settings:
-communicate, group, mass, +
comm_style, group, mass, min_modify, min_style, neigh_modify, neighbor, reset_timestep, run_style, @@ -362,20 +362,21 @@ in the command's documentation.
These are commands contributed by users, which can be used if LAMMPS
diff --git a/doc/Section_commands.txt b/doc/Section_commands.txt
index c9acfcd8f4..133972811f 100644
--- a/doc/Section_commands.txt
+++ b/doc/Section_commands.txt
@@ -305,7 +305,7 @@ Force fields:
Settings:
-"communicate"_communicate.html, "group"_group.html, "mass"_mass.html,
+"comm_style"_comm_style.html, "group"_group.html, "mass"_mass.html,
"min_modify"_min_modify.html, "min_style"_min_style.html,
"neigh_modify"_neigh_modify.html, "neighbor"_neighbor.html,
"reset_timestep"_reset_timestep.html, "run_style"_run_style.html,
@@ -367,7 +367,8 @@ in the command's documentation.
"box"_box.html,
"change_box"_change_box.html,
"clear"_clear.html,
-"communicate"_communicate.html,
+"comm_modify"_comm_modify.html,
+"comm_style"_comm_style.html,
"compute"_compute.html,
"compute_modify"_compute_modify.html,
"create_atoms"_create_atoms.html,
diff --git a/doc/balance.html b/doc/balance.html
index 94088e0e01..22ffe25648 100644
--- a/doc/balance.html
+++ b/doc/balance.html
@@ -13,111 +13,178 @@
Syntax:
Examples:
Description:
This command adjusts the size of processor sub-domains within the
-simulation box, to attempt to balance the number of particles and thus
-the computational cost (load) evenly across processors. The load
-balancing is "static" in the sense that this command performs the
-balancing once, before or between simulations. The processor
-sub-domains will then remain static during the subsequent run. To
-perform "dynamic" balancing, see the fix balance
-command, which can adjust processor sub-domain sizes on-the-fly during
-a run.
+ IMPORTANT NOTE: The rcb style is not yet implemented.
Load-balancing is only useful if the particles in the simulation box
-have a spatially-varying density distribution. E.g. a model of a
-vapor/liquid interface, or a solid with an irregular-shaped geometry
-containing void regions. In this case, the LAMMPS default of dividing
-the simulation box volume into a regular-spaced grid of processor
-sub-domain, with one equal-volume sub-domain per procesor, may assign
-very different numbers of particles per processor. This can lead to
-poor performance in a scalability sense, when the simulation is run in
+ This command adjusts the size and shape of processor sub-domains
+within the simulation box, to attempt to balance the number of
+particles and thus the computational cost (load) evenly across
+processors. The load balancing is "static" in the sense that this
+command performs the balancing once, before or between simulations.
+The processor sub-domains will then remain static during the
+subsequent run. To perform "dynamic" balancing, see the fix
+balance command, which can adjust processor
+sub-domain sizes and shapes on-the-fly during a run.
+ Load-balancing is typically only useful if the particles in the
+simulation box have a spatially-varying density distribution. E.g. a
+model of a vapor/liquid interface, or a solid with an irregular-shaped
+geometry containing void regions. In this case, the LAMMPS default of
+dividing the simulation box volume into a regular-spaced grid of 3d
+bricks, with one equal-volume sub-domain per procesor, may assign very
+different numbers of particles per processor. This can lead to poor
+performance in a scalability sense, when the simulation is run in
parallel.
Note that the processors command gives you control
+ Note that the processors command allows some control
over how the box volume is split across processors. Specifically, for
-a Px by Py by Pz grid of processors, it chooses or lets you choose Px,
-Py, and Pz, subject to the constraint that Px * Py * Pz = P, the total
-number of processors. This is sufficient to achieve good load-balance
-for many models on many processor counts. However, all the processor
-sub-domains will still be the same shape and have the same volume.
+a Px by Py by Pz grid of processors, it allows choice of Px, Py, and
+Pz, subject to the constraint that Px * Py * Pz = P, the total number
+of processors. This is sufficient to achieve good load-balance for
+many models on many processor counts. However, all the processor
+sub-domains will still have the same shape and same volume.
This command does not alter the topology of the Px by Py by Pz grid or
-processors. But it shifts the cutting planes between processors (in
-3d, or lines in 2d), which adjusts the volume (area in 2d) assigned to
-each processor, as in the following 2d diagram. The left diagram is
-the default partitioning of the simulation box across processors (one
-sub-box for each of 16 processors); the right diagram is after
-balancing.
- When the balance command completes, it prints out the final positions
-of all cutting planes in each of the 3 dimensions (as fractions of the
-box length). It also prints statistics about its results, including
-the change in "imbalance factor". This factor is defined as the
-maximum number of particles owned by any processor, divided by the
+ The requested load-balancing operation is only performed if the
+current "imbalance factor" in particles owned by each processor
+exceeds the specified thresh parameter. This factor is defined as
+the maximum number of particles owned by any processor, divided by the
average number of particles per processor. Thus an imbalance factor
of 1.0 is perfect balance. For 10000 particles running on 10
processors, if the most heavily loaded processor has 1200 particles,
-then the factor is 1.2, meaning there is a 20% imbalance. The change
-in the maximum number of particles (on any processor) is also printed.
+then the factor is 1.2, meaning there is a 20% imbalance. Note that a
+re-balance can be forced even if the current balance is perfect (1.0)
+be specifying a thresh < 1.0.
+ When the balance command completes, it prints statistics about its
+results, including the change in the imbalance factor and the change
+in the maximum number of particles (on any processor). For "grid"
+methods (defined below) that create a logical 3d grid of processors,
+the positions of all cutting planes in each of the 3 dimensions (as
+fractions of the box length) are also printed.
IMPORTANT NOTE: This command attempts to minimize the imbalance
-factor, as defined above. But because of the topology constraint that
-only the cutting planes (lines) between processors are moved, there
-are many irregular distributions of particles, where this factor
-cannot be shrunk to 1.0, particuarly in 3d. Also, computational cost
-is not strictly proportional to particle count, and changing the
-relative size and shape of processor sub-domains may lead to
-additional computational and communication overheads, e.g. in the PPPM
-solver used via the kspace_style command. Thus
-you should benchmark the run times of your simulation before and after
-balancing.
+factor, as defined above. But depending on the method a perfect
+balance (1.0) may not be achieved. For example, "grid" methods
+(defined below) that create a logical 3d grid cannot achieve perfect
+balance for many irregular distributions of particles. Likewise, if a
+portion of the system is a perfect lattice, e.g. the intiial system is
+generated by the create_atoms command, then "grid"
+methods may be unable to achieve exact balance. This is because
+entire lattice planes will be owned or not owned by a single
+processor.
+ IMPORTANT NOTE: Computational cost is not strictly proportional to
+particle count, and changing the relative size and shape of processor
+sub-domains may lead to additional computational and communication
+overheads, e.g. in the PPPM solver used via the
+kspace_style command. Thus you should benchmark
+the run times of a simulation before and after balancing.
The x, y, and z keywords adjust the position of cutting planes
-between processor sub-domains in a specific dimension. The uniform
-argument spaces the planes evenly, as in the left diagram above. The
-numeric argument requires you to list Ps-1 numbers that specify the
-position of the cutting planes. This requires that you know Ps = Px
-or Py or Pz = the number of processors assigned by LAMMPS to the
-relevant dimension. This assignment is made (and the Px, Py, Pz
-values printed out) when the simulation box is created by the
-"create_box" or "read_data" or "read_restart" command and is
-influenced by the settings of the "processors" command.
+ The method used to perform a load balance is specified by one of the
+listed styles, which are described in detail below. There are 2 kinds
+of styles.
+ The x, y, z, and shift styles are "grid" methods which produce
+a logical 3d grid of processors. They operate by changing the cutting
+planes (or lines) between processors in 3d (or 2d), to adjust the
+volume (area in 2d) assigned to each processor, as in the following 2d
+diagram. The left diagram is the default partitioning of the
+simulation box across processors (one sub-box for each of 16
+processors); the right diagram is after balancing.
+ The rcb style is a "tiling" method which does not produce a logical
+3d grid of processors. Rather it tiles the simulation domain with
+rectangular sub-boxes of varying size and shape in an irregular
+fashion so as to have equal numbers of particles in each sub-box, as
+in the following 2d diagram. Again the left diagram is the default
+partitioning of the simulation box across processors (one sub-box for
+each of 16 processors); the right diagram is after balancing.
+ NOTE: Need a diagram of RCB partitioning.
+ The "grid" methods can be used with either of the
+comm_style command options, brick or tiled. The
+"tiling" methods can only be used with comm_style
+tiled. Note that it can be useful to use a "grid"
+method with comm_style tiled to return the domain
+partitioning to a logical 3d grid of processors so that "comm_style
+brick" can be used for subsequent run commands.
+ When a "grid" method is specified, the current domain partitioning can
+be either a logical 3d grid or a tiled partitioning. In the former
+case, the current logical 3d grid is used as a starting point and
+changes are made to improve the imbalance factor. In the latter case,
+the tiled partitioning is discarded and a logical 3d grid is created
+with uniform spacing in all dimensions. This becomes the starting
+point for the balancing operation.
+ When a "tiling" method is specified, the current domain partitioning
+("grid" or "tiled") is ignored, and a new partitioning is computed
+from scratch.
+ The x, y, and z styles invoke a "grid" method for balancing, as
+described above. Note that any or all of these 3 styles can be
+specified together, one after the other. This style adjusts the
+position of cutting planes between processor sub-domains in specific
+dimensions. Only the specified dimensions are altered.
+ The uniform argument spaces the planes evenly, as in the left
+diagrams above. The numeric argument requires listing Ps-1 numbers
+that specify the position of the cutting planes. This requires
+knowing Ps = Px or Py or Pz = the number of processors assigned by
+LAMMPS to the relevant dimension. This assignment is made (and the
+Px, Py, Pz values printed out) when the simulation box is created by
+the "create_box" or "read_data" or "read_restart" command and is
+influenced by the settings of the processors
+command.
Each of the numeric values must be between 0 and 1, and they must be
listed in ascending order. They represent the fractional position of
@@ -130,12 +197,11 @@ larger than the right processor's sub-domain.
The dynamic keyword changes the cutting planes between processors in
-an iterative fashion, seeking to reduce the imbalance factor, similar
-to how the fix balance command operates. Note that
-this keyword begins its operation from the current processor
-partitioning, which could be uniform or the result of a previous
-balance command.
+ The shift style invokes a "grid" method for balancing, as
+described above. It changes the positions of cutting planes between
+processors in an iterative fashion, seeking to reduce the imbalance
+factor, similar to how the fix balance shift
+command operates.
The dimstr argument is a string of characters, each of which must be
an "x" or "y" or "z". Eacn character can appear zero or one time,
@@ -147,14 +213,14 @@ to be a density variation in the particles.
dimensions listed in dimstr, one dimension at a time. For a single
dimension, the balancing operation (described below) is iterated on up
to Niter times. After each dimension finishes, the imbalance factor
-is re-computed, and the balancing operation halts if the thresh
+is re-computed, and the balancing operation halts if the stopthresh
criterion is met.
A rebalance operation in a single dimension is performed using a
recursive multisectioning algorithm, where the position of each
cutting plane (line in 2d) in the dimension is adjusted independently.
-This is similar to a recursive bisectioning (RCB) for a single value,
-except that the bounds used for each bisectioning take advantage of
+This is similar to a recursive bisectioning for a single value, except
+that the bounds used for each bisectioning take advantage of
information from neighboring cuts if possible. At each iteration, the
count of particles on either side of each plane is tallied. If the
counts do not match the target value for the plane, the position of
@@ -168,26 +234,27 @@ plane gets closer to the target value.
assigned, particles are migrated to their new owning processor, and
the balance procedure ends.
IMPORTANT NOTE: At each rebalance operation, the RCB for each cutting
-plane (line in 2d) typcially starts with low and high bounds separated
-by the extent of a processor's sub-domain in one dimension. The size
-of this bracketing region shrinks by 1/2 every iteration. Thus if
-Niter is specified as 10, the cutting plane will typically be
-positioned to 1 part in 1000 accuracy (relative to the perfect target
-position). For Niter = 20, it will be accurate to 1 part in a
-million. Tus there is no need ot set Niter to a large value.
+ IMPORTANT NOTE: At each rebalance operation, the bisectioning for each
+cutting plane (line in 2d) typcially starts with low and high bounds
+separated by the extent of a processor's sub-domain in one dimension.
+The size of this bracketing region shrinks by 1/2 every iteration.
+Thus if Niter is specified as 10, the cutting plane will typically
+be positioned to 1 part in 1000 accuracy (relative to the perfect
+target position). For Niter = 20, it will be accurate to 1 part in
+a million. Thus there is no need ot set Niter to a large value.
LAMMPS will check if the threshold accuracy is reached (in a
dimension) is less iterations than Niter and exit early. However,
Niter should also not be set too small, since it will take roughly
the same number of iterations to converge even if the cutting plane is
initially close to the target value.
IMPORTANT NOTE: If a portion of your system is a perfect lattice,
-e.g. the intiial system is generated by the
-create_atoms command, then the balancer may be
-unable to achieve exact balance. I.e. entire lattice planes will be
-owned or not owned by a single processor. So you you should not
-expect to achieve perfect balance in this case.
+ The rcb style invokes a "tiled" method for balancing, as described
+above. It performs a recursive coordinate bisectioning (RCB) of the
+simulation domain.
+ Need further description of RCB.
Restrictions:
The dynamic keyword cannot be used with the x, y, or z
-arguments.
- For 2d simulations, the z keyword cannot be used. Nor can a "z"
-appear in dimstr for the dynamic keyword.
+ For 2d simulations, the z style cannot be used. Nor can a "z"
+appear in dimstr for the shift style.
Related commands:
Syntax:
Examples:
Description:
This command sets the style of inter-processor communication that
-occurs each timestep as atom coordinates and other properties are
-exchanged between neighboring processors and stored as properties of
-ghost atoms.
+ This command sets parameters that affect the inter-processor
+communication of atom information that occurs each timestep as
+coordinates and other properties are exchanged between neighboring
+processors and stored as properties of ghost atoms.
The default style is single which means each processor acquires
+ IMPORTANT NOTE: These options apply to the currently defined comm
+style. When you specify a comm_style command, all
+communication settings are restored to their default values, including
+those previously reset by a comm_modify command. Thus if your input
+script specifies a comm_style command, you should use the comm_modify
+command after it.
+ The mode keyword determines whether a single or multiple cutoff
+distances are used to determine which atoms to communicate.
+ The default mode is single which means each processor acquires
information for ghost atoms that are within a single distance from its
sub-domain. The distance is the maximum of the neighbor cutoff for
all atom type pairs.
For many systems this is an efficient algorithm, but for systems with
-widely varying cutoffs for different type pairs, the multi style can
+widely varying cutoffs for different type pairs, the multi mode can
be faster. In this case, each atom type is assigned its own distance
cutoff for communication purposes, and fewer atoms will be
communicated. See the neighbor multi command for a
neighbor list construction option that may also be beneficial for
simulations of this kind.
The cutoff option allows you to set a ghost cutoff distance, which
+ The cutoff keyword allows you to set a ghost cutoff distance, which
is the distance from the borders of a processor's sub-domain at which
ghost atoms are acquired from other processors. By default the ghost
cutoff = neighbor cutoff = pairwise force cutoff + neighbor skin. See
@@ -105,14 +114,14 @@ will typically lead to bad dynamics (i.e. the bond length is now the
simulation box length). To detect if this is happening, see the
neigh_modify cluster command.
The group option will limit communication to atoms in the specified
+ The group keyword will limit communication to atoms in the specified
group. This can be useful for models where no ghost atoms are needed
for some kinds of particles. All atoms (not just those in the
specified group) will still migrate to new processors as they move.
The group specified with this option must also be specified via the
atom_modify first command.
The vel option enables velocity information to be communicated with
+ The vel keyword enables velocity information to be communicated with
ghost particles. Depending on the atom_style,
velocity info includes the translational velocity, angular velocity,
and angular momentum of a particle. If the vel option is set to
@@ -131,12 +140,12 @@ that boundary (e.g. due to dilation or shear).
Related commands:
neighbor
+ Default:
The default settings are style = single, group = all, cutoff = 0.0,
-vel = no. The cutoff default of 0.0 means that ghost cutoff =
-neighbor cutoff = pairwise force cutoff + neighbor skin.
+ The option defauls are mode = single, group = all, cutoff = 0.0, vel =
+no. The cutoff default of 0.0 means that ghost cutoff = neighbor
+cutoff = pairwise force cutoff + neighbor skin.
balance keyword args ...
+
balance thresh style args keyword value ...
- x args = uniform or Px-1 numbers between 0 and 1
- uniform = evenly spaced cuts between processors in x dimension
- numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
- y args = uniform or Py-1 numbers between 0 and 1
- uniform = evenly spaced cuts between processors in y dimension
- numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
- z args = uniform or Pz-1 numbers between 0 and 1
- uniform = evenly spaced cuts between processors in z dimension
- numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
- dynamic args = dimstr Niter thresh
- dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
- Niter = # of times to iterate within each dimension of dimstr sequence
- thresh = stop balancing when this imbalance threshhold is reached
- out arg = filename
- filename = output file to write each processor's sub-domain to
+
x args = uniform or Px-1 numbers between 0 and 1
+ uniform = evenly spaced cuts between processors in x dimension
+ numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
+ x can be specified together with y or z
+ y args = uniform or Py-1 numbers between 0 and 1
+ uniform = evenly spaced cuts between processors in y dimension
+ numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
+ y can be specified together with x or z
+ z args = uniform or Pz-1 numbers between 0 and 1
+ uniform = evenly spaced cuts between processors in z dimension
+ numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
+ z can be specified together with x or y
+ shift args = dimstr Niter stopthresh
+ dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
+ Niter = # of times to iterate within each dimension of dimstr sequence
+ stopthresh = stop balancing when this imbalance threshhold is reached
+ rcb args = none
+
+ out value = filename
+ filename = write each processor's sub-domain to a file
balance x uniform y 0.4 0.5 0.6
-balance dynamic xz 5 1.1
-balance dynamic x 20 1.0 out tmp.balance
+
balance 0.9 x uniform y 0.4 0.5 0.6
+balance 1.2 shift xz 5 1.1
+balance 1.0 shift xz 5 1.1
+balance 1.1 rcb
+balance 1.0 shift x 20 1.0 out tmp.balance
-
-
+
+
+
-
+
+
@@ -242,11 +309,8 @@ only 10 unique vertices in total.
-communicate command
+
comm_modify command
communicate style keyword value ...
+
comm_modify keyword value ...
- cutoff value = Rcut (distance units) = communicate atoms from this far away
+
mode value = single or multi = communicate atoms within a single or multiple distances
+ cutoff value = Rcut (distance units) = communicate atoms from this far away
group value = group-ID = only communicate atoms in the group
vel value = yes or no = do or do not communicate velocity info with ghost atoms
@@ -29,32 +28,42 @@
communicate multi
-communicate multi group solvent
-communicate single vel yes
-communicate single cutoff 5.0 vel yes
+
communicate mode multi
+communicate mode multi group solvent
+communicate vel yes
+communicate cutoff 5.0 vel yes