mirror of https://github.com/lammps/lammps.git
git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@7732 f3b2605a-c512-4ea7-a41b-209d697bcdaa
This commit is contained in:
parent
8b2ffeddaf
commit
e04e54d0d1
Binary file not shown.
After Width: | Height: | Size: 33 KiB |
|
@ -0,0 +1,206 @@
|
|||
<HTML>
|
||||
<CENTER><A HREF = "http://lammps.sandia.gov">LAMMPS WWW Site</A> - <A HREF = "Manual.html">LAMMPS Documentation</A> - <A HREF = "Section_commands.html#comm">LAMMPS Commands</A>
|
||||
</CENTER>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<HR>
|
||||
|
||||
<H3>balance command
|
||||
</H3>
|
||||
<P>NOTE: the fix balance command referred to here for dynamic load
|
||||
balancing has not yet been released.
|
||||
</P>
|
||||
<P><B>Syntax:</B>
|
||||
</P>
|
||||
<PRE>balance group-ID keyword args ...
|
||||
</PRE>
|
||||
<LI>group-ID = ID of group of atoms to balance across processors
|
||||
</UL>
|
||||
<LI>one or more keyword/arg pairs may be appended
|
||||
|
||||
<LI>keyword = <I>x</I> or <I>y</I> or <I>z</I> or <I>dynamic</I>
|
||||
|
||||
<PRE> <I>x</I> args = <I>uniform</I> or Px-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in x dimension
|
||||
numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
|
||||
<I>y</I> args = <I>uniform</I> or Py-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in x dimension
|
||||
numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
|
||||
<I>z</I> args = <I>uniform</I> or Pz-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in x dimension
|
||||
numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
|
||||
<I>dynamic</I> args = Nrepeat Niter dimstr thresh
|
||||
Nrepeat = # of times to repeat dimstr sequence
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
dimstr = sequence of letters containing "x" or "y" or "z"
|
||||
thresh = stop balancing when this imbalance threshhold is reached
|
||||
</PRE>
|
||||
|
||||
</UL>
|
||||
<P><B>Examples:</B>
|
||||
</P>
|
||||
<PRE>balance x uniform y 0.4 0.5 0.6
|
||||
balance dynamic 1 5 xzx 1.1
|
||||
balance dynamic 5 10 x 1.0
|
||||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>This command adjusts the size of processor sub-domains within the
|
||||
simulation box, to attempt to balance the number of particles and thus
|
||||
the computational cost (load) evenly across processors. The load
|
||||
balancing is "static" in the sense that this command performs the
|
||||
balancing once, before or between simulations. The processor
|
||||
sub-domains will then remain static during the subsequent run. To
|
||||
perform "dynammic" balancing, see the <A HREF = "fix_balance.html">fix balance</A>
|
||||
command, which can adjust processor sub-domain sizes on-the-fly during
|
||||
a <A HREF = "run.html">run</A>.
|
||||
</P>
|
||||
<P>Load-balancing is only useful if the particles in the simulation box
|
||||
have a spatially-varying density distribution. E.g. a model of a
|
||||
vapor/liquid interface, or a solid with an irregular-shaped geometry
|
||||
containing void regions. In this case, the LAMMPS default of dividing
|
||||
the simulation box volume into a regular-spaced grid of processor
|
||||
sub-domain, with one equal-volume sub-domain per procesor, may assign
|
||||
very different numbers of particles per processor. This can lead to
|
||||
poor performance in a scalability sense, when the simulation is run in
|
||||
parallel.
|
||||
</P>
|
||||
<P>Note that the <A HREF = "processors.html">processors</A> command gives you some
|
||||
control over how the box volume is split across processors.
|
||||
Specifically, for a Px by Py by Pz grid of processors, it lets you
|
||||
choose Px, Py, and Pz, subject to the constraint that Px * Py * Pz =
|
||||
P, the total number of processors. This can be sufficient to achieve
|
||||
good load-balance for some models on some processor counts. However,
|
||||
all the processor sub-domains will still be the same shape and have
|
||||
the same volume.
|
||||
</P>
|
||||
<P>This command does not alter the topology of the Px by Py by Pz grid or
|
||||
processors. But it shifts the cutting planes between processors (in
|
||||
3d, or lines in 2d), which adjusts the volume (area in 2d) assigned to
|
||||
each processor, as in the following 2d diagram. The left diagram is
|
||||
the default partitioning of the simulation box across processors (one
|
||||
sub-box for each of 16 processors); the right diagram is after
|
||||
balancing.
|
||||
</P>
|
||||
<CENTER><IMG SRC = "JPG/balance.jpg">
|
||||
</CENTER>
|
||||
<P>When the balance command completes, it prints out the change in
|
||||
"imbalance factor". The imbalance factor is defined as the maximum
|
||||
number of particles owned by any processor, divided by the average
|
||||
number of particles per processor. Thus an imbalance factor of 1.0 is
|
||||
perfect balance. For 10000 particles running on 10 processors, if the
|
||||
most heavily loaded processor has 1200 particles, then the factor is
|
||||
1.2, meaning there is a 20% imbalance. The change in the maximum
|
||||
number of particles (on any processor) is also printed.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But because of the topology constraint that
|
||||
only the cutting planes (lines) between processors are moved, there
|
||||
are many irregular distributions of particles, where this factor
|
||||
cannot be shrunk to 1.0, particuarly in 3d. Also, computational cost
|
||||
is not strictly proportional to particle count, and changing the
|
||||
relative size and shape of processor sub-domains may lead to
|
||||
additional computational and communication overheads, e.g. in the PPPM
|
||||
solver used via the <A HREF = "kspace_style.html">kspace_style</A> command. Thus
|
||||
you should benchmark the run times of your simulation before and after
|
||||
balancing.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>x</I>, <I>y</I>, and <I>z</I> keywords adjust the position of cutting planes
|
||||
between processor sub-domains in a specific dimension. The <I>uniform</I>
|
||||
argument spaces the planes evenly, as in the left diagram above. The
|
||||
<I>numeric</I> argument requires you to list Ps-1 numbers that specify the
|
||||
position of the cutting planes. This requires that you know Ps = Px
|
||||
or Py or Pz = the number of processors assigned by LAMMPS to the
|
||||
relevant dimension. This assignment is made (and the Px, Py, Pz
|
||||
values printed out) when the simulation box is created by the
|
||||
"create_box" or "read_data" or "read_restart" command and is
|
||||
influenced by the settings of the "processors" command.
|
||||
</P>
|
||||
<P>Each of the numeric values must be between 0 and 1, and they must be
|
||||
listed in ascending order. They represent the fractional position of
|
||||
the cutting place. The left (or lower) edge of the box is 0.0, and
|
||||
the right (or upper) edge is 1.0. Neither of these values is
|
||||
specified. Only the interior Ps-1 positions are specified. Thus is
|
||||
there are 2 procesors in the x dimension, you specify a single value
|
||||
such as 0.75, which would make the left processor's sub-domain 3x
|
||||
larger than the right processor's sub-domain.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>dynamic</I> keyword changes the cutting planes between processors in
|
||||
an iterative fashion, seeking to reduce the imbalance factor, similar
|
||||
to how the <A HREF = "fix_balance.html">fix balance</A> command operates. Note that
|
||||
this keyword begins its operation from the current processor
|
||||
partitioning, which could be uniform or the result of a previous
|
||||
balance command.
|
||||
</P>
|
||||
<P>The <I>dimstr</I> argument is a string of characters, each of which must be
|
||||
an "x" or "y" or "z". The characters can appear in any order, and can
|
||||
be repeated as many times as desired. These are all valid <I>dimstr</I>
|
||||
arguments: "x" or "xyzyx" or "yyyzzz".
|
||||
</P>
|
||||
<P>Balancing proceeds by adjusting the cutting planes in each of the
|
||||
dimensions listed in <I>dimstr</I>, one dimension at a time. The entire
|
||||
sequence of dimensions is repeated <I>Nrepeat</I> times. For a single
|
||||
dimension, the balancing operation (described below) is iterated on
|
||||
<I>Niter</I> times. After each dimension finishes, the imbalance factor is
|
||||
re-computed, and the balancing operation halts if the <I>thresh</I>
|
||||
criterion is met.
|
||||
</P>
|
||||
<P>The interplay between <I>Nrepeat</I>, <I>Niter</I>, and <I>dimstr</I> means that
|
||||
these commands do essentially the same thing, the only difference
|
||||
being how often the imbalance factor is computed and checked against
|
||||
the threshhold:
|
||||
</P>
|
||||
<PRE>balance y dynamic 5 10 x 1.2
|
||||
balance y dynamic 1 10 xxxxx 1.2
|
||||
balance y dynamic 50 1 x 1.2
|
||||
</PRE>
|
||||
<P>A rebalance operation in a single dimension is performed using an
|
||||
iterative "diffusive" load-balancing algorithm. One iteration (which
|
||||
is repeated <I>Niter</I> times), works as follows. Assume there are Px
|
||||
processors in the x dimension. This defines Px slices of the
|
||||
simulation, each of which contains Py*Pz processors. The task is to
|
||||
adjust the position of the Px-1 cuts between slices, leaving the end
|
||||
cuts unchanged (left and right edges of the simulation box).
|
||||
</P>
|
||||
<P>The iteration beings by calculating the number of atoms within each of
|
||||
the Px slices. Then for each slice, its atom count is compared to its
|
||||
neighbors. If a slice has more atoms than its left (or right)
|
||||
neighbor, the cut is moved towards the center of the slice,
|
||||
effectively shrinking the width of the slice and migrating atoms to
|
||||
the other slice. The distance to move the cut is a function of the
|
||||
"density" of atoms in the donor slice and the difference in counts
|
||||
between the 2 slices. A damping factor is also applied to avoid
|
||||
oscillations in the position of the cutting plane as iterations
|
||||
proceed. Hence the "diffusive" nature of the algorithm as work
|
||||
(atoms) effectively diffuses from highly loaded processors to
|
||||
less-loaded processors.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P><B>Restrictions:</B>
|
||||
</P>
|
||||
<P>The <I>dynamic</I> keyword cannot be used with the <I>x</I>, <I>y</I>, or <I>z</I>
|
||||
arguments.
|
||||
</P>
|
||||
<P>For 2d simulations, the <I>z</I> keyword cannot be used. Nor can a "z"
|
||||
appear in <I>dimstr</I> for the <I>dynamic</I> keyword.
|
||||
</P>
|
||||
<P><B>Related commands:</B>
|
||||
</P>
|
||||
<P><A HREF = "processors.html">processors</A>, <A HREF = "fix_balance.html">fix balance</A>
|
||||
</P>
|
||||
<P><B>Default:</B> none
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>add a citation and image
|
||||
</P>
|
||||
</HTML>
|
|
@ -0,0 +1,197 @@
|
|||
"LAMMPS WWW Site"_lws - "LAMMPS Documentation"_ld - "LAMMPS Commands"_lc :c
|
||||
|
||||
:link(lws,http://lammps.sandia.gov)
|
||||
:link(ld,Manual.html)
|
||||
:link(lc,Section_commands.html#comm)
|
||||
|
||||
:line
|
||||
|
||||
balance command :h3
|
||||
|
||||
NOTE: the fix balance command referred to here for dynamic load
|
||||
balancing has not yet been released.
|
||||
|
||||
[Syntax:]
|
||||
|
||||
balance group-ID keyword args ... :pre
|
||||
|
||||
group-ID = ID of group of atoms to balance across processors :ule,l
|
||||
one or more keyword/arg pairs may be appended :l
|
||||
keyword = {x} or {y} or {z} or {dynamic} :l
|
||||
{x} args = {uniform} or Px-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in x dimension
|
||||
numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
|
||||
{y} args = {uniform} or Py-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in x dimension
|
||||
numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
|
||||
{z} args = {uniform} or Pz-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in x dimension
|
||||
numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
|
||||
{dynamic} args = Nrepeat Niter dimstr thresh
|
||||
Nrepeat = # of times to repeat dimstr sequence
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
dimstr = sequence of letters containing "x" or "y" or "z"
|
||||
thresh = stop balancing when this imbalance threshhold is reached :pre
|
||||
:ule
|
||||
|
||||
[Examples:]
|
||||
|
||||
balance x uniform y 0.4 0.5 0.6
|
||||
balance dynamic 1 5 xzx 1.1
|
||||
balance dynamic 5 10 x 1.0 :pre
|
||||
|
||||
[Description:]
|
||||
|
||||
This command adjusts the size of processor sub-domains within the
|
||||
simulation box, to attempt to balance the number of particles and thus
|
||||
the computational cost (load) evenly across processors. The load
|
||||
balancing is "static" in the sense that this command performs the
|
||||
balancing once, before or between simulations. The processor
|
||||
sub-domains will then remain static during the subsequent run. To
|
||||
perform "dynammic" balancing, see the "fix balance"_fix_balance.html
|
||||
command, which can adjust processor sub-domain sizes on-the-fly during
|
||||
a "run"_run.html.
|
||||
|
||||
Load-balancing is only useful if the particles in the simulation box
|
||||
have a spatially-varying density distribution. E.g. a model of a
|
||||
vapor/liquid interface, or a solid with an irregular-shaped geometry
|
||||
containing void regions. In this case, the LAMMPS default of dividing
|
||||
the simulation box volume into a regular-spaced grid of processor
|
||||
sub-domain, with one equal-volume sub-domain per procesor, may assign
|
||||
very different numbers of particles per processor. This can lead to
|
||||
poor performance in a scalability sense, when the simulation is run in
|
||||
parallel.
|
||||
|
||||
Note that the "processors"_processors.html command gives you some
|
||||
control over how the box volume is split across processors.
|
||||
Specifically, for a Px by Py by Pz grid of processors, it lets you
|
||||
choose Px, Py, and Pz, subject to the constraint that Px * Py * Pz =
|
||||
P, the total number of processors. This can be sufficient to achieve
|
||||
good load-balance for some models on some processor counts. However,
|
||||
all the processor sub-domains will still be the same shape and have
|
||||
the same volume.
|
||||
|
||||
This command does not alter the topology of the Px by Py by Pz grid or
|
||||
processors. But it shifts the cutting planes between processors (in
|
||||
3d, or lines in 2d), which adjusts the volume (area in 2d) assigned to
|
||||
each processor, as in the following 2d diagram. The left diagram is
|
||||
the default partitioning of the simulation box across processors (one
|
||||
sub-box for each of 16 processors); the right diagram is after
|
||||
balancing.
|
||||
|
||||
:c,image(JPG/balance.jpg)
|
||||
|
||||
When the balance command completes, it prints out the change in
|
||||
"imbalance factor". The imbalance factor is defined as the maximum
|
||||
number of particles owned by any processor, divided by the average
|
||||
number of particles per processor. Thus an imbalance factor of 1.0 is
|
||||
perfect balance. For 10000 particles running on 10 processors, if the
|
||||
most heavily loaded processor has 1200 particles, then the factor is
|
||||
1.2, meaning there is a 20% imbalance. The change in the maximum
|
||||
number of particles (on any processor) is also printed.
|
||||
|
||||
IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But because of the topology constraint that
|
||||
only the cutting planes (lines) between processors are moved, there
|
||||
are many irregular distributions of particles, where this factor
|
||||
cannot be shrunk to 1.0, particuarly in 3d. Also, computational cost
|
||||
is not strictly proportional to particle count, and changing the
|
||||
relative size and shape of processor sub-domains may lead to
|
||||
additional computational and communication overheads, e.g. in the PPPM
|
||||
solver used via the "kspace_style"_kspace_style.html command. Thus
|
||||
you should benchmark the run times of your simulation before and after
|
||||
balancing.
|
||||
|
||||
:line
|
||||
|
||||
The {x}, {y}, and {z} keywords adjust the position of cutting planes
|
||||
between processor sub-domains in a specific dimension. The {uniform}
|
||||
argument spaces the planes evenly, as in the left diagram above. The
|
||||
{numeric} argument requires you to list Ps-1 numbers that specify the
|
||||
position of the cutting planes. This requires that you know Ps = Px
|
||||
or Py or Pz = the number of processors assigned by LAMMPS to the
|
||||
relevant dimension. This assignment is made (and the Px, Py, Pz
|
||||
values printed out) when the simulation box is created by the
|
||||
"create_box" or "read_data" or "read_restart" command and is
|
||||
influenced by the settings of the "processors" command.
|
||||
|
||||
Each of the numeric values must be between 0 and 1, and they must be
|
||||
listed in ascending order. They represent the fractional position of
|
||||
the cutting place. The left (or lower) edge of the box is 0.0, and
|
||||
the right (or upper) edge is 1.0. Neither of these values is
|
||||
specified. Only the interior Ps-1 positions are specified. Thus is
|
||||
there are 2 procesors in the x dimension, you specify a single value
|
||||
such as 0.75, which would make the left processor's sub-domain 3x
|
||||
larger than the right processor's sub-domain.
|
||||
|
||||
:line
|
||||
|
||||
The {dynamic} keyword changes the cutting planes between processors in
|
||||
an iterative fashion, seeking to reduce the imbalance factor, similar
|
||||
to how the "fix balance"_fix_balance.html command operates. Note that
|
||||
this keyword begins its operation from the current processor
|
||||
partitioning, which could be uniform or the result of a previous
|
||||
balance command.
|
||||
|
||||
The {dimstr} argument is a string of characters, each of which must be
|
||||
an "x" or "y" or "z". The characters can appear in any order, and can
|
||||
be repeated as many times as desired. These are all valid {dimstr}
|
||||
arguments: "x" or "xyzyx" or "yyyzzz".
|
||||
|
||||
Balancing proceeds by adjusting the cutting planes in each of the
|
||||
dimensions listed in {dimstr}, one dimension at a time. The entire
|
||||
sequence of dimensions is repeated {Nrepeat} times. For a single
|
||||
dimension, the balancing operation (described below) is iterated on
|
||||
{Niter} times. After each dimension finishes, the imbalance factor is
|
||||
re-computed, and the balancing operation halts if the {thresh}
|
||||
criterion is met.
|
||||
|
||||
The interplay between {Nrepeat}, {Niter}, and {dimstr} means that
|
||||
these commands do essentially the same thing, the only difference
|
||||
being how often the imbalance factor is computed and checked against
|
||||
the threshhold:
|
||||
|
||||
balance y dynamic 5 10 x 1.2
|
||||
balance y dynamic 1 10 xxxxx 1.2
|
||||
balance y dynamic 50 1 x 1.2 :pre
|
||||
|
||||
A rebalance operation in a single dimension is performed using an
|
||||
iterative "diffusive" load-balancing algorithm. One iteration (which
|
||||
is repeated {Niter} times), works as follows. Assume there are Px
|
||||
processors in the x dimension. This defines Px slices of the
|
||||
simulation, each of which contains Py*Pz processors. The task is to
|
||||
adjust the position of the Px-1 cuts between slices, leaving the end
|
||||
cuts unchanged (left and right edges of the simulation box).
|
||||
|
||||
The iteration beings by calculating the number of atoms within each of
|
||||
the Px slices. Then for each slice, its atom count is compared to its
|
||||
neighbors. If a slice has more atoms than its left (or right)
|
||||
neighbor, the cut is moved towards the center of the slice,
|
||||
effectively shrinking the width of the slice and migrating atoms to
|
||||
the other slice. The distance to move the cut is a function of the
|
||||
"density" of atoms in the donor slice and the difference in counts
|
||||
between the 2 slices. A damping factor is also applied to avoid
|
||||
oscillations in the position of the cutting plane as iterations
|
||||
proceed. Hence the "diffusive" nature of the algorithm as work
|
||||
(atoms) effectively diffuses from highly loaded processors to
|
||||
less-loaded processors.
|
||||
|
||||
:line
|
||||
|
||||
[Restrictions:]
|
||||
|
||||
The {dynamic} keyword cannot be used with the {x}, {y}, or {z}
|
||||
arguments.
|
||||
|
||||
For 2d simulations, the {z} keyword cannot be used. Nor can a "z"
|
||||
appear in {dimstr} for the {dynamic} keyword.
|
||||
|
||||
[Related commands:]
|
||||
|
||||
"processors"_processors.html, "fix balance"_fix_balance.html
|
||||
|
||||
[Default:] none
|
||||
|
||||
:line
|
||||
|
||||
add a citation and image
|
Loading…
Reference in New Issue