git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@7342 f3b2605a-c512-4ea7-a41b-209d697bcdaa

This commit is contained in:
sjplimp 2011-12-13 15:58:47 +00:00
parent 8325ae6954
commit 9b72a103ea
4 changed files with 375 additions and 129 deletions

View File

@ -853,6 +853,7 @@ letter abbreviation can be used:
<LI>-p or -partition <LI>-p or -partition
<LI>-pl or -plog <LI>-pl or -plog
<LI>-ps or -pscreen <LI>-ps or -pscreen
<LI>-r or -reorder
<LI>-sc or -screen <LI>-sc or -screen
<LI>-sf or -suffix <LI>-sf or -suffix
<LI>-v or -var <LI>-v or -var
@ -961,10 +962,78 @@ partition screen files are created. This overrides the filename
specified in the -screen command-line option. This option is useful specified in the -screen command-line option. This option is useful
when working with large numbers of partitions, allowing the partition when working with large numbers of partitions, allowing the partition
screen files to be suppressed (-pscreen none) or placed in a screen files to be suppressed (-pscreen none) or placed in a
sub-directory (-pscreen replica_files/screen) If this option is not sub-directory (-pscreen replica_files/screen). If this option is not
used the screen file for partition N is screen.N or whatever is used the screen file for partition N is screen.N or whatever is
specified by the -screen command-line option. specified by the -screen command-line option.
</P> </P>
<PRE>-reorder nth N
-reorder custom filename
</PRE>
<P>Reorder the processors in the MPI communicator used to instantiate
LAMMPS, in one of several ways. The original MPI communicator ranks
all P processors from 0 to P-1. The mapping of these ranks to
physical processors is done by MPI before LAMMPS begins. It may be
useful in some cases to alter the rank order. E.g. to insure that
cores within each node are ranked in a desired order. Or when using
the <A HREF = "run_style.html">run_style verlet/split</A> command with 2 partitions
to insure that a specific Kspace processor (in the 2nd partition) is
matched up with a specific set of processors in the 1st partition.
See the <A HREF = "Section_accelerate.html">Section_accelerate</A> doc pages for
more details.
</P>
<P>If the keyword <I>nth</I> is used with a setting <I>N</I>, then it means every
Nth processor will be moved to the end of the ranking. This is useful
when using the <A HREF = "run_style.html">run_style verlet/split</A> command with 2
partitions via the -partition command-line switch. The first set of
processors will be in the first partition, the 2nd set in the 2nd
partition. The -reorder command-line switch can alter this so that
the 1st N procs in the 1st partition and one proc in the 2nd partition
will be ordered consecutively, e.g. as the cores on one physical node.
This can boost performance. For example, if you use "-reorder nth 4"
and "-partition 9 3" and you are running on 12 processors, the
processors will be reordered from
</P>
<PRE>0 1 2 3 4 5 6 7 8 9 10 11
</PRE>
<P>to
</P>
<PRE>0 1 2 4 5 6 8 9 10 3 7 11
</PRE>
<P>so that the processors in each partition will be
</P>
<PRE>0 1 2 4 5 6 8 9 10
3 7 11
</PRE>
<P>See the "processors" command for how to insure processors from each
partition could then be grouped optimally for quad-core nodes.
</P>
<P>If the keyword is <I>custom", then a file that specifies a permutation
of the processor ranks is also specified. The format of the reorder
file is as follows. Any number of initial blank or comment lines
(starting with a "#" character) can be present. These should be
followed by P lines of the form:
</P>
<PRE>I J
</PRE>
<P>where P is the number of processors LAMMPS was launched with. Note
that if running in multi-partition mode (see the -partition switch
above) P is the total number of processors in all partitions. The I
and J values describe a permutation of the P processors. Every I and
J should be values from 0 to P-1 inclusive. In the set of P I values,
every proc ID should appear exactly once. Ditto for the set of P J
values. A single I,J pairing means that the physical processor with
rank I in the original MPI communicator will have rank J in the
reordered communicator.
</P>
<P>Note that rank ordering can also be specified by many MPI
implementations, either by environment variables that specify how to
order physical processors, or by config files that specify what
physical processors to assign to each MPI rank. The -reorder switch
simply gives you a portable way to do this without relying on MPI
itself. See the <A HREF = "processors">processors out</A> command for how to output
info on the final assignment of physical processors to the LAMMPS
simulation domain.
</P>
<PRE>-screen file <PRE>-screen file
</PRE> </PRE>
<P>Specify a file for LAMMPS to write its screen information to. In <P>Specify a file for LAMMPS to write its screen information to. In

View File

@ -844,6 +844,7 @@ letter abbreviation can be used:
-p or -partition -p or -partition
-pl or -plog -pl or -plog
-ps or -pscreen -ps or -pscreen
-r or -reorder
-sc or -screen -sc or -screen
-sf or -suffix -sf or -suffix
-v or -var :ul -v or -var :ul
@ -952,10 +953,78 @@ partition screen files are created. This overrides the filename
specified in the -screen command-line option. This option is useful specified in the -screen command-line option. This option is useful
when working with large numbers of partitions, allowing the partition when working with large numbers of partitions, allowing the partition
screen files to be suppressed (-pscreen none) or placed in a screen files to be suppressed (-pscreen none) or placed in a
sub-directory (-pscreen replica_files/screen) If this option is not sub-directory (-pscreen replica_files/screen). If this option is not
used the screen file for partition N is screen.N or whatever is used the screen file for partition N is screen.N or whatever is
specified by the -screen command-line option. specified by the -screen command-line option.
-reorder nth N
-reorder custom filename :pre
Reorder the processors in the MPI communicator used to instantiate
LAMMPS, in one of several ways. The original MPI communicator ranks
all P processors from 0 to P-1. The mapping of these ranks to
physical processors is done by MPI before LAMMPS begins. It may be
useful in some cases to alter the rank order. E.g. to insure that
cores within each node are ranked in a desired order. Or when using
the "run_style verlet/split"_run_style.html command with 2 partitions
to insure that a specific Kspace processor (in the 2nd partition) is
matched up with a specific set of processors in the 1st partition.
See the "Section_accelerate"_Section_accelerate.html doc pages for
more details.
If the keyword {nth} is used with a setting {N}, then it means every
Nth processor will be moved to the end of the ranking. This is useful
when using the "run_style verlet/split"_run_style.html command with 2
partitions via the -partition command-line switch. The first set of
processors will be in the first partition, the 2nd set in the 2nd
partition. The -reorder command-line switch can alter this so that
the 1st N procs in the 1st partition and one proc in the 2nd partition
will be ordered consecutively, e.g. as the cores on one physical node.
This can boost performance. For example, if you use "-reorder nth 4"
and "-partition 9 3" and you are running on 12 processors, the
processors will be reordered from
0 1 2 3 4 5 6 7 8 9 10 11 :pre
to
0 1 2 4 5 6 8 9 10 3 7 11 :pre
so that the processors in each partition will be
0 1 2 4 5 6 8 9 10
3 7 11 :pre
See the "processors" command for how to insure processors from each
partition could then be grouped optimally for quad-core nodes.
If the keyword is {custom", then a file that specifies a permutation
of the processor ranks is also specified. The format of the reorder
file is as follows. Any number of initial blank or comment lines
(starting with a "#" character) can be present. These should be
followed by P lines of the form:
I J :pre
where P is the number of processors LAMMPS was launched with. Note
that if running in multi-partition mode (see the -partition switch
above) P is the total number of processors in all partitions. The I
and J values describe a permutation of the P processors. Every I and
J should be values from 0 to P-1 inclusive. In the set of P I values,
every proc ID should appear exactly once. Ditto for the set of P J
values. A single I,J pairing means that the physical processor with
rank I in the original MPI communicator will have rank J in the
reordered communicator.
Note that rank ordering can also be specified by many MPI
implementations, either by environment variables that specify how to
order physical processors, or by config files that specify what
physical processors to assign to each MPI rank. The -reorder switch
simply gives you a portable way to do this without relying on MPI
itself. See the "processors out"_processors command for how to output
info on the final assignment of physical processors to the LAMMPS
simulation domain.
-screen file :pre -screen file :pre
Specify a file for LAMMPS to write its screen information to. In Specify a file for LAMMPS to write its screen information to. In

View File

@ -19,27 +19,31 @@
<LI>zero or more keyword/arg pairs may be appended <LI>zero or more keyword/arg pairs may be appended
<LI>keyword = <I>grid</I> or <I>numa</I> or <I>part</I> <LI>keyword = <I>grid</I> or <I>level2</I> or <I>level3</I> or <I>numa</I> or <I>part</I> or <I>file</I>
<PRE> <I>grid</I> arg = <I>cart</I> or <I>cart/reorder</I> or <I>xyz</I> or <I>xzy</I> or <I>yxz</I> or <I>yzx</I> or <I>zxy</I> or <I>zyx</I> <PRE> <I>grid</I> arg = <I>cart</I> or <I>cart/reorder</I> or <I>xyz</I> or <I>xzy</I> or <I>yxz</I> or <I>yzx</I> or <I>zxy</I> or <I>zyx</I>
cart = use MPI_Cart() methods to layout 3d grid of procs with reorder = 0 cart = use MPI_Cart() methods to layout 3d grid of procs with reorder = 0
cart/reorder = use MPI_Cart() methods to layout 3d grid of procs with reorder = 1 cart/reorder = use MPI_Cart() methods to layout 3d grid of procs with reorder = 1
xyz,xzy,yxz,yzx,zxy,zyx = layout 3d grid of procs in IJK order, where I varies fastest, then J, and K slowest xyz,xzy,yxz,yzx,zxy,zyx = layout 3d grid of procs in IJK order
<I>numa</I> arg = none <I>numa</I> arg = none
<I>part</I> args = Psend Precv cstyle <I>part</I> args = Psend Precv cstyle
Psend = partition # (1 to Np) which will send its processor layout Psend = partition # (1 to Np) which will send its processor layout
Precv = partition # (1 to Np) which will recv the processor layout Precv = partition # (1 to Np) which will recv the processor layout
cstyle = <I>multiple</I> cstyle = <I>multiple</I>
<I>multiple</I> = Psend layout will be multiple of Precv layout in each dimension <I>multiple</I> = Psend layout will be multiple of Precv layout in each dimension
<I>file</I> arg = fname
fname = name of file to write processor mapping info to
</PRE> </PRE>
</UL> </UL>
<P><B>Examples:</B> <P><B>Examples:</B>
</P> </P>
<PRE>processors 2 4 4 <PRE>processors * * 5
processors * * 5 processors 2 4 4
processors * * * grid xyz processors 2 4 4 grid xyz
processors * * 8 grid xyz
processors * * * numa processors * * * numa
processors 4 8 16 custom myfile
processors * * * part 1 2 multiple processors * * * part 1 2 multiple
</PRE> </PRE>
<P><B>Description:</B> <P><B>Description:</B>
@ -49,57 +53,67 @@ simulation box. This involves 2 steps. First if there are P
processors it means choosing a factorization P = Px by Py by Pz so processors it means choosing a factorization P = Px by Py by Pz so
that there are Px processors in the x dimension, and similarly for the that there are Px processors in the x dimension, and similarly for the
y and z dimensions. Second, the P processors (with MPI ranks 0 to y and z dimensions. Second, the P processors (with MPI ranks 0 to
P-1) are mapped to the logical grid so that each grid cell is a P-1) are mapped to the logical 3d grid. The arguments to this command
processor. The arguments to this command control each of these 2 control each of these 2 steps.
steps.
</P> </P>
<P>The Px, Py, Pz parameters affect the factorization. Any of the 3 <P>The Px, Py, Pz parameters affect the factorization. Any of the 3
parameters can be specified with an asterisk "*", which means LAMMPS parameters can be specified with an asterisk "*", which means LAMMPS
will choose the number of processors in that dimension. It will do will choose the number of processors in that dimension of the grid.
this based on the size and shape of the global simulation box so as to It will do this based on the size and shape of the global simulation
minimize the surface-to-volume ratio of each processor's sub-domain. box so as to minimize the surface-to-volume ratio of each processor's
sub-domain.
</P> </P>
<P>Since LAMMPS does not load-balance by changing the grid of 3d <P>Since LAMMPS does not load-balance by changing the grid of 3d
processors on-the-fly, this choosing explicit values for Px or Py or processors on-the-fly, choosing explicit values for Px or Py or Pz can
Pz can be used to override the LAMMPS default if it is known to be be used to override the LAMMPS default if it is known to be
sub-optimal for a particular problem. For example, a problem where sub-optimal for a particular problem. E.g. a problem where the extent
the extent of atoms will change dramatically in a particular dimension of atoms will change dramatically in a particular dimension over the
over the course of the simulation. course of the simulation.
</P> </P>
<P>The product of Px, Py, Pz must equal P, the total # of processors <P>The product of Px, Py, Pz must equal P, the total # of processors
LAMMPS is running on. For a <A HREF = "dimension.html">2d simulation</A>, Pz must LAMMPS is running on. For a <A HREF = "dimension.html">2d simulation</A>, Pz must
equal 1. If multiple partitions are being used then P is the number equal 1.
of processors in this partition; see <A HREF = "Section_start.html#start_6">this
section</A> for an explanation of the
-partition command-line switch.
</P> </P>
<P>Note that if you run on a large, prime number of processors P, then a <P>Note that if you run on a large, prime number of processors P, then a
grid such as 1 x P x 1 will be required, which may incur extra grid such as 1 x P x 1 will be required, which may incur extra
communication costs due to the high surface area of each processor's communication costs due to the high surface area of each processor's
sub-domain. sub-domain.
</P> </P>
<P>Also note that if multiple partitions are being used then P is the
number of processors in this partition; see <A HREF = "Section_start.html#start_6">this
section</A> for an explanation of the
-partition command-line switch. Also note that you can prefix the
processors command with the <A HREF = "partition.html">partition</A> command to
easily specify different Px,Py,Pz values for different partitions.
</P>
<P>You can use the <A HREF = "partition.html">partition</A> command to specify
different processor grids for different partitions, e.g.
</P>
<PRE>partition yes 1 processors 4 4 4
partition yes 2 processors 2 3 2
</PRE>
<HR> <HR>
<P>The <I>grid</I> keyword affects how processor IDs are mapped to the 3d grid <P>The <I>grid</I> keyword affects how the P processor IDs (from 0 to P-1) are
of processors. mapped to the 3d grid of processors.
</P> </P>
<P>The <I>cart</I> style uses the family of MPI Cartesian functions to do <P>The <I>cart</I> style uses the family of MPI Cartesian functions to perform
this, namely MPI_Cart_create(), MPI_Cart_get(), MPI_Cart_shift(), and the mapping, namely MPI_Cart_create(), MPI_Cart_get(),
MPI_Cart_rank(). It invokes the MPI_Cart_create() function with its MPI_Cart_shift(), and MPI_Cart_rank(). It invokes the
reorder flag = 0, so that MPI is not free to reorder the processors. MPI_Cart_create() function with its reorder flag = 0, so that MPI is
not free to reorder the processors.
</P> </P>
<P>The <I>cart/reorder</I> style does the same thing as the <I>cart</I> style <P>The <I>cart/reorder</I> style does the same thing as the <I>cart</I> style
except it sets the reorder flag to 1, so that MPI is free to reorder except it sets the reorder flag to 1, so that MPI can reorder
processors if it desires. processors if it desires.
</P> </P>
<P>The <I>xyz</I>, <I>xzy</I>, <I>yxz</I>, <I>yzx</I>, <I>zxy</I>, and <I>zyx</I> styles are all <P>The <I>xyz</I>, <I>xzy</I>, <I>yxz</I>, <I>yzx</I>, <I>zxy</I>, and <I>zyx</I> styles are all
similar. If the style is IJK, then it explicitly maps the P similar. If the style is IJK, then it maps the P processors to the
processors to the grid so that the processor ID in the I direction grid so that the processor ID in the I direction varies fastest, the
varies fastest, the processor ID in the J direction varies next processor ID in the J direction varies next fastest, and the processor
fastest, and the processor ID in the K direction varies slowest. For ID in the K direction varies slowest. For example, if you select
example, if you select style <I>xyz</I> and you have a 2x2x2 grid of 8 style <I>xyz</I> and you have a 2x2x2 grid of 8 processors, the assignments
processors, the assignments of the 8 octants of the simulation domain of the 8 octants of the simulation domain will be:
will be:
</P> </P>
<PRE>proc 0 = lo x, lo y, lo z octant <PRE>proc 0 = lo x, lo y, lo z octant
proc 1 = hi x, lo y, lo z octant proc 1 = hi x, lo y, lo z octant
@ -114,21 +128,28 @@ proc 7 = hi x, hi y, hi z octant
should be aware of both the machine's network topology and the should be aware of both the machine's network topology and the
specific subset of processors and nodes that were assigned to your specific subset of processors and nodes that were assigned to your
simulation. Thus its MPI_Cart calls can optimize the assignment of simulation. Thus its MPI_Cart calls can optimize the assignment of
MPI processes to the 3d grid to minimize communication costs. However MPI processes to the 3d grid to minimize communication costs. In
in practice, few if any MPI implementations actually do this. So it practice, however, few if any MPI implementations actually do this.
is likely that the <I>cart</I> and <I>cart/reorder</I> styles simply give the So it is likely that the <I>cart</I> and <I>cart/reorder</I> styles simply give
same result as one of the IJK styles. the same result as one of the IJK styles.
</P> </P>
<HR> <HR>
<P>The <I>numa</I> keyword affects both the factorization of P into Px,Py,Pz <P>The <I>numa</I> keyword affects both the factorization of P into Px,Py,Pz
and the mapping of processors to the 3d grid. and the mapping of processors to the 3d grid.
</P> </P>
<P>It will perform a two-level factorization of the simulation box to <P>It operates similar to the <I>level2</I> and <I>level3</I> keywords except that
minimize inter-node communication. This can improve parallel it tries to auto-detect the count and topology of the processors and
efficiency by reducing network traffic. When this keyword is set, the cores within a node. Currently, it does this in only 2 levels
simulation box is first divided across nodes. Then within each node, (assumes the proces/node = 1), but it may be extended in the future.
the subdomain is further divided between the cores of each node. </P>
<P>It also uses a different algorithm (iterative) than the <I>level2</I>
keyword for doing the two-level factorization of the simulation box
into a 3d processor grid to minimize off-node communication. Thus it
may give a differnet or improved mapping of processors to the 3d grid.
</P>
<P>The numa setting will give an error if the number of MPI processes
is not evenly divisible by the number of cores used per node.
</P> </P>
<P>The numa setting will be ignored if (a) there are less than 4 cores <P>The numa setting will be ignored if (a) there are less than 4 cores
per node, or (b) the number of MPI processes is not divisible by the per node, or (b) the number of MPI processes is not divisible by the
@ -137,14 +158,16 @@ any of the Px or Py of Pz values is greater than 1.
</P> </P>
<HR> <HR>
<P>The <I>part</I> keyword can be useful when running in multi-partition mode, <P>The <I>part</I> keyword affects the factorization of P into Px,Py,Pz.
e.g. with the <A HREF = "run_style.html<A HREF = "Section_start.html#start_6">-partition">>run_style verlet/split</A> command. It </P>
specifies a dependency bewteen a sending partition <I>Psend</I> and a <P>It can be useful when running in multi-partition mode, e.g. with the
receiving partition <I>Precv</I> which is enforced when each is setting up <A HREF = "run_style.html">run_style verlet/split</A> command. It specifies a
their own mapping of the partitions processors to the simulation box. dependency bewteen a sending partition <I>Psend</I> and a receiving
Each of <I>Psend</I> and <I>Precv</I> must be integers from 1 to Np, where Np is partition <I>Precv</I> which is enforced when each is setting up their own
the number of partitions you have defined via the <A HREF = </A> mapping of their processors to the simulation box. Each of <I>Psend</I>
command-line switch</A>. and <I>Precv</I> must be integers from 1 to Np, where Np is the number of
partitions you have defined via the <A HREF = "Section_start.html#start_6">-partition command-line
switch</A>.
</P> </P>
<P>A "dependency" means that the sending partition will create its 3d <P>A "dependency" means that the sending partition will create its 3d
logical grid as Px by Py by Pz and after it has done this, it will logical grid as Px by Py by Pz and after it has done this, it will
@ -165,14 +188,6 @@ processors, it could create a 4x2x10 grid, but it will not create a
2x4x10 grid, since in the y-dimension, 6 is not an integer multiple of 2x4x10 grid, since in the y-dimension, 6 is not an integer multiple of
4. 4.
</P> </P>
<HR>
<P>Note that you can use the <A HREF = "partition.html">partition</A> command to
specify different processor grids for different partitions, e.g.
</P>
<PRE>partition yes 1 processors 4 4 4
partition yes 2 processors 2 3 2
</PRE>
<P>IMPORTANT NOTE: If you use the <A HREF = "partition.html">partition</A> command to <P>IMPORTANT NOTE: If you use the <A HREF = "partition.html">partition</A> command to
invoke different "processsors" commands on different partitions, and invoke different "processsors" commands on different partitions, and
you also use the <I>part</I> keyword, then you must insure that both the you also use the <I>part</I> keyword, then you must insure that both the
@ -183,6 +198,39 @@ setup phase if this error has been made.
</P> </P>
<HR> <HR>
<P>The <I>out</I> keyword writes the mapping of the factorization of P
processors and their mapping to the 3d grid to the specified file
<I>fname</I>. This is useful to check that you assigned physical
processors in the manner you desired, which can be tricky to figure
out, especially when running on multiple partitions or on a multicore
machine or when the processor ranks were reordered by use of the
<A HREF = "Section_start.html#start_6">-reorder command-line switch</A> or due to
use of MPI-specific launch options such as a config file.
</P>
<P>If you have multiple partitions you should insure that each one writes
to a different file, e.g. using a <A HREF = "variable.html">world-style variable</A>
for the filename. The file will have a self-explanatory header,
followed by one-line per processor in this format:
</P>
<P>I J K: world-ID universe-ID original-ID: name
</P>
<P>I,J,K are the indices of the processor in the 3d logical grid. The
IDs are the processor's rank in this simulation (the world), the
universe (of multiple simulations), and the original MPI communicator
used to instantiate LAMMPS, respectively. The world and universe IDs
will only be different if you are running on more than one partition;
see the <A HREF = "Section_start.html#start_6">-partition command-line switch</A>.
The universe and original IDs will only be different if you used the
<A HREF = "Section_start.html#start_6">-reorder command-line switch</A> to reorder
the processors differently than their rank in the original
communicator LAMMPS was instantiated with. The <I>name</I> is what is
returned by a call to MPI_Get_processor_name() and should represent an
identifier relevant to the physical processors in your machine. Note
that depending on the MPI implementation, multiple cores can have the
same <I>name</I>.
</P>
<HR>
<P><B>Restrictions:</B> <P><B>Restrictions:</B>
</P> </P>
<P>This command cannot be used after the simulation box is defined by a <P>This command cannot be used after the simulation box is defined by a
@ -190,13 +238,19 @@ setup phase if this error has been made.
It can be used before a restart file is read to change the 3d It can be used before a restart file is read to change the 3d
processor grid from what is specified in the restart file. processor grid from what is specified in the restart file.
</P> </P>
<P>The <I>numa</I> keyword cannot be used with the <I>part</I> keyword, or <P>You cannot use more than one of the <I>level2</I>, <I>level3</I>, or <I>numa</I>
with any <I>grid</I> setting other than <I>cart</I>. keywords.
</P> </P>
<P><B>Related commands:</B> none <P>The <I>numa</I> keyword cannot be used with the <I>part</I> keyword, and it
ignores the <I>grid</I> setting.
</P>
<P><B>Related commands:</B>
</P>
<P><A HREF = "partition.html">partition</A>, <A HREF = "Section_start.html#start_6">-reorder command-line
switch</A>
</P> </P>
<P><B>Default:</B> <P><B>Default:</B>
</P> </P>
<P>The option defaults are Px Py Pz = * * *, grid = cart, numa = 0. <P>The option defaults are Px Py Pz = * * * and grid = cart.
</P> </P>
</HTML> </HTML>

View File

@ -14,25 +14,29 @@ processors Px Py Pz keyword args ... :pre
Px,Py,Pz = # of processors in each dimension of a 3d grid :ulb,l Px,Py,Pz = # of processors in each dimension of a 3d grid :ulb,l
zero or more keyword/arg pairs may be appended :l zero or more keyword/arg pairs may be appended :l
keyword = {grid} or {numa} or {part} :l keyword = {grid} or {level2} or {level3} or {numa} or {part} or {file} :l
{grid} arg = {cart} or {cart/reorder} or {xyz} or {xzy} or {yxz} or {yzx} or {zxy} or {zyx} {grid} arg = {cart} or {cart/reorder} or {xyz} or {xzy} or {yxz} or {yzx} or {zxy} or {zyx}
cart = use MPI_Cart() methods to layout 3d grid of procs with reorder = 0 cart = use MPI_Cart() methods to layout 3d grid of procs with reorder = 0
cart/reorder = use MPI_Cart() methods to layout 3d grid of procs with reorder = 1 cart/reorder = use MPI_Cart() methods to layout 3d grid of procs with reorder = 1
xyz,xzy,yxz,yzx,zxy,zyx = layout 3d grid of procs in IJK order, where I varies fastest, then J, and K slowest xyz,xzy,yxz,yzx,zxy,zyx = layout 3d grid of procs in IJK order
{numa} arg = none {numa} arg = none
{part} args = Psend Precv cstyle {part} args = Psend Precv cstyle
Psend = partition # (1 to Np) which will send its processor layout Psend = partition # (1 to Np) which will send its processor layout
Precv = partition # (1 to Np) which will recv the processor layout Precv = partition # (1 to Np) which will recv the processor layout
cstyle = {multiple} cstyle = {multiple}
{multiple} = Psend layout will be multiple of Precv layout in each dimension :pre {multiple} = Psend layout will be multiple of Precv layout in each dimension
{file} arg = fname
fname = name of file to write processor mapping info to :pre
:ule :ule
[Examples:] [Examples:]
processors 2 4 4
processors * * 5 processors * * 5
processors * * * grid xyz processors 2 4 4
processors 2 4 4 grid xyz
processors * * 8 grid xyz
processors * * * numa processors * * * numa
processors 4 8 16 custom myfile
processors * * * part 1 2 multiple :pre processors * * * part 1 2 multiple :pre
[Description:] [Description:]
@ -42,57 +46,67 @@ simulation box. This involves 2 steps. First if there are P
processors it means choosing a factorization P = Px by Py by Pz so processors it means choosing a factorization P = Px by Py by Pz so
that there are Px processors in the x dimension, and similarly for the that there are Px processors in the x dimension, and similarly for the
y and z dimensions. Second, the P processors (with MPI ranks 0 to y and z dimensions. Second, the P processors (with MPI ranks 0 to
P-1) are mapped to the logical grid so that each grid cell is a P-1) are mapped to the logical 3d grid. The arguments to this command
processor. The arguments to this command control each of these 2 control each of these 2 steps.
steps.
The Px, Py, Pz parameters affect the factorization. Any of the 3 The Px, Py, Pz parameters affect the factorization. Any of the 3
parameters can be specified with an asterisk "*", which means LAMMPS parameters can be specified with an asterisk "*", which means LAMMPS
will choose the number of processors in that dimension. It will do will choose the number of processors in that dimension of the grid.
this based on the size and shape of the global simulation box so as to It will do this based on the size and shape of the global simulation
minimize the surface-to-volume ratio of each processor's sub-domain. box so as to minimize the surface-to-volume ratio of each processor's
sub-domain.
Since LAMMPS does not load-balance by changing the grid of 3d Since LAMMPS does not load-balance by changing the grid of 3d
processors on-the-fly, this choosing explicit values for Px or Py or processors on-the-fly, choosing explicit values for Px or Py or Pz can
Pz can be used to override the LAMMPS default if it is known to be be used to override the LAMMPS default if it is known to be
sub-optimal for a particular problem. For example, a problem where sub-optimal for a particular problem. E.g. a problem where the extent
the extent of atoms will change dramatically in a particular dimension of atoms will change dramatically in a particular dimension over the
over the course of the simulation. course of the simulation.
The product of Px, Py, Pz must equal P, the total # of processors The product of Px, Py, Pz must equal P, the total # of processors
LAMMPS is running on. For a "2d simulation"_dimension.html, Pz must LAMMPS is running on. For a "2d simulation"_dimension.html, Pz must
equal 1. If multiple partitions are being used then P is the number equal 1.
of processors in this partition; see "this
section"_Section_start.html#start_6 for an explanation of the
-partition command-line switch.
Note that if you run on a large, prime number of processors P, then a Note that if you run on a large, prime number of processors P, then a
grid such as 1 x P x 1 will be required, which may incur extra grid such as 1 x P x 1 will be required, which may incur extra
communication costs due to the high surface area of each processor's communication costs due to the high surface area of each processor's
sub-domain. sub-domain.
Also note that if multiple partitions are being used then P is the
number of processors in this partition; see "this
section"_Section_start.html#start_6 for an explanation of the
-partition command-line switch. Also note that you can prefix the
processors command with the "partition"_partition.html command to
easily specify different Px,Py,Pz values for different partitions.
You can use the "partition"_partition.html command to specify
different processor grids for different partitions, e.g.
partition yes 1 processors 4 4 4
partition yes 2 processors 2 3 2 :pre
:line :line
The {grid} keyword affects how processor IDs are mapped to the 3d grid The {grid} keyword affects how the P processor IDs (from 0 to P-1) are
of processors. mapped to the 3d grid of processors.
The {cart} style uses the family of MPI Cartesian functions to do The {cart} style uses the family of MPI Cartesian functions to perform
this, namely MPI_Cart_create(), MPI_Cart_get(), MPI_Cart_shift(), and the mapping, namely MPI_Cart_create(), MPI_Cart_get(),
MPI_Cart_rank(). It invokes the MPI_Cart_create() function with its MPI_Cart_shift(), and MPI_Cart_rank(). It invokes the
reorder flag = 0, so that MPI is not free to reorder the processors. MPI_Cart_create() function with its reorder flag = 0, so that MPI is
not free to reorder the processors.
The {cart/reorder} style does the same thing as the {cart} style The {cart/reorder} style does the same thing as the {cart} style
except it sets the reorder flag to 1, so that MPI is free to reorder except it sets the reorder flag to 1, so that MPI can reorder
processors if it desires. processors if it desires.
The {xyz}, {xzy}, {yxz}, {yzx}, {zxy}, and {zyx} styles are all The {xyz}, {xzy}, {yxz}, {yzx}, {zxy}, and {zyx} styles are all
similar. If the style is IJK, then it explicitly maps the P similar. If the style is IJK, then it maps the P processors to the
processors to the grid so that the processor ID in the I direction grid so that the processor ID in the I direction varies fastest, the
varies fastest, the processor ID in the J direction varies next processor ID in the J direction varies next fastest, and the processor
fastest, and the processor ID in the K direction varies slowest. For ID in the K direction varies slowest. For example, if you select
example, if you select style {xyz} and you have a 2x2x2 grid of 8 style {xyz} and you have a 2x2x2 grid of 8 processors, the assignments
processors, the assignments of the 8 octants of the simulation domain of the 8 octants of the simulation domain will be:
will be:
proc 0 = lo x, lo y, lo z octant proc 0 = lo x, lo y, lo z octant
proc 1 = hi x, lo y, lo z octant proc 1 = hi x, lo y, lo z octant
@ -107,21 +121,28 @@ Note that, in principle, an MPI implementation on a particular machine
should be aware of both the machine's network topology and the should be aware of both the machine's network topology and the
specific subset of processors and nodes that were assigned to your specific subset of processors and nodes that were assigned to your
simulation. Thus its MPI_Cart calls can optimize the assignment of simulation. Thus its MPI_Cart calls can optimize the assignment of
MPI processes to the 3d grid to minimize communication costs. However MPI processes to the 3d grid to minimize communication costs. In
in practice, few if any MPI implementations actually do this. So it practice, however, few if any MPI implementations actually do this.
is likely that the {cart} and {cart/reorder} styles simply give the So it is likely that the {cart} and {cart/reorder} styles simply give
same result as one of the IJK styles. the same result as one of the IJK styles.
:line :line
The {numa} keyword affects both the factorization of P into Px,Py,Pz The {numa} keyword affects both the factorization of P into Px,Py,Pz
and the mapping of processors to the 3d grid. and the mapping of processors to the 3d grid.
It will perform a two-level factorization of the simulation box to It operates similar to the {level2} and {level3} keywords except that
minimize inter-node communication. This can improve parallel it tries to auto-detect the count and topology of the processors and
efficiency by reducing network traffic. When this keyword is set, the cores within a node. Currently, it does this in only 2 levels
simulation box is first divided across nodes. Then within each node, (assumes the proces/node = 1), but it may be extended in the future.
the subdomain is further divided between the cores of each node.
It also uses a different algorithm (iterative) than the {level2}
keyword for doing the two-level factorization of the simulation box
into a 3d processor grid to minimize off-node communication. Thus it
may give a differnet or improved mapping of processors to the 3d grid.
The numa setting will give an error if the number of MPI processes
is not evenly divisible by the number of cores used per node.
The numa setting will be ignored if (a) there are less than 4 cores The numa setting will be ignored if (a) there are less than 4 cores
per node, or (b) the number of MPI processes is not divisible by the per node, or (b) the number of MPI processes is not divisible by the
@ -130,14 +151,16 @@ any of the Px or Py of Pz values is greater than 1.
:line :line
The {part} keyword can be useful when running in multi-partition mode, The {part} keyword affects the factorization of P into Px,Py,Pz.
e.g. with the "run_style verlet/split"_run_style.html command. It
specifies a dependency bewteen a sending partition {Psend} and a It can be useful when running in multi-partition mode, e.g. with the
receiving partition {Precv} which is enforced when each is setting up "run_style verlet/split"_run_style.html command. It specifies a
their own mapping of the partitions processors to the simulation box. dependency bewteen a sending partition {Psend} and a receiving
Each of {Psend} and {Precv} must be integers from 1 to Np, where Np is partition {Precv} which is enforced when each is setting up their own
the number of partitions you have defined via the "-partition mapping of their processors to the simulation box. Each of {Psend}
command-line switch"__Section_start.html#start_6. and {Precv} must be integers from 1 to Np, where Np is the number of
partitions you have defined via the "-partition command-line
switch"_Section_start.html#start_6.
A "dependency" means that the sending partition will create its 3d A "dependency" means that the sending partition will create its 3d
logical grid as Px by Py by Pz and after it has done this, it will logical grid as Px by Py by Pz and after it has done this, it will
@ -158,14 +181,6 @@ processors, it could create a 4x2x10 grid, but it will not create a
2x4x10 grid, since in the y-dimension, 6 is not an integer multiple of 2x4x10 grid, since in the y-dimension, 6 is not an integer multiple of
4. 4.
:line
Note that you can use the "partition"_partition.html command to
specify different processor grids for different partitions, e.g.
partition yes 1 processors 4 4 4
partition yes 2 processors 2 3 2 :pre
IMPORTANT NOTE: If you use the "partition"_partition.html command to IMPORTANT NOTE: If you use the "partition"_partition.html command to
invoke different "processsors" commands on different partitions, and invoke different "processsors" commands on different partitions, and
you also use the {part} keyword, then you must insure that both the you also use the {part} keyword, then you must insure that both the
@ -176,6 +191,39 @@ setup phase if this error has been made.
:line :line
The {out} keyword writes the mapping of the factorization of P
processors and their mapping to the 3d grid to the specified file
{fname}. This is useful to check that you assigned physical
processors in the manner you desired, which can be tricky to figure
out, especially when running on multiple partitions or on a multicore
machine or when the processor ranks were reordered by use of the
"-reorder command-line switch"_Section_start.html#start_6 or due to
use of MPI-specific launch options such as a config file.
If you have multiple partitions you should insure that each one writes
to a different file, e.g. using a "world-style variable"_variable.html
for the filename. The file will have a self-explanatory header,
followed by one-line per processor in this format:
I J K: world-ID universe-ID original-ID: name
I,J,K are the indices of the processor in the 3d logical grid. The
IDs are the processor's rank in this simulation (the world), the
universe (of multiple simulations), and the original MPI communicator
used to instantiate LAMMPS, respectively. The world and universe IDs
will only be different if you are running on more than one partition;
see the "-partition command-line switch"_Section_start.html#start_6.
The universe and original IDs will only be different if you used the
"-reorder command-line switch"_Section_start.html#start_6 to reorder
the processors differently than their rank in the original
communicator LAMMPS was instantiated with. The {name} is what is
returned by a call to MPI_Get_processor_name() and should represent an
identifier relevant to the physical processors in your machine. Note
that depending on the MPI implementation, multiple cores can have the
same {name}.
:line
[Restrictions:] [Restrictions:]
This command cannot be used after the simulation box is defined by a This command cannot be used after the simulation box is defined by a
@ -183,11 +231,17 @@ This command cannot be used after the simulation box is defined by a
It can be used before a restart file is read to change the 3d It can be used before a restart file is read to change the 3d
processor grid from what is specified in the restart file. processor grid from what is specified in the restart file.
The {numa} keyword cannot be used with the {part} keyword, or You cannot use more than one of the {level2}, {level3}, or {numa}
with any {grid} setting other than {cart}. keywords.
[Related commands:] none The {numa} keyword cannot be used with the {part} keyword, and it
ignores the {grid} setting.
[Related commands:]
"partition"_partition.html, "-reorder command-line
switch"_Section_start.html#start_6
[Default:] [Default:]
The option defaults are Px Py Pz = * * *, grid = cart, numa = 0. The option defaults are Px Py Pz = * * * and grid = cart.