git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@15228 f3b2605a-c512-4ea7-a41b-209d697bcdaa

This commit is contained in:
sjplimp 2016-06-28 13:30:04 +00:00
parent 8c63302c82
commit 42071be08c
4 changed files with 96 additions and 37 deletions

View File

@ -40,13 +40,16 @@ Syntax
*intel* args = NPhi keyword value ...
Nphi = # of coprocessors per node
zero or more keyword/value pairs may be appended
keywords = *omp* or *mode* or *balance* or *ghost* or *tpc* or *tptask* or *no_affinity*
*omp* value = Nthreads
Nthreads = number of OpenMP threads to use on CPU (default = 0)
keywords = *mode* or *omp* or *lrt* or *balance* or *ghost* or *tpc* or *tptask* or *no_affinity*
*mode* value = *single* or *mixed* or *double*
single = perform force calculations in single precision
mixed = perform force calculations in mixed precision
double = perform force calculations in double precision
*omp* value = Nthreads
Nthreads = number of OpenMP threads to use on CPU (default = 0)
*lrt* value = *yes* or *no*
yes = use additional thread dedicated for some PPPM calculations
no = do not dedicate an extra thread for some PPPM calculations
*balance* value = split
split = fraction of work to offload to coprocessor, -1 for dynamic
*ghost* value = *yes* or *no*
@ -330,6 +333,23 @@ precision, including storage of forces, torques, energies, and virial
quantities. *Double* means double precision is used for the entire
force calculation.
The *lrt* keyword can be used to enable "Long Range Thread (LRT)"
mode. It can take a value of *yes* to enable and *no* to disable.
LRT mode generates an extra thread (in addition to any OpenMP threads
specified with the OMP_NUM_THREADS environment variable or the *omp*
keyword). The extra thread is dedicated for performing part of the
:doc:`PPPM solver <kspace_style>` computations and communications. This
can improve parallel performance on processors supporting
Simultaneous Multithreading (SMT) such as Hyperthreading on Intel
processors. In this mode, one additional thread is generated per MPI
process. LAMMPS will generate a warning in the case that more threads
are used than available in SMT hardware on a node. If the PPPM solver
from the USER-INTEL package is not used, then the LRT setting is
ignored and no extra threads are generated. Enabling LRT will replace
the :doc:`run_style <run_style>` with the *verlet/lrt/intel* style that
is identical to the default *verlet* style aside from supporting the
LRT feature.
The *balance* keyword sets the fraction of :doc:`pair style <pair_style>` work offloaded to the coprocessor for split
values between 0.0 and 1.0 inclusive. While this fraction of work is
running on the coprocessor, other calculations will run on the host,
@ -568,15 +588,15 @@ must invoke the package gpu command in your input script or via the
"-pk gpu" :ref:`command-line switch <start_7>`.
For the USER-INTEL package, the default is Nphi = 1 and the option
defaults are omp = 0, mode = mixed, balance = -1, tpc = 4, tptask =
240. The default ghost option is determined by the pair style being
used. This value is output to the screen in the offload report at the
end of each run. Note that all of these settings, except "omp" and
"mode", are ignored if LAMMPS was not built with Xeon Phi coprocessor
support. These settings are made automatically if the "-sf intel"
:ref:`command-line switch <start_7>` is used. If it is
not used, you must invoke the package intel command in your input
script or or via the "-pk intel" :ref:`command-line switch <start_7>`.
defaults are omp = 0, mode = mixed, lrt = no, balance = -1, tpc = 4,
tptask = 240. The default ghost option is determined by the pair
style being used. This value is output to the screen in the offload
report at the end of each run. Note that all of these settings,
except "omp" and "mode", are ignored if LAMMPS was not built with
Xeon Phi coprocessor support. These settings are made automatically
if the "-sf intel" :ref:`command-line switch <start_7>`
is used. If it is not used, you must invoke the package intel
command in your input script or or via the "-pk intel" :ref:`command-line switch <start_7>`.
For the KOKKOS package, the option defaults neigh = full, newton =
off, binsize = 0.0, and comm = device. These settings are made

View File

@ -162,13 +162,16 @@
<em>intel</em> args = NPhi keyword value ...
Nphi = # of coprocessors per node
zero or more keyword/value pairs may be appended
keywords = <em>omp</em> or <em>mode</em> or <em>balance</em> or <em>ghost</em> or <em>tpc</em> or <em>tptask</em> or <em>no_affinity</em>
<em>omp</em> value = Nthreads
Nthreads = number of OpenMP threads to use on CPU (default = 0)
keywords = <em>mode</em> or <em>omp</em> or <em>lrt</em> or <em>balance</em> or <em>ghost</em> or <em>tpc</em> or <em>tptask</em> or <em>no_affinity</em>
<em>mode</em> value = <em>single</em> or <em>mixed</em> or <em>double</em>
single = perform force calculations in single precision
mixed = perform force calculations in mixed precision
double = perform force calculations in double precision
<em>omp</em> value = Nthreads
Nthreads = number of OpenMP threads to use on CPU (default = 0)
<em>lrt</em> value = <em>yes</em> or <em>no</em>
yes = use additional thread dedicated for some PPPM calculations
no = do not dedicate an extra thread for some PPPM calculations
<em>balance</em> value = split
split = fraction of work to offload to coprocessor, -1 for dynamic
<em>ghost</em> value = <em>yes</em> or <em>no</em>
@ -415,6 +418,22 @@ computed in single precision, but accumulated and stored in double
precision, including storage of forces, torques, energies, and virial
quantities. <em>Double</em> means double precision is used for the entire
force calculation.</p>
<p>The <em>lrt</em> keyword can be used to enable &#8220;Long Range Thread (LRT)&#8221;
mode. It can take a value of <em>yes</em> to enable and <em>no</em> to disable.
LRT mode generates an extra thread (in addition to any OpenMP threads
specified with the OMP_NUM_THREADS environment variable or the <em>omp</em>
keyword). The extra thread is dedicated for performing part of the
<a class="reference internal" href="kspace_style.html"><span class="doc">PPPM solver</span></a> computations and communications. This
can improve parallel performance on processors supporting
Simultaneous Multithreading (SMT) such as Hyperthreading on Intel
processors. In this mode, one additional thread is generated per MPI
process. LAMMPS will generate a warning in the case that more threads
are used than available in SMT hardware on a node. If the PPPM solver
from the USER-INTEL package is not used, then the LRT setting is
ignored and no extra threads are generated. Enabling LRT will replace
the <a class="reference internal" href="run_style.html"><span class="doc">run_style</span></a> with the <em>verlet/lrt/intel</em> style that
is identical to the default <em>verlet</em> style aside from supporting the
LRT feature.</p>
<p>The <em>balance</em> keyword sets the fraction of <a class="reference internal" href="pair_style.html"><span class="doc">pair style</span></a> work offloaded to the coprocessor for split
values between 0.0 and 1.0 inclusive. While this fraction of work is
running on the coprocessor, other calculations will run on the host,
@ -608,15 +627,15 @@ automatically if the &#8220;-sf gpu&#8221; <a class="reference internal" href="S
must invoke the package gpu command in your input script or via the
&#8220;-pk gpu&#8221; <a class="reference internal" href="Section_start.html#start-7"><span class="std std-ref">command-line switch</span></a>.</p>
<p>For the USER-INTEL package, the default is Nphi = 1 and the option
defaults are omp = 0, mode = mixed, balance = -1, tpc = 4, tptask =
240. The default ghost option is determined by the pair style being
used. This value is output to the screen in the offload report at the
end of each run. Note that all of these settings, except &#8220;omp&#8221; and
&#8220;mode&#8221;, are ignored if LAMMPS was not built with Xeon Phi coprocessor
support. These settings are made automatically if the &#8220;-sf intel&#8221;
<a class="reference internal" href="Section_start.html#start-7"><span class="std std-ref">command-line switch</span></a> is used. If it is
not used, you must invoke the package intel command in your input
script or or via the &#8220;-pk intel&#8221; <a class="reference internal" href="Section_start.html#start-7"><span class="std std-ref">command-line switch</span></a>.</p>
defaults are omp = 0, mode = mixed, lrt = no, balance = -1, tpc = 4,
tptask = 240. The default ghost option is determined by the pair
style being used. This value is output to the screen in the offload
report at the end of each run. Note that all of these settings,
except &#8220;omp&#8221; and &#8220;mode&#8221;, are ignored if LAMMPS was not built with
Xeon Phi coprocessor support. These settings are made automatically
if the &#8220;-sf intel&#8221; <a class="reference internal" href="Section_start.html#start-7"><span class="std std-ref">command-line switch</span></a>
is used. If it is not used, you must invoke the package intel
command in your input script or or via the &#8220;-pk intel&#8221; <a class="reference internal" href="Section_start.html#start-7"><span class="std std-ref">command-line switch</span></a>.</p>
<p>For the KOKKOS package, the option defaults neigh = full, newton =
off, binsize = 0.0, and comm = device. These settings are made
automatically by the required &#8220;-k on&#8221; <a class="reference internal" href="Section_start.html#start-7"><span class="std std-ref">command-line switch</span></a>. You can change them bu using the

File diff suppressed because one or more lines are too long

View File

@ -40,13 +40,16 @@ args = arguments specific to the style :l
{intel} args = NPhi keyword value ...
Nphi = # of coprocessors per node
zero or more keyword/value pairs may be appended
keywords = {omp} or {mode} or {balance} or {ghost} or {tpc} or {tptask} or {no_affinity}
{omp} value = Nthreads
Nthreads = number of OpenMP threads to use on CPU (default = 0)
keywords = {mode} or {omp} or {lrt} or {balance} or {ghost} or {tpc} or {tptask} or {no_affinity}
{mode} value = {single} or {mixed} or {double}
single = perform force calculations in single precision
mixed = perform force calculations in mixed precision
double = perform force calculations in double precision
{omp} value = Nthreads
Nthreads = number of OpenMP threads to use on CPU (default = 0)
{lrt} value = {yes} or {no}
yes = use additional thread dedicated for some PPPM calculations
no = do not dedicate an extra thread for some PPPM calculations
{balance} value = split
split = fraction of work to offload to coprocessor, -1 for dynamic
{ghost} value = {yes} or {no}
@ -316,6 +319,23 @@ precision, including storage of forces, torques, energies, and virial
quantities. {Double} means double precision is used for the entire
force calculation.
The {lrt} keyword can be used to enable "Long Range Thread (LRT)"
mode. It can take a value of {yes} to enable and {no} to disable.
LRT mode generates an extra thread (in addition to any OpenMP threads
specified with the OMP_NUM_THREADS environment variable or the {omp}
keyword). The extra thread is dedicated for performing part of the
"PPPM solver"_kspace_style.html computations and communications. This
can improve parallel performance on processors supporting
Simultaneous Multithreading (SMT) such as Hyperthreading on Intel
processors. In this mode, one additional thread is generated per MPI
process. LAMMPS will generate a warning in the case that more threads
are used than available in SMT hardware on a node. If the PPPM solver
from the USER-INTEL package is not used, then the LRT setting is
ignored and no extra threads are generated. Enabling LRT will replace
the "run_style"_run_style.html with the {verlet/lrt/intel} style that
is identical to the default {verlet} style aside from supporting the
LRT feature.
The {balance} keyword sets the fraction of "pair
style"_pair_style.html work offloaded to the coprocessor for split
values between 0.0 and 1.0 inclusive. While this fraction of work is
@ -551,15 +571,15 @@ must invoke the package gpu command in your input script or via the
"-pk gpu" "command-line switch"_Section_start.html#start_7.
For the USER-INTEL package, the default is Nphi = 1 and the option
defaults are omp = 0, mode = mixed, balance = -1, tpc = 4, tptask =
240. The default ghost option is determined by the pair style being
used. This value is output to the screen in the offload report at the
end of each run. Note that all of these settings, except "omp" and
"mode", are ignored if LAMMPS was not built with Xeon Phi coprocessor
support. These settings are made automatically if the "-sf intel"
"command-line switch"_Section_start.html#start_7 is used. If it is
not used, you must invoke the package intel command in your input
script or or via the "-pk intel" "command-line
defaults are omp = 0, mode = mixed, lrt = no, balance = -1, tpc = 4,
tptask = 240. The default ghost option is determined by the pair
style being used. This value is output to the screen in the offload
report at the end of each run. Note that all of these settings,
except "omp" and "mode", are ignored if LAMMPS was not built with
Xeon Phi coprocessor support. These settings are made automatically
if the "-sf intel" "command-line switch"_Section_start.html#start_7
is used. If it is not used, you must invoke the package intel
command in your input script or or via the "-pk intel" "command-line
switch"_Section_start.html#start_7.
For the KOKKOS package, the option defaults neigh = full, newton =