forked from lijiext/lammps
481 lines
26 KiB
HTML
481 lines
26 KiB
HTML
|
|
|
|
<!DOCTYPE html>
|
|
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
|
|
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
|
|
<head>
|
|
<meta charset="utf-8">
|
|
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
|
|
<title>5.USER-INTEL package — LAMMPS 15 May 2015 version documentation</title>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
|
|
|
|
|
|
|
|
<link rel="stylesheet" href="_static/sphinxcontrib-images/LightBox2/lightbox2/css/lightbox.css" type="text/css" />
|
|
|
|
|
|
|
|
<link rel="top" title="LAMMPS 15 May 2015 version documentation" href="index.html"/>
|
|
|
|
|
|
<script src="_static/js/modernizr.min.js"></script>
|
|
|
|
</head>
|
|
|
|
<body class="wy-body-for-nav" role="document">
|
|
|
|
<div class="wy-grid-for-nav">
|
|
|
|
|
|
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
|
<div class="wy-side-nav-search">
|
|
|
|
|
|
|
|
<a href="Manual.html" class="icon icon-home"> LAMMPS
|
|
|
|
|
|
|
|
</a>
|
|
|
|
|
|
<div role="search">
|
|
<form id="rtd-search-form" class="wy-form" action="search.html" method="get">
|
|
<input type="text" name="q" placeholder="Search docs" />
|
|
<input type="hidden" name="check_keywords" value="yes" />
|
|
<input type="hidden" name="area" value="default" />
|
|
</form>
|
|
</div>
|
|
|
|
|
|
</div>
|
|
|
|
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
|
|
|
|
|
|
|
|
<ul>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_intro.html">1. Introduction</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_start.html">2. Getting Started</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_commands.html">3. Commands</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_packages.html">4. Packages</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_accelerate.html">5. Accelerating LAMMPS performance</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_howto.html">6. How-to discussions</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_example.html">7. Example problems</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_perf.html">8. Performance & scalability</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_tools.html">9. Additional tools</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_modify.html">10. Modifying & extending LAMMPS</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_python.html">11. Python interface to LAMMPS</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_errors.html">12. Errors</a></li>
|
|
<li class="toctree-l1"><a class="reference internal" href="Section_history.html">13. Future and history</a></li>
|
|
</ul>
|
|
|
|
|
|
|
|
</div>
|
|
|
|
</nav>
|
|
|
|
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
|
|
|
|
|
|
<nav class="wy-nav-top" role="navigation" aria-label="top navigation">
|
|
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
|
<a href="Manual.html">LAMMPS</a>
|
|
</nav>
|
|
|
|
|
|
|
|
<div class="wy-nav-content">
|
|
<div class="rst-content">
|
|
<div role="navigation" aria-label="breadcrumbs navigation">
|
|
<ul class="wy-breadcrumbs">
|
|
<li><a href="Manual.html">Docs</a> »</li>
|
|
|
|
<li>5.USER-INTEL package</li>
|
|
<li class="wy-breadcrumbs-aside">
|
|
|
|
|
|
<a href="http://lammps.sandia.gov">Website</a>
|
|
<a href="Section_commands.html#comm">Commands</a>
|
|
|
|
</li>
|
|
</ul>
|
|
<hr/>
|
|
|
|
</div>
|
|
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
|
<div itemprop="articleBody">
|
|
|
|
<p><a class="reference internal" href="Section_accelerate.html"><em>Return to Section accelerate overview</em></a></p>
|
|
<div class="section" id="user-intel-package">
|
|
<h1>5.USER-INTEL package<a class="headerlink" href="#user-intel-package" title="Permalink to this headline">¶</a></h1>
|
|
<p>The USER-INTEL package was developed by Mike Brown at Intel
|
|
Corporation. It provides a capability to accelerate simulations by
|
|
offloading neighbor list and non-bonded force calculations to Intel(R)
|
|
Xeon Phi(TM) coprocessors (not native mode like the KOKKOS package).
|
|
Additionally, it supports running simulations in single, mixed, or
|
|
double precision with vectorization, even if a coprocessor is not
|
|
present, i.e. on an Intel(R) CPU. The same C++ code is used for both
|
|
cases. When offloading to a coprocessor, the routine is run twice,
|
|
once with an offload flag.</p>
|
|
<p>The USER-INTEL package can be used in tandem with the USER-OMP
|
|
package. This is useful when offloading pair style computations to
|
|
coprocessors, so that other styles not supported by the USER-INTEL
|
|
package, e.g. bond, angle, dihedral, improper, and long-range
|
|
electrostatics, can run simultaneously in threaded mode on the CPU
|
|
cores. Since less MPI tasks than CPU cores will typically be invoked
|
|
when running with coprocessors, this enables the extra CPU cores to be
|
|
used for useful computation.</p>
|
|
<p>If LAMMPS is built with both the USER-INTEL and USER-OMP packages
|
|
installed, this mode of operation is made easier to use, with the
|
|
“-suffix hybrid intel omp” <a class="reference internal" href="Section_start.html#start-7"><span>command-line switch</span></a>
|
|
or the <a class="reference internal" href="suffix.html"><em>suffix hybrid intel omp</em></a> command will both set a
|
|
second-choice suffix to “omp” so that styles from the USER-OMP package will be
|
|
used if available, after first testing if a style from the USER-INTEL
|
|
package is available.</p>
|
|
<p>When using the USER-INTEL package, you must choose at build time whether the
|
|
binary will support offload to Xeon Phi coprocessors. Binaries supporting
|
|
offload can still be run in CPU-only (host-only) mode.</p>
|
|
<p>Here is a quick overview of how to use the USER-INTEL package
|
|
for CPU-only acceleration:</p>
|
|
<ul class="simple">
|
|
<li>specify these CCFLAGS in your src/MAKE/Makefile.machine: -openmp, -DLAMMPS_MEMALIGN=64, -restrict, -xHost</li>
|
|
<li>specify -openmp with LINKFLAGS in your Makefile.machine</li>
|
|
<li>include the USER-INTEL package and (optionally) USER-OMP package and build LAMMPS</li>
|
|
<li>specify how many OpenMP threads per MPI task to use</li>
|
|
<li>use USER-INTEL and (optionally) USER-OMP styles in your input script</li>
|
|
</ul>
|
|
<p>Note that many of these settings can only be used with the Intel
|
|
compiler, as discussed below.</p>
|
|
<p>Using the USER-INTEL package to offload work to the Intel(R)
|
|
Xeon Phi(TM) coprocessor is the same except for these additional
|
|
steps:</p>
|
|
<ul class="simple">
|
|
<li>add the flag -DLMP_INTEL_OFFLOAD to CCFLAGS in your Makefile.machine</li>
|
|
<li>add the flag -offload to LINKFLAGS in your Makefile.machine</li>
|
|
</ul>
|
|
<p>The latter two steps in the first case and the last step in the
|
|
coprocessor case can be done using the “-pk intel” and “-sf intel”
|
|
<a class="reference internal" href="Section_start.html#start-7"><span>command-line switches</span></a> respectively. Or
|
|
the effect of the “-pk” or “-sf” switches can be duplicated by adding
|
|
the <a class="reference internal" href="package.html"><em>package intel</em></a> or <a class="reference internal" href="suffix.html"><em>suffix intel</em></a>
|
|
commands respectively to your input script.</p>
|
|
<p><strong>Required hardware/software:</strong></p>
|
|
<p>To use the offload option, you must have one or more Intel(R) Xeon
|
|
Phi(TM) coprocessors and use an Intel(R) C++ compiler.</p>
|
|
<p>Optimizations for vectorization have only been tested with the
|
|
Intel(R) compiler. Use of other compilers may not result in
|
|
vectorization or give poor performance.</p>
|
|
<p>Use of an Intel C++ compiler is recommended, but not required (though
|
|
g++ will not recognize some of the settings, so they cannot be used).
|
|
The compiler must support the OpenMP interface.</p>
|
|
<p>The recommended version of the Intel(R) compiler is 14.0.1.106.
|
|
Versions 15.0.1.133 and later are also supported. If using Intel(R)
|
|
MPI, versions 15.0.2.044 and later are recommended.</p>
|
|
<p><strong>Building LAMMPS with the USER-INTEL package:</strong></p>
|
|
<p>You can choose to build with or without support for offload to a
|
|
Intel(R) Xeon Phi(TM) coprocessor. If you build with support for a
|
|
coprocessor, the same binary can be used on nodes with and without
|
|
coprocessors installed. However, if you do not have coprocessors
|
|
on your system, building without offload support will produce a
|
|
smaller binary.</p>
|
|
<p>You can do either in one line, using the src/Make.py script, described
|
|
in <a class="reference internal" href="Section_start.html#start-4"><span>Section 2.4</span></a> of the manual. Type
|
|
“Make.py -h” for help. If run from the src directory, these commands
|
|
will create src/lmp_intel_cpu and lmp_intel_phi using
|
|
src/MAKE/Makefile.mpi as the starting Makefile.machine:</p>
|
|
<div class="highlight-python"><div class="highlight"><pre>Make.py -p intel omp -intel cpu -o intel_cpu -cc icc -a file mpi
|
|
Make.py -p intel omp -intel phi -o intel_phi -cc icc -a file mpi
|
|
</pre></div>
|
|
</div>
|
|
<p>Note that this assumes that your MPI and its mpicxx wrapper
|
|
is using the Intel compiler. If it is not, you should
|
|
leave off the “-cc icc” switch.</p>
|
|
<p>Or you can follow these steps:</p>
|
|
<div class="highlight-python"><div class="highlight"><pre>cd lammps/src
|
|
make yes-user-intel
|
|
make yes-user-omp (if desired)
|
|
make machine
|
|
</pre></div>
|
|
</div>
|
|
<p>Note that if the USER-OMP package is also installed, you can use
|
|
styles from both packages, as described below.</p>
|
|
<p>The Makefile.machine needs a “-fopenmp” flag for OpenMP support in
|
|
both the CCFLAGS and LINKFLAGS variables. You also need to add
|
|
-DLAMMPS_MEMALIGN=64 and -restrict to CCFLAGS.</p>
|
|
<p>If you are compiling on the same architecture that will be used for
|
|
the runs, adding the flag <em>-xHost</em> to CCFLAGS will enable
|
|
vectorization with the Intel(R) compiler. Otherwise, you must
|
|
provide the correct compute node architecture to the -x option
|
|
(e.g. -xAVX).</p>
|
|
<p>In order to build with support for an Intel(R) Xeon Phi(TM)
|
|
coprocessor, the flag <em>-offload</em> should be added to the LINKFLAGS line
|
|
and the flag -DLMP_INTEL_OFFLOAD should be added to the CCFLAGS line.</p>
|
|
<p>Example makefiles Makefile.intel_cpu and Makefile.intel_phi are
|
|
included in the src/MAKE/OPTIONS directory with settings that perform
|
|
well with the Intel(R) compiler. The latter file has support for
|
|
offload to coprocessors; the former does not.</p>
|
|
<p><strong>Notes on CPU and core affinity:</strong></p>
|
|
<p>Setting core affinity is often used to pin MPI tasks and OpenMP
|
|
threads to a core or group of cores so that memory access can be
|
|
uniform. Unless disabled at build time, affinity for MPI tasks and
|
|
OpenMP threads on the host will be set by default on the host
|
|
when using offload to a coprocessor. In this case, it is unnecessary
|
|
to use other methods to control affinity (e.g. taskset, numactl,
|
|
I_MPI_PIN_DOMAIN, etc.). This can be disabled in an input script
|
|
with the <em>no_affinity</em> option to the <a class="reference internal" href="package.html"><em>package intel</em></a>
|
|
command or by disabling the option at build time (by adding
|
|
-DINTEL_OFFLOAD_NOAFFINITY to the CCFLAGS line of your Makefile).
|
|
Disabling this option is not recommended, especially when running
|
|
on a machine with hyperthreading disabled.</p>
|
|
<p><strong>Running with the USER-INTEL package from the command line:</strong></p>
|
|
<p>The mpirun or mpiexec command sets the total number of MPI tasks used
|
|
by LAMMPS (one or multiple per compute node) and the number of MPI
|
|
tasks used per node. E.g. the mpirun command in MPICH does this via
|
|
its -np and -ppn switches. Ditto for OpenMPI via -np and -npernode.</p>
|
|
<p>If you plan to compute (any portion of) pairwise interactions using
|
|
USER-INTEL pair styles on the CPU, or use USER-OMP styles on the CPU,
|
|
you need to choose how many OpenMP threads per MPI task to use. Note
|
|
that the product of MPI tasks * OpenMP threads/task should not exceed
|
|
the physical number of cores (on a node), otherwise performance will
|
|
suffer.</p>
|
|
<p>If LAMMPS was built with coprocessor support for the USER-INTEL
|
|
package, you also need to specify the number of coprocessor/node and
|
|
optionally the number of coprocessor threads per MPI task to use. Note that
|
|
coprocessor threads (which run on the coprocessor) are totally
|
|
independent from OpenMP threads (which run on the CPU). The default
|
|
values for the settings that affect coprocessor threads are typically
|
|
fine, as discussed below.</p>
|
|
<p>Use the “-sf intel” <a class="reference internal" href="Section_start.html#start-7"><span>command-line switch</span></a>,
|
|
which will automatically append “intel” to styles that support it.
|
|
OpenMP threads per MPI task can be set via the “-pk intel Nphi omp Nt” or
|
|
“-pk omp Nt” <a class="reference internal" href="Section_start.html#start-7"><span>command-line switches</span></a>, which
|
|
set Nt = # of OpenMP threads per MPI task to use. The “-pk omp” form
|
|
is only allowed if LAMMPS was also built with the USER-OMP package.</p>
|
|
<p>Use the “-pk intel Nphi” <a class="reference internal" href="Section_start.html#start-7"><span>command-line switch</span></a> to set Nphi = # of Xeon Phi(TM)
|
|
coprocessors/node, if LAMMPS was built with coprocessor support. All
|
|
the available coprocessor threads on each Phi will be divided among
|
|
MPI tasks, unless the <em>tptask</em> option of the “-pk intel” <a class="reference internal" href="Section_start.html#start-7"><span>command-line switch</span></a> is used to limit the coprocessor
|
|
threads per MPI task. See the <a class="reference internal" href="package.html"><em>package intel</em></a> command
|
|
for details.</p>
|
|
<div class="highlight-python"><div class="highlight"><pre>CPU-only without USER-OMP (but using Intel vectorization on CPU):
|
|
mpirun -np 32 lmp_machine -sf intel -in in.script # 32 MPI tasks on as many nodes as needed (e.g. 2 16-core nodes)
|
|
lmp_machine -sf intel -pk intel 0 omp 16 -in in.script # 1 MPI task and 16 threads
|
|
</pre></div>
|
|
</div>
|
|
<div class="highlight-python"><div class="highlight"><pre>CPU-only with USER-OMP (and Intel vectorization on CPU):
|
|
lmp_machine -sf hybrid intel omp -pk intel 0 omp 16 -in in.script # 1 MPI task on a 16-core node with 16 threads
|
|
mpirun -np 4 lmp_machine -sf hybrid intel omp -pk omp 4 -in in.script # 4 MPI tasks each with 4 threads on a single 16-core node
|
|
</pre></div>
|
|
</div>
|
|
<div class="highlight-python"><div class="highlight"><pre>CPUs + Xeon Phi(TM) coprocessors with or without USER-OMP:
|
|
mpirun -np 32 -ppn 16 lmp_machine -sf intel -pk intel 1 -in in.script # 2 nodes with 16 MPI tasks on each, 240 total threads on coprocessor
|
|
mpirun -np 16 -ppn 8 lmp_machine -sf intel -pk intel 1 omp 2 -in in.script # 2 nodes, 8 MPI tasks on each node, 2 threads for each task, 240 total threads on coprocessor
|
|
mpirun -np 16 -ppn 8 lmp_machine -sf hybrid intel omp -pk intel 1 omp 2 -in in.script # 2 nodes, 8 MPI tasks on each node, 2 threads for each task, 240 total threads on coprocessor, USER-OMP package for some styles
|
|
</pre></div>
|
|
</div>
|
|
<p>Note that if the “-sf intel” switch is used, it also invokes a
|
|
default command: <a class="reference internal" href="package.html"><em>package intel 1</em></a>. If the “-sf hybrid intel omp”
|
|
switch is used, the default USER-OMP command <a class="reference internal" href="package.html"><em>package omp 0</em></a> is
|
|
also invoked. Both set the number of OpenMP threads per MPI task via the
|
|
OMP_NUM_THREADS environment variable. The first command sets the number of
|
|
Xeon Phi(TM) coprocessors/node to 1 (and the precision mode to “mixed”, as one
|
|
of its option defaults). The latter command is not invoked if LAMMPS was not
|
|
built with the USER-OMP package. The Nphi = 1 value for the first command is
|
|
ignored if LAMMPS was not built with coprocessor support.</p>
|
|
<p>Using the “-pk intel” or “-pk omp” switches explicitly allows for
|
|
direct setting of the number of OpenMP threads per MPI task, and
|
|
additional options for either of the USER-INTEL or USER-OMP packages.
|
|
In particular, the “-pk intel” switch sets the number of
|
|
coprocessors/node and can limit the number of coprocessor threads per
|
|
MPI task. The syntax for these two switches is the same as the
|
|
<a class="reference internal" href="package.html"><em>package omp</em></a> and <a class="reference internal" href="package.html"><em>package intel</em></a> commands.
|
|
See the <a class="reference internal" href="package.html"><em>package</em></a> command doc page for details, including
|
|
the default values used for all its options if these switches are not
|
|
specified, and how to set the number of OpenMP threads via the
|
|
OMP_NUM_THREADS environment variable if desired.</p>
|
|
<p><strong>Or run with the USER-INTEL package by editing an input script:</strong></p>
|
|
<p>The discussion above for the mpirun/mpiexec command, MPI tasks/node,
|
|
OpenMP threads per MPI task, and coprocessor threads per MPI task is
|
|
the same.</p>
|
|
<p>Use the <a class="reference internal" href="suffix.html"><em>suffix intel</em></a> command, or you can explicitly add an
|
|
“intel” suffix to individual styles in your input script, e.g.</p>
|
|
<div class="highlight-python"><div class="highlight"><pre>pair_style lj/cut/intel 2.5
|
|
</pre></div>
|
|
</div>
|
|
<p>You must also use the <a class="reference internal" href="package.html"><em>package intel</em></a> command, unless the
|
|
“-sf intel” or “-pk intel” <a class="reference internal" href="Section_start.html#start-7"><span>command-line switches</span></a> were used. It specifies how many
|
|
coprocessors/node to use, as well as other OpenMP threading and
|
|
coprocessor options. Its doc page explains how to set the number of
|
|
OpenMP threads via an environment variable if desired.</p>
|
|
<p>If LAMMPS was also built with the USER-OMP package, you must also use
|
|
the <a class="reference internal" href="package.html"><em>package omp</em></a> command to enable that package, unless
|
|
the “-sf hybrid intel omp” or “-pk omp” <a class="reference internal" href="Section_start.html#start-7"><span>command-line switches</span></a> were used. It specifies how many
|
|
OpenMP threads per MPI task to use, as well as other options. Its doc
|
|
page explains how to set the number of OpenMP threads via an
|
|
environment variable if desired.</p>
|
|
<p><strong>Speed-ups to expect:</strong></p>
|
|
<p>If LAMMPS was not built with coprocessor support when including the
|
|
USER-INTEL package, then acclerated styles will run on the CPU using
|
|
vectorization optimizations and the specified precision. This may
|
|
give a substantial speed-up for a pair style, particularly if mixed or
|
|
single precision is used.</p>
|
|
<p>If LAMMPS was built with coproccesor support, the pair styles will run
|
|
on one or more Intel(R) Xeon Phi(TM) coprocessors (per node). The
|
|
performance of a Xeon Phi versus a multi-core CPU is a function of
|
|
your hardware, which pair style is used, the number of
|
|
atoms/coprocessor, and the precision used on the coprocessor (double,
|
|
single, mixed).</p>
|
|
<p>See the <a class="reference external" href="http://lammps.sandia.gov/bench.html">Benchmark page</a> of the
|
|
LAMMPS web site for performance of the USER-INTEL package on different
|
|
hardware.</p>
|
|
<p><strong>Guidelines for best performance on an Intel(R) Xeon Phi(TM)
|
|
coprocessor:</strong></p>
|
|
<ul class="simple">
|
|
<li>The default for the <a class="reference internal" href="package.html"><em>package intel</em></a> command is to have
|
|
all the MPI tasks on a given compute node use a single Xeon Phi(TM)
|
|
coprocessor. In general, running with a large number of MPI tasks on
|
|
each node will perform best with offload. Each MPI task will
|
|
automatically get affinity to a subset of the hardware threads
|
|
available on the coprocessor. For example, if your card has 61 cores,
|
|
with 60 cores available for offload and 4 hardware threads per core
|
|
(240 total threads), running with 24 MPI tasks per node will cause
|
|
each MPI task to use a subset of 10 threads on the coprocessor. Fine
|
|
tuning of the number of threads to use per MPI task or the number of
|
|
threads to use per core can be accomplished with keyword settings of
|
|
the <a class="reference internal" href="package.html"><em>package intel</em></a> command.</li>
|
|
<li>If desired, only a fraction of the pair style computation can be
|
|
offloaded to the coprocessors. This is accomplished by using the
|
|
<em>balance</em> keyword in the <a class="reference internal" href="package.html"><em>package intel</em></a> command. A
|
|
balance of 0 runs all calculations on the CPU. A balance of 1 runs
|
|
all calculations on the coprocessor. A balance of 0.5 runs half of
|
|
the calculations on the coprocessor. Setting the balance to -1 (the
|
|
default) will enable dynamic load balancing that continously adjusts
|
|
the fraction of offloaded work throughout the simulation. This option
|
|
typically produces results within 5 to 10 percent of the optimal fixed
|
|
balance.</li>
|
|
<li>When using offload with CPU hyperthreading disabled, it may help
|
|
performance to use fewer MPI tasks and OpenMP threads than available
|
|
cores. This is due to the fact that additional threads are generated
|
|
internally to handle the asynchronous offload tasks.</li>
|
|
<li>If running short benchmark runs with dynamic load balancing, adding a
|
|
short warm-up run (10-20 steps) will allow the load-balancer to find a
|
|
near-optimal setting that will carry over to additional runs.</li>
|
|
<li>If pair computations are being offloaded to an Intel(R) Xeon Phi(TM)
|
|
coprocessor, a diagnostic line is printed to the screen (not to the
|
|
log file), during the setup phase of a run, indicating that offload
|
|
mode is being used and indicating the number of coprocessor threads
|
|
per MPI task. Additionally, an offload timing summary is printed at
|
|
the end of each run. When offloading, the frequency for <a class="reference internal" href="atom_modify.html"><em>atom sorting</em></a> is changed to 1 so that the per-atom data is
|
|
effectively sorted at every rebuild of the neighbor lists.</li>
|
|
<li>For simulations with long-range electrostatics or bond, angle,
|
|
dihedral, improper calculations, computation and data transfer to the
|
|
coprocessor will run concurrently with computations and MPI
|
|
communications for these calculations on the host CPU. The USER-INTEL
|
|
package has two modes for deciding which atoms will be handled by the
|
|
coprocessor. This choice is controlled with the <em>ghost</em> keyword of
|
|
the <a class="reference internal" href="package.html"><em>package intel</em></a> command. When set to 0, ghost atoms
|
|
(atoms at the borders between MPI tasks) are not offloaded to the
|
|
card. This allows for overlap of MPI communication of forces with
|
|
computation on the coprocessor when the <a class="reference internal" href="newton.html"><em>newton</em></a> setting
|
|
is “on”. The default is dependent on the style being used, however,
|
|
better performance may be achieved by setting this option
|
|
explictly.</li>
|
|
</ul>
|
|
<div class="section" id="restrictions">
|
|
<h2>Restrictions<a class="headerlink" href="#restrictions" title="Permalink to this headline">¶</a></h2>
|
|
<p>When offloading to a coprocessor, <a class="reference internal" href="pair_hybrid.html"><em>hybrid</em></a> styles
|
|
that require skip lists for neighbor builds cannot be offloaded.
|
|
Using <a class="reference internal" href="pair_hybrid.html"><em>hybrid/overlay</em></a> is allowed. Only one intel
|
|
accelerated style may be used with hybrid styles.
|
|
<a class="reference internal" href="special_bonds.html"><em>Special_bonds</em></a> exclusion lists are not currently
|
|
supported with offload, however, the same effect can often be
|
|
accomplished by setting cutoffs for excluded atom types to 0. None of
|
|
the pair styles in the USER-INTEL package currently support the
|
|
“inner”, “middle”, “outer” options for rRESPA integration via the
|
|
<a class="reference internal" href="run_style.html"><em>run_style respa</em></a> command; only the “pair” option is
|
|
supported.</p>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
</div>
|
|
</div>
|
|
<footer>
|
|
|
|
|
|
<hr/>
|
|
|
|
<div role="contentinfo">
|
|
<p>
|
|
© Copyright .
|
|
</p>
|
|
</div>
|
|
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
|
|
|
</footer>
|
|
|
|
</div>
|
|
</div>
|
|
|
|
</section>
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
<script type="text/javascript">
|
|
var DOCUMENTATION_OPTIONS = {
|
|
URL_ROOT:'./',
|
|
VERSION:'15 May 2015 version',
|
|
COLLAPSE_INDEX:false,
|
|
FILE_SUFFIX:'.html',
|
|
HAS_SOURCE: true
|
|
};
|
|
</script>
|
|
<script type="text/javascript" src="_static/jquery.js"></script>
|
|
<script type="text/javascript" src="_static/underscore.js"></script>
|
|
<script type="text/javascript" src="_static/doctools.js"></script>
|
|
<script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
|
|
<script type="text/javascript" src="_static/sphinxcontrib-images/LightBox2/lightbox2/js/jquery-1.11.0.min.js"></script>
|
|
<script type="text/javascript" src="_static/sphinxcontrib-images/LightBox2/lightbox2/js/lightbox.min.js"></script>
|
|
<script type="text/javascript" src="_static/sphinxcontrib-images/LightBox2/lightbox2-customize/jquery-noconflict.js"></script>
|
|
|
|
|
|
|
|
|
|
|
|
<script type="text/javascript" src="_static/js/theme.js"></script>
|
|
|
|
|
|
|
|
|
|
<script type="text/javascript">
|
|
jQuery(function () {
|
|
SphinxRtdTheme.StickyNav.enable();
|
|
});
|
|
</script>
|
|
|
|
|
|
</body>
|
|
</html> |