git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@14267 f3b2605a-c512-4ea7-a41b-209d697bcdaa
This commit is contained in:
sjplimp 2015-11-18 18:25:02 +00:00
parent 77f8955d4e
commit 0bdd0e36cc
1 changed files with 8 additions and 8 deletions

View File

@ -38,7 +38,7 @@
<I>gpu</I> args = Ngpu keyword value ...
Ngpu = # of GPUs per node
zero or more keyword/value pairs may be appended
keywords = <I>neigh</I> or <I>newton</I> or <I>binsize</I> or <I>split</I> or <I>gpuID</I> or <I>tpa</I> or <I>device</I>
keywords = <I>neigh</I> or <I>newton</I> or <I>binsize</I> or <I>split</I> or <I>gpuID</I> or <I>tpa</I> or <I>device</I> or <I>blocksize</I>
<I>neigh</I> value = <I>yes</I> or <I>no</I>
yes = neighbor list build on GPU (default)
no = neighbor list build on CPU
@ -320,13 +320,6 @@ large cutoffs or with a small number of particles per GPU, increasing
the value can improve performance. The number of threads per atom must
be a power of 2 and currently cannot be greater than 32.
</P>
<P>The <I>blocksize</I> keyword allows you to tweak the number of threads used
per thread block. This number should be a multiple of 32 (for GPUs)
and its maximum depends on the specific GPU hardware. Typical choices
are 64, 128, or 256. A larger blocksize increases occupancy of
individual GPU cores, but reduces the total number of thread blocks,
thus may lead to load imbalance.
</P>
<P>The <I>device</I> keyword can be used to tune parameters optimized for a
specific accelerator, when using OpenCL. For CUDA, the <I>device</I>
keyword is ignored. Currently, the device type is limited to NVIDIA
@ -335,6 +328,13 @@ may be added later. The default device type can be specified when
building LAMMPS with the GPU library, via settings in the
lib/gpu/Makefile that is used.
</P>
<P>The <I>blocksize</I> keyword allows you to tweak the number of threads used
per thread block. This number should be a multiple of 32 (for GPUs)
and its maximum depends on the specific GPU hardware. Typical choices
are 64, 128, or 256. A larger blocksize increases occupancy of
individual GPU cores, but reduces the total number of thread blocks,
thus may lead to load imbalance.
</P>
<HR>
<P>The <I>intel</I> style invokes settings associated with the use of the