git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@12394 f3b2605a-c512-4ea7-a41b-209d697bcdaa

2014-09-04 15:39:04 +00:00 · 2014-09-04 15:39:04 +00:00 · 0c0e5c356f
parent f3e11380a7
commit 0c0e5c356f
1 changed files with 49 additions and 54 deletions
--- a/bench/KEPLER/README
+++ b/bench/KEPLER/README
@ -1,68 +1,63 @@
-These are input scripts used to run versions of several of the
-benchmarks in the top-level bench directory using the GPU and
-USER-CUDA accelerator packages.  The results of running these scripts
-on two different machines (a desktop with 2 Tesla GPUs and the ORNL
-Titan supercomputer) are shown on the "GPU (Fermi)" section of the
-Benchmark page of the LAMMPS WWW site: lammps.sandia.gov/bench.
+These are build and input and run scripts used to run the LJ benchmark
+in the top-level bench directory using all the various accelerator
+packages currently available in LAMMPS.  The results of running these
+benchmarks on a GPU cluster with Kepler GPUs are shown on the "GPU
+(Kepler)" section of the Benchmark page of the LAMMPS WWW site:
+lammps.sandia.gov/bench.

-Examples are shown below of how to run these scripts.  This assumes
-you have built 3 executables with both the GPU and USER-CUDA packages
-installed, e.g.
+The specifics of the benchmark machine are as follows:

-lmp_linux_single
-lmp_linux_mixed
-lmp_linux_double
-
-The precision (single, mixed, double) refers to the GPU and USER-CUDA
-pacakge precision.  See the README files in the lib/gpu and lib/cuda
-directories for instructions on how to build the packages with
-different precisions.  The doc/Section_accelerate.html file also has a
-summary description.
+It is a small GPU cluster at Sandia National Labs called "shannon". It
+has 32 nodes, each with two 8-core Sandy Bridge Xeon CPUs (E5-2670,
+2.6GHz, HT deactivated), for a total of 512 cores.  Twenty-four of the
+nodes have two NVIDIA Kepler GPUs (K20x, 2688 732 MHz cores).  LAMMPS
+was compiled with the Intel icc compiler, using module
+openmpi/1.8.1/intel/13.1.SP1.106/cuda/6.0.37.

 ------------------------------------------------------------------------

-If the script has "cpu" in its name, it is meant to be run in CPU-only
-mode (without using the GPU or USER-CUDA styles).  For example:
+You can of course build LAMMPS yourself with any of the accelerator
+packages for your platform.

-mpirun -np 1 ../lmp_linux_double -v x 8 -v y 8 -v z 8 -v t 100 < in.lj.cpu
-mpirun -np 12 ../lmp_linux_double -v x 16 -v y 16 -v z 16 -v t 100 < in.lj.cpu
+The build.py script will build LAMMPS for the various accelerlator
+packages using the Makefile.* files in this dir, which you can edit if
+necessary for your platform.  You must set the "lmpdir" variable at
+the top of build.py to the home directory of LAMMPS as installed on
+your system.  Then typing, for example,

-The "xyz" settings determine the problem size.  The "t" setting
-determines the number of timesteps.
+python build.py cpu gpu
+
+will build executables for the CPU (no accelerators), and 3 GPU
+variants (double, mixed, single precision).  See the list
+of possible targets at the top of the build.py script.
+
+Note that the build.py script will un-install all packages in LAMMPS,
+then only install the ones needed for the benchmark.  The Makefile.*
+files in this dir are copied into lammps/src/MAKE, as a dummy
+Makefile.foo, so they will not conflict with makefiles that may
+already be there.  The build.py script also builds the auxiliary
+GPU and USER-CUDA library as needed.
+
+The various LAMMPS executables are copied into this directory
+when the build.py script finishes each build.

 ------------------------------------------------------------------------

-If the script has "gpu" in its name, it is meant to be run using
-the GPU package.  For example:
-
-mpirun -np 12 ../lmp_linux_single -sf gpu -v g 1 -v x 32 -v y 32 -v z 64 -v t 100 < in.lj.gpu
-
-mpirun -np 8 ../lmp_linux_mixed -sf gpu -v g 2 -v x 32 -v y 32 -v z 64 -v t 100 < in.lj.gpu
-
-The "xyz" settings determine the problem size.  The "t" setting
-determines the number of timesteps.  The "np" setting determines how
-many MPI tasks per compute node the problem will run on, and the "g"
-setting determines how many GPUs per compute node the problem will run
-on, i.e. 1 or 2 in this case.  Note that you can use more MPI tasks
-than GPUs (both per compute node) with the GPU package.
+The in.* files have settings for the benchmark appopriate to each
+accelerator package.  Many of them, including the problem size,
+and number of timsteps, must be set as command-line arguments
+when the input script is run.

 ------------------------------------------------------------------------

-If the script has "cuda" in its name, it is meant to be run using
-the USER-CUDA package.  For example:
+The run*.sh scripts have sample mpirun commands for running the input
+scripts on a single node.  These are provided for illustration
+purposes, to show what command-line arguments are used with each
+accelerator package, in combination with settings in the input scripts
+themselves.

-mpirun -np 1 ../lmp_linux_single -c on -sf cuda -v g 1 -v x 16 -v y 16 -v z 16 -v t 100 < in.lj.cuda
-
-mpirun -np 2 ../lmp_linux_double -c on -sf cuda -v g 2 -v x 32 -v y 64 -v z 64 -v t 100 < in.eam.cuda
-
-The "xyz" settings determine the problem size.  The "t" setting
-determines the number of timesteps.  The "np" setting determines how
-many MPI tasks per compute node the problem will run on, and the "g"
-setting determines how many GPUs per compute node the problem will run
-on, i.e. 1 or 2 in this case.  For the USER-CUDA package, the number
-of MPI tasks and GPUs (both per compute node) must be equal.
-
------------------------------------------------------------------------
-
-If the script has "titan" in its name, it was run on the Titan supercomputer
-at ORNL.
+Note that we generate these run scripts, either for interactive or
+batch submission, via Python scripts which produce a long list of runs
+to exercise a combination of options.  To perform a quick benchmark
+calculation on your platform, you will typically only want to run a
+few commands out of any run*.sh script