git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@13943 f3b2605a-c512-4ea7-a41b-209d697bcdaa
This commit is contained in:
sjplimp 2015-08-28 20:40:07 +00:00
parent abcda0a1fc
commit b0215cc367
3 changed files with 351 additions and 103 deletions

View File

@ -1753,36 +1753,91 @@ thermodynamic state and a total run time for the simulation. It then
appends statistics about the CPU time and storage requirements for the
simulation. An example set of statistics is shown here:
</P>
<PRE>Loop time of 49.002 on 2 procs for 2004 atoms
<PRE>Loop time of 2.81192 on 4 procs for 300 steps with 2004 atoms
97.0% CPU use with 4 MPI tasks x no OpenMP threads
Performance: 18.436 ns/day 1.302 hours/ns 106.689 timesteps/s
</PRE>
<PRE>Pair time (%) = 35.0495 (71.5267)
Bond time (%) = 0.092046 (0.187841)
Kspce time (%) = 6.42073 (13.103)
Neigh time (%) = 2.73485 (5.5811)
Comm time (%) = 1.50291 (3.06703)
Outpt time (%) = 0.013799 (0.0281601)
Other time (%) = 2.13669 (4.36041)
<PRE>MPI task timings breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 1.9808 | 2.0134 | 2.0318 | 1.4 | 71.60
Bond | 0.0021894 | 0.0060319 | 0.010058 | 4.7 | 0.21
Kspace | 0.3207 | 0.3366 | 0.36616 | 3.1 | 11.97
Neigh | 0.28411 | 0.28464 | 0.28516 | 0.1 | 10.12
Comm | 0.075732 | 0.077018 | 0.07883 | 0.4 | 2.74
Output | 0.00030518 | 0.00042665 | 0.00078821 | 1.0 | 0.02
Modify | 0.086606 | 0.086631 | 0.086668 | 0.0 | 3.08
Other | | 0.007178 | | | 0.26
</PRE>
<PRE>Nlocal: 1002 ave, 1015 max, 989 min
Histogram: 1 0 0 0 0 0 0 0 0 1
Nghost: 8720 ave, 8724 max, 8716 min
Histogram: 1 0 0 0 0 0 0 0 0 1
Neighs: 354141 ave, 361422 max, 346860 min
Histogram: 1 0 0 0 0 0 0 0 0 1
<PRE>Nlocal: 501 ave 508 max 490 min
Histogram: 1 0 0 0 0 0 1 1 0 1
Nghost: 6586.25 ave 6628 max 6548 min
Histogram: 1 0 1 0 0 0 1 0 0 1
Neighs: 177007 ave 180562 max 170212 min
Histogram: 1 0 0 0 0 0 0 1 1 1
</PRE>
<PRE>Total # of neighbors = 708282
Ave neighs/atom = 353.434
<PRE>Total # of neighbors = 708028
Ave neighs/atom = 353.307
Ave special neighs/atom = 2.34032
Number of reneighborings = 42
Dangerous reneighborings = 2
Neighbor list builds = 26
Dangerous builds = 0
</PRE>
<P>The first section gives the breakdown of the CPU run time (in seconds)
into major categories. The second section lists the number of owned
atoms (Nlocal), ghost atoms (Nghost), and pair-wise neighbors stored
per processor. The max and min values give the spread of these values
across processors with a 10-bin histogram showing the distribution.
The total number of histogram counts is equal to the number of
processors.
<P>The first section provides a global loop timing summary. The loop time
is the total wall time for the section. The second line provides the
CPU utilzation per MPI task; it should be close to 100% times the number
of OpenMP threads (or 1). Lower numbers correspond to delays due to
file i/o or unsufficient thread utilization. The <I>Performance</I> line is
provided for convenience to help predicting the number of loop
continuations required and for comparing performance with other similar
MD codes.
</P>
<P>The second section gives the breakdown of the CPU run time (in seconds)
into major categories:
</P>
<UL><LI><I>Pair</I> stands for all non-bonded force computation
<LI><I>Bond</I> stands for bonded interactions: bonds, angles, dihedrals, impropers
<LI><I>Kspace</I> stands for reciprocal space interactions: Ewald, PPPM, MSM
<LI><I>Neigh</I> stands for neighbor list construction
<LI><I>Comm</I> stands for communicating atoms and their properties
<LI><I>Output</I> stands for writing dumps and thermo output
<LI><I>Modify</I> stands for fixes and computes called by them
<LI><I>Other</I> is the remaining time
</UL>
<P>For each category, there is a breakdown of the least, average and most
amount of wall time a processor spent on this section. Also you have the
variation from the average time. Together these numbers allow to gauge
the amount of load imbalance in this segment of the calculation. Ideally
the difference between minimum, maximum and average is small and thus
the variation from the average close to zero. The final column shows
the percentage of the total loop time is spent in this section.
</P>
<P>When using the <A HREF = "timers.html">timers full</A> setting, and additional column
is present that also prints the CPU utilization in percent. In addition,
when using <I>timers full</I> and the <A HREF = "package.html">package omp</A> command are
active, a similar timing summary of time spent in threaded regions to
monitor thread utilization and load balance is provided. A new enrty is
the <I>Reduce</I> section, which lists the time spend in reducing the per-thread
data elements to the storage for non-threaded computation. These thread
timings are taking from the first MPI rank only and and thus, as the
breakdown for MPI tasks can change from MPI rank to MPI rank, this
breakdown can be very different for individual ranks. Here is an example
output for this optional output section:
</P>
<P>Thread timings breakdown (MPI rank 0):
Total threaded time 0.6846 / 90.6%
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 0.5127 | 0.5147 | 0.5167 | 0.3 | 75.18
Bond | 0.0043139 | 0.0046779 | 0.0050418 | 0.5 | 0.68
Kspace | 0.070572 | 0.074541 | 0.07851 | 1.5 | 10.89
Neigh | 0.084778 | 0.086969 | 0.089161 | 0.7 | 12.70
Reduce | 0.0036485 | 0.003737 | 0.0038254 | 0.1 | 0.55
</P>
<P>The third section lists the number of owned atoms (Nlocal), ghost atoms
(Nghost), and pair-wise neighbors stored per processor. The max and min
values give the spread of these values across processors with a 10-bin
histogram showing the distribution. The total number of histogram counts
is equal to the number of processors.
</P>
<P>The last section gives aggregate statistics for pair-wise neighbors
and special neighbors that LAMMPS keeps track of (see the
@ -1802,20 +1857,23 @@ takes place.
e.g.
</P>
<PRE>Minimization stats:
E initial, next-to-last, final = -0.895962 -2.94193 -2.94342
Gradient 2-norm init/final= 1920.78 20.9992
Gradient inf-norm init/final= 304.283 9.61216
Iterations = 36
Force evaluations = 177
Stopping criterion = linesearch alpha is zero
Energy initial, next-to-last, final =
-6372.3765206 -8328.46998942 -8328.46998942
Force two-norm initial, final = 1059.36 5.36874
Force max component initial, final = 58.6026 1.46872
Final line search alpha, max atom move = 2.7842e-10 4.0892e-10
Iterations, force evaluations = 701 1516
</PRE>
<P>The first line lists the initial and final energy, as well as the
energy on the next-to-last iteration. The next 2 lines give a measure
of the gradient of the energy (force on all atoms). The 2-norm is the
"length" of this force vector; the inf-norm is the largest component.
The last 2 lines are statistics on how many iterations and
force-evaluations the minimizer required. Multiple force evaluations
are typically done at each iteration to perform a 1d line minimization
in the search direction.
<P>The first line prints the criterion that determined the minimization
to be completed. The third line lists the initial and final energy,
as well as the energy on the next-to-last iteration. The next 2 lines
give a measure of the gradient of the energy (force on all atoms).
The 2-norm is the "length" of this force vector; the inf-norm is the
largest component. Then some information about the line search and
statistics on how many iterations and force-evaluations the minimizer
required. Multiple force evaluations are typically done at each
iteration to perform a 1d line minimization in the search direction.
</P>
<P>If a <A HREF = "kspace_style.html">kspace_style</A> long-range Coulombics solve was
performed during the run (PPPM, Ewald), then additional information is

View File

@ -2069,18 +2069,22 @@
<dt><a href="thermo.html#index-0">thermo</a>
</dt>
</dl></td>
<td style="width: 33%" valign="top"><dl>
<dt><a href="thermo_modify.html#index-0">thermo_modify</a>
</dt>
</dl></td>
<td style="width: 33%" valign="top"><dl>
<dt><a href="thermo_style.html#index-0">thermo_style</a>
</dt>
<dt><a href="timers.html#index-0">timestep</a>, <a href="timestep.html#index-0">[1]</a>
<dt><a href="timer.html#index-0">timer</a>
</dt>
<dt><a href="timestep.html#index-0">timestep</a>
</dt>
</dl></td>

View File

@ -1,77 +1,263 @@
<HTML>
<CENTER><A HREF = "http://lammps.sandia.gov">LAMMPS WWW Site</A> - <A HREF = "Manual.html">LAMMPS Documentation</A> - <A HREF = "Section_commands.html#comm">LAMMPS Commands</A>
</CENTER>
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>timestep command &mdash; LAMMPS 15 May 2015 version documentation</title>
<link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
<link rel="stylesheet" href="_static/sphinxcontrib-images/LightBox2/lightbox2/css/lightbox.css" type="text/css" />
<link rel="top" title="LAMMPS 15 May 2015 version documentation" href="index.html"/>
<script src="_static/js/modernizr.min.js"></script>
</head>
<body class="wy-body-for-nav" role="document">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-nav-search">
<a href="Manual.html" class="icon icon-home"> LAMMPS
</a>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="search.html" method="get">
<input type="text" name="q" placeholder="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="Section_intro.html">1. Introduction</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_start.html">2. Getting Started</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_commands.html">3. Commands</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_packages.html">4. Packages</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_accelerate.html">5. Accelerating LAMMPS performance</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_howto.html">6. How-to discussions</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_example.html">7. Example problems</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_perf.html">8. Performance &amp; scalability</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_tools.html">9. Additional tools</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_modify.html">10. Modifying &amp; extending LAMMPS</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_python.html">11. Python interface to LAMMPS</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_errors.html">12. Errors</a></li>
<li class="toctree-l1"><a class="reference internal" href="Section_history.html">13. Future and history</a></li>
</ul>
</div>
&nbsp;
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
<nav class="wy-nav-top" role="navigation" aria-label="top navigation">
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="Manual.html">LAMMPS</a>
</nav>
<HR>
<H3>timestep command
</H3>
<P><B>Syntax:</B>
</P>
<PRE>timers args
</PRE>
<LI><I>args</I> = one or more of <I>off</I> or <I>loop</I> or <I>normal</I> or <I>full</I> or <I>sync</I> or <I>nosync</I>
<PRE> <I>off</I> = do not collect and print timing information
<I>loop</I> = collect only the total time for the simulation loop
<I>normal</I> = collect timer information broken down in sections (default)
<I>full</I> = like <I>normal</I> but also include CPU and thread utilzation
<I>sync</I> = explicitly synchronize MPI tasks between sections
<I>nosync</I> = do not synchronize MPI tasks when collecting timer info (default)
</PRE>
<P><B>Examples:</B>
</P>
<PRE>timers full sync
timers loop
</PRE>
<P><B>Description:</B>
</P>
<P>Select to which level of detail LAMMPS is performing internal profiling.
</P>
<P>During regular runs LAMMPS will collect information about how much time is
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="breadcrumbs navigation">
<ul class="wy-breadcrumbs">
<li><a href="Manual.html">Docs</a> &raquo;</li>
<li>timestep command</li>
<li class="wy-breadcrumbs-aside">
<a href="http://lammps.sandia.gov">Website</a>
<a href="Section_commands.html#comm">Commands</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<div class="section" id="timestep-command">
<span id="index-0"></span><h1>timestep command<a class="headerlink" href="#timestep-command" title="Permalink to this headline"></a></h1>
<div class="section" id="syntax">
<h2>Syntax<a class="headerlink" href="#syntax" title="Permalink to this headline"></a></h2>
<div class="highlight-python"><div class="highlight"><pre>timers args
</pre></div>
</div>
<ul class="simple">
<li><em>args</em> = one or more of <em>off</em> or <em>loop</em> or <em>normal</em> or <em>full</em> or <em>sync</em> or <em>nosync</em></li>
</ul>
<pre class="literal-block">
<em>off</em> = do not collect and print timing information
<em>loop</em> = collect only the total time for the simulation loop
<em>normal</em> = collect timer information broken down in sections (default)
<em>full</em> = like <em>normal</em> but also include CPU and thread utilzation
<em>sync</em> = explicitly synchronize MPI tasks between sections
<em>nosync</em> = do not synchronize MPI tasks when collecting timer info (default)
</pre>
</div>
<div class="section" id="examples">
<h2>Examples<a class="headerlink" href="#examples" title="Permalink to this headline"></a></h2>
<div class="highlight-python"><div class="highlight"><pre>timers full sync
timers loop
</pre></div>
</div>
</div>
<div class="section" id="description">
<h2>Description<a class="headerlink" href="#description" title="Permalink to this headline"></a></h2>
<p>Select to which level of detail LAMMPS is performing internal profiling.</p>
<p>During regular runs LAMMPS will collect information about how much time is
spent in different sections of the code and thus can provide valuable
information for determining performance and load imbalance problems. This
can be done at different levels of detail and accuracy. For more
information about the timing output, please have a look at the <A HREF = "Section_start.html#start_8">discussion
of screen output</A>.
</P>
<P>The <I>off</I> setting will turn all time measurements off. The <I>loop</I> setting
can be done at different levels of detail and accuracy. For more
information about the timing output, please have a look at the <a class="reference internal" href="Section_start.html#start-8"><span>discussion of screen output</span></a>.</p>
<p>The <em>off</em> setting will turn all time measurements off. The <em>loop</em> setting
will only measure the total time of run loop and not collect any detailed
per section information. With the <I>normal</I> setting, timing information for
per section information. With the <em>normal</em> setting, timing information for
individual sections of the code are collected and also information about
load imbalances inside those sections presented. The <I>full</I> setting adds
load imbalances inside those sections presented. The <em>full</em> setting adds
information about CPU utilization and thread utilization, when multi-threading
is enabled.
</P>
<P>With the <I>sync</I> setting, all MPI tasks are synchronized at each timer call
is enabled.</p>
<p>With the <em>sync</em> setting, all MPI tasks are synchronized at each timer call
and thus allowing to study load imbalance more accuractly, but this usually
has some performance impact. Using the <I>nosync</I> setting this can be turned
off (which is the default).
</P>
<P>Multiple keywords can be provided and for keywords that are mutually
exclusive, the last one in that group is taking effect.
</P>
<P>IMPORTANT NOTE: Using the <I>full</I> and <I>sync</I> options provides the most
has some performance impact. Using the <em>nosync</em> setting this can be turned
off (which is the default).</p>
<p>Multiple keywords can be provided and for keywords that are mutually
exclusive, the last one in that group is taking effect.</p>
<div class="admonition warning">
<p class="first admonition-title">Warning</p>
<p class="last">Using the <em>full</em> and <em>sync</em> options provides the most
detailed and accurate timing information, but also can have a significant
negative performance impact due to the overhead of the many required system
calls. It is thus recommended to use these settings only when making tests
to identify the performance. For calculations with few atoms or a very
large number of performance, even using the <I>normal</I> setting can have
large number of performance, even using the <em>normal</em> setting can have
a measurable performance impact. It is recommended in those cases to use
the <I>loop</I> or <I>off</I> setting.
</P>
<P><B>Restrictions:</B> none
</P>
<P><B>Related commands:</B>
<A HREF = "run.html">run post no</A>, <A HREF = "kspace_modify.html">kspace_modify fftbench</A>
</P>
<P><B>Default:</B>
</P>
<P>timers normal nosync
</P>
</HTML>
the <em>loop</em> or <em>off</em> setting.</p>
</div>
</div>
<div class="section" id="restrictions">
<h2>Restrictions<a class="headerlink" href="#restrictions" title="Permalink to this headline"></a></h2>
<blockquote>
<div>none</div></blockquote>
</div>
<div class="section" id="related-commands">
<h2>Related commands<a class="headerlink" href="#related-commands" title="Permalink to this headline"></a></h2>
<p><a class="reference internal" href="run.html"><em>run post no</em></a>, <a class="reference internal" href="kspace_modify.html"><em>kspace_modify fftbench</em></a></p>
</div>
<div class="section" id="default">
<h2>Default<a class="headerlink" href="#default" title="Permalink to this headline"></a></h2>
<p>timers normal nosync</p>
</div>
</div>
</div>
</div>
<footer>
<hr/>
<div role="contentinfo">
<p>
&copy; Copyright .
</p>
</div>
Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT:'./',
VERSION:'15 May 2015 version',
COLLAPSE_INDEX:false,
FILE_SUFFIX:'.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="_static/jquery.js"></script>
<script type="text/javascript" src="_static/underscore.js"></script>
<script type="text/javascript" src="_static/doctools.js"></script>
<script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="_static/sphinxcontrib-images/LightBox2/lightbox2/js/jquery-1.11.0.min.js"></script>
<script type="text/javascript" src="_static/sphinxcontrib-images/LightBox2/lightbox2/js/lightbox.min.js"></script>
<script type="text/javascript" src="_static/sphinxcontrib-images/LightBox2/lightbox2-customize/jquery-noconflict.js"></script>
<script type="text/javascript" src="_static/js/theme.js"></script>
<script type="text/javascript">
jQuery(function () {
SphinxRtdTheme.StickyNav.enable();
});
</script>
</body>
</html>