mirror of https://github.com/lammps/lammps.git
git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@11976 f3b2605a-c512-4ea7-a41b-209d697bcdaa
This commit is contained in:
parent
d07b9f70e4
commit
d17e06c479
|
@ -1,266 +0,0 @@
|
|||
<HTML>
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<TITLE>LAMMPS Users Manual</TITLE>
|
||||
<META NAME="docnumber" CONTENT="10 May 2014 version">
|
||||
<META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
|
||||
<META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License.">
|
||||
</HEAD>
|
||||
|
||||
<BODY>
|
||||
|
||||
<CENTER><A HREF = "http://lammps.sandia.gov">LAMMPS WWW Site</A> - <A HREF = "Manual.html">LAMMPS Documentation</A> - <A HREF = "Section_commands.html#comm">LAMMPS Commands</A>
|
||||
</CENTER>
|
||||
|
||||
<HR>
|
||||
|
||||
<H1></H1>
|
||||
|
||||
<P><CENTER><H3>LAMMPS Documentation
|
||||
</H3></CENTER>
|
||||
<CENTER><H4>10 May 2014 version
|
||||
</H4></CENTER>
|
||||
<H4>Version info:
|
||||
</H4>
|
||||
<P>The LAMMPS "version" is the date when it was released, such as 1 May
|
||||
2010. LAMMPS is updated continuously. Whenever we fix a bug or add a
|
||||
feature, we release it immediately, and post a notice on <A HREF = "http://lammps.sandia.gov/bug.html">this page of
|
||||
the WWW site</A>. Each dated copy of LAMMPS contains all the
|
||||
features and bug-fixes up to and including that version date. The
|
||||
version date is printed to the screen and logfile every time you run
|
||||
LAMMPS. It is also in the file src/version.h and in the LAMMPS
|
||||
directory name created when you unpack a tarball, and at the top of
|
||||
the first page of the manual (this page).
|
||||
</P>
|
||||
<UL><LI>If you browse the HTML doc pages on the LAMMPS WWW site, they always
|
||||
describe the most current version of LAMMPS.
|
||||
</P>
|
||||
<P><LI>If you browse the HTML doc pages included in your tarball, they
|
||||
describe the version you have.
|
||||
</P>
|
||||
<P><LI>The <A HREF = "Manual.pdf">PDF file</A> on the WWW site or in the tarball is updated
|
||||
about once per month. This is because it is large, and we don't want
|
||||
it to be part of every patch.
|
||||
</P>
|
||||
<LI>There is also a <A HREF = "Developer.pdf">Developer.pdf</A> file in the doc
|
||||
directory, which describes the internal structure and algorithms of
|
||||
LAMMPS.
|
||||
</UL>
|
||||
<P>LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel
|
||||
Simulator.
|
||||
</P>
|
||||
<P>LAMMPS is a classical molecular dynamics simulation code designed to
|
||||
run efficiently on parallel computers. It was developed at Sandia
|
||||
National Laboratories, a US Department of Energy facility, with
|
||||
funding from the DOE. It is an open-source code, distributed freely
|
||||
under the terms of the GNU Public License (GPL).
|
||||
</P>
|
||||
<P>The primary developers of LAMMPS are <A HREF = "http://www.sandia.gov/~sjplimp">Steve Plimpton</A>, Aidan
|
||||
Thompson, and Paul Crozier who can be contacted at
|
||||
sjplimp,athomps,pscrozi at sandia.gov. The <A HREF = "http://lammps.sandia.gov">LAMMPS WWW Site</A> at
|
||||
http://lammps.sandia.gov has more information about the code and its
|
||||
uses.
|
||||
</P>
|
||||
|
||||
<HR>
|
||||
|
||||
<P>The LAMMPS documentation is organized into the following sections. If
|
||||
you find errors or omissions in this manual or have suggestions for
|
||||
useful information to add, please send an email to the developers so
|
||||
we can improve the LAMMPS documentation.
|
||||
</P>
|
||||
<P>Once you are familiar with LAMMPS, you may want to bookmark <A HREF = "Section_commands.html#comm">this
|
||||
page</A> at Section_commands.html#comm since
|
||||
it gives quick access to documentation for all LAMMPS commands.
|
||||
</P>
|
||||
<P><A HREF = "Manual.pdf">PDF file</A> of the entire manual, generated by
|
||||
<A HREF = "http://www.easysw.com/htmldoc">htmldoc</A>
|
||||
</P>
|
||||
<OL><LI><A HREF = "Section_intro.html">Introduction</A>
|
||||
|
||||
<UL> 1.1 <A HREF = "Section_intro.html#intro_1">What is LAMMPS</A>
|
||||
<BR>
|
||||
1.2 <A HREF = "Section_intro.html#intro_2">LAMMPS features</A>
|
||||
<BR>
|
||||
1.3 <A HREF = "Section_intro.html#intro_3">LAMMPS non-features</A>
|
||||
<BR>
|
||||
1.4 <A HREF = "Section_intro.html#intro_4">Open source distribution</A>
|
||||
<BR>
|
||||
1.5 <A HREF = "Section_intro.html#intro_5">Acknowledgments and citations</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_start.html">Getting started</A>
|
||||
|
||||
<UL> 2.1 <A HREF = "Section_start.html#start_1">What's in the LAMMPS distribution</A>
|
||||
<BR>
|
||||
2.2 <A HREF = "Section_start.html#start_2">Making LAMMPS</A>
|
||||
<BR>
|
||||
2.3 <A HREF = "Section_start.html#start_3">Making LAMMPS with optional packages</A>
|
||||
<BR>
|
||||
2.4 <A HREF = "Section_start.html#start_4">Building LAMMPS via the Make.py script</A>
|
||||
<BR>
|
||||
2.5 <A HREF = "Section_start.html#start_5">Building LAMMPS as a library</A>
|
||||
<BR>
|
||||
2.6 <A HREF = "Section_start.html#start_6">Running LAMMPS</A>
|
||||
<BR>
|
||||
2.7 <A HREF = "Section_start.html#start_7">Command-line options</A>
|
||||
<BR>
|
||||
2.8 <A HREF = "Section_start.html#start_8">Screen output</A>
|
||||
<BR>
|
||||
2.9 <A HREF = "Section_start.html#start_9">Tips for users of previous versions</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_commands.html">Commands</A>
|
||||
|
||||
<UL> 3.1 <A HREF = "Section_commands.html#cmd_1">LAMMPS input script</A>
|
||||
<BR>
|
||||
3.2 <A HREF = "Section_commands.html#cmd_2">Parsing rules</A>
|
||||
<BR>
|
||||
3.3 <A HREF = "Section_commands.html#cmd_3">Input script structure</A>
|
||||
<BR>
|
||||
3.4 <A HREF = "Section_commands.html#cmd_4">Commands listed by category</A>
|
||||
<BR>
|
||||
3.5 <A HREF = "Section_commands.html#cmd_5">Commands listed alphabetically</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_packages.html">Packages</A>
|
||||
|
||||
<UL> 4.1 <A HREF = "Section_packages.html#pkg_1">Standard packages</A>
|
||||
<BR>
|
||||
4.2 <A HREF = "Section_packages.html#pkg_2">User packages</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_accelerate.html">Accelerating LAMMPS performance</A>
|
||||
|
||||
<UL> 5.1 <A HREF = "Section_accelerate.html#acc_1">Measuring performance</A>
|
||||
<BR>
|
||||
5.2 <A HREF = "Section_accelerate.html#acc_2">General strategies</A>
|
||||
<BR>
|
||||
5.3 <A HREF = "Section_accelerate.html#acc_3">Packages with optimized styles</A>
|
||||
<BR>
|
||||
5.4 <A HREF = "Section_accelerate.html#acc_4">OPT package</A>
|
||||
<BR>
|
||||
5.5 <A HREF = "Section_accelerate.html#acc_5">USER-OMP package</A>
|
||||
<BR>
|
||||
5.6 <A HREF = "Section_accelerate.html#acc_6">GPU package</A>
|
||||
<BR>
|
||||
5.7 <A HREF = "Section_accelerate.html#acc_7">USER-CUDA package</A>
|
||||
<BR>
|
||||
5.8 <A HREF = "Section_accelerate.html#acc_8">Comparison of GPU and USER-CUDA packages</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_howto.html">How-to discussions</A>
|
||||
|
||||
<UL> 6.1 <A HREF = "Section_howto.html#howto_1">Restarting a simulation</A>
|
||||
<BR>
|
||||
6.2 <A HREF = "Section_howto.html#howto_2">2d simulations</A>
|
||||
<BR>
|
||||
6.3 <A HREF = "Section_howto.html#howto_3">CHARMM and AMBER force fields</A>
|
||||
<BR>
|
||||
6.4 <A HREF = "Section_howto.html#howto_4">Running multiple simulations from one input script</A>
|
||||
<BR>
|
||||
6.5 <A HREF = "Section_howto.html#howto_5">Multi-replica simulations</A>
|
||||
<BR>
|
||||
6.6 <A HREF = "Section_howto.html#howto_6">Granular models</A>
|
||||
<BR>
|
||||
6.7 <A HREF = "Section_howto.html#howto_7">TIP3P water model</A>
|
||||
<BR>
|
||||
6.8 <A HREF = "Section_howto.html#howto_8">TIP4P water model</A>
|
||||
<BR>
|
||||
6.9 <A HREF = "Section_howto.html#howto_9">SPC water model</A>
|
||||
<BR>
|
||||
6.10 <A HREF = "Section_howto.html#howto_10">Coupling LAMMPS to other codes</A>
|
||||
<BR>
|
||||
6.11 <A HREF = "Section_howto.html#howto_11">Visualizing LAMMPS snapshots</A>
|
||||
<BR>
|
||||
6.12 <A HREF = "Section_howto.html#howto_12">Triclinic (non-orthogonal) simulation boxes</A>
|
||||
<BR>
|
||||
6.13 <A HREF = "Section_howto.html#howto_13">NEMD simulations</A>
|
||||
<BR>
|
||||
6.14 <A HREF = "Section_howto.html#howto_14">Finite-size spherical and aspherical particles</A>
|
||||
<BR>
|
||||
6.15 <A HREF = "Section_howto.html#howto_15">Output from LAMMPS (thermo, dumps, computes, fixes, variables)</A>
|
||||
<BR>
|
||||
6.16 <A HREF = "Section_howto.html#howto_16">Thermostatting, barostatting, and compute temperature</A>
|
||||
<BR>
|
||||
6.17 <A HREF = "Section_howto.html#howto_17">Walls</A>
|
||||
<BR>
|
||||
6.18 <A HREF = "Section_howto.html#howto_18">Elastic constants</A>
|
||||
<BR>
|
||||
6.19 <A HREF = "Section_howto.html#howto_19">Library interface to LAMMPS</A>
|
||||
<BR>
|
||||
6.20 <A HREF = "Section_howto.html#howto_20">Calculating thermal conductivity</A>
|
||||
<BR>
|
||||
6.21 <A HREF = "Section_howto.html#howto_21">Calculating viscosity</A>
|
||||
<BR>
|
||||
6.22 <A HREF = "howto_22">Calculating a diffusion coefficient</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_example.html">Example problems</A>
|
||||
|
||||
<LI><A HREF = "Section_perf.html">Performance & scalability</A>
|
||||
|
||||
<LI><A HREF = "Section_tools.html">Additional tools</A>
|
||||
|
||||
<LI><A HREF = "Section_modify.html">Modifying & extending LAMMPS</A>
|
||||
|
||||
<UL> 10.1 <A HREF = "Section_modify.html#mod_1">Atom styles</A>
|
||||
<BR>
|
||||
10.2 <A HREF = "Section_modify.html#mod_2">Bond, angle, dihedral, improper potentials</A>
|
||||
<BR>
|
||||
10.3 <A HREF = "Section_modify.html#mod_3">Compute styles</A>
|
||||
<BR>
|
||||
10.4 <A HREF = "Section_modify.html#mod_4">Dump styles</A>
|
||||
<BR>
|
||||
10.5 <A HREF = "Section_modify.html#mod_5">Dump custom output options</A>
|
||||
<BR>
|
||||
10.6 <A HREF = "Section_modify.html#mod_6">Fix styles</A>
|
||||
<BR>
|
||||
10.7 <A HREF = "Section_modify.html#mod_7">Input script commands</A>
|
||||
<BR>
|
||||
10.8 <A HREF = "Section_modify.html#mod_8">Kspace computations</A>
|
||||
<BR>
|
||||
10.9 <A HREF = "Section_modify.html#mod_9">Minimization styles</A>
|
||||
<BR>
|
||||
10.10 <A HREF = "Section_modify.html#mod_10">Pairwise potentials</A>
|
||||
<BR>
|
||||
10.11 <A HREF = "Section_modify.html#mod_11">Region styles</A>
|
||||
<BR>
|
||||
10.12 <A HREF = "Section_modify.html#mod_12">Body styles</A>
|
||||
<BR>
|
||||
10.13 <A HREF = "Section_modify.html#mod_13">Thermodynamic output options</A>
|
||||
<BR>
|
||||
10.14 <A HREF = "Section_modify.html#mod_14">Variable options</A>
|
||||
<BR>
|
||||
10.15 <A HREF = "Section_modify.html#mod_15">Submitting new features for inclusion in LAMMPS</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_python.html">Python interface</A>
|
||||
|
||||
<UL> 11.1 <A HREF = "Section_python.html#py_1">Building LAMMPS as a shared library</A>
|
||||
<BR>
|
||||
11.2 <A HREF = "Section_python.html#py_2">Installing the Python wrapper into Python</A>
|
||||
<BR>
|
||||
11.3 <A HREF = "Section_python.html#py_3">Extending Python with MPI to run in parallel</A>
|
||||
<BR>
|
||||
11.4 <A HREF = "Section_python.html#py_4">Testing the Python-LAMMPS interface</A>
|
||||
<BR>
|
||||
11.5 <A HREF = "Section_python.html#py_5">Using LAMMPS from Python</A>
|
||||
<BR>
|
||||
11.6 <A HREF = "Section_python.html#py_6">Example Python scripts that use LAMMPS</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_errors.html">Errors</A>
|
||||
|
||||
<UL> 12.1 <A HREF = "Section_errors.html#err_1">Common problems</A>
|
||||
<BR>
|
||||
12.2 <A HREF = "Section_errors.html#err_2">Reporting bugs</A>
|
||||
<BR>
|
||||
12.3 <A HREF = "Section_errors.html#err_3">Error & warning messages</A>
|
||||
<BR></UL>
|
||||
<LI><A HREF = "Section_history.html">Future and history</A>
|
||||
|
||||
<UL> 13.1 <A HREF = "Section_history.html#hist_1">Coming attractions</A>
|
||||
<BR>
|
||||
13.2 <A HREF = "Section_history.html#hist_2">Past versions</A>
|
||||
<BR></UL>
|
||||
|
||||
</OL>
|
||||
|
||||
</BODY>
|
||||
|
||||
</HTML>
|
||||
|
||||
</HTML>
|
|
@ -309,7 +309,7 @@ in the command's documentation.
|
|||
</P>
|
||||
<P>Settings:
|
||||
</P>
|
||||
<P><A HREF = "communicate.html">communicate</A>, <A HREF = "group.html">group</A>, <A HREF = "mass.html">mass</A>,
|
||||
<P><A HREF = "comm_style.html">comm_style</A>, <A HREF = "group.html">group</A>, <A HREF = "mass.html">mass</A>,
|
||||
<A HREF = "min_modify.html">min_modify</A>, <A HREF = "min_style.html">min_style</A>,
|
||||
<A HREF = "neigh_modify.html">neigh_modify</A>, <A HREF = "neighbor.html">neighbor</A>,
|
||||
<A HREF = "reset_timestep.html">reset_timestep</A>, <A HREF = "run_style.html">run_style</A>,
|
||||
|
@ -362,20 +362,21 @@ in the command's documentation.
|
|||
</P>
|
||||
<DIV ALIGN=center><TABLE BORDER=1 >
|
||||
<TR ALIGN="center"><TD ><A HREF = "angle_coeff.html">angle_coeff</A></TD><TD ><A HREF = "angle_style.html">angle_style</A></TD><TD ><A HREF = "atom_modify.html">atom_modify</A></TD><TD ><A HREF = "atom_style.html">atom_style</A></TD><TD ><A HREF = "balance.html">balance</A></TD><TD ><A HREF = "bond_coeff.html">bond_coeff</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "bond_style.html">bond_style</A></TD><TD ><A HREF = "boundary.html">boundary</A></TD><TD ><A HREF = "box.html">box</A></TD><TD ><A HREF = "change_box.html">change_box</A></TD><TD ><A HREF = "clear.html">clear</A></TD><TD ><A HREF = "communicate.html">communicate</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "compute.html">compute</A></TD><TD ><A HREF = "compute_modify.html">compute_modify</A></TD><TD ><A HREF = "create_atoms.html">create_atoms</A></TD><TD ><A HREF = "create_box.html">create_box</A></TD><TD ><A HREF = "delete_atoms.html">delete_atoms</A></TD><TD ><A HREF = "delete_bonds.html">delete_bonds</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "dielectric.html">dielectric</A></TD><TD ><A HREF = "dihedral_coeff.html">dihedral_coeff</A></TD><TD ><A HREF = "dihedral_style.html">dihedral_style</A></TD><TD ><A HREF = "dimension.html">dimension</A></TD><TD ><A HREF = "displace_atoms.html">displace_atoms</A></TD><TD ><A HREF = "dump.html">dump</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "dump_image.html">dump image</A></TD><TD ><A HREF = "dump_modify.html">dump_modify</A></TD><TD ><A HREF = "dump_image.html">dump movie</A></TD><TD ><A HREF = "echo.html">echo</A></TD><TD ><A HREF = "fix.html">fix</A></TD><TD ><A HREF = "fix_modify.html">fix_modify</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "group.html">group</A></TD><TD ><A HREF = "if.html">if</A></TD><TD ><A HREF = "improper_coeff.html">improper_coeff</A></TD><TD ><A HREF = "improper_style.html">improper_style</A></TD><TD ><A HREF = "include.html">include</A></TD><TD ><A HREF = "jump.html">jump</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "kspace_modify.html">kspace_modify</A></TD><TD ><A HREF = "kspace_style.html">kspace_style</A></TD><TD ><A HREF = "label.html">label</A></TD><TD ><A HREF = "lattice.html">lattice</A></TD><TD ><A HREF = "log.html">log</A></TD><TD ><A HREF = "mass.html">mass</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "minimize.html">minimize</A></TD><TD ><A HREF = "min_modify.html">min_modify</A></TD><TD ><A HREF = "min_style.html">min_style</A></TD><TD ><A HREF = "molecule.html">molecule</A></TD><TD ><A HREF = "neb.html">neb</A></TD><TD ><A HREF = "neigh_modify.html">neigh_modify</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "neighbor.html">neighbor</A></TD><TD ><A HREF = "newton.html">newton</A></TD><TD ><A HREF = "next.html">next</A></TD><TD ><A HREF = "package.html">package</A></TD><TD ><A HREF = "pair_coeff.html">pair_coeff</A></TD><TD ><A HREF = "pair_modify.html">pair_modify</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "pair_style.html">pair_style</A></TD><TD ><A HREF = "pair_write.html">pair_write</A></TD><TD ><A HREF = "partition.html">partition</A></TD><TD ><A HREF = "prd.html">prd</A></TD><TD ><A HREF = "print.html">print</A></TD><TD ><A HREF = "processors.html">processors</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "quit.html">quit</A></TD><TD ><A HREF = "read_data.html">read_data</A></TD><TD ><A HREF = "read_dump.html">read_dump</A></TD><TD ><A HREF = "read_restart.html">read_restart</A></TD><TD ><A HREF = "region.html">region</A></TD><TD ><A HREF = "replicate.html">replicate</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "rerun.html">rerun</A></TD><TD ><A HREF = "reset_timestep.html">reset_timestep</A></TD><TD ><A HREF = "restart.html">restart</A></TD><TD ><A HREF = "run.html">run</A></TD><TD ><A HREF = "run_style.html">run_style</A></TD><TD ><A HREF = "set.html">set</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "shell.html">shell</A></TD><TD ><A HREF = "special_bonds.html">special_bonds</A></TD><TD ><A HREF = "suffix.html">suffix</A></TD><TD ><A HREF = "tad.html">tad</A></TD><TD ><A HREF = "temper.html">temper</A></TD><TD ><A HREF = "thermo.html">thermo</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "thermo_modify.html">thermo_modify</A></TD><TD ><A HREF = "thermo_style.html">thermo_style</A></TD><TD ><A HREF = "timestep.html">timestep</A></TD><TD ><A HREF = "uncompute.html">uncompute</A></TD><TD ><A HREF = "undump.html">undump</A></TD><TD ><A HREF = "unfix.html">unfix</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "units.html">units</A></TD><TD ><A HREF = "variable.html">variable</A></TD><TD ><A HREF = "velocity.html">velocity</A></TD><TD ><A HREF = "write_data.html">write_data</A></TD><TD ><A HREF = "write_dump.html">write_dump</A></TD><TD ><A HREF = "write_restart.html">write_restart</A>
|
||||
<TR ALIGN="center"><TD ><A HREF = "bond_style.html">bond_style</A></TD><TD ><A HREF = "boundary.html">boundary</A></TD><TD ><A HREF = "box.html">box</A></TD><TD ><A HREF = "change_box.html">change_box</A></TD><TD ><A HREF = "clear.html">clear</A></TD><TD ><A HREF = "comm_modify.html">comm_modify</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "comm_style.html">comm_style</A></TD><TD ><A HREF = "compute.html">compute</A></TD><TD ><A HREF = "compute_modify.html">compute_modify</A></TD><TD ><A HREF = "create_atoms.html">create_atoms</A></TD><TD ><A HREF = "create_box.html">create_box</A></TD><TD ><A HREF = "delete_atoms.html">delete_atoms</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "delete_bonds.html">delete_bonds</A></TD><TD ><A HREF = "dielectric.html">dielectric</A></TD><TD ><A HREF = "dihedral_coeff.html">dihedral_coeff</A></TD><TD ><A HREF = "dihedral_style.html">dihedral_style</A></TD><TD ><A HREF = "dimension.html">dimension</A></TD><TD ><A HREF = "displace_atoms.html">displace_atoms</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "dump.html">dump</A></TD><TD ><A HREF = "dump_image.html">dump image</A></TD><TD ><A HREF = "dump_modify.html">dump_modify</A></TD><TD ><A HREF = "dump_image.html">dump movie</A></TD><TD ><A HREF = "echo.html">echo</A></TD><TD ><A HREF = "fix.html">fix</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_modify.html">fix_modify</A></TD><TD ><A HREF = "group.html">group</A></TD><TD ><A HREF = "if.html">if</A></TD><TD ><A HREF = "improper_coeff.html">improper_coeff</A></TD><TD ><A HREF = "improper_style.html">improper_style</A></TD><TD ><A HREF = "include.html">include</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "jump.html">jump</A></TD><TD ><A HREF = "kspace_modify.html">kspace_modify</A></TD><TD ><A HREF = "kspace_style.html">kspace_style</A></TD><TD ><A HREF = "label.html">label</A></TD><TD ><A HREF = "lattice.html">lattice</A></TD><TD ><A HREF = "log.html">log</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "mass.html">mass</A></TD><TD ><A HREF = "minimize.html">minimize</A></TD><TD ><A HREF = "min_modify.html">min_modify</A></TD><TD ><A HREF = "min_style.html">min_style</A></TD><TD ><A HREF = "molecule.html">molecule</A></TD><TD ><A HREF = "neb.html">neb</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "neigh_modify.html">neigh_modify</A></TD><TD ><A HREF = "neighbor.html">neighbor</A></TD><TD ><A HREF = "newton.html">newton</A></TD><TD ><A HREF = "next.html">next</A></TD><TD ><A HREF = "package.html">package</A></TD><TD ><A HREF = "pair_coeff.html">pair_coeff</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "pair_modify.html">pair_modify</A></TD><TD ><A HREF = "pair_style.html">pair_style</A></TD><TD ><A HREF = "pair_write.html">pair_write</A></TD><TD ><A HREF = "partition.html">partition</A></TD><TD ><A HREF = "prd.html">prd</A></TD><TD ><A HREF = "print.html">print</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "processors.html">processors</A></TD><TD ><A HREF = "quit.html">quit</A></TD><TD ><A HREF = "read_data.html">read_data</A></TD><TD ><A HREF = "read_dump.html">read_dump</A></TD><TD ><A HREF = "read_restart.html">read_restart</A></TD><TD ><A HREF = "region.html">region</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "replicate.html">replicate</A></TD><TD ><A HREF = "rerun.html">rerun</A></TD><TD ><A HREF = "reset_timestep.html">reset_timestep</A></TD><TD ><A HREF = "restart.html">restart</A></TD><TD ><A HREF = "run.html">run</A></TD><TD ><A HREF = "run_style.html">run_style</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "set.html">set</A></TD><TD ><A HREF = "shell.html">shell</A></TD><TD ><A HREF = "special_bonds.html">special_bonds</A></TD><TD ><A HREF = "suffix.html">suffix</A></TD><TD ><A HREF = "tad.html">tad</A></TD><TD ><A HREF = "temper.html">temper</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "thermo.html">thermo</A></TD><TD ><A HREF = "thermo_modify.html">thermo_modify</A></TD><TD ><A HREF = "thermo_style.html">thermo_style</A></TD><TD ><A HREF = "timestep.html">timestep</A></TD><TD ><A HREF = "uncompute.html">uncompute</A></TD><TD ><A HREF = "undump.html">undump</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "unfix.html">unfix</A></TD><TD ><A HREF = "units.html">units</A></TD><TD ><A HREF = "variable.html">variable</A></TD><TD ><A HREF = "velocity.html">velocity</A></TD><TD ><A HREF = "write_data.html">write_data</A></TD><TD ><A HREF = "write_dump.html">write_dump</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "write_restart.html">write_restart</A>
|
||||
</TD></TR></TABLE></DIV>
|
||||
|
||||
<P>These are commands contributed by users, which can be used if <A HREF = "Section_start.html#start_3">LAMMPS
|
||||
|
|
|
@ -305,7 +305,7 @@ Force fields:
|
|||
|
||||
Settings:
|
||||
|
||||
"communicate"_communicate.html, "group"_group.html, "mass"_mass.html,
|
||||
"comm_style"_comm_style.html, "group"_group.html, "mass"_mass.html,
|
||||
"min_modify"_min_modify.html, "min_style"_min_style.html,
|
||||
"neigh_modify"_neigh_modify.html, "neighbor"_neighbor.html,
|
||||
"reset_timestep"_reset_timestep.html, "run_style"_run_style.html,
|
||||
|
@ -367,7 +367,8 @@ in the command's documentation.
|
|||
"box"_box.html,
|
||||
"change_box"_change_box.html,
|
||||
"clear"_clear.html,
|
||||
"communicate"_communicate.html,
|
||||
"comm_modify"_comm_modify.html,
|
||||
"comm_style"_comm_style.html,
|
||||
"compute"_compute.html,
|
||||
"compute_modify"_compute_modify.html,
|
||||
"create_atoms"_create_atoms.html,
|
||||
|
|
282
doc/balance.html
282
doc/balance.html
|
@ -13,111 +13,178 @@
|
|||
</H3>
|
||||
<P><B>Syntax:</B>
|
||||
</P>
|
||||
<PRE>balance keyword args ...
|
||||
<PRE>balance thresh style args keyword value ...
|
||||
</PRE>
|
||||
<UL><LI>one or more keyword/arg pairs may be appended
|
||||
<UL><LI>thresh = imbalance threshhold that must be exceeded to perform a re-balance
|
||||
|
||||
<LI>keyword = <I>x</I> or <I>y</I> or <I>z</I> or <I>dynamic</I> or <I>out</I>
|
||||
<LI>style = <I>x</I> or <I>y</I> or <I>z</I> or <I>shift</I> or <I>rcb</I>
|
||||
|
||||
<PRE> <I>x</I> args = <I>uniform</I> or Px-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in x dimension
|
||||
numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
|
||||
<I>y</I> args = <I>uniform</I> or Py-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in y dimension
|
||||
numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
|
||||
<I>z</I> args = <I>uniform</I> or Pz-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in z dimension
|
||||
numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
|
||||
<I>dynamic</I> args = dimstr Niter thresh
|
||||
dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
thresh = stop balancing when this imbalance threshhold is reached
|
||||
<I>out</I> arg = filename
|
||||
filename = output file to write each processor's sub-domain to
|
||||
<PRE> <I>x</I> args = <I>uniform</I> or Px-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in x dimension
|
||||
numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
|
||||
<I>x</I> can be specified together with <I>y</I> or <I>z</I>
|
||||
<I>y</I> args = <I>uniform</I> or Py-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in y dimension
|
||||
numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
|
||||
<I>y</I> can be specified together with <I>x</I> or <I>z</I>
|
||||
<I>z</I> args = <I>uniform</I> or Pz-1 numbers between 0 and 1
|
||||
<I>uniform</I> = evenly spaced cuts between processors in z dimension
|
||||
numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
|
||||
<I>z</I> can be specified together with <I>x</I> or <I>y</I>
|
||||
<I>shift</I> args = dimstr Niter stopthresh
|
||||
dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
stopthresh = stop balancing when this imbalance threshhold is reached
|
||||
<I>rcb</I> args = none
|
||||
</PRE>
|
||||
<LI>zero or more keyword/value pairs may be appended
|
||||
|
||||
<LI>keyword = <I>out</I>
|
||||
|
||||
<PRE> <I>out</I> value = filename
|
||||
filename = write each processor's sub-domain to a file
|
||||
</PRE>
|
||||
|
||||
</UL>
|
||||
<P><B>Examples:</B>
|
||||
</P>
|
||||
<PRE>balance x uniform y 0.4 0.5 0.6
|
||||
balance dynamic xz 5 1.1
|
||||
balance dynamic x 20 1.0 out tmp.balance
|
||||
<PRE>balance 0.9 x uniform y 0.4 0.5 0.6
|
||||
balance 1.2 shift xz 5 1.1
|
||||
balance 1.0 shift xz 5 1.1
|
||||
balance 1.1 rcb
|
||||
balance 1.0 shift x 20 1.0 out tmp.balance
|
||||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>This command adjusts the size of processor sub-domains within the
|
||||
simulation box, to attempt to balance the number of particles and thus
|
||||
the computational cost (load) evenly across processors. The load
|
||||
balancing is "static" in the sense that this command performs the
|
||||
balancing once, before or between simulations. The processor
|
||||
sub-domains will then remain static during the subsequent run. To
|
||||
perform "dynamic" balancing, see the <A HREF = "fix_balance.html">fix balance</A>
|
||||
command, which can adjust processor sub-domain sizes on-the-fly during
|
||||
a <A HREF = "run.html">run</A>.
|
||||
<P>IMPORTANT NOTE: The <I>rcb</I> style is not yet implemented.
|
||||
</P>
|
||||
<P>Load-balancing is only useful if the particles in the simulation box
|
||||
have a spatially-varying density distribution. E.g. a model of a
|
||||
vapor/liquid interface, or a solid with an irregular-shaped geometry
|
||||
containing void regions. In this case, the LAMMPS default of dividing
|
||||
the simulation box volume into a regular-spaced grid of processor
|
||||
sub-domain, with one equal-volume sub-domain per procesor, may assign
|
||||
very different numbers of particles per processor. This can lead to
|
||||
poor performance in a scalability sense, when the simulation is run in
|
||||
<P>This command adjusts the size and shape of processor sub-domains
|
||||
within the simulation box, to attempt to balance the number of
|
||||
particles and thus the computational cost (load) evenly across
|
||||
processors. The load balancing is "static" in the sense that this
|
||||
command performs the balancing once, before or between simulations.
|
||||
The processor sub-domains will then remain static during the
|
||||
subsequent run. To perform "dynamic" balancing, see the <A HREF = "fix_balance.html">fix
|
||||
balance</A> command, which can adjust processor
|
||||
sub-domain sizes and shapes on-the-fly during a <A HREF = "run.html">run</A>.
|
||||
</P>
|
||||
<P>Load-balancing is typically only useful if the particles in the
|
||||
simulation box have a spatially-varying density distribution. E.g. a
|
||||
model of a vapor/liquid interface, or a solid with an irregular-shaped
|
||||
geometry containing void regions. In this case, the LAMMPS default of
|
||||
dividing the simulation box volume into a regular-spaced grid of 3d
|
||||
bricks, with one equal-volume sub-domain per procesor, may assign very
|
||||
different numbers of particles per processor. This can lead to poor
|
||||
performance in a scalability sense, when the simulation is run in
|
||||
parallel.
|
||||
</P>
|
||||
<P>Note that the <A HREF = "processors.html">processors</A> command gives you control
|
||||
<P>Note that the <A HREF = "processors.html">processors</A> command allows some control
|
||||
over how the box volume is split across processors. Specifically, for
|
||||
a Px by Py by Pz grid of processors, it chooses or lets you choose Px,
|
||||
Py, and Pz, subject to the constraint that Px * Py * Pz = P, the total
|
||||
number of processors. This is sufficient to achieve good load-balance
|
||||
for many models on many processor counts. However, all the processor
|
||||
sub-domains will still be the same shape and have the same volume.
|
||||
a Px by Py by Pz grid of processors, it allows choice of Px, Py, and
|
||||
Pz, subject to the constraint that Px * Py * Pz = P, the total number
|
||||
of processors. This is sufficient to achieve good load-balance for
|
||||
many models on many processor counts. However, all the processor
|
||||
sub-domains will still have the same shape and same volume.
|
||||
</P>
|
||||
<P>This command does not alter the topology of the Px by Py by Pz grid or
|
||||
processors. But it shifts the cutting planes between processors (in
|
||||
3d, or lines in 2d), which adjusts the volume (area in 2d) assigned to
|
||||
each processor, as in the following 2d diagram. The left diagram is
|
||||
the default partitioning of the simulation box across processors (one
|
||||
sub-box for each of 16 processors); the right diagram is after
|
||||
balancing.
|
||||
</P>
|
||||
<CENTER><IMG SRC = "JPG/balance.jpg">
|
||||
</CENTER>
|
||||
<P>When the balance command completes, it prints out the final positions
|
||||
of all cutting planes in each of the 3 dimensions (as fractions of the
|
||||
box length). It also prints statistics about its results, including
|
||||
the change in "imbalance factor". This factor is defined as the
|
||||
maximum number of particles owned by any processor, divided by the
|
||||
<P>The requested load-balancing operation is only performed if the
|
||||
current "imbalance factor" in particles owned by each processor
|
||||
exceeds the specified <I>thresh</I> parameter. This factor is defined as
|
||||
the maximum number of particles owned by any processor, divided by the
|
||||
average number of particles per processor. Thus an imbalance factor
|
||||
of 1.0 is perfect balance. For 10000 particles running on 10
|
||||
processors, if the most heavily loaded processor has 1200 particles,
|
||||
then the factor is 1.2, meaning there is a 20% imbalance. The change
|
||||
in the maximum number of particles (on any processor) is also printed.
|
||||
then the factor is 1.2, meaning there is a 20% imbalance. Note that a
|
||||
re-balance can be forced even if the current balance is perfect (1.0)
|
||||
be specifying a <I>thresh</I> < 1.0.
|
||||
</P>
|
||||
<P>When the balance command completes, it prints statistics about its
|
||||
results, including the change in the imbalance factor and the change
|
||||
in the maximum number of particles (on any processor). For "grid"
|
||||
methods (defined below) that create a logical 3d grid of processors,
|
||||
the positions of all cutting planes in each of the 3 dimensions (as
|
||||
fractions of the box length) are also printed.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But because of the topology constraint that
|
||||
only the cutting planes (lines) between processors are moved, there
|
||||
are many irregular distributions of particles, where this factor
|
||||
cannot be shrunk to 1.0, particuarly in 3d. Also, computational cost
|
||||
is not strictly proportional to particle count, and changing the
|
||||
relative size and shape of processor sub-domains may lead to
|
||||
additional computational and communication overheads, e.g. in the PPPM
|
||||
solver used via the <A HREF = "kspace_style.html">kspace_style</A> command. Thus
|
||||
you should benchmark the run times of your simulation before and after
|
||||
balancing.
|
||||
factor, as defined above. But depending on the method a perfect
|
||||
balance (1.0) may not be achieved. For example, "grid" methods
|
||||
(defined below) that create a logical 3d grid cannot achieve perfect
|
||||
balance for many irregular distributions of particles. Likewise, if a
|
||||
portion of the system is a perfect lattice, e.g. the intiial system is
|
||||
generated by the <A HREF = "create_atoms.html">create_atoms</A> command, then "grid"
|
||||
methods may be unable to achieve exact balance. This is because
|
||||
entire lattice planes will be owned or not owned by a single
|
||||
processor.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: Computational cost is not strictly proportional to
|
||||
particle count, and changing the relative size and shape of processor
|
||||
sub-domains may lead to additional computational and communication
|
||||
overheads, e.g. in the PPPM solver used via the
|
||||
<A HREF = "kspace_style.html">kspace_style</A> command. Thus you should benchmark
|
||||
the run times of a simulation before and after balancing.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>x</I>, <I>y</I>, and <I>z</I> keywords adjust the position of cutting planes
|
||||
between processor sub-domains in a specific dimension. The <I>uniform</I>
|
||||
argument spaces the planes evenly, as in the left diagram above. The
|
||||
<I>numeric</I> argument requires you to list Ps-1 numbers that specify the
|
||||
position of the cutting planes. This requires that you know Ps = Px
|
||||
or Py or Pz = the number of processors assigned by LAMMPS to the
|
||||
relevant dimension. This assignment is made (and the Px, Py, Pz
|
||||
values printed out) when the simulation box is created by the
|
||||
"create_box" or "read_data" or "read_restart" command and is
|
||||
influenced by the settings of the "processors" command.
|
||||
<P>The method used to perform a load balance is specified by one of the
|
||||
listed styles, which are described in detail below. There are 2 kinds
|
||||
of styles.
|
||||
</P>
|
||||
<P>The <I>x</I>, <I>y</I>, <I>z</I>, and <I>shift</I> styles are "grid" methods which produce
|
||||
a logical 3d grid of processors. They operate by changing the cutting
|
||||
planes (or lines) between processors in 3d (or 2d), to adjust the
|
||||
volume (area in 2d) assigned to each processor, as in the following 2d
|
||||
diagram. The left diagram is the default partitioning of the
|
||||
simulation box across processors (one sub-box for each of 16
|
||||
processors); the right diagram is after balancing.
|
||||
</P>
|
||||
<CENTER><IMG SRC = "JPG/balance.jpg">
|
||||
</CENTER>
|
||||
<P>The <I>rcb</I> style is a "tiling" method which does not produce a logical
|
||||
3d grid of processors. Rather it tiles the simulation domain with
|
||||
rectangular sub-boxes of varying size and shape in an irregular
|
||||
fashion so as to have equal numbers of particles in each sub-box, as
|
||||
in the following 2d diagram. Again the left diagram is the default
|
||||
partitioning of the simulation box across processors (one sub-box for
|
||||
each of 16 processors); the right diagram is after balancing.
|
||||
</P>
|
||||
<P>NOTE: Need a diagram of RCB partitioning.
|
||||
</P>
|
||||
<P>The "grid" methods can be used with either of the
|
||||
<A HREF = "comm_style.html">comm_style</A> command options, <I>brick</I> or <I>tiled</I>. The
|
||||
"tiling" methods can only be used with <A HREF = "comm_style.html">comm_style
|
||||
tiled</A>. Note that it can be useful to use a "grid"
|
||||
method with <A HREF = "comm_style.html">comm_style tiled</A> to return the domain
|
||||
partitioning to a logical 3d grid of processors so that "comm_style
|
||||
brick" can be used for subsequent <A HREF = "run.html">run</A> commands.
|
||||
</P>
|
||||
<P>When a "grid" method is specified, the current domain partitioning can
|
||||
be either a logical 3d grid or a tiled partitioning. In the former
|
||||
case, the current logical 3d grid is used as a starting point and
|
||||
changes are made to improve the imbalance factor. In the latter case,
|
||||
the tiled partitioning is discarded and a logical 3d grid is created
|
||||
with uniform spacing in all dimensions. This becomes the starting
|
||||
point for the balancing operation.
|
||||
</P>
|
||||
<P>When a "tiling" method is specified, the current domain partitioning
|
||||
("grid" or "tiled") is ignored, and a new partitioning is computed
|
||||
from scratch.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>x</I>, <I>y</I>, and <I>z</I> styles invoke a "grid" method for balancing, as
|
||||
described above. Note that any or all of these 3 styles can be
|
||||
specified together, one after the other. This style adjusts the
|
||||
position of cutting planes between processor sub-domains in specific
|
||||
dimensions. Only the specified dimensions are altered.
|
||||
</P>
|
||||
<P>The <I>uniform</I> argument spaces the planes evenly, as in the left
|
||||
diagrams above. The <I>numeric</I> argument requires listing Ps-1 numbers
|
||||
that specify the position of the cutting planes. This requires
|
||||
knowing Ps = Px or Py or Pz = the number of processors assigned by
|
||||
LAMMPS to the relevant dimension. This assignment is made (and the
|
||||
Px, Py, Pz values printed out) when the simulation box is created by
|
||||
the "create_box" or "read_data" or "read_restart" command and is
|
||||
influenced by the settings of the <A HREF = "processors.html">processors</A>
|
||||
command.
|
||||
</P>
|
||||
<P>Each of the numeric values must be between 0 and 1, and they must be
|
||||
listed in ascending order. They represent the fractional position of
|
||||
|
@ -130,12 +197,11 @@ larger than the right processor's sub-domain.
|
|||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>dynamic</I> keyword changes the cutting planes between processors in
|
||||
an iterative fashion, seeking to reduce the imbalance factor, similar
|
||||
to how the <A HREF = "fix_balance.html">fix balance</A> command operates. Note that
|
||||
this keyword begins its operation from the current processor
|
||||
partitioning, which could be uniform or the result of a previous
|
||||
balance command.
|
||||
<P>The <I>shift</I> style invokes a "grid" method for balancing, as
|
||||
described above. It changes the positions of cutting planes between
|
||||
processors in an iterative fashion, seeking to reduce the imbalance
|
||||
factor, similar to how the <A HREF = "fix_balance.html">fix balance shift</A>
|
||||
command operates.
|
||||
</P>
|
||||
<P>The <I>dimstr</I> argument is a string of characters, each of which must be
|
||||
an "x" or "y" or "z". Eacn character can appear zero or one time,
|
||||
|
@ -147,14 +213,14 @@ to be a density variation in the particles.
|
|||
dimensions listed in <I>dimstr</I>, one dimension at a time. For a single
|
||||
dimension, the balancing operation (described below) is iterated on up
|
||||
to <I>Niter</I> times. After each dimension finishes, the imbalance factor
|
||||
is re-computed, and the balancing operation halts if the <I>thresh</I>
|
||||
is re-computed, and the balancing operation halts if the <I>stopthresh</I>
|
||||
criterion is met.
|
||||
</P>
|
||||
<P>A rebalance operation in a single dimension is performed using a
|
||||
recursive multisectioning algorithm, where the position of each
|
||||
cutting plane (line in 2d) in the dimension is adjusted independently.
|
||||
This is similar to a recursive bisectioning (RCB) for a single value,
|
||||
except that the bounds used for each bisectioning take advantage of
|
||||
This is similar to a recursive bisectioning for a single value, except
|
||||
that the bounds used for each bisectioning take advantage of
|
||||
information from neighboring cuts if possible. At each iteration, the
|
||||
count of particles on either side of each plane is tallied. If the
|
||||
counts do not match the target value for the plane, the position of
|
||||
|
@ -168,26 +234,27 @@ plane gets closer to the target value.
|
|||
assigned, particles are migrated to their new owning processor, and
|
||||
the balance procedure ends.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: At each rebalance operation, the RCB for each cutting
|
||||
plane (line in 2d) typcially starts with low and high bounds separated
|
||||
by the extent of a processor's sub-domain in one dimension. The size
|
||||
of this bracketing region shrinks by 1/2 every iteration. Thus if
|
||||
<I>Niter</I> is specified as 10, the cutting plane will typically be
|
||||
positioned to 1 part in 1000 accuracy (relative to the perfect target
|
||||
position). For <I>Niter</I> = 20, it will be accurate to 1 part in a
|
||||
million. Tus there is no need ot set <I>Niter</I> to a large value.
|
||||
<P>IMPORTANT NOTE: At each rebalance operation, the bisectioning for each
|
||||
cutting plane (line in 2d) typcially starts with low and high bounds
|
||||
separated by the extent of a processor's sub-domain in one dimension.
|
||||
The size of this bracketing region shrinks by 1/2 every iteration.
|
||||
Thus if <I>Niter</I> is specified as 10, the cutting plane will typically
|
||||
be positioned to 1 part in 1000 accuracy (relative to the perfect
|
||||
target position). For <I>Niter</I> = 20, it will be accurate to 1 part in
|
||||
a million. Thus there is no need ot set <I>Niter</I> to a large value.
|
||||
LAMMPS will check if the threshold accuracy is reached (in a
|
||||
dimension) is less iterations than <I>Niter</I> and exit early. However,
|
||||
<I>Niter</I> should also not be set too small, since it will take roughly
|
||||
the same number of iterations to converge even if the cutting plane is
|
||||
initially close to the target value.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: If a portion of your system is a perfect lattice,
|
||||
e.g. the intiial system is generated by the
|
||||
<A HREF = "create_atoms.html">create_atoms</A> command, then the balancer may be
|
||||
unable to achieve exact balance. I.e. entire lattice planes will be
|
||||
owned or not owned by a single processor. So you you should not
|
||||
expect to achieve perfect balance in this case.
|
||||
<HR>
|
||||
|
||||
<P>The <I>rcb</I> style invokes a "tiled" method for balancing, as described
|
||||
above. It performs a recursive coordinate bisectioning (RCB) of the
|
||||
simulation domain.
|
||||
</P>
|
||||
<P>Need further description of RCB.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
|
@ -242,11 +309,8 @@ only 10 unique vertices in total.
|
|||
|
||||
<P><B>Restrictions:</B>
|
||||
</P>
|
||||
<P>The <I>dynamic</I> keyword cannot be used with the <I>x</I>, <I>y</I>, or <I>z</I>
|
||||
arguments.
|
||||
</P>
|
||||
<P>For 2d simulations, the <I>z</I> keyword cannot be used. Nor can a "z"
|
||||
appear in <I>dimstr</I> for the <I>dynamic</I> keyword.
|
||||
<P>For 2d simulations, the <I>z</I> style cannot be used. Nor can a "z"
|
||||
appear in <I>dimstr</I> for the <I>shift</I> style.
|
||||
</P>
|
||||
<P><B>Related commands:</B>
|
||||
</P>
|
||||
|
|
279
doc/balance.txt
279
doc/balance.txt
|
@ -10,108 +10,172 @@ balance command :h3
|
|||
|
||||
[Syntax:]
|
||||
|
||||
balance keyword args ... :pre
|
||||
balance thresh style args keyword value ... :pre
|
||||
|
||||
one or more keyword/arg pairs may be appended :ulb,l
|
||||
keyword = {x} or {y} or {z} or {dynamic} or {out} :l
|
||||
{x} args = {uniform} or Px-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in x dimension
|
||||
numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
|
||||
{y} args = {uniform} or Py-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in y dimension
|
||||
numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
|
||||
{z} args = {uniform} or Pz-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in z dimension
|
||||
numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
|
||||
{dynamic} args = dimstr Niter thresh
|
||||
dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
thresh = stop balancing when this imbalance threshhold is reached
|
||||
{out} arg = filename
|
||||
filename = output file to write each processor's sub-domain to :pre
|
||||
thresh = imbalance threshhold that must be exceeded to perform a re-balance :ulb,l
|
||||
style = {x} or {y} or {z} or {shift} or {rcb} :l
|
||||
{x} args = {uniform} or Px-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in x dimension
|
||||
numbers = Px-1 ascending values between 0 and 1, Px - # of processors in x dimension
|
||||
{x} can be specified together with {y} or {z}
|
||||
{y} args = {uniform} or Py-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in y dimension
|
||||
numbers = Py-1 ascending values between 0 and 1, Py - # of processors in y dimension
|
||||
{y} can be specified together with {x} or {z}
|
||||
{z} args = {uniform} or Pz-1 numbers between 0 and 1
|
||||
{uniform} = evenly spaced cuts between processors in z dimension
|
||||
numbers = Pz-1 ascending values between 0 and 1, Pz - # of processors in z dimension
|
||||
{z} can be specified together with {x} or {y}
|
||||
{shift} args = dimstr Niter stopthresh
|
||||
dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
stopthresh = stop balancing when this imbalance threshhold is reached
|
||||
{rcb} args = none :pre
|
||||
zero or more keyword/value pairs may be appended :l
|
||||
keyword = {out} :l
|
||||
{out} value = filename
|
||||
filename = write each processor's sub-domain to a file :pre
|
||||
:ule
|
||||
|
||||
[Examples:]
|
||||
|
||||
balance x uniform y 0.4 0.5 0.6
|
||||
balance dynamic xz 5 1.1
|
||||
balance dynamic x 20 1.0 out tmp.balance :pre
|
||||
balance 0.9 x uniform y 0.4 0.5 0.6
|
||||
balance 1.2 shift xz 5 1.1
|
||||
balance 1.0 shift xz 5 1.1
|
||||
balance 1.1 rcb
|
||||
balance 1.0 shift x 20 1.0 out tmp.balance :pre
|
||||
|
||||
[Description:]
|
||||
|
||||
This command adjusts the size of processor sub-domains within the
|
||||
simulation box, to attempt to balance the number of particles and thus
|
||||
the computational cost (load) evenly across processors. The load
|
||||
balancing is "static" in the sense that this command performs the
|
||||
balancing once, before or between simulations. The processor
|
||||
sub-domains will then remain static during the subsequent run. To
|
||||
perform "dynamic" balancing, see the "fix balance"_fix_balance.html
|
||||
command, which can adjust processor sub-domain sizes on-the-fly during
|
||||
a "run"_run.html.
|
||||
IMPORTANT NOTE: The {rcb} style is not yet implemented.
|
||||
|
||||
Load-balancing is only useful if the particles in the simulation box
|
||||
have a spatially-varying density distribution. E.g. a model of a
|
||||
vapor/liquid interface, or a solid with an irregular-shaped geometry
|
||||
containing void regions. In this case, the LAMMPS default of dividing
|
||||
the simulation box volume into a regular-spaced grid of processor
|
||||
sub-domain, with one equal-volume sub-domain per procesor, may assign
|
||||
very different numbers of particles per processor. This can lead to
|
||||
poor performance in a scalability sense, when the simulation is run in
|
||||
This command adjusts the size and shape of processor sub-domains
|
||||
within the simulation box, to attempt to balance the number of
|
||||
particles and thus the computational cost (load) evenly across
|
||||
processors. The load balancing is "static" in the sense that this
|
||||
command performs the balancing once, before or between simulations.
|
||||
The processor sub-domains will then remain static during the
|
||||
subsequent run. To perform "dynamic" balancing, see the "fix
|
||||
balance"_fix_balance.html command, which can adjust processor
|
||||
sub-domain sizes and shapes on-the-fly during a "run"_run.html.
|
||||
|
||||
Load-balancing is typically only useful if the particles in the
|
||||
simulation box have a spatially-varying density distribution. E.g. a
|
||||
model of a vapor/liquid interface, or a solid with an irregular-shaped
|
||||
geometry containing void regions. In this case, the LAMMPS default of
|
||||
dividing the simulation box volume into a regular-spaced grid of 3d
|
||||
bricks, with one equal-volume sub-domain per procesor, may assign very
|
||||
different numbers of particles per processor. This can lead to poor
|
||||
performance in a scalability sense, when the simulation is run in
|
||||
parallel.
|
||||
|
||||
Note that the "processors"_processors.html command gives you control
|
||||
Note that the "processors"_processors.html command allows some control
|
||||
over how the box volume is split across processors. Specifically, for
|
||||
a Px by Py by Pz grid of processors, it chooses or lets you choose Px,
|
||||
Py, and Pz, subject to the constraint that Px * Py * Pz = P, the total
|
||||
number of processors. This is sufficient to achieve good load-balance
|
||||
for many models on many processor counts. However, all the processor
|
||||
sub-domains will still be the same shape and have the same volume.
|
||||
a Px by Py by Pz grid of processors, it allows choice of Px, Py, and
|
||||
Pz, subject to the constraint that Px * Py * Pz = P, the total number
|
||||
of processors. This is sufficient to achieve good load-balance for
|
||||
many models on many processor counts. However, all the processor
|
||||
sub-domains will still have the same shape and same volume.
|
||||
|
||||
This command does not alter the topology of the Px by Py by Pz grid or
|
||||
processors. But it shifts the cutting planes between processors (in
|
||||
3d, or lines in 2d), which adjusts the volume (area in 2d) assigned to
|
||||
each processor, as in the following 2d diagram. The left diagram is
|
||||
the default partitioning of the simulation box across processors (one
|
||||
sub-box for each of 16 processors); the right diagram is after
|
||||
balancing.
|
||||
|
||||
:c,image(JPG/balance.jpg)
|
||||
|
||||
When the balance command completes, it prints out the final positions
|
||||
of all cutting planes in each of the 3 dimensions (as fractions of the
|
||||
box length). It also prints statistics about its results, including
|
||||
the change in "imbalance factor". This factor is defined as the
|
||||
maximum number of particles owned by any processor, divided by the
|
||||
The requested load-balancing operation is only performed if the
|
||||
current "imbalance factor" in particles owned by each processor
|
||||
exceeds the specified {thresh} parameter. This factor is defined as
|
||||
the maximum number of particles owned by any processor, divided by the
|
||||
average number of particles per processor. Thus an imbalance factor
|
||||
of 1.0 is perfect balance. For 10000 particles running on 10
|
||||
processors, if the most heavily loaded processor has 1200 particles,
|
||||
then the factor is 1.2, meaning there is a 20% imbalance. The change
|
||||
in the maximum number of particles (on any processor) is also printed.
|
||||
then the factor is 1.2, meaning there is a 20% imbalance. Note that a
|
||||
re-balance can be forced even if the current balance is perfect (1.0)
|
||||
be specifying a {thresh} < 1.0.
|
||||
|
||||
When the balance command completes, it prints statistics about its
|
||||
results, including the change in the imbalance factor and the change
|
||||
in the maximum number of particles (on any processor). For "grid"
|
||||
methods (defined below) that create a logical 3d grid of processors,
|
||||
the positions of all cutting planes in each of the 3 dimensions (as
|
||||
fractions of the box length) are also printed.
|
||||
|
||||
IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But because of the topology constraint that
|
||||
only the cutting planes (lines) between processors are moved, there
|
||||
are many irregular distributions of particles, where this factor
|
||||
cannot be shrunk to 1.0, particuarly in 3d. Also, computational cost
|
||||
is not strictly proportional to particle count, and changing the
|
||||
relative size and shape of processor sub-domains may lead to
|
||||
additional computational and communication overheads, e.g. in the PPPM
|
||||
solver used via the "kspace_style"_kspace_style.html command. Thus
|
||||
you should benchmark the run times of your simulation before and after
|
||||
balancing.
|
||||
factor, as defined above. But depending on the method a perfect
|
||||
balance (1.0) may not be achieved. For example, "grid" methods
|
||||
(defined below) that create a logical 3d grid cannot achieve perfect
|
||||
balance for many irregular distributions of particles. Likewise, if a
|
||||
portion of the system is a perfect lattice, e.g. the intiial system is
|
||||
generated by the "create_atoms"_create_atoms.html command, then "grid"
|
||||
methods may be unable to achieve exact balance. This is because
|
||||
entire lattice planes will be owned or not owned by a single
|
||||
processor.
|
||||
|
||||
IMPORTANT NOTE: Computational cost is not strictly proportional to
|
||||
particle count, and changing the relative size and shape of processor
|
||||
sub-domains may lead to additional computational and communication
|
||||
overheads, e.g. in the PPPM solver used via the
|
||||
"kspace_style"_kspace_style.html command. Thus you should benchmark
|
||||
the run times of a simulation before and after balancing.
|
||||
|
||||
:line
|
||||
|
||||
The {x}, {y}, and {z} keywords adjust the position of cutting planes
|
||||
between processor sub-domains in a specific dimension. The {uniform}
|
||||
argument spaces the planes evenly, as in the left diagram above. The
|
||||
{numeric} argument requires you to list Ps-1 numbers that specify the
|
||||
position of the cutting planes. This requires that you know Ps = Px
|
||||
or Py or Pz = the number of processors assigned by LAMMPS to the
|
||||
relevant dimension. This assignment is made (and the Px, Py, Pz
|
||||
values printed out) when the simulation box is created by the
|
||||
"create_box" or "read_data" or "read_restart" command and is
|
||||
influenced by the settings of the "processors" command.
|
||||
The method used to perform a load balance is specified by one of the
|
||||
listed styles, which are described in detail below. There are 2 kinds
|
||||
of styles.
|
||||
|
||||
The {x}, {y}, {z}, and {shift} styles are "grid" methods which produce
|
||||
a logical 3d grid of processors. They operate by changing the cutting
|
||||
planes (or lines) between processors in 3d (or 2d), to adjust the
|
||||
volume (area in 2d) assigned to each processor, as in the following 2d
|
||||
diagram. The left diagram is the default partitioning of the
|
||||
simulation box across processors (one sub-box for each of 16
|
||||
processors); the right diagram is after balancing.
|
||||
|
||||
:c,image(JPG/balance.jpg)
|
||||
|
||||
The {rcb} style is a "tiling" method which does not produce a logical
|
||||
3d grid of processors. Rather it tiles the simulation domain with
|
||||
rectangular sub-boxes of varying size and shape in an irregular
|
||||
fashion so as to have equal numbers of particles in each sub-box, as
|
||||
in the following 2d diagram. Again the left diagram is the default
|
||||
partitioning of the simulation box across processors (one sub-box for
|
||||
each of 16 processors); the right diagram is after balancing.
|
||||
|
||||
NOTE: Need a diagram of RCB partitioning.
|
||||
|
||||
The "grid" methods can be used with either of the
|
||||
"comm_style"_comm_style.html command options, {brick} or {tiled}. The
|
||||
"tiling" methods can only be used with "comm_style
|
||||
tiled"_comm_style.html. Note that it can be useful to use a "grid"
|
||||
method with "comm_style tiled"_comm_style.html to return the domain
|
||||
partitioning to a logical 3d grid of processors so that "comm_style
|
||||
brick" can be used for subsequent "run"_run.html commands.
|
||||
|
||||
When a "grid" method is specified, the current domain partitioning can
|
||||
be either a logical 3d grid or a tiled partitioning. In the former
|
||||
case, the current logical 3d grid is used as a starting point and
|
||||
changes are made to improve the imbalance factor. In the latter case,
|
||||
the tiled partitioning is discarded and a logical 3d grid is created
|
||||
with uniform spacing in all dimensions. This becomes the starting
|
||||
point for the balancing operation.
|
||||
|
||||
When a "tiling" method is specified, the current domain partitioning
|
||||
("grid" or "tiled") is ignored, and a new partitioning is computed
|
||||
from scratch.
|
||||
|
||||
:line
|
||||
|
||||
The {x}, {y}, and {z} styles invoke a "grid" method for balancing, as
|
||||
described above. Note that any or all of these 3 styles can be
|
||||
specified together, one after the other. This style adjusts the
|
||||
position of cutting planes between processor sub-domains in specific
|
||||
dimensions. Only the specified dimensions are altered.
|
||||
|
||||
The {uniform} argument spaces the planes evenly, as in the left
|
||||
diagrams above. The {numeric} argument requires listing Ps-1 numbers
|
||||
that specify the position of the cutting planes. This requires
|
||||
knowing Ps = Px or Py or Pz = the number of processors assigned by
|
||||
LAMMPS to the relevant dimension. This assignment is made (and the
|
||||
Px, Py, Pz values printed out) when the simulation box is created by
|
||||
the "create_box" or "read_data" or "read_restart" command and is
|
||||
influenced by the settings of the "processors"_processors.html
|
||||
command.
|
||||
|
||||
Each of the numeric values must be between 0 and 1, and they must be
|
||||
listed in ascending order. They represent the fractional position of
|
||||
|
@ -124,12 +188,11 @@ larger than the right processor's sub-domain.
|
|||
|
||||
:line
|
||||
|
||||
The {dynamic} keyword changes the cutting planes between processors in
|
||||
an iterative fashion, seeking to reduce the imbalance factor, similar
|
||||
to how the "fix balance"_fix_balance.html command operates. Note that
|
||||
this keyword begins its operation from the current processor
|
||||
partitioning, which could be uniform or the result of a previous
|
||||
balance command.
|
||||
The {shift} style invokes a "grid" method for balancing, as
|
||||
described above. It changes the positions of cutting planes between
|
||||
processors in an iterative fashion, seeking to reduce the imbalance
|
||||
factor, similar to how the "fix balance shift"_fix_balance.html
|
||||
command operates.
|
||||
|
||||
The {dimstr} argument is a string of characters, each of which must be
|
||||
an "x" or "y" or "z". Eacn character can appear zero or one time,
|
||||
|
@ -141,14 +204,14 @@ Balancing proceeds by adjusting the cutting planes in each of the
|
|||
dimensions listed in {dimstr}, one dimension at a time. For a single
|
||||
dimension, the balancing operation (described below) is iterated on up
|
||||
to {Niter} times. After each dimension finishes, the imbalance factor
|
||||
is re-computed, and the balancing operation halts if the {thresh}
|
||||
is re-computed, and the balancing operation halts if the {stopthresh}
|
||||
criterion is met.
|
||||
|
||||
A rebalance operation in a single dimension is performed using a
|
||||
recursive multisectioning algorithm, where the position of each
|
||||
cutting plane (line in 2d) in the dimension is adjusted independently.
|
||||
This is similar to a recursive bisectioning (RCB) for a single value,
|
||||
except that the bounds used for each bisectioning take advantage of
|
||||
This is similar to a recursive bisectioning for a single value, except
|
||||
that the bounds used for each bisectioning take advantage of
|
||||
information from neighboring cuts if possible. At each iteration, the
|
||||
count of particles on either side of each plane is tallied. If the
|
||||
counts do not match the target value for the plane, the position of
|
||||
|
@ -162,26 +225,27 @@ Once the rebalancing is complete and final processor sub-domains
|
|||
assigned, particles are migrated to their new owning processor, and
|
||||
the balance procedure ends.
|
||||
|
||||
IMPORTANT NOTE: At each rebalance operation, the RCB for each cutting
|
||||
plane (line in 2d) typcially starts with low and high bounds separated
|
||||
by the extent of a processor's sub-domain in one dimension. The size
|
||||
of this bracketing region shrinks by 1/2 every iteration. Thus if
|
||||
{Niter} is specified as 10, the cutting plane will typically be
|
||||
positioned to 1 part in 1000 accuracy (relative to the perfect target
|
||||
position). For {Niter} = 20, it will be accurate to 1 part in a
|
||||
million. Tus there is no need ot set {Niter} to a large value.
|
||||
IMPORTANT NOTE: At each rebalance operation, the bisectioning for each
|
||||
cutting plane (line in 2d) typcially starts with low and high bounds
|
||||
separated by the extent of a processor's sub-domain in one dimension.
|
||||
The size of this bracketing region shrinks by 1/2 every iteration.
|
||||
Thus if {Niter} is specified as 10, the cutting plane will typically
|
||||
be positioned to 1 part in 1000 accuracy (relative to the perfect
|
||||
target position). For {Niter} = 20, it will be accurate to 1 part in
|
||||
a million. Thus there is no need ot set {Niter} to a large value.
|
||||
LAMMPS will check if the threshold accuracy is reached (in a
|
||||
dimension) is less iterations than {Niter} and exit early. However,
|
||||
{Niter} should also not be set too small, since it will take roughly
|
||||
the same number of iterations to converge even if the cutting plane is
|
||||
initially close to the target value.
|
||||
|
||||
IMPORTANT NOTE: If a portion of your system is a perfect lattice,
|
||||
e.g. the intiial system is generated by the
|
||||
"create_atoms"_create_atoms.html command, then the balancer may be
|
||||
unable to achieve exact balance. I.e. entire lattice planes will be
|
||||
owned or not owned by a single processor. So you you should not
|
||||
expect to achieve perfect balance in this case.
|
||||
:line
|
||||
|
||||
The {rcb} style invokes a "tiled" method for balancing, as described
|
||||
above. It performs a recursive coordinate bisectioning (RCB) of the
|
||||
simulation domain.
|
||||
|
||||
Need further description of RCB.
|
||||
|
||||
:line
|
||||
|
||||
|
@ -236,11 +300,8 @@ For a 3d problem, the syntax is similar with "SQUARES" replaced by
|
|||
|
||||
[Restrictions:]
|
||||
|
||||
The {dynamic} keyword cannot be used with the {x}, {y}, or {z}
|
||||
arguments.
|
||||
|
||||
For 2d simulations, the {z} keyword cannot be used. Nor can a "z"
|
||||
appear in {dimstr} for the {dynamic} keyword.
|
||||
For 2d simulations, the {z} style cannot be used. Nor can a "z"
|
||||
appear in {dimstr} for the {shift} style.
|
||||
|
||||
[Related commands:]
|
||||
|
||||
|
|
|
@ -9,19 +9,18 @@
|
|||
|
||||
<HR>
|
||||
|
||||
<H3>communicate command
|
||||
<H3>comm_modify command
|
||||
</H3>
|
||||
<P><B>Syntax:</B>
|
||||
</P>
|
||||
<PRE>communicate style keyword value ...
|
||||
<PRE>comm_modify keyword value ...
|
||||
</PRE>
|
||||
<UL><LI>style = <I>single</I> or <I>multi</I>
|
||||
<UL><LI>zero or more keyword/value pairs may be appended
|
||||
|
||||
<LI>zero or more keyword/value pairs may be appended
|
||||
<LI>keyword = <I>mode</I> or <I>cutoff</I> or <I>group</I> or <I>vel</I>
|
||||
|
||||
<LI>keyword = <I>cutoff</I> or <I>group</I> or <I>vel</I>
|
||||
|
||||
<PRE> <I>cutoff</I> value = Rcut (distance units) = communicate atoms from this far away
|
||||
<PRE> <I>mode</I> value = <I>single</I> or <I>multi</I> = communicate atoms within a single or multiple distances
|
||||
<I>cutoff</I> value = Rcut (distance units) = communicate atoms from this far away
|
||||
<I>group</I> value = group-ID = only communicate atoms in the group
|
||||
<I>vel</I> value = <I>yes</I> or <I>no</I> = do or do not communicate velocity info with ghost atoms
|
||||
</PRE>
|
||||
|
@ -29,32 +28,42 @@
|
|||
</UL>
|
||||
<P><B>Examples:</B>
|
||||
</P>
|
||||
<PRE>communicate multi
|
||||
communicate multi group solvent
|
||||
communicate single vel yes
|
||||
communicate single cutoff 5.0 vel yes
|
||||
<PRE>communicate mode multi
|
||||
communicate mode multi group solvent
|
||||
communicate vel yes
|
||||
communicate cutoff 5.0 vel yes
|
||||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>This command sets the style of inter-processor communication that
|
||||
occurs each timestep as atom coordinates and other properties are
|
||||
exchanged between neighboring processors and stored as properties of
|
||||
ghost atoms.
|
||||
<P>This command sets parameters that affect the inter-processor
|
||||
communication of atom information that occurs each timestep as
|
||||
coordinates and other properties are exchanged between neighboring
|
||||
processors and stored as properties of ghost atoms.
|
||||
</P>
|
||||
<P>The default style is <I>single</I> which means each processor acquires
|
||||
<P>IMPORTANT NOTE: These options apply to the currently defined comm
|
||||
style. When you specify a <A HREF = "comm_style.html">comm_style</A> command, all
|
||||
communication settings are restored to their default values, including
|
||||
those previously reset by a comm_modify command. Thus if your input
|
||||
script specifies a comm_style command, you should use the comm_modify
|
||||
command after it.
|
||||
</P>
|
||||
<P>The <I>mode</I> keyword determines whether a single or multiple cutoff
|
||||
distances are used to determine which atoms to communicate.
|
||||
</P>
|
||||
<P>The default mode is <I>single</I> which means each processor acquires
|
||||
information for ghost atoms that are within a single distance from its
|
||||
sub-domain. The distance is the maximum of the neighbor cutoff for
|
||||
all atom type pairs.
|
||||
</P>
|
||||
<P>For many systems this is an efficient algorithm, but for systems with
|
||||
widely varying cutoffs for different type pairs, the <I>multi</I> style can
|
||||
widely varying cutoffs for different type pairs, the <I>multi</I> mode can
|
||||
be faster. In this case, each atom type is assigned its own distance
|
||||
cutoff for communication purposes, and fewer atoms will be
|
||||
communicated. See the <A HREF = "neighbor.html">neighbor multi</A> command for a
|
||||
neighbor list construction option that may also be beneficial for
|
||||
simulations of this kind.
|
||||
</P>
|
||||
<P>The <I>cutoff</I> option allows you to set a ghost cutoff distance, which
|
||||
<P>The <I>cutoff</I> keyword allows you to set a ghost cutoff distance, which
|
||||
is the distance from the borders of a processor's sub-domain at which
|
||||
ghost atoms are acquired from other processors. By default the ghost
|
||||
cutoff = neighbor cutoff = pairwise force cutoff + neighbor skin. See
|
||||
|
@ -105,14 +114,14 @@ will typically lead to bad dynamics (i.e. the bond length is now the
|
|||
simulation box length). To detect if this is happening, see the
|
||||
<A HREF = "neigh_modify.html">neigh_modify cluster</A> command.
|
||||
</P>
|
||||
<P>The <I>group</I> option will limit communication to atoms in the specified
|
||||
<P>The <I>group</I> keyword will limit communication to atoms in the specified
|
||||
group. This can be useful for models where no ghost atoms are needed
|
||||
for some kinds of particles. All atoms (not just those in the
|
||||
specified group) will still migrate to new processors as they move.
|
||||
The group specified with this option must also be specified via the
|
||||
<A HREF = "atom_modify.html">atom_modify first</A> command.
|
||||
</P>
|
||||
<P>The <I>vel</I> option enables velocity information to be communicated with
|
||||
<P>The <I>vel</I> keyword enables velocity information to be communicated with
|
||||
ghost particles. Depending on the <A HREF = "atom_style.html">atom_style</A>,
|
||||
velocity info includes the translational velocity, angular velocity,
|
||||
and angular momentum of a particle. If the <I>vel</I> option is set to
|
||||
|
@ -131,12 +140,12 @@ that boundary (e.g. due to dilation or shear).
|
|||
</P>
|
||||
<P><B>Related commands:</B>
|
||||
</P>
|
||||
<P><A HREF = "neighbor.html">neighbor</A>
|
||||
<P><A HREF = "comm_style.html">comm_style</A>, <A HREF = "neighbor.html">neighbor</A>
|
||||
</P>
|
||||
<P><B>Default:</B>
|
||||
</P>
|
||||
<P>The default settings are style = single, group = all, cutoff = 0.0,
|
||||
vel = no. The cutoff default of 0.0 means that ghost cutoff =
|
||||
neighbor cutoff = pairwise force cutoff + neighbor skin.
|
||||
<P>The option defauls are mode = single, group = all, cutoff = 0.0, vel =
|
||||
no. The cutoff default of 0.0 means that ghost cutoff = neighbor
|
||||
cutoff = pairwise force cutoff + neighbor skin.
|
||||
</P>
|
||||
</HTML>
|
|
@ -6,15 +6,15 @@
|
|||
|
||||
:line
|
||||
|
||||
communicate command :h3
|
||||
comm_modify command :h3
|
||||
|
||||
[Syntax:]
|
||||
|
||||
communicate style keyword value ... :pre
|
||||
comm_modify keyword value ... :pre
|
||||
|
||||
style = {single} or {multi} :ulb,l
|
||||
zero or more keyword/value pairs may be appended :l
|
||||
keyword = {cutoff} or {group} or {vel} :l
|
||||
zero or more keyword/value pairs may be appended :ulb,l
|
||||
keyword = {mode} or {cutoff} or {group} or {vel} :l
|
||||
{mode} value = {single} or {multi} = communicate atoms within a single or multiple distances
|
||||
{cutoff} value = Rcut (distance units) = communicate atoms from this far away
|
||||
{group} value = group-ID = only communicate atoms in the group
|
||||
{vel} value = {yes} or {no} = do or do not communicate velocity info with ghost atoms :pre
|
||||
|
@ -22,32 +22,42 @@ keyword = {cutoff} or {group} or {vel} :l
|
|||
|
||||
[Examples:]
|
||||
|
||||
communicate multi
|
||||
communicate multi group solvent
|
||||
communicate single vel yes
|
||||
communicate single cutoff 5.0 vel yes :pre
|
||||
communicate mode multi
|
||||
communicate mode multi group solvent
|
||||
communicate vel yes
|
||||
communicate cutoff 5.0 vel yes :pre
|
||||
|
||||
[Description:]
|
||||
|
||||
This command sets the style of inter-processor communication that
|
||||
occurs each timestep as atom coordinates and other properties are
|
||||
exchanged between neighboring processors and stored as properties of
|
||||
ghost atoms.
|
||||
This command sets parameters that affect the inter-processor
|
||||
communication of atom information that occurs each timestep as
|
||||
coordinates and other properties are exchanged between neighboring
|
||||
processors and stored as properties of ghost atoms.
|
||||
|
||||
The default style is {single} which means each processor acquires
|
||||
IMPORTANT NOTE: These options apply to the currently defined comm
|
||||
style. When you specify a "comm_style"_comm_style.html command, all
|
||||
communication settings are restored to their default values, including
|
||||
those previously reset by a comm_modify command. Thus if your input
|
||||
script specifies a comm_style command, you should use the comm_modify
|
||||
command after it.
|
||||
|
||||
The {mode} keyword determines whether a single or multiple cutoff
|
||||
distances are used to determine which atoms to communicate.
|
||||
|
||||
The default mode is {single} which means each processor acquires
|
||||
information for ghost atoms that are within a single distance from its
|
||||
sub-domain. The distance is the maximum of the neighbor cutoff for
|
||||
all atom type pairs.
|
||||
|
||||
For many systems this is an efficient algorithm, but for systems with
|
||||
widely varying cutoffs for different type pairs, the {multi} style can
|
||||
widely varying cutoffs for different type pairs, the {multi} mode can
|
||||
be faster. In this case, each atom type is assigned its own distance
|
||||
cutoff for communication purposes, and fewer atoms will be
|
||||
communicated. See the "neighbor multi"_neighbor.html command for a
|
||||
neighbor list construction option that may also be beneficial for
|
||||
simulations of this kind.
|
||||
|
||||
The {cutoff} option allows you to set a ghost cutoff distance, which
|
||||
The {cutoff} keyword allows you to set a ghost cutoff distance, which
|
||||
is the distance from the borders of a processor's sub-domain at which
|
||||
ghost atoms are acquired from other processors. By default the ghost
|
||||
cutoff = neighbor cutoff = pairwise force cutoff + neighbor skin. See
|
||||
|
@ -98,14 +108,14 @@ will typically lead to bad dynamics (i.e. the bond length is now the
|
|||
simulation box length). To detect if this is happening, see the
|
||||
"neigh_modify cluster"_neigh_modify.html command.
|
||||
|
||||
The {group} option will limit communication to atoms in the specified
|
||||
The {group} keyword will limit communication to atoms in the specified
|
||||
group. This can be useful for models where no ghost atoms are needed
|
||||
for some kinds of particles. All atoms (not just those in the
|
||||
specified group) will still migrate to new processors as they move.
|
||||
The group specified with this option must also be specified via the
|
||||
"atom_modify first"_atom_modify.html command.
|
||||
|
||||
The {vel} option enables velocity information to be communicated with
|
||||
The {vel} keyword enables velocity information to be communicated with
|
||||
ghost particles. Depending on the "atom_style"_atom_style.html,
|
||||
velocity info includes the translational velocity, angular velocity,
|
||||
and angular momentum of a particle. If the {vel} option is set to
|
||||
|
@ -124,10 +134,10 @@ that boundary (e.g. due to dilation or shear).
|
|||
|
||||
[Related commands:]
|
||||
|
||||
"neighbor"_neighbor.html
|
||||
"comm_style"_comm_style.html, "neighbor"_neighbor.html
|
||||
|
||||
[Default:]
|
||||
|
||||
The default settings are style = single, group = all, cutoff = 0.0,
|
||||
vel = no. The cutoff default of 0.0 means that ghost cutoff =
|
||||
neighbor cutoff = pairwise force cutoff + neighbor skin.
|
||||
The option defauls are mode = single, group = all, cutoff = 0.0, vel =
|
||||
no. The cutoff default of 0.0 means that ghost cutoff = neighbor
|
||||
cutoff = pairwise force cutoff + neighbor skin.
|
|
@ -0,0 +1,70 @@
|
|||
<HTML>
|
||||
<CENTER><A HREF = "http://lammps.sandia.gov">LAMMPS WWW Site</A> - <A HREF = "Manual.html">LAMMPS Documentation</A> - <A HREF = "Section_commands.html#comm">LAMMPS Commands</A>
|
||||
</CENTER>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<HR>
|
||||
|
||||
<H3>comm_style command
|
||||
</H3>
|
||||
<P><B>Syntax:</B>
|
||||
</P>
|
||||
<PRE>comm_style style
|
||||
</PRE>
|
||||
<UL><LI>style = <I>brick</I> or <I>tiled</I>
|
||||
</UL>
|
||||
<P><B>Examples:</B>
|
||||
</P>
|
||||
<PRE>comm_style brick
|
||||
comm_style tiled
|
||||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>This command sets the style of inter-processor communication of atom
|
||||
information that occurs each timestep as coordinates and other
|
||||
properties are exchanged between neighboring processors and stored as
|
||||
properties of ghost atoms.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: The <I>tiled</I> style is not yet implemented.
|
||||
</P>
|
||||
<P>For the default <I>brick</I> style, the domain decomposition used by LAMMPS
|
||||
to partition the simulation box must be a regular 3d grid of bricks,
|
||||
one per processor. Each processor communicates with its 6 Cartesian
|
||||
neighbors in the grid to acquire information for nearby atoms.
|
||||
</P>
|
||||
<P>For the <I>tiled</I> style, a more general domain decomposition can be
|
||||
used, as triggered by the <A HREF = "balance.html">balance</A> or <A HREF = "fix_balance.html">fix
|
||||
balance</A> commands. The simulation box can be
|
||||
partitioned into non-overlapping rectangular-shaped "tiles" or varying
|
||||
sizes and shapes. Again there is one tile per processor. To acquire
|
||||
information for nearby atoms, communication must now be done with a
|
||||
more complex pattern of neighboring processors.
|
||||
</P>
|
||||
<P>Note that this command does not actually define a partitoining of the
|
||||
simulation box (a domain decomposition), rather it determines what
|
||||
kinds of decompositions are allowed and the pattern of communication
|
||||
used to enable the decomposition. A decomposition is created when the
|
||||
simulation box is first created, via the <A HREF = "create_box.html">create_box</A>
|
||||
or <A HREF = "read_data.html">read_data</A> or <A HREF = "read_restart.html">read_restart</A>
|
||||
commands. For both the <I>brick</I> and <I>tiled</I> styles, the initial
|
||||
decomposition will be the same, as described by
|
||||
<A HREF = "create_box.html">create_box</A> and <A HREF = "processors.html">processors</A>
|
||||
commands. The decomposition can be changed via the
|
||||
<A HREF = "balance.html">balance</A> or <A HREF = "fix_balance.html">fix_balance</A> commands.
|
||||
</P>
|
||||
<P><B>Restrictions:</B> none
|
||||
</P>
|
||||
<P><B>Related commands:</B>
|
||||
</P>
|
||||
<P><A HREF = "comm_modify.html">comm_modify</A>, <A HREF = "processors.html">processors</A>,
|
||||
<A HREF = "balance.html">balance</A>, <A HREF = "fix_balance.html">fix balance</A>
|
||||
</P>
|
||||
<P><B>Default:</B>
|
||||
</P>
|
||||
<P>The default style is brick.
|
||||
</P>
|
||||
</HTML>
|
|
@ -0,0 +1,65 @@
|
|||
"LAMMPS WWW Site"_lws - "LAMMPS Documentation"_ld - "LAMMPS Commands"_lc :c
|
||||
|
||||
:link(lws,http://lammps.sandia.gov)
|
||||
:link(ld,Manual.html)
|
||||
:link(lc,Section_commands.html#comm)
|
||||
|
||||
:line
|
||||
|
||||
comm_style command :h3
|
||||
|
||||
[Syntax:]
|
||||
|
||||
comm_style style :pre
|
||||
|
||||
style = {brick} or {tiled} :ul
|
||||
|
||||
[Examples:]
|
||||
|
||||
comm_style brick
|
||||
comm_style tiled :pre
|
||||
|
||||
[Description:]
|
||||
|
||||
This command sets the style of inter-processor communication of atom
|
||||
information that occurs each timestep as coordinates and other
|
||||
properties are exchanged between neighboring processors and stored as
|
||||
properties of ghost atoms.
|
||||
|
||||
IMPORTANT NOTE: The {tiled} style is not yet implemented.
|
||||
|
||||
For the default {brick} style, the domain decomposition used by LAMMPS
|
||||
to partition the simulation box must be a regular 3d grid of bricks,
|
||||
one per processor. Each processor communicates with its 6 Cartesian
|
||||
neighbors in the grid to acquire information for nearby atoms.
|
||||
|
||||
For the {tiled} style, a more general domain decomposition can be
|
||||
used, as triggered by the "balance"_balance.html or "fix
|
||||
balance"_fix_balance.html commands. The simulation box can be
|
||||
partitioned into non-overlapping rectangular-shaped "tiles" or varying
|
||||
sizes and shapes. Again there is one tile per processor. To acquire
|
||||
information for nearby atoms, communication must now be done with a
|
||||
more complex pattern of neighboring processors.
|
||||
|
||||
Note that this command does not actually define a partitoining of the
|
||||
simulation box (a domain decomposition), rather it determines what
|
||||
kinds of decompositions are allowed and the pattern of communication
|
||||
used to enable the decomposition. A decomposition is created when the
|
||||
simulation box is first created, via the "create_box"_create_box.html
|
||||
or "read_data"_read_data.html or "read_restart"_read_restart.html
|
||||
commands. For both the {brick} and {tiled} styles, the initial
|
||||
decomposition will be the same, as described by
|
||||
"create_box"_create_box.html and "processors"_processors.html
|
||||
commands. The decomposition can be changed via the
|
||||
"balance"_balance.html or "fix_balance"_fix_balance.html commands.
|
||||
|
||||
[Restrictions:] none
|
||||
|
||||
[Related commands:]
|
||||
|
||||
"comm_modify"_comm_modify.html, "processors"_processors.html,
|
||||
"balance"_balance.html, "fix balance"_fix_balance.html
|
||||
|
||||
[Default:]
|
||||
|
||||
The default style is brick.
|
|
@ -44,7 +44,12 @@ create_box 2 mybox bond/types 2 extra/bond/per/atom 1
|
|||
</P>
|
||||
<P>This command creates a simulation box based on the specified region.
|
||||
Thus a <A HREF = "region.html">region</A> command must first be used to define a
|
||||
geometric domain.
|
||||
geometric domain. It also partitions the simulation box into a
|
||||
regular 3d grid of rectangular bricks, one per processor, based on the
|
||||
number of processors being used and the settings of the
|
||||
<A HREF = "processors.html">processors</A> command. The partitioning can later be
|
||||
changed by the <A HREF = "balance.html">balance</A> or <A HREF = "fix_balance.html">fix
|
||||
balance</A> commands.
|
||||
</P>
|
||||
<P>The argument N is the number of atom types that will be used in the
|
||||
simulation.
|
||||
|
@ -94,13 +99,14 @@ you should not make the lo/hi box dimensions (as defined in your
|
|||
of the atoms you eventually plan to create, e.g. via the
|
||||
<A HREF = "create_atoms.html">create_atoms</A> command. For example, if your atoms
|
||||
extend from 0 to 50, you should not specify the box bounds as -10000
|
||||
and 10000. This is because LAMMPS uses the specified box size to
|
||||
layout the 3d grid of processors. A huge (mostly empty) box will be
|
||||
sub-optimal for performance when using "fixed" boundary conditions
|
||||
(see the <A HREF = "boundary.html">boundary</A> command). When using "shrink-wrap"
|
||||
boundary conditions (see the <A HREF = "boundary.html">boundary</A> command), a huge
|
||||
(mostly empty) box may cause a parallel simulation to lose atoms the
|
||||
first time that LAMMPS shrink-wraps the box around the atoms.
|
||||
and 10000. This is because as described above, LAMMPS uses the
|
||||
specified box size to layout the 3d grid of processors. A huge
|
||||
(mostly empty) box will be sub-optimal for performance when using
|
||||
"fixed" boundary conditions (see the <A HREF = "boundary.html">boundary</A>
|
||||
command). When using "shrink-wrap" boundary conditions (see the
|
||||
<A HREF = "boundary.html">boundary</A> command), a huge (mostly empty) box may cause
|
||||
a parallel simulation to lose atoms the first time that LAMMPS
|
||||
shrink-wraps the box around the atoms.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
|
|
|
@ -36,7 +36,12 @@ create_box 2 mybox bond/types 2 extra/bond/per/atom 1 :pre
|
|||
|
||||
This command creates a simulation box based on the specified region.
|
||||
Thus a "region"_region.html command must first be used to define a
|
||||
geometric domain.
|
||||
geometric domain. It also partitions the simulation box into a
|
||||
regular 3d grid of rectangular bricks, one per processor, based on the
|
||||
number of processors being used and the settings of the
|
||||
"processors"_processors.html command. The partitioning can later be
|
||||
changed by the "balance"_balance.html or "fix
|
||||
balance"_fix_balance.html commands.
|
||||
|
||||
The argument N is the number of atom types that will be used in the
|
||||
simulation.
|
||||
|
@ -86,13 +91,14 @@ you should not make the lo/hi box dimensions (as defined in your
|
|||
of the atoms you eventually plan to create, e.g. via the
|
||||
"create_atoms"_create_atoms.html command. For example, if your atoms
|
||||
extend from 0 to 50, you should not specify the box bounds as -10000
|
||||
and 10000. This is because LAMMPS uses the specified box size to
|
||||
layout the 3d grid of processors. A huge (mostly empty) box will be
|
||||
sub-optimal for performance when using "fixed" boundary conditions
|
||||
(see the "boundary"_boundary.html command). When using "shrink-wrap"
|
||||
boundary conditions (see the "boundary"_boundary.html command), a huge
|
||||
(mostly empty) box may cause a parallel simulation to lose atoms the
|
||||
first time that LAMMPS shrink-wraps the box around the atoms.
|
||||
and 10000. This is because as described above, LAMMPS uses the
|
||||
specified box size to layout the 3d grid of processors. A huge
|
||||
(mostly empty) box will be sub-optimal for performance when using
|
||||
"fixed" boundary conditions (see the "boundary"_boundary.html
|
||||
command). When using "shrink-wrap" boundary conditions (see the
|
||||
"boundary"_boundary.html command), a huge (mostly empty) box may cause
|
||||
a parallel simulation to lose atoms the first time that LAMMPS
|
||||
shrink-wraps the box around the atoms.
|
||||
|
||||
:line
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@
|
|||
</H3>
|
||||
<P><B>Syntax:</B>
|
||||
</P>
|
||||
<PRE>fix ID group-ID balance Nfreq dimstr Niter thresh keyword value ...
|
||||
<PRE>fix ID group-ID balance Nfreq thresh style args keyword value ...
|
||||
</PRE>
|
||||
<UL><LI>ID, group-ID are documented in <A HREF = "fix.html">fix</A> command
|
||||
|
||||
|
@ -21,76 +21,130 @@
|
|||
|
||||
<LI>Nfreq = perform dynamic load balancing every this many steps
|
||||
|
||||
<LI>dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
|
||||
<LI>thresh = imbalance threshhold that must be exceeded to perform a re-balance
|
||||
|
||||
<LI>Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
<LI>style = <I>shift</I> or <I>rcb</I>
|
||||
|
||||
<LI>thresh = stop balancing when this imbalance threshhold is reached
|
||||
<PRE> shift args = dimstr Niter stopthresh
|
||||
dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
stopthresh = stop balancing when this imbalance threshhold is reached
|
||||
rcb args = none
|
||||
</PRE>
|
||||
<LI>zero or more keyword/value pairs may be appended
|
||||
|
||||
<LI>zero or more keyword/arg pairs may be appended
|
||||
</UL>
|
||||
<LI>keyword = <I>out</I>
|
||||
|
||||
<PRE> <I>out</I> arg = filename
|
||||
filename = output file to write each processor's sub-domain to
|
||||
<PRE> <I>out</I> value = filename
|
||||
filename = write each processor's sub-domain to a file, at each re-balancing
|
||||
</PRE>
|
||||
|
||||
</UL>
|
||||
<P><B>Examples:</B>
|
||||
</P>
|
||||
<PRE>fix 2 all balance 1000 x 10 1.05
|
||||
fix 2 all balance 0 xy 20 1.1 out tmp.balance
|
||||
<PRE>fix 2 all balance 1000 1.05 shift x 10 1.05
|
||||
fix 2 all balance 100 0.9 shift xy 20 1.1 out tmp.balance
|
||||
fix 2 all balance 1000 1.1 rcb
|
||||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>This command adjusts the size of processor sub-domains within the
|
||||
simulation box dynamically as a simulation runs, to attempt to balance
|
||||
the number of particles and thus the computational cost (load) evenly
|
||||
across processors. The load balancing is "dynamic" in the sense that
|
||||
<P>This command adjusts the size and shape of processor sub-domains
|
||||
within the simulation box, to attempt to balance the number of
|
||||
particles and thus the computational cost (load) evenly across
|
||||
processors. The load balancing is "dynamic" in the sense that
|
||||
rebalancing is performed periodically during the simulation. To
|
||||
perform "static" balancing, before of between runs, see the
|
||||
perform "static" balancing, before or between runs, see the
|
||||
<A HREF = "balance.html">balance</A> command.
|
||||
</P>
|
||||
<P>Load-balancing is only useful if the particles in the simulation box
|
||||
have a spatially-varying density distribution. E.g. a model of a
|
||||
vapor/liquid interface, or a solid with an irregular-shaped geometry
|
||||
containing void regions. In this case, the LAMMPS default of dividing
|
||||
the simulation box volume into a regular-spaced grid of processor
|
||||
sub-domain, with one equal-volume sub-domain per procesor, may assign
|
||||
very different numbers of particles per processor. This can lead to
|
||||
poor performance in a scalability sense, when the simulation is run in
|
||||
<P>Load-balancing is typically only useful if the particles in the
|
||||
simulation box have a spatially-varying density distribution. E.g. a
|
||||
model of a vapor/liquid interface, or a solid with an irregular-shaped
|
||||
geometry containing void regions. In this case, the LAMMPS default of
|
||||
dividing the simulation box volume into a regular-spaced grid of 3d
|
||||
bricks, with one equal-volume sub-domain per procesor, may assign very
|
||||
different numbers of particles per processor. This can lead to poor
|
||||
performance in a scalability sense, when the simulation is run in
|
||||
parallel.
|
||||
</P>
|
||||
<P>Note that the <A HREF = "processors.html">processors</A> command gives you some
|
||||
control over how the box volume is split across
|
||||
processors. Specifically, for a Px by Py by Pz grid of processors, it
|
||||
lets you choose Px, Py, and Pz, subject to the constraint that Px * Py
|
||||
* Pz = P, the total number of processors. This can be sufficient to
|
||||
achieve good load-balance for some models on some processor
|
||||
counts. However, all the processor sub-domains will still be the same
|
||||
shape and have the same volume.
|
||||
<P>Note that the <A HREF = "processors.html">processors</A> command allows some control
|
||||
over how the box volume is split across processors. Specifically, for
|
||||
a Px by Py by Pz grid of processors, it allows choice of Px, Py, and
|
||||
Pz, subject to the constraint that Px * Py * Pz = P, the total number
|
||||
of processors. This is sufficient to achieve good load-balance for
|
||||
many models on many processor counts. However, all the processor
|
||||
sub-domains will still have the same shape and same volume.
|
||||
</P>
|
||||
<P>This command does not alter the topology of the Px by Py by Pz grid or
|
||||
processors. But it shifts the cutting planes between processors (in
|
||||
3d, or lines in 2d), which adjusts the volume (area in 2d) assigned to
|
||||
each processor, as in the following 2d diagram. The left diagram is
|
||||
the default partitioning of the simulation box across processors (one
|
||||
sub-box for each of 16 processors); the right diagram is after
|
||||
balancing.
|
||||
<P>On a particular timestep, a load-balancing operation is only performed
|
||||
if the current "imbalance factor" in particles owned by each processor
|
||||
exceeds the specified <I>thresh</I> parameter. This factor is defined as
|
||||
the maximum number of particles owned by any processor, divided by the
|
||||
average number of particles per processor. Thus an imbalance factor
|
||||
of 1.0 is perfect balance. For 10000 particles running on 10
|
||||
processors, if the most heavily loaded processor has 1200 particles,
|
||||
then the factor is 1.2, meaning there is a 20% imbalance. Note that
|
||||
re-balances can be forced even if the current balance is perfect (1.0)
|
||||
be specifying a <I>thresh</I> < 1.0.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But depending on the method a perfect
|
||||
balance (1.0) may not be achieved. For example, "grid" methods
|
||||
(defined below) that create a logical 3d grid cannot achieve perfect
|
||||
balance for many irregular distributions of particles. Likewise, if a
|
||||
portion of the system is a perfect lattice, e.g. the intiial system is
|
||||
generated by the <A HREF = "create_atoms.html">create_atoms</A> command, then "grid"
|
||||
methods may be unable to achieve exact balance. This is because
|
||||
entire lattice planes will be owned or not owned by a single
|
||||
processor.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: Computational cost is not strictly proportional to
|
||||
particle count, and changing the relative size and shape of processor
|
||||
sub-domains may lead to additional computational and communication
|
||||
overheads, e.g. in the PPPM solver used via the
|
||||
<A HREF = "kspace_style.html">kspace_style</A> command. Thus you should benchmark
|
||||
the run times of a simulation before and after balancing.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The method used to perform a load balance is specified by one of the
|
||||
listed styles, which are described in detail below. There are 2 kinds
|
||||
of styles.
|
||||
</P>
|
||||
<P>The <I>shift</I> style is a "grid" method which produces a logical 3d grid
|
||||
of processors. It operates by changing the cutting planes (or lines)
|
||||
between processors in 3d (or 2d), to adjust the volume (area in 2d)
|
||||
assigned to each processor, as in the following 2d diagram. The left
|
||||
diagram is the default partitioning of the simulation box across
|
||||
processors (one sub-box for each of 16 processors); the right diagram
|
||||
is after balancing.
|
||||
</P>
|
||||
<CENTER><IMG SRC = "JPG/balance.jpg">
|
||||
</CENTER>
|
||||
<P>IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But because of the topology constraint that
|
||||
only the cutting planes (lines) between processors are moved, there
|
||||
are many irregular distributions of particles, where this factor
|
||||
cannot be shrunk to 1.0, particuarly in 3d. Also, computational cost
|
||||
is not strictly proportional to particle count, and changing the
|
||||
relative size and shape of processor sub-domains may lead to
|
||||
additional computational and communication overheads, e.g. in the PPPM
|
||||
solver used via the <A HREF = "kspace_style.html">kspace_style</A> command. Thus
|
||||
you should benchmark the run times of your simulation with and without
|
||||
balancing.
|
||||
<P>The <I>rcb</I> style is a "tiling" method which does not produce a logical
|
||||
3d grid of processors. Rather it tiles the simulation domain with
|
||||
rectangular sub-boxes of varying size and shape in an irregular
|
||||
fashion so as to have equal numbers of particles in each sub-box, as
|
||||
in the following 2d diagram. Again the left diagram is the default
|
||||
partitioning of the simulation box across processors (one sub-box for
|
||||
each of 16 processors); the right diagram is after balancing.
|
||||
</P>
|
||||
<P>NOTE: Need a diagram of RCB partitioning.
|
||||
</P>
|
||||
<P>The "grid" methods can be used with either of the
|
||||
<A HREF = "comm_style.html">comm_style</A> command options, <I>brick</I> or <I>tiled</I>. The
|
||||
"tiling" methods can only be used with <A HREF = "comm_style.html">comm_style
|
||||
tiled</A>.
|
||||
</P>
|
||||
<P>When a "grid" method is specified, the current domain partitioning can
|
||||
be either a logical 3d grid or a tiled partitioning. In the former
|
||||
case, the current logical 3d grid is used as a starting point and
|
||||
changes are made to improve the imbalance factor. In the latter case,
|
||||
the tiled partitioning is discarded and a logical 3d grid is created
|
||||
with uniform spacing in all dimensions. This becomes the starting
|
||||
point for the balancing operation.
|
||||
</P>
|
||||
<P>When a "tiling" method is specified, the current domain partitioning
|
||||
("grid" or "tiled") is ignored, and a new partitioning is computed
|
||||
from scratch.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
|
@ -103,8 +157,8 @@ particles.
|
|||
</P>
|
||||
<P>The <I>Nfreq</I> setting determines how often a rebalance is performed. If
|
||||
<I>Nfreq</I> > 0, then rebalancing will occur every <I>Nfreq</I> steps. Each
|
||||
time a rebalance occurs, a reneighboring is triggered, so you should
|
||||
not make <I>Nfreq</I> too small. If <I>Nfreq</I> = 0, then rebalancing will be
|
||||
time a rebalance occurs, a reneighboring is triggered, so <I>Nfreq</I>
|
||||
should not be too small. If <I>Nfreq</I> = 0, then rebalancing will be
|
||||
done every time reneighboring normally occurs, as determined by the
|
||||
the <A HREF = "neighbor.html">neighbor</A> and <A HREF = "neigh_modify.html">neigh_modify</A>
|
||||
command settings.
|
||||
|
@ -112,6 +166,12 @@ command settings.
|
|||
<P>On rebalance steps, rebalancing will only be attempted if the current
|
||||
imbalance factor, as defined above, exceeds the <I>thresh</I> setting.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>shift</I> style invokes a "grid" method for balancing, as described
|
||||
above. It changes the positions of cutting planes between processors
|
||||
in an iterative fashion, seeking to reduce the imbalance factor.
|
||||
</P>
|
||||
<P>The <I>dimstr</I> argument is a string of characters, each of which must be
|
||||
an "x" or "y" or "z". Eacn character can appear zero or one time,
|
||||
since there is no advantage to balancing on a dimension more than
|
||||
|
@ -122,61 +182,61 @@ to be a density variation in the particles.
|
|||
dimensions listed in <I>dimstr</I>, one dimension at a time. For a single
|
||||
dimension, the balancing operation (described below) is iterated on up
|
||||
to <I>Niter</I> times. After each dimension finishes, the imbalance factor
|
||||
is re-computed, and the balancing operation halts if the <I>thresh</I>
|
||||
is re-computed, and the balancing operation halts if the <I>stopthresh</I>
|
||||
criterion is met.
|
||||
</P>
|
||||
<P>A rebalance operation in a single dimension is performed using a
|
||||
density-dependent recursive multisectioning algorithm, where the
|
||||
position of each cutting plane (line in 2d) in the dimension is
|
||||
adjusted independently. This is similar to a recursive bisectioning
|
||||
(RCB) for a single value, except that the bounds used for each
|
||||
bisectioning take advantage of information from neighboring cuts if
|
||||
possible, as well as counts of particles at the bounds on either side
|
||||
of each cuts, which themselves were cuts in previous iterations. The
|
||||
latter is used to infer a density of pariticles near each of the
|
||||
current cuts. At each iteration, the count of particles on either
|
||||
side of each plane is tallied. If the counts do not match the target
|
||||
value for the plane, the position of the cut is adjusted based on the
|
||||
local density. The low and high bounds are adjusted on each
|
||||
iteration, using new count information, so that they become closer
|
||||
together over time. Thus as the recustion progresses, the count of
|
||||
particles on either side of the plane gets closer to the target value.
|
||||
for a single value, except that the bounds used for each bisectioning
|
||||
take advantage of information from neighboring cuts if possible, as
|
||||
well as counts of particles at the bounds on either side of each cuts,
|
||||
which themselves were cuts in previous iterations. The latter is used
|
||||
to infer a density of pariticles near each of the current cuts. At
|
||||
each iteration, the count of particles on either side of each plane is
|
||||
tallied. If the counts do not match the target value for the plane,
|
||||
the position of the cut is adjusted based on the local density. The
|
||||
low and high bounds are adjusted on each iteration, using new count
|
||||
information, so that they become closer together over time. Thus as
|
||||
the recustion progresses, the count of particles on either side of the
|
||||
plane gets closer to the target value.
|
||||
</P>
|
||||
<P>The density-dependent part of this algorithm is often an advantage
|
||||
when you rebalance a system that is already nearly balanced. It
|
||||
typically converges more quickly than the geometric bisectioning
|
||||
algorithm used by the <A HREF = "balance.html">balance</A> command. However, if can
|
||||
be a disadvants if you attempt to rebalance a system that is far from
|
||||
balanced, and converge more slowly. In this case you probably want to
|
||||
use the <A HREF = "balance.html">balance</A> command before starting a run, so that
|
||||
you begin the run with a balanced system.
|
||||
be a disadvantage if you attempt to rebalance a system that is far
|
||||
from balanced, and converge more slowly. In this case you probably
|
||||
want to use the <A HREF = "balance.html">balance</A> command before starting a run,
|
||||
so that you begin the run with a balanced system.
|
||||
</P>
|
||||
<P>Once the rebalancing is complete and final processor sub-domains
|
||||
assigned, particles migrate to their new owning processor as part of
|
||||
the normal reneighboring procedure.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: At each rebalance operation, the RCB operation for
|
||||
each cutting plane (line in 2d) typcially starts with low and high
|
||||
bounds separated by the extent of a processor's sub-domain in one
|
||||
dimension. The size of this bracketing region shrinks based on the
|
||||
local density, as described above, which should typically be 1/2 or
|
||||
more every iteration. Thus if <I>Niter</I> is specified as 10, the cutting
|
||||
plane will typically be positioned to better than 1 part in 1000
|
||||
accuracy (relative to the perfect target position). For <I>Niter</I> = 20,
|
||||
it will be accurate to better than 1 part in a million. Thus there is
|
||||
no need to set <I>Niter</I> to a large value. This is especially true if
|
||||
you are rebalancing often enough that each time you expect only an
|
||||
incremental adjustement in the cutting planes is necessary. LAMMPS
|
||||
will check if the threshold accuracy is reached (in a dimension) is
|
||||
less iterations than <I>Niter</I> and exit early.
|
||||
<P>IMPORTANT NOTE: At each rebalance operation, the bisectioning for each
|
||||
cutting plane (line in 2d) typcially starts with low and high bounds
|
||||
separated by the extent of a processor's sub-domain in one dimension.
|
||||
The size of this bracketing region shrinks based on the local density,
|
||||
as described above, which should typically be 1/2 or more every
|
||||
iteration. Thus if <I>Niter</I> is specified as 10, the cutting plane will
|
||||
typically be positioned to better than 1 part in 1000 accuracy
|
||||
(relative to the perfect target position). For <I>Niter</I> = 20, it will
|
||||
be accurate to better than 1 part in a million. Thus there is no need
|
||||
to set <I>Niter</I> to a large value. This is especially true if you are
|
||||
rebalancing often enough that each time you expect only an incremental
|
||||
adjustement in the cutting planes is necessary. LAMMPS will check if
|
||||
the threshold accuracy is reached (in a dimension) is less iterations
|
||||
than <I>Niter</I> and exit early.
|
||||
</P>
|
||||
<P>IMPORTANT NOTE: If a portion of your system is a perfect lattice,
|
||||
e.g. a frozen substrate, then the balancer may be unable to achieve
|
||||
exact balance. I.e. entire lattice planes will be owned or not owned
|
||||
by a single processor. So you you should not expect to achieve
|
||||
perfect balance in this case. Nor will it be helpful to use a large
|
||||
value for <I>Niter</I>, since it will simply cause the balancer to iterate
|
||||
until <I>Niter</I> is reached, without improving the imbalance factor.
|
||||
<HR>
|
||||
|
||||
<P>The <I>rcb</I> style invokes a "tiled" method for balancing, as described
|
||||
above. It performs a recursive coordinate bisectioning (RCB) of the
|
||||
simulation domain.
|
||||
</P>
|
||||
<P>Need further description of RCB.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
|
@ -262,7 +322,10 @@ minimization</A>.
|
|||
</P>
|
||||
<HR>
|
||||
|
||||
<P><B>Restrictions:</B> none
|
||||
<P><B>Restrictions:</B>
|
||||
</P>
|
||||
<P>For 2d simulations, a "z" cannot appear in <I>dimstr</I> for the <I>shift</I>
|
||||
style.
|
||||
</P>
|
||||
<P><B>Related commands:</B>
|
||||
</P>
|
||||
|
|
|
@ -10,75 +10,129 @@ fix balance command :h3
|
|||
|
||||
[Syntax:]
|
||||
|
||||
fix ID group-ID balance Nfreq dimstr Niter thresh keyword value ... :pre
|
||||
fix ID group-ID balance Nfreq thresh style args keyword value ... :pre
|
||||
|
||||
ID, group-ID are documented in "fix"_fix.html command :ulb,l
|
||||
balance = style name of this fix command :l
|
||||
Nfreq = perform dynamic load balancing every this many steps :l
|
||||
dimstr = sequence of letters containing "x" or "y" or "z", each not more than once :l
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence :l
|
||||
thresh = stop balancing when this imbalance threshhold is reached :l
|
||||
zero or more keyword/arg pairs may be appended :ule,l
|
||||
thresh = imbalance threshhold that must be exceeded to perform a re-balance :l
|
||||
style = {shift} or {rcb} :l
|
||||
shift args = dimstr Niter stopthresh
|
||||
dimstr = sequence of letters containing "x" or "y" or "z", each not more than once
|
||||
Niter = # of times to iterate within each dimension of dimstr sequence
|
||||
stopthresh = stop balancing when this imbalance threshhold is reached
|
||||
rcb args = none :pre
|
||||
zero or more keyword/value pairs may be appended :l
|
||||
keyword = {out} :l
|
||||
{out} arg = filename
|
||||
filename = output file to write each processor's sub-domain to :pre
|
||||
{out} value = filename
|
||||
filename = write each processor's sub-domain to a file, at each re-balancing :pre
|
||||
:ule
|
||||
|
||||
[Examples:]
|
||||
|
||||
fix 2 all balance 1000 x 10 1.05
|
||||
fix 2 all balance 0 xy 20 1.1 out tmp.balance :pre
|
||||
fix 2 all balance 1000 1.05 shift x 10 1.05
|
||||
fix 2 all balance 100 0.9 shift xy 20 1.1 out tmp.balance
|
||||
fix 2 all balance 1000 1.1 rcb :pre
|
||||
|
||||
[Description:]
|
||||
|
||||
This command adjusts the size of processor sub-domains within the
|
||||
simulation box dynamically as a simulation runs, to attempt to balance
|
||||
the number of particles and thus the computational cost (load) evenly
|
||||
across processors. The load balancing is "dynamic" in the sense that
|
||||
This command adjusts the size and shape of processor sub-domains
|
||||
within the simulation box, to attempt to balance the number of
|
||||
particles and thus the computational cost (load) evenly across
|
||||
processors. The load balancing is "dynamic" in the sense that
|
||||
rebalancing is performed periodically during the simulation. To
|
||||
perform "static" balancing, before of between runs, see the
|
||||
perform "static" balancing, before or between runs, see the
|
||||
"balance"_balance.html command.
|
||||
|
||||
Load-balancing is only useful if the particles in the simulation box
|
||||
have a spatially-varying density distribution. E.g. a model of a
|
||||
vapor/liquid interface, or a solid with an irregular-shaped geometry
|
||||
containing void regions. In this case, the LAMMPS default of dividing
|
||||
the simulation box volume into a regular-spaced grid of processor
|
||||
sub-domain, with one equal-volume sub-domain per procesor, may assign
|
||||
very different numbers of particles per processor. This can lead to
|
||||
poor performance in a scalability sense, when the simulation is run in
|
||||
Load-balancing is typically only useful if the particles in the
|
||||
simulation box have a spatially-varying density distribution. E.g. a
|
||||
model of a vapor/liquid interface, or a solid with an irregular-shaped
|
||||
geometry containing void regions. In this case, the LAMMPS default of
|
||||
dividing the simulation box volume into a regular-spaced grid of 3d
|
||||
bricks, with one equal-volume sub-domain per procesor, may assign very
|
||||
different numbers of particles per processor. This can lead to poor
|
||||
performance in a scalability sense, when the simulation is run in
|
||||
parallel.
|
||||
|
||||
Note that the "processors"_processors.html command gives you some
|
||||
control over how the box volume is split across
|
||||
processors. Specifically, for a Px by Py by Pz grid of processors, it
|
||||
lets you choose Px, Py, and Pz, subject to the constraint that Px * Py
|
||||
* Pz = P, the total number of processors. This can be sufficient to
|
||||
achieve good load-balance for some models on some processor
|
||||
counts. However, all the processor sub-domains will still be the same
|
||||
shape and have the same volume.
|
||||
Note that the "processors"_processors.html command allows some control
|
||||
over how the box volume is split across processors. Specifically, for
|
||||
a Px by Py by Pz grid of processors, it allows choice of Px, Py, and
|
||||
Pz, subject to the constraint that Px * Py * Pz = P, the total number
|
||||
of processors. This is sufficient to achieve good load-balance for
|
||||
many models on many processor counts. However, all the processor
|
||||
sub-domains will still have the same shape and same volume.
|
||||
|
||||
This command does not alter the topology of the Px by Py by Pz grid or
|
||||
processors. But it shifts the cutting planes between processors (in
|
||||
3d, or lines in 2d), which adjusts the volume (area in 2d) assigned to
|
||||
each processor, as in the following 2d diagram. The left diagram is
|
||||
the default partitioning of the simulation box across processors (one
|
||||
sub-box for each of 16 processors); the right diagram is after
|
||||
balancing.
|
||||
On a particular timestep, a load-balancing operation is only performed
|
||||
if the current "imbalance factor" in particles owned by each processor
|
||||
exceeds the specified {thresh} parameter. This factor is defined as
|
||||
the maximum number of particles owned by any processor, divided by the
|
||||
average number of particles per processor. Thus an imbalance factor
|
||||
of 1.0 is perfect balance. For 10000 particles running on 10
|
||||
processors, if the most heavily loaded processor has 1200 particles,
|
||||
then the factor is 1.2, meaning there is a 20% imbalance. Note that
|
||||
re-balances can be forced even if the current balance is perfect (1.0)
|
||||
be specifying a {thresh} < 1.0.
|
||||
|
||||
IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But depending on the method a perfect
|
||||
balance (1.0) may not be achieved. For example, "grid" methods
|
||||
(defined below) that create a logical 3d grid cannot achieve perfect
|
||||
balance for many irregular distributions of particles. Likewise, if a
|
||||
portion of the system is a perfect lattice, e.g. the intiial system is
|
||||
generated by the "create_atoms"_create_atoms.html command, then "grid"
|
||||
methods may be unable to achieve exact balance. This is because
|
||||
entire lattice planes will be owned or not owned by a single
|
||||
processor.
|
||||
|
||||
IMPORTANT NOTE: Computational cost is not strictly proportional to
|
||||
particle count, and changing the relative size and shape of processor
|
||||
sub-domains may lead to additional computational and communication
|
||||
overheads, e.g. in the PPPM solver used via the
|
||||
"kspace_style"_kspace_style.html command. Thus you should benchmark
|
||||
the run times of a simulation before and after balancing.
|
||||
|
||||
:line
|
||||
|
||||
The method used to perform a load balance is specified by one of the
|
||||
listed styles, which are described in detail below. There are 2 kinds
|
||||
of styles.
|
||||
|
||||
The {shift} style is a "grid" method which produces a logical 3d grid
|
||||
of processors. It operates by changing the cutting planes (or lines)
|
||||
between processors in 3d (or 2d), to adjust the volume (area in 2d)
|
||||
assigned to each processor, as in the following 2d diagram. The left
|
||||
diagram is the default partitioning of the simulation box across
|
||||
processors (one sub-box for each of 16 processors); the right diagram
|
||||
is after balancing.
|
||||
|
||||
:c,image(JPG/balance.jpg)
|
||||
|
||||
IMPORTANT NOTE: This command attempts to minimize the imbalance
|
||||
factor, as defined above. But because of the topology constraint that
|
||||
only the cutting planes (lines) between processors are moved, there
|
||||
are many irregular distributions of particles, where this factor
|
||||
cannot be shrunk to 1.0, particuarly in 3d. Also, computational cost
|
||||
is not strictly proportional to particle count, and changing the
|
||||
relative size and shape of processor sub-domains may lead to
|
||||
additional computational and communication overheads, e.g. in the PPPM
|
||||
solver used via the "kspace_style"_kspace_style.html command. Thus
|
||||
you should benchmark the run times of your simulation with and without
|
||||
balancing.
|
||||
The {rcb} style is a "tiling" method which does not produce a logical
|
||||
3d grid of processors. Rather it tiles the simulation domain with
|
||||
rectangular sub-boxes of varying size and shape in an irregular
|
||||
fashion so as to have equal numbers of particles in each sub-box, as
|
||||
in the following 2d diagram. Again the left diagram is the default
|
||||
partitioning of the simulation box across processors (one sub-box for
|
||||
each of 16 processors); the right diagram is after balancing.
|
||||
|
||||
NOTE: Need a diagram of RCB partitioning.
|
||||
|
||||
The "grid" methods can be used with either of the
|
||||
"comm_style"_comm_style.html command options, {brick} or {tiled}. The
|
||||
"tiling" methods can only be used with "comm_style
|
||||
tiled"_comm_style.html.
|
||||
|
||||
When a "grid" method is specified, the current domain partitioning can
|
||||
be either a logical 3d grid or a tiled partitioning. In the former
|
||||
case, the current logical 3d grid is used as a starting point and
|
||||
changes are made to improve the imbalance factor. In the latter case,
|
||||
the tiled partitioning is discarded and a logical 3d grid is created
|
||||
with uniform spacing in all dimensions. This becomes the starting
|
||||
point for the balancing operation.
|
||||
|
||||
When a "tiling" method is specified, the current domain partitioning
|
||||
("grid" or "tiled") is ignored, and a new partitioning is computed
|
||||
from scratch.
|
||||
|
||||
:line
|
||||
|
||||
|
@ -91,8 +145,8 @@ particles.
|
|||
|
||||
The {Nfreq} setting determines how often a rebalance is performed. If
|
||||
{Nfreq} > 0, then rebalancing will occur every {Nfreq} steps. Each
|
||||
time a rebalance occurs, a reneighboring is triggered, so you should
|
||||
not make {Nfreq} too small. If {Nfreq} = 0, then rebalancing will be
|
||||
time a rebalance occurs, a reneighboring is triggered, so {Nfreq}
|
||||
should not be too small. If {Nfreq} = 0, then rebalancing will be
|
||||
done every time reneighboring normally occurs, as determined by the
|
||||
the "neighbor"_neighbor.html and "neigh_modify"_neigh_modify.html
|
||||
command settings.
|
||||
|
@ -100,6 +154,12 @@ command settings.
|
|||
On rebalance steps, rebalancing will only be attempted if the current
|
||||
imbalance factor, as defined above, exceeds the {thresh} setting.
|
||||
|
||||
:line
|
||||
|
||||
The {shift} style invokes a "grid" method for balancing, as described
|
||||
above. It changes the positions of cutting planes between processors
|
||||
in an iterative fashion, seeking to reduce the imbalance factor.
|
||||
|
||||
The {dimstr} argument is a string of characters, each of which must be
|
||||
an "x" or "y" or "z". Eacn character can appear zero or one time,
|
||||
since there is no advantage to balancing on a dimension more than
|
||||
|
@ -110,61 +170,61 @@ Balancing proceeds by adjusting the cutting planes in each of the
|
|||
dimensions listed in {dimstr}, one dimension at a time. For a single
|
||||
dimension, the balancing operation (described below) is iterated on up
|
||||
to {Niter} times. After each dimension finishes, the imbalance factor
|
||||
is re-computed, and the balancing operation halts if the {thresh}
|
||||
is re-computed, and the balancing operation halts if the {stopthresh}
|
||||
criterion is met.
|
||||
|
||||
A rebalance operation in a single dimension is performed using a
|
||||
density-dependent recursive multisectioning algorithm, where the
|
||||
position of each cutting plane (line in 2d) in the dimension is
|
||||
adjusted independently. This is similar to a recursive bisectioning
|
||||
(RCB) for a single value, except that the bounds used for each
|
||||
bisectioning take advantage of information from neighboring cuts if
|
||||
possible, as well as counts of particles at the bounds on either side
|
||||
of each cuts, which themselves were cuts in previous iterations. The
|
||||
latter is used to infer a density of pariticles near each of the
|
||||
current cuts. At each iteration, the count of particles on either
|
||||
side of each plane is tallied. If the counts do not match the target
|
||||
value for the plane, the position of the cut is adjusted based on the
|
||||
local density. The low and high bounds are adjusted on each
|
||||
iteration, using new count information, so that they become closer
|
||||
together over time. Thus as the recustion progresses, the count of
|
||||
particles on either side of the plane gets closer to the target value.
|
||||
for a single value, except that the bounds used for each bisectioning
|
||||
take advantage of information from neighboring cuts if possible, as
|
||||
well as counts of particles at the bounds on either side of each cuts,
|
||||
which themselves were cuts in previous iterations. The latter is used
|
||||
to infer a density of pariticles near each of the current cuts. At
|
||||
each iteration, the count of particles on either side of each plane is
|
||||
tallied. If the counts do not match the target value for the plane,
|
||||
the position of the cut is adjusted based on the local density. The
|
||||
low and high bounds are adjusted on each iteration, using new count
|
||||
information, so that they become closer together over time. Thus as
|
||||
the recustion progresses, the count of particles on either side of the
|
||||
plane gets closer to the target value.
|
||||
|
||||
The density-dependent part of this algorithm is often an advantage
|
||||
when you rebalance a system that is already nearly balanced. It
|
||||
typically converges more quickly than the geometric bisectioning
|
||||
algorithm used by the "balance"_balance.html command. However, if can
|
||||
be a disadvants if you attempt to rebalance a system that is far from
|
||||
balanced, and converge more slowly. In this case you probably want to
|
||||
use the "balance"_balance.html command before starting a run, so that
|
||||
you begin the run with a balanced system.
|
||||
be a disadvantage if you attempt to rebalance a system that is far
|
||||
from balanced, and converge more slowly. In this case you probably
|
||||
want to use the "balance"_balance.html command before starting a run,
|
||||
so that you begin the run with a balanced system.
|
||||
|
||||
Once the rebalancing is complete and final processor sub-domains
|
||||
assigned, particles migrate to their new owning processor as part of
|
||||
the normal reneighboring procedure.
|
||||
|
||||
IMPORTANT NOTE: At each rebalance operation, the RCB operation for
|
||||
each cutting plane (line in 2d) typcially starts with low and high
|
||||
bounds separated by the extent of a processor's sub-domain in one
|
||||
dimension. The size of this bracketing region shrinks based on the
|
||||
local density, as described above, which should typically be 1/2 or
|
||||
more every iteration. Thus if {Niter} is specified as 10, the cutting
|
||||
plane will typically be positioned to better than 1 part in 1000
|
||||
accuracy (relative to the perfect target position). For {Niter} = 20,
|
||||
it will be accurate to better than 1 part in a million. Thus there is
|
||||
no need to set {Niter} to a large value. This is especially true if
|
||||
you are rebalancing often enough that each time you expect only an
|
||||
incremental adjustement in the cutting planes is necessary. LAMMPS
|
||||
will check if the threshold accuracy is reached (in a dimension) is
|
||||
less iterations than {Niter} and exit early.
|
||||
IMPORTANT NOTE: At each rebalance operation, the bisectioning for each
|
||||
cutting plane (line in 2d) typcially starts with low and high bounds
|
||||
separated by the extent of a processor's sub-domain in one dimension.
|
||||
The size of this bracketing region shrinks based on the local density,
|
||||
as described above, which should typically be 1/2 or more every
|
||||
iteration. Thus if {Niter} is specified as 10, the cutting plane will
|
||||
typically be positioned to better than 1 part in 1000 accuracy
|
||||
(relative to the perfect target position). For {Niter} = 20, it will
|
||||
be accurate to better than 1 part in a million. Thus there is no need
|
||||
to set {Niter} to a large value. This is especially true if you are
|
||||
rebalancing often enough that each time you expect only an incremental
|
||||
adjustement in the cutting planes is necessary. LAMMPS will check if
|
||||
the threshold accuracy is reached (in a dimension) is less iterations
|
||||
than {Niter} and exit early.
|
||||
|
||||
IMPORTANT NOTE: If a portion of your system is a perfect lattice,
|
||||
e.g. a frozen substrate, then the balancer may be unable to achieve
|
||||
exact balance. I.e. entire lattice planes will be owned or not owned
|
||||
by a single processor. So you you should not expect to achieve
|
||||
perfect balance in this case. Nor will it be helpful to use a large
|
||||
value for {Niter}, since it will simply cause the balancer to iterate
|
||||
until {Niter} is reached, without improving the imbalance factor.
|
||||
:line
|
||||
|
||||
The {rcb} style invokes a "tiled" method for balancing, as described
|
||||
above. It performs a recursive coordinate bisectioning (RCB) of the
|
||||
simulation domain.
|
||||
|
||||
Need further description of RCB.
|
||||
|
||||
:line
|
||||
|
||||
|
@ -250,7 +310,10 @@ minimization"_minimize.html.
|
|||
|
||||
:line
|
||||
|
||||
[Restrictions:] none
|
||||
[Restrictions:]
|
||||
|
||||
For 2d simulations, a "z" cannot appear in {dimstr} for the {shift}
|
||||
style.
|
||||
|
||||
[Related commands:]
|
||||
|
||||
|
|
|
@ -57,12 +57,12 @@ processors * * * part 1 2 multiple
|
|||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>Specify how processors are mapped as a 3d logical grid to the global
|
||||
simulation box. This involves 2 steps. First if there are P
|
||||
<P>Specify how processors are mapped as a regular 3d grid to the global
|
||||
simulation box. The mapping involves 2 steps. First if there are P
|
||||
processors it means choosing a factorization P = Px by Py by Pz so
|
||||
that there are Px processors in the x dimension, and similarly for the
|
||||
y and z dimensions. Second, the P processors are mapped to the
|
||||
logical 3d grid. The arguments to this command control each of these
|
||||
regular 3d grid. The arguments to this command control each of these
|
||||
2 steps.
|
||||
</P>
|
||||
<P>The Px, Py, Pz parameters affect the factorization. Any of the 3
|
||||
|
@ -72,12 +72,11 @@ It will do this based on the size and shape of the global simulation
|
|||
box so as to minimize the surface-to-volume ratio of each processor's
|
||||
sub-domain.
|
||||
</P>
|
||||
<P>Since LAMMPS does not load-balance by changing the grid of 3d
|
||||
processors on-the-fly, choosing explicit values for Px or Py or Pz can
|
||||
be used to override the LAMMPS default if it is known to be
|
||||
sub-optimal for a particular problem. E.g. a problem where the extent
|
||||
of atoms will change dramatically in a particular dimension over the
|
||||
course of the simulation.
|
||||
<P>Choosing explicit values for Px or Py or Pz can be used to override
|
||||
the default manner in which LAMMPS will create the regular 3d grid of
|
||||
processors, if it is known to be sub-optimal for a particular problem.
|
||||
E.g. a problem where the extent of atoms will change dramatically in a
|
||||
particular dimension over the course of the simulation.
|
||||
</P>
|
||||
<P>The product of Px, Py, Pz must equal P, the total # of processors
|
||||
LAMMPS is running on. For a <A HREF = "dimension.html">2d simulation</A>, Pz must
|
||||
|
@ -101,6 +100,28 @@ different processor grids for different partitions, e.g.
|
|||
<PRE>partition yes 1 processors 4 4 4
|
||||
partition yes 2 processors 2 3 2
|
||||
</PRE>
|
||||
<P>IMPORTANT NOTE: This command only affects the initial regular 3d grid
|
||||
created when the simulation box is first specified via a
|
||||
<A HREF = "create_box.html">create_box</A> or <A HREF = "read_data.html">read_data</A> or
|
||||
<A HREF = "read_restart.html">read_restart</A> command. Or if the simulation box is
|
||||
re-created via the <A HREF = "replicate.html">replicate</A> command. The same
|
||||
regular grid is initially created, regardless of which
|
||||
<A HREF = "comm_style.html">comm_style</A> command is in effect.
|
||||
</P>
|
||||
<P>If load-balancing is never invoked via the <A HREF = "balance.html">balance</A> or
|
||||
<A HREF = "fix_balance.html">fix balance</A> commands, then the initial regular grid
|
||||
will persist for all simulations. If balancing is performed, some of
|
||||
the methods invoked by those commands retain the logical toplogy of
|
||||
the initial 3d grid, and the mapping of processors to the grid
|
||||
specified by the processors command. However the grid spacings in
|
||||
different dimensions may change, so that processors own sub-domains of
|
||||
different sizes. If the <A HREF = "comm_style.html">comm_style tiled</A> command is
|
||||
used, methods invoked by the balancing commands may discard the 3d
|
||||
grid of processors and tile the simulation domain with sub-domains of
|
||||
different sizes and shapes which no longer have a logical 3d
|
||||
connectivity. If that occurs, all the information specified by the
|
||||
processors command is ignored.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>grid</I> keyword affects the factorization of P into Px,Py,Pz and it
|
||||
|
@ -144,7 +165,7 @@ access (NUMA) costs. It also uses a different algorithm than the
|
|||
<I>twolevel</I> keyword for doing the two-level factorization of the
|
||||
simulation box into a 3d processor grid to minimize off-node
|
||||
communication, and it does its own MPI-based mapping of nodes and
|
||||
cores to the logical 3d grid. Thus it may produce a different layout
|
||||
cores to the regular 3d grid. Thus it may produce a different layout
|
||||
of the processors than the <I>twolevel</I> options.
|
||||
</P>
|
||||
<P>The <I>numa</I> style will give an error if the number of MPI processes is
|
||||
|
@ -239,11 +260,11 @@ and <I>Precv</I> must be integers from 1 to Np, where Np is the number of
|
|||
partitions you have defined via the <A HREF = "Section_start.html#start_7">-partition command-line
|
||||
switch</A>.
|
||||
</P>
|
||||
<P>A "dependency" means that the sending partition will create its 3d
|
||||
logical grid as Px by Py by Pz and after it has done this, it will
|
||||
<P>A "dependency" means that the sending partition will create its
|
||||
regular 3d grid as Px by Py by Pz and after it has done this, it will
|
||||
send the Px,Py,Pz values to the receiving partition. The receiving
|
||||
partition will wait to receive these values before creating its own 3d
|
||||
logical grid and will use the sender's Px,Py,Pz values as a
|
||||
partition will wait to receive these values before creating its own
|
||||
regular 3d grid and will use the sender's Px,Py,Pz values as a
|
||||
constraint. The nature of the constraint is determined by the
|
||||
<I>cstyle</I> argument.
|
||||
</P>
|
||||
|
@ -294,7 +315,7 @@ The universe and original IDs will only be different if you used the
|
|||
the processors differently than their rank in the original
|
||||
communicator LAMMPS was instantiated with.
|
||||
</P>
|
||||
<P>I,J,K are the indices of the processor in the 3d logical grid, each
|
||||
<P>I,J,K are the indices of the processor in the regular 3d grid, each
|
||||
from 1 to Nd, where Nd is the number of processors in that dimension
|
||||
of the grid.
|
||||
</P>
|
||||
|
|
|
@ -50,12 +50,12 @@ processors * * * part 1 2 multiple :pre
|
|||
|
||||
[Description:]
|
||||
|
||||
Specify how processors are mapped as a 3d logical grid to the global
|
||||
simulation box. This involves 2 steps. First if there are P
|
||||
Specify how processors are mapped as a regular 3d grid to the global
|
||||
simulation box. The mapping involves 2 steps. First if there are P
|
||||
processors it means choosing a factorization P = Px by Py by Pz so
|
||||
that there are Px processors in the x dimension, and similarly for the
|
||||
y and z dimensions. Second, the P processors are mapped to the
|
||||
logical 3d grid. The arguments to this command control each of these
|
||||
regular 3d grid. The arguments to this command control each of these
|
||||
2 steps.
|
||||
|
||||
The Px, Py, Pz parameters affect the factorization. Any of the 3
|
||||
|
@ -65,12 +65,11 @@ It will do this based on the size and shape of the global simulation
|
|||
box so as to minimize the surface-to-volume ratio of each processor's
|
||||
sub-domain.
|
||||
|
||||
Since LAMMPS does not load-balance by changing the grid of 3d
|
||||
processors on-the-fly, choosing explicit values for Px or Py or Pz can
|
||||
be used to override the LAMMPS default if it is known to be
|
||||
sub-optimal for a particular problem. E.g. a problem where the extent
|
||||
of atoms will change dramatically in a particular dimension over the
|
||||
course of the simulation.
|
||||
Choosing explicit values for Px or Py or Pz can be used to override
|
||||
the default manner in which LAMMPS will create the regular 3d grid of
|
||||
processors, if it is known to be sub-optimal for a particular problem.
|
||||
E.g. a problem where the extent of atoms will change dramatically in a
|
||||
particular dimension over the course of the simulation.
|
||||
|
||||
The product of Px, Py, Pz must equal P, the total # of processors
|
||||
LAMMPS is running on. For a "2d simulation"_dimension.html, Pz must
|
||||
|
@ -94,6 +93,28 @@ different processor grids for different partitions, e.g.
|
|||
partition yes 1 processors 4 4 4
|
||||
partition yes 2 processors 2 3 2 :pre
|
||||
|
||||
IMPORTANT NOTE: This command only affects the initial regular 3d grid
|
||||
created when the simulation box is first specified via a
|
||||
"create_box"_create_box.html or "read_data"_read_data.html or
|
||||
"read_restart"_read_restart.html command. Or if the simulation box is
|
||||
re-created via the "replicate"_replicate.html command. The same
|
||||
regular grid is initially created, regardless of which
|
||||
"comm_style"_comm_style.html command is in effect.
|
||||
|
||||
If load-balancing is never invoked via the "balance"_balance.html or
|
||||
"fix balance"_fix_balance.html commands, then the initial regular grid
|
||||
will persist for all simulations. If balancing is performed, some of
|
||||
the methods invoked by those commands retain the logical toplogy of
|
||||
the initial 3d grid, and the mapping of processors to the grid
|
||||
specified by the processors command. However the grid spacings in
|
||||
different dimensions may change, so that processors own sub-domains of
|
||||
different sizes. If the "comm_style tiled"_comm_style.html command is
|
||||
used, methods invoked by the balancing commands may discard the 3d
|
||||
grid of processors and tile the simulation domain with sub-domains of
|
||||
different sizes and shapes which no longer have a logical 3d
|
||||
connectivity. If that occurs, all the information specified by the
|
||||
processors command is ignored.
|
||||
|
||||
:line
|
||||
|
||||
The {grid} keyword affects the factorization of P into Px,Py,Pz and it
|
||||
|
@ -137,7 +158,7 @@ access (NUMA) costs. It also uses a different algorithm than the
|
|||
{twolevel} keyword for doing the two-level factorization of the
|
||||
simulation box into a 3d processor grid to minimize off-node
|
||||
communication, and it does its own MPI-based mapping of nodes and
|
||||
cores to the logical 3d grid. Thus it may produce a different layout
|
||||
cores to the regular 3d grid. Thus it may produce a different layout
|
||||
of the processors than the {twolevel} options.
|
||||
|
||||
The {numa} style will give an error if the number of MPI processes is
|
||||
|
@ -232,11 +253,11 @@ and {Precv} must be integers from 1 to Np, where Np is the number of
|
|||
partitions you have defined via the "-partition command-line
|
||||
switch"_Section_start.html#start_7.
|
||||
|
||||
A "dependency" means that the sending partition will create its 3d
|
||||
logical grid as Px by Py by Pz and after it has done this, it will
|
||||
A "dependency" means that the sending partition will create its
|
||||
regular 3d grid as Px by Py by Pz and after it has done this, it will
|
||||
send the Px,Py,Pz values to the receiving partition. The receiving
|
||||
partition will wait to receive these values before creating its own 3d
|
||||
logical grid and will use the sender's Px,Py,Pz values as a
|
||||
partition will wait to receive these values before creating its own
|
||||
regular 3d grid and will use the sender's Px,Py,Pz values as a
|
||||
constraint. The nature of the constraint is determined by the
|
||||
{cstyle} argument.
|
||||
|
||||
|
@ -287,7 +308,7 @@ The universe and original IDs will only be different if you used the
|
|||
the processors differently than their rank in the original
|
||||
communicator LAMMPS was instantiated with.
|
||||
|
||||
I,J,K are the indices of the processor in the 3d logical grid, each
|
||||
I,J,K are the indices of the processor in the regular 3d grid, each
|
||||
from 1 to Nd, where Nd is the number of processors in that dimension
|
||||
of the grid.
|
||||
|
||||
|
|
|
@ -120,7 +120,12 @@ is different than the default.
|
|||
</UL>
|
||||
<P>The initial simulation box size is determined by the lo/hi settings.
|
||||
In any dimension, the system may be periodic or non-periodic; see the
|
||||
<A HREF = "boundary.html">boundary</A> command.
|
||||
<A HREF = "boundary.html">boundary</A> command. When the simulation box is created
|
||||
it is also partitioned into a regular 3d grid of rectangular bricks,
|
||||
one per processor, based on the number of processors being used and
|
||||
the settings of the <A HREF = "processors.html">processors</A> command. The
|
||||
partitioning can later be changed by the <A HREF = "balance.html">balance</A> or
|
||||
<A HREF = "fix_balance.html">fix balance</A> commands.
|
||||
</P>
|
||||
<P>If the <I>xy xz yz</I> line does not appear, LAMMPS will set up an
|
||||
axis-aligned (orthogonal) simulation box. If the line does appear,
|
||||
|
|
|
@ -114,7 +114,12 @@ is different than the default.
|
|||
|
||||
The initial simulation box size is determined by the lo/hi settings.
|
||||
In any dimension, the system may be periodic or non-periodic; see the
|
||||
"boundary"_boundary.html command.
|
||||
"boundary"_boundary.html command. When the simulation box is created
|
||||
it is also partitioned into a regular 3d grid of rectangular bricks,
|
||||
one per processor, based on the number of processors being used and
|
||||
the settings of the "processors"_processors.html command. The
|
||||
partitioning can later be changed by the "balance"_balance.html or
|
||||
"fix balance"_fix_balance.html commands.
|
||||
|
||||
If the {xy xz yz} line does not appear, LAMMPS will set up an
|
||||
axis-aligned (orthogonal) simulation box. If the line does appear,
|
||||
|
|
|
@ -30,7 +30,15 @@ read_restart poly.*.%
|
|||
</P>
|
||||
<P>Read in a previously saved simulation from a restart file. This
|
||||
allows continuation of a previous run. Information about what is
|
||||
stored in a restart file is given below.
|
||||
stored in a restart file is given below. Basically this operation
|
||||
will re-create the simulation box with all its atoms and their
|
||||
attributes, at the point in time it was written to the restart file by
|
||||
a previous simluation. The simulation box will be partitioned into a
|
||||
regular 3d grid of rectangular bricks, one per processor, based on the
|
||||
number of processors in the current simulation and the settings of the
|
||||
<A HREF = "processors.html">processors</A> command. The partitioning can later be
|
||||
changed by the <A HREF = "balance.html">balance</A> or <A HREF = "fix_balance.html">fix
|
||||
balance</A> commands.
|
||||
</P>
|
||||
<P>Restart files are saved in binary format to enable exact restarts,
|
||||
meaning that the trajectories of a restarted run will precisely match
|
||||
|
|
|
@ -27,7 +27,15 @@ read_restart poly.*.% :pre
|
|||
|
||||
Read in a previously saved simulation from a restart file. This
|
||||
allows continuation of a previous run. Information about what is
|
||||
stored in a restart file is given below.
|
||||
stored in a restart file is given below. Basically this operation
|
||||
will re-create the simulation box with all its atoms and their
|
||||
attributes, at the point in time it was written to the restart file by
|
||||
a previous simluation. The simulation box will be partitioned into a
|
||||
regular 3d grid of rectangular bricks, one per processor, based on the
|
||||
number of processors in the current simulation and the settings of the
|
||||
"processors"_processors.html command. The partitioning can later be
|
||||
changed by the "balance"_balance.html or "fix
|
||||
balance"_fix_balance.html commands.
|
||||
|
||||
Restart files are saved in binary format to enable exact restarts,
|
||||
meaning that the trajectories of a restarted run will precisely match
|
||||
|
|
|
@ -27,7 +27,12 @@
|
|||
For example, replication factors of 2,2,2 will create a simulation
|
||||
with 8x as many atoms by doubling the simulation domain in each
|
||||
dimension. A replication factor of 1 in a dimension leaves the
|
||||
simulation domain unchanged.
|
||||
simulation domain unchanged. When the new simulation box is created
|
||||
it is also partitioned into a regular 3d grid of rectangular bricks,
|
||||
one per processor, based on the number of processors being used and
|
||||
the settings of the <A HREF = "processors.html">processors</A> command. The
|
||||
partitioning can later be changed by the <A HREF = "balance.html">balance</A> or
|
||||
<A HREF = "fix_balance.html">fix balance</A> commands.
|
||||
</P>
|
||||
<P>All properties of the atoms are replicated, including their
|
||||
velocities, which may or may not be desirable. New atom IDs are
|
||||
|
|
|
@ -24,7 +24,12 @@ Replicate the current simulation one or more times in each dimension.
|
|||
For example, replication factors of 2,2,2 will create a simulation
|
||||
with 8x as many atoms by doubling the simulation domain in each
|
||||
dimension. A replication factor of 1 in a dimension leaves the
|
||||
simulation domain unchanged.
|
||||
simulation domain unchanged. When the new simulation box is created
|
||||
it is also partitioned into a regular 3d grid of rectangular bricks,
|
||||
one per processor, based on the number of processors being used and
|
||||
the settings of the "processors"_processors.html command. The
|
||||
partitioning can later be changed by the "balance"_balance.html or
|
||||
"fix balance"_fix_balance.html commands.
|
||||
|
||||
All properties of the atoms are replicated, including their
|
||||
velocities, which may or may not be desirable. New atom IDs are
|
||||
|
|
Loading…
Reference in New Issue