mirror of https://github.com/lammps/lammps.git
changes to imbalance weight factors
This commit is contained in:
parent
11c2892e54
commit
c46be7db62
|
@ -1,7 +1,7 @@
|
|||
<!-- HTML_ONLY -->
|
||||
<HEAD>
|
||||
<TITLE>LAMMPS Users Manual</TITLE>
|
||||
<META NAME="docnumber" CONTENT="30 Sep 2016 version">
|
||||
<META NAME="docnumber" CONTENT="5 Oct 2016 version">
|
||||
<META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
|
||||
<META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License.">
|
||||
</HEAD>
|
||||
|
@ -21,7 +21,7 @@
|
|||
<H1></H1>
|
||||
|
||||
LAMMPS Documentation :c,h3
|
||||
30 Sep 2016 version :c,h4
|
||||
5 Oct 2016 version :c,h4
|
||||
|
||||
Version info: :h4
|
||||
|
||||
|
|
|
@ -319,24 +319,25 @@ accurately would be impractical and slow down the computation.
|
|||
Instead the {weight} keyword implements several ways to influence the
|
||||
per-particle weights empirically by properties readily available or
|
||||
using the user's knowledge of the system. Note that the absolute
|
||||
value of the weights are not important; their ratio is what is used to
|
||||
assign particles to processors. A particle with a weight of 2.5 is
|
||||
assumed to require 5x more computational than a particle with a weight
|
||||
of 0.5.
|
||||
value of the weights are not important; only their relative ratios
|
||||
affect which particle is assigned to which processor. A particle with
|
||||
a weight of 2.5 is assumed to require 5x more computational than a
|
||||
particle with a weight of 0.5. For all the options below the weight
|
||||
assigned to a particle must be a positive value; an error will be be
|
||||
generated if a weight is <= 0.0.
|
||||
|
||||
Below is a list of possible weight options with a short description of
|
||||
their usage and some example scenarios where they might be applicable.
|
||||
It is possible to apply multiple weight flags and the weightins they
|
||||
It is possible to apply multiple weight flags and the weightings they
|
||||
induce will be combined through multiplication. Most of the time,
|
||||
however, it is sufficient to use just one method.
|
||||
|
||||
The {group} weight style assigns weight factors to specified
|
||||
"groups"_group.html of particles. The {group} style keyword is
|
||||
followed by the number of groups, then pairs of group IDs and the
|
||||
corresponding weight factor. If a particle belongs to none of the
|
||||
corresponding weight factor. If a particle belongs to none of the
|
||||
specified groups, its weight is not changed. If it belongs to
|
||||
multiple groups, its weight is the product of the weight factors.
|
||||
The weight factors have to be positive.
|
||||
|
||||
This weight style is useful in combination with pair style
|
||||
"hybrid"_pair_hybrid.html, e.g. when combining a more costly manybody
|
||||
|
@ -347,14 +348,24 @@ the computational cost for each group remains constant over time.
|
|||
This is a purely empirical weighting, so a series test runs to tune
|
||||
the assigned weight factors for optimal performance is recommended.
|
||||
|
||||
The {neigh} weight style assigns a weight to each particle equal to
|
||||
its number of neighbors divided by the avergage number of neighbors
|
||||
for all particles. The {factor} setting is then appied as an overall
|
||||
scale factor to all the {neigh} weights which allows tuning of the
|
||||
impact of this style. A {factor} smaller than 1.0 (e.g. 0.8) often
|
||||
results in the best performance, since the number of neighbors is
|
||||
likely to overestimate the ideal weight. The factor has to be between
|
||||
0.0 and 2.0.
|
||||
The {neigh} weight style assigns the same weight to each particle
|
||||
owned by a processor based on the total count of neighbors in the
|
||||
neighbor list owned by that processor. The motivation is that more
|
||||
neighbors means a higher computational cost. The style does not use
|
||||
neighbors per atom to assign a unique weight to each atom, because
|
||||
that value can vary depending on how the neighbor list is built.
|
||||
|
||||
The {factor} setting is applied as an overall scale factor to the
|
||||
{neigh} weights which allows adjustment of their impact on the
|
||||
balancing operation. The specified {factor} value must be positive.
|
||||
A value > 1.0 will increase the weights so that the ratio of max
|
||||
weight to min weight increases by {factor}. A value < 1.0 will
|
||||
decrease the weights so that the ratio of max weight to min weight
|
||||
decreases by {factor}. In both cases the intermediate weight values
|
||||
increase/decrease proportionally as well. A value = 1.0 has no effect
|
||||
on the {neigh} weights. As a rule of thumb, we have found a {factor}
|
||||
of about 0.8 often results in the best performance, since the number
|
||||
of neighbors is likely to overestimate the ideal weight.
|
||||
|
||||
This weight style is useful for systems where there are different
|
||||
cutoffs used for different pairs of interations, or the density
|
||||
|
@ -370,35 +381,48 @@ weights are computed. Inserting a "run 0 post no"_run.html command
|
|||
before issuing the {balance} command, may be a workaround for this
|
||||
case, as it will induce the neighbor list to be built.
|
||||
|
||||
The {time} weight style uses "timer data"_timer.html to estimate a
|
||||
weight for each particle. It uses the same information as is used for
|
||||
the "MPI task timing breakdown"_Section_start.html#start_8, namely,
|
||||
the timings for sections {Pair}, {Bond}, {Kspace}, and {Neigh}. The
|
||||
time spent in these sections of the timestep are measured for each MPI
|
||||
rank, summed up, then converted into a cost for each MPI rank relative
|
||||
to the average cost over all MPI ranks for the same sections. That
|
||||
cost then evenly distributed over all the particles owned by that
|
||||
rank. Finally, the {factor} setting is then appied as an overall
|
||||
scale factor to all the {time} weights as a way to fine tune the
|
||||
impact of this weight style. Good {factor} values to use are
|
||||
typically between 0.5 and 1.2. Allowed are values between 0.0 and 2.0.
|
||||
The {time} weight style uses "timer data"_timer.html to estimate
|
||||
weights. It assigns the same weight to each particle owned by a
|
||||
processor based on the total computational time spent by that
|
||||
processor. See details below on what time window is used. It uses
|
||||
the same timing information as is used for the "MPI task timing
|
||||
breakdown"_Section_start.html#start_8, namely, for sections {Pair},
|
||||
{Bond}, {Kspace}, and {Neigh}. The time spent in those portions of
|
||||
the timestep are measured for each MPI rank, summed, then divided by
|
||||
the number of particles owned by that processor. I.e. the weight is
|
||||
an effective CPU time/particle averaged over the particles on that
|
||||
processor.
|
||||
|
||||
For the {balance} command the timing data is taken from the preceding
|
||||
run command, i.e. the timings are for the entire previous run. For
|
||||
the {fix balance} command the timing data is for only the timesteps
|
||||
since the last balancing operation was performed. If timing
|
||||
information for the required sections is not available, e.g. at the
|
||||
beginning of a run, or when the "timer"_timer.html command is set to
|
||||
either {loop} or {off}, a warning is issued. In this case no weights
|
||||
are computed.
|
||||
The {factor} setting is applied as an overall scale factor to the
|
||||
{time} weights which allows adjustment of their impact on the
|
||||
balancing operation. The specified {factor} value must be positive.
|
||||
A value > 1.0 will increase the weights so that the ratio of max
|
||||
weight to min weight increases by {factor}. A value < 1.0 will
|
||||
decrease the weights so that the ratio of max weight to min weight
|
||||
decreases by {factor}. In both cases the intermediate weight values
|
||||
increase/decrease proportionally as well. A value = 1.0 has no effect
|
||||
on the {time} weights. As a rule of thumb, effective values to use
|
||||
are typicall between 0.5 and 1.2. Note that the timer quantities
|
||||
mentioned above can be affected by communication which occurs in the
|
||||
middle of the operations, e.g. pair styles with intermediate exchange
|
||||
of data witin the force computation, and likewise for KSpace solves.
|
||||
|
||||
This weight style is the most generic one, and should be tried first,
|
||||
if neither the {group} or {neigh} styles are easily applicable.
|
||||
However, since the computed cost function is averaged over all local
|
||||
particles this weight style may not be highly accurate. This style
|
||||
can also be effective as a secondary weight in combination with either
|
||||
{group} or {neigh} to offset some of inaccuracies in either of those
|
||||
heuristics.
|
||||
When using the {time} weight style with the {balance} command, the
|
||||
timing data is taken from the preceding run command, i.e. the timings
|
||||
are for the entire previous run. For the {fix balance} command the
|
||||
timing data is for only the timesteps since the last balancing
|
||||
operation was performed. If timing information for the required
|
||||
sections is not available, e.g. at the beginning of a run, or when the
|
||||
"timer"_timer.html command is set to either {loop} or {off}, a warning
|
||||
is issued. In this case no weights are computed.
|
||||
|
||||
NOTE: The {time} weight style is the most generic option, and should
|
||||
be tried first, unless the {group} style is easily applicable.
|
||||
However, since the computed cost function is averaged over all
|
||||
particles on a processor, the weights may not be highly accurate.
|
||||
This style can also be effective as a secondary weight in combination
|
||||
with either {group} or {neigh} to offset some of inaccuracies in
|
||||
either of those heuristics.
|
||||
|
||||
The {var} weight style assigns per-particle weights by evaluating an
|
||||
"atom-style variable"_variable.html specified by {name}. This is
|
||||
|
|
|
@ -49,8 +49,8 @@ keyword = {append} or {buffer} or {element} or {every} or {fileper} or {first} o
|
|||
-N = sort per-atom lines in descending order by the Nth column
|
||||
{thresh} args = attribute operation value
|
||||
attribute = same attributes (x,fy,etotal,sxx,etc) used by dump custom style
|
||||
operation = "<" or "<=" or ">" or ">=" or "==" or "!="
|
||||
value = numeric value to compare to
|
||||
operation = "<" or "<=" or ">" or ">=" or "==" or "!=" or "|^"
|
||||
value = numeric value to compare to, or LAST
|
||||
these 3 args can be replaced by the word "none" to turn off thresholding
|
||||
{unwrap} arg = {yes} or {no} :pre
|
||||
these keywords apply only to the {image} and {movie} "styles"_dump_image.html :l
|
||||
|
@ -458,16 +458,59 @@ as well as memory, versus unsorted output.
|
|||
|
||||
The {thresh} keyword only applies to the dump {custom}, {cfg},
|
||||
{image}, and {movie} styles. Multiple thresholds can be specified.
|
||||
Specifying "none" turns off all threshold criteria. If thresholds are
|
||||
Specifying {none} turns off all threshold criteria. If thresholds are
|
||||
specified, only atoms whose attributes meet all the threshold criteria
|
||||
are written to the dump file or included in the image. The possible
|
||||
attributes that can be tested for are the same as those that can be
|
||||
specified in the "dump custom"_dump.html command, with the exception
|
||||
of the {element} attribute, since it is not a numeric value. Note
|
||||
that different attributes can be output by the dump custom command
|
||||
than are used as threshold criteria by the dump_modify command.
|
||||
E.g. you can output the coordinates and stress of atoms whose energy
|
||||
is above some threshold.
|
||||
that a different attributes can be used than those output by the "dump
|
||||
custom"_dump.html command. E.g. you can output the coordinates and
|
||||
stress of atoms whose energy is above some threshold.
|
||||
|
||||
If an atom-style variable is used as the attribute, then it can
|
||||
produce continuous numeric values or effective Boolean 0/1 values
|
||||
which may be useful for the comparision operation. Boolean values can
|
||||
be generated by variable formulas that use comparison or Boolean math
|
||||
operators or special functions like gmask() and rmask() and grmask().
|
||||
See the "variable"_variable.html command doc page for details.
|
||||
|
||||
NOTE: The LAST option, discussed below, is not yet implemented. It
|
||||
will be soon.
|
||||
|
||||
The specified value must be a simple numeric value or the word LAST.
|
||||
If LAST is used, it refers to the value of the attribute the last time
|
||||
the dump command was invoked to produce a snapshot. This is a way to
|
||||
only dump atoms whose attribute has changed (or not changed).
|
||||
Three examples follow.
|
||||
|
||||
dump_modify ... thresh ix != LAST :pre
|
||||
|
||||
This will dump atoms which have crossed the periodic x boundary of the
|
||||
simulation box since the last dump. (Note that atoms that crossed
|
||||
once and then crossed back between the two dump timesteps would not be
|
||||
included.)
|
||||
|
||||
region foo sphere 10 20 10 15
|
||||
variable inregion atom rmask(foo)
|
||||
dump_modify ... thresh v_inregion |^ LAST
|
||||
|
||||
This will dump atoms which crossed the boundary of the spherical
|
||||
region since the last dump.
|
||||
|
||||
variable charge atom "(q > 0.5) || (q < -0.5)"
|
||||
dump_modify ... thresh v_charge |^ LAST
|
||||
|
||||
This will dump atoms whose charge has changed from an absolute value
|
||||
less than 1/2 to greater than 1/2 (or vice versa) since the last dump.
|
||||
E.g. due to reactions and subsequent charge equilibration in a
|
||||
reactive force field.
|
||||
|
||||
The choice of operations are the usual comparison operators. The XOR
|
||||
operation (exclusive or) is also included as "|^". In this context,
|
||||
XOR means that if either the attribute or value is 0.0 and the other
|
||||
is non-zero, then the result is "true" and the threshold criterion is
|
||||
met. Otherwise it is not met.
|
||||
|
||||
:line
|
||||
|
||||
|
|
|
@ -11,13 +11,19 @@ velocity all create 1.44 87287 loop geom
|
|||
pair_style body 5.0
|
||||
pair_coeff * * 1.0 1.0
|
||||
|
||||
neighbor 0.3 bin
|
||||
neighbor 0.5 bin
|
||||
neigh_modify every 1 delay 0 check yes
|
||||
|
||||
fix 1 all nve/body
|
||||
#fix 1 all nvt/body temp 1.44 1.44 1.0
|
||||
fix 2 all enforce2d
|
||||
|
||||
#compute 1 all body/local type 1 2 3
|
||||
#dump 1 all local 100 dump.body index c_1[1] c_1[2] c_1[3] c_1[4]
|
||||
|
||||
thermo 500
|
||||
#dump 2 all image 1000 image.*.jpg type type &
|
||||
# zoom 1.6 adiam 1.5 body type 1.0 0
|
||||
#dump_modify 2 pad 5
|
||||
|
||||
thermo 100
|
||||
run 10000
|
||||
|
|
|
@ -40,9 +40,6 @@ Angle::Angle(LAMMPS *lmp) : Pointers(lmp)
|
|||
vatom = NULL;
|
||||
setflag = NULL;
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
datamask_modify = ALL_MASK;
|
||||
|
|
|
@ -29,10 +29,9 @@ class Angle : protected Pointers {
|
|||
double energy; // accumulated energies
|
||||
double virial[6]; // accumlated virial
|
||||
double *eatom,**vatom; // accumulated per-atom energy/virial
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
// KOKKOS host/device flag and data masks
|
||||
|
||||
ExecutionSpace execution_space;
|
||||
unsigned int datamask_read,datamask_modify;
|
||||
int copymode;
|
||||
|
@ -51,9 +50,6 @@ class Angle : protected Pointers {
|
|||
virtual double single(int, int, int, int) = 0;
|
||||
virtual double memory_usage();
|
||||
|
||||
virtual unsigned int data_mask() {return datamask;}
|
||||
virtual unsigned int data_mask_ext() {return datamask_ext;}
|
||||
|
||||
protected:
|
||||
int suffix_flag; // suffix compatibility flag
|
||||
|
||||
|
|
|
@ -208,9 +208,6 @@ Atom::Atom(LAMMPS *lmp) : Pointers(lmp)
|
|||
atom_style = NULL;
|
||||
avec = NULL;
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
avec_map = new AtomVecCreatorMap();
|
||||
|
||||
#define ATOM_CLASS
|
||||
|
|
|
@ -124,11 +124,6 @@ class Atom : protected Pointers {
|
|||
char **iname,**dname;
|
||||
int nivector,ndvector;
|
||||
|
||||
// used by USER-CUDA to flag used per-atom arrays
|
||||
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
// atom style and per-atom array existence flags
|
||||
// customize by adding new flag
|
||||
|
||||
|
|
|
@ -156,10 +156,6 @@ E: Invalid atom_style command
|
|||
|
||||
Self-explanatory.
|
||||
|
||||
E: USER-CUDA package requires a cuda enabled atom_style
|
||||
|
||||
Self-explanatory.
|
||||
|
||||
E: KOKKOS package requires a kokkos enabled atom_style
|
||||
|
||||
Self-explanatory.
|
||||
|
|
|
@ -44,9 +44,6 @@ Bond::Bond(LAMMPS *lmp) : Pointers(lmp)
|
|||
vatom = NULL;
|
||||
setflag = NULL;
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
datamask_modify = ALL_MASK;
|
||||
|
|
|
@ -29,10 +29,9 @@ class Bond : protected Pointers {
|
|||
double energy; // accumulated energies
|
||||
double virial[6]; // accumlated virial
|
||||
double *eatom,**vatom; // accumulated per-atom energy/virial
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
// KOKKOS host/device flag and data masks
|
||||
|
||||
ExecutionSpace execution_space;
|
||||
unsigned int datamask_read,datamask_modify;
|
||||
int copymode;
|
||||
|
@ -51,9 +50,6 @@ class Bond : protected Pointers {
|
|||
virtual double single(int, double, int, int, double &) = 0;
|
||||
virtual double memory_usage();
|
||||
|
||||
virtual unsigned int data_mask() {return datamask;}
|
||||
virtual unsigned int data_mask_ext() {return datamask_ext;}
|
||||
|
||||
void write_file(int, char**);
|
||||
|
||||
protected:
|
||||
|
|
|
@ -155,10 +155,6 @@ class CommTiled : public Comm {
|
|||
|
||||
/* ERROR/WARNING messages:
|
||||
|
||||
E: USER-CUDA package does not yet support comm_style tiled
|
||||
|
||||
Self-explanatory.
|
||||
|
||||
E: KOKKOS package does not yet support comm_style tiled
|
||||
|
||||
Self-explanatory.
|
||||
|
|
|
@ -99,9 +99,6 @@ Compute::Compute(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp),
|
|||
|
||||
// data masks
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
datamask_modify = ALL_MASK;
|
||||
|
|
|
@ -84,9 +84,6 @@ class Compute : protected Pointers {
|
|||
int comm_reverse; // size of reverse communication (0 if none)
|
||||
int dynamic_group_allow; // 1 if can be used with dynamic group, else 0
|
||||
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
// KOKKOS host/device flag and data masks
|
||||
|
||||
ExecutionSpace execution_space;
|
||||
|
@ -140,9 +137,6 @@ class Compute : protected Pointers {
|
|||
double, double, double,
|
||||
double, double, double) {}
|
||||
|
||||
virtual int unsigned data_mask() {return datamask;}
|
||||
virtual int unsigned data_mask_ext() {return datamask_ext;}
|
||||
|
||||
protected:
|
||||
int instance_me; // which Compute class instantiation I am
|
||||
|
||||
|
|
|
@ -41,9 +41,6 @@ Dihedral::Dihedral(LAMMPS *lmp) : Pointers(lmp)
|
|||
vatom = NULL;
|
||||
setflag = NULL;
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
datamask_modify = ALL_MASK;
|
||||
|
|
|
@ -29,10 +29,9 @@ class Dihedral : protected Pointers {
|
|||
double energy; // accumulated energy
|
||||
double virial[6]; // accumlated virial
|
||||
double *eatom,**vatom; // accumulated per-atom energy/virial
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
// KOKKOS host/device flag and data masks
|
||||
|
||||
ExecutionSpace execution_space;
|
||||
unsigned int datamask_read,datamask_modify;
|
||||
int copymode;
|
||||
|
@ -49,9 +48,6 @@ class Dihedral : protected Pointers {
|
|||
virtual void write_data(FILE *) {}
|
||||
virtual double memory_usage();
|
||||
|
||||
virtual unsigned int data_mask() {return datamask;}
|
||||
virtual unsigned int data_mask_ext() {return datamask_ext;}
|
||||
|
||||
protected:
|
||||
int suffix_flag; // suffix compatibility flag
|
||||
|
||||
|
|
|
@ -43,7 +43,7 @@ enum{ID,MOL,PROC,PROCP1,TYPE,ELEMENT,MASS,
|
|||
OMEGAX,OMEGAY,OMEGAZ,ANGMOMX,ANGMOMY,ANGMOMZ,
|
||||
TQX,TQY,TQZ,
|
||||
COMPUTE,FIX,VARIABLE,INAME,DNAME};
|
||||
enum{LT,LE,GT,GE,EQ,NEQ};
|
||||
enum{LT,LE,GT,GE,EQ,NEQ,XOR};
|
||||
enum{INT,DOUBLE,STRING,BIGINT}; // same as in DumpCFG
|
||||
|
||||
#define INVOKED_PERATOM 8
|
||||
|
@ -947,6 +947,11 @@ int DumpCustom::count()
|
|||
} else if (thresh_op[ithresh] == NEQ) {
|
||||
for (i = 0; i < nlocal; i++, ptr += nstride)
|
||||
if (choose[i] && *ptr == value) choose[i] = 0;
|
||||
} else if (thresh_op[ithresh] == XOR) {
|
||||
for (i = 0; i < nlocal; i++, ptr += nstride)
|
||||
if (choose[i] && (*ptr == 0.0 && value == 0.0) ||
|
||||
(*ptr != 0.0 && value != 0.0))
|
||||
choose[i] = 0;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -1835,6 +1840,7 @@ int DumpCustom::modify_param(int narg, char **arg)
|
|||
else if (strcmp(arg[2],">=") == 0) thresh_op[nthresh] = GE;
|
||||
else if (strcmp(arg[2],"==") == 0) thresh_op[nthresh] = EQ;
|
||||
else if (strcmp(arg[2],"!=") == 0) thresh_op[nthresh] = NEQ;
|
||||
else if (strcmp(arg[2],"|^") == 0) thresh_op[nthresh] = XOR;
|
||||
else error->all(FLERR,"Invalid dump_modify threshold operator");
|
||||
|
||||
// set threshold value
|
||||
|
|
|
@ -95,10 +95,7 @@ id(NULL), style(NULL), eatom(NULL), vatom(NULL)
|
|||
maxeatom = maxvatom = 0;
|
||||
vflag_atom = 0;
|
||||
|
||||
// CUDA and KOKKOS per-fix data masks
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
// KOKKOS per-fix data masks
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
|
|
|
@ -99,11 +99,6 @@ class Fix : protected Pointers {
|
|||
ExecutionSpace execution_space;
|
||||
unsigned int datamask_read,datamask_modify;
|
||||
|
||||
// USER-CUDA per-fix data masks
|
||||
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
Fix(class LAMMPS *, int, char **);
|
||||
virtual ~Fix();
|
||||
void modify_params(int, char **);
|
||||
|
@ -211,9 +206,6 @@ class Fix : protected Pointers {
|
|||
|
||||
virtual double memory_usage() {return 0.0;}
|
||||
|
||||
virtual unsigned int data_mask() {return datamask;}
|
||||
virtual unsigned int data_mask_ext() {return datamask_ext;}
|
||||
|
||||
protected:
|
||||
int instance_me; // which Fix class instantiation I am
|
||||
|
||||
|
|
|
@ -18,12 +18,11 @@
|
|||
#include "error.h"
|
||||
|
||||
using namespace LAMMPS_NS;
|
||||
#define SMALL 0.001
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
||||
ImbalanceGroup::ImbalanceGroup(LAMMPS *lmp) : Imbalance(lmp),
|
||||
id(0), factor(0), num(0) {}
|
||||
ImbalanceGroup::ImbalanceGroup(LAMMPS *lmp) : Imbalance(lmp), id(0), factor(0)
|
||||
{}
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
||||
|
@ -50,7 +49,7 @@ int ImbalanceGroup::options(int narg, char **arg)
|
|||
if (id[i] < 0)
|
||||
error->all(FLERR,"Unknown group in balance weight command");
|
||||
factor[i] = force->numeric(FLERR,arg[2*i+2]);
|
||||
if (factor[i] < 0.0) error->all(FLERR,"Illegal balance weight command");
|
||||
if (factor[i] <= 0.0) error->all(FLERR,"Illegal balance weight command");
|
||||
}
|
||||
return 2*num+1;
|
||||
}
|
||||
|
@ -67,13 +66,10 @@ void ImbalanceGroup::compute(double *weight)
|
|||
|
||||
for (int i = 0; i < nlocal; ++i) {
|
||||
const int imask = mask[i];
|
||||
double iweight = weight[i];
|
||||
for (int j = 0; j < num; ++j) {
|
||||
if (imask & bitmask[id[j]])
|
||||
iweight *= factor[j];
|
||||
weight[i] *= factor[j];
|
||||
}
|
||||
if (iweight < SMALL) weight[i] = SMALL;
|
||||
else weight[i] = iweight;
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -22,14 +22,14 @@
|
|||
#include "error.h"
|
||||
|
||||
using namespace LAMMPS_NS;
|
||||
#define SMALL 0.001
|
||||
|
||||
#define BIG 1.0e20
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
||||
ImbalanceNeigh::ImbalanceNeigh(LAMMPS *lmp) : Imbalance(lmp)
|
||||
{
|
||||
did_warn = 0;
|
||||
factor = 1.0;
|
||||
}
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
@ -38,8 +38,7 @@ int ImbalanceNeigh::options(int narg, char **arg)
|
|||
{
|
||||
if (narg < 1) error->all(FLERR,"Illegal balance weight command");
|
||||
factor = force->numeric(FLERR,arg[0]);
|
||||
if ((factor < 0.0) || (factor > 2.0))
|
||||
error->all(FLERR,"Illegal balance weight command");
|
||||
if (factor <= 0.0) error->all(FLERR,"Illegal balance weight command");
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
@ -52,7 +51,7 @@ void ImbalanceNeigh::compute(double *weight)
|
|||
if (factor == 0.0) return;
|
||||
|
||||
// find suitable neighbor list
|
||||
// we can only make use of certain (conventional) neighbor lists
|
||||
// can only use certain conventional neighbor lists
|
||||
|
||||
for (req = 0; req < neighbor->old_nrequest; ++req) {
|
||||
if ((neighbor->old_requests[req]->half ||
|
||||
|
@ -65,37 +64,46 @@ void ImbalanceNeigh::compute(double *weight)
|
|||
|
||||
if (req >= neighbor->old_nrequest || neighbor->ago < 0) {
|
||||
if (comm->me == 0 && !did_warn)
|
||||
error->warning(FLERR,"No suitable neighbor list found. "
|
||||
"Neighbor weighted balancing skipped");
|
||||
error->warning(FLERR,"Balance weight neigh skipped b/c no list found");
|
||||
did_warn = 1;
|
||||
return;
|
||||
}
|
||||
|
||||
// neighsum = total neigh count for atoms on this proc
|
||||
// localwt = weight assigned to each owned atom
|
||||
|
||||
NeighList *list = neighbor->lists[req];
|
||||
bigint neighsum = 0;
|
||||
|
||||
const int inum = list->inum;
|
||||
const int * const ilist = list->ilist;
|
||||
const int * const numneigh = list->numneigh;
|
||||
int nlocal = atom->nlocal;
|
||||
|
||||
// first pass: get local number of neighbors
|
||||
|
||||
bigint neighsum = 0;
|
||||
for (int i = 0; i < inum; ++i) neighsum += numneigh[ilist[i]];
|
||||
double localwt = 0.0;
|
||||
if (nlocal) localwt = 1.0*neighsum/nlocal;
|
||||
|
||||
double allatoms = static_cast <double>(atom->natoms);
|
||||
if (allatoms == 0.0) allatoms = 1.0;
|
||||
double allavg;
|
||||
double myavg = static_cast<double>(neighsum)/allatoms;
|
||||
MPI_Allreduce(&myavg,&allavg,1,MPI_DOUBLE,MPI_SUM,world);
|
||||
|
||||
// second pass: compute and apply weights
|
||||
if (nlocal && localwt <= 0.0) error->one(FLERR,"Balance weight <= 0.0");
|
||||
|
||||
double scale = 1.0/allavg;
|
||||
for (int ii = 0; ii < inum; ++ii) {
|
||||
const int i = ilist[ii];
|
||||
weight[i] *= (1.0-factor) + factor*scale*numneigh[i];
|
||||
if (weight[i] < SMALL) weight[i] = SMALL;
|
||||
// apply factor if specified != 1.0
|
||||
// wtlo,wthi = lo/hi values excluding 0.0 due to no atoms on this proc
|
||||
// lo value does not change
|
||||
// newhi = new hi value to give hi/lo ratio factor times larger/smaller
|
||||
// expand/contract all localwt values from lo->hi to lo->newhi
|
||||
|
||||
if (factor != 1.0) {
|
||||
double wtlo,wthi;
|
||||
if (localwt == 0.0) localwt = BIG;
|
||||
MPI_Allreduce(&localwt,&wtlo,1,MPI_DOUBLE,MPI_MIN,world);
|
||||
if (localwt == BIG) localwt = 0.0;
|
||||
MPI_Allreduce(&localwt,&wthi,1,MPI_DOUBLE,MPI_MAX,world);
|
||||
if (wtlo == wthi) return;
|
||||
|
||||
double newhi = wthi*factor;
|
||||
localwt = wtlo + ((localwt-wtlo)/(wthi-wtlo)) * (newhi-wtlo);
|
||||
}
|
||||
|
||||
for (int i = 0; i < nlocal; i++) weight[i] *= localwt;
|
||||
}
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
|
|
@ -19,15 +19,16 @@
|
|||
#include "timer.h"
|
||||
#include "error.h"
|
||||
|
||||
// DEBUG
|
||||
#include "update.h"
|
||||
|
||||
using namespace LAMMPS_NS;
|
||||
#define SMALL 0.001
|
||||
|
||||
#define BIG 1.0e20
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
||||
ImbalanceTime::ImbalanceTime(LAMMPS *lmp) : Imbalance(lmp)
|
||||
{
|
||||
factor = 1.0;
|
||||
}
|
||||
ImbalanceTime::ImbalanceTime(LAMMPS *lmp) : Imbalance(lmp) {}
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
||||
|
@ -35,8 +36,7 @@ int ImbalanceTime::options(int narg, char **arg)
|
|||
{
|
||||
if (narg < 1) error->all(FLERR,"Illegal balance weight command");
|
||||
factor = force->numeric(FLERR,arg[0]);
|
||||
if ((factor < 0.0) || (factor > 2.0))
|
||||
error->all(FLERR,"Illegal balance weight command");
|
||||
if (factor <= 0.0) error->all(FLERR,"Illegal balance weight command");
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
@ -53,37 +53,60 @@ void ImbalanceTime::init()
|
|||
|
||||
void ImbalanceTime::compute(double *weight)
|
||||
{
|
||||
const int nlocal = atom->nlocal;
|
||||
const bigint natoms = atom->natoms;
|
||||
if (!timer->has_normal()) return;
|
||||
|
||||
if (factor == 0.0) return;
|
||||
// cost = CPU time for relevant timers since last invocation
|
||||
// localwt = weight assigned to each owned atom
|
||||
// just return if no time yet tallied
|
||||
|
||||
// compute the cost function of based on relevant timers
|
||||
|
||||
if (timer->has_normal()) {
|
||||
double cost = -last;
|
||||
cost += timer->get_wall(Timer::PAIR);
|
||||
cost += timer->get_wall(Timer::NEIGH);
|
||||
cost += timer->get_wall(Timer::BOND);
|
||||
cost += timer->get_wall(Timer::KSPACE);
|
||||
double cost = -last;
|
||||
cost += timer->get_wall(Timer::PAIR);
|
||||
cost += timer->get_wall(Timer::NEIGH);
|
||||
cost += timer->get_wall(Timer::BOND);
|
||||
cost += timer->get_wall(Timer::KSPACE);
|
||||
|
||||
double allcost;
|
||||
MPI_Allreduce(&cost,&allcost,1,MPI_DOUBLE,MPI_SUM,world);
|
||||
/*
|
||||
printf("TIME %ld %d %g %g: %g %g %g %g\n",
|
||||
update->ntimestep,atom->nlocal,last,cost,
|
||||
timer->get_wall(Timer::PAIR),
|
||||
timer->get_wall(Timer::NEIGH),
|
||||
timer->get_wall(Timer::BOND),
|
||||
timer->get_wall(Timer::KSPACE));
|
||||
*/
|
||||
|
||||
if ((allcost > 0.0) && (nlocal > 0)) {
|
||||
const double avgcost = allcost/natoms;
|
||||
const double localcost = cost/nlocal;
|
||||
const double scale = (1.0-factor) + factor*localcost/avgcost;
|
||||
for (int i = 0; i < nlocal; ++i) {
|
||||
weight[i] *= scale;
|
||||
if (weight[i] < SMALL) weight[i] = SMALL;
|
||||
}
|
||||
}
|
||||
double maxcost;
|
||||
MPI_Allreduce(&cost,&maxcost,1,MPI_DOUBLE,MPI_MAX,world);
|
||||
if (maxcost <= 0.0) return;
|
||||
|
||||
// record time up to this point
|
||||
int nlocal = atom->nlocal;
|
||||
double localwt = 0.0;
|
||||
if (nlocal) localwt = cost/nlocal;
|
||||
|
||||
last += cost;
|
||||
if (nlocal && localwt <= 0.0) error->one(FLERR,"Balance weight <= 0.0");
|
||||
|
||||
// apply factor if specified != 1.0
|
||||
// wtlo,wthi = lo/hi values excluding 0.0 due to no atoms on this proc
|
||||
// lo value does not change
|
||||
// newhi = new hi value to give hi/lo ratio factor times larger/smaller
|
||||
// expand/contract all localwt values from lo->hi to lo->newhi
|
||||
|
||||
if (factor != 1.0) {
|
||||
double wtlo,wthi;
|
||||
if (localwt == 0.0) localwt = BIG;
|
||||
MPI_Allreduce(&localwt,&wtlo,1,MPI_DOUBLE,MPI_MIN,world);
|
||||
if (localwt == BIG) localwt = 0.0;
|
||||
MPI_Allreduce(&localwt,&wthi,1,MPI_DOUBLE,MPI_MAX,world);
|
||||
if (wtlo == wthi) return;
|
||||
|
||||
double newhi = wthi*factor;
|
||||
localwt = wtlo + ((localwt-wtlo)/(wthi-wtlo)) * (newhi-wtlo);
|
||||
}
|
||||
|
||||
for (int i = 0; i < nlocal; i++) weight[i] *= localwt;
|
||||
|
||||
// record time up to this point
|
||||
|
||||
last += cost;
|
||||
}
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
|
|
@ -24,11 +24,10 @@
|
|||
#include "update.h"
|
||||
|
||||
using namespace LAMMPS_NS;
|
||||
#define SMALL 0.001
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
||||
ImbalanceVar::ImbalanceVar(LAMMPS *lmp) : Imbalance(lmp), name(0), id(0) {}
|
||||
ImbalanceVar::ImbalanceVar(LAMMPS *lmp) : Imbalance(lmp), name(0) {}
|
||||
|
||||
/* -------------------------------------------------------------------- */
|
||||
|
||||
|
@ -76,10 +75,15 @@ void ImbalanceVar::compute(double *weight)
|
|||
memory->create(values,nlocal,"imbalance:values");
|
||||
|
||||
input->variable->compute_atom(id,all,values,1,0);
|
||||
for (int i = 0; i < nlocal; ++i) {
|
||||
weight[i] *= values[i];
|
||||
if (weight[i] < SMALL) weight[i] = SMALL;
|
||||
}
|
||||
|
||||
int flag = 0;
|
||||
for (int i = 0; i < nlocal; i++)
|
||||
if (values[i] <= 0.0) flag = 1;
|
||||
int flagall;
|
||||
MPI_Allreduce(&flag,&flagall,1,MPI_INT,MPI_SUM,world);
|
||||
if (flagall) error->one(FLERR,"Balance weight <= 0.0");
|
||||
|
||||
for (int i = 0; i < nlocal; i++) weight[i] *= values[i];
|
||||
|
||||
memory->destroy(values);
|
||||
}
|
||||
|
|
|
@ -38,9 +38,6 @@ Improper::Improper(LAMMPS *lmp) : Pointers(lmp)
|
|||
vatom = NULL;
|
||||
setflag = NULL;
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
datamask_modify = ALL_MASK;
|
||||
|
|
|
@ -29,10 +29,9 @@ class Improper : protected Pointers {
|
|||
double energy; // accumulated energies
|
||||
double virial[6]; // accumlated virial
|
||||
double *eatom,**vatom; // accumulated per-atom energy/virial
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
// KOKKOS host/device flag and data masks
|
||||
|
||||
ExecutionSpace execution_space;
|
||||
unsigned int datamask_read,datamask_modify;
|
||||
int copymode;
|
||||
|
@ -49,9 +48,6 @@ class Improper : protected Pointers {
|
|||
virtual void write_data(FILE *) {}
|
||||
virtual double memory_usage();
|
||||
|
||||
virtual unsigned int data_mask() {return datamask;}
|
||||
virtual unsigned int data_mask_ext() {return datamask_ext;}
|
||||
|
||||
protected:
|
||||
int suffix_flag; // suffix compatibility flag
|
||||
|
||||
|
|
|
@ -328,12 +328,6 @@ E: Package command after simulation box is defined
|
|||
The package command cannot be used afer a read_data, read_restart, or
|
||||
create_box command.
|
||||
|
||||
E: Package cuda command without USER-CUDA package enabled
|
||||
|
||||
The USER-CUDA package must be installed via "make yes-user-cuda"
|
||||
before LAMMPS is built, and the "-c on" must be used to enable the
|
||||
package.
|
||||
|
||||
E: Package gpu command without GPU package installed
|
||||
|
||||
The GPU package must be installed via "make yes-gpu" before LAMMPS is
|
||||
|
|
|
@ -88,9 +88,6 @@ KSpace::KSpace(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp)
|
|||
eatom = NULL;
|
||||
vatom = NULL;
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
datamask_modify = ALL_MASK;
|
||||
|
|
|
@ -80,10 +80,8 @@ class KSpace : protected Pointers {
|
|||
|
||||
int group_group_enable; // 1 if style supports group/group calculation
|
||||
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
// KOKKOS host/device flag and data masks
|
||||
|
||||
ExecutionSpace execution_space;
|
||||
unsigned int datamask_read,datamask_modify;
|
||||
int copymode;
|
||||
|
|
|
@ -168,19 +168,10 @@ E: Cannot use -cuda on and -kokkos on together
|
|||
|
||||
This is not allowed since both packages can use GPUs.
|
||||
|
||||
E: Cannot use -cuda on without USER-CUDA installed
|
||||
|
||||
The USER-CUDA package must be installed via "make yes-user-cuda"
|
||||
before LAMMPS is built.
|
||||
|
||||
E: Cannot use -kokkos on without KOKKOS installed
|
||||
|
||||
Self-explanatory.
|
||||
|
||||
E: Using suffix cuda without USER-CUDA package enabled
|
||||
|
||||
Self-explanatory.
|
||||
|
||||
E: Using suffix gpu without GPU package installed
|
||||
|
||||
Self-explanatory.
|
||||
|
|
|
@ -100,9 +100,6 @@ Pair::Pair(LAMMPS *lmp) : Pointers(lmp)
|
|||
|
||||
// KOKKOS per-fix data masks
|
||||
|
||||
datamask = ALL_MASK;
|
||||
datamask_ext = ALL_MASK;
|
||||
|
||||
execution_space = Host;
|
||||
datamask_read = ALL_MASK;
|
||||
datamask_modify = ALL_MASK;
|
||||
|
|
|
@ -97,9 +97,6 @@ class Pair : protected Pointers {
|
|||
class NeighList *listmiddle;
|
||||
class NeighList *listouter;
|
||||
|
||||
unsigned int datamask;
|
||||
unsigned int datamask_ext;
|
||||
|
||||
int allocated; // 0/1 = whether arrays are allocated
|
||||
// public so external driver can check
|
||||
int compute_flag; // 0 if skip compute()
|
||||
|
@ -191,9 +188,6 @@ class Pair : protected Pointers {
|
|||
virtual void min_xf_get(int) {}
|
||||
virtual void min_x_set(int) {}
|
||||
|
||||
virtual unsigned int data_mask() {return datamask;}
|
||||
virtual unsigned int data_mask_ext() {return datamask_ext;}
|
||||
|
||||
// management of callbacks to be run from ev_tally()
|
||||
|
||||
protected:
|
||||
|
|
|
@ -20,9 +20,8 @@ namespace Suffix {
|
|||
static const int NONE = 0;
|
||||
static const int OPT = 1<<0;
|
||||
static const int GPU = 1<<1;
|
||||
static const int CUDA = 1<<2;
|
||||
static const int OMP = 1<<3;
|
||||
static const int INTEL = 1<<4;
|
||||
static const int OMP = 1<<2;
|
||||
static const int INTEL = 1<<3;
|
||||
}
|
||||
|
||||
}
|
||||
|
|
|
@ -81,14 +81,6 @@ class Update : protected Pointers {
|
|||
|
||||
/* ERROR/WARNING messages:
|
||||
|
||||
E: USER-CUDA mode requires CUDA variant of run style
|
||||
|
||||
CUDA mode is enabled, so the run style must include a cuda suffix.
|
||||
|
||||
E: USER-CUDA mode requires CUDA variant of min style
|
||||
|
||||
CUDA mode is enabled, so the min style must include a cuda suffix.
|
||||
|
||||
E: Illegal ... command
|
||||
|
||||
Self-explanatory. Check the input script syntax and compare to the
|
||||
|
|
|
@ -4813,72 +4813,6 @@ double Variable::evaluate_boolean(char *str)
|
|||
return argstack[0].value;
|
||||
}
|
||||
|
||||
/* ---------------------------------------------------------------------- */
|
||||
|
||||
unsigned int Variable::data_mask(int ivar)
|
||||
{
|
||||
if (eval_in_progress[ivar]) return EMPTY_MASK;
|
||||
eval_in_progress[ivar] = 1;
|
||||
unsigned int datamask = data_mask(data[ivar][0]);
|
||||
eval_in_progress[ivar] = 0;
|
||||
return datamask;
|
||||
}
|
||||
|
||||
/* ---------------------------------------------------------------------- */
|
||||
|
||||
unsigned int Variable::data_mask(char *str)
|
||||
{
|
||||
unsigned int datamask = EMPTY_MASK;
|
||||
|
||||
for (unsigned int i = 0; i < strlen(str)-2; i++) {
|
||||
int istart = i;
|
||||
while (isalnum(str[i]) || str[i] == '_') i++;
|
||||
int istop = i-1;
|
||||
|
||||
int n = istop - istart + 1;
|
||||
char *word = new char[n+1];
|
||||
strncpy(word,&str[istart],n);
|
||||
word[n] = '\0';
|
||||
|
||||
// ----------------
|
||||
// compute
|
||||
// ----------------
|
||||
|
||||
if ((strncmp(word,"c_",2) == 0) && (i>0) && (!isalnum(str[i-1]))) {
|
||||
if (domain->box_exist == 0)
|
||||
error->all(FLERR,
|
||||
"Variable evaluation before simulation box is defined");
|
||||
|
||||
int icompute = modify->find_compute(word+2);
|
||||
if (icompute < 0)
|
||||
error->all(FLERR,"Invalid compute ID in variable formula");
|
||||
|
||||
datamask &= modify->compute[icompute]->data_mask();
|
||||
}
|
||||
|
||||
if ((strncmp(word,"f_",2) == 0) && (i>0) && (!isalnum(str[i-1]))) {
|
||||
if (domain->box_exist == 0)
|
||||
error->all(FLERR,
|
||||
"Variable evaluation before simulation box is defined");
|
||||
|
||||
int ifix = modify->find_fix(word+2);
|
||||
if (ifix < 0) error->all(FLERR,"Invalid fix ID in variable formula");
|
||||
|
||||
datamask &= modify->fix[ifix]->data_mask();
|
||||
}
|
||||
|
||||
if ((strncmp(word,"v_",2) == 0) && (i>0) && (!isalnum(str[i-1]))) {
|
||||
int ivar = find(word+2);
|
||||
if (ivar < 0) error->all(FLERR,"Invalid variable name in variable formula");
|
||||
datamask &= data_mask(ivar);
|
||||
}
|
||||
|
||||
delete [] word;
|
||||
}
|
||||
|
||||
return datamask;
|
||||
}
|
||||
|
||||
/* ----------------------------------------------------------------------
|
||||
class to read variable values from a file
|
||||
for flag = SCALARFILE, reads one value per line
|
||||
|
|
|
@ -49,9 +49,6 @@ class Variable : protected Pointers {
|
|||
tagint int_between_brackets(char *&, int);
|
||||
double evaluate_boolean(char *);
|
||||
|
||||
unsigned int data_mask(int ivar);
|
||||
unsigned int data_mask(char *str);
|
||||
|
||||
private:
|
||||
int me;
|
||||
int nvar; // # of defined variables
|
||||
|
|
|
@ -1 +1 @@
|
|||
#define LAMMPS_VERSION "30 Sep 2016"
|
||||
#define LAMMPS_VERSION "5 Oct 2016"
|
||||
|
|
Loading…
Reference in New Issue