Since we compute dvdw as d vdw / d rij, we have to also compute
dslw as d slw / d rij. Currently, we compute -1/r d slw/d rij,
which leads to incorrect results when the two are later combined.
Alternatively, one could also modify dvdw to be -1/r d vdw/d rij,
which would be a more standard way to do LJ calculations, but this
way seems more consistent.
Overall improvements range from 2% to 18% on our benchmarks
1) Newton has to be turned on for SSA, so remove those conditionals
2) Rework the math in ssa_update() to eliminate many ops and temporaries
3) Split ssa_update() into two versions, based on DPD vs. DPDE
4) Reorder code in ssa_update_*() to reduce register pressure
The code tries to make this distinction between the real distance (r23) and the facticious one (rij), but does not do so very well.
It is better if those two variables have the same value everywhere, and apply the correction where necessary.
The current way to use the values is incorrrect.
Remove those calculations that effectively are derivatives w.r.t. |rij| (the facticious distance), is constant and thus the chained derivative (d|rij|/dRij) is always zero.
Apply the corrections due to drij/dRij in the sum omega term.
The bonderorderLJ function operates on a facticious distance |rij|, i.e. everything gets calculated "as if" atoms i and j were a given distance alpha apart.
Mathematically, bondorderLJ is a function of rij (a vector), that is (in terms of the real distance Rij) rij = alpha * Rij/|Rij|.
When we calculate the forces in bondorderLJ, we have to make sure to chain in this derivative whenever we calculate derivatives w.r.t. rij.
The right correction, as it turns our, is Fij = alpha / |Rij| * (Identity(3,3) - Rij * Rij^T / |Rij|^2) * fij.
This commit only fixes this for the p_ij^sigma pi terms, which were modified to separate out the d/drij derivative in the cosine calculation.
Now, derivatives are taken w.r.t. the connecting edges instead of the edge points.
Since Etmp (representing sum_kijl omega_kijl * w_ik * w_jl) is not reset between the forward and reverse pass, the value used by later calculation will be twice the expected values.
One could instead reset Etmp between these passes, but there really is no reason to calculate it twice.