D75801 removed the last and only user of this option, so we can
drop it now. The original idea behind this was to only run expensive
transforms under -O3, but apart from the one known bits transform,
this has never really taken off. I believe nowadays the recommendation
is to put expensive transforms in AggressiveInstCombine instead,
though that isn't terribly popular either :)
Differential Revision: https://reviews.llvm.org/D76540
If ExpensiveCombines is enabled (which is the case with -O3 on the
legacy PM and always on the new PM), InstCombine tries to compute
the known bits of all instructions in the hope that all bits end up
being known, which is fairly expensive.
How effective is it? If we add some statistics on how often the
constant folding succeeds and how many KnownBits calculations are
performed and run test-suite we get:
"instcombine.NumConstPropKnownBits": 642,
"instcombine.NumConstPropKnownBitsComputed": 18744965,
In other words, we get one fold for every 30000 KnownBits calculations.
However, the truth is actually much worse: Currently, known bits are
computed before performing other folds, so there is a high chance
that cases that get folded by known bits would also have been
handled by other folds.
What happens if we compute known bits after all other folds
(hacky implementation: https://gist.github.com/nikic/751f25b3b9d9e0860db5dde934f70f46)?
"instcombine.NumConstPropKnownBits": 0,
"instcombine.NumConstPropKnownBitsComputed": 18105547,
So it turns out despite doing 18 million known bits calculations,
the known bits fold does not do anything useful on test-suite.
I was originally planning to move this into AggressiveInstCombine
so it only runs once in the pipeline, but seeing this, I think
we're better off removing it entirely.
As this is the only use of the "expensive combines" mechanism,
it may be removed afterwards, but I'll leave that to a separate patch.
Differential Revision: https://reviews.llvm.org/D75801
Ideally SimplifyDemanded should compute the same known bits as
computeKnownBits(). This patch addresses one discrepancy, where
ValueTracking is more powerful: If we have a shl nsw shift, we
know that the sign bit of the input and output must be the same.
If this results in a conflict, the result is poison.
This is implemented in
2c4ca6832f/lib/Analysis/ValueTracking.cpp (L1175-L1179)
and
2c4ca6832f/lib/Analysis/ValueTracking.cpp (L904-L908).
This implements the same basic logic in SimplifyDemanded. It's
slightly stronger, because I return undef instead of zero for the
poison case (which is not an option inside ValueTracking).
As mentioned in https://reviews.llvm.org/D75801#inline-698484,
we could detect poison in more cases, this just establishes parity
with the existing logic.
Differential Revision: https://reviews.llvm.org/D76489