Summary:
Patch for capture tracking broke
bootstrap of clang with -fstict-vtable-pointers
which resulted in debbugging nightmare. It was fixed
https://reviews.llvm.org/D46900 but as it turned
out, there were other parts like inliner (computing of
noalias metadata) that I found after bootstraping with enabled
assertions.
Reviewers: hfinkel, rsmith, chandlerc, amharc, kuhar
Subscribers: JDevlieghere, eraman, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D47088
llvm-svn: 333070
This enables us to detect more fast path sdiv cases under cost analysis.
This patch also enables us to handle non-uniform-constant pow2 cases for X86 SDIV costs.
Found while working on D46276
Future patches can then extend the vectorizers to more fully support non-uniform pow2 cases.
Differential Revision: https://reviews.llvm.org/D46637
llvm-svn: 332969
Summary:
invariant.group.launder should not stop propagation
of nonnull and dereferenceable, because e.g. we would not be
able to hoist loads speculatively.
Reviewers: rsmith, amharc, kuhar, xbolva00, hfinkel
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D46972
llvm-svn: 332788
Summary:
Memdep had funny bug related to invariant.groups - because it did not
invalidated cache, in some very rare cases it was possible to show memory
dependence of the instruction that was deleted, but because other
instruction took it's place it resulted in call to vtable!
Thanks @amharc for repro!.
Reviewers: dberlin, kuhar, amharc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45320
Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 332781
CanProveNotTakenFirstIteration utility does not handle the case when
condition of the branch is a constant. Add its handling.
Reviewers: reames, anna, mkazantsev
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46996
llvm-svn: 332695
Summary:
There was some unfinished work started for offset tracking in CFLGraph by the author of implementation of Andersen algorithm. This work was completed and support for field sensitivity was added to the core of Andersen algorithm.
The performance results seem promising.
SPEC2006 int_base score was increased by 1.1 % (I compared clang 6.0 with clang 6.0 with this patch). The avergae compile time was increased by +- 1 % according my measures with small and medium C/C++ projects (I did not tested it on the large projects with milions of lines of code)
Reviewers: chandlerc, george.burgess.iv, rja
Reviewed By: rja
Subscribers: rja, llvm-commits
Differential Revision: https://reviews.llvm.org/D46282
llvm-svn: 332657
Summary:
A recent patch ([[ https://reviews.llvm.org/rL331587 | rL331587 ]]) to Capture Tracking taught it that the `launder_invariant_group` intrinsic captures its argument only by returning it. Unfortunately, BasicAA still considered every call instruction as a possible escape source and hence concluded that the result of a `launder_invariant_group` call cannot alias any local non-escaping value. This led to [[ https://bugs.llvm.org/show_bug.cgi?id=37458 | bug 37458 ]].
This patch updates the relevant check for escape sources in BasicAA.
Reviewers: Prazek, kuhar, rsmith, hfinkel, sanjoy, xbolva00
Reviewed By: hfinkel, xbolva00
Subscribers: JDevlieghere, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D46900
llvm-svn: 332466
Summary:
After r332167 we started to sort the IDF blocks inside IDF calculation, so
there is no need to re-sort them on the user site. The test changes are due to
a slightly different order we're using now (originally we used DFSInNumber and
now the blocks are sorted by a pair (LevelFromRoot, DFSInNumber)).
Reviewers: dberlin, mgrang
Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D46899
llvm-svn: 332385
In general, it's difficult to poke the ConstantExpr code in CFLAA, since
LLVM is so great at eagerly reducing ConstantExprs. :)
Sadly, this only shows a functional difference from before the patch
because CFLAA has some special logic around taking loads of non-pointers
into account. Namely, with the broken select behavior, CFLAA will
completely fail to take note of @g3. Since CFLAA doesn't have any record
about @g3 when we do an alias query for @g3 and %a, it conservatively
answers MayAlias. When we properly take @g3 into account with the new
select logic, we get NoAlias for this query.
I suspect that the aforementioned "special logic" isn't completely
correct, but this test-case should prevent future wonky aliasing results
from appearing for these flavors of ConstantExprs, so I think it's still
worth having.
llvm-svn: 332017
With custom lowering for vector MULLH{S,U}, it is now profitable to
vectorize a divide by constant loop for the custom types (v16i8, v8i16,
and v4i32). The cost if based on TargetLowering::Build{S,U}DIV which
uses a multiply by constant plus adjustment to express a divide by
constant.
Both {u,s}mull{2} are expressed as Instruction::Mul and shifts by
Instruction::AShr.
llvm-svn: 331873
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is
!DILabel(scope: !1, name: "foo", file: !2, line: 3)
We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is
llvm.dbg.label(metadata !1)
It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.
We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.
Differential Revision: https://reviews.llvm.org/D45024
Patch by Hsiangkai Wang.
llvm-svn: 331841
This patch was temporarily reverted because it has exposed bug 37229 on
PowerPC platform. The bug is unrelated to the patch and was just a general
bug in the optimization done for PowerPC platform only. The bug was fixed
by the patch rL331410.
This patch returns the disabled commit since the bug was fixed.
llvm-svn: 331427
It turned out that readonly argmemonly is not enough.
store 42, %p
%b = barrier(%p)
store 43, %b
the first store is dead, but because barrier was marked as
reading argument memory, it was considered alive. With
inaccessiblememonly it doesn't read the argument, but
it also can't be CSEd.
based on: https://reviews.llvm.org/D32006
llvm-svn: 331338
Summary:
The value tracking analysis uses function alignment to infer that the
least significant bits of function pointers are known to be zero.
Unfortunately, this is not correct for ARM targets: the least
significant bit of a function pointer stores the ARM/Thumb state
information (i.e., the LSB is set for Thumb functions and cleared for
ARM functions).
The original approach (https://reviews.llvm.org/D44781) introduced a
new field for function pointer alignment in the DataLayout structure
to address this. But it seems unlikely that optimizations based on
function pointer alignment would bring much benefit in practice to
justify the additional maintenance burden, so this patch simply
assumes that function pointer alignment is always unknown.
Reviewers: javed.absar, efriedma
Reviewed By: efriedma
Subscribers: kristof.beyls, llvm-commits, hfinkel, rogfer01
Differential Revision: https://reviews.llvm.org/D46110
llvm-svn: 331025
This patch adds a new shuffle kind useful for transposing a 2xn matrix. These
transpose shuffle masks read corresponding even- or odd-numbered vector
elements from two n-dimensional source vectors and write each result into
consecutive elements of an n-dimensional destination vector. The transpose
shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such,
this patch also considers transpose shuffles in the AArch64 implementation of
getShuffleCost.
Differential Revision: https://reviews.llvm.org/D45982
llvm-svn: 330941
This reverts commit 023c8be90980e0180766196cba86f81608b35d38.
This patch triggers miscompile of zlib on PowerPC platform. Most likely it is
caused by some pre-backend PPC-specific pass, but we don't clearly know the
reason yet. So we temporally revert this patch with intention to return it
once the problem is resolved. See bug 37229 for details.
llvm-svn: 330893
This patch adds a cost model test case for vector shuffles having transpose
masks. The given costs are inaccurate and will be updated in a follow-on patch.
llvm-svn: 330625
Improve the alias analysis to account for cases where we
know that src/dst pairs cannot alias due to things like
TBAA. As we know they are noalias, we know no dependency
can occur. Also fixes issues around the size parameter
to AA being incorrect.
Differential Revision: https://reviews.llvm.org/D42381
llvm-svn: 329692
The script allows the auto-generation of checks for cost model tests to speed up their creation and help improve coverage, which will help a lot with PR36550.
If the need arises we can add support for other analyze passes as well, but the cost models was the one I needed to get done - at the moment it just warns that any other analysis mode is unsupported.
I've regenerated a couple of x86 test files to show the effect.
Differential Revision: https://reviews.llvm.org/D45272
llvm-svn: 329390
Summary:
These new image intrinsics contain the texture type as part of
their name and have each component of the address/coordinate as
individual parameters.
This is a preparatory step for implementing the A16 feature, where
coordinates are passed as half-floats or -ints, but the Z compare
value and texel offsets are still full dwords, making it difficult
or impossible to distinguish between A16 on or off in the old-style
intrinsics.
Additionally, these intrinsics pass the 'texfailpolicy' and
'cachectrl' as i32 bit fields to reduce operand clutter and allow
for future extensibility.
v2:
- gather4 supports 2darray images
- fix a bug with 1D images on SI
Change-Id: I099f309e0a394082a5901ea196c3967afb867f04
Reviewers: arsenm, rampitec, b-sumner
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44939
llvm-svn: 329166