Summary:
If we don't have the first and last access of an interleaved load group,
the first and last wide load in the loop can do an out of bounds
access. Even though we discard results from speculative loads,
this can cause problems, since it can technically generate page faults
(or worse).
We now discard interleaved load groups that don't have the first and
load in the group.
Reviewers: hfinkel, rengolin
Subscribers: rengolin, llvm-commits, mzolotukhin, anemet
Differential Revision: http://reviews.llvm.org/D17332
llvm-svn: 261331
Summary:
This was broken in r260694 which swapped the address and data operands
for flat store instructions. The code in SIInsertWaits assumes
that the data operand always comes before the address operand, so
we need to add a special case for flat.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17366
llvm-svn: 261330
As discussed on PR24580, this patch adds some (more to come) initial fast-isel codegen tests to match the IR generated in clang/test/CodeGen/avx-builtins.c
llvm-svn: 261329
According to the SystemZ ABI, 128-bit integer types should be
passed and returned via implicit reference. However, this is
not currently implemented at the LLVM IR level for the i128
type. This does not matter when compiling C/C++ code, since
clang will implement the implicit reference itself.
However, it turns out that when calling libgcc helper routines
operating on 128-bit integers, LLVM will use i128 argument and
return value types; the resulting code is not compatible with
the ABI used in libgcc, leading to crashes (see PR26559).
This should be simple to fix, except that i128 currently is not
even a legal type for the SystemZ back end. Therefore, common
code will already split arguments and return values into multiple
parts. The bulk of this patch therefore consists of detecting
such parts, and correctly handling passing via implicit reference
of a value split into multiple parts. If at some time in the
future, i128 becomes a legal type, this code can be removed again.
This fixes PR26559.
llvm-svn: 261325
routine.
We were getting this wrong in small ways and generally being very
inconsistent about it across loop passes. Instead, let's have a common
place where we do this. One minor downside is that this will require
some analyses like SCEV in more places than they are strictly needed.
However, this seems benign as these analyses are complete no-ops, and
without this consistency we can in many cases end up with the legacy
pass manager scheduling deciding to split up a loop pass pipeline in
order to run the function analysis half-way through. It is very, very
annoying to fix these without just being very pedantic across the board.
The only loop passes I've not updated here are ones that use
AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer.
They seemed less relevant.
With this patch, almost all of the problems in PR24804 around loop pass
pipelines are fixed. The one remaining issue is that we run simplify-cfg
and instcombine in the middle of the loop pass pipeline. We've recently
added some loop variants of these passes that would seem substantially
cleaner to use, but this at least gets us much closer to the previous
state. Notably, the seven loop pass managers is down to three.
I've not updated the loop passes using LoopAccessAnalysis because that
analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't
clear that those transforms want to support those forms anyways. They
all run late anyways, so this is harmless. Similarly, LSR is left alone
because it already carefully manages its forms and doesn't need to get
fused into a single loop pass manager with a bunch of other loop passes.
LoopReroll didn't use loop simplified form previously, and I've updated
the test case to match the trivially different output.
Finally, I've also factored all the pass initialization for the passes
that use this technique as well, so that should be done regularly and
reliably.
Thanks to James for the help reviewing and thinking about this stuff,
and Ben for help thinking about it as well!
Differential Revision: http://reviews.llvm.org/D17435
llvm-svn: 261316
especially the *structure* of it with respect to various pass managers.
This uncovers an absolute horror show of problems. This test shows just
how bad PR24804 is: we have a totaly of *seven* loop pass managers in
the main optimization pipeline.
I've tried to comment the various bits to the best of my knowledge, but
more enhancements here would be great.
Also great would be folks adding various test for other pipelines, I'm
focused on trying to fix the O2 pipeline. I just wanted a test to show
what I'm changing.
llvm-svn: 261305
Certain optimization passes (like globaldce) can prune function
declaration that SjLjEHPrepare assumed would exit when it'd
runOnFunction.
This fixes PR26669.
llvm-svn: 261303
more places to prevent gratuitous re-"runs" of these passes.
The passes themselves don't do any work when run, but we keep spending
time scheduling and running these needlessly when we really don't need
to do so.
This is the first patch towards fixing the really horrible loop pass
pipeline fragmentation pointed out by Sanjoy in PR24804.
llvm-svn: 261302
Summary:
This change adds 3 tables to html report:
- list of covered files with number of functions covered.
- list of not covered files
- list of not covered functions.
I tried to put most coverage-calculating functionality into
SourceCoverageData.
Differential Revision: http://reviews.llvm.org/D17421
llvm-svn: 261287
Summary:
Without this, this command
$ llvm-run llc -stop-after machine-cp -o - <( echo '' )
outputs an error, because we close stdout twice -- once when closing the
file opened for "-o", and again when closing outs().
Also clarify in the outs() definition that you can't ever call it if you
want to open your own raw_fd_ostream on stdout.
Reviewers: jroelofs, tstellarAMD
Subscribers: jholewinski, qcolombet, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D17422
llvm-svn: 261286
This test builds on 261250 (IR support for cmpxchg of pointers) and 261245 (capture tracking support for cmpxchg) to show that correctly analyze the capturing of pointers in a cmpxchg of pointer type.
llvm-svn: 261284
Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives:
1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative)
2) It pushes work onto the frontend authors for no real gain
This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes.
Differential Revision: http://reviews.llvm.org/D17413
llvm-svn: 261281
This is effectively NFC because Atom is the only in-order x86 subtarget currently,
but the predicate would have become wrong if any other in-order CPU came along.
See related discussion in:
http://reviews.llvm.org/D16836
llvm-svn: 261275
Old compilers don't like constexpr, but we're only going to use this in one
place anyway: this file. Everyone else should go through PointerLikeTypeTraits.
Update to r261259.
llvm-svn: 261268
This patch is part of the work to make PPCLoopDataPrefetch
target-independent
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758).
Obviously the pass still only used from PPC at this point. Subsequent
patches will start driving this from ARM64 as well.
Due to the previous patch most lines should show up as moved lines.
llvm-svn: 261265
This is done only to make the next patch that move the pass out PPC to
Transforms easier to read. After this most line should show up as moved
lines in that patch.
This patch is part of the work to make PPCLoopDataPrefetch
target-independent
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758).
llvm-svn: 261264
If we know that all of our successors want to be in the exact same
state, it makes sense to hoist the state transition into their common
predecessor.
Differential Revision: http://reviews.llvm.org/D17391
llvm-svn: 261262
IRBuilder has two ways of putting bundle operands on calls: the default
operand bundle, and an overload of CreateCall that takes an operand
bundle list.
Previously, this overload used a default argument of None. This made it
impossible to distinguish between the case were the caller doesn't care
about bundles, and the case where the caller explicitly wants no
bundles. We behaved as if they wanted the latter behavior rather than
the former, which led to problems with simplifylibcalls and WinEH.
This change fixes it by making the parameter non-optional, so we can
distinguish these two cases.
llvm-svn: 261258
Summary: As per title. There was a lot of part missing in the C API, so I had to extend the invoke and landingpad API.
Reviewers: echristo, joker.eph, Wallbraker
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17359
llvm-svn: 261254
These atomic operations are conceptually both a load and store from the same location. As such, we can treat them as the most conservative of those two components which in practice, means we can treat them like stores. An cmpxchg or atomicrmw captures the values, but not the locations accessed.
Note: We can probably be more aggressive about the comparison value in an cmpxhg since to have it be in memory, it must already be captured, but I figured it was better to avoid that for the moment.
Note 2: It turns out that since we don't actually support cmpxchg of pointer type, writing a negative test is impossible.
Differential Revision: http://reviews.llvm.org/D17400
llvm-svn: 261245
In r260133, LLVM was changed to no longer extend i8/i16 return values,
as it's not required by the ABI. However, code was found in the wild
that relies on the old behaviour on Darwin, so this commit reverts
back to that old behaviour for Darwin.
On other platforms, it's less likely that code would be depending on
the old behaviour, as GCC and MSVC haven't been extending such return
values.
llvm-svn: 261235
covmap needs to created as non allocatable, but not with
SHT_NOTE. The latter was needed to workaround a problem
of BFD linker with gc, which is no longer needed. (A more
proper longer term fix requires changing FE driver to force
referencing the section using linker script).
Differential Revision: http://reviews.llvm.org/D17309
llvm-svn: 261228
Summary:
These correspond to IMAGE_LOAD/STORE[_MIP] and are going to be used by Mesa
for the GL_ARB_shader_image_load_store extension.
IMAGE_LOAD is already matched by llvm.SI.image.load. That intrinsic has
a legacy name and pretends not to read memory.
Differential Revision: http://reviews.llvm.org/D17276
llvm-svn: 261224
Compiling Hexagon target with GCC 6 produces "error: should have been
declared inside" due to GCC PR c++/69657 which was merged.
Properly wrapping operator<<() definitions within the namespace llvm
fixes the issue.
Author: domagoj.stolfa
Differential Revision: http://reviews.llvm.org/D17281
llvm-svn: 261220
Commit r259357 was reverted because it caused PR26629. We were assuming all
roots of a vectorizable tree could be truncated to the same width, which is not
the case in general. This commit reapplies the patch along with a fix and a new
test case to ensure we don't regress because of this issue again. This should
fix PR26629.
llvm-svn: 261212