handles Intrinsic::trap if TargetOptions::TrapFuncName is set.
This fixes a bug in which the trap function was not taken into consideration
when a program was compiled without optimization (at -O0).
<rdar://problem/16291933>
llvm-svn: 206323
This adds a warning that triggers when profile data doesn't match for
the source that's being compiled with -fprofile-instr-use=. This fires
only once per translation unit, as warning on every mismatched
function would be quite noisy.
llvm-svn: 206322
The following example shows a non-parallel loop
void f(int a[]) {
int i;
for (i = 0; i < 10; ++i)
A[i] = A[i+5];
}
which, in case we import a schedule that limits the iteration domain
to 0 <= i < 5, becomes parallel. Previously we crashed in such cases, now we
just recognize it as parallel.
This fixes http://llvm.org/PR19435
Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 206318
When instantiating an array that has an alignment attribute on it, we
were looking through the array type and only considering the element
type for the resulting alignment. We need to make sure we take the
array's requirements into account too.
llvm-svn: 206317
This patch teaches the backend how to efficiently lower logical and
arithmetic packed shifts on both SSE and AVX/AVX2 machines.
When possible, instead of scalarizing a vector shift, the backend should try
to expand the shift into a sequence of two packed shifts by immedate count
followed by a MOVSS/MOVSD.
Example
(v4i32 (srl A, (build_vector < X, Y, Y, Y>)))
Can be rewritten as:
(v4i32 (MOVSS (srl A, <Y,Y,Y,Y>), (srl A, <X,X,X,X>)))
[with X and Y ConstantInt]
The advantage is that the two new shifts from the example would be lowered into
X86ISD::VSRLI nodes. This is always cheaper than scalarizing the vector into
four scalar shifts plus four pairs of vector insert/extract.
llvm-svn: 206316
Sema does have a CUDALaunchBoundsAttr, but CodeGen was doing nothing with it.
This change translates CUDALaunchBoundsAttr to maxntidx and minctasm
metadata, which NVPTX then translates to the correct PTX directives.
Patch by Manjunath Kudlur.
llvm-svn: 206302
Implement DebugInfoVerifier, which steals verification relying on
DebugInfoFinder from Verifier.
- Adds LegacyDebugInfoVerifierPassPass, a ModulePass which wraps
DebugInfoVerifier. Uses -verify-di command-line flag.
- Change verifyModule() to invoke DebugInfoVerifier as well as
Verifier.
- Add a call to createDebugInfoVerifierPass() wherever there was a
call to createVerifierPass().
This implementation as a module pass should sidestep efficiency issues,
allowing us to turn debug info verification back on.
<rdar://problem/15500563>
llvm-svn: 206300
This implements clause C.8 of the AAPCS in the front-end, so that Clang
accurately knows when the registers run out and it has to insert padding before
the stack objects begin.
PR19432.
llvm-svn: 206296
Sometimes we need emit the bits that would actually be a MOVN when producing a
relocated MOVZ instruction (don't ask). But not always, a check which ARM64 got
wrong until now.
llvm-svn: 206289
Code is mostly copied directly across, with a slight extension of the
ISelDAGToDAG function so that it can cope with the floating-point constants
being behind a litpool.
llvm-svn: 206285
ARM64 suffered multiple -verify-machineinstr failures (principally over the
xsp/xzr issue) because FastISel was completely ignoring which subset of the
general-purpose registers each instruction required.
More fixes are coming in ARM64 specific FastISel, but this should cover the
generic problems.
llvm-svn: 206283
In the case of a CHECK failure the program tries to fork and launch llvm-symbolizer,
but hangs in mz_force_lock because one of the allocator locks is already acquired.
llvm-svn: 206274