Commit Graph

37452 Commits

Author SHA1 Message Date
Tom Stellard 14416ae6cd Support/ELF: Add R_AMDGPU_GOTPCREL relocation
Summary:
We will start generating this in a future patch.

Reviewers: arsenm, kzhuravl, rafael, ruiu, tony-tye

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21482

llvm-svn: 273628
2016-06-23 23:11:29 +00:00
Hans Wennborg 4b63a98de3 [codeview] Add classes and unions to the Local/Global UDTs lists
Differential Revision: http://reviews.llvm.org/D21655

llvm-svn: 273626
2016-06-23 22:57:25 +00:00
Chris Bieneman 10dcd3bd3b [yaml2macho] Removing asserts in favor of explicit yaml parse error
32-bit Mach headers don't have reserved fields. When generating the
mapping for 32-bit headers leaving off the reserved field will result in
parse errors if the field is present in the yaml.

Added a CHECK-NOT line to ensure that mach_header.yaml isn't adding a
reserved field, and a test to ensure that the parser error gets hit with
32-bit headers.

llvm-svn: 273623
2016-06-23 22:36:31 +00:00
Kyle Butt 991df7889b Codegen: [X86] preservere memory refs for folded umul_lohi
Memory references were not being propagated for this folded load. This
prevented optimizations like LICM from hoisting the load.

Added test to verify that this allows LICM to proceed.

llvm-svn: 273617
2016-06-23 21:40:35 +00:00
Kyle Butt 178314ab52 Codegen: LICM Remove check for exactly 1 register def.
When considering whether to split an instruction with a memory operand
into an explicit load and a register-based instruction, we currently
check that the resulting instruction has exactly 1 def. This prevents 2
important LICM optimizations: compares with memory operands, and double
indirect calls. All the tests and the test-suite pass without the check.
My guess as to original intent is to limit the additional register pressure
created by the new instruction, but given that we only split out a single
register, it is already limited.

The licm-dominance test now checks actual memory loads for hoisting instead of
undef, and it tests compares.
hoist-invariant-load.ll now checks for 2 hoists, the intended hoist, and a bonus
from calling a got-relative function in a loop.

llvm-svn: 273616
2016-06-23 21:38:49 +00:00
Rafael Espindola 2d3cce71ee Uses shouldAssumeDSOLocal.
With that SystemZ knows to avoid a GOT for PIE.

llvm-svn: 273614
2016-06-23 21:18:59 +00:00
Rafael Espindola f2898d73a5 Convert test to FileCheck.
llvm-svn: 273609
2016-06-23 20:37:49 +00:00
Anna Thomas 31a0b2088f InstCombine rule to fold trunc when value available
Summary:
This instcombine rule folds away trunc operations that have value available from a prior load or store.
This kind of code can be generated as a result of GVN widening the load or from source code as well.

Reviewers: reames, majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21246

llvm-svn: 273608
2016-06-23 20:22:22 +00:00
George Burgess IV 1f99da54c2 [CFLAA] Use better interprocedural function summaries.
Previously, we just unified any arguments that seemed to be related to
each other. With this patch, we now respect dereference levels, etc.
which should make us substantially more accurate. Proper handling of
StratifiedAttrs will be done in a later patch.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21536

llvm-svn: 273596
2016-06-23 18:55:23 +00:00
Hans Wennborg a21a263101 [codeview] Fix letter casing in FileCheck regexes
We print those hex numbers with uppercase letters.

llvm-svn: 273594
2016-06-23 18:23:28 +00:00
Michael Kuperstein 0194d30e09 [X86] Extract HiPE prologue constants into metadata
X86FrameLowering::adjustForHiPEPrologue() contains a hard-coded offset
into an Erlang Runtime System-internal data structure (the PCB). As the
layout of this data structure is prone to change, this poses problems
for maintaining compatibility.

To address this problem, the compiler can produce this information as
module-level named metadata. For example (where P_NSP_LIMIT is the
offending offset):

!hipe.literals = !{ !2, !3, !4 }
!2 = !{ !"P_NSP_LIMIT", i32 152 }
!3 = !{ !"X86_LEAF_WORDS", i32 24 }
!4 = !{ !"AMD64_LEAF_WORDS", i32 24 }

Patch by Magnus Lang

Differential Revision: http://reviews.llvm.org/D20363

llvm-svn: 273593
2016-06-23 18:17:25 +00:00
Nirav Dave 38bb1c15fd Prevent generation of temp file in test from r273585.
llvm-svn: 273588
2016-06-23 18:06:35 +00:00
Nirav Dave bfdb483755 Preserve DebugInfo when replacing values in DAGCombiner
Recommiting after correcting over-eager Debug Value transfer fixing PR28270.

[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.

Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.

This refixes PR9817 which was being incompletely checked in the
testsuite.

Reviewers: jyknight

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D21037

llvm-svn: 273585
2016-06-23 17:52:57 +00:00
Pablo Barrio 7a64346533 [ARM] Lower (select_cc k k (select_cc ~k ~k x)) into (SSAT l_k x)
Summary:
SSAT saturates an integer, making sure that its value lies within
an interval [-k, k]. Since the constant is given to SSAT as the
number of bytes set to one, k + 1 must be a power of 2, otherwise
the optimization is not possible. Also, the select_cc must use <
and > respectively so that they define an interval.

Reviewers: mcrosier, jmolloy, rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21372

llvm-svn: 273581
2016-06-23 16:53:49 +00:00
Artur Pilipenko 80771b9ad9 Upgrade other old memset/memcpy signatures in tests causing buildbot failures with rL273568.
llvm-svn: 273580
2016-06-23 16:34:52 +00:00
Hans Wennborg b510b458b9 [codeview] Emit retained types
Differential Revision: http://reviews.llvm.org/D21630

llvm-svn: 273579
2016-06-23 16:33:53 +00:00
Hans Wennborg a63b50afb8 Revert r273568 "Remangle intrinsics names when types are renamed"
It broke 2008-07-15-Bswap.ll and 2009-09-01-PostRAProlog.ll

llvm-svn: 273574
2016-06-23 16:13:23 +00:00
Artur Pilipenko 4fec7b7131 Fix an old memset signature in 2009-09-01-PostRAProlog.ll test causing a buildbot failure
llvm-svn: 273573
2016-06-23 16:07:10 +00:00
Artur Pilipenko f0c9f81379 Remangle intrinsics names when types are renamed
This is a fix for the problem mentioned in "LTO and intrinsics mangling" llvm-dev mail thread:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098387.html

Reviewers: mehdi_amini, reames

Differential Revision: http://reviews.llvm.org/D19373

llvm-svn: 273568
2016-06-23 15:25:09 +00:00
Michael Zolotukhin 2d3592d481 [LoopUnrollAnalyzer] Fix a bug in UnrolledInstAnalyzer::visitLoad.
When simplifying a load we need to make sure that the type of the
simplified value matches the type of the instruction we're processing.
In theory, we can handle casts here as we deal with constant data, but
since it's not implemented at the moment, we at least need to bail out.

This fixes PR28262.

llvm-svn: 273562
2016-06-23 14:31:31 +00:00
Valery Pykhtin a852d695b8 [AMDGPU] Enable absolute expression initializer for amd_kernel_code_t fields.
Differential Revision: http://reviews.llvm.org/D21380

llvm-svn: 273561
2016-06-23 14:13:06 +00:00
Simon Pilgrim 595dddb103 [X86][AVX512] Added AVX512F vector sign extend tests
Now that Elena has confirmed that PR26474 has been fixed

llvm-svn: 273560
2016-06-23 14:01:45 +00:00
Hal Finkel a1271036c5 Allow DeadStoreElimination to track combinations of partial later wrties
DeadStoreElimination can currently remove a small store rendered unnecessary by
a later larger one, but could not remove a larger store rendered unnecessary by
a series of later smaller ones. This adds that capability.

It works by keeping a map, which is used as an effective interval map, for each
store later overwritten only partially, and filling in that interval map as
more such stores are discovered. No additional walking or aliasing queries are
used. In the map forms an interval covering the the entire earlier store, then
it is dead and can be removed. The map is used as an interval map by storing a
mapping between the ending offset and the beginning offset of each interval.

I discovered this problem when investigating a performance issue with code like
this on PowerPC:

  #include <complex>
  using namespace std;

  complex<float> bar(complex<float> C);
  complex<float> foo(complex<float> C) {
    return bar(C)*C;
  }

which produces this:

  define void @_Z4testSt7complexIfE(%"struct.std::complex"* noalias nocapture sret %agg.result, i64 %c.coerce) {
  entry:
    %ref.tmp = alloca i64, align 8
    %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"*
    %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32
    %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32
    %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float
    %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32
    %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float
    call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce)
    %2 = bitcast %"struct.std::complex"* %agg.result to i64*
    %3 = load i64, i64* %ref.tmp, align 8
    store i64 %3, i64* %2, align 4 ; <--- ***** THIS SHOULD NOT BE HERE ****
    %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0
    %4 = lshr i64 %3, 32
    %5 = trunc i64 %4 to i32
    %6 = bitcast i32 %5 to float
    %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1
    %7 = trunc i64 %3 to i32
    %8 = bitcast i32 %7 to float
    %mul_ad.i.i = fmul fast float %6, %1
    %mul_bc.i.i = fmul fast float %8, %0
    %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i
    %mul_ac.i.i = fmul fast float %6, %0
    %mul_bd.i.i = fmul fast float %8, %1
    %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i
    store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4
    store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4
    ret void
  }

the problem here is not just that the i64 store is unnecessary, but also that
it blocks further backend optimizations of the other uses of that i64 value in
the backend.

In the future, we might want to add a special case for handling smaller
accesses (e.g. using a bit vector) if the map mechanism turns out to be
noticeably inefficient. A sorted vector is also a possible replacement for the
map for small numbers of tracked intervals.

Differential Revision: http://reviews.llvm.org/D18586

llvm-svn: 273559
2016-06-23 13:46:39 +00:00
Daniel Sanders de393329b9 [mips] Don't derive the default ABI from the CPU in the backend.
Summary:
The backend has no reason to behave like a driver and should generally do
as it's told (and error out if it can't) instead of trying to figure out
what the API user meant. The default ABI is still derived from the arch
component as a concession to backwards compatibility.

API-users that previously passed an explicit CPU and a triple that was
inconsistent with the CPU (e.g. mips-linux-gnu and mips64r2) may get a
different ABI to what they got before. However, it's expected that there
are no such users on the basis that CodeGen has been asserting that the
triple is consistent with the selected ABI for several releases. API-users
that were consistent or passed '' or 'generic' as the CPU will see no
difference.

Reviewers: sdardis, rafael

Subscribers: rafael, dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21466

llvm-svn: 273557
2016-06-23 12:42:53 +00:00
Daniel Sanders 8e17bea7d5 [mips][ias] Integers are not registers.
Summary:
When parseAnyRegister() encounters a symbol alias, it parses integers and adds
a corresponding expression to the operand list. This is clearly wrong since the
only operands that parseAnyRegister() should be accepting are registers.

It's not clear why this code was added and there are no test cases that cover
it. I think it might be leftover from when searchSymbolAlias() was more widely
used.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21377

llvm-svn: 273555
2016-06-23 10:54:09 +00:00
Diana Picus e440f99913 [AMDGPU] Remove exit-on-error in test (PR27761)
The exit-on-error flag was necessary in order to avoid an assertion when
handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize.

We can avoid the assertion by creating some dummy nodes. This enables us to
remove the exit-on-error flag on the first 2 run lines (SI), but on the third
run line (R600) we would run into another assertion when trying to reserve
indirect registers. This patch also replaces that assertion with an early exit
from the function.

Fixes PR27761.

Differential Revision: http://reviews.llvm.org/D20852

llvm-svn: 273550
2016-06-23 09:19:16 +00:00
Simon Dardis 724e530296 [mips] Fix dext/dins definitions
dext and dins, along with their 'm' and 'u' variants are defined in mips64r2,
not mips64.

Reviewers: dsanders, vkalintiris

Differential Review: http://reviews.llvm.org/D21608

llvm-svn: 273549
2016-06-23 09:06:20 +00:00
Craig Topper 597aa42fec [AVX512] Remove masked unpack intrinsics and autoupgrade to vectorshuffle and selects.
llvm-svn: 273543
2016-06-23 07:37:33 +00:00
David Majnemer d1fbf48566 [SCCP] Don't assume all Constants are ConstantInt
This fixes PR28269.

llvm-svn: 273521
2016-06-23 00:14:29 +00:00
Peter Collingbourne 6717803485 Revert r273456, "Preserve DebugInfo when replacing values in DAGCombiner" as it caused pr28270.
llvm-svn: 273518
2016-06-23 00:06:17 +00:00
Vedant Kumar cae1cff94a [llvm-cov] Fix a buggy lit test
There is no check prefix for "WHOLE-FILE": this particular line was
supposed to use the "ALL" prefix.

llvm-svn: 273517
2016-06-22 23:58:03 +00:00
Matt Arsenault 3cb4ddeb4e AMDGPU: Fix liveness when expanding m0 loop
llvm-svn: 273514
2016-06-22 23:40:57 +00:00
Sanjoy Das e57bf680ec [ImplicitNullChecks] Hoist trivial depdendencies if possible
When trying to convert a loading instruction into a FAULTING_LOAD, we
sometimes face code like this:

  if %R10 is not null:
    %R9<def> = MOV32ri Immediate
    %R9<def, tied> = AND32rm %R9, 0x20(%R10)
  else:
    goto TRAP

In these cases we would like to use the AND32rm instruction as the
faulting operation by hoisting the "depedency" def-ing %R9 also above
the control flow, transforming the program into:

  %R9<def> = MOV32ri Immediate
  %R9<def, tied> = FAULTING_LOAD_OP(AND32rm %R9, 0x20(%R10), FailPath: TRAP)

This change teaches ImplicitNullChecks to do the above, when safe.

llvm-svn: 273501
2016-06-22 22:16:51 +00:00
Rafael Espindola 928a95d0b0 Use shouldAssumeDSOLocal.
With this it handle -fPIE.

llvm-svn: 273499
2016-06-22 22:09:17 +00:00
Changpeng Fang 47efe1f6db AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32
Reviewers: tstellarAMD, arsenm

Differential Revision: http://reviews.llvm.org/D21533

llvm-svn: 273496
2016-06-22 21:33:49 +00:00
Hans Wennborg 9a519a099e [codeview] Write LF_UDT_SRC_LINE records (PR28251)
Differential Revision: http://reviews.llvm.org/D21621

llvm-svn: 273495
2016-06-22 21:22:13 +00:00
Reid Kleckner ac460619d2 [codeview] Fix the alignment padding that we add to list records
Tweak the big-types.ll test case to catch this bug. We just need an
enumerator name that doesn't have a length that is a multiple of 4.

llvm-svn: 273477
2016-06-22 20:59:17 +00:00
Davide Italiano ec7e29e941 [IRObjectFile] Propagate .weak attribute correctly for ASM symbols.
PR: 28256
Differential Revision:  http://reviews.llvm.org/D21616

llvm-svn: 273474
2016-06-22 20:48:15 +00:00
Peter Collingbourne 6d88fde3af IR: Introduce Module::global_objects().
This is a convenience iterator that allows clients to enumerate the
GlobalObjects within a Module.

Also start using it in a few places where it is obviously the right thing
to use.

Differential Revision: http://reviews.llvm.org/D21580

llvm-svn: 273470
2016-06-22 20:29:42 +00:00
Matt Arsenault 9babdf4265 AMDGPU: Fix verifier errors in SILowerControlFlow
The main sin this was committing was using terminator
instructions in the middle of the block, and then
not updating the block successors / predecessors.
Split the blocks up to avoid this and introduce new
pseudo instructions for branches taken with exec masking.

Also use a pseudo instead of emitting s_endpgm and erasing
it in the special case of a non-void return.

llvm-svn: 273467
2016-06-22 20:15:28 +00:00
Krzysztof Parzyszek f7f7068109 [Hexagon] Add SDAG preprocessing step to expose shifted addressing modes
Transform: (store ch addr (add x (add (shl y c) e)))
       to: (store ch addr (add x (shl (add y d) c))),
where e = (shl d c) for some integer d.
The purpose of this is to enable generation of loads/stores with
shifted addressing mode, i.e. mem(x+y<<#c). For that, the shift
value c must be 0, 1 or 2.

llvm-svn: 273466
2016-06-22 20:08:27 +00:00
Sanjay Patel a06d989552 [ValueTracking] improve ComputeNumSignBits for vector constants
This is similar to the computeKnownBits improvement in rL268479. 
There's probably more we can do for vector logic instructions, but 
this should let us see non-splat constant masking ops that can
become vector selects instead of and/andn/or sequences.

Differential Revision: http://reviews.llvm.org/D21610

llvm-svn: 273459
2016-06-22 19:20:59 +00:00
Chad Rosier 8c106bcbe8 [AArch64] Remove an overly aggressive assert.
llvm-svn: 273458
2016-06-22 19:18:52 +00:00
Rafael Espindola 8474fdf90d Start using shouldAssumeDSOLocal on Hexagon.
Include a token test showing that access to private is now the same as
to internal.

llvm-svn: 273457
2016-06-22 19:09:14 +00:00
Nirav Dave 96beb7dee5 Preserve DebugInfo when replacing values in DAGCombiner
Recommiting after fixing over-aggressive assertion

[DAG] Previously debug values would transfer debuginfo for the selected
start node for a replacement which allows for debug to be dropped.

Push debug value transfer to occur with node/value replacement in
SelectionDAG, remove now extraneous transfers of debug values.

This refixes PR9817 which was being incompletely checked in the
testsuite.

Reviewers: jyknight

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D21037

llvm-svn: 273456
2016-06-22 19:03:26 +00:00
Wei Ding 0526e7f8d9 AMDGPU: Add convergent flag to INLINEASM instruction.
Differential Revision: http://reviews.llvm.org/D21214

llvm-svn: 273455
2016-06-22 18:51:08 +00:00
Reid Kleckner 156a7239c1 [codeview] Add IntroducingVirtual debug info flag
CodeView needs to know if a virtual method was introduced in the current
class, and base classes may not have complete type information, so we
need to thread this bit through from the frontend.

llvm-svn: 273453
2016-06-22 18:31:14 +00:00
Vedant Kumar f5ac6d49e4 [asan] Do not instrument accesses to profiling globals
It's only useful to asan-itize profiling globals while debugging llvm's
profiling instrumentation passes. Enabling asan along with instrprof or
gcov instrumentation shouldn't incur extra overhead.

This patch is in the same spirit as r264805 and r273202, which disabled
tsan instrumentation of instrprof/gcov globals.

Differential Revision: http://reviews.llvm.org/D21541

llvm-svn: 273444
2016-06-22 17:30:58 +00:00
Reid Kleckner 643dd83661 [codeview] Defer emission of all referenced complete records
This is the motivating example:
  struct B { int b; };
  struct A { B *b; };
  int f(A *p) { return p->b->b; }

Clang emits complete types for both A and B because they are required to
be complete, but our CodeView emission would only emit forward
declarations of A and B. This was a consequence of the fact that the A*
type must reference the forward declaration of A, which doesn't
reference B at all.

We can't eagerly emit complete definitions of A and B when we request
the forward declaration's type index because of recursive types like
linked lists. If we did that, our stack usage could get out of hand, and
it would be possible to lower a type while attempting to lower a type,
and we would need to double check if our type is already present in the
TypeIndexMap after all recursive getTypeIndex calls.

Instead, defer complete type emission until after all type lowering has
completed. This ensures that all referenced complete types are emitted,
and that type lowering is not re-entrant.

llvm-svn: 273443
2016-06-22 17:15:28 +00:00
Zhan Jun Liau 0df350589f [SystemZ] Recognize RISBG opportunities involving a truncate
Summary:
Recognize RISBG opportunities where the end result is narrower than the
original input - where a truncate separates the shift/and operations.

The motivating case is some code in postgres which looks like:

	srlg	%r2, %r0, 11
	nilh	%r2, 255

Reviewers: uweigand

Author: RolandF

Differential Revision: http://reviews.llvm.org/D21452

llvm-svn: 273433
2016-06-22 16:16:27 +00:00