Commit Graph

44438 Commits

Author SHA1 Message Date
Petr Hosek 86611a078f [llvm-objdump] Don't attempt to print lines beyond the end of file
This may trigger a segfault in llvm-objdump when the line number stored
in debug infromation points beyond the end of file; lines in LineBuffer
are stored in std::vector which is allocated in chunks, so even if the
debug info points beyond the end of the file, this doesn't necessarily
trigger the segfault unless the line number points beyond the allocated
space.

Differential Revision: https://reviews.llvm.org/D32466

llvm-svn: 301347
2017-04-25 18:56:33 +00:00
Gil Rapaport 5c875c3d6f [LV] Make LIT test insensitive to basic block numbering
This patch is part of D28975's breakdown.

induction.ll encodes the specific (and rather arbitrary) numbers given to
predicated basic blocks by the unique naming mechanism, which makes it
sensitive to changes in LV's instruction generation order. This patch replaces
those specific numbers with a numeric pattern.

Differential Revision: https://reviews.llvm.org/D32404

llvm-svn: 301345
2017-04-25 18:14:24 +00:00
Stanislav Mekhanoshin f2db5434be Skip bitcasts while looking for GEP in LoadStoreVectorizer
Differential Revisison: https://reviews.llvm.org/D32101

llvm-svn: 301343
2017-04-25 18:00:08 +00:00
Simon Pilgrim 58641e4529 [X86][AVX2] Add shuffle test for PR27320 showing current codegen.
llvm-svn: 301342
2017-04-25 18:00:04 +00:00
Craig Topper b3b3c29c87 [InstCombine] Fix CHECK-LABEL in two tests.
llvm-svn: 301337
2017-04-25 17:40:58 +00:00
Simon Pilgrim 6f775ba188 [X86][SSE] Add tests for PR14657 showing current codegen.
llvm-svn: 301334
2017-04-25 17:22:34 +00:00
Adrian Prantl de1a8b4efb Print complete DIExpressions in the assembler output DEBUG_VALUE comments.
The previous code was complex, incorrect, and couldn't print everything.

llvm-svn: 301333
2017-04-25 17:22:09 +00:00
Sam Clegg 7fb391fea3 [WebAssembly] Read global index in init expression as LEB
Subscribers: jfb, dschuff

Differential Revision: https://reviews.llvm.org/D32462

llvm-svn: 301330
2017-04-25 17:11:56 +00:00
Craig Topper 0b650d3569 [InstSimplify] Handle (~A & ~B) | (~A ^ B) -> ~A ^ B
The code Sanjay Patel moved over from InstCombine doesn't work properly if the 'and' has both inputs as nots because we used a commuted op matcher on the 'and' first. But this will bind to the first 'not' on 'and' when there could be two 'not's. InstCombine could rely on DeMorgan to ensure the 'and' wouldn't have two 'not's eventually, but InstSimplify can't rely on that.

This patch matches the xor first then checks for the ands and allows a not of either operand of the xor.

Differential Revision: https://reviews.llvm.org/D32458

llvm-svn: 301329
2017-04-25 17:01:32 +00:00
Andrew Ng 10ebfe0684 Resubmit r301309: [DebugInfo][X86] Fix handling of DBG_VALUE's in post-RA scheduler.
This patch reapplies r301309 with the fix to the MIR test to fix the assertion
triggered by r301309. Had trimmed a little bit too much from the MIR!

llvm-svn: 301317
2017-04-25 15:39:57 +00:00
Craig Topper ba01143193 [InstCombine] Add missing commute handling to (A | B) & (B ^ (~A)) -> (A & B)
The matching here wasn't able to handle all the possible commutes. It always assumed the not would be on the left of the xor, but that's not guaranteed.

Differential Revision: https://reviews.llvm.org/D32474

llvm-svn: 301316
2017-04-25 15:19:04 +00:00
Dylan McKay 8f515b1ef7 [AVR] Support the LDWRdPtr instruction with the same Src+Dst register
llvm-svn: 301313
2017-04-25 15:09:04 +00:00
Andrew Ng 049ed153af Revert "[DebugInfo][X86] Fix handling of DBG_VALUE's in post-RA scheduler."
This reverts commit r301309 which is causing buildbot assertion failures.

llvm-svn: 301312
2017-04-25 14:36:01 +00:00
Andrew Ng 178c369456 [DebugInfo][X86] Fix handling of DBG_VALUE's in post-RA scheduler.
This patch fixes a bug with the updating of DBG_VALUE's in
BreakAntiDependencies. Previously, it would only attempt to update the first
DBG_VALUE following the instruction whose register is being changed,
potentially leaving DBG_VALUE's referring to the wrong register. Now the code
will update all DBG_VALUE's that immediately follow the instruction.

This issue was detected as a result of an optimized codegen difference with
"-g" where an X86 byte/word fixup was not performed due to a DBG_VALUE
referencing the wrong register.

Differential Revision: https://reviews.llvm.org/D31755

llvm-svn: 301309
2017-04-25 13:39:49 +00:00
Simon Pilgrim 7d65b66962 [DAGCombiner] Add vector support for (srl (trunc (srl x, c1)), c2) combine.
llvm-svn: 301305
2017-04-25 12:40:45 +00:00
Andrew Ng 1606fc0bf9 [SimplifyLibCalls] Fix infinite loop with fast-math optimization.
One of the fast-math optimizations is to replace calls to standard double
functions with their float equivalents, e.g. exp -> expf. However, this can
cause infinite loops for the following:

  float expf(float val) { return (float) exp((double) val); }

A similar inline declaration exists in the MinGW-w64 math.h header file which
when compiled with -O2/3 and fast-math generates infinite loops.

So this fix checks that the calling function to the standard double function
that is being replaced does not match the float equivalent.

Differential Revision: https://reviews.llvm.org/D31806

llvm-svn: 301304
2017-04-25 12:36:14 +00:00
Simon Pilgrim ab0446332e [SelectionDAG] Recognise splat vector isKnownToBeAPowerOfTwo one/sign bit shift cases.
llvm-svn: 301303
2017-04-25 12:29:07 +00:00
Sanjoy Das 561247a823 [IVUsers] Don't bail out of normalizing non-affine add recs
Summary:
In a previous change I changed SCEV's normalization / denormalization
to work with non-affine add recs.  So the bailout in IVUsers can be
removed.

Reviewers: atrick, efriedma

Reviewed By: atrick

Subscribers: davide, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D32105

llvm-svn: 301298
2017-04-25 06:53:25 +00:00
Craig Topper d5775617c8 [InstCombine] Add test cases for missing commute handling in ((A ^ C) ^ B) & (B ^ A) -> (B ^ A) & ~C
llvm-svn: 301297
2017-04-25 06:47:49 +00:00
Craig Topper e4d7ac4cb1 [InstCombine] Add test cases showing failures to handle commuted patterns after tricking the operand complexity sorting.
llvm-svn: 301296
2017-04-25 06:22:17 +00:00
Akira Hatanaka 490397fc08 [ObjCARC] Do not sink an objc_retain past a clang.arc.use.
We need to do this to prevent a miscompile which sinks an objc_retain
past an objc_release that releases the object objc_retain retains. This
happens because the top-down and bottom-up traversals each determines
the insert point for retain or release individually without knowing
where the other instruction is moved.

For example, when the following IR is fed to the ARC optimizer, the
top-down traversal decides to insert objc_retain right before
objc_release and the bottom-up traversal decides to insert objc_release
right after clang.arc.use.

(IR before ARC optimizer)
%11 = call i8* @objc_retain(i8* %10)
call void (...) @clang.arc.use(%0* %5)
call void @llvm.dbg.value(...)
call void @objc_release(i8* %6)

This reverses the order of objc_release and objc_retain, which causes
the object to be destructed prematurely.

(IR after ARC optimizer)
call void (...) @clang.arc.use(%0* %5)
call void @objc_release(i8* %6)
call void @llvm.dbg.value(...)
%11 = call i8* @objc_retain(i8* %10)

rdar://problem/30530580

llvm-svn: 301289
2017-04-25 04:06:35 +00:00
Sanjay Patel 6b01b4f5a6 [ARM, x86] add more vector tests for bool math; NFC
I'm proposing a fold for increment-of-sexted-bool in:
https://reviews.llvm.org/D31944
...so we need to know what happens in more cases like these.

llvm-svn: 301269
2017-04-24 22:42:34 +00:00
Sanjay Patel 35c362ebbb [InstSimplify] use ConstantRange to simplify more and-of-icmps
We can simplify (and (icmp X, C1), (icmp X, C2)) to one of the icmps in many cases. 
I had to check some of these with Alive to prove to myself it's right, but everything 
seems to check out. Eg, the code in instcombine was completely ignoring predicates with 
mismatched signedness.

Handling or-of-icmps would be a follow-up step.

Differential Revision: https://reviews.llvm.org/D32143

llvm-svn: 301260
2017-04-24 21:52:39 +00:00
Artem Tamazov d6656b945e [AMDGPU][mc][tests][NFC] Bulk ISA tests: update for Gfx7/Gfx8, add for Gfx9.
llvm-svn: 301247
2017-04-24 20:42:27 +00:00
Teresa Johnson b2c390e9f5 Update profile during memory instrinsic optimization
Summary:
Ensure that the new merge BB (which contains the rest of the original BB
after the mem op being optimized) gets a profile frequency, in case
there are additional mem ops later in the BB. Otherwise they get skipped
as the merge BB looks cold.

Reviewers: davidxl, xur

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32447

llvm-svn: 301244
2017-04-24 20:30:42 +00:00
Matt Arsenault 4474652c95 Revert "StructurizeCFG: Directly invert cmp instructions"
This reverts commit r300732. This breaks a few tests.
I think the problem is related to adding more uses of
the condition that don't yet exist at this point.

llvm-svn: 301242
2017-04-24 20:25:01 +00:00
Davide Italiano 0f62eea7ff [LoopUnroll] Don't try to unroll non canonical loops.
The current Loop Unroll implementation works with loops having a
single latch that contains a conditional branch to a block outside
the loop (the other successor is, by defition of latch, the header).
If this precondition doesn't hold, avoid unrolling the loop as
the code is not ready to handle such circumstances.

Differential Revision:  https://reviews.llvm.org/D32261

llvm-svn: 301239
2017-04-24 20:14:11 +00:00
Sanjoy Das 206f65c049 [LIR] Obey non-integral pointer semantics
Summary: See http://llvm.org/docs/LangRef.html#non-integral-pointer-type

Reviewers: haicheng

Reviewed By: haicheng

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32196

llvm-svn: 301238
2017-04-24 20:12:10 +00:00
Matt Arsenault 0774ea267a AMDGPU: Select scratch mubuf offsets when pointer is a constant
In call sequence setups, there may not be a frame index base
and the pointer is a constant offset from the frame
pointer / scratch wave offset register.

llvm-svn: 301230
2017-04-24 19:40:59 +00:00
Stanislav Mekhanoshin bd5394be3d [AMDGPU] Merge M0 initializations
Merges equivalent initializations of M0 and hoists them into a common
dominator block. Technically the same code can be used with any
register, physical or virtual.

Differential Revision: https://reviews.llvm.org/D32279

llvm-svn: 301228
2017-04-24 19:37:54 +00:00
Piotr Padlewski 610c966a4e Handle invariant.group.barrier in BasicAA
Summary:
llvm.invariant.group.barrier returns pointer that mustalias
pointer it takes. It can't be marked with `returned` attribute,
because it would be remove easily. The other reason is that
only Alias Analysis can know about this, because if any other
pass would know it, then the result would be replaced with it's
argument, which would be invalid.

We can think about returned pointer as something that mustalias, but
it doesn't have to be bitwise the same as the argument.

Reviewers: dberlin, chandlerc, hfinkel, sanjoy

Subscribers: reames, nlewycky, rsmith, anna, amharc

Differential Revision: https://reviews.llvm.org/D31585

llvm-svn: 301227
2017-04-24 19:37:17 +00:00
Evgeniy Stepanov 9e536081fe [asan] Let the frontend disable gc-sections optimization for asan globals.
Also extend -asan-globals-live-support flag to all binary formats.

llvm-svn: 301226
2017-04-24 19:34:13 +00:00
Adrian Prantl 083e6a5b5c Don't emit CFI instructions at the end of a function
When functions are terminated by unreachable instructions, the last
instruction might trigger a CFI instruction to be generated. However,
emitting it would be be illegal since the function (and thus the FDE
the CFI is in) has already ended with the previous instruction.

Darwin's dwarfdump --verify --eh-frame complains about this and the
specification supports this.
Relevant bits from the DWARF 5 standard (6.4 Call Frame Information):

"[The] address_range [field in an FDE]: The number of bytes of
 program instructions described by this entry."

"Row creation instructions: [...]
 The new location value is always greater than the current one."
The first quotation implies that a CFI cannot describe a target
address outside of the enclosing FDE's range.

rdar://problem/26244988

Differential Revision: https://reviews.llvm.org/D32246

llvm-svn: 301219
2017-04-24 18:45:59 +00:00
Yaxun Liu fd23a0c095 CodeGen: Add a hook for getFenceOperandTy
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.

This patch has no effect on targets other than amdgcn.

Differential Revision: https://reviews.llvm.org/D32186

llvm-svn: 301215
2017-04-24 18:26:27 +00:00
Evgeniy Stepanov 58ccc0949a Revert "Compute safety information in a much finer granularity."
Use-after-free in llvm::isGuaranteedToExecute.

llvm-svn: 301214
2017-04-24 18:25:07 +00:00
Sanjay Patel 0889225f51 [InstSimplify] move (A & ~B) | (A ^ B) -> (A ^ B) from InstCombine
This is a straight cut and paste, but there's a bigger problem: if this
fold exists for simplifyOr, there should be a DeMorganized version for
simplifyAnd. But more than that, we have a patchwork of ad hoc logic
optimizations in InstCombine. There should be some structure to ensure 
that we're not missing sibling folds across and/or/xor.
 

llvm-svn: 301213
2017-04-24 18:24:36 +00:00
Adrian Prantl f2c7997013 Use DW_OP_stack_value when reconstructing variable values with arithmetic.
When the location description of a source variable involves arithmetic
on the value itself, it needs to be marked with DW_OP_stack_value since it
is not describing the variable's location, but rather its value.

This is a follow-up to r297971 and fixes the source testcase quoted in
the comment in debuginfo-dce.ll.

rdar://problem/30725338

This reapplies r301093 without modifications.

llvm-svn: 301210
2017-04-24 18:11:42 +00:00
Adrian Prantl 283833d022 Add a testcase for DIExpression(DW_OP_stack_value)
and relax the assertion that prohibited its emission.

This fixes the assertion failure uncovered by r301093.

llvm-svn: 301209
2017-04-24 18:11:38 +00:00
Matt Arsenault 3e02538a02 AMDGPU: Move trap lowering to DAG
Fixes traps in any block besides the entry block,
and fixes depending on a live-in physical register
by using a virtual register copy.

Also happens to stop emitting a nop in the case
debug trap is not supported.

llvm-svn: 301206
2017-04-24 17:49:13 +00:00
Zachary Turner da949c1804 [llvm-pdbdump] Merge functionality of graphical and text dumpers.
The *real* difference between these two was that

a) The "graphical" dumper could recurse, while the text one could
   not.
b) The "text" dumper could display nested types and functions,
   while the graphical one could not.

Merge these two so that there is only one dumper that can recurse
arbitrarily deep and optionally display nested types or not.

llvm-svn: 301204
2017-04-24 17:47:52 +00:00
Zachary Turner 1690164cac [llvm-pdbdump] Re-write the record layout code to be more resilient.
This reworks the way virtual bases are handled, and also the way
padding is detected across multiple levels of aggregates, producing
a much more accurate result.

llvm-svn: 301203
2017-04-24 17:47:24 +00:00
Matt Arsenault 02907f3039 InstCombine: Fix assert when reassociating fsub with undef
There is logic to track the expected number of instructions
produced. It thought in this case an instruction would
be necessary to negate the result, but here it folded
into a ConstantExpr fneg when the non-undef value operand
was cancelled out by the second fsub.

I'm not sure why we don't fold constant FP ops with undef currently,
but I think that would also avoid this problem.

llvm-svn: 301199
2017-04-24 17:24:37 +00:00
Nicolai Haehnle 5dea645138 AMDGPU: Move v_readlane lane select from VGPR to SGPR
Summary:
Fix a compiler bug when the lane select happens to end up in a VGPR.

Clarify the semantic of the corresponding intrinsic to be that of
the corresponding GLSL: the lane select must be uniform across a
wave front, otherwise results are undefined.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D32343

llvm-svn: 301197
2017-04-24 17:17:36 +00:00
Xin Tong a266923d57 Compute safety information in a much finer granularity.
Summary:
Instead of keeping a variable indicating whether there are early exits
in the loop.  We keep all the early exits. This improves LICM's ability to
move instructions out of the loop based on is-guaranteed-to-execute.

I am going to update compilation time as well soon.

Reviewers: hfinkel, sanjoy, efriedma, mkuper

Reviewed By: hfinkel

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D32433

llvm-svn: 301196
2017-04-24 17:12:22 +00:00
Nicolai Haehnle 9c66185315 InstCombine/AMDGPU: Fix constant folding of llvm.amdgcn.{icmp,fcmp}
Summary:
The return value of these intrinsics should always have 0 bits for
inactive threads. This means that when all arguments are constant
and the comparison evaluates to true, the intrinsic should return
the current exec mask.

Fixes some GL_ARB_shader_ballot tests.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D32344

llvm-svn: 301195
2017-04-24 17:08:43 +00:00
Igor Breger 87aafa073f [GlobalISel][X86] Lower FormalArgument/Ret using G_MERGE_VALUES/G_UNMERGE_VALUES.
Summary: [GlobalISel][X86] Lower FormalArgument/Ret using G_MERGE_VALUES/G_UNMERGE_VALUES.

Reviewers: zvi, t.p.northover, guyblank

Reviewed By: t.p.northover

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D32288

llvm-svn: 301194
2017-04-24 17:05:52 +00:00
Nicolai Haehnle ef449787d8 AMDGPU: Fix crash when scheduling non-memory SMRD instructions
Summary: Fixes piglit spec/arb_shader_clock/execution/*

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D32345

llvm-svn: 301191
2017-04-24 16:53:52 +00:00
Nirav Dave c799f3a809 [SDAG] Teach Chain Analysis about BaseIndexOffset addressing.
While we use BaseIndexOffset in FindBetterNeighborChains to
appropriately realize they're almost the same address and should be
improved concurrently we do not use it in isAlias using the non-index
understanding FindBaseOffset instead. Adding a BaseIndexOffset check
in isAlias like should allow indexed stores to be merged.

FindBaseOffset to be excised in subsequent patch.

Reviewers: jyknight, aditya_nandakumar, bogner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31987

llvm-svn: 301187
2017-04-24 15:37:20 +00:00
Simon Pilgrim 9111cd950d [X86][AVX] Add scheduling latency/throughput tests for missing AVX1 instructions
Had to split btver2/znver1 checks as only btver2 suppresses zeroupper

llvm-svn: 301181
2017-04-24 14:26:30 +00:00
Jonas Paulsson 1e8648577c [SystemZ] Update kill-flag in splitMove().
EarlierMI needs to clear the kill flag on the first operand in case of a store.

Review: Ulrich Weigand
llvm-svn: 301177
2017-04-24 12:40:28 +00:00
Renato Golin 54c736f833 [DWARF] Move test to x86 directory
llvm-svn: 301176
2017-04-24 12:37:11 +00:00
George Rimar ca53211beb [DWARF] - Take relocations in account when extracting ranges from .debug_ranges
I found this when investigated "Bug 32319 - .gdb_index is broken/incomplete" for LLD.

When we have object file with .debug_ranges section it may be filled with zeroes.
Relocations are exist in file to relocate this zeroes into real values later, but until that
a pair of zeroes is treated as terminator. And DWARF parser thinks there is no ranges at all
when I am trying to collect address ranges for building .gdb_index.

Solution implemented in this patch is to take relocations in account when parsing ranges.

Differential revision: https://reviews.llvm.org/D32228

llvm-svn: 301170
2017-04-24 10:19:45 +00:00
Diana Picus f53865daa4 [ARM] GlobalISel: Legalize s8 and s16 G_(S|U)DIV
We have to widen the operands to 32 bits and then we can either use
hardware division if it is available or lower to a libcall otherwise.

At the moment it is not enough to set the Legalizer action to
WidenScalar, since for libcalls it won't know what to do (it won't be
able to find what size to widen to, because it will find Libcall and not
Legal for 32 bits). To hack around this limitation, we request Custom
lowering, and as part of that we widen first and then we run another
legalizeInstrStep on the widened DIV.

llvm-svn: 301166
2017-04-24 09:12:19 +00:00
Sjoerd Meijer e5b8557d5b [Arch64AsmParser] better diagnostic for isb
Instruction isb takes as an operand either 'sy' or an immediate value. This
improves the diagnostic when the string is not 'sy' and adds a test case for
this which was missing. This also adds tests to check invalid inputs for dsb
and dmb.

Differential Revision: https://reviews.llvm.org/D32227

llvm-svn: 301165
2017-04-24 08:22:20 +00:00
Diana Picus b70e88bdec [ARM] GlobalISel: Support G_(S|U)DIV for s32
Add support for both targets with hardware division and without. For
hardware division we have to add support throughout the pipeline
(legalizer, reg bank select, instruction select). For targets without
hardware division, we only need to mark it as a libcall.

llvm-svn: 301164
2017-04-24 08:20:05 +00:00
Diana Picus 95a8aa93e2 [ARM] GlobalISel: Select G_CONSTANT with CImm operands
When selecting a G_CONSTANT to a MOVi, we need the value to be an Imm
operand. We used to just leave the G_CONSTANT operand unchanged, which
works in some cases (such as the GEP offsets that we create when
referring to stack slots). However, in many other places the G_CONSTANTs
are created with CImm operands. This patch makes sure to handle those as
well, and to error out gracefully if in the end we don't end up with an
Imm operand.

Thanks to Oliver Stannard for reporting this issue.

llvm-svn: 301162
2017-04-24 06:30:56 +00:00
Dean Michael Berris ca780b5a27 [XRay] A tool for Comparing xray function call graphs
Summary:
This is a tool for comparing the function graphs produced by the
llvm-xray graph too. It takes the form of a new subcommand of the
llvm-xray tool 'graph-diff'.

This initial version of the patch is very rough, but it is close to
feature complete.

Depends on D29363

Reviewers: dblaikie, dberris

Reviewed By: dberris

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D29320

llvm-svn: 301160
2017-04-24 05:54:33 +00:00
Sanjoy Das 0cdcdf018e Revert "[SCEV] Enable SCEV verification by default in EXPENSIVE_CHECKS builds"
This reverts commit r301150.  It breaks CodeGen/Hexagon/hwloop-wrap2.ll, reverting
while I investigate.

llvm-svn: 301154
2017-04-24 02:35:19 +00:00
Sanjoy Das 8919303b0a [SCEV] Enable SCEV verification by default in EXPENSIVE_CHECKS builds
llvm-svn: 301150
2017-04-24 00:41:58 +00:00
Sanjoy Das bdbc4938f9 [SCEV] Fix exponential time complexity by caching
llvm-svn: 301149
2017-04-24 00:09:46 +00:00
Xinliang David Li db8d09b6c2 [PartialInine]: add triaging options
There are more bugs (runtime failures) triggered when partial
inlining is turned on. Add options to help triaging problems.

llvm-svn: 301148
2017-04-23 23:39:04 +00:00
Simon Pilgrim 12df01c3c7 [X86][AVX] Add scheduling latency/throughput tests for some AVX1 instructions
More instructions will be added in future commits

llvm-svn: 301145
2017-04-23 22:08:17 +00:00
Sanjay Patel e0c26e0640 [InstCombine] add/move folds for [not]-xor
We handled all of the commuted variants for plain xor already,
although they were scattered around and sometimes folded less
efficiently using distributive laws. We had no folds for not-xor.

Handling all of these patterns consistently is part of trying to 
reinstate:
https://reviews.llvm.org/rL300977

llvm-svn: 301144
2017-04-23 22:00:02 +00:00
Simon Pilgrim 06d6263309 [X86][SSE] Add scheduler class support for SSE42 (PCMPGT) instructions
llvm-svn: 301142
2017-04-23 21:23:27 +00:00
Simon Pilgrim 7d71ed503d [X86][SSE] Add scheduling latency/throughput tests for (most) SSE42 instructions
llvm-svn: 301141
2017-04-23 21:00:25 +00:00
Sanjay Patel afa371fd1d [InstCombine] add tests for not-xor and remove redundant tests; NFC
llvm-svn: 301140
2017-04-23 20:59:00 +00:00
Xin Tong f98602a1ab [JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor.
Summary:
In case all predecessor go to a single successor of current BB. We want to fold (not thread).

I failed to update the phi nodes properly in the last patch https://reviews.llvm.org/rL300657.

Phi nodes values are per predecessor in LLVM.

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32400

llvm-svn: 301139
2017-04-23 20:56:29 +00:00
Simon Pilgrim 19a173ac23 [X86][SSE] Add scheduling latency/throughput tests for (most) SSE41 instructions
llvm-svn: 301137
2017-04-23 20:05:21 +00:00
Simon Pilgrim 57fea6879b [X86][SSE] Add missing scheduling latency/throughput test for PINSRW
llvm-svn: 301136
2017-04-23 19:56:49 +00:00
Sanjay Patel 42a84ac710 [InstCombine] add tests for or-to-xor; NFC
llvm-svn: 301131
2017-04-23 16:37:36 +00:00
Sanjay Patel d13b0bfdac [InstCombine] add pattern matches for commuted variants of xor-to-xor
There's probably some better way to write this that eliminates the
code duplication without hurting readability, but at least this
eliminates the logic holes and is hopefully slightly more efficient
than creating new instructions.

llvm-svn: 301129
2017-04-23 16:03:00 +00:00
Sanjay Patel 9081808521 [InstCombine] add tests for xor-to-xor; NFC
Besides missing 2 commuted patterns, the way we handle these folds is inefficient.

llvm-svn: 301128
2017-04-23 14:51:03 +00:00
Simon Pilgrim c781c0f630 [X86][SSE] Add scheduling latency/throughput tests for SSSE3 instructions
llvm-svn: 301127
2017-04-23 14:01:55 +00:00
Simon Pilgrim e8f8422fe5 [X86][SSE] Add scheduling latency/throughput tests for SSE3 instructions
llvm-svn: 301126
2017-04-23 13:59:29 +00:00
Sanjay Patel 794c34dc35 [InstCombine] add tests for add-to-xor commuted variants; NFC
1 out of the 4 tests commuted the operands, so there's an asymmetry
somewhere under this in how we handle these transforms.

llvm-svn: 301125
2017-04-23 13:37:05 +00:00
Ayman Musa 544988f34d [X86] Convert test checks to generated checks of update_llc_test_checks.py. NFC
llvm-svn: 301107
2017-04-23 07:41:40 +00:00
Artyom Skrobov 53cf1897cc [ARM] ScheduleDAGRRList::DelayForLiveRegsBottomUp must consider OptionalDefs
Summary:
D30400 has enabled tADC and tSBC instructions to be unglued, thereby allowing CPSR to remain live between Thumb1 scheduling units.

Most Thumb1 instructions have an OptionalDef for CPSR; but the scheduler ignored the OptionalDefs, and could unwittingly insert a flag-setting instruction in between an ADDS and the corresponding ADC.

Reviewers: javed.absar, atrick, MatzeB, t.p.northover, jmolloy, rengolin

Reviewed By: javed.absar

Subscribers: rogfer01, efriedma, aemerson, rengolin, llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D31081

llvm-svn: 301106
2017-04-23 06:58:08 +00:00
Adrian Prantl 4677205010 Revert "Use DW_OP_stack_value when reconstructing variable values with arithmetic."
This reverts commit r301093 while investigating stage2 bot breakage.

llvm-svn: 301099
2017-04-23 00:44:40 +00:00
Jonathan Roelofs 1233fe5ac3 Fix testcase: s/CHECKNEXT/CHECK-NEXT/
llvm-svn: 301098
2017-04-22 23:43:44 +00:00
Sanjay Patel ceff20fe50 [InstCombine] clean up tests and regenerate checks; NFC
llvm-svn: 301097
2017-04-22 23:36:47 +00:00
Adrian Prantl a2d25ac14a Use DW_OP_stack_value when reconstructing variable values with arithmetic.
When the location description of a source variable involves arithmetic
on the value itself, it needs to be marked with DW_OP_stack_value since it
is not describing the variable's location, but rather its value.

This is a follow-up to r297971 and fixes the source testcase quoted in
the comment in debuginfo-dce.ll.

rdar://problem/30725338

llvm-svn: 301093
2017-04-22 20:54:06 +00:00
Simon Pilgrim f27a714a9e [X86] Regenerate TLS tests
Use the correct check prefix for X86/X32/X64 target types.

llvm-svn: 301092
2017-04-22 20:13:58 +00:00
Daniel Sanders 658541fe69 [globalisel][tablegen] Add support for RegisterOperand.
Summary:
It functions just like RegisterClass except that the class is obtained
from a field.

Depends on D31761.

Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar

Reviewed By: ab

Subscribers: dberris, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D32229

llvm-svn: 301080
2017-04-22 15:53:21 +00:00
Daniel Sanders 2deea1878e [globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.

In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.

The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
	InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
	return OptionalComplexRendererFn(
	       [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.

As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.

Depends on D31418

Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31761

llvm-svn: 301079
2017-04-22 15:11:04 +00:00
Daniel Sanders 3016d3c6c9 [globalisel][tablegen] Fix PR32733 by checking which instruction operands belong to.
canMutate() was returning true when the operands were all in the same order as
the matched instruction. However, it wasn't checking the operands were actually
on that instruction. This worked when we could only match a single instruction
but the addition of nested instruction matching led to cases where the operands
could be split across multiple instructions. canMutate() now returns false if
operands belong to instructions other than the root of the match.

llvm-svn: 301077
2017-04-22 14:31:28 +00:00
David Blaikie 5477b97d45 Fix test to handle .rel and .rela sections (& to actually specify the target architecture as X86)
llvm-svn: 301073
2017-04-22 08:17:39 +00:00
David Blaikie 85366acf15 Avoid using relocations for ref_addr in .dwo files
In dwo files the fixed offset can be used - if the dwos are linked into
a dwp, the dwo consumer must use the dwp tables to find out where the
original range of the debug_info was and resolve the "section relative"
value relative to that original range - effectively
avoiding/reimplementing the relocation handling.

llvm-svn: 301072
2017-04-22 07:53:44 +00:00
David Blaikie 6cce69020c Fix test from polluting the source tree
(though this seems like a "does this not crash" test - which isn't very
good. Should be fixed)

llvm-svn: 301071
2017-04-22 07:53:40 +00:00
Artur Pilipenko 0632bdc648 Fix for PR32740 - Invalid floating type, unreachable between r300969 and r301029
The bug was introduced by r301018 "[InstCombine] fadd double (sitofp x), y check that the promotion is valid". The patch didn't expect that fadd can be on vectors not necessarily scalars. Add vector support along with the test.

llvm-svn: 301070
2017-04-22 07:24:52 +00:00
Matt Arsenault 01d17e7c5f LowerSwitch: Fix producing invalid IR on unreachable code
If a switch was in an unreachable block that branched
to a block with a phi, it would leave phis with missing
predecessors.

llvm-svn: 301064
2017-04-21 23:54:12 +00:00
David Blaikie 96b1ed50e8 Move Split DWARF handling to an MC option/command line argument rather than using metadata
Since Split DWARF needs to name the actual .dwo file that is generated,
it can't be known at the time the llvm::Module is produced as it may be
merged with other Modules before the object is generated and that object
may be generated with any name.

By passing the Split DWARF file name when LLVM is producing object code
the .dwo file name in the object file can match correctly.

The support for Split DWARF for implicit modules remains the same -
using metadata to store the dwo name and dwo id so that potentially
multiple skeleton CUs referring to different dwo files can be generated
from one llvm::Module.

llvm-svn: 301062
2017-04-21 23:35:26 +00:00
Matthias Braun d78597ec08 AArch64FrameLowering: Check if the ExtraCSSpill register is actually unused
The code assumed that when saving an additional CSR register
(ExtraCSSpill==true) we would have a free register throughout the
function. This was not true if this CSR register is also used to pass
values as in the swiftself case.

rdar://31451816

llvm-svn: 301057
2017-04-21 22:42:08 +00:00
Adrian Prantl ff384546f5 Add test coverage for mem2reg dbg.declare lowering.
llvm-svn: 301050
2017-04-21 22:13:55 +00:00
Hans Wennborg 9b9a5358dd Re-commit r301040 "X86: Don't emit zero-byte functions on Windows"
In addition to the original commit, tighten the condition for when to
pad empty functions to COFF Windows.  This avoids running into problems
when targeting e.g. Win32 AMDGPU, which caused test failures when this
was committed initially.

llvm-svn: 301047
2017-04-21 21:48:41 +00:00
Matt Arsenault c07bda7b87 InferAddressSpaces: Infer for just GEPs
Fixes leaving intermediate flat addressing computations
where a GEP instruction's source is a constant expression.

Still leaves behind a trivial addrspacecast + gep pair that
instcombine is able to handle, which ideally could be folded
here directly.

llvm-svn: 301044
2017-04-21 21:35:04 +00:00
Xinliang David Li 0e9f6df169 [PartialInliner] Partial inliner needs to check use kind before transformation
Differential Revision: https://reviews.llvm.org/D32373

llvm-svn: 301042
2017-04-21 21:20:56 +00:00
Hans Wennborg 04593000d8 Revert r301040 "X86: Don't emit zero-byte functions on Windows"
This broke almost all bots. Reverting while fixing.

llvm-svn: 301041
2017-04-21 21:10:37 +00:00
Hans Wennborg cb3e810714 X86: Don't emit zero-byte functions on Windows
Empty functions can lead to duplicate entries in the Guard CF Function
Table of a binary due to multiple functions sharing the same RVA,
causing the kernel to refuse to load that binary.

We had a terrific bug due to this in Chromium.

It turns out we were already doing this for Mach-O in certain
situations. This patch expands the code for that in
AsmPrinter::EmitFunctionBody() and renames
TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it
seems it was used for not just Mach-O anyway.

Differential Revision: https://reviews.llvm.org/D32330

llvm-svn: 301040
2017-04-21 20:58:12 +00:00
Zachary Turner 0fc009b008 Add a dependency from llvm/test to llvm-cvtres.
llvm-svn: 301038
2017-04-21 20:45:11 +00:00
Tim Northover 1efaa3a88f AArch64: add test for "fence singlethread"
Forgot a git add yesterday.

llvm-svn: 301037
2017-04-21 20:36:08 +00:00