Commit Graph

151687 Commits

Author SHA1 Message Date
Matt Arsenault b34635550a AMDGPU: Return correct type during argument lowering
The type needs to be casted back to the original argument type.
Fixes an assert that for some reason is only run when
using -debug.

Includes an additional combine to avoid test regressions
from having conversions mixed with multiple Assert[SZ]ext
nodes. On subtargets where i16 is legal, this was producing an i32
register with an i16 AssertZExt, truncated to i16 with another i8
AssertZExt.

t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
t3: i16 = truncate t2
t5: i16 = AssertZext t3, ValueType:ch:i8
t6: i8 = truncate t5
t7: i32 = zero_extend t6
llvm-svn: 308082
2017-07-15 05:52:59 +00:00
Dinar Temirbulatov 3c64077c82 [SLPVectorizer] Add an extra parameter to tryScheduleBundle function, NFCI.
llvm-svn: 308081
2017-07-15 05:43:54 +00:00
Yonghong Song 9276ef05c8 bpf: generate better lowering code for certain select/setcc instructions
Currently, for code like below,
===
  inner_map = bpf_map_lookup_elem(outer_map, &port_key);
  if (!inner_map) {
    inner_map = &fallback_map;
  }
===
the compiler generates (pseudo) code like the below:
===
  I1: r1 = bpf_map_lookup_elem(outer_map, &port_key);
  I2: r2 = 0
  I3: if (r1 == r2)
  I4:   r6 = &fallback_map
  I5: ...
===

During kernel verification process, After I1, r1 holds a state
map_ptr_or_null. If I3 condition is not taken
(path [I1, I2, I3, I5]), supposedly r1 should become map_ptr.
Unfortunately, kernel does not recognize this pattern
and r1 remains map_ptr_or_null at insn I5. This will cause
verificaiton failure later on.

Kernel, however, is able to recognize pattern "if (r1 == 0)"
properly and give a map_ptr state to r1 in the above case.

LLVM here generates suboptimal code which causes kernel verification
failure. This patch fixes the issue by changing BPF insn pattern
matching and lowering to generate proper codes if the righthand
parameter of the above condition is a constant. A test case
is also added.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 308080
2017-07-15 05:41:42 +00:00
Teresa Johnson 16798558ba Require asserts in new test that uses debug flag
This should fix bot failures from r308078.

llvm-svn: 308079
2017-07-15 05:27:57 +00:00
Teresa Johnson 82b4fb1afe [ThinLTO] Ensure we always select the same function copy to import
Summary:
Check if the first eligible callee is under the instruction threshold.
Checking this on the first eligible callee ensures that we don't end
up selecting different callees to import when we invoke this routine
with different thresholds due to reaching the callee via paths that
are shallower or hotter (when there are multiple copies, i.e. with
weak or linkonce linkage). We don't want to leave the decision of which
copy to import up to the backend.

Reviewers: mehdi_amini

Subscribers: inglorion, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D35436

llvm-svn: 308078
2017-07-15 04:53:05 +00:00
Haicheng Wu abdef9ee7e [TTI] Refine the cost of EXT in getUserCost()
Now, getUserCost() only checks the src and dst types of EXT to decide it is free
or not. This change first checks the types, then calls isExtFreeImpl(), and
check if EXT can form ExtLoad at last. Currently, only AArch64 has customized
implementation of isExtFreeImpl() to check if EXT can be folded into its use.

Differential Revision: https://reviews.llvm.org/D34458

llvm-svn: 308076
2017-07-15 02:12:16 +00:00
Kostya Serebryany e9838cdcc5 [libFuzzer] remove stale code
llvm-svn: 308075
2017-07-15 01:31:40 +00:00
Jakub Kuderski 663603490b [Dominators] Fix reachable visitation and reenable a unit test
This fixes a minor bug in insertion to a reachable node that caused
DominatorTree.InsertDeleteExhaustive flakiness. The patch also adds
a new testcase for this exact failure.

llvm-svn: 308074
2017-07-15 01:27:16 +00:00
Jakub Kuderski 5996b1c330 [Dominators] Temporarily disable a flaky unit test
The DominatorTree.InsertDeleteExhaustive uses a RNG with a
constant seed to generate different sequences of updates. The test
fails on some buildbots and this patch disables it for now.

llvm-svn: 308070
2017-07-14 23:49:12 +00:00
Justin Bogner c27a70d048 [libFuzzer] Allow non-fuzzer args after -ignore_remaining_args=1
With this change, libFuzzer will ignore any arguments after a sigil
argument, but it will preserve these arguments at the end of the
command line when launching subprocesses. Using this, its possible to
handle positional and single-dash arguments to the program under test
by discarding everything up to -ignore_remaining_args=1 in
LLVMFuzzerInitialize.

llvm-svn: 308069
2017-07-14 23:33:04 +00:00
Adrian Prantl b9a8f7ab1f Add missing space to comment
llvm-svn: 308068
2017-07-14 23:23:58 +00:00
Jakub Kuderski 23497b4758 [Dominators] Remove an extra semicolon and add a missing include.
llvm-svn: 308065
2017-07-14 22:24:15 +00:00
Jakub Kuderski eb59ff22e4 [Dominators] Implement incremental deletions
Summary:
This patch implements incremental edge deletions.

It also makes DominatorTreeBase store a pointer to the parent function. The parent function is needed to perform full rebuilts during some deletions, but it is also used to verify that inserted and deleted edges come from the same function.

Reviewers: dberlin, davide, grosser, sanjoy, brzycki

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35342

llvm-svn: 308062
2017-07-14 21:58:53 +00:00
Kostya Serebryany e6823ced65 [libFuzzer] fix stats during merge
llvm-svn: 308061
2017-07-14 21:48:19 +00:00
Yi Kong 3b680d8d81 [AArch64] Avoid selecting XZR inline ASM memory operand
Restricting register class to PointerRegClass for memory operands.

Also fix the PointerRegClass for AArch64 from GPR64 to GPR64sp, since
XZR cannot hold a memory pointer while SP is.

Fixes PR33134.

Differential Revision: https://reviews.llvm.org/D34999

llvm-svn: 308060
2017-07-14 21:46:16 +00:00
Geoff Berry b1e8714af9 [AArch64][Falkor] Avoid HW prefetcher tag collisions (step 1)
Summary:
This patch is the first step in reducing HW prefetcher instruction tag
collisions in inner loops for Falkor.  It adds a pass that annotates IR
loads with metadata to indicate that they are known to be strided loads,
and adds a target lowering hook that translates this metadata to a
target-specific MachineMemOperand flag.

A follow on change will use this MachineMemOperand flag to re-write
instructions to reduce tag collisions.

Reviewers: mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34963

llvm-svn: 308059
2017-07-14 21:44:12 +00:00
Jakub Kuderski 2b9b9c87d0 [Dominators] Add a missing include
llvm-svn: 308058
2017-07-14 21:38:15 +00:00
Davide Italiano 6fdfede10d [AMDGPU] Throw away more dead code. NFCI.
llvm-svn: 308055
2017-07-14 21:20:29 +00:00
Jakub Kuderski 13e9ef1716 [Dominators] Implement incremental insertions
Summary:
This patch introduces incremental edge insertions based on the Depth Based Search algorithm.

Insertions should work for both dominators and postdominators.

Reviewers: dberlin, grosser, davide, sanjoy, brzycki

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35341

llvm-svn: 308054
2017-07-14 21:17:33 +00:00
Dimitry Andric e4b97459f1 Fix mixed line terminators. NFC.
llvm-svn: 308052
2017-07-14 21:14:58 +00:00
Geoff Berry f7d5daa0c0 [EarlyCSE] Handle calls with no MemorySSA info.
Summary:
When checking for memory dependencies between calls using MemorySSA,
handle cases where the calls have no MemoryAccess associated with them
because the AA analysis being used has determined that the call does not
read/write memory.

Fixes PR33756

Reviewers: dberlin, davide

Subscribers: mcrosier, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D35317

llvm-svn: 308051
2017-07-14 20:13:21 +00:00
Haicheng Wu 476adcca6b [JumpThreading] Add a pattern to TryToUnfoldSelectInCurrBB()
Add the following pattern to TryToUnfoldSelectInCurrBB()

bb:
   %p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ...
   %c = cmp %p, 0
   %s = select %c, trueval, falseval

The Select in the above pattern will be unfolded and then jump-threaded. The
current implementation does not allow CMP in the middle of PHI and Select.

Differential Revision: https://reviews.llvm.org/D34762

llvm-svn: 308050
2017-07-14 19:16:47 +00:00
Krzysztof Parzyszek 302a9d41c6 [Hexagon] Replace ISD opcode VPACK with VPACKE/VPACKO, NFC
This breaks up pack-even and pack-odd into two separate operations.

llvm-svn: 308049
2017-07-14 19:02:32 +00:00
Davide Italiano 502ac724ac [AMDGPU] Garbage collect dead code. NFCI.
Unbreaks the build with GCC7.

llvm-svn: 308047
2017-07-14 18:47:29 +00:00
Craig Topper 4fa0cdbb74 [TableGen][MC] Fix a few places where we didn't hide the underlying type of LaneBitmask very well.
One place compared with 32, which I've replaced with LaneBitmask::BitWidth.

The other places are shifts of a constant 1 by a lane number. But if LaneBitmask were to be a larger type than 32-bits like 64-bits, the 1 would need to be 1ULL to do a 64-bit shift. To hide this I've added a LanebitMask::getLane that hides the shift and make sures the 1 is casted to correct type first.

llvm-svn: 308042
2017-07-14 18:30:09 +00:00
Jakub Kuderski b292c22c8d [Dominators] Make IsPostDominator a template parameter
Summary:
DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere.

This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction.

Reviewers: dberlin, sanjoy, davide, grosser

Reviewed By: dberlin

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35315

llvm-svn: 308040
2017-07-14 18:26:09 +00:00
Alfred Huang 5b27072f57 [AMDGPU] Do not insert an instruction into worklist twice in movetovalu
In moveToVALU(), move to vector ALU is performed, all instrs in
the use chain will be visited. We do not want the same node to be
pushed to the visit worklist more than once.

Differential Revision: https://reviews.llvm.org/D34726

llvm-svn: 308039
2017-07-14 17:56:55 +00:00
Jakub Kuderski e1c46554a2 [Dominators] Simplify block and node printing
Summary:
This patch adds `BlockPrinter`-- a small wrapper for printing CFG nodes and DomTree nodes to `raw_ostream`. It is meant to be only used internally, for debugging and printing errors.

Reviewers: dberlin, sanjoy, grosser, davide

Reviewed By: grosser, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35286

llvm-svn: 308036
2017-07-14 16:56:35 +00:00
George Rimar efd3ffb2b6 [llvm-readobj] - Teach readobj to print DT_FILTER dynamic tag in human readable form.
Nothing special here, output format is similar to the format
used by binutils readelf and ELF Tool Chain readelf.

Differential revision: https://reviews.llvm.org/D35351

llvm-svn: 308033
2017-07-14 16:00:16 +00:00
Krzysztof Parzyszek 9c084fc55d [Hexagon] Add intrinsics for data cache operations
This is the LLVM part, adding definitions for
  void @llvm.hexagon.Y2.dccleana(i8*)
  void @llvm.hexagon.Y2.dccleaninva(i8*)
  void @llvm.hexagon.Y2.dcinva(i8*)
  void @llvm.hexagon.Y2.dczeroa(i8*)
  void @llvm.hexagon.Y4.l2fetch(i8*, i32)
  void @llvm.hexagon.Y5.l2fetch(i8*, i64)
The clang part will follow.

llvm-svn: 308032
2017-07-14 15:58:48 +00:00
Sanjay Patel 3f4db3ea97 [InstCombine] convert bitwise (in)equality checks to logical ops (PR32401)
As discussed in:
https://bugs.llvm.org/show_bug.cgi?id=32401

we have a backend transform to undo this:
https://reviews.llvm.org/rL299542

when it's likely that the xor version leads to better codegen, but we want 
this form in IR for better analysis and simplification potential.

llvm-svn: 308031
2017-07-14 15:09:49 +00:00
Simon Dardis 45b2277a33 Revert "Reland "[mips][mt][6/7] Add support for mftr, mttr instructions.""
FileCheck is crashing on in the input file, so reverting again while
I investigate.

This reverts r308023.

llvm-svn: 308030
2017-07-14 15:08:05 +00:00
Sanjay Patel 22abfdfe47 [InstCombine] add tests for PR32401; NFC
Also, add comments to a couple of tests that could be moved out of instcombine.

llvm-svn: 308029
2017-07-14 14:43:28 +00:00
Jonas Paulsson b144af49c1 [SystemZ] Minor fixing in SystemZScheduleZ196.td
Some minor corrections for the recently added instructions.

Review: Ulrich Weigand
llvm-svn: 308028
2017-07-14 14:30:46 +00:00
Sanjay Patel 0439d76497 [InstCombine] auto-generate complete test checks; NFC
llvm-svn: 308027
2017-07-14 14:29:11 +00:00
Nirav Dave a8f63af9d1 Improve Aliasing of operations to static alloca
Recommiting after adding check to avoid miscomputing alias information
on addresses of the same base but different subindices.

Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.

Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.

Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.

The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.

Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand

Reviewed By: rnk

Subscribers: sdardis, nemanjai, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33345

llvm-svn: 308025
2017-07-14 13:56:21 +00:00
Jonas Paulsson 89ca10de33 [SystemZ] Enable LoopDataPrefetch pass.
Loop data prefetching has shown some improvements on benchmarks, and is
enabled at -O1 and above.

Review: Ulrich Weigand
llvm-svn: 308024
2017-07-14 13:52:38 +00:00
Simon Dardis b3529841db Reland "[mips][mt][6/7] Add support for mftr, mttr instructions.""
Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.

For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.

Reviewers: slthakur, atanasyan

Differential Revision: https://reviews.llvm.org/D35253

The last version of this patch broke one of the expensive checks buildbots,
this version changes the failing test/MC/Mips/mt/invalid.s and other invalid
tests to write the errors to a file and run FileCheck on that, rather than
relying on the 'not llvm-mc ... <%s 2>&1 | Filecheck %s' idiom.

Hopefully this will sarisfy the buildbot.

llvm-svn: 308023
2017-07-14 13:44:12 +00:00
Zoran Jovanovic 0e03935182 Reverting commit 308011.
llvm-svn: 308017
2017-07-14 10:52:22 +00:00
Zoran Jovanovic d374c5993b [mips][microMIPS] Extending size reduction pass with ADDIUSP and ADDIUR1SP
Author: milena.vujosevic.janicic
Reviewers: sdardis
The patch extends size reduction pass for MicroMIPS.
The following instructions are examined and transformed, if possible:
ADDIU instruction is transformed into 16-bit instruction ADDIUSP
ADDIU instruction is transformed into 16-bit instruction ADDIUR1SP
Function InRange is changed to avoid left shifting of negative values, since 
that caused some sanitizer tests to fail (so the previous patch 
Differential Revision: https://reviews.llvm.org/D34511

llvm-svn: 308011
2017-07-14 10:13:11 +00:00
Diana Picus 87a7067983 [ARM] GlobalISel: Support G_BRCOND
Insert a TSTri to set the flags and a Bcc to branch based on their
values. This is a bit inefficient in the (common) cases where the
condition for the branch comes from a compare right before the branch,
since we set the flags both as part of the compare lowering and as part
of the branch lowering. We're going to live with that until we settle on
a principled way to handle this kind of situation, which occurs with
other patterns as well (combines might be the way forward here).

llvm-svn: 308009
2017-07-14 09:46:06 +00:00
Jonas Paulsson a84f9f5364 [SystemZ] Minor fixing in SystemZScheduleZEC12.td
Some minor corrections for the recently added instructions.

Review: Ulrich Weigand
llvm-svn: 308007
2017-07-14 09:18:18 +00:00
Renato Golin d806b49899 [RelTest] Diana is doing both releases now
llvm-svn: 308006
2017-07-14 08:33:52 +00:00
Sam Parker 2893448576 [ARM] Allow rematerialization of ARM Thumb literal pool loads
Constants are crucial for code size in the ARM Thumb-1 instruction
set. The 16 bit instruction size often does not offer enough space
for immediate arguments. This means that additional instructions are
frequently used to load constants into registers. Since constants are
hoisted, this can lead to significant register spillage if they are
used multiple times in a single function. This can be avoided by
rematerialization, i.e. recomputing a constant instead of reloading
it from the stack. This patch fixes the rematerialization of literal
pool loads in the ARM Thumb instruction set.

Patch by Philip Ginsbach

Differential Revision: https://reviews.llvm.org/D33936

llvm-svn: 308004
2017-07-14 08:23:56 +00:00
Max Kazantsev f80ffa1a78 [IRCE] Fix corner case with Start = INT_MAX
When iterating through loop

  for (int i = INT_MAX; i > 0; i--)

We fail to generate the pre-loop for it. It happens because we use the
overflown value in a comparison predicate when identifying whether or not
we need it.

In old logic, we used SLE predicate against Greatest value which exceeds all
seen values of the IV and might be overflown. Now we use the GreatestSeen
value of this IV with SLT predicate.

Also added a test that ensures that a pre-loop is generated for such loops.

Differential Revision: https://reviews.llvm.org/D35347

llvm-svn: 308001
2017-07-14 06:35:03 +00:00
Adam Nemet c5bcc587ae [opt-viewer] Flush stdout after progress update
Without this, there was no progress shown during parsing but only during
rendering on macOS.

llvm-svn: 308000
2017-07-14 04:54:26 +00:00
Eric Christopher 4e332c7cf1 Add a set of comments explaining why getSubtargetImpl() is deleted on these targets.
llvm-svn: 307999
2017-07-14 04:33:43 +00:00
Dinar Temirbulatov 21599fe2de [SLPVectorizer] Add an extra parameter to alreadyVectorized function, NFCI.
llvm-svn: 307996
2017-07-14 03:48:29 +00:00
Eric Christopher f73870eefa Remove set but not used variables from the debug info verifier code.
llvm-svn: 307987
2017-07-14 01:40:47 +00:00
Leo Li 7641d962da [CMake]Use LLVM_LIBRARY_DIR for lib path.
Summary:
This makes sure the correct lib path is being used when `CMAKE_CFG_INTDIR` or
`LLVM_LIBDIR_SUFFIX` is set.

Reviewers: beanz

Subscribers: mgorny, srhines, pirama, llvm-commits

Differential Revision: https://reviews.llvm.org/D35318

llvm-svn: 307985
2017-07-14 00:35:21 +00:00