Commit Graph

354517 Commits

Author SHA1 Message Date
Roman Lebedev fde8eb00e1
[InstCombine] visitMaskedMerge(): when unfolding, sanitize undef constants (PR45955)
We can't leave undef vector element constants as-is,
it is a miscompile, so we need to sanitize them.

We have two vectors (C and ~C):
* We can't replace undef with 0 in both of them
* We can't replace undef with 0 in only one of them
* We could replace undef with -1 in both of them
* We could replace undef with -1 in only one(!) of them
* We could replace undef with -1 in one and 0 in another one of them.

Therefore, it seems best to go with the last option, since otherwise
we'd loose knowledge that C and ~C have no common bits set,
which seems more important than preserving partial undef knowledge.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45955
2020-05-17 22:53:03 +03:00
David Blaikie a055e3856f DebugInfo: Reduce long-distance dependence on what will/won't emit a debug_addr section
This is a no-op/NFC at the moment & generally makes the code /somewhat/
cleaner/less reliant on assumptions about what will produce a debug_addr
section.

It's still a bit "spooky action at a distance" - the add ranges code
pre-emptively inserts addresses into the address pool it knows will
eventually be used by the range emission code (or low/high pc).

The 'ideal' would be either to actually compute the addresses needed for
range (& loc) emission earlier - which would mean decanonicalizing the
range/loc representation earlier to account for whether it was going to
use addrx encodings or not (which would be unfortunate, but could be
refactored to be relatively unobtrusive).

Alternatively, emitting the range/loc sections earlier would cause them
to request the needed addresses sooner - but then you endup having to
split finalizeModuleInfo because some things need to be handled there
before the ranges/locs are emitted, I think...
2020-05-17 12:45:56 -07:00
Nikita Popov 39beeeff20 [LVI] Don't use dominator tree in isValidAssumeForContext()
LVI and its consumers currently have quite a bit of complexity
related to dominator tree management. However, it doesn't look
like it is actually needed...

The only use of the dominator tree is inside isValidAssumeForContext().
However, due to the way LVI queries work, it is not needed:
If we query a value for some block, we will first get the edge values
from all predecessor blocks, which also includes an intersection with
assumptions that apply to the terminator of the predecessor. As such,
we will already have processed all assumptions from predecessor blocks
(this is actually stronger than what isValidAssumeForContext() does
with a DT, because this is capable of combining non-dominating
assumptions). The only additional assumptions we need to take into
account are those in the block being queried. And we don't need a
dominator tree for that.

This patch only removes the use of DT, I will drop the machinery
around it in a followup.

Differential Revision: https://reviews.llvm.org/D76797
2020-05-17 21:39:35 +02:00
Fedor Sergeev a39faacca1 Add missing include Host.h in llvm-mc-* fuzzers. NFC.
Fixes build failure in these fuzzers.
2020-05-18 02:21:22 +07:00
Nathan James 74bcb00e00 [ASTMatchers] Added BinaryOperator hasOperands matcher
Summary: Adds a matcher called `hasOperands` for `BinaryOperator`'s when you need to match both sides but the order isn't important, usually on commutative operators.

Reviewers: klimek, aaron.ballman, gribozavr2, alexfh

Reviewed By: aaron.ballman

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80054
2020-05-17 19:54:14 +01:00
JinGu Kang 8120562ba6 test commit 2020-05-17 19:49:37 +01:00
Simon Pilgrim 090cf4591f Revert rGca18ce1a00cd8b7cb7ce0e130440f5ae1ffe86ee "GlobPattern.h - remove unnecessary BitVector.h/StringRef.h includes. NFC"
Causes lld build errors
2020-05-17 18:51:21 +01:00
Simon Pilgrim ca18ce1a00 GlobPattern.h - remove unnecessary BitVector.h/StringRef.h includes. NFC
Use forward declarations (BitVector already had one) and an headers to source file that were implicitly using them.
2020-05-17 18:29:41 +01:00
Simon Pilgrim 897e926bb0 ImmutableGraph.h - remove unused raw_ostream.h include. NFC 2020-05-17 18:29:41 +01:00
Fangrui Song 02cdbc349f [XRay] Migrate xray_naive_log=true tests to xray_mode=xray-basic 2020-05-17 09:32:52 -07:00
Sanjay Patel 57c3fe76a3 [x86] favor vector constant load to avoid GPR to XMM transfer
This build vector lowering pattern came up in D79886.
I've tried to limit the improvement to cases where it looks
clearly better to load, but we could remove the 'TODO'
predicates already if we are willing to overlook some
corner cases.

Differential Revision: https://reviews.llvm.org/D80013
2020-05-17 11:56:26 -04:00
Sanjay Patel 130a2356ae [InstCombine] add tests for FP cast of cast; NFC
A fold of casts is proposed as a backend transform in D79187,
but we can also do that in IR (and that may obsolete the need
for a backend transform).
2020-05-17 11:42:07 -04:00
Xing GUO 42011fb1c8 [ObjectYAML][DWARF] Take into account other debug sections in DWARFYAML::Data::isEmpty(). 2020-05-17 22:53:27 +08:00
Dylan McKay ede6005e70 [AVR] Explicitly set the address of the data section when invoking the linker
This is required to get avr-gdb correctly showing values at the right
addresses. This problem was discovered by using debug symbols in an
external program to lookup values in an AVR simulator.
2020-05-18 02:24:51 +12:00
Nicolas Vasilache 1d6eb09d22 [mlir] NFC - VectorTransforms use OpBuilder where relevant
Summary: This will allow using unrolling outside of only rewrite patterns.

Differential Revision: https://reviews.llvm.org/D80083
2020-05-17 10:17:12 -04:00
Simon Pilgrim 6f02633a4f [X86] Add getTargetConstantFromBasePtr helper. NFC.
Allows us to share code from LoadSDNode and MemIntrinsicSDNode constant pool loads.
2020-05-17 14:58:31 +01:00
Simon Pilgrim 9aca5b68ee [X86] getTargetConstantBitsFromNode - remove unnecessary X86ISD::VBROADCAST handling.
We create X86ISD::VBROADCAST_LOAD for constant pool folds now.
2020-05-17 14:58:30 +01:00
Sanjay Patel bfd512160f [InstCombine] improve analysis of FP->int->FP to eliminate fpextend
This was originally in D79116.
Converting from a narrow-enough FP source value to integer and
back to FP guarantees that the conversion to FP is exact because
of UB/poison-on-overflow.

This was suggested in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c19
2020-05-17 09:06:57 -04:00
Florian Hahn b54a663312 [LoopUnroll] Extend test case with additional loop with larger TC. 2020-05-17 13:55:11 +01:00
Florian Hahn 9e2a99e5b7 [LoopUnroll] Precommit test for PR459393. 2020-05-17 13:29:36 +01:00
Christudasan Devadasan 7c4e711ef8 [AMDGPU] Enable base pointer.
When the callee requires a dynamic stack realignment,
it is not possible to correcty access the incoming
stack arguments using the stack pointer. We reserve a
base pointer in such cases to access the function arguments
inside the callee. The base pointer will hold the incoming
stack pointer value before any kind of delta added to it.

Reviewed By: arsenm, scott.linder

Differential Revision: https://reviews.llvm.org/D78811
2020-05-17 16:13:55 +05:30
Joachim Protze d23131a3c0 [OpenMP] Fix race condition in the completion/freeing of detached tasks
Spurious assertion failures are symptoms of a race condition for the handling
of detached tasks:
Assertion failure at kmp_tasking.cpp(3744): taskdata->td_flags.complete == 1.
Assertion failure at kmp_tasking.cpp(710): taskdata->td_flags.executing == 0.

in the case of detach=true, all accesses to taskdata in __kmp_task_finish need
to happen before (~line 873):

taskdata->td_flags.proxy = TASK_PROXY;

This assignment signals to __kmp_fulfill_event, that the task will need to be
freed there. So, conceptionally the ownership of taskdata is moved.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D79702
2020-05-17 12:28:38 +02:00
Fedor Sergeev f93a6aaebc [Inliner][NFC] silence gcc 'overloaded-virtual' warning on hiding of Pass::doInitialization
When compiling with -Werror=overloaded-virtual, gcc emits this:
====
llvm/include/llvm/Pass.h:102:16: error: ‘virtual bool llvm::Pass::doInitialization(llvm::Module&)’ was hidden [-Werror=overloaded-virtual]
   virtual bool doInitialization(Module &)  { return false; }
                ^~~~~~~~~~~~~~~~
In file included from llvm/lib/Transforms/IPO/Inliner.cpp:20:0:
llvm/include/llvm/Transforms/IPO/Inliner.h:38:8: error:   by ‘virtual bool llvm::LegacyInlinerBase::doInitialization(llvm::CallGraph&)’ [-Werror=overloaded-virtual]
   bool doInitialization(CallGraph &CG) override;
        ^~~~~~~~~~~~~~~~
====

This is an old issue which has just started biting our downstream after
a slight rearrangement of includes around Inliner.
Fixing it similar to how doFinalization was done years ago.
2020-05-17 16:31:33 +07:00
Dylan McKay 1335737ee1 [LLVM][AVR] Support for R_AVR_6 fixup
Summary: Handle the emission of `R_AVR_6` ELF relocation type.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: hiraditya, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78721

Patch by @LemonBoy https://reviews.llvm.org/p/LemonBoy/
2020-05-17 19:46:09 +12:00
Dylan McKay 1420f4efbe [AVR] Fix I/O instructions on XMEGA
Summary:
On XMEGA, I/O address space is same as data address space - there is no 0x20 offset,
because CPU General Purpose Registers are not mapped in data address space.

From https://en.wikipedia.org/wiki/AVR_microcontrollers
> In the XMEGA variant, the working register file is not mapped into the data address space; as such, it is not possible to treat any of the XMEGA's working registers as though they were SRAM. Instead, the I/O registers are mapped into the data address space starting at the very beginning of the address space.

Reviewers: dylanmckay

Reviewed By: dylanmckay

Subscribers: hiraditya, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77207

Patch by Vlastimil Labsky.
2020-05-17 19:46:09 +12:00
Fangrui Song 3841ed4104 [Driver] Render -T for Gnu.cpp
clang -T a.lds a.c currently does not render -T.
2020-05-16 23:54:31 -07:00
Stephen Neuendorffer efa70843aa [MLIR][cmake] use LINK_LIBS PUBLIC for MLIRStandardOpsTransforms
Without this LLVM_LINK_LLVM_DYLIB is broken

Differential Revision: https://reviews.llvm.org/D80074
2020-05-16 22:45:14 -07:00
Fangrui Song 3dbbbcc80e [llvm-xray] consumeError when trying big-endian
Follow-up of rL341226.

Fixes "Expected<T> must be checked before access or destruction"
2020-05-16 22:44:48 -07:00
Arthur Eubanks 8092c8fec0 [NFC] Run clang-format on ISDOpcodes.h
Subscribers: jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80050
2020-05-16 22:33:00 -07:00
Yi Kong 2fe66bdb2e [Compiler-rt] Emit error if builtins library cannot be found
Since setting COMPILER_RT_USE_BUILTINS_LIBRARY would remove -z,defs
flag, missing builtins library would continue to build unnoticed.
Explicitly emit an error in such case.

Differential Revision: https://reviews.llvm.org/D79470
2020-05-17 10:54:53 +08:00
Nico Weber 3735505e4f Fix a few doc typos to cycle bots. 2020-05-16 20:38:28 -04:00
Nico Weber bc98dc12d8 Try to heal bots after https://reviews.llvm.org/D79655 2020-05-16 20:32:58 -04:00
Craig Topper 796ae8cf82 [LegalizeDAG] Use MachinePointerInfo::getUnknownStack in place of MachinePointerInfo() in a couple places. NFC
We know the pointer somewhere on the stack, we just don't know
exactly where since the index may be variable.

Differential Revision: https://reviews.llvm.org/D80060
2020-05-16 15:48:16 -07:00
Eli Friedman 4f04db4b54 AllocaInst should store Align instead of MaybeAlign.
Along the lines of D77454 and D79968.  Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.

Differential Revision: https://reviews.llvm.org/D80044
2020-05-16 14:53:16 -07:00
Craig Topper 135b877874 [X86] Replace selectScalarSSELoad ComplexPattern with PatFrags to handle the 3 types of loads we currently match.
This ensures we create mem operands for these instructions fixing PR45949.

Unfortunately, it increases the size of X86GenDAGISel.inc, but some dag
combine canonicalization could reduce the types of load we need to match.
2020-05-16 14:30:45 -07:00
Eli Friedman 0ec5f50196 Harden IR and bitcode parsers against infinite size types.
If isSized is passed a SmallPtrSet, it uses that set to catch infinitely
recursive types (for example, a struct that has itself as a member).
Otherwise, it just crashes on such types.
2020-05-16 14:24:51 -07:00
faisal vali accd9af838 Revert "[nfc] test commit"
This reverts commit 0ee46e857d.
2020-05-16 15:12:04 -05:00
faisal vali 0ee46e857d [nfc] test commit 2020-05-16 15:08:30 -05:00
John McCall 32870a84d9 Expose IRGen API to add the default IR attributes to a function definition.
I've also made a stab at imposing some more order on where and how we add
attributes; this part should be NFC.  I wasn't sure whether the CUDA use
case for libdevice should propagate CPU/features attributes, so there's a
bit of unnecessary duplication.
2020-05-16 14:44:54 -04:00
mydeveloperday 49c9a68d7f The release notes for ObjCBreakBeforeNestedBlockParam was placed between the release note for IndentCaseBlocks and its example code
Remove other whitespace and line limit issues and double blank line issues
2020-05-16 18:52:13 +01:00
Sanjay Patel 81e9ede3a2 [VectorCombine] forward walk through instructions to improve chaining of transforms
This is split off from D79799 - where I was proposing to fully iterate
over a function until there are no more transforms. I suspect we are
still going to want to do something like that eventually.

But we can achieve the same gains much more efficiently on the current
set of regression tests just by reversing the order that we visit the
instructions.

This may also reduce the motivation for D79078, but we are still not
getting the optimal pattern for a reduction.
2020-05-16 13:08:01 -04:00
Sanjay Patel 43017ceb78 [PhaseOrdering] add vector reduction tests; NFC
These are based on tests originally included in:
D79078
2020-05-16 12:51:10 -04:00
Nikita Popov 604f44977b [InstCombine] Clean up alignment handling (NFC)
Now that load/store alignment is required, we can simplify code
in some places.
2020-05-16 18:47:29 +02:00
David Green 2123bb843e [ARM] Patterns for VQSHRN
Given a VQMOVN(VSHR), we can fold that into a VQSHRN simply enough using
a few tablegen patterns.

Differential Revision: https://reviews.llvm.org/D77720
2020-05-16 17:46:43 +01:00
Sanjay Patel 6211830fba [VectorCombine] add reduction-like patterns; NFC
These are based on tests originally included in:
D79078
2020-05-16 12:45:01 -04:00
Jay Foad 9a05547954 [AArch64] Precommit tests for D77316 2020-05-16 16:00:02 +01:00
Sanjay Patel 5be37cb124 [x86][CGP] try to hoist funnel shift above select-of-splats
This is basically the same patch as D63233, but converted to
funnel shifts rather than regular shifts. I did not see a
way to effectively share code for these 2 cases though.

This follows D79718 and D79827 to re-fix PR37426 because
that gets canonicalized to funnel shift intrinsics in IR.

I did draft an alternative patch as an enhancement to
"shouldSinkOperands()", but that was awkward because
we have to key the transform from the select, but then
look at both its users and its operands.
2020-05-16 10:44:47 -04:00
David Green 72f1fb2edf [ARM] Combines for VMOVN
This adds two combines for VMOVN, one to fold
VMOVN[tb](c, VQMOVNb(a, b)) => VQMOVN[tb](c, b)
The other to perform demand bits analysis on the lanes of a VMOVN. We
know that only the bottom lanes of the second operand and the top or
bottom lanes of the Qd operand are needed in the result, depending on if
the VMOVN is bottom or top.

Differential Revision: https://reviews.llvm.org/D77718
2020-05-16 15:13:16 +01:00
David Green 2e1fbf85b6 [ARM] MVE saturating truncates
This adds some custom lowering for VQMOVN, an instruction that can be
used to perform saturating truncates from a pair of min(max(X, -0x8000),
0x7fff), providing those constants are correct. This leaves a VQMOVNBs
which saturates the value and inserts that into the bottom lanes of an
existing vector. We then need to do something with the other lanes,
extending the value using a vmovlb.

Ideally, as will often be the case, only the bottom lane of what remains
will be demanded, allowing the vmovlb to be removed. Which should mean
the instruction is either equal or a win most of the time, and allows
some extra follow-up folding to happen.

Differential Revision: https://reviews.llvm.org/D77590
2020-05-16 15:10:20 +01:00
Simon Pilgrim 228913780b DIEHash.cpp - remove headers explicitly included in DIEHash.h. NFC.
Don't duplicate module header includes.
2020-05-16 15:00:57 +01:00