Commit Graph

362924 Commits

Author SHA1 Message Date
Roman Lebedev 8633a0d985
[NFC][InstCombine] Better tests for x s/EXACT (1 << y) pattern 2020-08-06 23:37:15 +03:00
Roman Lebedev 1c21635c94
[NFC][InstCombine] Tests for x s/EXACT (-1 << y) pattern 2020-08-06 23:37:15 +03:00
Adrian Prantl 0fa520af67 Unify the code that updates the ArchSpec after finding a fat binary
with how it is done for a lean binary

In particular this affects how target create --arch is handled — it
allowed us to override the deployment target (a useful feature for the
expression evaluator), but the fat binary case didn't.

rdar://problem/66024437

Differential Revision: https://reviews.llvm.org/D85049

(cherry picked from commit 470bdd3caaab0b6e0ffed4da304244be40b78668)
2020-08-06 13:30:17 -07:00
Richard Smith d6492d8744 Add -Wtautological-value-range-compare warning.
This warning diagnoses cases where an expression is compared to a
constant, and the comparison is tautological due to the form of the
expression (but not merely due to its type). This applies in cases such
as comparisons of bit-fields and the result of bit-masks.

The new warning is added to the Clang diagnostic group
-Wtautological-constant-in-range-compare but not to the
formerly-equivalent GCC-compatibility diagnostic group -Wtype-limits,
which retains its old meaning of diagnosing only tautological
comparisons to extremal values of a type (eg, int > INT_MAX).

Reviewed By: rtrieu

Differential Revision: https://reviews.llvm.org/D85256
2020-08-06 13:28:50 -07:00
Craig Topper ffc248f3b8 [LegalTypes] Move VSELECT node creation out of WidenVSELECTAndMask and push to 2 of the 3 callers.
One of the callers only wants the condition, but the vselect can
be simplified by getNode making it hard or impossible to retrieve
the condition.

Instead, return the condition and make the other 2 callers
responsible for creating the vselect node using the condition.
Rename the function to WidenVSELECTMask accordingly.

Differential Revision: https://reviews.llvm.org/D85468
2020-08-06 13:18:16 -07:00
Craig Topper 4df38a5589 [X86] Optimize out a few extra strlen calls in getX86TargetCPU. NFCI
We had a conversion from const char * to StringRef and const char *
to std::string conversion. These both do their own
strlen call if the compiler doens't figure out how to share them.
By adding the temporary StringRef we can convert it to std::string
instead.

The other case is to use a StringSwitch<StringRef> instead of
StringSwitch<const char *> since the output values of the switch
are string literals. This allows the length to be computed at
compile time. Otherwise we have to convert from const char *
to std::string after the StringSwitch.
2020-08-06 13:18:15 -07:00
Craig Topper e1cad4234c [X86] Make getX86TargetCPU return std::string instead of const char *. Remove call to MakeArgString. NFCI
I believe this function used to be called directly from X86
specific code and was used to immediately create -target-cpu
command line. A later refactoring changed it to to be called from
a generic getCPU function that returns std::string. So on some
paths we created a string using MakeArgString converted that to
std::string then called MakeArgString again from that.

Instead just return std::string directly like the other targets.
2020-08-06 13:18:15 -07:00
Yonghong Song 87cba43402 BPF: add a SimplifyCFG IR pass during generic Scalar/IPO optimization
The following bpf linux kernel selftest failed with latest
llvm:
  $ ./test_progs -n 7/10
  ...
  The sequence of 8193 jumps is too complex.
  verification time 126272 usec
  stack depth 320
  processed 114799 insns (limit 1000000)
  ...
  libbpf: failed to load object 'pyperf600_nounroll.o'
  test_bpf_verif_scale:FAIL:110
  #7/10 pyperf600_nounroll.o:FAIL
  #7 bpf_verif_scale:FAIL

After some investigation, I found the following llvm patch
  https://reviews.llvm.org/D84108
is responsible. The patch disabled hoisting common instructions
in SimplifyCFG by default. Later on, the code changes and a
SimplifyCFG phase with hoisting on cannot do the work any more.

A test is provided to demonstrate the problem.
The IR before simplifyCFG looks like:
  for.cond:
    %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
    %cmp = icmp ult i32 %i.0, 6
    br i1 %cmp, label %for.body, label %for.cond.cleanup

  for.cond.cleanup:
    %2 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
    %cmp2 = icmp eq i8* %2, null
    %conv = zext i1 %cmp2 to i32
    call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3
    call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3
    ret i32 %conv

  for.body:
    %3 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
    %tobool.not = icmp eq i8* %3, null
    br i1 %tobool.not, label %for.inc, label %land.lhs.true

The first two insns of `for.cond.cleanup` and `for.body`, load and
icmp, can be hoisted to `for.cond` block. With Patch D84108, the
optimization is delayed. But unfortunately, later on loop rotation
added addition phi nodes to `for.body` and hoisting cannot
be done any more.

Note such a hoisting is beneficial to bpf programs as
bpf verifier does path sensitive analysis and verification.
The hoisting preverts reloading from stack which will assume
conservative value and increase exploited insns. In this case,
it caused verifier failure.

To fix this problem, I added an IR pass from bpf target
to performance additional simplifycfg with hoisting common inst
enabled.

Differential Revision: https://reviews.llvm.org/D85434
2020-08-06 13:16:00 -07:00
Snehasish Kumar 8d943a928d [NFC] Rename BBSectionsPrepare -> BasicBlockSections.
Rename the BBSectionsPrepare pass as suggested by the review comment in
https://reviews.llvm.org/D85368.

Differential Revision: https://reviews.llvm.org/D85380
2020-08-06 13:12:06 -07:00
Adrian Prantl f406a90a08 Add missing override to Makefile 2020-08-06 13:07:16 -07:00
Sanjay Patel 250a167c41 [InstSimplify] avoid crashing by trying to rem-by-zero
Bug was noted in the post-commit comments for:
rGe8760bb9a8a3
2020-08-06 16:06:31 -04:00
Jonas Devlieghere ba37b144e6 [LLDB] Skip test_launch_simple from TestTargetAPI.py when remote 2020-08-06 13:03:18 -07:00
Matt Arsenault 30eeb742f1 clang: Use byref for aggregate kernel arguments
Add address space to indirect abi info and use it for kernels.

Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.

Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).

This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.

This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
2020-08-06 15:52:26 -04:00
Sanjay Patel c9bcc237a2 [VectorCombine] add tests for load+insert; NFC 2020-08-06 15:45:02 -04:00
Adrian Prantl 05df9cc703 Correctly detect legacy iOS simulator Mach-O objectfiles
The code in ObjectFileMachO didn't disambiguate between ios and
ios-simulator object files for Mach-O objects using the legacy
ambiguous LC_VERSION_MIN load commands. This used to not matter before
taught ArchSpec that ios and ios-simulator are no longer compatible.

rdar://problem/66545307

Differential Revision: https://reviews.llvm.org/D85358
2020-08-06 12:40:45 -07:00
cgyurgyik 128bf458ab [libc] Add tolower, toupper implementation.
Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D85326
2020-08-06 15:21:38 -04:00
Anton Afanasyev a7478fab6c [SLP] Fix order of `insertelement`/`insertvalue` seed operands
Summary:
This patch takes the indices operands of `insertelement`/`insertvalue`
into account while generation of seed elements for `findBuildAggregate()`.
This function has kept the original order of `insert`s before.
Also this patch optimizes `findBuildAggregate()` preventing it from
redundant temporary vector allocations and its multiple reversing.

Fixes llvm.org/pr44067

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83779
2020-08-06 22:09:24 +03:00
Evgenii Stepanov 189ba3db86 Fix CFI issues in <future>
This change fixes errors reported by Control Flow Integrity (CFI) checking when using `std::packaged_task`.  The errors mostly stem from casting the underlying storage (`__buf_`) to `__base*`, even if it is uninitialized.  The solution is to wrap `__base*` access to `__buf_` behind a getter marked with _LIBCPP_NO_CFI.

Differential Revision: https://reviews.llvm.org/D82627
2020-08-06 12:05:22 -07:00
Matt Arsenault 87ce06e315 Add freeze keyword to IR emacs mode 2020-08-06 14:58:28 -04:00
MaheshRavishankar 25e8668e88 [mlir][SPIR-V] Fix wrongly placed Rationale section.
Differential Revision: https://reviews.llvm.org/D85461
2020-08-06 11:51:42 -07:00
Jonas Devlieghere 86aa8e6363 [lldb] Use target.GetLaunchInfo() instead of creating an empty one.
Update tests that were creating an empty LaunchInfo instead of using the
one coming from the target. This ensures target properties are honored.
2020-08-06 11:51:26 -07:00
Aleksandr Platonov 9f24148b21 [clangd] Fix crash in bugprone-bad-signal-to-kill-thread clang-tidy check.
Inside clangd, clang-tidy checks don't see preprocessor events in the preamble.
This leads to `Token::PtrData == nullptr` for tokens that the macro is defined to.
E.g. `#define SIGTERM 15`:
- Token::Kind == tok::numeric_constant (Token::isLiteral() == true)
- Token::UintData == 2
- Token::PtrData == nullptr

As the result of this, bugprone-bad-signal-to-kill-thread check crashes at null-dereference inside clangd.

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D85417
2020-08-06 21:45:21 +03:00
dfukalov 4ccc38813e [AMDGPU][CostModel] Add f16, f64 and contract cases to fused costs estimation.
Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model.
Also added operations with contract attribute.

Fixed line endings in test.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D84995
2020-08-06 21:43:27 +03:00
Matt Arsenault e00201539f GlobalISel: Implement fewerElementsVector for G_EXTRACT_VECTOR_ELT
Use the same basic strategy as LegalizeVectorTypes. Try to index into
smaller pieces if there's a constant index, and otherwise fall back to
a stack temporary.
2020-08-06 14:33:16 -04:00
Arthur Eubanks d0acd97c68 [NewPM][LoopUnswitch] Pin loop-unswitch to legacy PM or use simple-loop-unswitch
As mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html,
loop-unswitch has not been ported to the NPM. Instead people are using
simple-loop-unswitch.

Pin all tests in Transforms/LoopUnswitch to legacy PM and replace all
other uses of loop-unswitch with simple-loop-unswitch.

One test that didn't fit into the above was
2014-06-21-congruent-constant.ll which seems to only pass with
loop-unswitch. That is also pinned to legacy PM.

Now all tests containing "-loop-unswitch" anywhere in the test succeed with
NPM turned on by default.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85360
2020-08-06 10:56:00 -07:00
Aaron En Ye Shi 96c2d5e99e [HIP] Ignore invalid ar linker options
Instead of accepting the same arguments as regular linker,
the static linker will only accept input files.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D85442
2020-08-06 17:39:41 +00:00
Fred Riss 99298c7fc5 [lldb/testsuite] Change get_debugserver_exe to support Rosetta
In order to be able to run the debugserver tests against the Rosetta
debugserver, detect the Rosetta run configuration and return the
system Rosetta debugserver.
2020-08-06 10:38:30 -07:00
Arthur Eubanks 5bb6b8250a [NewPM] Pin -assumption-cache-tracker tests to legacy PM
All tests have corresponding NPM RUN lines.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85395
2020-08-06 10:38:03 -07:00
Matt Arsenault 1a0c0944c6 AMDGPU: Define raw/struct variants of buffer atomic fadd
Somehow the new FP atomic buffer intrinsics ended up using the legacy
style for buffer intrinsics.
2020-08-06 13:36:19 -04:00
Sterling Augustine 9dbdaea9a0 Remove unused variable "saved_opts".
wattr_get is a macro, and the documentation states:
"The parameter opts is reserved for  future use,
applications must supply a null pointer."

In practice, passing a variable there is harmless, except
that it is unused inside the macro, which causes unused
variable warnings.

The various places where
2020-08-06 10:20:21 -07:00
Matt Arsenault eae9c54148 AArch64/GlobalISel: Fix verifier error after selecting returnaddress
This was caching the wrong register to re-use later.
2020-08-06 13:18:05 -04:00
Simon Pilgrim 3b93464dcf [SLP][X86] Regenerate sdiv test noticed in D83779. NFC. 2020-08-06 18:00:21 +01:00
Mircea Trofin ca7973cf18 [NFC]{MLInliner] Point out the tests' model dependencies 2020-08-06 09:57:26 -07:00
Matt Arsenault 90eb7d5283 AMDGPU: Fix spilling of 96-bit AGPRs 2020-08-06 12:42:07 -04:00
Matt Arsenault 56270d1d42 AMDGPU/GlobalISel: Start trying to handle AGPR bank
Try to use AGPR banks for the various merge/unmerge type
operations. Previously these would introduce copies to VGPR.
2020-08-06 12:39:50 -04:00
Matt Arsenault 34040a4f61 GlobalISel: Define InvalidRegBankID enum value 2020-08-06 12:39:49 -04:00
Alexey Bataev 8d072a4405 [OPENMP]Fix for Windows buildbots, NFC. 2020-08-06 12:36:52 -04:00
Alexey Bataev 0af7835eae [OPENMP]Redesign of OMPExecutableDirective/OMPDeclarativeDirective representation.
Summary:
Introduced OMPChildren class to handle all associated clauses, statement
and child expressions/statements. It allows to represent some directives
more correctly (like flush, depobj etc. with pseudo clauses, ordered
depend directives, which are standalone, and target data directives).
Also, it will make easier to avoid using of CapturedStmt in directives,
if required (atomic, tile etc. directives).
Also, it simplifies serialization/deserialization of the
executable/declarative directives.
Reduces number of allocation operations for mapper declarations.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83261
2020-08-06 12:25:19 -04:00
Simon Pilgrim 8f5b2cb828 [InstCombine] Add tests for mul(add(x,c),negpow2) -> mul(sub(-c,x),pow2) fold
Also fix some undef vector elements in the similar vector tests that I missed.
2020-08-06 17:13:28 +01:00
Mircea Trofin 87fb7aa137 [llvm][MLInliner] Don't log 'mandatory' events
We don't want mandatory events in the training log. We do want to handle
them, to keep the native size accounting accurate, but that's all.

Fixed the code, also expanded the test to capture this.

Differential Revision: https://reviews.llvm.org/D85373
2020-08-06 09:04:15 -07:00
Joel E. Denny 518a27e559 [OpenMP] Fix ref count dec for implicit map of partial data
D85342 broke this case.  The new test case presents an example.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D85369
2020-08-06 11:39:29 -04:00
Raphael Isemann f6913e7440 [lldb][NFC] Document and encapsulate OriginMap in ASTContextMetadata
Just adds the respective accessor functions to ASTContextMetadata instead
of directly exposing the OriginMap to the whole world.
2020-08-06 17:37:29 +02:00
Simon Pilgrim d1a91d947f [InstCombine] Add tests for mul(sub(x,y),negpow2) -> mul(sub(y,x),pow2) fold
Add full vector coverage (that currently are not folded).
2020-08-06 16:31:57 +01:00
Simon Pilgrim b7b1a38d41 PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI.
We already need to include raw_ostream.h, also add missing StringRef.h implicit dependency.
2020-08-06 16:31:56 +01:00
Fangrui Song a6db64ef4a [ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD
GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD
(PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to
SHF_ALLOC NOBITS sections. The location counter is not advanced).

This patch tries to fix PR37607 (remove a special case in
`Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot
reset dot to 0 for a middle non-SHF_ALLOC output section. This results in
removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC
non-orphan sections can have non-zero addresses like in GNU ld.

The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan
only. This results in a special case in createSection and findOrphanPos, respectively.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D85100
2020-08-06 08:27:15 -07:00
Matt Arsenault 63cdc9a49f AMDGPU/GlobalISel: Handle llvm.amdgcn.ds.{fadd|fmin|fmax}
These intrinsics are missing mangling for both the pointer and data
type.
2020-08-06 11:09:08 -04:00
Matt Arsenault 63c4be53cf AMDGPU/GlobalISel: Try to promote to use packed saturating add/sub
This produces worse results right now for i8 vectors, but that should
be addressed when we actually try to optimize packed vectors.
2020-08-06 11:08:45 -04:00
Sanjay Patel 60f2c6a94c [PatternMatch] allow intrinsic form of min/max with existing matchers
I skimmed the existing users of these matchers and don't see any problems
(eg, the caller assumes the matched value was a select instruction without checking).

So I think we can generalize the matching to allow the new intrinsics or the cmp+select idioms.

I did not find any unit tests for the matchers, so added some basics there. The instsimplify
tests are adapted from existing tests for the cmp+select pattern and cover the folds in
simplifyICmpWithMinMax().

Differential Revision: https://reviews.llvm.org/D85230
2020-08-06 10:50:24 -04:00
Matt Arsenault dcf3ffb0a8 AMDGPU/GlobalISel: Move frame index selection to patterns
Doesn't really save any code until global value is handled too.
2020-08-06 10:42:15 -04:00
Matt Arsenault d188a608bd AMDGPU: Fix code duplication between the selectors
Not sure this is the right place for this helper.
2020-08-06 10:42:15 -04:00