Commit Graph

372717 Commits

Author SHA1 Message Date
William S. Moses f5c5fd1c50 [MLIR] Correct block merge bug
Block merging in MLIR will incorrectly merge blocks with operations whose values are used outside of that block. This change forbids this behavior and provides a test where it is illegal to perform such a merge.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D91745
2020-11-20 19:12:59 +01:00
Yitzhak Mandelbaum 88e6208562 [libTooling] Update Transformer's `node` combinator to include the trailing semicolon for decls.
Currently, `node` only includes the semicolon for (some) statements. However,
declarations have the same issue of (potentially) trailing semicolons, so `node`
should behave the same for them.

Differential Revision: https://reviews.llvm.org/D91872
2020-11-20 18:11:50 +00:00
Craig Topper a7eae62a42 [SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version.
The default version only works if the returned node has a single
result. The X86 and PowerPC versions support multiple results
and allow a single result to be returned from a node with
multiple outputs. And allow a single result that is not result 0
of the node.

Also replace the Mips version since the new version should work
for it. The original version handled multiple results, but only
if the new node and original node had the same number of results.

Differential Revision: https://reviews.llvm.org/D91846
2020-11-20 10:06:53 -08:00
Alex Zinenko 18d0f7d5c3 [mlir] add canonicalization patterns for trivial SCF 'for' and 'if'
Add canoncalization patterns to remove zero-iteration 'for' loops, replace
single-iteration 'for' loops with their bodies; remove known-false conditionals
with no 'else' branch and replace conditionals with known value by the
respective region. Although similar transformations are performed at the CFG
level, not all flows reach that level, e.g., the GPU flow may want to remove
single-iteration loops before deciding on loop mapping to thread dimensions.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D91865
2020-11-20 19:04:39 +01:00
Dave Lee dbcc69217a [lldb] Add examples and reword source-map help string
Update the help string for `target.source-map` to remove the use of the word
"duple" and to add examples. Additionally I rewrote parts with the goal of
making the description more concrete.

rdar://68736012

Differential Revision: https://reviews.llvm.org/D91742
2020-11-20 10:01:36 -08:00
Arthur Eubanks ac7419bb4f [Hexagon][NewPM] Port -hexagon-loop-idiom and add to pipeline
Fixes pmpy-mod.ll under NPM

Reviewed By: kparzysz

Differential Revision: https://reviews.llvm.org/D91829
2020-11-20 09:34:37 -08:00
Stella Stamenova 370d0bac90 [mlir] Expose parseDimAndSymbolList from affineops.h
This was removed from ops.h, but it is used by onnx-mlir

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D91830
2020-11-20 09:26:58 -08:00
Andrew Wei 1cd19fc568 [DeadMachineInstrctionElim] Post order visit all blocks and Iteratively run DeadMachineInstructionElim pass until nothing dead
Patched by: guopeilin
Reviewed By: hliao,rampitec

Differential Revision: https://reviews.llvm.org/D91513
2020-11-21 00:43:23 +08:00
Simon Pilgrim 09a081f221 [X86][SSE] LowerADDSAT_SUBSAT - avoid X86ISD::BLENDV in UADDSAT/USUBSAT custom lowering
Use the OR(CMP,ADD) / AND(CMP,SUB) patterns like we do on pre-SSE4 targets.

We're still using X86ISD::BLENDV on some AVX targets as we don't do custom lowering for >= 256-bit vectors.

Really this (and combineVSelectWithAllOnesOrZeros) needs moving to DAGCombiner, but pre-SSE42 we see the vXi64 comparison type as a 2 x 32-bits result so we can't just rely on ComputeNumSignBits to give us the 'all bits' result we need.
2020-11-20 16:53:01 +00:00
Andrzej Warzynski 1b749c0cb5 [flang][driver] Remove unnecessary CMake dependencies (nfc)
Remove clangFrontend from the list of dependencies. These should have
been removed in: 8d51d37e06. See also
https://reviews.llvm.org/D87774.
2020-11-20 16:44:11 +00:00
Sanjay Patel e32bd35120 [CostModel] mostly remove cost-kind predicate for intrinsics in basic TTI implementation
This is re-applying a combination of f7eac51b9b and 8ec7ea3ddc as one patch
to avoid regressions now that we have better testing in place.

Those were reverted with 32dd5870ee because of crashing in experimental intrinsics.
That bug should be fixed with 7ae346434.

Paraphrased original commit messages:

This is the last step in removing cost-kind as a consideration in the
basic class model for intrinsics.
See D89461 for the start of that.
Subsequent commits dealt with each of the special-case intrinsics that
had customization here in the basic class. This should remove a barrier
to retrying D87188 (canonicalization to the abs intrinsic).

The ARM and x86 cost diffs seen here may be wrong because the
target-specific overrides have their own bugs, but we hope this is
less wrong - if something has a significant throughput cost, then it
should have a significant size / blended cost too by default.

The only behavioral diff in current regression tests is shown in the
x86 scatter-gather test (which is misplaced or broken because it runs
the entire -O3 pipeline) - we unrolled less, and we assume that is
a improvement.

Exception: in general, we want the *size* cost for a scalar call to be
cheap even if the other costs are expensive - we expect it to just be
a branch with some optional stack manipulation.

It is likely that we will want to carve out some
exceptions/overrides to this rule as follow-up patches for
calls that have some general and/or target-specific difference
to the expected lowering.

This was noticed as a regression in unrolling, so we have a test
for that now along with a couple of direct cost model tests.

If the assumed scalarization costs for the oversized vector
calls are not realistic, that would be another follow-up
refinement of the cost models.

Differential Revision: https://reviews.llvm.org/D90554
2020-11-20 11:21:10 -05:00
Simon Pilgrim e3f0177deb [X86] Add SSE42 sat-add test coverage
Check SSE42 targets which have PCMPGTQ
2020-11-20 16:00:24 +00:00
Alex Richardson 51e09e1d5a [AMDGPU] Set the default globals address space to 1
This will ensure that passes that add new global variables will create them
in address space 1 once the passes have been updated to no longer default
to the implicit address space zero.
This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't
already to present to ensure bitcode backwards compatibility.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D84345
2020-11-20 15:46:53 +00:00
Alex Richardson 3bc4157556 Add a default address space for globals to DataLayout
This is similar to the existing alloca and program address spaces (D37052)
and should be used when creating/accessing global variables.
We need this in our CHERI fork of LLVM to place all globals in address space 200.
This ensures that values are accessed using CHERI load/store instructions
instead of the normal MIPS/RISC-V ones.

The problem this is trying to fix is that most of the time the type of
globals is created using a simple PointerType::getUnqual() (or ::get() with
the default address-space value of 0). This does not work for us and we get
assertion/compilation/instruction selection failures whenever a new call
is added that uses the default value of zero.

In our fork we have removed the default parameter value of zero for most
address space arguments and use DL.getProgramAddressSpace() or
DL.getGlobalsAddressSpace() whenever possible. If this change is accepted,
I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead
of relying on the default value of 0 for PointerType::get(), etc.

This patch and the follow-up changes will not have any functional changes
for existing backends with the default globals address space of zero.
A follow-up commit will change the default globals address space for
AMDGPU to 1.

Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D70947
2020-11-20 15:46:52 +00:00
Siva Chandra Reddy 4766a86cf2 [libc] Combine all math differential fuzzers into one target.
Also added diffing of a few more math functions. Combining the diff check
for all of these functions helps us meet the OSS fuzz bar of a minimum of
100 program edges.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D91817
2020-11-20 07:46:15 -08:00
Anton Afanasyev 6f1c07b23a [SLP][Test] Update pr47269.ll test. NFC
Expand test for PR47269 to better demonstrate changes introduced by D90445.
2020-11-20 18:33:57 +03:00
Jamie Schmeiser 7f6360cdc6 Reland: Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug.
Summary:
Expand existing loopsink testing to also test loopsinking using new pass
manager.  Enable memoryssa for loopsink with new pass manager.  This
combination exposed a bug that was previously fixed for loopsink
without memoryssa.  When sinking an instruction into a loop, the source
block may not be part of the loop but still needs to be checked for
pointer invalidation.  This is the fix for bugzilla #39695 (PR 54659)
expanded to also work with memoryssa.

Respond to review comments.  Enable Memory SSA in legacy Loop Sink pass
under EnableMSSALoopDependency option control.  Update tests accordingly.

Respond to review comments.  Add options controlling whether memoryssa is
used for loop sink, defaulting to off.  Expand testing based on these
options.

Respond to review comments.  Properly indicated preserved analyses.

This relanding addresses a compile-time performance problem by moving
test for profile data earlier to avoid unnecessary computations.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D90249
2020-11-20 10:26:33 -05:00
Sanjay Patel 7ae346434a [CostModel] avoid crashing while finding scalarization overhead
The constrained intrinsics have metadata arguments, so the
tests here were crashing as noted in D90554 (and that was
reverted even though this bug exists independently of that
change).
2020-11-20 10:18:29 -05:00
Chris Kennelly e4f9b5d442 [clang-tidy] Include std::basic_string_view in readability-redundant-string-init.
std::string_view("") produces a string_view instance that compares
equal to std::string_view(), but requires more complex initialization
(storing the address of the string literal, rather than zeroing).

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D91009
2020-11-20 10:06:57 -05:00
Jamie Schmeiser 621efa6a5a [NFC intended] Refactor the code for printChanged for reuse and to facilitate subsequent reporters of changes to the IR in the new pass manager.
Summary:
[NFC intended] Refactor the code for printChanged for reuse and to facilitate
subsequent reporters of changes to the IR in the new pass manager.

Create abstract template base classes for common functionality and give
classes more appropriate names.  The base classes handle all of the
determination of when a function or pass is "interesting" and should be
reported or filtered out. They have pure virtual functions which are called
when a change by a pass has been recognized so the derived class need only
provide the overrides to present the information about the changing IR.
There are at least 2 more change reporters to come (which were presented
in my tutorial at the 2020 llvm developer's meeting) that derive from
these classes.

Respond to review comments:  move function out of line, remove inline keyword,
remove unneeded qualifiers, simplify comparison.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks), madhur13490 (Madhur Amilkanthwar)
Differential Revision: https://reviews.llvm.org/D87000
2020-11-20 09:43:06 -05:00
Adam Czachorowski 95ce9fbc23 [clang] Do not crash on pointer wchar_t pointer assignment.
wchar_t can be signed (thus hasSignedIntegerRepresentation() returns
true), but it doesn't have an unsigned type, which would lead to a crash
when trying to get it.

With this fix, we special-case WideChar types in the pointer assignment
code.

Differential Revision: https://reviews.llvm.org/D91625
2020-11-20 15:27:15 +01:00
Sjoerd Meijer 412237dcd0 [AArch64] Enable post RA scheduler for Cortex-R82
Just something I forgot when I added the R82. Need to have a look
at crypto and fusing, but will do that as a follow up.

Differential Revision: https://reviews.llvm.org/D91848
2020-11-20 14:04:26 +00:00
Yafei Liu 2ce6352e46 Add a call super attribute plugin example
If a virtual method is marked as call_super, the
override method must call it, simpler feature like @CallSuper
in Android Java.
2020-11-20 08:51:12 -05:00
Stephen Kelly 2033fa29b0 Add documentation illustrating use of IgnoreUnlessSpelledInSource
Differential Revision: https://reviews.llvm.org/D91639
2020-11-20 13:49:25 +00:00
David Green f08c37da7b [ARM] Disable WLSTP loops
This checks to see if the loop will likely become a tail predicated loop
and disables wls loop generation if so, as the likelihood for reverting
is currently too high. These should be fairly rare situations anyway due
to the way iterations and element counts are used during lowering. Just
not trying can alter how SCEV's are materialized however, leading to
different codegen.

It also adds a option to disable all while low overhead loops, for
debugging.

Differential Revision: https://reviews.llvm.org/D91663
2020-11-20 13:30:44 +00:00
Pavel Iliin 4d7df43ffd [AArch64] Out-of-line atomics (-moutline-atomics) implementation.
This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
Options -moutline-atomics and -mno-outline-atomics to enable and disable it
were added to clang driver. This is clang and llvm part of out-of-line atomics
interface, library part is already supported by libgcc. Compiler-rt
support is provided in separate patch.

Differential Revision: https://reviews.llvm.org/D91157
2020-11-20 13:30:12 +00:00
Sanjay Patel 1285781fc5 [CostModel] add tests for math library calls; NFC
This is a partial un-revert of 32dd5870ee (originally df09f82599 ).

I'm adding back the baseline tests first, so we don't have
to back-track as much in case there are still problems.
2020-11-20 08:24:49 -05:00
Sanjay Patel 99cf39bfed [LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC
See discussion in D90554.

This is a partial un-revert of 32dd5870ee. I'm adding
back the baseline tests first, so we don't have to
back-track as much in case there are still problems.
2020-11-20 08:15:46 -05:00
Rainer Orth 03d593dd7e [sanitizers][test] Test sanitizer_common and ubsan_minimal on Solaris
During the initial Solaris sanitizer port, I missed to enable the
`sanitizer_common` and `ubsan_minimal` testsuites.  This patch fixes this,
correcting a few unportabilities:

- `Posix/getpass.cpp` failed to link since Solaris lacks `libutil`.
  Omitting the library lets the test `PASS`, but I thought adding `%libutil`
  along the lines of `%librt` to be overkill.
- One subtest of `Posix/getpw_getgr.cpp` is disabled because Solaris
  `getpwent_r` has a different signature than expected.
- `/dev/null` is a symlink on Solaris.
- XPG7 specifies that `uname` returns a non-negative value on success.

Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D91606
2020-11-20 14:06:25 +01:00
Stephan Herhut a89e55ca57 [mlir][std] Canonicalize a dim(memref_reshape) into a load from the shape operand
This canonicalization helps propagate shape information through the program.

Differential Revision: https://reviews.llvm.org/D91854
2020-11-20 14:03:02 +01:00
Sanjay Patel dfd2858b7f [InstCombine] add test comments for negative tests; NFC 2020-11-20 07:59:46 -05:00
Stephan Herhut 6af81ea1d6 [mlir][std] Fold load(tensor_to_memref) into extract_element
This canonicalization is useful to resolve loads into scalar values when
doing partial bufferization.

Differential Revision: https://reviews.llvm.org/D91855
2020-11-20 13:42:11 +01:00
Raphael Isemann ffb3fd8f18 Partially revert '[MachO] Update embedded part of ObjectFileMachO for Mangled API change'
Commit f3aa9e36d9 fixed the embedded OS
build by removing all passed args for `GetName`/`GetDemangledName`. The motivation
for this was that these arguments were apparently removed in
commit 22b044877d. However, only `GetName`'s language
argument was removed but the mangling preference argument was *not* removed
(and unfortunately had a default argument). So when that commit removed all
the args it didn't just fix the build but it also changed all the mangling
preferences to 'demangled' for all `GetName` calls.

Also some `GetName` calls were outside the TARGET_OS_EMBEDDED ifdef, so
this change ended up breaking the following tests on macOS:

  lldb-api :: lang/objc/objc-static-method-stripped/TestObjCStaticMethodStripped.py
  lldb-api :: lang/objc/objc-super/TestObjCSuper.py

From what I can see f3aa9e36d9 removed 12 ePreferMangled args and this patch
re-adds 12 args with roughly the same line numbers, so this *should* restore the
old behaviour and also keep the embedded build working. On the other hand,
ObjectFileMachO::ParseSymtab is a very successful attempt at writing
the longest possible function within LLVM, so this fix is partly based
on the engineering principle known as "hoping for the best".
2020-11-20 13:31:36 +01:00
Kazushi (Jam) Marukawa 42389f1e96 [VE] Change threshold for jump table generation
Implement getMinimumJumpTableEntries() to specify threshold for jump
table genaration.  We use 8 for the case of PIC mode to relieve the
impact of PIC calculation required to implement PIC mode jump table.
Update jump table regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91785
2020-11-20 21:27:18 +09:00
Stephan Herhut cb778c3423 [mlir][std] Fold comparisons when the operands are equal
For equal operands, comparisons can be decided statically.

Differential Revision: https://reviews.llvm.org/D91856
2020-11-20 13:26:41 +01:00
Mikhail Goncharov 0caa82e2ac Revert "[mlir][Linalg] Fuse sequence of Linalg operation (on buffers)"
This reverts commit f8284d21a8.

Revert "[mlir][Linalg] NFC: Expose some utility functions used for promotion."

This reverts commit 0c59f51592.

Revert "Remove unused isZero function"

This reverts commit 0f9f0a4046.

Change f8284d21 led to multiple failures in IREE compilation.
2020-11-20 13:12:54 +01:00
Simon Pilgrim 822c5c5084 [clang][CodeGen] Move WebAssembly specific tests to WebAssembly subtarget folder
Minor cleanup to move more target specific tests out of the root codegen test folder
2020-11-20 12:03:28 +00:00
Simon Pilgrim 2f1fe9a3a6 [clang][CodeGen] Move riscv specific tests to RISCV subtarget folder
Minor cleanup to move more target specific tests out of the root codegen test folder
2020-11-20 12:03:28 +00:00
Rainer Orth 0f69cbe269 [sanitizer_common][test] Disable CombinedAllocator32Compact etc. on Solaris/sparcv9
As reported in PR 48202, two allocator tests `FAIL` on Solaris/sparcv9,
presumably because Solaris uses the full 64-bit address space and the
allocator cannot deal with that:

  SanitizerCommon-Unit :: ./Sanitizer-sparcv9-Test/SanitizerCommon.CombinedAllocator32Compact
  SanitizerCommon-Unit :: ./Sanitizer-sparcv9-Test/SanitizerCommon.SizeClassAllocator32Iteration

This patch disables the tests.

Tested on `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D91622
2020-11-20 13:02:15 +01:00
Rainer Orth ce6524d127 [sanitizer_common][test] Disable FastUnwindTest.* on SPARC
Many of the `FastUnwindTest.*` tests `FAIL` on SPARC, both Solaris and
Linux.  The issue is that the fake stacks used in those tests don't match
the requirements of the SPARC unwinder in `sanitizer_stacktrace_sparc.cpp`
which has to look at the register window save area.

I'm disabling the failing tests.

Tested on `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D91618
2020-11-20 12:52:18 +01:00
Simon Pilgrim 44c96becc9 Fix MSVC "not all control paths return a value" warnings. NFCI. 2020-11-20 11:41:20 +00:00
David Spickett 32541685b2 [lldb][AArch64/Linux] Show memory tagged memory regions
This extends the "memory region" command to
show tagged regions on AArch64 Linux when the MTE
extension is enabled.

(lldb) memory region the_page
[0x0000fffff7ff8000-0x0000fffff7ff9000) rw-
memory tagging: enabled

This is done by adding an optional "flags" field to
the qMemoryRegion packet. The only supported flag is
"mt" but this can be extended.

This "mt" flag is read from /proc/{pid}/smaps on Linux,
other platforms will leave out the "flags" field.

Where this "mt" flag is received "memory region" will
show that it is enabled. If it is not or the target
doesn't support memory tagging, the line is not shown.
(since majority of the time tagging will not be enabled)

Testing is added for the existing /proc/{pid}/maps
parsing and the new smaps parsing.
Minidump parsing has been updated where needed,
though it only uses maps not smaps.

Target specific tests can be run with QEMU and I have
added MTE flags to the existing helper scripts.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D87442
2020-11-20 11:21:59 +00:00
Eugene Zhulenev a86a9b5ef7 [mlir] Automatic reference counting for Async values + runtime support for ref counted objects
Depends On D89963

**Automatic reference counting algorithm outline:**

1. `ReturnLike` operations forward the reference counted values without
    modifying the reference count.
2. Use liveness analysis to find blocks in the CFG where the lifetime of
   reference counted values ends, and insert `drop_ref` operations after
   the last use of the value.
3. Insert `add_ref` before the `async.execute` operation capturing the
   value, and pairing `drop_ref` before the async body region terminator,
   to release the captured reference counted value when execution
   completes.
4. If the reference counted value is passed only to some of the block
   successors, insert `drop_ref` operations in the beginning of the blocks
   that do not have reference coutned value uses.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D90716
2020-11-20 03:08:44 -08:00
Sebastian Neubauer 7a18bdb350 [AMDGPU] Implement flat scratch init for pal
Extract the scratch offset from the scratch buffer descriptor that is
stored in the global table.

Differential Revision: https://reviews.llvm.org/D91701
2020-11-20 11:14:30 +01:00
QingShan Zhang 1b5921f4d8 [NFC][Test] Update test for IEEE Long Double 2020-11-20 09:57:45 +00:00
Max Kazantsev 2290daa938 [Test] Auto-update checks in a test 2020-11-20 16:53:51 +07:00
Georgii Rymar 343dceb831 [llvm-readelf/obj] - Improve error reporting when dumping group sections.
Our code that dumps groups has 3 noticeable issues:
1) It uses `unwrapOrError` in many places.
2) It doesn't allow reporting unique warnings, because the `getGroups` helper is not
   a member of `DumpStyle<ELFT>`.
3) It might just crash. See the comment for `StrTableOrErr->data() + Sym.st_name` line.

In this patch I am starting addressing these points.
For start I've converted one of `unwrapOrError` calls to a unique warning.

Differential revision: https://reviews.llvm.org/D91798
2020-11-20 12:40:23 +03:00
Kirill Bobyrev da14ae23a5
[clangd] NFC: Reorder headers in tests accordig to Clang-Tidy 2020-11-20 10:38:41 +01:00
Georgii Rymar aadbe20622 [llvm-readobj] - Introduce `forEachRelocationDo` helper.
Our `printStackSize` implementation currently uses
API like `RelocationRef`, `object::symbol_iterator`.
It is not ideal as it doesn't allow
to handle possible error conditions properly.

Some time ago I started rewriting it and this NFC patch is
a one more step toward to it. Here I am introducing the
`forEachRelocationDo` helper. With it it is possible to iterate
over all kinds of relocations, what is helpful for improving
the code in `printStackSize` and around.

Differential revision: https://reviews.llvm.org/D91530
2020-11-20 12:21:42 +03:00
AndreyChurbanov 5644f734d6 Revert "[OpenMP] Add support for Intel's umonitor/umwait"
This reverts commit 9cfad5f9c5.
2020-11-20 12:16:34 +03:00