Commit Graph

364454 Commits

Author SHA1 Message Date
Craig Topper ba319ac47e [X86] Remove a redundant COPY_TO_REGCLASS for VK16 after a KMOVWkr in an isel output pattern.
KMOVWkr produces VK16, there's no reason to copy it to VK16 again.

Test changes are presumably because we were scheduling based on
the COPY that is no longer there.
2020-08-25 15:19:27 -07:00
Shoaib Meenai 22cd6bee4a [llvm-libtool-darwin] Address post-commit feedback
Address James Henderson's comments on https://reviews.llvm.org/D86359.
2020-08-25 15:04:23 -07:00
Dave Lee 66c4880291 Remove unused/misnamed SetObjectModificationTime
Remove `SetObjectModificationTime` which is not currently used, and assigns to the wrong member.

Differential Revision: https://reviews.llvm.org/D86493
2020-08-25 14:49:34 -07:00
Mircea Trofin 7cfcecece0 [MLInliner] Simplify TFUTILS_SUPPORTED_TYPES
We only need the C++ type and the corresponding TF Enum. The other
parameter was used for the output spec json file, but we can just
standardize on the C++ type name there.

Differential Revision: https://reviews.llvm.org/D86549
2020-08-25 14:19:39 -07:00
Stanislav Mekhanoshin b7760c3e5d [AMDGPU] Remove unsound dependency on ISA version in waitcnt
Differential Revision: https://reviews.llvm.org/D86566
2020-08-25 14:01:42 -07:00
Fangrui Song 82d0749749 [TargetLoweringObjectFileImpl] Make .llvmbc and .llvmcmd non-SHF_ALLOC
There are two ways .llvmbc can be produced:

* clang -c -fembed-bitcode=all (which also produces .llvmcmd)
* LTO backend: ld.lld -mllvm -lto-embed-bitcode or -plugin-opt=-lto-embed-bitcode

.llvmbc and .llvmcmd have the SHF_ALLOC flag, so they can be dropped by
--gc-sections.

This patch sets SectionKind::Metadata to drop the SHF_ALLOC flag. This
is conceptually correct: the two sections are not part of the process
image, so SHF_ALLOC is not appropriate.

`test/LTO/X86/embed-bitcode.ll`: changed `llvm-objcopy -O binary --only-section` to
`llvm-objcopy --dump-section`. `-O binary` does not dump non-SHF_ALLOC sections.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D86374
2020-08-25 13:37:29 -07:00
Krzysztof Parzyszek 514d6e9a8d [SDAG] Improve MemSDNode::getBasePtr
It returned getOperand(1), except for STORE for which it returned
getOperand(2). Handle MSTORE, MGATHER, and MSCATTER as well.
2020-08-25 15:19:52 -05:00
aartbik 66e536bc36 [mlir] [LLVMIR] Mark reductions as side-effect free
Attribute was missing from original base class.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D86569
2020-08-25 13:09:19 -07:00
Shilei Tian 0775c1dfbc [OpenMP] Pack first-private arguments to improve efficiency of data transfer
In this patch, we pack all small first-private arguments, allocate and transfer them all at once to reduce the number of data transfer which is very expensive.

Let's take the test case as example.
```
int main() {
  int data1[3] = {1}, data2[3] = {2}, data3[3] = {3};
  int sum[16] = {0};
#pragma omp target teams distribute parallel for map(tofrom: sum) firstprivate(data1, data2, data3)
  for (int i = 0; i < 16; ++i) {
    for (int j = 0; j < 3; ++j) {
      sum[i] += data1[j];
      sum[i] += data2[j];
      sum[i] += data3[j];
    }
  }
}
```
Here `data1`, `data2`, and `data3` are three first-private arguments of the target region. In the previous `libomptarget`, it called data allocation and data transfer three times, each of which allocated and transferred 12 bytes. With this patch, it only calls allocation and transfer once. The size is `(12+4)*3=48` where 12 is the size of each array and 4 is the padding to keep the address aligned with 8. It is implemented in this way:
1. First collect all information for those *first*-private arguments. _private_ arguments are not the case because private arguments don't need to be mapped to target device. It just needs a data allocation. With the patch for memory manager, the data allocation could be very cheap, especially for the small size. For each qualified argument, push a place holder pointer `nullptr` to the `vector` for kernel arguments, and we will update them later.
2. After we have all information, create a buffer that can accommodate all arguments plus their paddings. Copy the arguments to the buffer at the right place, i.e. aligned address.
3. Allocate a target memory with the same size as the host buffer, transfer the host buffer to target device, and finally update all place holder pointers in the arguments `vector`.

The reason we only consider small arguments is, the data transfer is asynchronous. Therefore, for the large argument, we could continue to do things on the host side meanwhile, hopefully, the data is also being transferred. The "small" is defined by that the argument size is less than a predefined value. Currently it is 1024. I'm not sure whether it is a good one, and that is an open question. Another question is, do we need to make it configurable via an environment variable?

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D86307
2020-08-25 16:06:29 -04:00
Stanislav Mekhanoshin 817c831f02 [AMDGPU] Switch to named simm16 in vscnt insertion
Differential Revision: https://reviews.llvm.org/D86568
2020-08-25 13:05:27 -07:00
Ankit Aggarwal 2da1eefb58 [Hexagon] Check if EVT is simple type in HVX lowering 2020-08-25 15:02:44 -05:00
Jonas Devlieghere 521220690a [lldb] Make Reproducer compatbile with SubsystemRAII (NFC)
Make Reproducer compatbile with SubsystemRAII and use it in
LocateSymbolFileTest.
2020-08-25 13:00:04 -07:00
Abhina Sreeskantharajan 97ccf93b36 [SystemZ][z/OS] Add z/OS Target and define macros
This patch adds the z/OS target and defines macros as a stepping stone
towards enabling a native build on z/OS.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D85324
2020-08-25 15:51:59 -04:00
Juneyoung Lee f753f5b050 [ValueTracking] Let getGuaranteedNonPoisonOp find multiple non-poison operands
This patch helps getGuaranteedNonPoisonOp find multiple non-poison operands.

Instead of special-casing llvm.assume, I think it is also a viable option to
add noundef to Intrinsics.td. If it makes sense, I'll make a patch for that.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86477
2020-08-26 04:40:21 +09:00
Juneyoung Lee 8e51bb249b [ValueTracking] Add a noundef test for D86477; NFC 2020-08-26 04:40:21 +09:00
Amy Huang b1009ee84f Reland "[DebugInfo] Move constructor homing case in shouldOmitDefinition."
For some reason the ctor homing case was before the template
specialization case, and could have returned false too early.
I moved the code out into a separate function to avoid this.

This reverts commit 05777ab941.
2020-08-25 12:36:11 -07:00
Nikita Popov 3a54b6a4b7 [MemDep] Use BatchAA when computing pointer dependencies
We're not changing IR while running a single MemDep query, so it's
safe to cache alias analysis results using BatchAA. This adds BatchAA
usage to getSimplePointerDependencyFrom(), which is non-intrusive --
covering larger parts (like a whole processNonLocalLoad query) is
also possible, but requires threading BatchAA through a bunch of APIs.

For the ThinLTO configuration, this is a 1% geomean improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D85583
2020-08-25 21:34:34 +02:00
aartbik 84fdc33f47 [mlir] [LLVMIR] Add get active lane mask intrinsic
Provides fast, generic way of setting a mask up to a certain
point. Potential use cases that may benefit are create_mask
and transfer_read/write operations in the vector dialect.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D86501
2020-08-25 12:19:17 -07:00
Wolfgang Pieb e02920fe55 [llvm-mca][NFC] Refactor handling of views that examine individual instructions,
including printing them.

Reviewers: andreadb, lebedev.ri

Differential Review: https://reviews.llvm.org/D86390

Introduces a new base class "InstructionView" that such views derive from.
Other views still use the "View" base class.
2020-08-25 12:12:37 -07:00
clementval 4d69bcb12f [mlir][openacc][NFC] Fix comment about OpenACCExecMapping 2020-08-25 15:11:05 -04:00
peter klausler bce7a7edf3 [flang] Check that various variables referenced in I/O statements may be defined
A number of I/O syntax rules involve variables that will be written to,
and must therefore be definable.  This includes internal file variables,
IOSTAT= and IOMSG= specifiers, most INQUIRE statement specifiers, a few
other specifiers, and input variables.  This patch checks for
these violations, and implements several additional I/O TODO constraint
checks.

Differential Revision: https://reviews.llvm.org/D86557
2020-08-25 12:06:18 -07:00
Kuba Mracek e713b0ecbc [tsan] On arm64e, strip out ptrauth bits from incoming PCs
Differential Revision: https://reviews.llvm.org/D86378
2020-08-25 11:59:36 -07:00
Craig Topper 01eb1233db [X86] Mention -march=sapphirerapids in the release notes.
This was just added in e02d081f2b.
2020-08-25 11:57:34 -07:00
Arthur Eubanks df5576a852 [test] Add -inject-tli-mapping to -loop-vectorize -vector-library tests
The legacy LoopVectorize has a dependency on InjectTLIMappingsLegacy.
That cannot be expressed in the new PM since they are both normal
passes. Explicitly add -inject-tli-mappings as a pass.

Follow-up to https://reviews.llvm.org/D86492.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86561
2020-08-25 11:55:11 -07:00
Lang Hames f436bef507 [examples] Fix dependencies for OrcV2Examples/LLJITWithThinLTOSummaries. 2020-08-25 11:51:20 -07:00
Lang Hames 594107d488 [ORC] Fix an endif comment. 2020-08-25 11:51:20 -07:00
peter klausler a0a1a4e5c8 [flang] Improve error handling for bad characters in source
When an illegal character appears in Fortran source (after
preprocessing), catch and report it in the prescanning phase
rather than leaving it for the parser to cope with.

Differential Revision: https://reviews.llvm.org/D86553
2020-08-25 11:42:19 -07:00
peter klausler 13cee14bb1 [flang] Parse global compiler directives
Accept and represent "global" compiler directives that appear
before and between program units in a source file.

Differential Revision: https://reviews.llvm.org/D86555
2020-08-25 11:41:11 -07:00
Raphael Isemann ef76686916 [lldb] Initialize reproducers in LocateSymbolFileTest
Since a842950b62 this test started using
the reproducer subsystem but we never initialized it in the test. The
Subsystem takes an argument, so we can't use the usual SubsystemRAII at the
moment to do this for us.

This just adds the initialize/terminate calls to get the test passing again.
2020-08-25 20:26:43 +02:00
Raphael Isemann 7de7fe5d0e [lldb] Don't ask for QOS_CLASS_UNSPECIFIED queue in TestQueues
TestQueues is curiously failing for me as my queue for QOS_CLASS_UNSPECIFIED
is named "Utility" and not "User Initiated" or "Default". While debugging, this
I noticed that this test isn't actually using this API right from what I understand. The API documentation
for `dispatch_get_global_queue` specifies for the parameter: "You may specify the value
QOS_CLASS_USER_INTERACTIVE, QOS_CLASS_USER_INITIATED, QOS_CLASS_UTILITY, or QOS_CLASS_BACKGROUND."

QOS_CLASS_UNSPECIFIED isn't listed as one of the supported values. swift-corelibs-libdispatch
even checks for this value and returns a DISPATCH_BAD_INPUT. The
libdispatch shipped on macOS seems to also check for QOS_CLASS_UNSPECIFIED and seems to
instead cause a "client crash", but somehow this doesn't trigger in this test and instead we just
get whatever queue

This patch just removes that part of the test as it appears the code is just incorrect.

Reviewed By: jasonmolenda

Differential Revision: https://reviews.llvm.org/D86211
2020-08-25 20:13:33 +02:00
Wei Wang ae90df8e5a [FIX] Avoid creating BFI when emitting remarks for dead functions
Dead function has its body stripped away, and can cause various
analyses to panic. Also it does not make sense to apply analyses on
such function.

Reviewed By: xazax.hun, MaskRay, wenlei, hoy

Differential Revision: https://reviews.llvm.org/D84715
2020-08-25 11:12:38 -07:00
Kazuaki Ishizaki 40cbb2484d [mlir] NFC: fix typo in FileCheck prefix
CHECL -> CHECK

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D86550
2020-08-26 03:12:14 +09:00
Matt Arsenault 1b3de8812d AArch64: Fix hardcoded register in test 2020-08-25 13:56:39 -04:00
peter klausler ba4cc3b380 [flang] Don't completely left-justify fixed-form tokenization
If the label field is empty, and macro replacement occurs,
the rescanned text might be misclassified as a comment card
if it happens to begin with a C or a D.  Insert a leading
space into these otherwise empty label fields.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47173
2020-08-25 10:53:56 -07:00
Arthur Eubanks 78e4aeb783 [NewPM][test] Fix accelerate-vector-functions.ll under NPM
The legacy SLPVectorizer has a dependency on InjectTLIMappingsLegacy.
That cannot be expressed in the new PM since they are both normal
passes. Explicitly add -inject-tli-mappings as a pass.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86492
2020-08-25 10:50:14 -07:00
David Green 5b7a889a67 [ARM] Additional test for tailpred reductions. NFC 2020-08-25 18:29:15 +01:00
Krzysztof Parzyszek dcef5e0c37 [Hexagon] Remove (redundant) HexagonISelLowering::isHvxOperation(SDValue)
Use isHvxOperation(SDNode*) instead.
2020-08-25 11:45:08 -05:00
Sanjay Patel 21a008bbba [x86] add AVX shuffle test for PR47262; NFC
Goes with D86429
2020-08-25 12:43:09 -04:00
Ta-Wei Tu abbd652dd6 [LoopNest] False negative of `arePerfectlyNested` with LCSSA loops
Summary: The LCSSA pass (required for all loop passes) sometimes adds
additional blocks containing LCSSA variables, and checkLoopsStructure
may return false even when the loops are perfectly nested in this case.
This is because the successor of the exit block of the inner loop now
points to the LCSSA block instead of the latch block of the outer loop.
Examples are shown in the test nests-with-lcssa.ll.

To fix the issue, the successor of the exit block of the inner loop can
now point to a block in which all instructions are LCSSA phi node
(except the terminator), and the sole successor of that block should
point to the latch block of the outer loop.

Reviewed By: Whitney, etiotto

Differential Revision: https://reviews.llvm.org/D86133
2020-08-25 16:20:52 +00:00
David Tenty f8454d60b8 [AIX][compiler-rt][builtins] Don't add ppc builtin implementations that require __int128 on AIX
since __int128 currently isn't supported on AIX.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D85972
2020-08-25 11:35:38 -04:00
Sjoerd Meijer 2002bb4878 [LangRef] Revise semantics of intrinsic get.active.lane.mask
A first version of get.active.lane.mask was committed in rG7fb8a40e5220. One of
the main purposes and uses of this intrinsic is to communicate information from
the middle-end to the back-end, but its current definition and semantics make
this actually very difficult. The intrinsic was defined as:

  @llvm.get.active.lane.mask(%IV, %BTC)

where %BTC is the Backedge-Taken Count (variable names are different in the
LangRef spec). This allows to implicitly communicate the loop tripcount, which
can be reconstructed by calculating BTC + 1. But it has been very difficult to
prove that calculating BTC + 1 is safe and doesn't overflow. We need
complicated range and SCEV analysis, and thus the problem is that this
intrinsic isn't really doing what it was supposed to solve. Examples of the
overflow checks that are required in the (ARM) back-end are D79175 and D86074,
which aren't even complete/correct yet.

To solve this problem, we are revising the definitions/semantics for
get.active.lane.mask to avoid all the complicated overflow analysis. This means
that instead of communicating the BTC, we are now using the loop tripcount. Now
using LangRef's variable names, its semantics is changed from:

  icmp ule (%base + i), %n

to:

  icmp ult (%base + i), %n

with %n > 0 and corresponding to the loop tripcount. The intrinsic signature
remains the same.

Differential Revision: https://reviews.llvm.org/D86147
2020-08-25 16:23:51 +01:00
Sanjay Patel c4f0a0896f [InstCombine] improve demanded element analysis for vector insert-of-extract (2nd try)
The 1st attempt (rG557b890) was reverted because it caused miscompiles.
That bug is avoided here by changing the order of folds and as verified
in the new tests.

Original commit message:
InstCombine currently has odd rules for folding insert-extract chains to shuffles,
so we miss collapsing seemingly simple cases as shown in the tests here.

But poison makes this not quite as easy as we might have guessed. Alive2 tests to
show the subtle difference (similar to the regression tests):
https://alive2.llvm.org/ce/z/hp4hv3 (this is ok)
https://alive2.llvm.org/ce/z/ehEWaN (poison leakage)

SLP tends to create these patterns (as shown in the SLP tests), and this could
help with solving PR16739.

Differential Revision: https://reviews.llvm.org/D86460
2020-08-25 11:19:36 -04:00
Sanjay Patel 11f8d4aa10 [InstCombine] add vector demanded elements tests with shuffles; NFC
The 1st draft of D86460 (reverted) would show miscompiles with these tests
because the undef element tracking went wrong and became visible in the
shuffle masks.
2020-08-25 11:19:35 -04:00
Fangrui Song 25863cc512 [ELF] .note.gnu.property: error for invalid pr_datasize
A n_type==NT_GNU_PROPERTY_TYPE_0 note encodes a program property.
If pr_datasize is invalid, LLD may crash
(https://github.com/ClangBuiltLinux/linux/issues/1141)

This patch adds some error checking, supports big-endian, and add some tests
for invalid n_descsz.

Differential Revision: https://reviews.llvm.org/D86422
2020-08-25 08:05:39 -07:00
Jay Foad 8a1926c67a AMDGPU/GlobalISel: re-auto-generate some test checks 2020-08-25 15:54:22 +01:00
Sjoerd Meijer 8d5f64c4ed [Verifier] Additional check for intrinsic get.active.lane.mask
This adapts the verifier checks for intrinsic get.active.lane.mask to the new
semantics of it as described in D86147. I.e., the second argument %n, which
corresponds to the loop tripcount, must be greater than 0 if it is a constant,
so check that.

Differential Revision: https://reviews.llvm.org/D86301
2020-08-25 15:44:33 +01:00
Kostya Kortchinsky bd5ca4f0ed [scudo][standalone] Skip irrelevant regions during release
With the 'new' way of releasing on 32-bit, we iterate through all the
regions in between `First` and `Last`, which covers regions that do not
belong to the class size we are working with. This is effectively wasted
cycles.

With this change, we add a `SkipRegion` lambda to `releaseFreeMemoryToOS`
that will allow the release function to know when to skip a region.
For the 64-bit primary, since we are only working with 1 region, we never
skip.

Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D86399
2020-08-25 07:41:02 -07:00
Xing GUO 1dc57ada0c [DWARFYAML] Make the 'Attributes' field optional.
This patch makes the 'Attributes' field optional. We don't need to
explicitly specify the 'Attributes' field in the future.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D86537
2020-08-25 22:37:43 +08:00
Sjoerd Meijer 39522b1e10 [SelectionDAG] Legalize intrinsic get.active.lane.mask
This adapts legalization of intrinsic get.active.lane.mask to the new semantics
as described in D86147. Because the second argument is now the loop tripcount,
we legalize this intrinsic to an 'icmp ULT' instead of an ULE when it was the
backedge-taken count.

Differential Revision: https://reviews.llvm.org/D86302
2020-08-25 15:00:10 +01:00
Jeremy Morse 121a49d839 [LiveDebugValues] Add switches for using instr-ref variable locations
This patch adds the -Xclang option
"-fexperimental-debug-variable-locations" and same LLVM CodeGen option,
to pick which variable location tracking solution to use.

Right now all the switch does is pick which LiveDebugValues
implementation to use, the normal VarLoc one or the instruction
referencing one in rGae6f78824031. Over time, the aim is to add fragments
of support in aid of the value-tracking RFC:

  http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html

also controlled by this command line switch. That will slowly move
variable locations to be defined by an instruction calculating a value,
and a DBG_INSTR_REF instruction referring to that value. Thus, this is
going to grow into a "use the new kind of variable locations" switch,
rather than just "use the new LiveDebugValues implementation".

Differential Revision: https://reviews.llvm.org/D83048
2020-08-25 14:58:48 +01:00