Commit Graph

389273 Commits

Author SHA1 Message Date
Lang Hames 2b45895df4 [JITLink] Move some Block bitfields into Addressable to improve packing.
Keeping these bitfields from Block to Addressable allows them to be packed with
the bitfields at the end of Addressable, reducing the size of Block by eight
bytes.
2021-05-22 07:59:24 -07:00
Yaxun (Sam) Liu bf6124580d [HIP] support ThinLTO
Add options -[no-]offload-lto and -foffload-lto=[thin,full] for controlling
LTO for offload compilation. Allow LTO for AMDGPU target.

AMDGPU target does not support codegen of object files containing
call of external functions, therefore the LLVM module passed to
AMDGPU backend needs to contain definitions of all the callees.
An LLVM option is added to allow function importer to import
functions with noinline attribute.

HIP toolchain passes proper LLVM options to lld to make sure
function importer imports definitions of all the callees.

Reviewed by: Teresa Johnson, Artem Belevich

Differential Revision: https://reviews.llvm.org/D99683
2021-05-22 10:48:34 -04:00
Butygin 0dd36f81b9 [mlir][linalg][nfc] Fix signed/unsigned comparison warning in header
Differential Revision: https://reviews.llvm.org/D102968
2021-05-22 17:18:01 +03:00
Nikita Popov 9a9421a461 Reapply [InstCombine] Fold multiuse shr eq zero
This was reverted due to performance regressions in ARM benchmarks,
which have since been addressed by D101196 (SCEV analysis improvement)
and D101778 (CGP reverse transform).

-----

The single-use case is handled implicity by converting the icmp
into a mask check first. When comparing with zero in particular,
we don't need the one-use restriction, as we only produce a single
icmp.

https://alive2.llvm.org/ce/z/MSixcm
https://alive2.llvm.org/ce/z/GwpG0M
2021-05-22 14:46:50 +02:00
David Green 211ce51f27 [ARM] Clean up some tests, removing dead instructions. NFC 2021-05-22 13:38:00 +01:00
Butygin 4184018253 [mlir][SCF] Canonicalize nested ParallelOp's
Differential Revision: https://reviews.llvm.org/D102799
2021-05-22 14:00:00 +03:00
Simon Pilgrim 7a898477bb [CostModel][X86] vXi8 MUL is always promoted to vXi16 2021-05-22 11:56:49 +01:00
Navdeep Kumar e552fa28da [MLIR][GPU] Add CUDA Tensor core WMMA test
Add a test case to test the complete execution of WMMA ops on a Nvidia
GPU with tensor cores. These tests are enabled under
MLIR_RUN_CUDA_TENSOR_CORE_TESTS.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D95334
2021-05-22 16:19:36 +05:30
Uday Bondhugula 3597b2c37d [MLIR] Drop stale reference to mlir-edsc-builder-api-test
Drop stale reference to mlir-edsc-builder-api-test.

Differential Revision: https://reviews.llvm.org/D102967
2021-05-22 16:11:29 +05:30
Florian Hahn a6de8d95db
[Matrix] Bail out early if there are no matrix intrinsics.
If there are no matrix intrinsics in a function, we can directly bail
out, as there's nothing left to do.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D102931
2021-05-22 11:37:25 +01:00
Simon Pilgrim 02918f1079 [CostModel][X86] Add test coverage for sub-64bit vXi8 multiplication costs
These can be cheaply promoted to a single v8i16 vector for multiplication
2021-05-22 11:33:36 +01:00
Simon Pilgrim 9bd0dc83b5 [CostModel][X86] Improve v8i32 MUL costs on AVX1 targets to account for slower btver2
BTVER2 has a 2 cycle throughput for v4i32 multiplies (same as SSE41 targets), which is only partially hidden by the subvector extracts/insert when splitting v8i32.
2021-05-22 11:13:07 +01:00
Butygin 9afbca746b [mlir] ConvertStandardToLLVM: make AllocLikeOpLowering public
It is useful for someone who wants to implement custom AllocOp LLVM lowering

Differential Revision: https://reviews.llvm.org/D102932
2021-05-22 12:57:45 +03:00
Tomasz Miąsko 75cc1cf018 [Demangle][Rust] Parse function signatures
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102581
2021-05-22 11:49:08 +02:00
Tomasz Miąsko e4fa6c95ac [Demangle][Rust] Parse references
Reviewed By: dblaikie

Part of https://reviews.llvm.org/D102580
2021-05-22 11:49:08 +02:00
Tomasz Miąsko 6aac56336d [Demangle][Rust] Parse raw pointers
Reviewed By: dblaikie

Part of https://reviews.llvm.org/D102580
2021-05-22 11:49:08 +02:00
Nikita Popov 069174a634 [CVP] Add test for PR50399 (NFC) 2021-05-22 11:21:34 +02:00
Roman Lebedev 8ed0864fd7
Reland [X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost()
Now that getMemoryOpCost() correctly handles all the vector variants,
we should no longer hand-roll our own version of it, but use it directly.

The AVX512 variant probably needs a similar change,
but there it is less obvious.

This was initially landed in 69ed93a435,
but was reverted in 6b95fd199d
because the patch it depends on was reverted.
2021-05-22 11:47:08 +03:00
Roman Lebedev 05a4e4a89c
Reland [X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again
Instead of handling power-of-two sized vector chunks,
try handling the large vector in a stream mode,
decreasing the operational vector size
once it no longer works for the elements left to process.

Notably, this improves costs for overaligned loads - loading padding is fine.
This more directly tracks when we need to insert/extract the YMM/XMM subvector,
some costs fluctuate because of that.

This was initially landed in c02476f315,
but reverted in 5fddc3312b,
because the code made some very optimistic assumptions about invariants
that didn't hold in practice.

Reviewed By: RKSimon, ABataev

Differential Revision: https://reviews.llvm.org/D100684
2021-05-22 11:46:32 +03:00
LemonBoy fd5cc41818 [SelectionDAG] Fix argument copy elision with irregular types
D29668 enabled to avoid a useless copy of the argument value into an alloca if the caller places it in memory (as it often happens on x86) by directly forwarding the pointer to it. This optimization is illegal if the type contains padding bytes: if a truncating store into the alloca is replaced the upper bits are filled with garbage and produce code misbehaving at runtime.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D102153
2021-05-22 09:43:37 +02:00
Serge Pavlov c9c05a91c4 [ConstantFolding] Use APFloat for constant folding. NFC
Replace use of host floating types with operations on APFloat when it is
possible. Use of APFloat makes analysis more convenient and facilitates
constant folding in the case of non-default FP environment.

Differential Revision: https://reviews.llvm.org/D102672
2021-05-22 13:00:20 +07:00
Michael Kruse 86008477a4 [Polly] Avoid compiler warning. NFC.
Avoid the warning

    /polly/lib/Support/RegisterPasses.cpp:833:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
      default:
      ^

since all cases are now handled.

Thanks to Luke Benes for reporting.
2021-05-22 00:21:20 -05:00
Lang Hames 4272fca2db [ORC] Check for underflow on SymbolStringPtr ref-counts. 2021-05-21 21:11:54 -07:00
Lang Hames 20634ece15 [ORC] Fix debugging output: printDescription should not have a newline. 2021-05-21 21:11:54 -07:00
Lang Hames fda4300da8 [ORC] Fix race condtition in CoreAPIsTest.
This test has been failing intermittently on some builders, probably due to a
race on the WorkThreads vector. This patch should fix that.
2021-05-21 21:11:54 -07:00
Fangrui Song 7f0acc4e4f [docs] ld.lld.1: Mention -z nostart-stop-gc 2021-05-21 19:57:51 -07:00
Fangrui Song 5d9ea36baf [UpdateTestChecks] Default --x86_scrub_rip to False
True is a bad default: the useful symbol names and `@GOTPCREL` are scrubbed.

Change the default and add global variable tests to x86-basic.ll
(renamed from x86_function_name.ll since we now also test variables).
I updated some tests to show the differences.

Updated LCPI regex to include Darwin style `LCPI_[0-9]+_[0-9]+` (no
leading dot).

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D102588
2021-05-21 19:26:15 -07:00
peter klausler e162dc6f28 [flang] Fix symbol table bugs with ENTRY statements
Dummy arguments of ENTRY statements in execution parts were
not being created as objects, nor were they being implicitly
typed.

When the symbol corresponding to an alternate ENTRY point
already exists (by that name) due to having been referenced
in an earlier call, name resolution used to delete the extant
symbol.  This isn't the right thing to do -- the extant
symbol will be pointed to by parser::Name nodes in the parse
tree while no longer being part of any Scope.

Differential Review: https://reviews.llvm.org/D102948
2021-05-21 17:45:37 -07:00
Lang Hames 40df1b15b4 [ORC][C-bindings] Replace LLVMOrcJITTargetMachineBuilderDisposeTargetTriple.
The implementation and intent behind freeing the triple string here is the same
as LLVMGetDefaultTargetTriple (and any other owned c string returned from the C
API), so we should use LLVMDisposeMessage for to free the string for
consistency.

Patch by Mats Larsen -- thanks Mats!

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D102957
2021-05-21 17:38:06 -07:00
Arthur Eubanks f7788e1bff Revert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes"
This reverts commit d14d84af2f.

Causes unacceptable memory regressions.
2021-05-21 16:38:03 -07:00
Arthur Eubanks a52530dd6a Revert "[NPM] Do not run function simplification pipeline unnecessarily"
This reverts commit 97ab068034.

Depends on D100917, which is to be reverted.
2021-05-21 16:38:02 -07:00
Eli Friedman f8e7b28c99 [NewPM] Mark BitcodeWriter as required.
The textual IR writer has an equivalent marking.  It looks like this got
missed in e6ea877.
2021-05-21 16:14:09 -07:00
Vitaly Buka 5992823008 [NFC][sanitizer] Remove unused variable 2021-05-21 16:11:51 -07:00
Vitaly Buka 01c5904907 [lit] Print full googletest commad line
Similar to regular output of LIT tests:
c162f086ba/llvm/utils/lit/lit/TestRunner.py (L1569)

Differential Revision: https://reviews.llvm.org/D102899
2021-05-21 16:11:51 -07:00
Nick Desaulniers 033138ea45 [IR] make stack-protector-guard-* flags into module attrs
D88631 added initial support for:

- -mstack-protector-guard=
- -mstack-protector-guard-reg=
- -mstack-protector-guard-offset=

flags, and D100919 extended these to AArch64. Unfortunately, these flags
aren't retained for LTO. Make them module attributes rather than
TargetOptions.

Link: https://github.com/ClangBuiltLinux/linux/issues/1378

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D102742
2021-05-21 15:53:30 -07:00
Andrew Young ab3cd2601b
[mlir][docs] Add memref and sparse_tensor to Passes.md
These pass documents belong on the main pass page, and not generated as
top level pages.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D102947
2021-05-21 15:23:39 -07:00
Sam Clegg 8544b40b6e [lld][WebAssembly] Fix for PIC output + TLS + non-shared-memory
Prior to this change build with `-shared/-pie` and using TLS (but
without -shared-memory) would hit this assert:

  "Currenly only a single data segment is supported in PIC mode"

This is because we were not including TLS data when merging data
segments.  However, when we build without shared-memory (i.e.  without
threads) we effectively lower away TLS into a normal active data
segment.. so we were ending up with two active data segments: the merged
data, and the lowered TLS data.

To fix this problem we can instead avoid combining data segments at
all when running in shared memory mode (because in this case all
segment initialization is passive).  And then in non-shared memory
mode we know that TLS has been lowered and therefore we can can
and should combine all segments.

So with this new behavior we have two different modes:

1. With shared memory / mutli-threaded: Never combine data segments
   since it is not necessary.  (All data segments as passive already).

2. Wihout shared memory / single-threaded: Combine *all* data segments
   since we treat TLS as normal data.  (We end up with a single
   active data segment).

Differential Revision: https://reviews.llvm.org/D102937
2021-05-21 15:16:47 -07:00
Jon Roelofs cc9c895d88 [compiler-rt][profile] Explicitly specify PROFILE_SOURCES extensions. NFC 2021-05-21 14:46:08 -07:00
Yaxun (Sam) Liu 91dfd68e90 [NFC][HIP] fix comments in __clang_hip_cmath.h 2021-05-21 17:44:18 -04:00
Vitaly Buka f50b87e9ef [NFC][sanitizer] Fix android bot after D102815
https://lab.llvm.org/buildbot/#/builders/77/builds/6519
2021-05-21 14:08:04 -07:00
Martin Storsjö 4468e5b899 [clang] Don't pass multiple backend options if mixing -mimplicit-it and -Wa,-mimplicit-it
If multiple instances of the -arm-implicit-it option is passed to
the backend, it errors out.

Also fix cases where there are multiple -Wa,-mimplicit-it; the existing
tests indicate that the last one specified takes effect, while in
practice it passed double options, which didn't work as intended.

Differential Revision: https://reviews.llvm.org/D102812
2021-05-22 00:05:31 +03:00
Saleem Abdulrasool 6c6b3e3afe RISCV: add a few deprecated aliases for CSRs
This adds the {s,u,m}badaddr CSR aliases as well as the sptbr alias.
These are for compatibility with binutils.  Furthermore, these are used
by the RISC-V Proxy Kernel and are required to enable building the Proxy
Kernel with the LLVM IAS.

The aliases here are deprecated.  These are being introduced in order to
provide a compatibility story for the existing GNU toolchain, which
still supports the deprecated spelling in the assembler.  However, in
order to encourage the migration of existing coding, we provide warnings
indicating that the aliased CSRs are deprecated and should be replaced.

Differential Revision: https://reviews.llvm.org/D101919
Reviewed By: Craig Topper
2021-05-21 13:52:58 -07:00
AndreyChurbanov aa6e7e8da8 [OpenMP] libomp: move warnings to after library initialization
Warnings on deprecated api cannot be suppressed if the library is not initialized.
With this change it is possible to set KMP_WARNINGS=false to suppress the warnings.

Differential Revision: https://reviews.llvm.org/D102676
2021-05-21 23:47:23 +03:00
Axel Y. Rivera 4fb131b497 [LLD][COFF] PR49068: Include the IMAGE_REL_BASED_HIGHLOW relocation base type when the machine is 64 bits and the relocation type is ADDR32
The COFF driver produces an ABSOLUTE relocation base for an ADDR32
relocation type and the system is 64 bits (machine=AMD64). The
relocation information won't be added in the output and could
produce an incorrect address access during run-time. This change
set checks if the relocation type is IMAGE_REL_AMD64_ADDR32 and
if so, adds the relocated symbol as IMAGE_REL_BASED_HIGHLOW base.

Differential Revision: https://reviews.llvm.org/D96619
2021-05-21 23:45:55 +03:00
Arthur Eubanks 7a29a12301 [Verifier] Move some atomicrmw/cmpxchg checks to instruction creation
These checks already exist as asserts when creating the corresponding
instruction. Anybody creating these instructions already need to take
care to not break these checks.

Move the checks for success/failure ordering in cmpxchg from the
verifier to the LLParser and BitcodeReader plus an assert.

Add some tests for cmpxchg ordering. The .bc files are created from the
.ll files with an llvm-as with these checks disabled.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102803
2021-05-21 13:41:17 -07:00
zoecarver 8110a73164 [libcxx][gardening] Re-order includes across libcxx.
This commit alphabetizes all includes in libcxx. This is a NFC.

This can also serve as a pseudo "announcement" for how we should order these headers going forward (note: double underscores go before other headers).

Differential Revision: https://reviews.llvm.org/D102941
2021-05-21 13:22:10 -07:00
David Goldblatt 3c4b79481d [InstSimplify] add tests for rem-of-mul; NFC
These are baseline tests for D102864
2021-05-21 15:46:39 -04:00
Aart Bik c194b49c9c [mlir][sparse] add full dimension ordering support
This revision completes the "dimension ordering" feature
of sparse tensor types that enables the programmer to
define a preferred order on dimension access (other than
the default left-to-right order). This enables e.g. selection
of column-major over row-major storage for sparse matrices,
but generalized to any rank, as in:

dimOrdering = affine_map<(i,j,k,l,m,n,o,p) -> (p,o,j,k,i,l,m,n)>

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102856
2021-05-21 12:35:13 -07:00
Vitaly Buka bbdabb044d [NFC][lit] Add missing UNRESOLVED test
D102899 will change it behavour.
2021-05-21 11:34:00 -07:00
Vitaly Buka 3294001304 [NFC][lit] Add skipped test into upstream format
Missing from D102694
2021-05-21 11:34:00 -07:00