Commit Graph

345183 Commits

Author SHA1 Message Date
Stephen Neuendorffer dc120bae46 [MLIR] Do not link mlir-cpu-runner with X86 libs
The three libs where recently added to the `mlir-cpu-runner`'s
`CMakeLists.txt` file. This prevent the runner to compile on other
platform (e.g. Power in my case).  Native codegen is pulled in
by the ExecutionEngine library, so this is redundant in any case.

Differential Revision: https://reviews.llvm.org/D75916
2020-03-11 11:35:59 -07:00
Sergej Jaskiewicz 0197eac333 Temporarily re-apply https://reviews.llvm.org/D74347
It was reverted in 35367e06b8
because it broke the buildbot due to missing libc++abi headers.

https://reviews.llvm.org/D75991 improves the diagnostics, so I hope
the build log will be more informative.
2020-03-11 21:35:14 +03:00
Francesco Petrogalli 4dde9e9b02 [llvm][CodeGen] IR intrinsics for SVE2 contiguous conflict detection instructions.
Summary:
The IR intrinsics are mapped to the following SVE2 instructions:

* WHILERW <Pd>.<T>, <Xn>, <Xm>
* WHILEWR <Pd>.<T>, <Xn>, <Xm>

The intrinsics introduced in this patch are the IR counterpart of the
SVE ACLE functions `svwhilerw` and `svwhilewr` (all data type
variants).

Patch by Maciej Gąbka <maciej.gabka@arm.com>.

Reviewers: kmclaughlin, rengolin

Reviewed By: kmclaughlin

Subscribers: tschuett, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75862
2020-03-11 18:28:02 +00:00
Stanislav Mekhanoshin 9801e5469b [AMDGPU] Disable nested endcf collapse
The assumption is that conditional regions are perfectly nested
and a mask restored at the exit from the inner block will be
completely covered by a mask restored in the outer.

It turns out with our current structurizer this is not always
the case.

Disable the optimization for now, but I want to keep it around
for a while to either try after further structurizer changes or
to move it into control flow lowering where we have more info
and reuse the test.

Differential Revision: https://reviews.llvm.org/D75958
2020-03-11 11:24:20 -07:00
Tim Shen ced0dd8e51 [MLIR] Guard DMA-specific logic with DMA option
Differential Revision: https://reviews.llvm.org/D75963
2020-03-11 11:23:13 -07:00
Juneyoung Lee 8eb2f865c3 [CodeGenPrepare] Fold br(freeze(icmp x, const)) to br(icmp(freeze x, const))
Summary:
This patch helps CodeGenPrepare move freeze into the icmp when it is used by branch.
It reenables generation of efficient conditional jumps.

This is only done when at least one of icmp's operands is constant to prevent the transformation from increasing # of freeze instructions.

Performance degradation of MultiSource/Benchmarks/Ptrdist/yacr2/yacr2.test is resolved with this patch.

Checked with Alive2

Reviewers: reames, fhahn, nlopes

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75859
2020-03-12 03:16:15 +09:00
Sergej Jaskiewicz ed77efeff1 [libc++] [cmake] Better diagnostics for missing abi library headers
Summary:
This is NFC. We only add additional information to the log.

Reviewers: EricWF, ldionne, mclow.lists

Reviewed By: ldionne

Subscribers: kristof.beyls, dexonsmith, danielkiss, mgorny, ldionne, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D75991
2020-03-11 21:02:45 +03:00
Jay Foad a46dba24fa [AMDGPU] Extend macro fusion for ADDC and SUBB to SUBBREV
Summary:
There's a lot of test case churn but the overall effect is to increase
the number of back-to-back v_sub,v_subbrev pairs, which can execute with
no delay even on gfx10.

Reviewers: arsenm, rampitec, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75999
2020-03-11 17:59:21 +00:00
Florian Hahn bc6c8c4bbb [Matrix] Add remark propagation along the inlined-at chain.
This patch adds support for propagating matrix expressions along the
inlined-at chain and emitting remarks at the traversed function scopes.

To motivate this new behavior, consider the example below. Without the
remark 'up-leveling', we would only get remarks in load.h and store.h,
but we cannot generate a remark describing the full expression in
toplevel.cpp, which is the place where the user has the best chance of
spotting/fixing potential problems.

With this patch, we generate a remark for the load in load.h, one for
the store in store.h and one for the complete expression in
toplevel.cpp. For a bigger example, please see remarks-inlining.ll.

    load.h:
    template <typename Ty, unsigned R, unsigned C> Matrix<Ty, R, C> load(Ty *Ptr) {
      Matrix<Ty, R, C> Result;
      Result.value = *reinterpret_cast <typename Matrix<Ty, R, C>::matrix_t *>(Ptr);
      return Result;
    }

    store.h:
    template <typename Ty, unsigned R, unsigned C> void store(Matrix<Ty, R, C> M1, Ty *Ptr) {
       *reinterpret_cast<typename decltype(M1)::matrix_t *>(Ptr) = M1.value;
    }

    toplevel.cpp
    void test(double *A, double *B, double *C) {
      store(add(load<double, 3, 5>(A), load<double, 3, 5>(B)), C);
    }

For a given function, we traverse the inlined-at chain for each
matrix instruction (= instructions with shape information). We collect
the matrix instructions in each DISubprogram we visit. This produces a
mapping of DISubprogram -> (List of matrix instructions visible in the
subpogram). We then generate remarks using the list of instructions for
each subprogram in the inlined-at chain. Note that the list of instructions
for a subprogram includes the instructions from its own subprograms
recursively. For example using the example above, for the subprogram
'test' this includes inline functions 'load' and 'store'. This allows
surfacing the remarks at a level useful to users.

Please note that the current approach may create a lot of extra remarks.
Additional heuristics to cut-off the traversal can be implemented in the
future. For example, it might make sense to stop 'up-leveling' once all
matrix instructions are at the same debug location.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D73600
2020-03-11 17:40:08 +00:00
Alexey Bataev 0d7c8c07d2 [OPENMP][DOCS]Mark depobj as implemented, NFC. 2020-03-11 13:26:01 -04:00
Sterling Augustine 8ffdabdb61 Lazily save initialState of registers during unwind.
Summary:
Copying all of the saved register state on every entry to
parseInstruction is a severe performance contraint, especially
because most of this saved state is never used. On x86 linux
this is about 560 bytes, and will be more on other platforms.

When performance testing libunwind, this memcpy appears at the
top of nearly all our tests.

By only saving this state as needed, we see increasing in performance
of around 2.5% for the ctak test here.

https://github.com/clasp-developers/ctak

Certain internal extremely exception-heavy tasks run in about 2/3
the time.

Note that by stashing the new boolean inside what had been padding in
the original structure, this uses no additional memory.

Subscribers: fedor.sergeev, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D75692
2020-03-11 10:13:33 -07:00
Andrzej Warzynski a9f1583228 [AArch64][SVE] Add the @llvm.aarch64.sve.sel intrinsic
Reviewers: sdesmalen, efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75928
2020-03-11 17:05:21 +00:00
Philip Reames e671641844 [GC] Remove buggy untested optimization from statepoint lowering
A downstream test case (see included reduced test) revealed that we have a bug in how we handle duplicate relocations. If we have the same SDValue relocated twice, and that value happens to be a constant (such as null), we only export one of the two llvm::Values. Exporting on a per llvm::Value basis is required to allow lowering of gc.relocates in following basic blocks (e.g. invokes). Without it, we end up with a use of an undefined vreg and bad things happen.

Rather than fixing the optimization - which appears to be hard - I propose we simply remove it. There are no tests in tree that change with this code removed. If we find out later that this did matter for something, we can reimplement a variation of this in CodeGenPrepare to catch the easy cases without complicating the lowering code.

Thanks to Denis and Serguei who did all the hard work of figuring out what went wrong here. The patch is by far the easy part. :)

Differential Revision: https://reviews.llvm.org/D75964
2020-03-11 10:03:24 -07:00
Adrian Prantl 0396aa4c05 Add a decorator option to skip tests based on a default setting.
This patch allows skipping a test based on a default setting, which is
useful when running the testsuite in different "modes" based on a
default setting. This is a feature I need for the Swift testsuite, but
I think it's generally useful.

Differential Revision: https://reviews.llvm.org/D75864
2020-03-11 10:00:36 -07:00
Fangrui Song fbf41b5267 [ELF] Simplify sh_addr computation and warn if sh_addr is not a multiple of sh_addralign
See `docs/ELF/linker_script.rst` for the new computation for sh_addr and sh_addralign.
`ALIGN(section_align)` now means: "increase alignment to section_align"
(like yet another input section requirement).

The "start of section .foo changes from 0x11 to 0x20" warning no longer
makes sense. Change it to warn if sh_addr%sh_addralign!=0.

To decrease the alignment from the default max_input_align,
use `.output ALIGN(8) : {}` instead of `.output : ALIGN(8) {}`
See linkerscript/section-address-align.test as an example.

When both an output section address and ALIGN are set (can be seen as an
"undefined behavior" https://sourceware.org/ml/binutils/2020-03/msg00115.html),
lld may align more than GNU ld, but it makes a linker script working
with GNU ld hard to break with lld.

This patch can be considered as restoring part of the behavior before D74736.

Differential Revision: https://reviews.llvm.org/D75724
2020-03-11 09:35:42 -07:00
James Henderson 2150a6d0d6 [Object][unittest] Skip tests on machines with non-64 bit size_t
Speculative fix for build bot failures such as
http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/14317/
2020-03-11 15:31:30 +00:00
David Green 72bf26feb3 [ARM] Extra VFMA tests. NFC 2020-03-11 15:14:07 +00:00
Haojian Wu d83ade4506 [clangd] Improve the "max limit" error message in rename, NFC.
previously, we emited "exceeds the max limit 49" which was weird, now we
emit "exceeds the max limit 50".
2020-03-11 16:13:46 +01:00
Matt Arsenault a2202f6a3f AMDGPU/GlobalISel: Manually RegBankSelect copies
This was failng on any pre-assigned copy to the VCC bank.

This is something of a workaround for the default implementation in
getInstrMappingImpl, and how it treats copy-like operations in
general.

Copy-like operations are considered to only have one result register
bank, rather than separate banks for each source like a normal
instruction. To avoid potentially mishandling reg_sequence with
impossible operand combinations, the generic implementation errors on
impossible costs. If the bank was already assigned, is treated it
as-if it were an unsatisfiable REG_SEQUENCE mapping. We really don't
get any value from any of what getInstrMappingImpl tries to do for
copies, so just directly emit the simple mapping we really want.
2020-03-11 11:12:12 -04:00
Christian Sigg fc421d7ca3 [MLIR] Remove all-reduce lowering from GPU to NVVM. Use in-dialect lowering instead.
Reviewers: herhut, mravishankar

Reviewed By: herhut

Subscribers: merge_guards_bot, jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73794
2020-03-11 15:17:54 +01:00
Christian Sigg f3ad6eb5d3 Change to individual pretty printer classes, remove generic `make_printer`.
Summary: Follow-up from D72589.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: merge_guards_bot, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73609
2020-03-11 15:04:03 +01:00
Hubert Tong b94d4b1903 [unittests][Object] Use matching signedness for expected value
Speculative fix for buildbot breakage:
http://lab.llvm.org:8011/builders/clang-ppc64le-rhel/builds/1899/steps/ninja%20check%201/logs/stdio

D75742 introduces checks that cause bots to complain about comparing
values where the integer types mismatch on signedness.

This patch makes the expected value unsigned in various cases (since the
value being tested is unsigned).
2020-03-11 09:58:10 -04:00
Artem Dergachev edbf2fde14 [analyzer] Fix a strange compile error on a certain Clang-7.0.0
error: default initialization of an object of const type
       'const clang::QualType' without a user-provided
       default constructor

  Irrelevant; // A placeholder, whenever we do not care about the type.
  ^
            {}
2020-03-11 16:54:34 +03:00
Joachim Protze 31c85ca06d [compiler-rt][tsan] Make fiber support in thread sanitizer dynamic linkable
This patch will allow dynamic libraries to call into the fiber support functions
introduced in https://reviews.llvm.org/D54889

Differential Revision: https://reviews.llvm.org/D74487
2020-03-11 14:14:33 +01:00
Alexey Bataev c422d69b1a [LIBOMPTARGET]Fix PR45139: Bug in mixing Python and OpenMP target offload.
Summary: Explicitly initialize data members of RTLsTy class upon construction.

Reviewers: grokos

Subscribers: guansong, openmp-commits, caomhin, kkwli0

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D75946
2020-03-11 09:12:02 -04:00
Valentin Clement c7380995f8 [MLIR] Add `and`, `or`, `xor`, `min`, `max` too gpu.all_reduce and the nvvm lowering
Summary:
This patch add some builtin operation for the gpu.all_reduce ops.
- for Integer only: `and`, `or`, `xor`
- for Float and Integer: `min`, `max`

This is useful for higher level dialect like OpenACC or OpenMP that can lower to the GPU dialect.

Differential Revision: https://reviews.llvm.org/D75766
2020-03-11 14:07:04 +01:00
Stephan Herhut f6790a1c63 Revert "[MLIR] Add `and`, `or`, `xor`, `min`, `max` too gpu.all_reduce and the nvvm lowering"
Attribution to original author got lost.
2020-03-11 14:07:04 +01:00
Jonathan Coe 1fb9c29833 [clang-format] Improved identification of C# nullables
Summary:
Allow `?` inside C# generics.

Do not mistake casts like `(Type?)` as conditional operators.

Reviewers: krasimir

Subscribers: cfe-commits, MyDeveloperDay

Tags: #clang-format, #clang

Differential Revision: https://reviews.llvm.org/D75983
2020-03-11 12:58:52 +00:00
Jonathan Coe 5c917bd9a7 [clang-format] No space in `new()` and `this[Type x]` in C#
Reviewers: krasimir

Reviewed By: krasimir

Subscribers: cfe-commits, MyDeveloperDay

Tags: #clang-format, #clang

Differential Revision: https://reviews.llvm.org/D75984
2020-03-11 12:53:53 +00:00
Sam Parker 51cad66e97 [NFC][ARM] Add test
Precommit test for LowOverheadLoops.
2020-03-11 11:51:52 +00:00
Sam Parker d941df363d [NFC][ARM] Reorder some logic
Move some logic around in LowOverheadLoop::ValidateLiveOut
2020-03-11 11:40:09 +00:00
Simon Pilgrim b3b4727a3e [X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic opcodes (PR39467)
For i32 and i64 cases, X86ISD::SHLD/SHRD are close enough to ISD::FSHL/FSHR that we can use them directly, we just need to account for the operand commutation for SHRD.

The i16 SHLD/SHRD case is annoying as the shift amount is modulo-32 (vs funnel shift modulo-16), so I've added X86ISD::FSHL/FSHR equivalents, which matches the generic implementation in all other terms.

Something I'm slightly concerned with is that ISD::FSHL/FSHR legality is controlled by the Subtarget.isSHLDSlow() feature flag - we don't normally use non-ISA features for this but it allows the DAG combines to continue to operate after legalization in a lot more cases.

The X86 *bits.ll changes are all affected by the same issue - we now have a "FSHR(-1,-1,amt) -> ROTR(-1,amt) -> (-1)" simplification that reduces the dependencies enough for the branch fall through code to mess up.

Differential Revision: https://reviews.llvm.org/D75748
2020-03-11 11:17:49 +00:00
Peter Smith 6d5603e2d2 [LLD][ELF] Add initial LLD LinkerScript docs page
LLD implements Linker Scripts as they are described in the GNU ld manual.
This description is far from a specification, with the only true reference
the GNU ld implementation, which has undocumented behaviour that can vary
from release to release.

To make it easy for people to switch between linkers we try to follow GNU
ld implementation details wherever possible. We reserve the right to make
our own decisions where the undocumented GNU ld behaviour is not
appropriate for LLD. We don't have a place to document these decisions and
it can be difficult for users to find out this information.

This file is a statement of the LLD implementation policy and will contain
intentional deviations from GNU ld.

The first patch that will add concrete details to this file is D75724

Differential Revision: https://reviews.llvm.org/D75921
2020-03-11 10:56:12 +00:00
LLVM GN Syncbot 8d9886f893 [gn build] Port 326bc1da45 2020-03-11 10:47:56 +00:00
James Henderson 326bc1da45 [Object] Fix handling of large archive members
The archive library truncated the size of archive members whose size was
greater than max uint32_t. This patch fixes the issue and adds some unit
tests to verify.

Reviewed by: ruiu, MaskRay, grimar, rupprecht

Differential Revision: https://reviews.llvm.org/D75742
2020-03-11 10:29:45 +00:00
Anna Welker a6d3bec83f [TTI][ARM][MVE] Refine gather/scatter cost model
Refines the gather/scatter cost model, but also changes the TTI
function getIntrinsicInstrCost to accept an additional parameter
which is needed for the gather/scatter cost evaluation.
This did require trivial changes in some non-ARM backends to
adopt the new parameter.
Extending gathers and truncating scatters are now priced cheaper.

Differential Revision: https://reviews.llvm.org/D75525
2020-03-11 10:23:41 +00:00
Victor Campos 8a12553223 [ARM] Improve codegen of volatile load/store of i64
Summary:
Instead of generating two i32 instructions for each load or store of a volatile
i64 value (two LDRs or STRs), now emit LDRD/STRD.

These improvements cover architectures implementing ARMv5TE or Thumb-2.

The code generation explicitly deviates from using the register-offset
variant of LDRD/STRD. In this variant, the register allocated to the
register-offset cannot be reused in any of the remaining operands. Such
restriction seems to be non-trivial to implement in LLVM, thus it is
left as a to-do.

Reviewers: dmgreen, efriedma, john.brawn, nickdesaulniers

Reviewed By: efriedma, nickdesaulniers

Subscribers: danielkiss, alanphipps, hans, nathanchance, nickdesaulniers, vvereschaka, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70072
2020-03-11 10:19:27 +00:00
QingShan Zhang 9304decdee [NFC][Test] Add a PowerPC test to verify the behavior of a*b +/- c*d 2020-03-11 09:35:40 +00:00
Sebastian Neubauer 2f857eadf5 [AMDGPU] Use script to generate atomic optimizations test
This is a preparation for introducing a llvm.amdgcn.ballot intrinsic in
D65088.
2020-03-11 09:59:36 +01:00
QingShan Zhang 5edf900da0 [NFC][Test] Format the test PowerPC/recipest.ll with update_llc_test_checks.py 2020-03-11 08:49:53 +00:00
Jonas Devlieghere 4016c6b07f [lldb/Reproducer] Prevent crash when GDB multi-loader can't be created.
Check that the multi loader isn't null and print an error otherwise.
This patch also extends the test to cover these error paths.
2020-03-10 23:16:55 -07:00
Akira Hatanaka 37fa9d65ea [CodeGen][ObjC] Don't extend lifetime of ObjC pointers passed to calls
to __builtin_os_log_format if ARC isn't enabled

Fixes a bug introduced in this commit:
f4d791f833

rdar://problem/60301219
2020-03-10 22:10:32 -07:00
Serge Pavlov 14a1b80e04 Make IEEEFloat::roundToIntegral more standard conformant
Behavior of IEEEFloat::roundToIntegral is aligned with IEEE-754
operation roundToIntegralExact. In partucular this function now:
- returns opInvalid for signaling NaNs,
- returns opInexact if the result of rounding differs from argument.

Differential Revision: https://reviews.llvm.org/D75246
2020-03-11 10:38:46 +07:00
Matt Arsenault c0ad75e758 GlobalISel: Don't try to narrow extending loads/trunc store
If the loaded memory size was smaller than the result size, this would
produce out of bounds memory accesses. I'm wondering if we need a
distinct narrow memory legalize action type, since a case I care about
is decomposing a 4-byte unaligned access into 4 extending loads, which
would leave the original result register type. I'm currently awkwardly
using narrowScalar to handle unaligned accesses that need to be split.
2020-03-10 23:34:10 -04:00
Matt Arsenault b17a81f8b2 GlobalISel: Add missing add/sub with carries to MachineIRBuilder 2020-03-10 22:39:55 -04:00
Matt Arsenault 206d46a192 AMDGPU/GlobalISel: Add some tests that used to infinite loop 2020-03-10 22:12:56 -04:00
Leonard Chan 1c70dec18c [libunwind] Remove __FILE__ and __LINE__ from error reporting
We were seeing non-deterministic binary size differences depending on which
toolchain was used to build fuchsia. This is because libunwind embeded the
FILE path into a logging macro, even for release builds, which makes the code
dependent on the build directory.

This removes the file and line number from the error message. This is
consistent with how other runtimes report error, e.g.
https://github.com/llvm/llvm-project/blob/master/libcxxabi/src/abort_message.cpp#L30.

Differential Revision: https://reviews.llvm.org/D75890
2020-03-10 18:58:41 -07:00
Hubert Tong 48121a5743 [cmake] Link libclangDaemonTweaks with clangFormat
Speculative fix for buildbot failure in
http://lab.llvm.org:8011/builders/clang-ppc64le-rhel/builds/1881/steps/build%20stage%201/logs/stdio

Cause appears to be D75716.
2020-03-10 21:31:10 -04:00
Paula Toth 54928ba0ec [clang-tidy] Use more widely available headers for protability-restrict-system-includes-check's test 2020-03-10 16:54:11 -07:00
Richard Smith 4cba668ac1 Fix crash-on-invalid when trying to recover from a function template
being deleted on its second or subsequent declaration.
2020-03-10 16:34:27 -07:00