Commit Graph

389495 Commits

Author SHA1 Message Date
Sanjay Patel 9e43b1e9a1 [InstCombine] avoid 'tmp' usage in test files; NFC
The update script ( utils/update_test_checks.py ) warns against this.
2021-05-26 08:32:07 -04:00
Sanjay Patel b70fe92f08 [InstCombine] avoid 'tmp' usage in test file; NFC
The update script ( utils/update_test_checks.py ) warns against this.
2021-05-26 08:32:07 -04:00
Max Kazantsev 0de553dce0 Revert "Return "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration""
This reverts commit 43d2e51c2e.

Commited wrong version.
2021-05-26 19:29:07 +07:00
Max Kazantsev 43d2e51c2e Return "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration"
The patch was reverted due to compile time impact of contextual SCEV
queries. It also appeared that it introduced a miscompile on irreducible CFG.

Changes made:
1. isKnownPredicateAt is replaced with more lightweight isKnownPredicate;
2. Irreducible CFG in live code is now detected and excluded from processing.

Differential Revision: https://reviews.llvm.org/D102615
2021-05-26 19:23:21 +07:00
Adrian Kuegel dee46d0829 [mlir] Fold complex.create(complex.re(op), complex.im(op))
Differential Revision: https://reviews.llvm.org/D103148
2021-05-26 14:02:53 +02:00
Andrew Savonichev 8ac66d61ea [AArch64] Generate LD1 for anyext i8 or i16 vector load
The existing LD1 patterns do not cover cases where result type does
not match the memory type. This happens when illegal vector types are
extended and scalarized, for example:

  load <2 x i16>* %v2i16

is lowered into:

  // first element
  (v4i32 (insert_subvector (v2i32 (scalar_to_vector (load anyext from i16)))))
  // other elements
  (v4i32 (insert_vector_elt (i32 (load anyext from i16)) idx))

Before this patch these patterns were compiled into LDR + INS.
Now they are compiled into LD1.

The problem was reported in
PR24820: LLVM Generates abysmal code in simple situation.

Differential Revision: https://reviews.llvm.org/D102938
2021-05-26 14:44:21 +03:00
Max Kazantsev 5fb58d4598 [Test] Add Loop Deletion test with irreducible CFG
Authored by Mikael Holmén. It demonstrated miscompile on irreducible
CFG with patch "[LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration".
The patch is reverted. Checking in the test to make sure this bug
does not return.
2021-05-26 18:40:14 +07:00
Sven van Haastregt ba0fe85ec0 [OpenCL] Include header for atomic-ops test
Avoid duplicating the memory_order and memory_scope enum definitions.
2021-05-26 12:32:07 +01:00
Tomas Matheson ab8c44112c [MC] Move elf-unique-sections-by-flags.ll to X86/ 2021-05-26 12:28:17 +01:00
pooja2299 cebdf5d846 [Docs] Updated the content of getting started documentation under llvm/lib/MC
Wrote about llvm/lib/MC subproject on https://llvm.org/docs/GettingStarted.html page.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D101047
2021-05-26 16:25:26 +05:30
Tomas Matheson 165321b3d2 [MC][ELF] Emit unique sections for different flags
Global values imply flags such as readable, writable, executable for the
sections that they will be placed in. Currently MC places all such
entries into the same section, using the first set of flags seen. This
can lead to situations in LTO where a writable global is placed in the
same named section as a readable global from another file, and the
section may not be marked writable.

D72194 ensures that mergeable globals with explicit sections are placed
in separate sections with compatible entry size, by emitting the
`unique` assembly syntax where appropriate. This change extends that
approach to include section flags, so that globals with different
section flags are emitted in separate unique sections.

Differential revision: https://reviews.llvm.org/D100944
2021-05-26 11:51:29 +01:00
Tomas Matheson e79e8041c5 [MC][NFCI] Factor out ELF section unique ID calculation
Precursor to D100944. The logic for determining the unique ID had become
quite difficult to reason about, so I have factored this out into a
separate function.

Differential Revision: https://reviews.llvm.org/D102336
2021-05-26 11:51:29 +01:00
Pushpinder Singh a2d6ef5876 [AMDGPU][Libomptarget] Inline atmi_init/atmi_finalize
After D102847, these functions can be inlined.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103075
2021-05-26 10:50:08 +00:00
Pushpinder Singh cc8661ac4a [AMDGPU][Libomptarget] Delete g_atmi_initialized
This patch drops g_atmi_initialized and inlines the Initialize &
Finalize methods from Runtime class.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D102847
2021-05-26 10:46:54 +00:00
Raphael Isemann 76e47d4887 [lldb][NFC] Use C++ versions of the deprecated C standard library headers
The C headers are deprecated so as requested in D102845, this is replacing them
all with their (not deprecated) C++ equivalent.

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D103084
2021-05-26 12:46:12 +02:00
Simon Pilgrim 21aec4fdc5 [X86][SLM] Fix vector PSHUFB + variable shift resource/throughputs
Match whats documented in the Intel AOM (+Agner) - PSHUFB xmm is really slow, and mmx/xmm vector shifts are half rate.

Noticed while working to get the cost tables to more closely match llvm-mca analysis, in this case for shifts and truncations.
2021-05-26 11:14:21 +01:00
Florian Hahn 2a41d702be
[SCEV] Add tests with signed predicates for applyLoopGuards. 2021-05-26 11:10:11 +01:00
Pushpinder Singh 7648b6978e [AMDGPU][Libomptarget] Move Kernel/Symbol info tables to RTLDeviceInfoTy
Two globals KernelInfoTable & SymbolInfoTable are moved
into RTLDeviceInfoTy class.
This builds on the top of D102691.
[2/2]

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D102692
2021-05-26 10:02:28 +00:00
Kerry McLaughlin 6b0fe3c63b [NFC] Add CHECK lines for unordered FP reductions
An additional RUN line has been added to both strict-fadd.ll &
scalable-strict-fadd.ll to ensure the correct behaviour of these
tests where `-enable-strict-reductions` is false.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D103015
2021-05-26 11:00:20 +01:00
Mirko Brkusanin 9601849984 [AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks
This function can change regbank for registers which already have a selected
bank. Depending on the instruction where these registers were used it can
cause instruction selection to fail.

Differential Revision: https://reviews.llvm.org/D98515
2021-05-26 11:57:41 +02:00
Mirko Brkusanin 7386ad4e9e Revert "[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg banks"
This reverts commit 18c5444702.
2021-05-26 11:57:41 +02:00
Fraser Cormack 7e27e4273d [RISCV] Pre-commit fixed-length mask vselect tests
These are default-expanded but later unrolled due to RISC-V's vector
boolean content policy. A patch to improve this codegen will follow
shortly.
2021-05-26 10:44:45 +01:00
Max Kazantsev 7ee863b8eb [Test] Add simplified versions of tests for loop deletion that don't need context 2021-05-26 16:39:00 +07:00
Tim Northover 8c5ac18d71 AArch64: support post-indexed stores to bfloat types. 2021-05-26 10:35:52 +01:00
Simon Pilgrim 942e01de89 [CostModel][X86] Remove old testshift* tests
The vector shift cost tests are better covered (more cpu/sse levels) by the vshift-*-*cost files, and we're trying to avoid codegen tests in here as it makes it harder to maintain the test files.
2021-05-26 10:31:00 +01:00
Simon Pilgrim 66978466ba [X86][Atom] Fix vector variable shift resource/throughputs
Match whats documented in the Intel AOM - the non-immediate variants of the PSLL*/PSRA*/PSRL* shift instructions requires BOTH ports - this was being incorrectly modelled as EITHER port.

Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.
2021-05-26 10:30:59 +01:00
Max Kazantsev 794fb5482e [Test] Add test on unrolling to make sure it won't fail
Initially it failed an assertion with "Do actual DCE in LoopUnroll (try 2)"
which was later reverted. Make sure that when this patch is returned, the
test works fine.
2021-05-26 16:30:41 +07:00
Roman Lebedev 8c86161a0b
[NFC][X86] clang-format X86TTIImpl::getInterleavedMemoryOpCostAVX2()
I plan to make changes to it, and undoing formatting each time is not going to be fun.
2021-05-26 12:27:47 +03:00
David Sherwood 70d8365e33 Fix warning introduced by 9c766f4090 2021-05-26 10:20:39 +01:00
Bjorn Pettersson a3b3f7e631 [HIP] Adjust check in hip-include-path.hip test case
The changes in commit 722c39fef5 caused the test case to fail
when building with -DLLVM_LIBDIR_SUFFIX=64. This patch makes the
checks a bit more relaxed to support libdir suffixes again.

Also adjusting the regular expressions to avoid mathes including
double quotes.
2021-05-26 11:08:05 +02:00
Butygin 91e0cb6598 [mlir] LocalAliasAnalysis: Assume allocation scope to function scope if cannot determine better
It helps when checking aliasing between AllocOp result and function arguments.

Differential Revision: https://reviews.llvm.org/D102557
2021-05-26 12:06:56 +03:00
Adrian Kuegel cb65419b1a [mlir] Simplify folding code (NFC) 2021-05-26 11:00:07 +02:00
David Sherwood 9c766f4090 [InstCombine] Fold extractelement + vector GEP with one use
We sometimes see code like this:

Case 1:
  %gep = getelementptr i32, i32* %a, <2 x i64> %splat
  %ext = extractelement <2 x i32*> %gep, i32 0

or this:

Case 2:
  %gep = getelementptr i32, <4 x i32*> %a, i64 1
  %ext = extractelement <4 x i32*> %gep, i32 0

where there is only one use of the GEP. In such cases it makes
sense to fold the two together such that we create a scalar GEP:

Case 1:
  %ext = extractelement <2 x i64> %splat, i32 0
  %gep = getelementptr i32, i32* %a, i64 %ext

Case 2:
  %ext = extractelement <2 x i32*> %a, i32 0
  %gep = getelementptr i32, i32* %ext, i64 1

This may create further folding opportunities as a result, i.e.
the extract of a splat vector can be completely eliminated. Also,
even for the general case where the vector operand is not a splat
it seems beneficial to create a scalar GEP and extract the scalar
element from the operand. Therefore, in this patch I've assumed
that a scalar GEP is always preferrable to a vector GEP and have
added code to unconditionally fold the extract + GEP.

I haven't added folds for the case when we have both a vector of
pointers and a vector of indices, since this would require
generating an additional extractelement operation.

Tests have been added here:

  Transforms/InstCombine/gep-vector-indices.ll

Differential Revision: https://reviews.llvm.org/D101900
2021-05-26 09:54:26 +01:00
Adrian Kuegel b99f892b02 [mlir] Fold complex.re(complex.create) and complex.im(complex.create)
This extends the folding we already have. A test needs to be adjusted.

Differential Revision: https://reviews.llvm.org/D103141
2021-05-26 10:53:05 +02:00
Esme-Yi bf809cd165 [NFC][object] Change the input parameter of the method isDebugSection.
Summary: This is a NFC patch to change the input parameter of the method SectionRef::isDebugSection(), by replacing the StringRef SectionName with DataRefImpl Sec. This allows us to determine if a section is debug type in more ways than just by section name.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D102601
2021-05-26 08:47:53 +00:00
David Green 2cf0e52b85 [ARM] Add patterns for vmulh
Now that vmulh can be selected, this adds the MVE patterns to make it
legal and generate instructions.

Differential Revision: https://reviews.llvm.org/D88011
2021-05-26 09:22:12 +01:00
Björn Schäpers 9ef66ed437 [clang-format][NFC] correctly sort StatementAttributeLike-macros' IO.map 2021-05-26 07:59:08 +02:00
LLVM GN Syncbot dde123993f [gn build] Port 36d0fdf9ac 2021-05-26 04:31:12 +00:00
Christopher Di Bella 36d0fdf9ac [libcxx][iterator] adds `std::ranges::advance`
Implements part of P0896 'The One Ranges Proposal'.
Implements [range.iter.op.advance].

Differential Revision: https://reviews.llvm.org/D101922
2021-05-26 04:27:30 +00:00
Arthur Eubanks 1202f559bd [OpaquePtr] Make atomicrmw work with opaque pointers
FullTy is only necessary when we need to figure out what type an
instruction works with given a pointer's pointee type. However, we just
end up using the value operand's type, so FullTy isn't necessary.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102788
2021-05-25 20:16:21 -07:00
Jonas Devlieghere 564eb20e0d Revert "[lldb] Avoid format string in LLDB_SCOPED_TIMER"
Right after pushing, I remembered that this was added to silence a GCC
warning (https://reviews.llvm.org/D99120). This reverts my patch and
adds a comment.
2021-05-25 17:22:51 -07:00
Jonas Devlieghere bbcb3433d4 [lldb] Avoid format string in LLDB_SCOPED_TIMER
Pass LLVM_PRETTY_FUNCTION directly for the no-argument macro.
2021-05-25 17:14:08 -07:00
Teresa Johnson d35fe04fa3 [LTT] Handle merged llvm.assume when dropping type tests
When the lower type test pass is invoked a second time with
DropTypeTests set to true, it expects that all remaining type tests feed
assume instructions, which are removed along with the type tests.

In some cases the llvm.assume might have been merged with another one,
i.e. from a builtin_assume instruction, in which case the type test
would actually feed a phi that in turn feeds the merged assume
instruction. In this case we can simply replace that operand of the phi
with "true" before removing the type test.

Differential Revision: https://reviews.llvm.org/D103073
2021-05-25 17:02:13 -07:00
Arthur Eubanks ad90a6be21 [OpaquePtr] Create new bitcode encoding for atomicrmw
Since the opaque pointer type won't contain the pointee type, we need to
separately encode the value type for an atomicrmw.

Emit this new code for atomicrmw.

Handle this new code and the old one in the bitcode reader.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103123
2021-05-25 16:30:34 -07:00
Fangrui Song e67259531d [sanitizer] Let glibc aarch64 use O(1) GetTls
The generic approach can still be used by musl and FreeBSD. Note: on glibc
2.31, TLS_PRE_TCB_SIZE is 0x700, larger than ThreadDescriptorSize() by 16, but
this is benign: as long as the range includes pthread::{specific_1stblock,specific}
pthread_setspecific will not cause false positives.

Note: the state before afec953857 underestimated
the TLS size a lot (nearly ThreadDescriptorSize() = 1776).
That may explain why afec953857 actually made some
tests pass.
2021-05-25 16:28:17 -07:00
Kevin Athey 52ac114771 LLVM Detailed IR tests for introduction of flag -fsanitize-address-detect-stack-use-after-return-mode.
Rework all tests that interact with use after return to correctly handle the case where the mode has been explicitly set to Never or Always.

for issue: https://github.com/google/sanitizers/issues/1394

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D102462
2021-05-25 16:17:39 -07:00
Alexandre Ganea 20c9a44ac0 [benchmark] Silence 'suggest override' and 'missing override' warnings
When building with Clang 11 on Windows, silence the following:

F:\aganea\llvm-project\llvm\utils\benchmark\include\benchmark/benchmark.h(955,8): warning: 'Run' overrides a member function but is not marked 'override' [-Wsuggest-override]
  void Run(State& st);
       ^
F:\aganea\llvm-project\llvm\utils\benchmark\include\benchmark/benchmark.h(895,16): note: overridden virtual function is here
  virtual void Run(State& state) = 0;
               ^
1 warning generated.
2021-05-25 18:46:37 -04:00
Alexandre Ganea dd2be15ff9 [gcov] Silence warning: comparison of integers of different signs
When building with Clang 11 on Windows, silence the following:

[432/5643] Building C object projects\compiler-rt\lib\profile\CMakeFiles\clang_rt.profile-x86_64.dir\GCDAProfiling.c.obj
F:\aganea\llvm-project\compiler-rt\lib\profile\GCDAProfiling.c(464,13): warning: comparison of integers of different signs: 'uint32_t' (aka 'unsigned int') and 'int' [-Wsign-compare]
    if (val != (gcov_version >= 90 ? GCOV_TAG_OBJECT_SUMMARY
        ~~~ ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
2021-05-25 18:46:37 -04:00
Rob Suderman e5d227e95c [NFC][MLIR][TOSA] Replaced tosa linalg.indexed_generic lowerings with linalg.index
Indexed Generic should be going away in the future. Migrate to linalg.index.

Reviewed By: NatashaKnk, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D103110
2021-05-25 15:34:28 -07:00
Vitaly Buka e14696bfd7 [NFC][SCUDO] Fix unittest for -gtest_repeat=10
Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D103122
2021-05-25 15:32:42 -07:00