Commit Graph

416964 Commits

Author SHA1 Message Date
Roman Lebedev 703240c71f
[SROA] Maintain shadow/backing alloca when some slices are noncapturnig read-only calls to allow alloca partitioning/promotion
This is inspired by the original variant of D109749 by Graham Hunter,
but is a more general version.

Roughly, instead of promoting the alloca, we call it
a shadow/backing alloca, go through all it's slices,
clone(!) instructions that operated on it,
but make them operate on the cloned alloca,
and promote cloned alloca instead.

This keeps the shadow/backing alloca, and all the original instructions
around, which results in said shadow/backing alloca being
a perfect mirror/representation of the promoted alloca's content,
so calls that take the alloca as arguments (non-capturingly!)
can be supported.

For now, we require that the calls also don't modify the alloca's content,
but that is only to simplify the initial implementation,
and that will be supported in a follow-up.

Overall, this leads to *smaller* codesize:
https://llvm-compile-time-tracker.com/compare.php?from=a8b4f5bbab62091835205f3d648902432a4a5b58&to=aeae054055b125b011c1122f82c86457e159436f&stat=size-total
and is roughly neutral compile-time wise:
https://llvm-compile-time-tracker.com/compare.php?from=a8b4f5bbab62091835205f3d648902432a4a5b58&to=aeae054055b125b011c1122f82c86457e159436f&stat=instructions

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D113520
2022-03-04 21:08:43 +03:00
Arthur O'Dwyer f0891cd61b [clang] [concepts] Check constrained-auto return types for void-returning functions
Fixes #49188.

Differential Revision: https://reviews.llvm.org/D119184
2022-03-04 12:43:06 -05:00
Arthur O'Dwyer adf6703f75 [clang] [NFC] Add `const` to a parameter that's not modified.
Reviewed as part of D119184.
2022-03-04 12:43:05 -05:00
Simon Pilgrim 588d97e246 [X86] getTargetVShiftNode - peek through any zext node
If the shift amount has been zero-extended, peek through as this might help us further canonicalize the shift amount.

Fixes regression mentioned in rG147cfcbef1255ba2b4875b76708dab1a685085f5
2022-03-04 17:41:45 +00:00
Colin Cross bcc65fb491 Pass through more LIBCXX_* variables to libfuzzer's custom lib++
Pass LIBCXX_HAS_PTHREAD_LIB, LIBCXX_HAS_RT_LIB  and LIBCXXABI_HAS_PTHREAD_LIB
through to the custom lib++ builds so that libfuzzer  doesn't end up with a .deplibs section that
links against those libraries when the variables are set to false.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D120946
2022-03-04 09:31:37 -08:00
Siva Chandra Reddy dd33f9cdef [libc] Make the errno macro resolve to the thread local variable directly.
With modern architectures having a thread pointer and language supporting
thread locals, there is no reason to use a function intermediary to access
the thread local errno value.

The entrypoint corresponding to errno has been replaced with an object
library as there is no formal entrypoint for errno anymore.

Reviewed By: jeffbailey, michaelrj

Differential Revision: https://reviews.llvm.org/D120920
2022-03-04 17:29:49 +00:00
LLVM GN Syncbot fa8293bbc7 [gn build] Port c88deef0a7 2022-03-04 17:22:23 +00:00
LLVM GN Syncbot a1e91b53f6 [gn build] Port 7ee97c24ef 2022-03-04 17:22:23 +00:00
Yitzhak Mandelbaum c88deef0a7 [clang][dataflow] Add `MatchSwitch` utility library.
Adds `MatchSwitch`, a library for simplifying implementation of transfer
functions. `MatchSwitch` supports constructing a "switch" statement, where each
case of the switch is defined by an AST matcher. The cases are considered in
order, like pattern matching in functional languages.

Differential Revision: https://reviews.llvm.org/D120900
2022-03-04 17:19:51 +00:00
Krzysztof Drewniak 4e817b3fa3 [MLIR][AMDGPU] Fix typo and add comment to SerializeToHsaco
Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D120943
2022-03-04 17:15:11 +00:00
Yitzhak Mandelbaum 7ee97c24ef [clang][dataflow] Add a lattice to track source locations.
This patch adds a simpe lattice used to collect source loctions. An intended application is to track errors found in code during an analysis.

Differential Revision: https://reviews.llvm.org/D120890
2022-03-04 17:13:24 +00:00
William S. Moses 62f84c73d2 [MLIR][SCF] Allow combining subsequent if statements that yield & negated condition
This patch extends the existing if combining canonicalization to also handle the case where a value returned by the first if is used within the body of the second if.

This patch also extends if combining to support if's whose conditions are logical negations of each other.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D120924
2022-03-04 12:07:47 -05:00
Jeremy Morse 0e96d95d13 [DebugInfo][InstrRef] Accept register-reads after isel in any block
When lowering LLVM-IR to instruction referencing stuff, if a value is
defined by a COPY, we try and follow the register definitions back to where
the value was defined, and build an instruction reference to that
instruction. In a few scenarios (such as arguments), this isn't possible.
I added some assertions to catch cases that weren't explicitly whitelisted.

Over the course of a few months, several more scenarios have cropped up,
the lastest is the llvm.read_register intrinsic, which lets LLVM-IR read an
arbitary register at any point. In the face of this, there's little point
in validating whether debug-info reads a register in an expected scenario.
Thus: this patch just deletes those assertions, and adds a regression test
to check that something is done with the llvm.read_register intrinsic.

Fixes #54190

Differential Revision: https://reviews.llvm.org/D121001
2022-03-04 17:01:12 +00:00
William S. Moses 1d1791572c [MLIR][MemRef] Ensure alloca_scope is inlined with no allocating ops
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D120841
2022-03-04 11:58:59 -05:00
Simon Pilgrim 147cfcbef1 [X86] LowerShiftByScalarVariable - find splat patterns with getSplatSourceVector instead of getSplatValue
This completes the removal of uses of SelectionDAG::getSplatValue started in D119090 - by avoiding extracting the splatted element we make it a lot easier to zero-extend the bottom 64-bits of the shift amount and fixes issues we had on 32-bit targets where i64 isn't legal.

I've removed the old version of getTargetVShiftNode that took the scalar shift amount argument and LowerRotate can finally efficiently handle vXi16 rotates-by-scalar (using the same code as general funnel-shifts).

The only regression we see is in the X86-AVX2 PR52719 test case in vector-shift-ashr-256.ll - this is now hitting the same problem as the X86-AVX1 case (failure to simplify a multi-use X86ISD::VBROADCAST_LOAD) which I intend to address in a follow up patch.
2022-03-04 16:47:35 +00:00
Hans Wennborg 85c53c7092 Revert "[AArch64] Async unwind - function prologues"
It caused builds to assert with:

  (StackSize == 0 && "We already have the CFA offset!"),
  function generateCompactUnwindEncoding, file AArch64AsmBackend.cpp, line 624.

when targeting iOS. See comment on the code review for reproducer.

> This patch rearranges emission of CFI instructions, so the resulting
> DWARF and `.eh_frame` information is precise at every instruction.
>
> The current state is that the unwind info is emitted only after the
> function prologue. This is fine for synchronous (e.g. C++) exceptions,
> but the information is generally incorrect when the program counter is
> at an instruction in the prologue or the epilogue, for example:
>
> ```
> stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
> mov     x29, sp
> .cfi_def_cfa w29, 16
> ...
> ```
>
> after the `stp` is executed the (initial) rule for the CFA still says
> the CFA is in the `sp`, even though it's already offset by 16 bytes
>
> A correct unwind info could look like:
> ```
> stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
> .cfi_def_cfa_offset 16
> mov     x29, sp
> .cfi_def_cfa w29, 16
> ...
> ```
>
> Having this information precise up to an instruction is useful for
> sampling profilers that would like to get a stack backtrace. The end
> goal (towards this patch is just a step) is to have fully working
> `-fasynchronous-unwind-tables`.
>
> Reviewed By: danielkiss, MaskRay
>
> Differential Revision: https://reviews.llvm.org/D111411

This reverts commit 32e8b550e5.
2022-03-04 17:36:26 +01:00
zhongyunde 7a605ab7bf [AArch64] Use simd mov to materialize big fp constants
mov w8, #1325400064 + fmov s0, w8 ==> movi v0.2s, 0x4f, lsl 24
Fix https://github.com/llvm/llvm-project/issues/53651

Reviewed By: dmgreen, fhahn

Differential Revision: https://reviews.llvm.org/D120452
2022-03-04 11:34:20 -05:00
Richard Howell 8ba84ceda0 [llvm] fix bitcode-strip.test on windows
Remove the executable name from the test match as this will have
a `.exe` suffix on windows.

Reviewed By: drodriguez

Differential Revision: https://reviews.llvm.org/D121000
2022-03-04 08:30:50 -08:00
Michel Weber 21dc4ad56a [MLIR][Presburger] skip IntegerPolyhedrons with LocalIds in coalesce
This patch makes coalesce skip the comparison of all pairs of IntegerPolyhedrons with LocalIds rather than crash. The heuristics to handle these cases will be upstreamed later on.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D120995
2022-03-04 16:12:04 +00:00
Richard Howell 8e6d2fe4d4 [llvm] add -o flag to llvm-bitcode-strip
Add the -o flag to specify an output path for llvm-bitcode-strip.
This matches the interface to the Xcode bitcode_strip tool.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D120731
2022-03-04 08:03:51 -08:00
William S. Moses 4a94a33ca6 [MLIR][LLVM] Fold extractvalue to ignore insertvalue at distinct index
We can simplify an extractvalue of an insertvalue to extract out of the base of the insertvalue, if the insert and extract are at distinct and non-prefix'd indices

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D120915
2022-03-04 11:03:34 -05:00
Louis Dionne 9fee527eca [runtimes] Trigger CI jobs when only the runtimes/ subdirectory is touched 2022-03-04 10:59:27 -05:00
Balazs Benics e86324f800 [clang-tidy][NFC] Document bugprone-narrowing-conversions check alias 2022-03-04 16:47:11 +01:00
Augie Fackler 5e4c75db3b InstructionCombining: avoid eliding mismatched alloc/free pairs
Prior to this change LLVM would happily elide a call to any allocation
function and a call to any free function operating on the same unused
pointer. This can cause problems in some obscure cases, for example if
the body of operator::new can be inlined but the body of
operator::delete can't, as in this example from jyknight:

    #include <stdlib.h>
    #include <stdio.h>

    int allocs = 0;

    void *operator new(size_t n) {
        allocs++;
        void *mem = malloc(n);
        if (!mem) abort();
        return mem;
    }

    __attribute__((noinline)) void operator delete(void *mem) noexcept {
        allocs--;
        free(mem);
    }

    void deleteit(int*i) { delete i; }
    int main() {
        int*i = new int;
        deleteit(i);
        if (allocs != 0)
          printf("MEMORY LEAK! allocs: %d\n", allocs);
    }

This patch addresses the issue by introducing the concept of an
allocator function family and uses it to make sure that alloc/free
function pairs are only removed if they're in the same family.

Differential Revision: https://reviews.llvm.org/D117356
2022-03-04 10:41:10 -05:00
Karl Meakin 43a0016f3d Extend `performANDCSELCombine` to `performANDORCSELCombine`
Differential Revision: https://reviews.llvm.org/D120422
2022-03-04 15:09:59 +00:00
Nikita Popov 6467d1d275 [CoroFrame] Remove unused insertSpills() return value (NFC) 2022-03-04 15:11:24 +01:00
Simon Pilgrim 940d7cd59f [X86] SimplifyDemandedVectorElts - adjust X86ISD::ANDNP demanded elts based off constant masks
Similar to what we already do in combineAndnp, if either operand is a constant then we can improve the demanded elts/bits.
2022-03-04 13:40:56 +00:00
David Spickett ffca16c3dc Revert "[WebAssembly] Update WebAssemblyAsmTypeCheck for table.get"
This reverts commit 6b2482f6f4 due to
test failures on AArch64 bots:
https://lab.llvm.org/buildbot/#/builders/183/builds/3684
2022-03-04 13:33:55 +00:00
Andrzej Warzynski bbcc0f6006 [flang] Fix standalone builds
In dd875dd88b I added a missing MLIR
dependency in Flang. However, that particular CMake target is not
exported as something available to standalone builds. In this patch is
switch to `MLIRIR` instead, which depends on
`MLIRBuiltinAttributeInterfacesIncGen` - the missing dependency added
previously.

Differential Revision: https://reviews.llvm.org/D120986
2022-03-04 13:05:44 +00:00
Jay Foad e8e301ed92 [AMDGPU] Extra test cases in hard-clauses.mir
Add some cases where different kinds of instruction might be
combined in the same hard clause.
2022-03-04 12:46:59 +00:00
Jay Foad b79840a472 [AMDGPU] Regenerate checks in hard-clauses.mir 2022-03-04 12:46:59 +00:00
Nathan Sidwell 64221645a8 [demangler] Make OutputBuffer non-copyable
In addressing the buffer ownership API, I discovered a rogue member
function that returned by value rather than by reference. It clearly
intended to return by reference, but because the copy ctor wasn't
deleted this wasn't caught.

It is not necessary to make this a move-only type, although that would
be an alternative.

Reviewed By: bruno

Differential Revision: https://reviews.llvm.org/D120901
2022-03-04 04:43:37 -08:00
Aaron Ballman 6afe035404 Revert "[analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions"
This reverts commit 9c300c18a4.

This broke the sphinx bot and seems like an unintentional commit.
2022-03-04 07:21:52 -05:00
4vtomat 5a148869d3 [NFC] Divide tests into smaller files
This commit divides the large test files(over 30k lines) under clang/test/CodeGen/RISCV including:
rvv-intrinsics/vloxseg.c
rvv-intrinsics/vluxseg.c
rvv-intrinsics-overloaded/vloxseg.c
rvv-intrinsics-overloaded/vluxseg.c
into "non-masked" version and "masked" version which can reduce the test cases by 50% in a single file.

Differential Revision: https://reviews.llvm.org/D120967
2022-03-04 04:16:52 -08:00
Jay Foad d7d4ed0847 [AMDGPU] Tweak predicates for image_bvh_intersect_ray instructions
Don't override SubtargetPredicate since that is already set in the
base classes for the appropriate subtarget like MIMG_gfx10. Use
OtherPredicates instead for consistency with the way we handle
features like HasImageInsts and HasExtendedImageInsts. NFC.

Differential Revision: https://reviews.llvm.org/D120909
2022-03-04 12:05:23 +00:00
nokotan 6b2482f6f4 [WebAssembly] Update WebAssemblyAsmTypeCheck for table.get
This patch is aimed to resolve [[ https://github.com/llvm/llvm-project/issues/53789 | GitHub Issue #53789 ]].

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D120229
2022-03-04 13:02:02 +01:00
Paul Walker 42b4a6227e [DAGCombine] Prevent illegal ISD::SPLAT_VECTOR operations post legalisation.
When triggered during operation legalisation the affected combine
generates a splat_vector that when custom lowered for SVE fixed
length code generation, results in the original precombine sequence
and thus we enter a legalisation/combine hang.

NOTE: The patch contains no tests because I observed this issue
only when combined with other work that might never become public.
The current way AArch64 lowers ISD::SPLAT_VECTOR meant a specific
test was not possible so I'm hoping the DAGCombiner fix can be seen
as obvious. The AArch64ISelLowering change is requirted to maintain
existing code quality.

Differential Revision: https://reviews.llvm.org/D120735
2022-03-04 11:54:03 +00:00
Florian Hahn fb42e557d8
[Driver] Split up huge arm-cortex-cpus.c test.
This test file has grown to the point where it takes a huge amount of
time to run. At the moment, this test seems to consistently time out
when running in the pre-commit checks in Phabricator with a 10 minute
timeout. For example see
https://reviews.llvm.org/harbormaster/unit/view/2832724/

While splitting up the test file is not ideal, it is even more
undesirable to have huge test files that time out in common settings.

This patch splits up the test file roughly in the middle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D120876
2022-03-04 11:37:00 +00:00
Florian Hahn 8f5bdaf481
[Driver] Split up huge aarch64-cpus.c test.
This test file has grown to the point where it takes a huge amount of
time to run. At the moment, this test seems to consistently time out
when running in the pre-commit checks in Phabricator with a 10 minute
timeout. For example see
https://reviews.llvm.org/harbormaster/unit/view/2832723/

While splitting up the test file is not ideal, it is even more
undesirable to have huge test files that time out in common settings.

This patch splits up the test file roughly in the middle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D120875
2022-03-04 11:24:12 +00:00
David Green e348b09bb5 [AArch64] Turn UZP1 with undef operand into truncate
This turns upz1(x, undef) to concat(truncate(x), undef), as the truncate
is simpler and can often be optimized away, and it helps some of the
insert-subvector tests optimize more cleanly.

Differential Revision: https://reviews.llvm.org/D120879
2022-03-04 11:12:26 +00:00
Nikita Popov 6b5b367858 [Attributor] Remove function pointer type check (NFCI)
This check is not relevant for correctness, it can only avoid
walking some recursive uses if the cast is to a non-function
pointer type. As this distinction will no longer be possible
with opaque pointers and all users will have to be walked
anyway, I'm dropping the check in advance.
2022-03-04 12:09:51 +01:00
Florian Hahn 5a60260efe
[IVDescriptor] Use DT to check order of Previous, OtherPrev.
Previous and OhterPrev may not be in the same block. Use DT::dominates
instead of local comesBefore. DT::dominates is already used earlier to
check the order of Previous and SinkCandidate.

Fixes https://github.com/llvm/llvm-project/issues/54195
2022-03-04 11:07:42 +00:00
Nikita Popov d3a52089eb Reapply [MergeICmps] Don't require GEP
Recommit without changes over 53abe3ff66,
which addressed the cause of the reported crash.

-----

With opaque pointers, the zero-offset load will generally not use
a GEP. Allow a direct load without GEP, which is treated the same
way as a zero-offset GEP.
2022-03-04 11:39:11 +01:00
David Sherwood f9331c9a2c [AArch64] Fix the TuneExynosM4 entry in lib/Target/AArch64/AArch64.td
A bug was introduced in 5ea35791e6 that
gave the wrong name and description for TuneExynosM4. This patch fixes
that and changes it back to m3.

Differential Revision: https://reviews.llvm.org/D120665
2022-03-04 10:27:21 +00:00
Nikita Popov 53abe3ff66 [MergeICmp] Make instruction move robust against empty block (NFCI)
Use the overload that support moving into an empty block. I don't
think that this situation can occur right now, but it can happen
with the change from e7fb1c15cb,
and the test is derived from the issue reported there.
2022-03-04 11:15:08 +01:00
Sander de Smalen 7c65d2288b [AArch64] Improve access to fixed-width object when stack has SVE.
When the stack has SVE objects, fixed-width objects are often better accessed
from the SP, instead of the FP, because part/all of the fixed-width offset
can be folded into the (non-scalable) addressing mode, where otherwise an
ADDVL would be required.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D120738
2022-03-04 09:33:59 +00:00
Sander de Smalen d363bddac5 [AArch64] NFC: Add test for access to fixed-width stack object when stack has SVE.
In this case, the access would benefit from being accessed from the SP, as that
would avoid the redundant ADDVL, since most of the offset can currently be
folded into the addressing mode.
2022-03-04 09:33:59 +00:00
Nikita Popov 7a258c6a37 [Bitcode] Move x86_intrcc upgrade to bitcode reader
This upgrade requires access the legacy pointer element type, so
it needs to happen inside the bitcode reader.
2022-03-04 10:30:50 +01:00
Nikita Popov e3a9f68e2c [Bitcode] Fully support opaque pointer auto upgrade
This completes the propagation of type IDs through bitcode reading,
and switches remaining uses of getPointerElementType() to use
contained type IDs.

The main new thing here is that sometimes we need to create a type
ID for a type that was not explicitly encoded in bitcode (or we
don't know its ID at the current point). For such types we create a
"virtual" type ID, which is cached based on the type and the
contained type IDs. Luckily, we generally only need zero or one
contained type IDs, and in the one case where we need two, we can
get away with not including it in the cache key.

With this change, we pass the entirety of llvm-test-suite at O3
with opaque pointers.

Differential Revision: https://reviews.llvm.org/D120471
2022-03-04 10:23:06 +01:00
David Green 04661a4d8e [AArch64] Additional insert-subvector codegen tests. NFC 2022-03-04 09:04:09 +00:00