Commit Graph

422949 Commits

Author SHA1 Message Date
Thomas Preud'homme 68dee83923 [MachinePipeliner] Fix unscheduled instruction
Prior to ordering instructions to be scheduled, the machine pipeliner
update recurrence node sets in groupRemainingNodes() by adding in a
given node set any node on the dependency path from a node set with
higher priority to the given node set. The function computePath() that
determine what constitutes a path follows artificial dependencies.

However, when ordering the nodes in the resulting node sets,
computeNodeOrder() calls ignoreDependence when looking at dependencies
which ignores artificial dependencies. This can cause a node not to be
scheduled which then causes wrong code generation and in the case of a
debug build will lead to an assert failure in generatePhis() in
ModuloScheduler.cpp.

This commit adds calls to ignoreDependence() in computePath() to not add
any node in groupRemainingNodes() that would not be ordered by
computeNodeOrder().

Reviewed By: sgundapa

Differential Revision: https://reviews.llvm.org/D124267
2022-05-05 16:01:41 +01:00
David Green 1f37d94838 [PowerPC] Add extra v2i64 splat load tests. NFC
In service of D123801, this add some tests targetting a v2i64 splat of a
load, and regenerates vsx_shuffle_le.ll for easier updating.
2022-05-05 15:56:55 +01:00
Sam McCall 04b4190489 [Driver] Make "upgrade" of -include to include-pch optional; disable in clangd
If clang is passed "-include foo.h", it will rewrite to "-include-pch foo.h.pch"
before passing it to cc1, if foo.h.pch exists.

Existence is checked, but validity is not. This is probably a reasonable
assumption for the compiler itself, but not for clang-based tools where the
actual compiler may be a different version of clang, or even GCC.
In the end, we lose our -include, we gain a -include-pch that can't be used,
and the file often fails to parse.

I would like to turn this off for all non-clang invocations (i.e.
createInvocationFromCommandLine), but we have explicit tests of this behavior
for libclang and I can't work out the implications of changing it.

Instead this patch:
 - makes it optional in the driver, default on (no change)
 - makes it optional in createInvocationFromCommandLine, default on (no change)
 - changes driver to do IO through the VFS so it can be tested
 - tests the option
 - turns the option off in clangd where the problem was reported

Subsequent patches should make libclang opt in explicitly and flip the default
for all other tools. It's probably also time to extract an options struct
for createInvocationFromCommandLine.

Fixes https://github.com/clangd/clangd/issues/856
Fixes https://github.com/clangd/vscode-clangd/issues/324

Differential Revision: https://reviews.llvm.org/D124970
2022-05-05 16:47:17 +02:00
Philip Reames 042a7a5f0d [riscv] Use X0 for destination of VSETVLI instruction if result unused
If the GPR destination register of a VSETVLI instruction is unused, we can replace it with X0. This discards the result, and thus reduces register pressure.

Since after the core insertion/lowering algorithm has run, many user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic for fixed length vectorization. (vscale vectorization generally uses the GPR result to know how far to e.g. advance pointers in a loop and these uses are not removed.)  When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.

Differential Revision: https://reviews.llvm.org/D124961
2022-05-05 07:39:45 -07:00
David Green c7a6b11b7e [ARM][AArch64] Add some extra shuffle conversion test coverage. NFC
This adds a big endian run line for the AArch64 TRN tests and
regenerated the check lines, along with adding an extra MVE VMOVN case
and regenerating vector-DAGCombine.ll for easier updating.
2022-05-05 15:27:44 +01:00
Peter Steinfeld d134442200 [flang][nfc] Use a message class for "not yet implemented" messages
Following a previous suggestion from Peter Klausler.

Differential Revision: https://reviews.llvm.org/D124972
2022-05-05 07:12:22 -07:00
Benjamin Kramer 17d27d926b [IR] Simplify code. NFCI. 2022-05-05 16:06:59 +02:00
Andrzej Warzynski bb177edc44 [flang][driver] Re-organise the code-gen actions (nfc)
All frontend actions that generate code  (MLIR, LLVM IR/BC,
Assembly/Object Code) are re-factored as essentially one action,
`CodeGenAction`, with minor specialisations. To facilate all this,
`CodeGenAction` is extended to hold `TargetMachine` and backend action
type (MLIR vs LLVM IR vs LLVM BC vs Assembly vs Object Code).

`CodeGenAction` is no longer a pure abstract class and the
corresponding `ExecuteAction` is implemented so that it covers all use
cases. All this allows a much better code re-use.

Key functionality is extracted into some helpful hooks:
  * `SetUpTargetMachine`
  * `GetOutputStream`
  * `EmitObjectCodeHelper`
  * `EmitBCHelper`
I hope that this clarifies the overall structure. I suspect that we may
need to revisit this again as the functionality grows in complexity.

Differential Revision: https://reviews.llvm.org/D124665
2022-05-05 14:05:06 +00:00
Fred Tingaud c894e85fc6 In MSVC compatibility mode, handle unqualified templated base class initialization
Before C++20, MSVC was supporting not mentioning the template argument of the base class when initializing a class inheriting a templated base class.
So the following code compiled correctly:
```
template <class T>
class Base {
};

template <class T>
class Derived : public Base<T> {
public:
  Derived() : Base() {}
};

void test() {
    Derived<int> d;
}
```
See https://godbolt.org/z/Pxxe7nccx for a conformance view.

This patch adds support for such construct when in MSVC compatibility mode.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D124666
2022-05-05 16:03:39 +02:00
Benjamin Kramer 08b20f20d2 [ConstantFold] Use getFltSemantics instead of manually checking the type
Simplifies the code and makes fpext/fptrunc constant folding not crash
when the result is bf16.
2022-05-05 15:52:19 +02:00
Marco Elver 47bdea3f7e [ThreadSanitizer] Add fallback DebugLocation for instrumentation calls
When building with debug info enabled, some load/store instructions do
not have a DebugLocation attached. When using the default IRBuilder, it
attempts to copy the DebugLocation from the insertion-point instruction.
When there's no DebugLocation, no attempt is made to add one.

This is problematic for inserted calls, where the enclosing function has
debug info but the call ends up without a DebugLocation in e.g. LTO
builds that verify that both the enclosing function and calls to
inlinable functions have debug info attached.

This issue was noticed in Linux kernel KCSAN builds with LTO and debug
info enabled:

  | ...
  | inlinable function call in a function with debug info must have a !dbg location
  |   call void @__tsan_read8(i8* %432)
  | ...

To fix, ensure that all calls to the runtime have a DebugLocation
attached, where the possibility exists that the insertion-point might
not have any DebugLocation attached to it.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D124937
2022-05-05 15:21:35 +02:00
Sam McCall 40c13720a4 [Frontend] give createInvocationFromCommandLine an options struct
It's accumulating way too many optional params (see D124970)

While here, improve the name and the documentation.

Differential Revision: https://reviews.llvm.org/D124971
2022-05-05 15:12:07 +02:00
Alexey Bataev 99f31acfce [SLP]Further improvement of the cost model for scalars used in buildvectors.
Further improvement of the cost model for the scalars used in
buildvectors sequences. The main functionality is outlined into
a separate function.
The cost is calculated in the following way:
1. If the Base vector is not undef vector, resizing the very first mask to
have common VF and perform action for 2 input vectors (including non-undef
Base). Other shuffle masks are combined with the resulting after the 1 stage and processed as a shuffle of 2 elements.
2. If the Base is undef vector and have only 1 shuffle mask, perform the
action only for 1 vector with the given mask, if it is not the identity
mask.
3. If > 2 masks are used, perform serie of shuffle actions for 2 vectors,
combing the masks properly between the steps.

The original implementation misses the very first analysis for the Base
vector, so the cost might too optimistic in some cases. But it improves
the cost for the insertelements which are part of the current SLP graph.

Part of D107966.

Differential Revision: https://reviews.llvm.org/D115750
2022-05-05 06:04:25 -07:00
Xing Xue e5926906eb [XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction-sections
Summary:
When -ffunction-sections is on, this patch makes the compiler to generate unique LSDA and EH info sections for functions on AIX by appending the function name to the section name as a suffix. This will allow the AIX linker to garbage-collect unused function.

Reviewed by: MaskRay, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D124855
2022-05-05 09:01:36 -04:00
Peter Waller 75f9e83ace [AArch64] Add -aarch64-insert-extract-base-cost
The new flag -aarch64-insert-extract-base-cost can be used to
set the value of AArch64Subtarget::getVectorInsertExtractBaseCost(),
for the purposes of experimentation.

Differential Revision: https://reviews.llvm.org/D124835
2022-05-05 10:35:45 +00:00
Jay Foad ba6c8d42d4 [AMDGPU] Combine DPP mov even if old reg def is in different BB
Given a DPP mov like this:

  %2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
  ...
  %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec

this patch just removes a check that %2 (the "old reg") was defined in
the same BB as the DPP mov instruction. GCNDPPCombine requires that the
MIR is in SSA form so I don't understand why the BB matters.

This lets the optimization work in more real world cases when the
definition of %2 gets hoisted out of a loop.

Differential Revision: https://reviews.llvm.org/D124182
2022-05-05 11:30:31 +01:00
Tobias Burnus 6f095babc2 sanitizer_common: Define FP_XSTATE_MAGIC1 for old glibc
D116208 (commit 1298273e82) added FP_XSTATE_MAGIC1.
However, when building with glibc < 2.16 for backward-dependency
compatibility, it is not defined - and the build breaks.

Note: The define comes from Linux's asm/sigcontext.h but the
file uses signal.h which includes glibc's bits/sigcontext.h - which
is synced from the kernel's file but lags behind.

Solution: For backward compatility with ancient systems, define
FP_XSTATE_MAGIC1 if undefined.

//For the old systems, we were building with Linux kernel 3.19 but to support really old glibc systems, we build with a sysroot of glibc 2.12. While our kernel (and the users' kernels) have FP_XSTATE_MAGIC1, glibc 2.12 is too old. – With this patch, building the sanitizer libs works again. This showed up for us today as GCC mainline/13 has now synced the sanitizer libs.//

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D124927
2022-05-05 11:05:27 +01:00
einvbri df5801806d [analyzer] Get direct binding for specific punned case
Region store was not able to see through this case to the actual
initialized value of STRUCT ff. This change addresses this case by
getting the direct binding. This was found and debugged in a downstream
compiler, with debug guidance from @steakhal. A positive and negative
test case is added.

The specific case where this issue was exposed.

  typedef struct {
    int a:1;
    int b[2];
  } STRUCT;

  int main() {
    STRUCT ff = {0};
    STRUCT* pff = &ff;
    int a = ((int)pff + 1);
    return a;
  }

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D124349
2022-05-05 04:53:45 -05:00
Florian Hahn 3497a4f396
[LICM] Add test to exercise assertion from D123473.
Add a test case that triggers an assertion with earlier versions of
D123473.
2022-05-05 10:49:52 +01:00
Jay Foad 9ebbe25034 RegAllocGreedy: Common up part of the priority calculation. NFC. 2022-05-05 10:35:33 +01:00
Nikita Popov 9678936f18 [DAGCombine] Fold (X & ~Y) | Y with truncated not
This extends the (X & ~Y) | Y to X | Y fold to also work if ~Y is
a truncated not (when taking into account the mask X). This is
done by exporting the infrastructure added in D124856 and reusing
it here.

I've retained the old value of AllowUndefs=false, though probably
this can be switched to true with extra test coverage.

Differential Revision: https://reviews.llvm.org/D124930
2022-05-05 11:10:11 +02:00
Florian Hahn 6bd2b70877
[SimpleLoopUnswitch] Add freeze if branch execs for partial unswitching.
We cannot skip the freezing the condition if the unswitched branch
executes, if the condition is a chain of ANDs/ORs. For example, if if we
have an AND %c1, %c2 with  %c1 == undef and %c2 == 0, there would be no
branch on undef in the original code, but a branch on undef if we
unswitch %c1.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D124603
2022-05-05 09:44:07 +01:00
Jean Perier b910cf986a [flang] use 1-based dim in transformational runtime error msg
Flang transformational runtime was previously reporting conformity
issues in a zero based fashion to describe which dimension is non
conformant. This may confuse Fortran user, especially when the message
is about a dimension other than the first one.

Differential Revision: https://reviews.llvm.org/D124941
2022-05-05 10:33:14 +02:00
Adrian Kuegel cc344d262a [clang] Add static_cast to fix Bazel build.
Differential Revision: https://reviews.llvm.org/D124995
2022-05-05 10:29:47 +02:00
Matthias Springer e300682597 [mlir][scf][bufferize] Update verifyAnalysis error message
The previous error message was technically incorrect. We do not compare equivalence of YieldOp operands and ForOp operands.

Differential Revision: https://reviews.llvm.org/D124934
2022-05-05 16:56:50 +09:00
Matthias Springer 417e1c7d52 [mlir][scf][bufferize][NFC] Split ForOp bufferization into smaller functions
This is in preparation of WhileOp bufferization, which reuses these functions.

Differential Revision: https://reviews.llvm.org/D124933
2022-05-05 16:55:44 +09:00
Matthias Springer f178c386f5 [mlir][scf][bufferize][NFC] Simplify verifyAnalysis implementation
Differential Revision: https://reviews.llvm.org/D124928
2022-05-05 16:51:10 +09:00
Nikita Popov 47c559d6c1 [SCEV] Fold umin_seq to umin using implied poison reasoning
Similar to how we convert logical and/or to bitwise and/or, we should
also convert umin_seq to umin based on implied poison reasoning. In
%x umin_seq %y, if %y being poison implies %x being poison, then we
don't need the sequential evaluation: Having %y contribute towards
the result will never make the result more poisonous. An important
corollary of this is that if %y is never poison, we also don't need
the sequential evaluation.

This avoids some of the regressions in D124910.

Differential Revision: https://reviews.llvm.org/D124921
2022-05-05 09:43:49 +02:00
serge-sans-paille f416e57339 [lldb] Fix ppc64 detection in lldb
Currently, ppc64le and ppc64 (defaulting to big endian) have the same
descriptor, thus the linear scan always return ppc64le. Handle that through
subtype.

This is a recommit of f114f00948 with a new test
setup that doesn't involves (unsupported) corefiles.

Differential Revision: https://reviews.llvm.org/D124760
2022-05-05 09:22:02 +02:00
Chuanqi Xu 405bf90235 [NFC] [Pipelines] Hoist CoroCleanup as Module Pass
This is similar to previous patch https://reviews.llvm.org/D123925. It
could also reduce the time we call declaresCoroCleanupIntrinsics. And it
is helpful for further changes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D124362
2022-05-05 15:15:09 +08:00
Chuanqi Xu 7d40f562e7 [Pipelines] Hoist CoroCleanup to avoid blocking optimizations
CoroCleanup is designed to lowering all the remaining coroutine
intrinsics. It is required to run after CoroSplit only. However, the
position of CoroCleanup now is far too late. The downside here is that
the unlowered coroutine instrincs might blocking other optimizations
too. So it should be a pure win to hoist the position of CoroCleanup.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D124360
2022-05-05 15:13:27 +08:00
Zakk Chen 6c10014f1d [RISCV][Clang] add more tests for clang driver. (NFC)
Test experimental arch, Zfh, Zfmin and Zve arch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124611
2022-05-04 23:55:52 -07:00
Serge Pavlov 83914ee96f [InstCombine] Remove side effect of replaced constrained intrinsics
If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases it is correct behavior
but often the side effect is actually absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426
2022-05-05 12:02:42 +07:00
Mariusz Sikora 2417de2758 [AMDGPU] Use d16 flag for image.sample instructions
Image.sample instruction can be forced to return half type instead of
float when d16 flag is enabled.

This patch adds new pattern in InstCombine to detect if output of
image.sample is used later only by fptrunc which converts the type
from float to half. If pattern is detected then fptrunc and image.sample
are combined to single image.sample which is returning half type.
Later in Lowering part d16 flag is added to image sample intrinsic.

Differential Revision: https://reviews.llvm.org/D124232
2022-05-05 06:29:19 +02:00
Wael Yehia 2407c13aa4 [AIX][PGO] Enable linux style PGO on AIX
This patch switches the PGO implementation on AIX from using the runtime
registration-based section tracking to the __start_SECNAME/__stop_SECNAME
based. In order to enable the recognition of __start_SECNAME/__stop_SECNAME
symbols in the AIX linker, the -bdbg:namedsects:ss needs to be used.

Reviewed By: jsji, MaskRay, davidxl

Differential Revision: https://reviews.llvm.org/D124857
2022-05-05 04:10:39 +00:00
Eric Li 58abe36ae7 [clang][dataflow] Add flowConditionIsTautology function
Provide a way for users to check if a flow condition is
unconditionally true.

Differential Revision: https://reviews.llvm.org/D124943
2022-05-05 03:57:43 +00:00
Patryk Wychowaniec 6641c57aeb [AVR] Always expand STDSPQRr & STDWSPQRr
Currently, STDSPQRr and STDWSPQRr are expanded only during
AVRFrameLowering - this means that if any of those instructions happen
to appear _outside_ of the typical FrameSetup / FrameDestroy
context, they wouldn't get substituted, eventually leading to a crash:

```
LLVM ERROR: Not supported instr: <MCInst XXX <MCOperand Reg:1>
<MCOperand Imm:15> <MCOperand Reg:53>>
```

This commit fixes this issue by moving expansion of those two opcodes
into AVRExpandPseudo.

This bug was originally discovered due to the Rust compiler_builtins
library. Its 0.1.37 release contained a 128-bit software
division/remainder routine that exercised this buggy branch in the code.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D123528
2022-05-05 03:10:59 +00:00
Lian Wang 8bb10436ab [RISCV][NFC] Use true_mask replace riscv_vmset_vl in defined patterns.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124660
2022-05-05 03:05:52 +00:00
Phoebe Wang aa25b55bde [X86] Add `void` to void function. NFC 2022-05-05 10:58:46 +08:00
Luo, Yuanke 373ce14760 [X86][AMX] Replace PXOR instruction with SET0 in AMX pre config.
To generate zero value, the PXOR instruction need 3 operands that is
tied to the same vreg. If is not good in SSA form and with undef value
two address instruction pass may convert
`%0:vr128 = PXORrr undef %0, undef %0`
to `%1:vr128 = PXORrr undef %1:vr128(tied-def 0), undef %0:vr128`.
It is not expected.
It can be simplified to SET0 instruction which only take 1 destination
operand. It should be more friendly to two address instruction pass and
register allocation pass.
`%0:vr128 = V_SET0`
Also add AVX1 code path so that it is consistant to other code.

Differential Revision: https://reviews.llvm.org/D124903
2022-05-05 10:44:57 +08:00
Ben Shi dc66897d4c [Disassembler][AVR] Remove unused static functions
The unused static functions cause failures on some build machines.
2022-05-05 02:20:27 +00:00
Craig Topper 589517925b [X86] Call initializeX86PreTileConfigPass from LLVMInitializeX86Target.
Without this, the pass doesn't show up in print-before/after-all.

Differential Revision: https://reviews.llvm.org/D124973
2022-05-04 19:09:06 -07:00
Craig Topper 572dfef1db [SelectionDAG] Use llvm::any_of to simplify a loop. NFC 2022-05-04 19:09:06 -07:00
Ben Shi b1dcd6bafb [MC][AVR] Implement decoding ST/LD
Reviewed By: aykevl, dylanmckay

Differential Revision: https://reviews.llvm.org/D123476
2022-05-05 01:53:59 +00:00
Ben Shi cef2739d68 [MC][AVR] Implement decoding STD/LDD
Reviewed By: aykevl, dylanmckay

Differential Revision: https://reviews.llvm.org/D123442
2022-05-05 01:53:49 +00:00
Alexander Shaposhnikov ec7122f64b [InstCombine] Fold ((A&B)^C)|B
Fold ((A&B)^C)|B into C|B.

https://alive2.llvm.org/ce/z/zSGSor

This addresses the issue https://github.com/llvm/llvm-project/issues/55169

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D124710
2022-05-05 00:56:20 +00:00
Ayke van Laethem 514371c370
[compiler-rt][AVR] Fix avr_SOURCES CMake variable
D123200 did not include the generic sources, which means that only the
AVR-specific sources were compiled. With this change, generic sources
are included as expected.

Tested with the following commands:

    cmake -G Ninja -DCOMPILER_RT_DEFAULT_TARGET_TRIPLE=avr -DCOMPILER_RT_BAREMETAL_BUILD=1 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_C_FLAGS="--target=avr -mmcu=avr5 -nostdlibinc -mdouble=64" ../path/to/builtins

    ninja

Differential Revision: https://reviews.llvm.org/D124969
2022-05-05 02:29:04 +02:00
Craig Topper 60cb489685 [RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount.
No reason to special case simm12, movImm handles all immediates.

This also fixe a bug that we weren't passing the frame-setup/destroy
flag to movImm when we were calling it.
2022-05-04 17:23:22 -07:00
Alexander Shaposhnikov 640f1e0829 [InstCombine][NFC] Update comment in and-xor-or.ll 2022-05-05 00:07:49 +00:00
Alexander Shaposhnikov 46bef4d713 [InstCombine][NFC] Add baseline tests for folds of ((A&B)^C)|B
Differential revision: https://reviews.llvm.org/D124709

Test plan: make check-all
2022-05-05 00:04:33 +00:00