Commit Graph

415911 Commits

Author SHA1 Message Date
Adrian Prantl efe9fd08e0 Disable test on big endian machines. Yaml2obj has problems there. 2022-02-22 12:48:19 -08:00
Adrian Prantl a23f7c0cb6 Remove dead code. 2022-02-22 12:48:19 -08:00
Dmitry Vassiliev 90a3b31091 [Transforms] Enhance CorrelatedValuePropagation to handle both values of select
The "Correlated Value Propagation" pass was missing a case when handling select instructions. It was only handling the "false" constant value, while in NVPTX the select may have the condition (and thus the branches) inverted, for example:
```
loop:
	%phi = phi i32* [ %sel, %loop ], [ %x, %entry ]
	%f = tail call i32* @f(i32* %phi)
	%cmp1 = icmp ne i32* %f, %y
	%sel = select i1 %cmp1, i32* %f, i32* null
	%cmp2 = icmp eq i32* %sel, null
	br i1 %cmp2, label %return, label %loop
```
But the select condition can be inverted:
```
	%cmp1 = icmp eq i32* %f, %y
	%sel = select i1 %cmp1, i32* null, i32* %f
```
The fix is to enhance "Correlated Value Propagation" to handle both branches of the select instruction.

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D119643
2022-02-23 00:11:20 +04:00
Dmitry Vassiliev 0b302be023 [Transforms] Pre-commit test cases for CorrelatedValuePropagation to handle both values of select
This is a pre-commit of test cases relevant for D119643.
CorrelatedValuePropagation should handle inverted select condition, but it does not yet.
2022-02-23 00:10:05 +04:00
Kai Nacke 30053c1445 [SystemZ/z/OS] Add va intrinsics for XPLINK
Add support for va intrinsics for the XPLINK ABI.
Only the extended vararg variant, which uses a pointer to next
argument, is supported. The standard variant will build on this.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D120148
2022-02-22 14:35:05 -05:00
Zarko Todorovski 7fb02d2752 [libc++][AIX] Add AIX error message as expected output
AIX's libc generates "Error -1 occurred" instead of the "Unknown Error"
expected by these test cases. Add this as expected output for AIX only.

Reviewed By: daltenty, #powerpc, #libc, zibi, Quuxplusone

Differential Revision: https://reviews.llvm.org/D119982
2022-02-22 14:34:36 -05:00
Rainer Orth cb8e9bea95 [sanitizer_common] Use GetStaticTlsBoundary on Solaris 11.4
This is a restricted alternative to D91605
<https://reviews.llvm.org/D91605> which only works on Solaris 11.4 SRU 10+,
but would break the build on Solaris 11.3 and Illumos which lack
`dlpi_tls_modid`.

Apart from that, the patch is trivial.  One caveat is that the
`sanitizer_common` and `asan` tests need to be linked explicitly with `ld
-z relax=transtls` on Solaris/amd64 since the archives with calls to
`__tls_get_addr` are linked in directly.

Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D120048
2022-02-22 20:18:22 +01:00
Rainer Orth b1fc966d2e [Driver] Support Solaris/amd64 GetTls
This is the driver part of D91605 <https://reviews.llvm.org/D91605>, a
workaround to allow direct calls to `__tls_get_addr` on Solaris/amd64.

Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D119829
2022-02-22 20:14:33 +01:00
Adrian Prantl a3bfb01d94 Add support for chained fixup load commands to MachOObjectFile
This is part of a series of patches to upstream support for Mach-O chained fixups.

This patch adds support for parsing the chained fixup load command and
parsing the chained fixups header. It also puts into place the
abstract interface that will be used to iterate over the fixups.

Differential Revision: https://reviews.llvm.org/D113630
2022-02-22 11:06:27 -08:00
Adrian Prantl 621e2de138 Add a (nonfunctional) -dyld_info flag to llvm-objdump.
Darwin otool implements this flag as a one-stop solution for
displaying bind and rebase info. As I am working on upstreaming
chained fixup support this command will be useful to write testcases.

Differential Revision: https://reviews.llvm.org/D113573
2022-02-22 11:06:27 -08:00
Reid Kleckner de2cc2a002 Reland "[mlir][pdl] NFC re-add NoSideEffect to Result and Results Op"
This reverts commit 9865c3f28a.

Looks like our commits raced and Jeff fixed the build issue.
2022-02-22 10:48:25 -08:00
Mogball ef7b9824cd [mlir][pdl] NFC fix missing include 2022-02-22 18:47:25 +00:00
Reid Kleckner 9865c3f28a Revert "[mlir][pdl] NFC re-add NoSideEffect to Result and Results Op"
This reverts commit 63eb963e58.

Breaks MLIR build.
2022-02-22 10:46:49 -08:00
Reid Kleckner ecb27004ec Revert "[AArch64] Alter mull shuffle(ext(..)) combine to work on buildvectors"
This reverts commit 9fc1a0dcb7.

We have bisected a compiler crash to this revision and will provide a
test case soon.
2022-02-22 10:31:09 -08:00
Mogball 63eb963e58 [mlir][pdl] NFC re-add NoSideEffect to Result and Results Op 2022-02-22 18:27:44 +00:00
Philip Reames 8612b11c86 [SLP] Use isInSchedulingRegion consistently [NFC] 2022-02-22 10:27:16 -08:00
Wouter van Oortmerssen d657c6893f [WebAssembly] Allow .data shorthand for .section .data,"",@ 2022-02-22 10:26:36 -08:00
Shubham Sandeep Rastogi c5256412b7 Updated reflection-dump.test for mpenum section
With 1c1e2cce9a a new swift5 reflection section for multi-payload enum mask information was added, which is called mpenum. This change simply adds a check to make sure dsymutil can dump out information in that section into the dSYM bundle.

Differential Revision: https://reviews.llvm.org/D120291
2022-02-22 10:21:22 -08:00
Mogball 1da213836b [pdl] Remove `NoSideEffect` from all PDL ops
This trait results in PDL ops being erroneously CSE'd. These ops are side-effect free in the rewriter but not in the matcher (where unused values aren't allowed anyways). These ops should have a more nuanced side-effect modeling, this is fixing a bug introduced by a previous change.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D120222
2022-02-22 18:20:59 +00:00
Philip Reames 0539a26d91 [SLP] Schedule only sub-graph of vectorizable instructions
SLP currently schedules all instructions within a scheduling window which stretches from the first instruction potentially vectorized to the last. This window can include a very large number of unrelated instructions which are not being considered for vectorization. This change switches the code to only schedule the sub-graph consisting of the instructions being vectorized and their transitive users.

This has the effect of greatly reducing the amount of work performed in large basic blocks, and thus greatly improves compile time on degenerate examples. To understand the effects, I added some statistics (not planned for upstream contribution). Here's an illustration from my motivating example:

Before this patch:

704357 SLP                          - Number of calcDeps actions
 699021 SLP                          - Number of schedule calls
   5598 SLP                          - Number of ReSchedule actions
     59 SLP                          - Number of ReScheduleOnFail actions
  10084 SLP                          - Number of schedule resets
   8523 SLP                          - Number of vector instructions generated

After this patch:

102895 SLP                          - Number of calcDeps actions
 161916 SLP                          - Number of schedule calls
   5637 SLP                          - Number of ReSchedule actions
     55 SLP                          - Number of ReScheduleOnFail actions
  10083 SLP                          - Number of schedule resets
   8403 SLP                          - Number of vector instructions generated

I do want to highlight that there is a small difference in number of generated vector instructions. This example is hitting the bailout due to maximum window size, and the change in scheduling is slightly perturbing when and how we hit it. This can be seen in the RescheduleOnFail counter change. Given that, I think we can safely ignore.

The downside of this change can be seen in the large test diff. We group all vectorizable instructions together at the bottom of the scheduling region. This means that vector instructions can move quite far from their original point in code. While maybe undesirable, I don't see this as being a major problem as this pass is not intended to be a general scheduling pass.

For context, it's worth noting that the pre-scheduling that SLP does while building the vector tree is exactly the sub-graph scheduling implemented by this patch.

Differential Revision: https://reviews.llvm.org/D118538
2022-02-22 10:15:55 -08:00
Valentin Clement 026a43f6cf
[flang] Update PFTBuilder
This patch update the PFTBuilder to be able to lower
the construct present in semantics.

This is a building block for other lowering patches that will be posted soon.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: PeteSteinfeld, schweitz

Differential Revision: https://reviews.llvm.org/D120336

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
2022-02-22 19:09:28 +01:00
Fangrui Song 88d66f6ed1 [ELF] Move duplicate symbol check after input file parsing
https://discourse.llvm.org/t/parallel-input-file-parsing/60164

To decouple symbol initialization and section initialization, `Defined::section`
assignment should be postponed after input file parsing. To avoid spurious
duplicate definition error due to two definitions in COMDAT groups of the same
signature, we should postpone the duplicate symbol check.

The function is called postScan instead of a more specific name like
checkDuplicateSymbols, because we may merge Symbol::mergeProperties into
postScan. It is placed after compileBitcodeFiles to apply to ET_REL files
produced by LTO. This causes minor diagnostic regression
for skipLinkedOutput configurations: ld.lld --thinlto-index-only a.bc b.o
(bitcode definition prevails) won't detect duplicate symbol error. I think this
is an acceptable compromise. The important cases where (a) both files are
bitcode or (b) --thinlto-index-only is unused are still detected.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D119908
2022-02-22 10:07:58 -08:00
Shilei Tian 104d9a6743 [Clang][OpenMP] Add the codegen support for `atomic compare`
This patch adds the codegen support for `atomic compare` in clang.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D118632
2022-02-22 13:01:39 -05:00
Aaron Ballman 16994a2cfa Fix the Sphinx build after f8cedc642d 2022-02-22 12:50:39 -05:00
Jay Foad b47e2dc91f [StableHashing] Hash machine basic blocks and functions
This adds very basic support for hashing MachineBasicBlock
and MachineFunction, for use in MachineFunctionPass to
detect passes that modify the MachineFunction wrongly.

Differential Revision: https://reviews.llvm.org/D120122
2022-02-22 17:38:47 +00:00
Jay Foad 0e74d75a29 [StructurizeCFG] Fix boolean not bug
D118623 added code to fold not-of-compare into a compare
with the inverted predicate, if the compare had no other
uses. This relies on accurate use lists in the IR but it
was run before setPhiValues, when some phi inputs are still
stored in a data structure on the side, instead of being
real uses in the IR. The effect was that a phi that should
be using the original compare result would now get an
inverted result instead.

Fix this by moving simplifyConditions after setPhiValues.

Differential Revision: https://reviews.llvm.org/D120312
2022-02-22 17:36:20 +00:00
Simon Atanasyan cedc23bc86 [MIPS] Add `-no-pie` option to the clang driver's tests depend on it 2022-02-22 20:24:21 +03:00
Stanislav Mekhanoshin 9e055c0fff [AMDGPU] Extend SILoadStoreOptimizer to handle global saddr loads
This adds handling of the _SADDR forms to the GLOBAL_LOAD combining.

TODO: merge global stores.
TODO: merge flat load/stores.
TODO: merge flat with global promoting to flat.

Differential Revision: https://reviews.llvm.org/D120285
2022-02-22 09:01:43 -08:00
Nikita Popov f4e9df22b5 [InstCombine] Add test for missed select fold due to one use limitation (NFC)
The eq sub zero fold currently has an artificial one-use limitation,
causing us to miss this fold.
2022-02-22 17:57:00 +01:00
Stanislav Mekhanoshin ba17bd2674 [AMDGPU] Extend SILoadStoreOptimizer to handle global loads
There can be situations where global and flat loads and stores are not
combined by the vectorizer, in particular if their address space
differ in the IR but they end up the same class instructions after
selection. For example a divergent load from constant address space
ends up being the same global_load as a load from global address space.

TODO: merge global stores.
TODO: handle SADDR forms.
TODO: merge flat load/stores.
TODO: merge flat with global promoting to flat.

Differential Revision: https://reviews.llvm.org/D120279
2022-02-22 08:42:36 -08:00
Nikita Popov b6eafba296 [Bitcode] Store type IDs for values
This is the next step towards supporting bitcode auto upgrade with
opaque pointers. The ValueList now stores the Value* together with
its associated type ID, which allows inspecting the original pointer
element type of arbitrary values.

This is a largely mechanical change threading the type ID through
various places. I've left TODOTypeID placeholders in a number of
places where determining the type ID is either non-trivial or
requires allocating a new type ID not present in the original
bitcode. For this reason, the new type IDs are also not used for
anything yet (apart from propagation). They will get used once the
TODOs are resolved.

Differential Revision: https://reviews.llvm.org/D119821
2022-02-22 17:27:06 +01:00
serge-sans-paille 79c9072dc0 Restore documentation for __builtin_assume
This got removed by 6cacd420a1, and that was a
mistake.

Differential Revision: https://reviews.llvm.org/D120205
2022-02-22 17:19:11 +01:00
tyb0807 8e10448cbb [AArch64] Remove unused feature flags from AArch64TargetInfo
This removes two feature flags from `AArch64TargetInfo` class:

- `HasHBC`: this feature does not involve generating any IR intrinsics,
so clang does not need to know about whether it is set

- `HasCrypto`: this feature is deprecated in favor of finer grained
features such as AES, SHA2, SHA3 and SM4. The associated ACLE macro
__ARM_FEATURE_CRYPTO is thus no longer used.

Differential Revision: https://reviews.llvm.org/D118757
2022-02-22 16:13:44 +00:00
Marek Kurdej 071f870e7f [clang-format] Avoid parsing "requires" as a keyword in non-C++-like languages.
Fixes the issue raised post-review in D113319 (cf. https://reviews.llvm.org/D113319#3337485).

Reviewed By: krasimir

Differential Revision: https://reviews.llvm.org/D120324
2022-02-22 16:55:38 +01:00
Nemanja Ivanovic 2aaba44b5c [PowerPC] Allow absolute expressions in relocations
The Linux kernel build uses absolute expressions suffixed with @lo/@ha
relocations. This currently doesn't work for DS/DQ form instructions and
there is no reason for it not to. It also works with GAS.
This patch allows this as long as the value is a multiple of 4/16
for DS/DQ form.

Differential revision: https://reviews.llvm.org/D115419
2022-02-22 09:53:08 -06:00
Marek Kurdej fee4a9712f [clang-format] Use FormatToken::is* functions without passing through `Tok`. NFC. 2022-02-22 16:41:15 +01:00
Timm Bäder 535a23053b Fix docs build after f8cedc642d
Looks like rst doesn't like '#' in link texts. Just remove it.
2022-02-22 16:35:35 +01:00
Nikita Popov e075bf6bdb [CodeGen] Add test for PR53990 (NFC) 2022-02-22 16:32:20 +01:00
Timm Bäder f8cedc642d [clang] Never wrap a nullptr in CXXNewExpr::getArraySize()
Otherwise callers of these functions have to check both the return value
for and the contents of the returned llvm::Optional.

Fixes #53742

Differential Revision: https://reviews.llvm.org/D119525
2022-02-22 16:27:32 +01:00
Matthias Springer 5c4f749429 [mlir][bufferize] Fix GCC build
Differential Revision: https://reviews.llvm.org/D120326
2022-02-23 00:03:33 +09:00
Pavel Labath 126a2607a8 [lldb] Remove HostProcess:GetMainModule
the function is unused, and the posix implementation is only really correct on linux.
2022-02-22 16:00:58 +01:00
Timm Bäder 02571f86bb [clang][www] Port make_cxx_dr_status script to Python3
And run it to re-generate the cxx_dr_status.html

Differential Revision: https://reviews.llvm.org/D120313
2022-02-22 15:47:43 +01:00
Krasimir Georgiev c9592ae49b [clang-format] Fix preprocessor nesting after commit 529aa4b011
In 529aa4b011
by setting the identifier info to nullptr, we started to subtly
interfere with the parts in the beginning of the function,
529aa4b011/clang/lib/Format/UnwrappedLineParser.cpp (L991)
causing the preprocessor nesting to change in some cases. E.g., for the
added regression test, clang-format started incorrectly guessing the
language as C++.

This tries to address this by introducing an internal identifier info
element to use instead.

Reviewed By: curdeius, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D120315
2022-02-22 15:43:18 +01:00
Sander de Smalen ffa4dfc8de [AArch64][SME] Remove term 'streaming-sve' from assembler diagnostics.
'streaming-sve' is not a feature that users should be able to set,
hence why it shouldn't show up in user-diagnostics. The only
flag that end-users should be able to set is '+sme'.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D120256
2022-02-22 13:48:22 +00:00
Egor Zhdan 3a1cb36237 Add DriverKit support
This patch is the first in a series of patches to upstream the support for Apple's DriverKit. Once complete, it will allow targeting DriverKit platform with Clang similarly to AppleClang.

This code was originally authored by JF Bastien.

Differential Revision: https://reviews.llvm.org/D118046
2022-02-22 13:42:53 +00:00
Simon Moll 4fd77129f2 [VE] Split unsupported v512.32 ops
Split v512.32 binary ops into two v256.32 ops using packing support
opcodes (vec_unpack_lo|hi, vec_pack).

Depends on D120053 for packing opcodes.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D120146
2022-02-22 14:29:41 +01:00
Paul Walker 25ed2ab341 [SVE] Add isel patterns for SABA/UABA.
Differential Revision: https://reviews.llvm.org/D119830
2022-02-22 13:09:57 +00:00
Sanjay Patel ad7214f23d [x86] add load folding restriction to pushAddIntoCmovOfConsts()
With only a load-fold the diffs look neutral. If there's a load and store (rmw)
fold opportunity as shown in the test based on #53862, then we end up with an
extra instruction.

Fixes #53862

Differential Revision: https://reviews.llvm.org/D120281
2022-02-22 08:02:11 -05:00
Kiran Chandramohan f57627f544 [Flang] Initial patch to lower a Fortran intrinsic
This patch brings in some initial changes for lowering Fortran
intrinsics. Intrinsics are generally lowered to a mix of FIR and
MLIR operations, runtime calls or LLVM intrinsics. This patch
particularly brings in the lowering of the Fortran `andi` intrinsic
to `arith.andi` in MLIR.

The significant changes are in ConvertExpr.cpp and IntrinsicCall.cpp.
Intrinsic functions occur as part of expressions. Lowering deals with this
in ConvertExpr.cpp in `genval(const Fortran::evaluate::FunctionRef<A> &funcRef)`.
The code in the above mentioned function kicks of a sequence of calls
that ultimately results in a call to the `genIand ` function in
IntrinsicCall.cpp which creates the MLIR `arith.andi` operation.

A few tests are also included.

Note: Generally intrinsics like `iand` can occur in array (elemental)
context, but since that part is not fully supported in lowering, tests
are only added for the scalar context.

This patch is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D119990

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: zacharyselk <zrselk@gmail.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
2022-02-22 12:46:35 +00:00
Thomas Symalla 380ff31d83
[AMDGPU] Fix typo in comment [NFC]
This replaces "V_MOB_B32" with "V_MOV_B32" in some comment.
2022-02-22 13:27:26 +01:00