Commit Graph

417939 Commits

Author SHA1 Message Date
Julian Lettner 9c542a5a4e Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Joseph Huber 24ebdb6c25 [CUDA] Add CUDA fatbinary magic
Nvidia uses fatbinaries to bundle all of their device code. This patch
adds the magic number "0x50ed55ba" used in their propeitary format to
the list of magic identifies. This is technically undocumented and could
unlikely be changed by Nvidia in the future.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D120932
2022-03-14 20:08:31 -04:00
Joseph Huber 23d885b3a2 [OpenMP][NFC] Refactor new driver to be more general
This path refactors the new driver to be less dependent on OpenMP. This
is done in preparation for the new driver to be able to handle other
offloading kinds and compile them together.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120934
2022-03-14 20:08:29 -04:00
Joseph Huber 9f89769cd7 [Clang] Add offload kind to embedded offload object
This patch adds the offload kind to the embedded section name in
preparation for offloading to different kinda like CUDA or HIP.

Depends on D120288

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120271
2022-03-14 20:08:27 -04:00
Joseph Huber 06b336c4cd [OpenMP] Implement dense map info for device file
This patch implements a DenseMap info struct for the device file type.
This is used to help grouping device files that have the same triple and
architecture. Because of this the filename, which will always be unique
for each file, is not used.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120288
2022-03-14 20:08:26 -04:00
Joseph Huber 806bbc49dc [OpenMP] Try to embed offloading objects after codegen
Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a
codegen action, meaning we only handle this option correctly when the
input is a bitcode file. This patch adds the same handling to embed an
offloading object after we complete code generation. This allows us to
embed the object correctly if the input file is source or bitcode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120270
2022-03-14 20:08:24 -04:00
Stanislav Mekhanoshin c4500de255 [AMDGPU] gfx940: disable OP_SEL on V_DOT instructions
Differential Revision: https://reviews.llvm.org/D121634
2022-03-14 17:02:00 -07:00
Vy Nguyen 0d5e27623a Reland "[lld-macho] Avoid using bump-alloc in TrieBuider""
This reverts commit ee7a286cd3.
2022-03-14 19:33:13 -04:00
Stanislav Mekhanoshin 8dd3d1cf1f [AMDGPU] Add symbolic names for gfx940 HWREGs
The namespaces of HWREGs is now overlapping with gfx10. Thus the
patch is longer than necessary to just support new names. It also
need to handle proper error messages, i.e. to issue a "specified
hardware register is not supported on this GPU" message.

This may need a major refactoring in the future.

Differential Revision: https://reviews.llvm.org/D121418
2022-03-14 16:13:33 -07:00
Andrew Browne dbf8c00b09 [DFSan] Remove trampolines to unblock opaque pointers. (Reland with fix)
https://github.com/llvm/llvm-project/issues/54172

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D121250
2022-03-14 16:03:25 -07:00
Stanislav Mekhanoshin 23499103f7 [AMDGPU] Support for gfx940 flat lds opcodes
Differential Revision: https://reviews.llvm.org/D121414
2022-03-14 15:46:19 -07:00
Stanislav Mekhanoshin 1f53f20fc1 [AMDGPU] Support gfx940 v_lshl_add_u64 instruction
Differential Revision: https://reviews.llvm.org/D121401
2022-03-14 15:45:42 -07:00
Ryan Prichard 659029302d [ARM] __cxa_end_cleanup: avoid clobbering r4
The fix for D111703 clobbered r4 both to:
 - Save/restore the original lr.
 - Load the address of _Unwind_Resume for LIBCXXABI_BAREMETAL.

This patch saves and restores lr without clobbering any extra
registers.

For LIBCXXABI_BAREMETAL, it is still necessary to clobber one extra
register to hold the address of _Unwind_Resume, but it seems better to
use ip/r12 (intended for linker veneers/trampolines) than r4 for this
purpose.

The function also clobbers r0 for the _Unwind_Resume function's
parameter, but that is unavoidable.

Reviewed By: danielkiss, logan, MaskRay

Differential Revision: https://reviews.llvm.org/D121432
2022-03-14 15:44:35 -07:00
Dávid Bolvanský 56e7d6bd44 [Clang] noinline stmt attribute - emit warnings rather than errors
Compatible behaviour with always_inline stmt attribute
2022-03-14 23:40:17 +01:00
Jim Ingham 33f9fc77d1 Don't report memory return values on MacOS_arm64 of SysV_arm64 ABI's.
They don't require that the memory return address be restored prior to
function exit, so there's no guarantee the value is correct.  It's better
to return nothing that something that's not accurate.

Differential Revision: https://reviews.llvm.org/D121348
2022-03-14 15:25:40 -07:00
Stanislav Mekhanoshin 36fe3f13a9 [AMDGPU] flat scratch SVS addressing mode for gfx940
Both VADDR and SADDR are used in SVS mode.

Differential Revision: https://reviews.llvm.org/D121254
2022-03-14 15:23:36 -07:00
Sterling Augustine ee7a286cd3 Revert "[lld-macho] Avoid using bump-alloc in TrieBuider"
This reverts commit e049a87f04.

That commit breaks the build with errors of the form:

/usr/local/google/home/saugustine/llvm/llvm-project/lld/MachO/ExportTrie.cpp:148:11: error: definition of implicitly declared destructor
TrieNode::~TrieNode() {
2022-03-14 15:23:04 -07:00
Peter Klausler 0d9b0f642b [flang] IEEE_ARITHMETIC must imply USE IEEE_EXCEPTIONS
The intrinsic module IEEE_ARITHMETIC must incorporate the public
names from the intrisic module IEEE_EXCEPTIONS.  Rename IEEE_EXCEPTIONS
to __Fortran_ieee_exceptions so that it won't clash with the
nonintrinsic namespace, establish a new intrinic IEEE_EXCEPTIONS
module that USEs it, and add a USE to IEEE_ARITHMETIC.

Updated to use STREQUAL rather than ambiguous MATCHES in
the CMakeLists.txt file.

Differential Revision: https://reviews.llvm.org/D121490
2022-03-14 15:13:43 -07:00
Stanislav Mekhanoshin 47bac63d3f [AMDGPU] gfx940 memory model
Differential Revision: https://reviews.llvm.org/D121242
2022-03-14 15:01:46 -07:00
Andrew Litteken 228cc2c38b [IROutliner] Ensure merged PHINodes respect order and incoming blocks, not just incoming values
When matching PHINodes when margining functions the IROutliner only checks that an incoming value exists in phi node in overall function. It doesn't check the length, the order, or that the incoming block also matches. In the given example, we see that both phi nodes have the same incoming values, but from different blocks.

The fix is to to enforce stricter a match of the incoming value, and the incoming block as well when matching the created phi nodes.

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D121310
2022-03-14 16:48:21 -05:00
Craig Topper ce78e68261 [InstCombine] Fold select based logic of fcmps with same operands when FMF is present.
If we have a logical and/or in select form and the true/false operand
is an fcmp with poison generating FMF, we won't be able to fold it
to an and/or instruction. This prevents us from optimizing the case
where it is a logical operation of two fcmps with identical operands.

This patch adds explicit checks for this case that doesn't rely on
converting to and/or to do the optimization. It reuses the existing
foldLogicOfFCmps, but adds a new flag to disable the other combine
that is inside that function.

FMF flags from the two FCmps are intersected using the logic added in
D121243. The FIXME has been updated to indicate that we can only use
a union for the non-select form.

This allows us to optimize cases like this from compare-fp-3.c in the
gcc torture suite with fast math.

void
test1 (float x, float y)
{
  if ((x==y) && (x!=y))
    link_error0();
}

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D121323
2022-03-14 14:45:07 -07:00
Nick Desaulniers 236695e70c [IRLinker] make IRLinker::AddLazyFor optional (llvm::unique_function). NFC
2 of the 3 callsite of IRMover::move() pass empty lambda functions. Just
make this parameter llvm::unique_function.

Came about via discussion in D120781. Probably worth making this change
regardless of the resolution of D120781.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121630
2022-03-14 14:37:34 -07:00
Vy Nguyen e049a87f04 [lld-macho] Avoid using bump-alloc in TrieBuider
The code can be used in multi-threads and the allocator is not thread safe.

fixes PR/54378

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D121638
2022-03-14 17:22:53 -04:00
Simon Pilgrim edd3e705bb [X86] Fix avx512.mask.vpshld/vpshrd tests to correctly test maskz cases 2022-03-14 21:20:26 +00:00
Dávid Bolvanský a717e9d47e [AttrDocs] try to fix build 2022-03-14 22:19:55 +01:00
Fangrui Song c30e6447c0 [ELF] Move section assignment from initializeSymbols to postParse
https://discourse.llvm.org/t/parallel-input-file-parsing/60164

initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.

Postpone reporting duplicate symbol errors so that the messages have the
section information. (`Defined::section` is assigned in postParse and another
thread may not have the information).

* duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now
  has no section information
* comdat-binding: `%t/w.o %t/g.o` leads to an undesired undefined symbol. This
  is not ideal but we report a diagnostic to inform that this is unsupported.
  (See release note)
* comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded
  section error) makes more sense

Depends on D120640

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D120626
2022-03-14 14:13:41 -07:00
Stella Laurenzo 0c3156bd43 NFC: Remove unterminated string from Python pyi file. 2022-03-14 14:10:38 -07:00
Stanislav Mekhanoshin 72a9e5f891 [AMDGPU] Restrict machine copy propagation from creating unaligned classes
Fixes: SWDEV-326366

Differential Revision: https://reviews.llvm.org/D121491
2022-03-14 14:09:40 -07:00
Valentin Clement 40e82feef0
[flang] Lower any intrinsic
This patch lowers the `any` intrinsic function to FIR.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D121609

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: mleair <leairmark@gmail.com>
2022-03-14 22:07:05 +01:00
Amara Emerson 8cbf18cb04 [GlobalISel] Fix store merging incorrectly merging volatile stores.
The existing volatile checks only handle aliasing hazards between stores,
but that isn't enough since by that point volatile stores may have already
been added to the current candidate group.
2022-03-14 13:48:51 -07:00
Andrew Browne edc33fa569 Revert "[DFSan] Remove trampolines to unblock opaque pointers."
This reverts commit 84af90336f.
2022-03-14 13:47:41 -07:00
Dávid Bolvanský 003c0b9307 [Clang] always_inline statement attribute
Motivation:

```
int test(int x, int y) {
    int r = 0;
    [[clang::always_inline]] r += foo(x, y); // force compiler to inline this function here
    return r;
}
```

In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others).

This patch solves this problem with call site attribute. "noinline" statement attribute already landed in D119061. Also, some LLVM Inliner fixes landed so call site attribute is stronger than function attribute.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D120717
2022-03-14 21:45:31 +01:00
Andrew Browne 84af90336f [DFSan] Remove trampolines to unblock opaque pointers.
https://github.com/llvm/llvm-project/issues/54172

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D121250
2022-03-14 13:39:49 -07:00
Florian Mayer 628c537b32 [MTE] Add test that stack tagging does not mess up stack coloring.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D121433
2022-03-14 13:36:21 -07:00
Shafik Yaghmour 28c878aeb2 [LLDB] Applying clang-tidy modernize-use-default-member-init over LLDB
Applied modernize-use-default-member-init clang-tidy check over LLDB.
It appears in many files we had already switched to in class member init but
never updated the constructors to reflect that. This check is already present in
the lldb/.clang-tidy config.

Differential Revision: https://reviews.llvm.org/D121481
2022-03-14 13:32:03 -07:00
Andrew Litteken c79ab1065e [IROutliner] Separate split PHI nodes from multiple exits by different outlinable regions.
The IR Outliner is supposed to extract the outputs contained in an external phi node and place them into a phi node contained within the outlined function. However, when the output values of two outlined functions with two different output sets are contained within the same phi node, they are counted as the same exit path when first analyzed. In reality, these create two different phi nodes, creating an inconsistency, resulting in a mismatch in the expected number of output paths and a crash.  This fixes that counting when analyzing the outputs by also analyzing the incoming blocks rather than just the incoming values.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D121313
2022-03-14 14:56:59 -05:00
Peter Collingbourne aaca634c94 gn build: Add support for building with libcurl.
Differential Revision: https://reviews.llvm.org/D121260
2022-03-14 12:52:19 -07:00
Florian Hahn 4a0481e981
[LV] Check for users of truncated IVs, add more detailed comment.
Add missing outside user check for truncated IVs. Also hoist the code in
the helper with additional explanations.

Fixes #54370.
2022-03-14 19:39:30 +00:00
Amir Ayupov 842fa38dbe [X86] Fix cosmetic issues in instruction mnemonics
- Remove spurious } in invlpgb mnemonic
- Add \t between mnemonic and operands for ud1 instructions

Reviewed By: skan, craig.topper

Differential Revision: https://reviews.llvm.org/D121570
2022-03-14 12:29:44 -07:00
Jonas Devlieghere dde487e547
[lldb] Plumb process host architecture through platform selection
To allow us to select a different platform based on where the process is
running, plumb the process host architecture through platform selection.

This patch is in preparation for D121444 which needs this functionality
to tell apart iOS binaries running on Apple Silicon vs on a remote iOS
device.

Differential revision: https://reviews.llvm.org/D121484
2022-03-14 12:17:01 -07:00
Fangrui Song c7cf960d85 [ELF] Set the priority of STB_GNU_UNIQUE the same as STB_WEAK
In GCC -fgnu-unique output, STB_GNU_UNIQUE symbols are always defined relative
to a section in a COMDAT group. Currently `other` cannot be STB_GNU_UNIQUE for
valid input, so this patch is NFC.

If we switch to the model that ignores COMDAT resolution when performing symbol
resolution (D120626), this will fix bogus `relocation refers to a symbol in a
discarded section` errors when mixing -fno-gnu-unique objects with -fgnu-unique
objects.

Differential Revision: https://reviews.llvm.org/D120640
2022-03-14 12:00:15 -07:00
Ben Barham 3466b8e23d [Support] Add const to `FileError::getFileName`
`getFileName` returns a `StringRef`, there's no reason it shouldn't be
const.

Differential Revision: https://reviews.llvm.org/D121495
2022-03-14 11:45:29 -07:00
Ben Barham cc63ae42d7 [VFS] Rename `RedirectingFileSystem::dump` to `print`
The rest of LLVM uses `print` for the method taking the `raw_ostream`
and `dump` only for the method with no parameters. Use the same for
`RedirectingFileSystem`.

Differential Revision: https://reviews.llvm.org/D121494
2022-03-14 11:44:07 -07:00
Kazu Hirata 9286786e87 [CodeGen] Remove an unused variable introduced in D121128 2022-03-14 11:41:04 -07:00
Fangrui Song 407c721ceb [Support] Change zlib::compress to return void
With a sufficiently large output buffer, the only failure is Z_MEM_ERROR.
Check it and call the noreturn report_bad_alloc_error if applicable.
resize_for_overwrite may call report_bad_alloc_error as well.

Now that there is no other error type, we can replace the return type with void
and simplify call sites.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D121512
2022-03-14 11:38:04 -07:00
Jonas Devlieghere b0a76b0162
[lldb] Fix the Windows build after D121536 2022-03-14 11:22:21 -07:00
Peter Klausler 3b61587c9e [flang] LBOUND() edge case: empty dimension
LBOUND must return 1 for an empty dimension, no matter what
explicit expression might appear in a declaration or arrive in
a descriptor.

Differential Revision: https://reviews.llvm.org/D121488
2022-03-14 11:16:09 -07:00
Eric Schweitz c2e7e75954 Write a pass to annotate constant operands on FIR ops. This works
around the feature in MLIR's canonicalizer, which considers the semantics
of constants differently based on how they are packaged and not their
values and use.  Add test.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D121625
2022-03-14 11:14:44 -07:00
Andrzej Warzynski 75d74d99c7 Revert "[flang] IEEE_ARITHMETIC must imply USE IEEE_EXCEPTIONS"
This reverts commit b6a7600491. It caused
the following build failure:
```
ninja: error: dependency cycle: include/flang/__fortran_ieee_exceptions.mod -> include/flang/__fortran_ieee_exceptions.mod
```

See e.g.:
* https://lab.llvm.org/buildbot/#/builders/172/builds/9595

To reproduce:
```
cmake -G Ninja \
  -DLLVM_TARGETS_TO_BUILD=host \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_PROJECTS="clang;flang" \
  ../../llvm
ninja check-flang
```
2022-03-14 18:05:26 +00:00
Fraser Cormack a44aeab526 [RISCV] Add MIR tests exposing missed InstAliases
The InstAlias framework cannot match registers against zero_reg, which
RVV uses to encode unmasked operations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D92228
2022-03-14 17:53:07 +00:00