Use it AArch64 post-legal combiner. These don't always get folded because when
the instructions are created the constants are obscured by artifacts.
Differential Revision: https://reviews.llvm.org/D106776
Dominator trees were previously used for an optimization related to
`wasm.lsda` but the optimization was removed in D97309. Currently
dominators are not doing anything in this pass. Also removes some
`include` lines without which it compiles.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D106811
This patch adds the always inline attribute to the outlined functions generated
by OpenMP regions. Because there is only a single instance of this function and
it always has internal linkage it is safe to inline in every instance it is
created. This could potentially lead to performance degredation due to
inflated register counts in the parallel region.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106799
This patch expands the tree item that corresponds to the selected thread
by default in the Threads window. Additionally, the tree root item is
always expanded, which is the process in the Threads window.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D100243
When Emscripten EH mixes with Emscripten SjLj, we are not currently
handling some of them correctly. There are three cases:
1. The current function calls `setjmp` and there is an `invoke` to a
function that can either throw or longjmp. In this case, we have to
check both for exception and longjmp. We are currently handling this
case correctly:
0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1058-L1090)
When inserting routines for functions that can longjmp, which we do
only for setjmp-calling functions, we check if the function was
previously an `invoke` and handle it correctly.
2. The current function does NOT call `setjmp` and there is an `invoke`
to a function that can either throw or longjmp. Because there is no
`setjmp` call, we haven't been doing any check for functions that can
longjmp. But in that case, for `invoke`, we only check for an
exception and if it is not an exception we reset `__THREW__` to 0,
which can silently swallow the longjmp:
0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L70-L80)
This CL fixes this.
3. The current function calls `setjmp` and there is no `invoke`. Because
it is not an `invoke`, we haven't been doing any check for functions
that can throw, and only insert longjmp-checking routines for
functions that can longjmp. But in that case, if a longjmpable
function throws, we only check for a longjmp so if it is not a
longjmp we reset `__THREW__` to 0, which can silently swallow the
exception:
0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L156-L169)
This CL fixes this.
To do that, this moves around some code, so we register necessary
functions for both EH and SjLj and precompute some data (the set of
functions that contains `setjmp`) before doing actual EH or SjLj
transformation.
This CL makes 2nd and 3rd tests in
https://github.com/emscripten-core/emscripten/pull/14732 work.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D106525
See LWG reflector thread of 2021-07-23 titled
'Question on ranges::advance and "past-the-sentinel iterators"'.
Test case heavily based on one graciously provided by Casey Carter.
Differential Revision: https://reviews.llvm.org/D106735
This is a NFC commit to normalize how we set target properties on the
various runtime targets. A follow-up patch is going to add new properties,
and I wanted that follow-up patch to be cleaner.
This patch adds a driver flag `-fopenmp-target-new-runtime` to optionally enable the new device runtime
bitcode library. This allows users to enable the new experimental runtime
before it becomes the default in the future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106793
Fix the external-io unittest under Windows.
In particular, fixes the following issues:
1. When creating a temporary file, open it with read+write permissions
using the _O_RDWR flag. _S_IREAD and _S_IWRITE are for the file
permissions of the created file.
2. _chsize returns 0 on success (just like ftruncate).
3. To set a std::optional, use its assign-operator overload instead of
getting a reference to its value and overwrite that. The latter is
invalid if the std::optional has no value, and is caught by
msvc's debug STL.
The non-GTest unittest is currently not executed under Windows because
of the added .exe extension to the output file: external-io.text.exe.
llvm-lit skips the file because .exe is not in the lists of test
suffixes (.test is). D105315 is going to change that by converting it
to a GTest-test.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D106726
Nowadays, simplifycfg pass already tail-merges all the ret blocks together
before doing anything, and it should not increase the count of ret's,
so this is dead code.
This patch replaces the workaround for simpler implicit moves
implemented in D105518.
The Microsoft STL currently has some issues with P2266.
Where before, with -fms-compatibility, we would disable simpler
implicit moves globally, with this change, we disable it only
when the returned expression is in a context contained by
std namespace and is located within a system header.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: aaron.ballman, mibintc
Differential Revision: https://reviews.llvm.org/D105951
This fixes an assert firing when compiling code which involves 128 bit
integrals.
This would trigger runtime checks similar to this:
```
Assertion failed: getMinSignedBits() <= 64 && "Too many bits for int64_t", file llvm/include/llvm/ADT/APInt.h, line 1646
```
To get around this, we just saturate those big values.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D105320
Add td definitions and asm/disasm tests for the addex instruction introduced in
ISA 3.0.
Reviewed By: nemanjai, amyk, NeHuang
Differential Revision: https://reviews.llvm.org/D106666
Invalid costs can be used to avoid vectorization with a given VF, which is
used for scalable vectors to avoid things that the code-generator cannot
handle. If we override the cost using the -force-target-instruction-cost
option of the LV, we would override this mechanism, rendering the flag useless.
This change ensures the cost is only overriden when the original cost that
was calculated is valid. That allows the flag to be used in combination
with the -scalable-vectorization option.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D106677
This change moves most of `sve-inductions.ll` to non-AArch64 specific
LV tests using the `-target-supports-scalable-vectors` flag, because they're
not explicitly AArch64-specific. One test builds on AArch64-specific
knowledge regarding masked loads/stores, and remains in sve-inductions.ll.
DIEnumerator stores an APInt as of April 2020, so now we don't need to
truncate the enumerator value to 64 bits. Fixes assertions during IRGen.
Split from D105320, thanks to Matheus Izvekov for the test case and
report.
Differential Revision: https://reviews.llvm.org/D106585
This is a followup patch for D105930 to add implicit-def of RM for
mtfsb[01] instructions as per review comments.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D106603
Summary:
There was an unnecessary variable assigned to the information cache when we
only need it in the constructor to extract the function declaration.
Even though the standalone build is deprecated, some people are still
relying on it (including libc++ itself for some configurations). Setting
the target triple will ensure that the build and the test suite behaves
consistently in the standalone and normal builds.
Differential Revision: https://reviews.llvm.org/D106800
On Apple platforms the builtins may be built for both arm64 and arm64e.
With Makefile generators separate targets are built using Make sub-invocations.
This causes a race when creating the symlink which may sometimes fail.
Work around this by using a custom target that the builtin targets depend on.
This causes any sub-invocations to depend on the symlinks having been created before.
Mailing list thread: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151822.html
Reviewed By: thakis, steven_wu
Differential Revision: https://reviews.llvm.org/D106305
XL provides functions __vec_ldrmb/__vec_strmb for loading/storing a
sequence of 1 to 16 bytes in big endian order, right justified in the
vector register (regardless of target endianness).
This is equivalent to vec_xl_len_r/vec_xst_len_r which are only
available on Power9.
This patch simply uses the Power9 functions when compiled for Power9,
but provides a more general implementation for Power8.
Differential revision: https://reviews.llvm.org/D106757
Leave the name section in the output when using the --strip-debug
flag. This treats it more like ELF symbol tables, as the name
section has similar uses at runtime (e.g. wasm engines understand
it and it can be used for symbolization at runtime).
Fixes https://github.com/emscripten-core/emscripten/issues/14623
Differential Revision: https://reviews.llvm.org/D106728
This patch resolves the paths in the file/directory fields before
performing checks. Those checks are applied on the file system if
m_need_to_exist is true, so remote files can set this to false to avoid
performing host-side file system checks. Additionally, methods to get
a resolved and a direct file specs were added to be used by client code.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D106553
Proposed alternative to D105338.
This is ugly, but short-term I think it's the best way forward: first,
let's formalize the hacks into a coherent model. Then we can consider
extensions of that model (we could have different flavors of volatile
with different rules).
Differential Revision: https://reviews.llvm.org/D106309
The legalizer generates selects for some operations, which can have constant
condition values, resulting in lots of dead code if it's not folded away.
Differential Revision: https://reviews.llvm.org/D106762
The previous patch included the implementations for the scudo allocator,
but not the wrappers. This change fixes that.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D106718
Both `__THREW__` and `__threwValue` are global variables, and we have
been distinguishing the global variable `__THREW__` and the loaded value
`%__THREW__.val` in comments but not doing it for `__threwValue`. Made
the pseudocode comments consistent for both variables.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D106524
Tosa shape verification prevent shape propagation when coming from a dialect
of known shape. Relax this constraint to allow ingestion / shape propagation
from these other dialects.
Differential Revision: https://reviews.llvm.org/D106610
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it
possible to construct RuntimeCheckingPtrGroup objects without a
RuntimePointerChecking object. This should make it easier to
re-use the code to generate runtime checks, e.g. in D102834.
RtCheck was only used to access the pointer info for a given index.
Instead, the start and end expressions can be passed directly.
For code-gen, we also need to know the address space to use. This can
also be explicitly passed at construction.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D105481
During tail duplication, SSA values may be updated and have their uses
replaced with a virtual register, and any debug instructions that use
that value are deleted. This patch fixes the implementation of the debug
instruction deletion to work correctly for debug instructions that use
the SSA value multiple times, by batching deletions so that we don't
attempt to delete the same instruction twice.
Differential Revision: https://reviews.llvm.org/D106557
This patch changes the index argument of lvxl?/lve[bhw]x and
stvxl?/stve[bhw]x builtins from int to long. Because on 64-bit
subtargets, an extra extsw will always been generated, which is
incorrect.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D106530
The getOrderedReductionCost implementation introduced in D105432 calls the CRTP base version getArithmeticInstrCost instead of the redirecting to the target version.
Differential Revision: https://reviews.llvm.org/D106795