The change allows clang -mno-omit-leaf-frame-pointer to disable frame
pointer elimination. This behavior matches X86 and Mips, and also GCC
AArch64.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71168
This matches https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
> -momit-leaf-frame-pointer
> -mno-omit-leaf-frame-pointer
>
> Omit or keep the frame pointer in leaf functions. The former behavior is the default.
-mno-omit-leaf-frame-pointer is currently a no-op because
TargetOptions::DisableFramePointerElim is only considered for non-leaf
functions.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71167
libcxx/test/std/containers/sequences/array/at.pass.cpp
Need to include <stdexcept> for std::out_of_range.
libcxx/test/std/localization/locale.categories/category.time/*
Need to include <ios> for std::ios.
Extends the desciptor-based indirect call support for 32-bit codegen,
and enables indirect calls for AIX.
In-depth Description:
In a function descriptor based ABI, a function pointer points at a
descriptor structure as opposed to the function's entry point. The
descriptor takes the form of 3 pointers: 1 for the function's entry
point, 1 for the TOC anchor of the module containing the function
definition, and 1 for the environment pointer:
struct FunctionDescriptor {
void *EntryPoint;
void *TOCAnchor;
void *EnvironmentPointer;
};
An indirect call has several steps of loading the the information from
the descriptor into the proper registers for setting up the call. Namely
it has to:
1) Save the caller's TOC pointer into the TOC save slot in the linkage
area, and then load the callee's TOC pointer into the TOC register
(GPR 2 on AIX).
2) Load the function descriptor's entry point into the count register.
3) Load the environment pointer into the environment pointer register
(GPR 11 on AIX).
4) Perform the call by branching on count register.
5) Restore the caller's TOC pointer after returning from the indirect call.
A couple important caveats to the above:
- There is no way to directly load a value from memory into the count register.
Instead we populate the count register by loading the entry point address into
a gpr and then moving the gpr to the count register.
- The TOC restore has to come immediately after the branch on count register
instruction (i.e., the 1st instruction executed after we return from the
call). This is an implementation limitation. We could, in theory, schedule
the restore elsewhere as long as no uses of the TOC pointer fall in between
the call and the restore; however, to keep it simple, we insert a pseudo
instruction that represents both the indirect branch instruction and the
load instruction that restores the caller's TOC from the linkage area. As
they flow through the compiler as a single pseudo instruction, nothing can be
inserted between them and the caller's TOC is then valid at any use.
Differtential Revision: https://reviews.llvm.org/D70724
Legalization algorithm is complicated by two facts:
1) While regular instructions should be possible to legalize in
an isolated, per-instruction, context-free manner, legalization
artifacts can only be eliminated in pairs, which could be deeply, and
ultimately arbitrary nested: { [ () ] }, where which paranthesis kind
depicts an artifact kind, like extend, unmerge, etc. Such structure
can only be fully eliminated by simple local combines if they are
attempted in a particular order (inside out), or alternatively by
repeated scans each eliminating only one innermost pair, resulting in
O(n^2) complexity.
2) Some artifacts might in fact be regular instructions that could (and
sometimes should) be legalized by the target-specific rules. Which
means failure to eliminate all artifacts on the first iteration is
not a failure, they need to be tried as instructions, which may
produce more artifacts, including the ones that are in fact regular
instructions, resulting in a non-constant number of iterations
required to finish the process.
I trust the recently introduced termination condition (no new artifacts
were created during as-a-regular-instruction-retrial of artifacts not
eliminated on the previous iteration) to be efficient in providing
termination, but only performing the legalization in full if and only if
at each step such chains of artifacts are successfully eliminated in
full as well.
Which is currently not guaranteed, as the artifact combines are applied
only once and in an arbitrary order that has to do with the order of
creation or insertion of artifacts into their worklist, which is a no
particular order.
In this patch I make a small change to the artifact combiner, making it
to re-insert into the worklist immediate (modulo a look-through copies)
artifact users of each vreg that changes its definition due to an
artifact combine.
Here the first scan through the artifacts worklist, while not
being done in any guaranteed order, only needs to find the innermost
pair(s) of artifacts that could be immediately combined out. After that
the process follows def-use chains, making them shorter at each step, thus
combining everything that can be combined in O(n) time.
Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders
Reviewed By: aditya_nandakumar, paquette
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71448
and introducing new unittests/CodeGen/GlobalISel/LegalizerTest.cpp
relying on it to unit test the entire legalizer algorithm (including the
top-level main loop).
See also https://reviews.llvm.org/D71448
D39317 made clang use .init_array when no gcc installations is found.
This change changes all gcc installations to use .init_array .
GCC 4.7 by default stopped providing .ctors/.dtors compatible crt files,
and stopped emitting .ctors for __attribute__((constructor)).
.init_array should always work.
FreeBSD rules are moved to FreeBSD.cpp to make Generic_ELF rules clean.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71434
When parsing the code with OpenMP and the function's body must be
skipped, need to skip also OpenMP annotation tokens. Otherwise the
counters for braces/parens are unbalanced and parsing fails.
Summary:
When running the tests on a Ubuntu 18.04 machine this test is crashing for
me inside the runtime linker. My guess is that it is trying to save more
registers (possibly large vector ones) and the current stack space is not
sufficient.
Reviewers: samsonov, kcc, eugenis
Reviewed By: eugenis
Subscribers: eugenis, merge_guards_bot, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D71461
Summary:
To find potential opportunities to use getMemBasePlusOffset() I looked at
all ISD::ADD uses found with the regex getNode\(ISD::ADD,.+,.+Ptr
in lib/CodeGen/SelectionDAG. If this patch is accepted I will convert
the files in the individual backends too.
The motivation for this change is our out-of-tree CHERI backend
(https://github.com/CTSRD-CHERI/llvm-project). We use a separate register
type to store pointers (128-bit capabilities, which are effectively
unforgeable and monotonic fat pointers). These capabilities permit a
reduced set of operations and therefore use a separate ValueType (iFATPTR).
to represent pointers implemented as capabilities.
Therefore, we need to avoid using ISD::ADD for our patterns that operate
on pointers and need to use a function that chooses ISD::ADD or a new
ISD::PTRADD opcode depending on the value type.
We originally added a new DAG.getPointerAdd() function, but after this
patch series we can modify the implementation of getMemBasePlusOffset()
instead. Avoiding direct uses of ISD::ADD for pointer types will
significantly reduce the amount of assertion/instruction selection
failures for us in future upstream merges.
Reviewers: spatel
Reviewed By: spatel
Subscribers: merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71207
Summary:
This change is preparatory work to use this helper functions in more places.
In order to make this change, getMemBasePlusOffset() has been extended to
also take a SDNodeFlags parameter.
The motivation for this change is our out-of-tree CHERI backend
(https://github.com/CTSRD-CHERI/llvm-project). We use a separate register
type to store pointers (128-bit capabilities, which are effectively
unforgeable and monotonic fat pointers). These capabilities permit a
reduced set of operations and therefore use a separate ValueType (iFATPTR).
to represent pointers implemented as capabilities.
Therefore, we need to avoid using ISD::ADD for our patterns that operate
on pointers and need to use a function that chooses ISD::ADD or a new
ISD::PTRADD opcode depending on the value type.
We originally added a new DAG.getPointerAdd() function, but after this
patch series we can modify the implementation of getMemBasePlusOffset()
instead. Avoiding direct uses of ISD::ADD for pointer types will
significantly reduce the amount of assertion/instruction selection
failures for us in future upstream merges.
Reviewers: spatel
Reviewed By: spatel
Subscribers: merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71206
Summary:
This change is preparatory work to use this helper functions in more places.
Currently the function only allows integer constants offsets, but there
are cases where we can use an existing SDValue parameter.
The motivation for this change is our out-of-tree CHERI backend
(https://github.com/CTSRD-CHERI/llvm-project). We use a separate register
type to store pointers (128-bit capabilities, which are effectively
unforgeable and monotonic fat pointers). These capabilities permit a
reduced set of operations and therefore use a separate ValueType (iFATPTR).
to represent pointers implemented as capabilities.
Therefore, we need to avoid using ISD::ADD for our patterns that operate
on pointers and need to use a function that chooses ISD::ADD or a new
ISD::PTRADD opcode depending on the value type.
We originally added a new DAG.getPointerAdd() function, but after this
patch series we can modify the implementation of getMemBasePlusOffset()
instead. Avoiding direct uses of ISD::ADD for pointer types will
significantly reduce the amount of assertion/instruction selection
failures for us in future upstream merges.
Reviewers: spatel, craig.topper
Reviewed By: spatel, craig.topper
Subscribers: craig.topper, merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71205
Summary:
This change is preparatory work to use this helper functions in more places.
Currently the function only allows positive offsets, but there are cases
where we want to subtract an offset from an existing pointer.
The motivation for this change is our out-of-tree CHERI backend
(https://github.com/CTSRD-CHERI/llvm-project). We use a separate register
type to store pointers (128-bit capabilities, which are effectively
unforgeable and monotonic fat pointers). These capabilities permit a
reduced set of operations and therefore use a separate ValueType (iFATPTR).
to represent pointers implemented as capabilities.
Therefore, we need to avoid using ISD::ADD for our patterns that operate
on pointers and need to use a function that chooses ISD::ADD or a new
ISD::PTRADD opcode depending on the value type.
We originally added a new DAG.getPointerAdd() function, but after this
patch series we can modify the implementation of getMemBasePlusOffset()
instead. Avoiding direct uses of ISD::ADD for pointer types will
significantly reduce the amount of assertion/instruction selection
failures for us in future upstream merges.
Reviewers: spatel
Reviewed By: spatel
Subscribers: merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71204
Copy the block to the heap before passing it to the callee in case the
block escapes in the callee.
rdar://problem/55683462
Differential Revision: https://reviews.llvm.org/D71431
Summary:
Right now, NSException::GetSummary() has the following output:
"name: $exception_name - reason: $exception_reason"
It would be better to simplify the output by removing the name and only
showing the exception's reason. This way, annotations would look nicer in
the editor, and would be a shorter summary in the Variables Inspector.
Accessing the exception's name can still be done by expanding the
NSException object in the Variables Inspector.
rdar://54770115
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71311
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
In looking into some other code, I came across this issue where a
float converted to a gcc integer vector via a splat causes it to miss
the float-to-integral cast, which causes some REALLY strange codegen
bugs.
The AST looked like:
`-ImplicitCastExpr <col:13>
'gcc_int_2':'__attribute__((__vector_size__(2 * sizeof(int)))) int' <VectorSplat>
`-ImplicitCastExpr <col:13> 'float' <LValueToRValue>
`-DeclRefExpr <col:13> 'float' lvalue ParmVar
0x556f16a5dc90 'f' 'float'
Despite the type of the VectorSplat cast as printed, it ended up
becoming a vector of float, which caused non-matching instructions. For
example, IntVector + a float constant resulted in:
add <2 x i32> %8, <2 x float> <float 3.000000e+00, float 3.000000e+00>
This patch corrects the conversion so that the float is first converted
to an integral, THEN splatted.
Summary:
This copy ensures that debug location information is kept for
compressed instructions. There are places where both compressInstruction and
uncompressInstruction are called that were not doing this copy, discarding some
debug info.
This change merely moves the copy into the generated file, so you cannot forget
to copy the location over when compressing or uncompressing.
Reviewers: asb, luismarques
Reviewed By: luismarques
Subscribers: sameer.abuasal, aprantl, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67493
This reverts commit 0be81968a2.
The VFDatabase needs some rework to be able to handle vectorization
and subsequent scalarization of intrinsics in out-of-tree versions of
the compiler. For more details, see the discussion in
https://reviews.llvm.org/D67572.
The initial attempt (rG89633320) botched the logic by reversing
the source/dest types. Added x86 tests for additional coverage.
The vector tests show a potential improvement (fold vector load
instead of broadcasting), but that's a known/existing problem.
This fold is done in IR by instcombine, and we have a special
form of it already here in DAGCombiner, but we want the more
general transform too:
https://rise4fun.com/Alive/3jZm
Name: general
Pre: (C1 + zext(C2) < 64)
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%a = and i64 %s2, zext((1 << (16 - C2)) - 1)
%r = trunc %a to i16
Name: special
Pre: C1 == 48
%s = lshr i64 %x, C1
%t = trunc i64 %s to i16
%r = lshr i16 %t, C2
=>
%s2 = lshr i64 %x, C1 + zext(C2)
%r = trunc %s2 to i16
...because D58017 exposes a regression without this fold.
The big switch in `ARMBaseInstrInfo::getNumMicroOps` is missing cases for
`VLLDM` and `VLSTM`, which are currently defined with itineraries having a
dynamic count of micro-ops.
Assuming an optimistic case in which these instruction do not actually perform
loads or stores, and with the idea that Armv8-m cores are supposed to use the
new style scheduling models, this patch just sets the itinerary for those two
instructions to `NoItinerary`.
Differential Revision: https://reviews.llvm.org/D71266
Summary:
[libomptarget] Build most of common/src for amdgcn
Excluding parallel.cu, which uses an integer min() from cuda,
Excluding support.cu, which calls malloc that is not yet available for amdgcn
Reviewers: jdoerfert, ABataev, grokos
Reviewed By: jdoerfert
Subscribers: gregrodgers, ronlieb, jvesely, mgorny, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D71446
Turns out that gtest in LLVM is only 1.8.0 (the newest version 1.10.0)
supports the GTEST_SKIP() macro, and apparently I didn't build w/o
GWP-ASan.
Should fix the GN bot, as well as any bots that may spuriously break on
platforms where the code wasn't correctly ifdef'd out as well.
This reverts commit 2bbd32f5e8, it was
causing UBSan failures like the following:
lld/ELF/Target.cpp:103:41: runtime error: applying non-zero offset 24 to null pointer
Fix PR44284. This is probably not valid assembly but we should not crash.
Reviewed By: luporl, #powerpc, steven.zhang
Differential Revision: https://reviews.llvm.org/D71443
When a common symbol is merged with a shared symbol, increase st_size if
the shared symbol has a larger st_size. At runtime, the executable's
symbol overrides the shared symbol. The shared symbol may be created
from common symbols in a previous link. This rule makes sure we pick
the largest size among all common symbols.
This behavior matches GNU ld. See
https://sourceware.org/bugzilla/show_bug.cgi?id=25236 for discussions.
A shared symbol does not hold alignment constraints. Ignore the
alignment update.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D71161
Summary:
Adds GWP-ASan to Scudo standalone. Default parameters are pulled across from the
GWP-ASan build. No backtrace support as of yet.
Reviewers: cryptoad, eugenis, pcc
Reviewed By: cryptoad
Subscribers: merge_guards_bot, mgorny, #sanitizers, llvm-commits, cferris, vlad.tsyrklevich, pcc
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D71229