Summary:
If a wrapper around one of the mem* stdlib functions bitcasts the returned
pointer value before returning it (e.g. to a wchar_t*), LLVM does not emit a
tail call.
Add a check for this scenario so that we emit a tail call.
Reviewers: wmi, mkuper, ramred01, dmgreen
Reviewed By: wmi, dmgreen
Subscribers: hiraditya, sanwou01, javed.absar, lebedev.ri, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59078
For AMDGPU this depends on whether denormals are enabled in the
default FP mode for the function. Currently this is treated as a
subtarget feature, so FMAD is selectively legal based on that. I want
to move this out of the subtarget features so this can be controlled
with a denormal mode attribute. Additionally, this will allow folding
based on a future ftz fast math flag.
Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods
to return optional machine operand pair of destination and source
registers.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D69622
This reverts commit f5e1b718a6.
PR43855 reports a performance regression with commit ee50590e. This commit
depends on the faulty one, so has to come out too.
This adds a flag to LLVM and clang to always generate a .debug_frame
section, even if other debug information is not being generated. In
situations where .eh_frame would normally be emitted, both .debug_frame
and .eh_frame will be used.
Differential Revision: https://reviews.llvm.org/D67216
Teach the combiner helper how to replace shuffle_vector of scalars
into build_vector.
I am not particularly happy about having to add this combine, but we
currently get those from <1 x iN> from the IR.
Bonus: This fixes an assert in the shuffle_vector combines since before
this patch, we were expecting vector types.
From SelectionDAGs point of view, debug variable locations specified with
dbg.declare and dbg.addr are indirect -- they specify the address of
something. But calling conventions might mean that a Value is placed on
the stack somewhere, and this too is indirection. Previously this was
mixed up in the "IsIndirect" field of DBG_VALUE insts; this patch
separates them by encoding the indirection in a DIExpression.
If we have a dbg.declare or dbg.addr, then the expression produces an
address that then becomes a DWARF memory location. We can represent
this by putting a DW_OP_deref on the _end_ of the expression. If a Value
has been placed on the stack, then we need to put a DW_OP_deref on the
_start_ of the expression, to load the Value from the stack and have
the rest of the expression operate on it.
Differential Revision: https://reviews.llvm.org/D69028
Summary:
This is used on AMDGPU for rounding from v3f64 (which is illegal) to
v3f32 (which is legal).
Subscribers: jvesely, nhaehnle, tpr, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69339
This is a follow-up to D67448.
Split live intervals with multiple dead defs during the initial
execution of the live interval analysis, but do it outside of the
function createAndComputeVirtRegInterval.
Differential Revision: https://reviews.llvm.org/D68666
Extend the describeLoadedValue() with support for target specific ARM and
AArch64 instructions interpretation. The patch provides specialization for
ADD and SUB operations that include a register and an immediate/offset
operand. Some of the instructions can operate with global string addresses
or constant pool indexes but such cases are omitted since we currently lack
flexible support for processing such operands at DWARF production stage.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D67556
Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp
expansions but do not change any default policy for now.
This also fixes a bug in the memcmp expansion itself when large
displacements are needed.
https://reviews.llvm.org/D69507
This patch adds support for deleted C++ special member functions in
clang and llvm. Also added Defaulted member encodings for future
support for defaulted member functions.
Patch by Sourabh Singh Tomar!
Differential Revision: https://reviews.llvm.org/D69215
Enable the new SelectionDAG representation for unordered loads and stores introduced in r371441 by default. As a reminder, the new lowering changes the representation of an unordered atomic load from an AtomicSDNode - which is essentially a black box which gets passed through without combines messing with it - to a LoadSDNode w/a atomic marker on the MMO. The later parallels the way we handle volatiles, and I've audited the code to ensure that every location which checks one checks the other.
This has been fairly heavily fuzzed, and I examined diffs in a reasonable large corpus of assembly by hand, so I'm reasonable sure this is correct for the common case. Late in the review for this, it was discovered that I hadn't correctly handled cases which could be legalized into CAS operations. This points out that there's a strong bias in the IR of the frontend I'm working with towards only legal atomics. If there are problems with this patch, the most likely area will be legalization.
Differential Revision: https://reviews.llvm.org/D69219
llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir failed with
EXPENSIVE_CHECKS enabled, causing the patch to be reverted in
rG2c496bb5309c972d59b11f05aee4782ddc087e71.
This patch relands the patch with a proper fix to the
live-debug-values-reg-copy.mir tests, by ensuring the MIR encodes the
callee-saves correctly so that the CalleeSaved info is taken from MIR
directly, rather than letting it be recalculated by the PEI pass. I've
done this by running `llc -stop-before=prologepilog` on the LLVM
IR as captured in the test files, adding the extra MOV instructions
that were manually added in the original test file, then running `llc
-run-pass=prologepilog` and finally re-added the comments for the MOV
instructions.
Use the existing helper function in BranchFolding, "countsAsInstruction",
to skip over non-instructions. Otherwise debug instructions can be
identified as the last real instruction in a block, leading to different
codegen decisions when debug is enabled as demonstrated by the test case.
Patch by: yechunliang (Chris Ye)!
Differential Revision: https://reviews.llvm.org/D66467
Summary:
Fixes some things from original commit at https://reviews.llvm.org/D69136. The main
change is that the heap alloc marker is always stored as ExtraInfo in the machine
instruction instead of in the PointerSumType because it cannot hold more than
4 pointer types.
Add instruction marker to MachineInstr ExtraInfo. This does almost the
same thing as Pre/PostInstrSymbols, except that it doesn't create a label until
printing instructions. This allows for labels to be put around instructions that
are deleted/duplicated somewhere.
Use this marker to track heap alloc site call instructions.
Reviewers: rnk
Subscribers: MatzeB, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69536
Summary:
(Split of off D67120)
SizeOpts/MachineSizeOpts changes for profile guided size optimization.
(A second try after previously committed as r375254 and reverted as r375375.)
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69409
Emit a remarks section by default for the following formats:
* bitstream
* yaml-strtab
while still providing -remarks-section=<bool> to override the defaults.
I want to add the ability to rerun the outliner in certain cases, and I
thought this could be an NFC change that could make a subsequent change
that allows for rerunning the outliner a cleaner diff.
Differential Revision: https://reviews.llvm.org/D69482
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.
Reviewers: thakis, rnk, theraven, pcc
Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65761
In the Pre-RA machine sinker, previously we were relying on all DBG_VALUEs
being immediately after the instruction that defined their operands. This
isn't a valid assumption, as a variable location change doesn't
necessarily correspond to where the value is computed. In this patch, we
collect DBG_VALUEs that might need sinking as we walk through a block,
and sink all of them if their defining instruction is sunk.
This patch adds some copy propagation too, so that if we sink a copy inst,
the now non-dominated paths can use the copy source for the variable
location.
Differential Revision: https://reviews.llvm.org/D58386
This enhances D69127 (rGe6c145e0548e3b3de6eab27e44e1504387cf6b53)
to handle the looser "any_extend" cast in addition to zext.
This is a prerequisite step for canonicalizing in the other direction
(narrow the popcount) in IR - PR43688:
https://bugs.llvm.org/show_bug.cgi?id=43688
When we sink DBG_VALUEs between blocks, we simply move the DBG_VALUE
instruction to below the sunk instruction. However, we should also mark
the variable as being undef at the original location, to terminate any
earlier variable location. This patch does that -- plus, if the
instruction being sunk is a copy, it attempts to propagate the copy
through the DBG_VALUE, replacing the destination with the source.
Differential Revision: https://reviews.llvm.org/D58238
We would previously have no soft-float softening for cbrt, so could hit
a crash failing to select. This fills in what appears to be missing.
Differential Revision: https://reviews.llvm.org/D69345
Similar to:
rG4c47617627fb
This makes the DAG behavior consistent with IR's insertelement.
https://bugs.llvm.org/show_bug.cgi?id=42689
I've tried to maintain test intent for AArch64 and WebAssembly
by replacing undef index operands with something else.
If the target's preferred shift amount VT can't hold any shift
amount for the promoted VT, we should use i32. The specific shift
amount shouldn't matter. The type will be adjusted later when the
shift itself is type legalized. This avoids an assert in getNode.
Fixes PR43820.
This combine is only valid if the inner setcc produces a 0/1 result
or the inner type is MVT::i1.
I haven't seen this cause any issues, just happened to notice it
while reviewing combines in this function.
While there also fix another call to use the value type from the
SDValue for the operand instead of calling SDNode::getValueType(0).
Though its likely the use is result 0, its not guaranteed.
This makes the DAG behavior consistent with IR's extractelement after:
rGb32e4664a715
https://bugs.llvm.org/show_bug.cgi?id=42689
I've tried to maintain test intent for WebAssembly.
The AMDGPU test is trying to test for crashing or other bad behavior,
but I'm not sure if that's possible after this change.
zext (ctpop X) --> ctpop (zext X)
This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688:
https://bugs.llvm.org/show_bug.cgi?id=43688
I'm not sure if any other targets are affected, but I found a missing fold for PPC, so added tests based on that.
The reason we widen all the way to 64-bit in these tests is because the initial DAG looks something like this:
t5: i8 = ctpop t4
t6: i32 = zero_extend t5 <-- created based on IR, but unused node?
t7: i64 = zero_extend t5
Differential Revision: https://reviews.llvm.org/D69127
Summary:
Add instruction marker to MachineInstr ExtraInfo. This does almost the
same thing as Pre/PostInstrSymbols, except that it doesn't create a label until
printing instructions. This allows for labels to be put around instructions that
are deleted/duplicated somewhere.
Also undo the workaround in r375137.
Reviewers: rnk
Subscribers: MatzeB, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69136
Summary:
Ternary expression checks for ISD::ADD instead of ISD::UADDO inside DAGTypeLegalizer::ExpandIntRes_UADDSUBO.
This means the ternary expression will evaluate to ISD::SUBCARRY for both ISD::UADDO and ISD::USUBO nodes.
Targets are likely to implement both, so impact will be very limited in practice.
Reviewers: bogner, lebedev.ri
Reviewed By: lebedev.ri
Subscribers: lebedev.ri, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68123