This is an alternate fix for the bug discussed in D70595.
This also includes minimal tests for other in-tree targets to show the problem more
generally.
We check the number of uses as a predicate for whether some value is free to negate,
but that use count can change as we rewrite the expression in getNegatedExpression().
So something that was marked free to negate during the cost evaluation phase becomes
not free to negate during the rewrite phase (or the inverse - something that was not
free becomes free). This can lead to a crash/assert because we expect that everything
in an expression that is negatible to be handled in the corresponding code within
getNegatedExpression().
This patch adds a hack to work-around the case where we probably no longer detect
that either multiply operand of an FMA isNegatibleForFree which is assumed to be
true when we started rewriting the expression.
Differential Revision: https://reviews.llvm.org/D70975
This is an alternate fix for the bug discussed in D70595.
This also includes minimal tests for other in-tree targets to show the problem more
generally.
We check the number of uses as a predicate for whether some value is free to negate,
but that use count can change as we rewrite the expression in getNegatedExpression().
So something that was marked free to negate during the cost evaluation phase becomes
not free to negate during the rewrite phase (or the inverse - something that was not
free becomes free). This can lead to a crash/assert because we expect that everything
in an expression that is negatible to be handled in the corresponding code within
getNegatedExpression().
This patch adds a hack to work-around the case where we probably no longer detect
that either multiply operand of an FMA isNegatibleForFree which is assumed to be
true when we started rewriting the expression.
Differential Revision: https://reviews.llvm.org/D70975
expected failed test (RV32IF-ILP32F) will be fixed in a subsequent patch.
Reviewers: efriedma, lenary, asb
Reviewed By: efriedma, lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70116
As of b1d8576 there is middle-end support for STRICT_[SU]INT_TO_FP,
so this patch adds SystemZ back-end support as well.
The patch is SystemZ target specific except for adding SD patterns
strict_[su]int_to_fp and any_[su]int_to_fp to TargetSelectionDAG.td
as usual.
Summary:
When we build the walk across these DAG's we need to be able to reach every node
from the roots. Flip and traversal edges (so that use->def becomes def->uses)
that make nodes unreachable. Note that early on we'll just error out on these
flipped edges as def->uses edges are more complicated to match due to their
one->many nature.
Depends on D69077
Reviewers: volkan, bogner
Subscribers: llvm-commits
Summary:
Add trimming of unused components of s_buffer_load.
Extend trimming of *buffer_load to also include
unused components at the beginning of vectors and update offset.
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70315
Summary:
This commit builds upon Derek Schuff's 2014 commit for attaching labels to
existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 )
When temporary labels appear ahead of a fragment, MCObjectStreamer will
track the temporary label symbol in a "Pending Labels" list. Labels are
associated with fragments when a real fragment arrives; otherwise, an empty
data fragment will be created if the streamer's section changes or if the
stream finishes.
This commit moves the "Pending Labels" list into each MCStream, so that
this label-fragment matching process is resilient to section changes. If
the streamer emits a label in a new section, switches to another section to
do other work, then switches back to the first section and emits a
fragment, that initial label will be associated with this new fragment.
Labels will only receive empty data fragments in the case where no other
fragment exists for that section.
The downstream effects of this can be seen in Mach-O relocations. The
previous approach could produce local section relocations and external
symbol relocations for the same data in an object file, and this mix of
relocation types resulted in problems in the ld64 Mach-O linker. This
commit ensures relocations triggered by temporary labels are consistent.
Reviewers: pete, ab, dschuff
Reviewed By: pete, dschuff
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71368
Summary:
The MatchDag structure is a representation of the checks that need to be
performed and the dependencies that limit when they can happen.
There are two kinds of node in the MatchDag:
* Instrs - Represent a MachineInstr
* Predicates - Represent a check that needs to be performed (i.e. opcode, is register, same machine operand, etc.)
and two kinds of edges:
* (Traversal) Edges - Represent a register that can be traversed to find one instr from another
* Predicate Dependency Edges - Indicate that a predicate requires a piece of information to be tested.
For example, the matcher:
(match (MOV $t, $s),
(MOV $d, $t))
with MOV declared as an instruction of the form:
%dst = MOV %src1
becomes the following MatchDag with the following instruction nodes:
__anon0_0 // $t=getOperand(0), $s=getOperand(1)
__anon0_1 // $d=getOperand(0), $t=getOperand(1)
traversal edges:
__anon0_1[src1] --[t]--> __anon0_0[dst]
predicate nodes:
<<$mi.getOpcode() == MOV>>:$__anonpred0_2
<<$mi.getOpcode() == MOV>>:$__anonpred0_3
and predicate dependencies:
__anon0_0 ==> __anonpred0_2[mi]
__anon0_0 ==> __anonpred0_3[mi]
The result of this parse is currently unused but can be tested
using -gicombiner-stop-after-parse as done in parse-match-pattern.td. The
dump for testing includes a graphviz format dump to allow the rule to be
viewed visually.
Later on, these MatchDag's will be used to generate code and to build an
efficient decision tree.
Reviewers: volkan, bogner
Reviewed By: volkan
Subscribers: arsenm, mgorny, mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69077
of integers to floating point.
This includes some of Craig Topper's changes for promotion support from
D71130.
Differential Revision: https://reviews.llvm.org/D69275
Commit 84a9756 added an extra blank line at the end of any line table.
However, a blank line is also printed after the line table header, which
meant that two blank lines in a row were being printed after a header,
if there were no rows. This patch defers the post-header blank line
printing until it has been determined that there are rows to print.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D71540
This fixes an assertion failure that triggers inside
getMemOperandWithOffset when Machine Sinking calls it on a MachineInstr
that is not a memory operation.
Different backends implement getMemOperandWithOffset differently: some
return false on non-memory MachineInstrs, others assert.
The Machine Sinking pass in at least SinkingPreventsImplicitNullCheck
relies on getMemOperandWithOffset to return false on non-memory
MachineInstrs, instead of asserting.
This patch updates the documentation on getMemOperandWithOffset that it
should return false on any MachineInstr it cannot handle, instead of
asserting. It also adapts the in-tree backends accordingly where
necessary.
Differential Revision: https://reviews.llvm.org/D71359
Summary:
With DWARF5 it is no longer possible to distinguish normal methods and methods with `__attribute__((objc_direct))` by just looking at the debug information
as they are both now children of the of the DW_TAG_structure_type that defines them (before only the `__attribute__((objc_direct))` methods were children).
This means that in LLDB we are no longer able to create a correct Clang AST of a module by just looking at the debug information. Instead we would
need to call the Objective-C runtime to see which of the methods have a `__attribute__((objc_direct))` and then add the attribute to our own Clang AST
depending on what the runtime returns. This would mean that we either let the module AST be dependent on the Objective-C runtime (which doesn't
seem right) or we retroactively add the missing attribute to the imported AST in our expressions.
A third option is to annotate methods with `__attribute__((objc_direct))` as `DW_AT_APPLE_objc_direct` which is what this patch implements. This way
LLDB doesn't have to call the runtime for any `__attribute__((objc_direct))` method and the AST in our module will already be correct when we create it.
Reviewers: aprantl, SouraVX
Reviewed By: aprantl
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D71201
This patch makes it so that cases where multiple instructions that
differ only in their ConstantInt or ConstantFP MachineOperand values no
longer collide. For instance:
%0:_(s1) = G_CONSTANT i1 true
%1:_(s1) = G_CONSTANT i1 false
%2:_(s32) = G_FCONSTANT float 1.0
%3:_(s32) = G_FCONSTANT float 0.0
Prior to this patch the first two instructions would collide together.
Also, the last two G_FCONSTANT instructions would also collide. Now they
will no longer collide.
Differential Revision: https://reviews.llvm.org/D71558
Currently -fuse-init-array option is not effective when target triple
does not specify os, on x86,x86_64.
i.e.
// -fuse-init-array is not honored.
$ clang -target i386 -fuse-init-array test.c -S
// -fuse-init-array is honored.
$ clang -target i386-linux -fuse-init-array test.c -S
This patch fixes first case.
And does cleanup.
Reviewers: rnk, craig.topper, fhahn, echristo
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71360
Summary:
This patch restricts loop fusion to only consider rotated loops as valid candidates.
This simplifies the analysis and transformation and aligns with other loop optimizations.
Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney, fhahn, hfinkel
Reviewed By: Meinersbur
Subscribers: ormris, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71025
Summary:
The instructions were originally implemented via builtins and
intrinsics so users would have to explicitly opt-in to using
them. This was useful while were validating whether these instructions
should have been merged into the spec proposal. Now that they have
been, we can use normal codegen patterns, so the intrinsics and
builtins are no longer useful.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71500
The %update_cc_test_checks substitution only gets added if python3
is on path, so the test fails if it isn't. Don't run the test
when it would fail.
Also include the '%' in the arg to add_update_script_substition(),
to help greppability.
Now that the machine verifier will check for cases of register/immediate
MachineOperands and their correspondence to the MC instruction descriptor,
this patch adds the operand types to the descriptors where they were
previously missing. All MCOI::OPERAND_UNKNOWN operand types have been handled
to get a known type, except for G_... (global isel) instructions.
Review: Ulrich Weigand
https://reviews.llvm.org/D71494
Summary:
llvm-cxxfilt wasn't correctly demangle COFF import thunk in those two
cases before:
* demangle in split mode (multiple words from commandline)
* the import thunk prefix was added no matter the later part of the
string can be demangled or not
Now llvm-cxxfilt should handle both case correctly.
Reviewers: compnerd, erik.pilkington, jhenderson
Reviewed By: jhenderson
Subscribers: jkorous, dexonsmith, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71425
If a function requires optnone to trigger a crash, it must also have noline,
otherwise it will fail a verifier check.
Differential revision: https://reviews.llvm.org/D69522
Summary:
Reland after fixing a bug that allowed outlining of SP modifying instructions
that invalidated return address signing.
During AArch64 frame lowering instructions to enable return address
signing are inserted into functions if needed. Functions generated during
machine outlining don't run through target frame lowering and hence are
missing such instructions.
This patch introduces the following changes:
1. If not all functions that potentially participate in function outlining agree
on their return address signing scope and their return address signing key,
outlining is disabled for these functions.
2. If not all functions that potentially participate in function outlining agree
on their support for v8.3A features, outlining is disabled for these
functions.
3. If an outlining candidate would outline instructions that modify sp in a way
that invalidates return address signing, outlining is disabled for that
particular candidate.
4. If all candidate functions agree on the signing scope, signing key and their
support for v8.3 features, the outlined function behaves as if it had the
same scope and key attributes and as if it would provide the same v8.3A
support as the original functions.
Reviewers: ostannard, paquette
Reviewed By: ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70635
The emission of stack maps in AArch64 binaries has been disabled for all
binary formats except Mach-O since rL206610, probably mistakenly, as far
as I can tell. This patch reverts this to its intended state.
Differential Revision: https://reviews.llvm.org/D70069
Patch by Loic Ottet.
Summary:
This commit adds basic tests for these update script to validate that
they still work as expected. In the future we could extend these tests
whenever new features are added to avoid introducing regressions.
Reviewers: xbolva00, MaskRay, jdoerfert
Reviewed By: jdoerfert
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70660
Summary:
In commit d60f34c20a (llvm-svn 317128,
PR35113) MergeBlockIntoPredecessor was changed into
discarding some dbg.value intrinsics referring to
PHI values, post-splice due to loop rotation.
That elimination of dbg.value intrinsics did not
consider which dbg.value to keep depending on the
context (e.g. if the variable is changing its value
several times inside the basic block).
In the past that hasn't been such a big problem since
CodeGenPrepare::placeDbgValues has moved the dbg.value
to be next to the PHI node anyway. But after commit
00e238896c CodeGenPrepare isn't doing that
any longer, so we need to be more careful when avoiding
duplicate dbg.value intrinsics in MergeBlockIntoPredecessor.
This patch replaces the code that tried to avoid duplicate
dbg.values by using the RemoveRedundantDbgInstrs helper.
Reviewers: aprantl, jmorse, vsk
Reviewed By: aprantl, vsk
Subscribers: jholewinski, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71480
Summary:
In commit d60f34c20a (llvm-svn 317128,
PR35113) MergeBlockIntoPredecessor was changed into
discarding some dbg.value intrinsics referring to
PHI values, post-splice due to loop rotation.
That elimination of dbg.value intrinsics does not
consider which dbg.value to keep based on the context.
Such as always keeping the one that comes first textually,
or the need to keep several of them in case the variable
is changing it's value several times inside the basic block.
In the past that hasn't been such a big problem since
CodeGenPrepare::placeDbgValues has moved the dbg.value
to be next to the PHI node anyway. But after commit
00e238896c CodeGenPrepare isn't doing that
any longer, so we need to be more careful when avoiding
duplicate dbg.value intrinsics in MergeBlockIntoPredecessor.
This patch is just a pre commit of the test case.
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71479
Summary:
Add a RemoveRedundantDbgInstrs to BasicBlockUtils with the
goal to remove redundant dbg intrinsics from a basic block.
This can be useful after various transforms, as it might
be simpler to do a filtering of dbg intrinsics after the
transform than during the transform.
One primary use case would be to replace a too aggressive
removal done by MergeBlockIntoPredecessor, seen at loop
rotate (not done in this patch).
The elimination algorithm currently focuses on dbg.value
intrinsics and is doing two iterations over the BB.
First we iterate backward starting at the last instruction
in the BB. Whenever a consecutive sequence of dbg.value
instructions are found we keep the last dbg.value for
each variable found (variable fragments are identified
using the {DILocalVariable, FragmentInfo, inlinedAt}
triple as given by the DebugVariable helper class).
Next we iterate forward starting at the first instruction
in the BB. Whenever we find a dbg.value describing a
DebugVariable (identified by {DILocalVariable, inlinedAt})
we save the {DIValue, DIExpression} that describes that
variables value. But if the variable already was mapped
to the same {DIValue, DIExpression} pair we instead drop
the second dbg.value.
To ease the process of making lit tests for this utility a
new pass is introduced called RedundantDbgInstElimination.
It can be executed by opt using -redundant-dbg-inst-elim.
Reviewers: aprantl, jmorse, vsk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71478
Summary:
At present, the code calculating known bits of AMDGPU MUL_I24 confuses the concepts of "non-negative number" and "positive number".
In some situations, it results in incorrect code. I have a case where the optimizer replaces the result of calculating MUL_I24(-5, 0) with -8.
Reviewers: foad, arsenm
Reviewed By: arsenm
Subscribers: foad, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Patch by Eugene Kuznetsov.
Differential Revision: https://reviews.llvm.org/D70367
Summary:
Guard against a potential crash observed in https://github.com/JuliaLang/julia/issues/32994#issuecomment-524249628
If two branches are collapsed we can encounter a degenerate conditional branch `TBB==FBB`.
The subsequent code assumes that they differ, so we exit out early.
Reviewers: ributzka, spatel
Subscribers: loladiro, dexonsmith, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66657
.text sh_address=0x1000 sh_offset=0x1000
.data sh_address=0x3000 sh_offset=0x2000
In an objcopy -O binary output, the distance between two sections equal
their LMA differences (0x3000-0x1000), instead of their sh_offset
differences (0x2000-0x1000). This patch changes our behavior to match
GNU.
This rule gets more complex when the containing PT_LOAD has
p_vaddr!=p_paddr. GNU objcopy essentially computes
sh_offset-p_offset+p_paddr for each candidate section, and removes the
gap before the first address.
Added tests to binary-paddr.test to catch the compatibility problem.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D71035
Summary:
r372285 changed LLVM to use a `TargetConstant` for parameters of intrinsics that are required to be immediates.
Since that commit, use of `%llvm.ppc.altivec.vc{fsx,fux,tsxs,tuxs}` intrinsics has not worked, and resulted in a `LLVM ERROR: Cannot select: intrinsic %llvm.ppc.altivec.vc*` error. The intrinsics' TableGen definitions matched on `imm` instead of `timm`.
This commit updates those definitions to use `timm`.
Fixes: https://llvm.org/PR44239
Reviewers: hfinkel, nemanjai, #powerpc, Jim
Reviewed By: Jim
Subscribers: qiucf, wuzish, Jim, hiraditya, kbarton, jsji, shchenz, llvm-commits
Tags: #llvm
Patched by vddvss (Colin Samples).
Differential Revision: https://reviews.llvm.org/D71138
shuf (inselt ?, C, IndexC), undef, <IndexC, IndexC...> --> <C, C...>
This is another missing shuffle fold pattern uncovered by the
shuffle correctness fix from D70246.
The problem was visible in the post-commit thread example, but
we managed to overcome the limitation for that particular case
with D71220.
This is something like the inverse of the previous fix - there
we didn't demand the inserted scalar, and here we are only
demanding an inserted scalar.
Differential Revision: https://reviews.llvm.org/D71488
The commit r369122 may keep LR and FP register (aka. frame record) in
the middle of a frame, thus we must add the offsets to ensure the FP
register always points to innermost frame record on the stack.
According to AAPCS64[1], a conforming code shall construct a linked list
of stack frames that can be traversed with frame records. This commit
is also essential to frame-pointer-based stack unwinder (e.g. the stack
unwinder in linx-perf-tools.)
[1] https://github.com/ARM-software/software-standards/blob/master/abi/aapcs64/aapcs64.rst#the-frame-pointer
Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64/framelayout-frame-record.ll
Test: llvm-lit ${LLVM_SRC}/test/CodeGen/AArch64
Differential Revision: https://reviews.llvm.org/D70800