Summary: This change fix PR35342 by replacing only the current use with undef in unreachable blocks.
Reviewers: efriedma, mcrosier, igor-laevsky
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40184
llvm-svn: 318551
making it no longer even remotely simple.
The pass will now be more of a "full loop unswitching" pass rather than
anything substantively simpler than any other approach. I plan to rename
it accordingly once the dust settles.
The key ideas of the new loop unswitcher are carried over for
non-trivial unswitching:
1) Fully unswitch a branch or switch instruction from inside of a loop to
outside of it.
2) Update the CFG and IR. This avoids needing to "remember" the
unswitched branches as well as avoiding excessively cloning and
reliance on complex parts of simplify-cfg to cleanup the cfg.
3) Update the analyses (where we can) rather than just blowing them away
or relying on something else updating them.
Sadly, #3 is somewhat compromised here as the dominator tree updates
were too complex for me to want to reason about. I will need to make
another attempt to do this now that we have a nice dynamic update API
for dominators. However, we do adhere to #3 w.r.t. LoopInfo.
This approach also adds an important principls specific to non-trivial
unswitching: not *all* of the loop will be duplicated when unswitching.
This fact allows us to compute the cost in terms of how much *duplicate*
code is inserted rather than just on raw size. Unswitching conditions
which essentialy partition loops will work regardless of the total loop
size.
Some remaining issues that I will be addressing in subsequent commits:
- Handling unstructured control flow.
- Unswitching 'switch' cases instead of just branches.
- Moving to the dynamic update API for dominators.
Some high-level, interesting limitationsV that folks might want to push
on as follow-ups but that I don't have any immediate plans around:
- We could be much more clever about not cloning things that will be
deleted. In fact, we should be able to delete *nothing* and do
a minimal number of clones.
- There are many more interesting selection criteria for which branch to
unswitch that we might want to look at. One that I'm interested in
particularly are a set of conditions which all exit the loop and which
can be merged into a single unswitched test of them.
Differential revision: https://reviews.llvm.org/D34200
llvm-svn: 318549
The assertion was introduced in r317853 but there are cases when a call
isn't handled either as direct or indirect. In this case we add a
reference graph edge but not a call graph edge.
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: mehdi_amini, inglorion, eraman, hiraditya, efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D40056
llvm-svn: 318540
Fix test as it is assuming that the cache pruning is always being
performed by default. Explicitly set prune interval to 0s to ensure
pruning is always performed.
llvm-svn: 318520
Enabling and using dwarf exceptions seems like an easier path
to take, than to make the COFF/ARM backend output EHABI directives.
Previously, no EH model was enabled at all on this target.
There's no point in setting UseIntegratedAssembler to false since
GNU binutils doesn't support Windows on ARM, and since we don't
need to support external assembler, we don't need to use register
numbers in cfi directives.
Differential Revision: https://reviews.llvm.org/D39532
llvm-svn: 318510
The logic of replacing of a couple `RANGE_CHECK_LOWER + RANGE_CHECK_UPPER`
into `RANGE_CHECK_BOTH` in fact duplicates the logic of range intersection which
happens when we calculate safe iteration space. Effectively, the result of intersection of
these ranges doesn't differ from the range of merged range check.
We chose to remove duplicating logic in favor of code simplicity.
Differential Revision: https://reviews.llvm.org/D39589
llvm-svn: 318508
We might have instructions such as ext(copy(trunc)), and while cleaning
up legalization artifacts, we can also dce the copies that are in
between legalization artifacts.
llvm-svn: 318501
Add additional RUN clauses to test for -asan-mapping-scale=5 in
selective tests, with special CHECK statements where needed.
Differential Revision: https://reviews.llvm.org/D39775
llvm-svn: 318493
iterator to walk the list which keeps changing inside the loop. When the
UseList contains several uses with the same user, we end processing the same
user more than once, which leads to an assert.
With this fix, unique users are saved and processed later to avoid
processing duplicates.
Differential Revision: https://reviews.llvm.org/D39864
llvm-svn: 318477
't' constraint normally only accepts f32 operands, but for VCVT the
operands can be i32. LLVM is overly restrictive and rejects asm like:
float foo() {
float result;
__asm__ __volatile__(
"vcvt.f32.s32 %[result], %[arg1]\n"
: [result]"=t"(result)
: [arg1]"t"(0x01020304) );
return result;
}
Relax the value type for 't' constraint to either f32 or i32.
Differential Revision: https://reviews.llvm.org/D40137
llvm-svn: 318472
Only do this pre-legalize in case we're using the sign extend to legalize for KNL.
This recovers all of the tests that changed when I stopped SelectionDAGBuilder from deleting sign extends.
There's more work that could be done here particularly to fix the i8->i64 test case that experienced split.
llvm-svn: 318468
Previously SelectionDAGBuilder would remove this sign extend leading to a failure during isel.
The codegen here isn't very nice as we ended up triggering a split.
llvm-svn: 318467
The sign extend might be from an i16 or i8 type and was inserted by InstCombine to match the pointer width. X86 gather legalization isn't currently detecting this to reinsert a sign extend to make things legal.
It's a bit weird for the SelectionDAGBuilder to do this kind of optimization in the first place. With this removed we can at least lean on InstCombine somewhat to ensure the index is i32 or i64.
I'll work on trying to recover some of the test cases by removing sign extends in the backend when its safe to do so with an understanding of the current legalizer capabilities.
This should fix PR30690.
llvm-svn: 318466
The wider element type will normally cause legalize to try to split and scalarize the gather/scatter, but we can't handle that. Instead, truncate the index early so the gather/scatter node is insulated from the legalization.
This really shouldn't happen in practice since InstCombine will normalize index types to the same size as pointers.
llvm-svn: 318452
This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant.
Differential Revision: https://reviews.llvm.org/D39352
llvm-svn: 318436
Summary:
This change introduces a `DynamicSymbols` field to the ELF specific YAML
supported by `yaml2obj` and `obj2yaml`. This grouping of symbols provides a way
to represent ELF dynamic symbols. The `DynamicSymbols` structure is identical to
the existing `Symbols`.
Reviewers: compnerd, jakehehrlich, silvas
Reviewed By: silvas
Subscribers: silvas, jakehehrlich, llvm-commits
Differential Revision: https://reviews.llvm.org/D39582
llvm-svn: 318433
Summary:
This change fixes a bug where `obj2yaml` can in some cases produce YAML that
causes `yaml2obj` to error.
The ELF YAML document structure has a `Sections` mapping, which contains three
mappings, all of which are optional: `Local`, `Global`, and `Weak.` Any one of
these can be missing, but if all three are missing, then `yaml2obj` errors. This
change allows YAML input for cases like this one.
I have tested this with check-llvm and check-lld, and all tests passed.
This change is the result of test failures while working on D39582, which
introduces a `DynamicSymbols` mapping, which will be empty at times.
Reviewers: compnerd, jakehehrlich, silvas, kledzik, mehdi_amini, pcc
Reviewed By: compnerd
Subscribers: silvas, llvm-commits
Differential Revision: https://reviews.llvm.org/D39908
llvm-svn: 318428
llvm.invariant.group.barrier may accept pointers to arbitrary address space.
This patch let it accept pointers to i8 in any address space and returns
pointer to i8 in the same address space.
Differential Revision: https://reviews.llvm.org/D39973
llvm-svn: 318413
// trunc (binop X, C) --> binop (trunc X, C')
// trunc (binop (ext X), Y) --> binop X, (trunc Y)
I'm grouping sub with the other binops because that makes the code simpler
and the transforms are valid:
https://rise4fun.com/Alive/UeF
...so even though we don't expect a sub with constant Op1 or any of the
other opcodes with constant Op0 due to canonicalization rules, we might as
well handle those situations if non-canonical code somehow reaches this
point (it should just make instcombine more efficient in reaching its
end goal).
This should solve the problem that later manifests in the vectorizers in
PR35295:
https://bugs.llvm.org/show_bug.cgi?id=35295
llvm-svn: 318404
Fix a couple places where the minimum alignment/size should be a
function of the shadow granularity:
- alignment of AllGlobals
- the minimum left redzone size on the stack
Added a test to verify that the metadata_array is properly aligned
for shadow scale of 5, to be enabled when we add build support
for testing shadow scale of 5.
Differential Revision: https://reviews.llvm.org/D39470
llvm-svn: 318395
SelectionDAGBuilder::visitAlloca assumes alloca address space is 0, which is
incorrect for triple amdgcn---amdgiz and causes isel failure.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D40095
llvm-svn: 318392
Change the calculation for the desired ValueType for non-sign
extending loads, as in those cases we don't care about the
higher bits. This creates a smaller ExtVT and allows for such
combinations as:
(srl (zextload i16, [addr]), 8) -> (zextload i8, [addr + 1])
Differential Revision: https://reviews.llvm.org/D40034
llvm-svn: 318390
This patch contains more accurate cost of interelaved load\store of stride 2 for the types int64\double on AVX2.
Reviewers: delena, RKSimon, craig.topper, dorit
Reviewed By: dorit
Differential Revision: https://reviews.llvm.org/D40008
llvm-svn: 318385
When expanding exit conditions for pre- and postloops, we may end up expanding a
recurrency from the loop to in its loop's preheader. This produces incorrect IR.
This patch ensures that IRCE uses SCEVExpander correctly and only expands code which
is safe to expand in this particular location.
Differentian Revision: https://reviews.llvm.org/D39234
llvm-svn: 318381
The type legalizer will try to scalarize these operations if it sees them, but there is no handling for scalarizing them. This leads to a fatal error. With this change they will now be scalarized by the mem intrinsic scalarizing pass before SelectionDAG.
llvm-svn: 318380
processDbgDeclares assumes pointer size is the same for different addr spaces.
It uses pointer size for addr space 0 for all pointers, which causes assertion
in stripAndAccumulateInBoundsConstantOffsets for amdgcn---amdgiz since
pointer in addr space 5 has different size than in addr space 0.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D40085
llvm-svn: 318370
Add hook in BPF backend so that llvm-objdump can print out
the jmp target with label names, e.g.,
...
if r1 != 2 goto 6 <LBB0_2>
...
goto 7 <LBB0_4>
...
LBB0_2:
...
LBB0_4:
...
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 318358
Summary:
This patch adds a LLVM_ENABLE_GISEL_COV which, like LLVM_ENABLE_DAGISEL_COV,
causes TableGen to instrument the generated table to collect rule coverage
information. However, LLVM_ENABLE_GISEL_COV goes a bit further than
LLVM_ENABLE_DAGISEL_COV. The information is written to files
(${CMAKE_BINARY_DIR}/gisel-coverage-* by default). These files can then be
concatenated into ${LLVM_GISEL_COV_PREFIX}-all after which TableGen will
read this information and use it to emit warnings about untested rules.
This technique could also be used by SelectionDAG and can be further
extended to detect hot rules and give them priority over colder rules.
Usage:
* Enable LLVM_ENABLE_GISEL_COV in CMake
* Build the compiler and run some tests
* cat gisel-coverage-[0-9]* > gisel-coverage-all
* Delete lib/Target/*/*GenGlobalISel.inc*
* Build the compiler
Known issues:
* ${LLVM_GISEL_COV_PREFIX}-all must be generated as a manual
step due to a lack of a portable 'cat' command. It should be the
concatenation of all ${LLVM_GISEL_COV_PREFIX}-[0-9]* files.
* There's no mechanism to discard coverage information when the ruleset
changes
Depends on D39742
Reviewers: ab, qcolombet, t.p.northover, aditya_nandakumar, rovka
Reviewed By: rovka
Subscribers: vsk, arsenm, nhaehnle, mgorny, kristof.beyls, javed.absar, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D39747
llvm-svn: 318356