This change does two things. One, it ensures compilation will abort
instead of miscompiling if ARMFrameLowering::determineCalleeSaves
chooses not to save LR in a case where it's necessary. Two, it changes
the way we estimate the size of a function to be more conservative in
the presence of constant pool entries and jump tables.
EstimateFunctionSizeInBytes probably still isn't really conservative
enough, but I'm not sure how we can come up with a reliable estimate
before constant islands runs.
Differential Revision: https://reviews.llvm.org/D59439
llvm-svn: 356527
This adds pattern matching for the insert+shufflevector sequence so we can
generate dup instructions instead of the current TBL sequence.
Differential Revision: https://reviews.llvm.org/D59558
llvm-svn: 356526
Nothing prevents entries from being bigger than the 16 bit size field in
Dwarf < 5. For entries that are too big, just emit an empty entry
instead of crashing.
This fixes PR41038.
Reviewers: probinson, aprantl, davide
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D59518
llvm-svn: 356514
LTO provides additional opportunities for tailcall elimination due to
link-time inlining and visibility of nocapture attribute. Testing showed
negligible impact on compilation times.
Differential Revision: https://reviews.llvm.org/D58391
llvm-svn: 356511
Teach instcombine to propagate demanded elements through a masked load or masked gather instruction. This is in the broader context of improving vector pointer instcombine under https://reviews.llvm.org/D57140.
Differential Revision: https://reviews.llvm.org/D57372
llvm-svn: 356510
The 2nd loop calculates spill costs but reports free registers as cost
0 anyway, so there is little benefit from having a separate early
loop.
Surprisingly this is not NFC, as many register are marked regDisabled
so the first loop often picks up later registers unnecessarily instead
of the first one available in the allocation order...
Patch by Matthias Braun
llvm-svn: 356499
The actual code change is fairly straight forward, but exercising it isn't. First, it turned out we weren't adding the appropriate flags in SelectionDAG. Second, it turned out that we've got some optimization gaps, so obvious test cases don't work.
My first attempt (in atomic-unordered.ll) points out a deficiency in our peephole-opt folding logic which I plan to fix separately. Instead, I'm exercising this through MachineLICM.
Differential Revision: https://reviews.llvm.org/D59375
llvm-svn: 356494
Improve computeOverflowForUnsignedAdd/Sub in ValueTracking by
intersecting the computeConstantRange() result into the ConstantRange
created from computeKnownBits(). This allows us to detect some
additional never/always overflows conditions that can't be determined
from known bits.
This revision also adds basic handling for constants to
computeConstantRange(). Non-splat vectors will be handled in a followup.
The signed case will also be handled in a followup, as it needs some
more groundwork.
Differential Revision: https://reviews.llvm.org/D59386
llvm-svn: 356489
Add tests for wider atomic loads and stores. In the process, fix a crasher where we appearently handled unorder stores, but not loads, when lowering to cmpxchg idioms.
llvm-svn: 356482
Dynamic stack realignment was disabled on micromips by checking if
target has standard encoding. We simply change the condition to skip
Mips16 only.
Patch by Mirko Brkusanin.
Differential Revision: http://reviews.llvm.org/D59499
llvm-svn: 356478
In r311255 we added a case where we split vectors whose elements are
all derived from the same input vector so that we could shuffle it
more efficiently. In doing so, createBuildVecShuffle was taught to
adjust for the fact that all indices would be based off of the first
vector when this happens, but it's possible for the code that checked
that to fire incorrectly if we happen to have a BUILD_VECTOR of
extracts from subvectors and don't hit this new optimization.
Instead of trying to detect if we've split the vector by checking if
we have extracts from the same base vector, we can just pass that
information into createBuildVecShuffle, avoiding the miscompile.
Differential Revision: https://reviews.llvm.org/D59507
llvm-svn: 356476
There are some issues w/missed opts on older platforms, but that's not the purpose of this test. Using a newer API points out that some TODOs are already handled, and allows addition of tests to exercise other issues (future patch.)
llvm-svn: 356473
Combine 2 fcmps that are checking for nan-ness:
and (fcmp ord X, 0), (and (fcmp ord Y, 0), Z) --> and (fcmp ord X, Y), Z
or (fcmp uno X, 0), (or (fcmp uno Y, 0), Z) --> or (fcmp uno X, Y), Z
This is an exact match for a minimal reassociation pattern.
If we want to handle this more generally that should go in
the reassociate pass and allow removing this code.
This should fix:
https://bugs.llvm.org/show_bug.cgi?id=41069
llvm-svn: 356471
These changes are related to PR37743 and include:
SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.
Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.
Add promoting the integer ABS node in the LegalizeIntegerType.
Expand-based legalization of integer result for the ABS nodes.
Expand-based legalization of ABS vector operations.
Add some integer abs testcases for different typesizes for Thumb arch
Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
tmp = (SRA, Hi, 31)
Lo = (UADDO tmp, Lo)
Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
Lo = (XOR tmp, Lo)
The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
(ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).
Change integer abs testcases for codegen with the ABS node support for AArch64.
Indicate that the ABS is legal for the i64 type when the NEON is supported.
Change the integer abs testcases to show changing of codegen.
Add combine and legalization of ABS nodes for Thumb arch.
Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.
For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743
Patch by: @ikulagin (Ivan Kulagin)
Differential Revision: https://reviews.llvm.org/D49837
llvm-svn: 356468
Summary:
GNU ar supports the 'N' count modifier for the extract (x) and delete (d) operations. When an archive contains multiple members with the same name, this can be used to extract (or delete) them individually. For example:
```
$ llvm-ar t archive.a
foo
foo
$ llvm-ar x archive.a
-> Writes foo twice, overwriting it the second time :( :(
$ llvm-ar xN 1 archive.a foo && mv foo foo.1
$ llvm-ar xN 2 archive.a foo && mv foo foo.2
-> Write foo twice, renaming it in between invocations to preserve all versions
```
Reviewers: ruiu, MaskRay
Reviewed By: ruiu, MaskRay
Subscribers: jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59503
llvm-svn: 356466
I found this really weird WWM-related case whereby through the WWM
transformations our isel lowering was trying to promote 2 min's into a
min3 for the i8 type, which our hardware doesn't support.
The new min3_i8.ll test case would previously spew the error:
PromoteIntegerResult #0: t69: i8 = SMIN3 t70, Constant:i8<0>, t68
Before the simple fix to our isel lowering to not do it for i8 MVT's.
Differential Revision: https://reviews.llvm.org/D59543
llvm-svn: 356464
Switch to the `MCParserUtils::parseAssignmentExpression` for parsing
assignment expressions in the `.set` directive reduces code and allows
to print an error message instead of crashing in case of incorrect
recursive using of the `.set`.
Fix for the bug https://bugs.llvm.org/show_bug.cgi?id=41053.
Differential Revision: http://reviews.llvm.org/D59452
llvm-svn: 356461
As discussed on PR41125 and D59363, we have a mismatch between icmp eq/ne cases with an undef operand:
When the other operand is constant we fold to undef (handled in ConstantFoldCompareInstruction)
When the other operand is non-constant we fold to a bool constant based on isTrueWhenEqual (handled in SimplifyICmpInst).
Neither is really wrong, but this patch changes the logic in SimplifyICmpInst to consistently fold to undef.
The NewGVN test change is annoying (as with most heavily reduced tests) but AFAICT I have kept the purpose of the test based on rL291968.
Differential Revision: https://reviews.llvm.org/D59541
llvm-svn: 356456
Moving subprogram specific flags into DISPFlags makes IR code more readable.
In addition, we provide free space in DIFlags for other
'non-subprogram-specific' debug info flags.
Patch by Djordje Todorovic.
Differential Revision: https://reviews.llvm.org/D59288
llvm-svn: 356454
Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows
for a convenient way to perform type conversions on the Dwarf expression
stack. As an additional bonus it paves the way for using other Dwarf
v5 ops that need to reference a base_type.
The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp
to perform sext/zext on debug values but mainly the patch is about
preparing terrain for adding other Dwarf v5 ops that need to reference a
base_type.
For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a
complex shift & mask pattern is generated to emulate sext/zext.
This is a recommit of r356442 with trivial fixes for the failing tests.
Differential Revision: https://reviews.llvm.org/D56587
llvm-svn: 356451
Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows
for a convenient way to perform type conversions on the Dwarf expression
stack. As an additional bonus it paves the way for using other Dwarf
v5 ops that need to reference a base_type.
The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp
to perform sext/zext on debug values but mainly the patch is about
preparing terrain for adding other Dwarf v5 ops that need to reference a
base_type.
For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a
complex shift & mask pattern is generated to emulate sext/zext.
Differential Revision: https://reviews.llvm.org/D56587
llvm-svn: 356442
Summary:
This adds `preds` comment lines to BB names for readability, while also
fixes some of existing incorrect comment lines. Also deletes a few
unnecessary attributes. Autogenerated by `opt`.
Reviewers: kripken
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59456
llvm-svn: 356439
Summary:
After r355981, intrinsic arguments that are immediate values should be
marked as `ImmArg`.
Reviewers: dschuff, tlively
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59447
llvm-svn: 356437
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This was suggested by spatel as an alternative
approach to D59378. I've also added the infinite looping test from
that revision here.
Differential Revision: https://reviews.llvm.org/D59506
llvm-svn: 356415
Baseline tests for D59471 (InstCombine of `add nuw` and `uaddo` with
constants).
Patch by Dan Robertson.
Differential Revision: https://reviews.llvm.org/D59472
llvm-svn: 356414
- Do not use unnamed values in saddo tests
- Add tests for canonicalization of a constant arg0
Patch by Dan Robertson.
Differential Revision: https://reviews.llvm.org/D59476
llvm-svn: 356403
Allow the clamp modifier on vop3 int arithmetic instructions in assembly
and disassembly.
This involved adding a clamp operand to the affected instructions in MIR
and MC, and thus having to fix up several places in codegen and MIR
tests.
Differential Revision: https://reviews.llvm.org/D59267
Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e
llvm-svn: 356399
This commit allows v_cndmask_b32_e64 with abs, neg source
modifiers on src0, src1 to be assembled and disassembled.
This does appear to be allowed, even though they are floating point
modifiers and the operand type is b32.
To do this, I added src0_modifiers and src1_modifiers to the
MachineInstr, which involved fixing up several places in codegen and mir
tests.
Differential Revision: https://reviews.llvm.org/D59191
Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea
llvm-svn: 356398
This reinstates r347934, along with a tweak to address a problem with
PHI node ordering that that commit created (or exposed). (That commit
was reverted at r348426, due to the PHI node issue.)
Original commit message:
r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.
This fixes PR30806.
Differential Revision: https://reviews.llvm.org/D57428
llvm-svn: 356392