Commit Graph

62969 Commits

Author SHA1 Message Date
Craig Topper 5e7815b695 [X86] Correct v4f32->v2i64 cvt(t)ps2(u)qq memory isel patterns
These instructions only read 64-bits of memory so we shouldn't
allow a full vector width load to be pattern matched in case it
is marked volatile.

Instead allow vzload or scalar_to_vector+load.

Also add a DAG combine to turn full vector loads into vzload when
used by one of these instructions if the load isn't volatile.

This fixes another case for PR42079

llvm-svn: 364838
2019-07-01 19:01:37 +00:00
Matt Arsenault bae3636f96 AMDGPU/GlobalISel: Handle more input argument intrinsics
llvm-svn: 364836
2019-07-01 18:50:50 +00:00
Matt Arsenault 9e8e8c60fa AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsics
llvm-svn: 364835
2019-07-01 18:49:01 +00:00
Matt Arsenault 756d81905f AMDGPU/GlobalISel: Legalize workgroup ID intrinsics
llvm-svn: 364834
2019-07-01 18:47:22 +00:00
Matt Arsenault e2c86cce3a AMDGPU/GlobalISel: Legalize workitem ID intrinsics
Tests don't cover the masked input path since non-kernel arguments
aren't lowered yet.

Test is copied directly from the existing test, with 2 additions.

llvm-svn: 364833
2019-07-01 18:45:36 +00:00
Matt Arsenault e15770aec4 AMDGPU/GlobalISel: Custom lower control flow intrinsics
Replace the brcond for the 2 cases that act as branches. For now
follow how the current system works, although I think we can
eventually get rid of the pseudos.

llvm-svn: 364832
2019-07-01 18:40:23 +00:00
Matt Arsenault 4073b33786 AMDGPU/GlobalISel: Handle 16-bit SALU min/max
This needs to be extended to s32, and expanded into cmp+select.  This
is relying on the fact that widenScalar happens to leave the
instruction in place, but this isn't a guaranteed property of
LegalizerHelper.

llvm-svn: 364831
2019-07-01 18:33:37 +00:00
Matt Arsenault 5a7d5111e5 AMDGPU/GlobalISel: Lower SALU min/max to cmp+select
Use a change observer to apply a register bank to the newly created
intermediate result register.

llvm-svn: 364830
2019-07-01 18:30:45 +00:00
Robert Lougher e20030f612 [X86] Avoid SFB - Fix inconsistent codegen with/without debug info(2)
The function findPotentialBlockers may consider debug info instructions as
potential blockers and may stop searching for a store-load pair prematurely.

This patch corrects this and tests the cases where the store is separated
from the load by more than InspectionLimit debug instructions.

Patch by Chris Dawson.

Differential Revision: https://reviews.llvm.org/D62408

llvm-svn: 364829
2019-07-01 18:28:21 +00:00
Matt Arsenault 7f8c720939 AMDGPU/GlobalISel: Add tests for add legalization
llvm-svn: 364828
2019-07-01 18:26:47 +00:00
Matt Arsenault ef59cb6982 AMDGPU/GlobalISel: Legalize s16 add/sub/mul
If this is scalar, promote to s32. Use a new observer class to assign
the register bank of newly created registers.

llvm-svn: 364827
2019-07-01 18:18:55 +00:00
Matt Arsenault 9470bb262b AMDGPU/GlobalISel: Fix allowing non-boolean conditions for G_SELECT
The condition register bank must be scc or vcc so that a copy will be
inserted, which will be lowered to a compare.

Currently greedy unnecessarily forces using a VCC select.

llvm-svn: 364825
2019-07-01 18:13:12 +00:00
Roman Lebedev e62857786f [NFC][InstCombine] Add tests for "shift direction in bittest" (PR42466)
https://rise4fun.com/Alive/8O1
https://bugs.llvm.org/show_bug.cgi?id=42466

llvm-svn: 364824
2019-07-01 18:11:32 +00:00
Matt Arsenault 03ca176ab3 GlobalISel: Verify G_MERGE_VALUES operand sizes
llvm-svn: 364822
2019-07-01 18:01:35 +00:00
Matt Arsenault b2ea20eedd AMDGPU/GlobalISel: RegBankSelect for sendmsg/sendmsghalt
llvm-svn: 364819
2019-07-01 17:40:18 +00:00
Matt Arsenault 40d1faf38f AMDGPU/GlobalISel: Legalize s16 fcmp
llvm-svn: 364817
2019-07-01 17:35:53 +00:00
Nicolai Haehnle 10c911db63 AMDGPU/GFX10: implement ds_ordered_count changes
Summary:
ds_ordered_count can now simultaneously operate on up to 4 dwords
in a single instruction, which are taken from (and returned to)
lanes 0..3 of a single VGPR.

Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63716

llvm-svn: 364815
2019-07-01 17:17:52 +00:00
Nicolai Haehnle 4dc3b2bf95 AMDGPU: Support GDS atomics
Summary:
Original patch by Marek Olšák

Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63452

llvm-svn: 364814
2019-07-01 17:17:45 +00:00
Matt Arsenault 1094e6a814 AMDGPU/GlobalISel: RegBankSelect for DS ordered add/swap
llvm-svn: 364811
2019-07-01 17:04:57 +00:00
Matt Arsenault 265059eaf6 AMDGPU/GlobalISel: RegBankSelect for amdgcn.writelane
llvm-svn: 364808
2019-07-01 16:41:36 +00:00
Matt Arsenault 0a52e9d026 AMDGPU/GlobalISel: Complete implementation of G_GEP
Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.

llvm-svn: 364806
2019-07-01 16:34:48 +00:00
Matt Arsenault e1006259d8 AMDGPU/GlobalISel: Select G_PHI
llvm-svn: 364805
2019-07-01 16:32:47 +00:00
Matt Arsenault d810ff2588 AMDGPU/GlobalISel: Try to select VOP3 form of add
There are several things broken, but at least emit the right thing for
gfx9.

The import of the pattern with the unused carry out seems to not
work. Needs a special class for clamp, because OperandWithDefaultOps
doesn't really work.

llvm-svn: 364804
2019-07-01 16:27:32 +00:00
Matt Arsenault 62d64b0c30 AMDGPU/GlobalISel: RegBankSelect for readlane/readfirstlane
llvm-svn: 364801
2019-07-01 16:19:39 +00:00
Tom Stellard 9e9dd30de3 AMDGPU/GlobalISel: Implement select for 32-bit G_ADD
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58804

llvm-svn: 364797
2019-07-01 16:09:33 +00:00
Matt Arsenault 2ab25f9ceb AMDGPU/GlobalISel: Select G_BRCOND for vcc
llvm-svn: 364795
2019-07-01 16:06:02 +00:00
Roman Lebedev 04d3d3bbff [InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)
Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

llvm-svn: 364792
2019-07-01 15:55:24 +00:00
Roman Lebedev 72b8d41ce8 [InstCombine] Shift amount reassociation in bittest (PR42399)
Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0`  iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

llvm-svn: 364791
2019-07-01 15:55:15 +00:00
Krzysztof Parzyszek 5abf80cdfa [Hexagon] Custom-lower UADDO(x, 1) and USUBO(x, 1)
llvm-svn: 364790
2019-07-01 15:50:09 +00:00
Matt Arsenault cda82f0bb6 AMDGPU/GlobalISel: Select G_FRAME_INDEX
llvm-svn: 364789
2019-07-01 15:48:18 +00:00
Nicolai Haehnle 7cfd99ab15 AMDGPU/GFX10: fix scratch resource descriptor
Summary:
The stride should depend on the wave size, not the hardware generation.

Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be
relevant.

Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6

Reviewers: arsenm, rampitec, mareko

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63808

llvm-svn: 364788
2019-07-01 15:43:00 +00:00
Matt Arsenault fdf36729c7 AMDGPU/GlobalISel: Make s16 select legal
This is easy to handle and avoids legalization artifacts which are
likely to obscure combines.

llvm-svn: 364787
2019-07-01 15:42:47 +00:00
Matt Arsenault 6464280eb0 AMDGPU/GlobalISel: Select G_BRCOND for scc conditions
llvm-svn: 364786
2019-07-01 15:39:27 +00:00
Matt Arsenault 1daad91af6 AMDGPU/GlobalISel: Tolerate copies with no type set
isVCC has the same bug, but isn't used in a context where it can cause
a problem.

llvm-svn: 364784
2019-07-01 15:23:04 +00:00
Matt Arsenault fb99fc7a68 AMDGPU: Fix tests using the default alloca address space
llvm-svn: 364783
2019-07-01 15:23:03 +00:00
Matt Arsenault 4f64ade04c AMDGPU/GlobalISel: Select src modifiers
llvm-svn: 364782
2019-07-01 15:18:56 +00:00
Jinsong Ji ee6539341b [UpdateTestChecks][PowerPC] Avoid empty string when scrubbing loop comments
Summary:
SCRUB_LOOP_COMMENT_RE was introduced in https://reviews.llvm.org/D31285
This works for some loops.

However, we may generate lines with loop comments only.
And since we don't scrub leading white spaces, this will leave an empty
line there, and FileCheck will complain it.

eg: llvm/test/CodeGen/PowerPC/PR35812-neg-cmpxchg.ll:27:15:
error: found empty check string with prefix 'CHECK:'
; CHECK-NEXT:

This prevented us from using the `update_llc_test_checks.py` for quite some cases.

We should still keep the comment token there, so that we can safely
scrub the loop comment without breaking FileCheck.

Reviewers: timshen, hfinkel, lebedev.ri, RKSimon

Subscribers: nemanjai, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63957

llvm-svn: 364775
2019-07-01 14:37:48 +00:00
Roman Lebedev 34a0b16e29 [NFC][InstCombine] Better commutative tests for "shift amount reassociation in bittest" pattern.
As discussed in https://reviews.llvm.org/D63829
*if* *both* shifts are one-use, we'd most likely want to produce `lshr`,
and not rely on ordering.

Also, there should likely be a *separate* fold to do this reordering.

llvm-svn: 364772
2019-07-01 14:28:24 +00:00
Krzysztof Parzyszek 511ad50db4 [Hexagon] Rework VLCR algorithm
Add code to catch pattern for commutative instructions for VLCR.

Patch by Suyog Sarda.

llvm-svn: 364770
2019-07-01 13:50:47 +00:00
Matt Arsenault 5bf850d52e AMDGPU/GlobalISel: Fix RegBankSelect for G_FCANONICALIZE
llvm-svn: 364768
2019-07-01 13:40:18 +00:00
Matt Arsenault b5fc94f3e7 AMDGPU/GlobalISel: Fix RegBankSelect for G_BUILD_VECTOR
llvm-svn: 364767
2019-07-01 13:40:17 +00:00
Matt Arsenault 89fc8bcdd6 AMDGPU/GlobalISel: Fail on store to 32-bit address space
llvm-svn: 364766
2019-07-01 13:37:39 +00:00
Matt Arsenault 3b7668ae4b AMDGPU/GlobalISel: Improve icmp selection coverage.
Select s64 eq/ne scalar icmp.

llvm-svn: 364765
2019-07-01 13:34:26 +00:00
Roman Lebedev 9f3645869c [NFC][InstCombine] Improve test coverage for ((~x) + y) + 1 -> y - x fold fold (PR42459)
So we indeed to have this fold, but only if +1 is not the last operation..

llvm-svn: 364764
2019-07-01 13:31:06 +00:00
Matt Arsenault c23149f612 AMDGPU/GlobalISel: RegBankSelect for WWM/WQM
llvm-svn: 364763
2019-07-01 13:30:12 +00:00
Matt Arsenault facf69e844 AMDGPU/GlobalISel: Use vcc reg bank for amdgcn.wqm.vote
llvm-svn: 364762
2019-07-01 13:30:09 +00:00
Matt Arsenault 9f992c238a AMDGPU/GlobalISel: Fix scc->vcc copy handling
This was checking the size of the register with the value of the size,
which happens to be exec. Also fix assuming VCC is 64-bit to fix
wave32.

Also remove some untested handling for physical registers which is
skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical
copy source. I'm not sure if this should be trying to handle this
special case instead of dealing with this in copyPhysReg.

llvm-svn: 364761
2019-07-01 13:22:07 +00:00
Matt Arsenault 5dafcb9b11 AMDGPU/GlobalISel: Use and instead of BFE with inline immediate
Zext from s1 is the only case where this should do anything with the
current legal extensions.

llvm-svn: 364760
2019-07-01 13:22:06 +00:00
Matt Arsenault 01bb075c1f GlobalISel: Add GINodeEquiv for min/max
llvm-svn: 364759
2019-07-01 13:22:04 +00:00
Matt Arsenault fbf67d88de GlobalISel: Add DAG compat for G_FCANONICALIZE
llvm-svn: 364758
2019-07-01 13:22:00 +00:00
Roman Lebedev d5c3e34cb7 [NFC][InstCombine] Tests for ((~x) + y) + 1 -> y - x fold fold (PR42459)
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

https://bugs.llvm.org/show_bug.cgi?id=42459
https://rise4fun.com/Alive/EPO0

llvm-svn: 364749
2019-07-01 12:22:06 +00:00
Benjamin Kramer ed13fef477 [SelectionDAG] Do minnum->minimum at legalization time instead of building time
The SDAGBuilder behavior stems from the days when we didn't have fast
math flags available in SDAG. We do now and doing the transformation in
the legalizer has the advantage that it also works for vector types.

llvm-svn: 364743
2019-07-01 11:00:23 +00:00
Roman Lebedev 4f878fe3a7 [NFC][InstCombine] Tests for x - ~(y) -> x + y + 1 fold (PR42457)
https://bugs.llvm.org/show_bug.cgi?id=42457
https://rise4fun.com/Alive/iFhE

llvm-svn: 364739
2019-07-01 09:57:53 +00:00
Roman Lebedev f55818e3a7 [InstCombine] Omit 'urem' where possible
This was added in D63390 / rL364286 to backend,
but it makes sense to also handle it in middle-end.
https://rise4fun.com/Alive/Zsln

llvm-svn: 364738
2019-07-01 09:41:43 +00:00
Roman Lebedev 0f82f64c83 [NFC][InstCombine] Copy test for omit urem when possible from TargetLowering
Was added in D63390 / rL364286 to backend, but it makes sense to also handle it here.
https://rise4fun.com/Alive/Zsln

llvm-svn: 364737
2019-07-01 09:41:27 +00:00
Jeremy Morse d2b6665e33 [DebugInfo] Avoid adding too much indirection to pointer-valued variables
This patch addresses PR41675, where a stack-pointer variable is dereferenced
too many times by its location expression, presenting a value on the stack as
the pointer to the stack.

The difference between a stack *pointer* DBG_VALUE and one that refers to a
value on the stack, is currently the indirect flag. However the DWARF backend
will also try to guess whether something is a memory location or not, based
on whether there is any computation in the location expression. By simply
prepending the stack offset to existing expressions, we can accidentally
convert a register location into a memory location, which introduces a
suprise (and unintended) dereference.

The solution is to add DW_OP_stack_value whenever we add a DIExpression
computation to a stack *pointer*. It's an implicit location computed on the
expression stack, thus needs to be flagged as a stack_value.

For the edge case where the offset is zero and the location could be a register
location, DIExpression::prepend will still generate opcodes, and thus
DW_OP_stack_value must still be added.

Differential Revision: https://reviews.llvm.org/D63429

llvm-svn: 364736
2019-07-01 09:38:23 +00:00
Yevgeny Rouban d4097b4a93 [SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst
Differential Revision: https://reviews.llvm.org/D60606

llvm-svn: 364734
2019-07-01 08:43:53 +00:00
Sam Parker 98722691b0 [ARM] WLS/LE Code Generation
Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
   to generate intrinsics that guard the loop entry, as well as setting
   the loop trip count.
2) Lower the BRCOND that uses said intrinsic to an Arm specific node:
   ARMWLS.
3) ISelDAGToDAG the node to a new pseudo instruction:
   t2WhileLoopStart.
4) Add support in ArmLowOverheadLoops to handle the new pseudo
   instruction.

Differential Revision: https://reviews.llvm.org/D63816

llvm-svn: 364733
2019-07-01 08:21:28 +00:00
Craig Topper fcda45a9eb [X86] Add more load folding tests for vcvt(t)ps2(u)qq showing missed foldings. NFC
llvm-svn: 364730
2019-07-01 07:59:42 +00:00
Craig Topper 29fff0797b [X86] Improve the type checking fast-isel handling of vector bitcasts.
We had a bunch of vector size legality checks for the source type
based on feature flags, but we didn't check the destination type at
all beyond ensuring that it was a "simple" type. But this allowed
the destination to be i128 which isn't legal.

This commit changes the code to use TLI's isTypeLegal logic in
place of the all the subtarget checks. Then additionally checks
that the source and dest are vectors.

Fixes 42452

llvm-svn: 364729
2019-07-01 07:09:34 +00:00
Craig Topper 4ca81a9b99 [X86] Add a DAG combine to replace vector loads feeding a v4i32->v2f64 CVTSI2FP/CVTUI2FP node with a vzload.
But only when the load isn't volatile.

This improves load folding during isel where we only have vzload
and scalar_to_vector+load patterns. We can't have full vector load
isel patterns for the same volatile load issue.

Also add some missing masked cvtsi2fp/cvtui2fp with vzload patterns.

llvm-svn: 364728
2019-07-01 07:09:31 +00:00
Craig Topper fc233c9108 [X86] Add some additional load folding tests to vec_int_to_fp.ll/vec_int_to_fp-widen.ll and disable the peephole pass.
Also copy some missing test cases from vec_int_to_fp.ll to vec_int_to_fp-widen.ll

llvm-svn: 364727
2019-07-01 07:09:26 +00:00
Craig Topper d1728f8987 [X86] Add MOVHPDrm/MOVLPDrm patterns that use VZEXT_LOAD.
We already had patterns that used scalar_to_vector+load. But we can
also have a vzload.

Found while investigating combining scalar_to_vector+load to vzload.

llvm-svn: 364726
2019-07-01 07:09:23 +00:00
Sanjay Patel 706b48251f [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
This is the opposite direction of D62158 (we have to choose 1 form or the other).
Now that we have FMF on the select, this becomes more palatable. And the benefits
of having a single IR instruction for this operation (less chances of missing folds
based on extra uses, etc) overcome my previous comments about the potential advantage
of larger pattern matching/analysis.

Differential Revision: https://reviews.llvm.org/D62414

llvm-svn: 364721
2019-06-30 13:40:31 +00:00
Sanjay Patel 77dc1e8568 [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum
This transform came up in D62414, but we should deal with it first.
We have LLVM intrinsics that correspond exactly to libm calls (unlike
most libm calls, these libm calls never set errno).
This holds without any fast-math-flags, so we should always canonicalize
to those intrinsics directly for better optimization.
Currently, we convert to fcmp+select only when we have FMF (nnan) because
fcmp+select does not preserve the semantics of the call in the general case.

Differential Revision: https://reviews.llvm.org/D63214

llvm-svn: 364714
2019-06-29 14:28:54 +00:00
Roman Lebedev e3a94ba4a9 [InstCombine] Shift amount reassociation (PR42391)
Summary:
Given pattern:
`(x shiftopcode Q) shiftopcode K`
we should rewrite it as
`x shiftopcode (Q+K)`  iff `(Q+K) u< bitwidth(x)`
This is valid for any shift, but they must be identical.

* https://rise4fun.com/Alive/9E2
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 | PR42391]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: nikic

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63812

llvm-svn: 364712
2019-06-29 11:51:50 +00:00
Nikita Popov 2d756c4feb [LFTR] Fix post-inc pointer IV with truncated exit count (PR41998)
Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we
have a truncated exit count we'll truncate the IV when comparing
against the limit, in which case exit count overflow in post-inc
form doesn't matter. However, for pointer IVs we don't do that, so
we have to be careful about incrementing the IV in the wide type.

I'm fixing this by removing the IVCount variable (which was
ExitCount or ExitCount+1) and replacing it with a UsePostInc flag,
and then moving the actual limit adjustment to the individual cases
(which are: pointer IV where we add to the wide type, integer IV
where we add to the narrow type, and constant integer IV where we
add to the wide type).

Differential Revision: https://reviews.llvm.org/D63686

llvm-svn: 364709
2019-06-29 09:24:12 +00:00
Sam Clegg b72664fd21 Partial revert of "[llvm-ar] Document response file support in --help"
This is partial revert of 70a8027c60.

The test apparently failed on win32 bots due to the way slashes in
pathnames are handled.

llvm-svn: 364705
2019-06-29 01:53:26 +00:00
Matt Arsenault 7889d4ce66 AMDGPU/GlobalISel: Add some more tests for icmp select
llvm-svn: 364703
2019-06-29 00:55:16 +00:00
Matt Arsenault 0d45209757 AMDGPU/GlobalISel: RegBankSelect for update.dpp
llvm-svn: 364701
2019-06-29 00:44:36 +00:00
Matt Arsenault fd82cf4f4d AMDGPU/GlobalISel: RegBankSelect for atomic.inc/atomic.dec
llvm-svn: 364699
2019-06-29 00:39:20 +00:00
Matt Arsenault adb1f21e52 AMDGPU/GlobalISel: RegBankSelect for some DS intrinsics
llvm-svn: 364698
2019-06-29 00:33:13 +00:00
Matt Arsenault 5ea3c9adb2 AMDGPU/GlobalISel: RegBankSelect for icmp/fcmp intrinsics
llvm-svn: 364696
2019-06-29 00:28:52 +00:00
Matt Arsenault 6aafb3068f AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.fmas
llvm-svn: 364695
2019-06-29 00:25:53 +00:00
Matt Arsenault ade5162432 AMDGPU/GlobalISel: RegBankSelect for some simple leaf intrinsics
llvm-svn: 364694
2019-06-29 00:22:28 +00:00
Matt Arsenault 69d9c31433 AMDGPU: Add baseline test for packed shufflevector
llvm-svn: 364691
2019-06-28 23:43:40 +00:00
Wouter van Oortmerssen 319c87d94f [WebAssembly] Assembler: support .int16/32/64 directives.
Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63959

llvm-svn: 364689
2019-06-28 22:20:33 +00:00
Wouter van Oortmerssen 35bcba4fae [WebAssembly] Allow @object in .type directives.
Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63955

llvm-svn: 364688
2019-06-28 21:53:11 +00:00
Cameron McInally b671535983 [NFC][NewGVN] Explicitly check fpmath metadata in fpmath.ll
Suggested in D63933.

llvm-svn: 364685
2019-06-28 21:39:08 +00:00
Sanjay Patel 573b241c68 [Lanai] auto-generate complete test checks; NFC
This file will fail with a common codegen transform that
I'm looking at, and I can't tell if that's an improvement
or regression based on the sparse checking.

llvm-svn: 364684
2019-06-28 20:45:32 +00:00
Wouter van Oortmerssen fc222e23ca [WebAssembly] Assembler: Allow offsets and p2align in symbol load.
Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63951

llvm-svn: 364682
2019-06-28 20:31:13 +00:00
Cameron McInally 30e5cf1d8f [NewGVN] Add unary FNeg support to NewGVN pass
Differential Revision: https://reviews.llvm.org/D63933

llvm-svn: 364680
2019-06-28 20:09:32 +00:00
Cameron McInally ab4b2364e5 [GVNSink] Add unary FNeg support to GVNSink pass
Differential Revision: https://reviews.llvm.org/D63900

llvm-svn: 364678
2019-06-28 19:57:31 +00:00
Brad Smith 4b733ca617 Default to Secure PLT on PPC for musl libc.
This matches the default settings of clang.

llvm-svn: 364675
2019-06-28 19:48:31 +00:00
Sam Clegg 70a8027c60 [llvm-ar] Document response file support in --help
Also a test for this.

Differential Revision: https://reviews.llvm.org/D63836

llvm-svn: 364673
2019-06-28 18:48:05 +00:00
Lang Hames 62a627ae78 Re-apply r364600 with fixes.
Fix: MachO/X86_64_RELOC_GOT is a 32-bit reloc, so only compare 32 bits.
llvm-svn: 364672
2019-06-28 18:36:59 +00:00
Jinsong Ji 7d78e5cc81 [UpdateChecks] Add support for armv7-apple-darwin
armv7-apple-darwin was not supported well, the script can't generate
checks.

https://reviews.llvm.org/D60601/new/#inline-568671

Differential Revision: https://reviews.llvm.org/D63939

llvm-svn: 364668
2019-06-28 18:07:19 +00:00
Simon Pilgrim 978a08c885 [X86] CombineShuffleWithExtract - recurse through EXTRACT_SUBVECTOR chain
llvm-svn: 364667
2019-06-28 17:57:32 +00:00
Peter Collingbourne 7108df964a hwasan: Remove the old frame descriptor mechanism.
Differential Revision: https://reviews.llvm.org/D63470

llvm-svn: 364665
2019-06-28 17:53:26 +00:00
Roman Lebedev 0b8b419537 [NFC][Codegen] Revisit test coverage for X % C == 0 fold once more (add tests with '1' divisor)
llvm-svn: 364661
2019-06-28 17:26:28 +00:00
Wouter van Oortmerssen 633d222d30 [WebAssembly] Added visibility and ident directives to WasmAsmParser.
Summary:
These are output by clang -S, so can now be roundtripped thru clang.

(partially) fixes: https://bugs.llvm.org/show_bug.cgi?id=34544

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63901

llvm-svn: 364658
2019-06-28 16:51:06 +00:00
Roman Lebedev 3b4f086df4 [NFC][InstCombine] Shift amount reassociation: revisit flag preservation tests
llvm-svn: 364657
2019-06-28 16:36:53 +00:00
Sam Tebbs e39e958da3 [ARM] Add support for the MVE long shift instructions
MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers.

The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl.

test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions.

Differential Revision: https://reviews.llvm.org/D63430

llvm-svn: 364654
2019-06-28 15:43:31 +00:00
Max Moroz 176b9f6516 [llvm-cov[ Fix lcov coverage report contains functions from other compilation units.
Summary: Patch by Chuan Qiu (@eagleonhill).

Reviewers: Dor1s

Reviewed By: Dor1s

Subscribers: lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63571

llvm-svn: 364653
2019-06-28 15:38:25 +00:00
Roman Lebedev 9f1dffdb02 [NFC][InstCombine] Shift amount reassociation: add flag preservation test
As discussed in https://reviews.llvm.org/D63812#inline-569870
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg

So basically if the same flag is set on both original shifts -> set it on new shift.
Don't think we can do anything with non-matching flags on shl.

llvm-svn: 364652
2019-06-28 15:32:52 +00:00
Cameron McInally 9fab46ca0b [NFC][Float2Int] Pre-commit unary FNeg test to basic.ll
llvm-svn: 364649
2019-06-28 15:12:15 +00:00
Cameron McInally 13d9c723c8 [NFC][NewGVN] Pre-commit unary FNeg test to fpmath.ll
llvm-svn: 364646
2019-06-28 14:39:58 +00:00
Dmitry Preobrazhensky 1d572ce395 [AMDGPU][MC] Enabled constant expressions as operands of sendmsg
See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D62735

llvm-svn: 364645
2019-06-28 14:14:02 +00:00
Simon Pilgrim a54e1a0f01 [X86] CombineShuffleWithExtract - only require 1 source to be EXTRACT_SUBVECTOR
We were requiring that both shuffle operands were EXTRACT_SUBVECTORs, but we can relax this to only require one of them to be.

Also, we shouldn't bother attempting this if both operands are from the lowest subvector (or not EXTRACT_SUBVECTOR at all).

llvm-svn: 364644
2019-06-28 12:24:49 +00:00
David Green 9dbdfe6b78 [ARM] Add MVE mul patterns
This simply adds integer and floating point VMUL patterns for MVE, same as we
have add and sub.

Differential Revision: https://reviews.llvm.org/D63866

llvm-svn: 364643
2019-06-28 11:44:03 +00:00
Roman Lebedev 9af4474253 [NFC][Codegen] Revisit test coverage for X % C == 0 fold
llvm-svn: 364642
2019-06-28 11:36:34 +00:00
David Green 2883944035 [ARM] Mark math routines as non-legal for MVE
This adds handling and tests for a number of floating point math routines,
which have no MVE instructions.

Differential Revision: https://reviews.llvm.org/D63725

llvm-svn: 364641
2019-06-28 11:17:38 +00:00
David Green ff70cbc895 [ARM] MVE patterns for VABS and VNEG
This simply adds the required patterns for fp neg and abs.

Differential Revision: https://reviews.llvm.org/D63861

llvm-svn: 364640
2019-06-28 10:25:35 +00:00
David Green eb7080ac6e [ARM] Widening loads and narrowing stores
MVE has instructions to widen as it loads, and narrow as it stores. This adds
the required patterns and legalisation to make them work including specifying
that they are legal, patterns to select them and test changes.

Patch by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63839

llvm-svn: 364636
2019-06-28 09:47:55 +00:00
David Green 07e53fee14 [ARM] MVE loads and stores
This fills in the gaps for basic MVE loads and stores, allowing unaligned
access and adding far too many tests. These will become important as
narrowing/expanding and pre/post inc are added. Big endian might still not be
handled very well, because we have not yet added bitcasts (and I'm not sure how
we want it to work yet). I've included the alignment code anyway which maps
with our current patterns. We plan to return to that later.

Code written by Simon Tatham, with additional tests from Me and Mikhail Maltsev.

Differential Revision: https://reviews.llvm.org/D63838

llvm-svn: 364633
2019-06-28 08:41:40 +00:00
David Green fc4102417b [ARM] Mark div and rem as expand for MVE
We don't have vector operations for these, so they need to be expanded for both
integer and float.

Differential Revision: https://reviews.llvm.org/D63595

llvm-svn: 364631
2019-06-28 08:18:55 +00:00
David Green 62889b0ea5 [ARM] Select MVE fp add and sub
The same as integer arithmetic, we can add simple floating point MVE addition and
subtraction patterns.

Initial code by David Sherwood

Differential Revision: https://reviews.llvm.org/D63257

llvm-svn: 364629
2019-06-28 07:41:09 +00:00
Sam Parker 9a92be1b35 [HardwareLoops] Loop counter guard intrinsic
Introduce llvm.test.set.loop.iterations which sets the loop counter
and also produces an i1 after testing that the count is not zero.

Differential Revision: https://reviews.llvm.org/D63809

llvm-svn: 364628
2019-06-28 07:38:16 +00:00
David Green be05b85db9 [ARM] Select MVE add and sub
This adds the first few patterns for MVE code generation, adding simple integer
add and sub patterns.

Initial code by David Sherwood

Differential Revision: https://reviews.llvm.org/D63255

llvm-svn: 364627
2019-06-28 07:21:11 +00:00
David Green 8be372b190 [ARM] MVE vector shuffles
This patch adds necessary shuffle vector and buildvector support for ARM MVE.
It essentially adds support for VDUP, VREVs and some VMOVs, which are often
required by other code (like upcoming patches).

This mostly uses the same code from Neon that already generated
NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and
moved to ARMInstrInfo as they are common to both architectures. Most of the
selection code seems to be applicable to both, but NEON does have some more
instructions making some parts specific.

Most code originally by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63567

llvm-svn: 364626
2019-06-28 07:08:42 +00:00
Stanislav Mekhanoshin 07fd88d735 [AMDGPU] Packed thread ids in function call ABI
Differential Revision: https://reviews.llvm.org/D63851

llvm-svn: 364619
2019-06-28 01:52:13 +00:00
Amara Emerson ecb7ac35f9 [GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when optimizations are used.
The new switch lowering code that tries to generate jump tables and range checks
were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to
town on trying to generate optimized lowerings, e.g. multiple jump tables, range
checks etc. This exposed bugs in the way PHI nodes are handled because the CFG
looks even stranger after all of this is done.

llvm-svn: 364613
2019-06-27 23:56:34 +00:00
Peter Collingbourne 5378afc02a hwasan: Use llvm.read_register intrinsic to read the PC on aarch64 instead of taking the function's address.
This shaves an instruction (and a GOT entry in PIC code) off prologues of
functions with stack variables.

Differential Revision: https://reviews.llvm.org/D63472

llvm-svn: 364608
2019-06-27 23:24:07 +00:00
Lang Hames c29abb50f2 Revert "[JITLink][MachO/x86-64] Add a testcase for X86_64_RELOC_GOT."
Reverts commit r364600 while I investigate bot failures.

llvm-svn: 364606
2019-06-27 23:00:30 +00:00
Roman Lebedev 29d05c005f [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 3)
Summary:
I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222.
There is no such action in "Add Action" menu...

This implements an optimization described in Hacker's Delight 10-17: when `C` is constant,
the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder.
The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.

This is a recommit, the original commit rL364563 was reverted in rL364568
because test-suite detected miscompile - the new comparison constant 'Q'
was being computed incorrectly (we divided by `D0` instead of `D`).

Original patch D50222 by @hermord (Dmytro Shynkevych)

Notes:
- In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla.
  This seems to require an extra branch on overflow, so I refrained from implementing this for now.
- An explicit check for when the `REM` can be reduced to just its LHS is included:
  the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise.
  I hadn't managed to find a better way to not generate worse output in this case.
- The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390.

Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00

Reviewed By: RKSimon, xbolva00

Subscribers: dexonsmith, kristina, xbolva00, javed.absar, llvm-commits, hermord

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63391

llvm-svn: 364600
2019-06-27 21:52:10 +00:00
Lang Hames 4a8dc61534 [JITLink][MachO/x86-64] Add a testcase for X86_64_RELOC_GOT.
This is the data-section counterpart to X86_64_RELOC_GOTPCREL.

llvm-svn: 364598
2019-06-27 21:50:29 +00:00
Cameron McInally 30cab5d6ee [NFC][GVNSink] Pre-commit unary FNeg test to fpmath.ll
llvm-svn: 364597
2019-06-27 21:23:07 +00:00
Heejin Ahn 24dba1fe97 [WebAssembly] Enable an atomic.notify MC test
Summary:
Assembly of atomic.notify has been fixed in r364576, so we can enable
it.

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63898

llvm-svn: 364596
2019-06-27 21:22:04 +00:00
Cameron McInally 6e62a796d5 [GVN] Add support for unary FNeg to GVN pass
Differential Revision: https://reviews.llvm.org/D63896

llvm-svn: 364592
2019-06-27 21:05:02 +00:00
Sanjay Patel 7ecf1ec49a [x86] remove whitespace; NFC
llvm-svn: 364588
2019-06-27 20:37:12 +00:00
Cameron McInally 22afca2ce0 [NFC][GVN] Pre-commit unary FNeg tests to fpmath.ll
llvm-svn: 364587
2019-06-27 20:33:44 +00:00
Sanjay Patel a95ca2b5ff [x86] prevent crashing from select narrowing with AVX512
llvm-svn: 364585
2019-06-27 20:16:58 +00:00
Philip Reames 1cf9e72cbc Update -analyze -scalar-evolution output for multiple exit loops w/computable exit values
The previous output was next to useless if *any* exit was not computable.  If we have more than one exit, show the exit count for each so that it's easier to see what's going from with SCEV analysis when debugging.

llvm-svn: 364579
2019-06-27 19:22:43 +00:00
Roman Lebedev bd34e50cf0 [NFC][CodeGen] Add negative test for X u% C == 0 fold (D63391)
The fold (D63391) uses multiplicativeInverse(),
but it is not guaranteed to always succeed,
and '100' appears to be one of the problematic values.

llvm-svn: 364578
2019-06-27 19:09:51 +00:00
Wouter van Oortmerssen bfd3f69480 [WebAssembly] AsmParser: better atomic inst detection
Summary:
Previously missed atomic.notify.

Fixes https://bugs.llvm.org/show_bug.cgi?id=40728

Reviewers: aheejin

Subscribers: sbc100, jgravelle-google, sunfish, jfb, llvm-commits, dschuff

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63747

llvm-svn: 364576
2019-06-27 18:58:26 +00:00
Djordje Todorovic 774eabd097 Revert "[LiveDebugValues] Emit the debug entry values"
Appears that the 'test/DebugInfo/MIR/X86/dbginfo-entryvals.mir'
does not pass on Windows.

This reverts commit rL364553.

llvm-svn: 364571
2019-06-27 18:12:04 +00:00
Wouter van Oortmerssen 6b3f56b65f [WebAssembly] Fix p2align in assembler.
Summary:
- Match the syntax output by InstPrinter.
- Fix it always emitting 0 for align. Had to work around fact that
  opcode is not available for GetDefaultP2Align while parsing.
- Updated tests that were erroneously happy with a p2align=0

Fixes https://bugs.llvm.org/show_bug.cgi?id=40752

Reviewers: aheejin, sbc100

Subscribers: jgravelle-google, sunfish, jfb, llvm-commits, dschuff

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63633

llvm-svn: 364570
2019-06-27 18:11:15 +00:00
Roman Lebedev 0a2b7b79fa Revert "[CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 2)"
*Appears* to break test-suite on
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/23790

FAIL: burg.execution_time
FAIL: spiff.execution_time
FAIL: employ.execution_time
FAIL: llu.execution_time
FAIL: gramschmidt.execution_time
FAIL: fdtd-apml.execution_time

This reverts commit r364563.

llvm-svn: 364568
2019-06-27 17:22:31 +00:00
Nicolai Haehnle 32ef9292be AMDGPU: Make fixing i1 copies robust against re-ordering
Summary:
The new test case led to incorrect code.

Change-Id: Ief48b227e97aa662dd3535c9bafb27d4a184efca

Reviewers: arsenm, david-salinas

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63871

llvm-svn: 364566
2019-06-27 16:56:44 +00:00
David Green 152dd3b854 [ARM] Move low overhead loop codegen tests into a separate file. NFC
llvm-svn: 364565
2019-06-27 16:56:41 +00:00
Roman Lebedev 0627b09863 [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 2)
Summary:
I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222.
There is no such action in "Add Action" menu...
Original patch D50222 by @hermord (Dmytro Shynkevych)

This implements an optimization described in Hacker's Delight 10-17: when `C` is constant,
the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder.
The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.

Original patch author: @hermord (Dmytro Shynkevych)!

Notes:
- In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla.
  This seems to require an extra branch on overflow, so I refrained from implementing this for now.
- An explicit check for when the `REM` can be reduced to just its LHS is included:
  the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise.
  I hadn't managed to find a better way to not generate worse output in this case.
- The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390.

Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00

Reviewed By: RKSimon, xbolva00

Subscribers: xbolva00, javed.absar, llvm-commits, hermord

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63391

llvm-svn: 364563
2019-06-27 16:45:42 +00:00
Chris Jackson 41e20d2101 [llvm-nm] Fix for BZ41711 - Class character for a symbol with undefined
binding does not match class assigned by GNU nm

 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41711

 Differential Revision: https://reviews.llvm.org/D63340

llvm-svn: 364559
2019-06-27 16:27:53 +00:00
Roland Froese 9f7f5858fe Recommit [PowerPC] Update P9 vector costs for insert/extract element
Recommit patch D60160 after regression fix patch D63463.

llvm-svn: 364557
2019-06-27 16:20:24 +00:00
Paul Robinson 1339f74b8a [debug-info] Make a couple of tests more robust.
llvm-svn: 364556
2019-06-27 15:53:07 +00:00
Johannes Doerfert 3b77583e95 [Attr] Add "willreturn" function attribute
This patch introduces a new function attribute, willreturn, to indicate
that a call of this function will either exhibit undefined behavior or
comes back and continues execution at a point in the existing call stack
that includes the current invocation.

This attribute guarantees that the function does not have any endless
loops, endless recursion, or terminating functions like abort or exit.

Patch by Hideto Ueno (@uenoku)

Reviewers: jdoerfert

Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62801

llvm-svn: 364555
2019-06-27 15:51:40 +00:00
Djordje Todorovic d6a46aff59 [LiveDebugValues] Emit the debug entry values
Emit replacements for clobbered parameters location if the parameter
has unmodified value throughout the funciton. This is basic scenario
where we can use the debug entry values.

([12/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D58042

llvm-svn: 364553
2019-06-27 15:35:48 +00:00
Tim Northover a4771e9dfd Bitcode: derive all types used from records instead of Values.
There is existing bitcode that we need to support where the structured nature
of pointer types is used to derive the result type of some operation. For
example a GEP's operation and result will be based on its input Type.

When pointers become opaque, the BitcodeReader will still have access to this
information because it's explicitly told how to construct the more complex
types used, but this information will not be attached to any Value that gets
looked up. This changes BitcodeReader so that in all places which use type
information in this manner, it's derived from a side-table rather than from the
Value in question.

llvm-svn: 364550
2019-06-27 14:46:51 +00:00
Simon Pilgrim 83e1a1e79b [TargetLowering] SimplifyDemandedVectorElts - add shift/rotate support.
llvm-svn: 364548
2019-06-27 14:25:54 +00:00
Sanjay Patel d0e098696f [InstCombine] remove 'tmp' names and regenerate checks; NFC
llvm-svn: 364546
2019-06-27 14:20:10 +00:00
Jinsong Ji 157b073fa5 [PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and others
This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751
llvm-mc aborted when disassembling tabortdc.

This patch try to clean up TM related DAGs.

* Fixes the problem by remove explicit output of cr0, and put it as implicit def.
* Update int_ppc_tbegin pattern to accommodate the implicit def of cr0.
* Update the TCHECK operand and int_ppc_tcheck accordingly.
* Add some builtin test and disassembly tests.
* Remove unused CRRC0/crrc0

Differential Revision: https://reviews.llvm.org/D61935

llvm-svn: 364544
2019-06-27 14:11:31 +00:00
Hans Wennborg 408fc0849e Revert r363658 "[SVE][IR] Scalable Vector IR Type with pr42210 fix"
We saw a 70% ThinLTO link time increase in Chromium for Android, see
crbug.com/978817. Sounds like more of PR42210.

> Recommit of D32530 with a few small changes:
>   - Stopped recursively walking through aggregates in
>     the verifier, so that we don't impose too much
>     overhead on large modules under LTO (see PR42210).
>   - Changed tests to match; the errors are slightly
>     different since they only report the array or
>     struct that actually contains a scalable vector,
>     rather than all aggregates which contain one in
>     a nested member.
>   - Corrected an older comment
>
> Reviewers: thakis, rengolin, sdesmalen
>
> Reviewed By: sdesmalen
>
> Differential Revision: https://reviews.llvm.org/D63321

llvm-svn: 364543
2019-06-27 13:55:02 +00:00
Djordje Todorovic a0d45058eb [DWARF] Handle the DW_OP_entry_value operand
Add the IR and the AsmPrinter parts for handling of the DW_OP_entry_values
DWARF operation.

([11/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D60866

llvm-svn: 364542
2019-06-27 13:52:34 +00:00
Simon Pilgrim c692a8dc51 [TargetLowering] SimplifyDemandedBits - use DemandedElts to better identify partial splat shift amounts
llvm-svn: 364541
2019-06-27 13:48:43 +00:00
Simon Tatham ffb2b347ff [ARM] Fix handling of zero offsets in LOB instructions.
The BF and WLS/WLSTP instructions have various branch-offset fields
occupying different positions and lengths in the instruction encoding,
and all of them were decoded at disassembly time by the function
DecodeBFLabelOffset() which returned SoftFail if the offset was zero.

In fact, it's perfectly fine and not even a SoftFail for most of those
offset fields to be zero. The only one that can't be zero is the 4-bit
field labelled `boff` in the architecture spec, occupying bits {26-23}
of the BF instruction family. If that one is zero, the encoding
overlaps other instructions (WLS, DLS, LETP, VCTP), so it ought to be
a full Fail.

Fixed by adding an extra template parameter to DecodeBFLabelOffset
which controls whether a zero offset is accepted or rejected. Adjusted
existing tests (only in error messages for bad disassemblies); added
extra tests to demonstrate zero offsets being accepted in all the
right places, and a few demonstrating rejection of zero `boff`.

Reviewers: DavidSpickett, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63864

llvm-svn: 364533
2019-06-27 12:41:07 +00:00
Simon Tatham e5ce56fb95 [ARM] Make coprocessor number restrictions consistent.
Different versions of the Arm architecture disallow the use of generic
coprocessor instructions like MCR and CDP on different sets of
coprocessors. This commit centralises the check of the coprocessor
number so that it's consistent between assembly and disassembly, and
also updates it for the new restrictions in Arm v8.1-M.

New tests added that check all the coprocessor numbers; old tests
updated, where they used a number that's now become illegal in the
context in question.

Reviewers: DavidSpickett, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63863

llvm-svn: 364532
2019-06-27 12:40:55 +00:00
Simon Tatham 02449f9c3c [ARM] Tighten restrictions on use of SP in v8.1-M CSEL.
In the `CSEL Rd,Rm,Rn` instruction family (also including CSINC, CSINV
and CSNEG), the architecture lists it as CONSTRAINED UNPREDICTABLE
(i.e. SoftFail) to use SP in the Rd or Rm slot, but outright illegal
to use it in the Rn slot, not least because some encodings of that
form are used by MVE instructions such as UQRSHLL.

MC was treating all three slots the same, as SoftFail. So the only
reason UQRSHLL was disassembled correctly at all was because the MVE
decode table is separate from the Thumb2 one and takes priority; if
you turned off MVE, then encodings such as `[0x5f,0xea,0x0d,0x83]`
would disassemble as spurious CSELs.

Fixed by inventing another version of the `GPRwithZR` register class,
which disallows SP completely instead of just SoftFailing it.

Reviewers: DavidSpickett, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63862

llvm-svn: 364531
2019-06-27 12:40:40 +00:00
Jeremy Morse 3ca8f2b007 Add triple to a test I just added.
llvm-svn: 364524
2019-06-27 11:52:03 +00:00
Tim Northover 22c96a966b IR: compare type attributes deeply when looking into functions.
FunctionComparator attempts to produce a stable comparison of two Function
instances by looking at all available properties. Since ByVal attributes now
contain a Type pointer, they are not trivially ordered and FunctionComparator
should use its own Type comparison logic to sort them.

llvm-svn: 364523
2019-06-27 11:44:45 +00:00
George Rimar cfe9d0fb2b [Object/invalid.test] - Convert most of the sub tests to YAML.
Object/invalid.test is a test case that is used to check the behavior of tools
when broken inputs are used.

The most often tool tested there is llvm-readobj. I think we might want to move
such tests to test\tools\llvm-readobj. For now this patch converts
many sub-tests to use YAML and removes 12 binaries from the inputs.

Differential revision: https://reviews.llvm.org/D63762

llvm-svn: 364522
2019-06-27 11:31:43 +00:00
Stefan Stipanovic 5360589b7d [Attributor] Deducing existing nounwind attribute.
Adding nounwind deduction in new attributor framework.

Reviewers: jdoerfert, uenoku

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D63379

llvm-svn: 364521
2019-06-27 11:27:54 +00:00