Commit Graph

62969 Commits

Author SHA1 Message Date
Nilanjana Basu 2082bf28eb Changing CodeView debug info type record representation in assembly files to make it more human-readable & editable
llvm-svn: 364982
2019-07-03 00:26:23 +00:00
Alex Lorenz da1dfecd32 Add support for the 'macCatalyst' MachO platform
Mac Catalyst is a new MachO platform in macOS Catalina.
It always uses the build_version MachO load command.

Differential Revision: https://reviews.llvm.org/D64107

llvm-svn: 364981
2019-07-02 23:47:11 +00:00
Teresa Johnson 45fa289eb1 [ThinLTO] Work around existing failure exposed by new test
When adding summary entries for index-based WPD (r364960), an added
test also included some additional testing of the existing hybrid
Thin/Regular LTO WPD (test/ThinLTO/X86/devirt.ll). That part of the
test is producing a failure on the llvm-clang-x86_64-expensive-checks-win
bot:

*** Bad machine code: Explicit definition marked as use ***
- function:    __typeid__ZTS1A_0_branch_funnel
- basic block: %bb.0  (0x81d4c58)
- instruction: ICALL_BRANCH_FUNNEL %0:gr64, @0, 16, @_ZN1B1fEi, 48, @_ZN1C1fEi
- operand 0:   %0:gr64
LLVM ERROR: Found 1 machine code errors.

This is functionality unrelated to the summary entries added with my
patch, so I am disabling this part of the new test until it is
addressed. I'll continue to investigate the failure.

llvm-svn: 364978
2019-07-02 23:28:28 +00:00
Teresa Johnson 54c7907f52 [ThinLTO] Dump input on failure in devirt test
To help track down bug exposed by llvm-clang-x86_64-expensive-checks-win
bot.

llvm-svn: 364973
2019-07-02 22:06:02 +00:00
Eli Friedman e97aa961d3 [ARM] Fix unwind info for Thumb1 functions that save high registers.
There were two issues here: one, some of the relevant instructions were
missing the expected "FrameSetup" flag, and two,
ARMAsmPrinter::EmitUnwindingInstruction wasn't expecting "mov"
instructions in the prologue.

I'm sticking the additional state into ARMFunctionInfo so it's obvious
it only applies to the current function.

I considered a few alternative approaches where we would compute the
correct unwind information as part of the prologue/epilogue lowering,
but it seems like a lot of work to introduce pseudo-instructions, and
the current code seems to be reliable enough.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42408.

Differential Revision: https://reviews.llvm.org/D63964

llvm-svn: 364970
2019-07-02 21:35:15 +00:00
Teresa Johnson f2055c5eb8 [gold] Fix test after BitStream reader error changes
The recent change to the BitStream reader error handling in r364464
changed the error message format (from "LLVM ERROR:" to just "error"),
leading to a failure in this test which is only executed for very recent
versions of gold. Fix this by removing that part of the error message
check, leaving only the interesting part of the message to be checked.

llvm-svn: 364965
2019-07-02 20:24:00 +00:00
Vasileios Porpodas cf47ff5ffb [SLP] Recommit: Look-ahead operand reordering heuristic.
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk

Reviewed By: RKSimon, dtemirbulatov

Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364964
2019-07-02 20:20:28 +00:00
Jessica Paquette 99316043bb [AArch64][GlobalISel] Teach tryOptSelect to handle G_ICMP
This teaches `tryOptSelect` to handle folding G_ICMP, and removes the
requirement that the G_SELECT we're dealing with is floating point.

Some refactoring to make this work nicely as well:

- Factor out the scalar case from the selection code for G_ICMP into
  `emitIntegerCompare`.
- Make `tryOptCMN` return a MachineInstr* instead of a bool.
- Make `tryOptCMN` not modify the instruction being selected.
- Factor out the CMN emission into `emitCMN` for readability.

By doing this this way, we can get all of the compare selection optimizations
in select emission.

Differential Revision: https://reviews.llvm.org/D64084

llvm-svn: 364961
2019-07-02 19:44:16 +00:00
Teresa Johnson a700436323 [ThinLTO] Add summary entries for index-based WPD
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).

Also added are the necessary bitcode records, and the corresponding
assembly support.

The follow-on index-based WPD patch is D55153.

Depends on D53890.

Reviewers: pcc

Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54815

llvm-svn: 364960
2019-07-02 19:38:02 +00:00
Matt Arsenault 5fe851b6cd AMDGPU: Custom lower vector_shuffle for v4i16/v4f16
Ordinarily it is lowered as a build_vector of each extract_vector_elt,
which in turn get lowered to bitcasts and bit shifts. Very little
understand the lowered extract pattern, resulting in much worse
code. We treat concat_vectors of v2i16 as legal, so prefer that.

llvm-svn: 364959
2019-07-02 19:15:45 +00:00
Craig Topper fa4e825a3b [X86] Copy test cases from vector-zext.ll to vector-zext-widen.ll. Same for vector-sext.ll. NFC
llvm-svn: 364957
2019-07-02 18:39:59 +00:00
Yuanfang Chen d16c162c94 [llvm-objdump] Warn if no user specified sections (-j) are not found.
Match GNU objdump.

https://bugs.llvm.org/show_bug.cgi?id=41898

Reviewers: jhenderson, grimar, MaskRay, rupprecht

Reviewed by: jhenderson, grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D63779

llvm-svn: 364955
2019-07-02 18:38:17 +00:00
Simon Pilgrim 5613874947 [X86] getTargetConstantBitsFromNode - remove unnecessary getZExtValue() (PR42486)
Don't use APInt::getZExtValue() if you can avoid it - eventually someone will call it with i128 or something that doesn't fit into 64-bits.

In this case it was completely superfluous as we'd moved the rest of the code to always use APInt.

Fixes the <1 x i128> addition bug in PR42486

llvm-svn: 364953
2019-07-02 18:20:38 +00:00
Alexander Timofeev 2ce560f029 [AMDGPU] LCSSA pass added in preISel. Uniform values defined in the divergent loop and used outside
Differential Revision: https://reviews.llvm.org/D63953

Reviewers: rampitec, nhaehnle, arsenm
llvm-svn: 364950
2019-07-02 17:59:44 +00:00
Craig Topper cffbaa93b7 [X86] Add patterns to select (scalar_to_vector (loadf32)) as (V)MOVSSrm instead of COPY_TO_REGCLASS + (V)MOVSSrm_alt.
Similar for (V)MOVSD. Ultimately, I'd like to see about folding
scalar_to_vector+load to vzload. Which would select as (V)MOVSSrm
so this is closer to that.

llvm-svn: 364948
2019-07-02 17:51:02 +00:00
Roman Lebedev 059f495831 [NFC][Codegen][X86][AArch64][ARM][PowerPC] Recommit: Add test coverage for "add-of-inc" vs "sub-of-not"
I initially committed it with --check-prefix instead of --check-prefixes
(again, shame on me, and utils/update_*.py not complaining!)
and did not have a moment to understand the failure,
so i reverted it initially in rL64939.

llvm-svn: 364945
2019-07-02 16:48:49 +00:00
David Bolvansky cb1a5a705c [SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n)
Summary:
Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190



Reviewers: spatel, nikic, efriedma

Reviewed By: efriedma

Subscribers: efriedma, nikic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63038

llvm-svn: 364940
2019-07-02 15:58:45 +00:00
Roman Lebedev 893bbc9001 Revert "[NFC][Codegen][X86][AArch64][ARM][PowerPC] Add test coverage for "add-of-inc" vs "sub-of-not""
Some test failures i don't have a moment to investigate.

This reverts commit r364930.

llvm-svn: 364939
2019-07-02 15:54:24 +00:00
Matt Arsenault c3d5bbee23 AMDGPU: Fix broken test
llvm-svn: 364935
2019-07-02 15:34:40 +00:00
Matt Arsenault 50be3481d4 AMDGPU/GlobalISel: Try generated matcher with intrinsics
llvm-svn: 364933
2019-07-02 14:52:16 +00:00
Matt Arsenault a8bff4b963 AMDGPU/GlobalISel: Select mul
llvm-svn: 364932
2019-07-02 14:52:14 +00:00
Matt Arsenault dd7ca4faa5 GlobalISel: Define GINodeEquiv for G_UMULH/G_SMULH
llvm-svn: 364931
2019-07-02 14:49:29 +00:00
Roman Lebedev 39639261cc [NFC][Codegen][X86][AArch64][ARM][PowerPC] Add test coverage for "add-of-inc" vs "sub-of-not"
As it is pointed out in https://reviews.llvm.org/D63992,
before we get to pick canonical variant in middle-end
we should ensure best codegen in backend.

llvm-svn: 364930
2019-07-02 14:48:52 +00:00
Paul Robinson a5f3e278c8 Use --defsym instead of sed in a test. NFC
llvm-svn: 364929
2019-07-02 14:47:49 +00:00
Matt Arsenault 70a4d3f67c AMDGPU/GlobalISel: Fix G_GEP with mixed SGPR/VGPR operands
The register bank for the destination of the sample argument copy was
wrong. We shouldn't be constraining each source to the result register
bank. Allow constraining the original register to the right size.

llvm-svn: 364928
2019-07-02 14:40:22 +00:00
Matt Arsenault ed63399244 AMDGPU/GlobalISel: Select G_FENCE
Manually select to workaround tablegen emitter emitting checks for
G_CONSTANT.

llvm-svn: 364927
2019-07-02 14:17:38 +00:00
Matt Arsenault ce690544a6 GlobalISel: Add G_FENCE
The pattern importer is for some reason emitting checks for G_CONSTANT
for the immediate operands.

llvm-svn: 364926
2019-07-02 14:16:39 +00:00
Paul Robinson ca4e80182e Fix line endings (NFC)
llvm-svn: 364919
2019-07-02 13:13:36 +00:00
George Rimar 234f5f675e [Object/invalid.test] - Convert Object/corrupt.test to YAML and merge the result into invalid.test
Object/corrupt.test has the same purpose as Object/invalid.test:
it tests the behavior on invalid inputs.

In this patch I converted it to YAML, merged into invalid.test, 
added comments and removed a few precompiled binaries.

Differential revision: https://reviews.llvm.org/D63927

llvm-svn: 364916
2019-07-02 12:58:37 +00:00
Roman Lebedev 0bde7c6527 [InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484)
I was actually wondering if there was some nicer way than m_Value()+cast,
but apparently what i was really "subconsciously" thinking about
was correctness issue.

hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction,
not for BinaryOperator, so let's just use m_Instruction(),
thus both avoiding a cast, and a crash.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42484,
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587

llvm-svn: 364915
2019-07-02 12:54:48 +00:00
Simon Tatham bffd099d15 [ARM] MVE: allow soft-float ABI to pass vector types.
Passing a vector type over the soft-float ABI involves it being split
into four GPRs, so the first thing that has to happen at the start of
the function is to recombine those into a vector register. The ABI
types all vectors as v2f64, so we need to support BUILD_VECTOR for
that type, which I do in this patch by allowing it to be expanded in
terms of INSERT_VECTOR_ELT, and writing an ISel pattern for that in
turn. Similarly, I provide a rule for EXTRACT_VECTOR_ELT so that a
returned vector can be marshalled back into GPRs.

While I'm here, I've also added ISD::UNDEF to the list of operations
we turn back on in `setAllExpand`, because I noticed that otherwise it
gets expanded into a BUILD_VECTOR with explicit zero inputs, leading
to pointless machine instructions to zero out a vector register that's
about to have every lane overwritten of in any case.

Reviewers: dmgreen, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63937

llvm-svn: 364910
2019-07-02 11:26:11 +00:00
Simon Tatham 7b63a9533c [ARM] Stop using scalar FP instructions in integer-only MVE mode.
If you compile with `-mattr=+mve` (enabling integer MVE instructions
but not floating-point ones), then the scalar FP //registers// exist
and it's legal to move things in and out of them, load and store them,
but it's not legal to do arithmetic on them.

In D60708, the calls to `addRegisterClass` in ARMISelLowering that
enable use of the scalar FP registers became conditionalised on
`Subtarget->hasFPRegs()` instead of `Subtarget->hasVFP2Base()`, so
that loads, stores and moves of those registers would work. But I
didn't realise that that would also enable all the operations on those
types by default.

Now, if the target doesn't have basic VFP, we follow up those
`addRegisterClass` calls by turning back off all the nontrivial
operations you can perform on f32 and f64. That causes several
knock-on failures, which are fixed by allowing the `VMOVDcc` and
`VMOVScc` instructions to be selected even if all you have is
`HasFPRegs`, and adjusting several checks for 'is this a double in a
single-precision-only world?' to the more general 'is this any FP type
we can't do arithmetic on?'. Between those, the whole of the
`float-ops.ll` and `fp16-instructions.ll` tests can now run in
MVE-without-FP mode and generate correct-looking code.

One odd side effect is that I had to relax the check lines in that
test so that they permit test functions like `add_f` to be generated
as tailcalls to software FP library functions, instead of ordinary
calls. Doing that is entirely legal, but the mystery is why this is
the first RUN line that's needed the relaxation: on the usual kind of
non-FP target, no tailcalls ever seem to be generated. Going by the
llc messages, I think `SoftenFloatResult` must be perturbing the code
generation in some way, but that's as much as I can guess.

Reviewers: dmgreen, ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63938

llvm-svn: 364909
2019-07-02 11:26:00 +00:00
George Rimar e400186b52 [yaml2obj] - An attempt to fix a ppc64be build bot after r364898
I guess the problem is because of endianess of
the bytes tested by "od" tool. I changed the Content
sequence as it does not actually matter.

llvm-svn: 364907
2019-07-02 11:02:09 +00:00
George Rimar eb279769d9 [test/Object] - Fix build bot.
Fixed mistype in the test case.

BB: http://lab.llvm.org:8011/builders/lld-x86_64-ubuntu-fast/builds/2720/steps/test-check-all/logs/stdio
llvm-svn: 364905
2019-07-02 10:47:13 +00:00
George Rimar 2915b3988f [Object/invalid.test] - Convert 3 more sub-tests to YAML
This allows to remove 3 more precompiled binaries from the inputs.

Differential revision: https://reviews.llvm.org/D63880

llvm-svn: 364903
2019-07-02 10:30:06 +00:00
George Rimar 9df825f429 [yaml2obj] - Allow overriding sh_offset field from the YAML.
Some of our test cases are using objects which
has sections with a broken sh_offset field.

There was no way to set it from YAML until this patch.

Differential revision: https://reviews.llvm.org/D63879

llvm-svn: 364898
2019-07-02 10:20:12 +00:00
Roman Lebedev 7928fea4a7 [NFC][InstCombine] Revisit tests for "redundant shift input masking" (PR42456)
llvm-svn: 364897
2019-07-02 10:02:25 +00:00
Roman Lebedev 377dfb0226 [NFC][InstCombine] Add tests for "redundant shift input masking" (PR42456)
https://bugs.llvm.org/show_bug.cgi?id=42456
https://rise4fun.com/Alive/Vf1p

llvm-svn: 364894
2019-07-02 09:27:34 +00:00
Amara Emerson 000ef2c2ae [TailDuplicator] Fix copy instruction emitting into the wrong block.
The code for duplicating instructions could sometimes try to emit copies
intended to deal with unconstrainable register classes to the tail block of the
original instruction, rather than before the newly cloned instruction in the
predecessor block.

This was exposed by GlobalISel on arm64.

Differential Revision: https://reviews.llvm.org/D64049

llvm-svn: 364888
2019-07-02 06:04:46 +00:00
QingShan Zhang 7fdb3a293b [PowerPC] Implement the areMemAccessesTriviallyDisjoint hook
After implemented this hook, we will model the memory dependency in the scheduling dependency graph more precise,
and will have more opportunity to reorder the load/stores, as they didn't have the dependency at some condition

Differential Revision: https://reviews.llvm.org/D63804

llvm-svn: 364886
2019-07-02 03:28:52 +00:00
Zi Xuan Wu 7ae536a1ce [DAGCombiner] Exploiting more about the transformation of TransformFPLoadStorePair function
For a given floating point load / store pair, if the load value isn't used by any other operations, 
then consider transforming the pair to integer load / store operations if the target deems the transformation profitable.

And we can exploiting much more when there are other operation nodes with chain operand between the load/store pair 
so long as we keep the chain ordering original. We only replace the register used to load/store from float to integer.

I only add testcase in ARM because the TLI.isDesirableToTransformToIntegerOp hook is only enabled in ARM target.

Differential Revision: https://reviews.llvm.org/D60601

llvm-svn: 364883
2019-07-02 02:54:52 +00:00
Jordan Rupprecht 351b7e7b24 Revert Recommit [PowerPC] Update P9 vector costs for insert/extract element
This reverts r364557 (git commit 9f7f5858fe)

This crashes as reported on the commit thread. Repro instructions TBD.

llvm-svn: 364876
2019-07-01 23:29:46 +00:00
Reid Kleckner d72163947a [PGO] Update ICP pass for recent byval type changes
Fixes verifier errors encountered in PR42413.

Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D63842

llvm-svn: 364861
2019-07-01 22:43:39 +00:00
Matt Arsenault 40c08052a5 AMDGPU: Correct properties for adjcallstack* pseudos
These should be SALU writes, and these are lowered to instructions
that def SCC.

llvm-svn: 364859
2019-07-01 22:01:05 +00:00
Huihui Zhang 8e1051b3a0 [InstCombine][NFCI] Update test cases in onehot_merge.ll
Use both one bit and signbit shifting to check for one bit merge.

Reviewers: lebedev.ri, spatel, efriedma, craig.topper

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63903

llvm-svn: 364857
2019-07-01 22:00:32 +00:00
Sanjay Patel ddc1b40f26 [InstCombine] reduce more checks for power-of-2-or-zero using ctpop
Extends the transform from:
rL364341
...to include another (more common?) pattern that tests whether a
value is a power-of-2 (including or excluding zero).

llvm-svn: 364856
2019-07-01 22:00:00 +00:00
Jordan Rupprecht a7972dc04a Revert [SLP] Look-ahead operand reordering heuristic.
This reverts r364478 (git commit 574cb0eb3a)

The patch is causing compilation timeouts.

llvm-svn: 364846
2019-07-01 21:10:43 +00:00
Roman Lebedev 975120a21b [NFC][InstCombine] More commutative tests for "shift direction in bittest" (PR42466)
'and' is commutative, if we don't want to touch shift-of-const,
we still need to check the other hand of 'and'.

llvm-svn: 364844
2019-07-01 20:33:56 +00:00
Matt Arsenault c9f14f29f5 GlobalISel: Try to widen merges with other merges
If the requested source type an be used as a merge source type, create
a merge of merges. This avoids creating large, illegal extensions and
bit-ops directly to the result type.

llvm-svn: 364841
2019-07-01 19:36:10 +00:00
Matt Arsenault 73dec22c3e AMDGPU: Revert accidental change to test
llvm-svn: 364839
2019-07-01 19:09:57 +00:00
Craig Topper 5e7815b695 [X86] Correct v4f32->v2i64 cvt(t)ps2(u)qq memory isel patterns
These instructions only read 64-bits of memory so we shouldn't
allow a full vector width load to be pattern matched in case it
is marked volatile.

Instead allow vzload or scalar_to_vector+load.

Also add a DAG combine to turn full vector loads into vzload when
used by one of these instructions if the load isn't volatile.

This fixes another case for PR42079

llvm-svn: 364838
2019-07-01 19:01:37 +00:00
Matt Arsenault bae3636f96 AMDGPU/GlobalISel: Handle more input argument intrinsics
llvm-svn: 364836
2019-07-01 18:50:50 +00:00
Matt Arsenault 9e8e8c60fa AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsics
llvm-svn: 364835
2019-07-01 18:49:01 +00:00
Matt Arsenault 756d81905f AMDGPU/GlobalISel: Legalize workgroup ID intrinsics
llvm-svn: 364834
2019-07-01 18:47:22 +00:00
Matt Arsenault e2c86cce3a AMDGPU/GlobalISel: Legalize workitem ID intrinsics
Tests don't cover the masked input path since non-kernel arguments
aren't lowered yet.

Test is copied directly from the existing test, with 2 additions.

llvm-svn: 364833
2019-07-01 18:45:36 +00:00
Matt Arsenault e15770aec4 AMDGPU/GlobalISel: Custom lower control flow intrinsics
Replace the brcond for the 2 cases that act as branches. For now
follow how the current system works, although I think we can
eventually get rid of the pseudos.

llvm-svn: 364832
2019-07-01 18:40:23 +00:00
Matt Arsenault 4073b33786 AMDGPU/GlobalISel: Handle 16-bit SALU min/max
This needs to be extended to s32, and expanded into cmp+select.  This
is relying on the fact that widenScalar happens to leave the
instruction in place, but this isn't a guaranteed property of
LegalizerHelper.

llvm-svn: 364831
2019-07-01 18:33:37 +00:00
Matt Arsenault 5a7d5111e5 AMDGPU/GlobalISel: Lower SALU min/max to cmp+select
Use a change observer to apply a register bank to the newly created
intermediate result register.

llvm-svn: 364830
2019-07-01 18:30:45 +00:00
Robert Lougher e20030f612 [X86] Avoid SFB - Fix inconsistent codegen with/without debug info(2)
The function findPotentialBlockers may consider debug info instructions as
potential blockers and may stop searching for a store-load pair prematurely.

This patch corrects this and tests the cases where the store is separated
from the load by more than InspectionLimit debug instructions.

Patch by Chris Dawson.

Differential Revision: https://reviews.llvm.org/D62408

llvm-svn: 364829
2019-07-01 18:28:21 +00:00
Matt Arsenault 7f8c720939 AMDGPU/GlobalISel: Add tests for add legalization
llvm-svn: 364828
2019-07-01 18:26:47 +00:00
Matt Arsenault ef59cb6982 AMDGPU/GlobalISel: Legalize s16 add/sub/mul
If this is scalar, promote to s32. Use a new observer class to assign
the register bank of newly created registers.

llvm-svn: 364827
2019-07-01 18:18:55 +00:00
Matt Arsenault 9470bb262b AMDGPU/GlobalISel: Fix allowing non-boolean conditions for G_SELECT
The condition register bank must be scc or vcc so that a copy will be
inserted, which will be lowered to a compare.

Currently greedy unnecessarily forces using a VCC select.

llvm-svn: 364825
2019-07-01 18:13:12 +00:00
Roman Lebedev e62857786f [NFC][InstCombine] Add tests for "shift direction in bittest" (PR42466)
https://rise4fun.com/Alive/8O1
https://bugs.llvm.org/show_bug.cgi?id=42466

llvm-svn: 364824
2019-07-01 18:11:32 +00:00
Matt Arsenault 03ca176ab3 GlobalISel: Verify G_MERGE_VALUES operand sizes
llvm-svn: 364822
2019-07-01 18:01:35 +00:00
Matt Arsenault b2ea20eedd AMDGPU/GlobalISel: RegBankSelect for sendmsg/sendmsghalt
llvm-svn: 364819
2019-07-01 17:40:18 +00:00
Matt Arsenault 40d1faf38f AMDGPU/GlobalISel: Legalize s16 fcmp
llvm-svn: 364817
2019-07-01 17:35:53 +00:00
Nicolai Haehnle 10c911db63 AMDGPU/GFX10: implement ds_ordered_count changes
Summary:
ds_ordered_count can now simultaneously operate on up to 4 dwords
in a single instruction, which are taken from (and returned to)
lanes 0..3 of a single VGPR.

Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63716

llvm-svn: 364815
2019-07-01 17:17:52 +00:00
Nicolai Haehnle 4dc3b2bf95 AMDGPU: Support GDS atomics
Summary:
Original patch by Marek Olšák

Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63452

llvm-svn: 364814
2019-07-01 17:17:45 +00:00
Matt Arsenault 1094e6a814 AMDGPU/GlobalISel: RegBankSelect for DS ordered add/swap
llvm-svn: 364811
2019-07-01 17:04:57 +00:00
Matt Arsenault 265059eaf6 AMDGPU/GlobalISel: RegBankSelect for amdgcn.writelane
llvm-svn: 364808
2019-07-01 16:41:36 +00:00
Matt Arsenault 0a52e9d026 AMDGPU/GlobalISel: Complete implementation of G_GEP
Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.

llvm-svn: 364806
2019-07-01 16:34:48 +00:00
Matt Arsenault e1006259d8 AMDGPU/GlobalISel: Select G_PHI
llvm-svn: 364805
2019-07-01 16:32:47 +00:00
Matt Arsenault d810ff2588 AMDGPU/GlobalISel: Try to select VOP3 form of add
There are several things broken, but at least emit the right thing for
gfx9.

The import of the pattern with the unused carry out seems to not
work. Needs a special class for clamp, because OperandWithDefaultOps
doesn't really work.

llvm-svn: 364804
2019-07-01 16:27:32 +00:00
Matt Arsenault 62d64b0c30 AMDGPU/GlobalISel: RegBankSelect for readlane/readfirstlane
llvm-svn: 364801
2019-07-01 16:19:39 +00:00
Tom Stellard 9e9dd30de3 AMDGPU/GlobalISel: Implement select for 32-bit G_ADD
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58804

llvm-svn: 364797
2019-07-01 16:09:33 +00:00
Matt Arsenault 2ab25f9ceb AMDGPU/GlobalISel: Select G_BRCOND for vcc
llvm-svn: 364795
2019-07-01 16:06:02 +00:00
Roman Lebedev 04d3d3bbff [InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)
Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

llvm-svn: 364792
2019-07-01 15:55:24 +00:00
Roman Lebedev 72b8d41ce8 [InstCombine] Shift amount reassociation in bittest (PR42399)
Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0`  iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

llvm-svn: 364791
2019-07-01 15:55:15 +00:00
Krzysztof Parzyszek 5abf80cdfa [Hexagon] Custom-lower UADDO(x, 1) and USUBO(x, 1)
llvm-svn: 364790
2019-07-01 15:50:09 +00:00
Matt Arsenault cda82f0bb6 AMDGPU/GlobalISel: Select G_FRAME_INDEX
llvm-svn: 364789
2019-07-01 15:48:18 +00:00
Nicolai Haehnle 7cfd99ab15 AMDGPU/GFX10: fix scratch resource descriptor
Summary:
The stride should depend on the wave size, not the hardware generation.

Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be
relevant.

Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6

Reviewers: arsenm, rampitec, mareko

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63808

llvm-svn: 364788
2019-07-01 15:43:00 +00:00
Matt Arsenault fdf36729c7 AMDGPU/GlobalISel: Make s16 select legal
This is easy to handle and avoids legalization artifacts which are
likely to obscure combines.

llvm-svn: 364787
2019-07-01 15:42:47 +00:00
Matt Arsenault 6464280eb0 AMDGPU/GlobalISel: Select G_BRCOND for scc conditions
llvm-svn: 364786
2019-07-01 15:39:27 +00:00
Matt Arsenault 1daad91af6 AMDGPU/GlobalISel: Tolerate copies with no type set
isVCC has the same bug, but isn't used in a context where it can cause
a problem.

llvm-svn: 364784
2019-07-01 15:23:04 +00:00
Matt Arsenault fb99fc7a68 AMDGPU: Fix tests using the default alloca address space
llvm-svn: 364783
2019-07-01 15:23:03 +00:00
Matt Arsenault 4f64ade04c AMDGPU/GlobalISel: Select src modifiers
llvm-svn: 364782
2019-07-01 15:18:56 +00:00
Jinsong Ji ee6539341b [UpdateTestChecks][PowerPC] Avoid empty string when scrubbing loop comments
Summary:
SCRUB_LOOP_COMMENT_RE was introduced in https://reviews.llvm.org/D31285
This works for some loops.

However, we may generate lines with loop comments only.
And since we don't scrub leading white spaces, this will leave an empty
line there, and FileCheck will complain it.

eg: llvm/test/CodeGen/PowerPC/PR35812-neg-cmpxchg.ll:27:15:
error: found empty check string with prefix 'CHECK:'
; CHECK-NEXT:

This prevented us from using the `update_llc_test_checks.py` for quite some cases.

We should still keep the comment token there, so that we can safely
scrub the loop comment without breaking FileCheck.

Reviewers: timshen, hfinkel, lebedev.ri, RKSimon

Subscribers: nemanjai, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63957

llvm-svn: 364775
2019-07-01 14:37:48 +00:00
Roman Lebedev 34a0b16e29 [NFC][InstCombine] Better commutative tests for "shift amount reassociation in bittest" pattern.
As discussed in https://reviews.llvm.org/D63829
*if* *both* shifts are one-use, we'd most likely want to produce `lshr`,
and not rely on ordering.

Also, there should likely be a *separate* fold to do this reordering.

llvm-svn: 364772
2019-07-01 14:28:24 +00:00
Krzysztof Parzyszek 511ad50db4 [Hexagon] Rework VLCR algorithm
Add code to catch pattern for commutative instructions for VLCR.

Patch by Suyog Sarda.

llvm-svn: 364770
2019-07-01 13:50:47 +00:00
Matt Arsenault 5bf850d52e AMDGPU/GlobalISel: Fix RegBankSelect for G_FCANONICALIZE
llvm-svn: 364768
2019-07-01 13:40:18 +00:00
Matt Arsenault b5fc94f3e7 AMDGPU/GlobalISel: Fix RegBankSelect for G_BUILD_VECTOR
llvm-svn: 364767
2019-07-01 13:40:17 +00:00
Matt Arsenault 89fc8bcdd6 AMDGPU/GlobalISel: Fail on store to 32-bit address space
llvm-svn: 364766
2019-07-01 13:37:39 +00:00
Matt Arsenault 3b7668ae4b AMDGPU/GlobalISel: Improve icmp selection coverage.
Select s64 eq/ne scalar icmp.

llvm-svn: 364765
2019-07-01 13:34:26 +00:00
Roman Lebedev 9f3645869c [NFC][InstCombine] Improve test coverage for ((~x) + y) + 1 -> y - x fold fold (PR42459)
So we indeed to have this fold, but only if +1 is not the last operation..

llvm-svn: 364764
2019-07-01 13:31:06 +00:00
Matt Arsenault c23149f612 AMDGPU/GlobalISel: RegBankSelect for WWM/WQM
llvm-svn: 364763
2019-07-01 13:30:12 +00:00
Matt Arsenault facf69e844 AMDGPU/GlobalISel: Use vcc reg bank for amdgcn.wqm.vote
llvm-svn: 364762
2019-07-01 13:30:09 +00:00
Matt Arsenault 9f992c238a AMDGPU/GlobalISel: Fix scc->vcc copy handling
This was checking the size of the register with the value of the size,
which happens to be exec. Also fix assuming VCC is 64-bit to fix
wave32.

Also remove some untested handling for physical registers which is
skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical
copy source. I'm not sure if this should be trying to handle this
special case instead of dealing with this in copyPhysReg.

llvm-svn: 364761
2019-07-01 13:22:07 +00:00
Matt Arsenault 5dafcb9b11 AMDGPU/GlobalISel: Use and instead of BFE with inline immediate
Zext from s1 is the only case where this should do anything with the
current legal extensions.

llvm-svn: 364760
2019-07-01 13:22:06 +00:00
Matt Arsenault 01bb075c1f GlobalISel: Add GINodeEquiv for min/max
llvm-svn: 364759
2019-07-01 13:22:04 +00:00
Matt Arsenault fbf67d88de GlobalISel: Add DAG compat for G_FCANONICALIZE
llvm-svn: 364758
2019-07-01 13:22:00 +00:00