Mac Catalyst is a new MachO platform in macOS Catalina.
It always uses the build_version MachO load command.
Differential Revision: https://reviews.llvm.org/D64107
llvm-svn: 364981
When adding summary entries for index-based WPD (r364960), an added
test also included some additional testing of the existing hybrid
Thin/Regular LTO WPD (test/ThinLTO/X86/devirt.ll). That part of the
test is producing a failure on the llvm-clang-x86_64-expensive-checks-win
bot:
*** Bad machine code: Explicit definition marked as use ***
- function: __typeid__ZTS1A_0_branch_funnel
- basic block: %bb.0 (0x81d4c58)
- instruction: ICALL_BRANCH_FUNNEL %0:gr64, @0, 16, @_ZN1B1fEi, 48, @_ZN1C1fEi
- operand 0: %0:gr64
LLVM ERROR: Found 1 machine code errors.
This is functionality unrelated to the summary entries added with my
patch, so I am disabling this part of the new test until it is
addressed. I'll continue to investigate the failure.
llvm-svn: 364978
There were two issues here: one, some of the relevant instructions were
missing the expected "FrameSetup" flag, and two,
ARMAsmPrinter::EmitUnwindingInstruction wasn't expecting "mov"
instructions in the prologue.
I'm sticking the additional state into ARMFunctionInfo so it's obvious
it only applies to the current function.
I considered a few alternative approaches where we would compute the
correct unwind information as part of the prologue/epilogue lowering,
but it seems like a lot of work to introduce pseudo-instructions, and
the current code seems to be reliable enough.
Fixes https://bugs.llvm.org/show_bug.cgi?id=42408.
Differential Revision: https://reviews.llvm.org/D63964
llvm-svn: 364970
The recent change to the BitStream reader error handling in r364464
changed the error message format (from "LLVM ERROR:" to just "error"),
leading to a failure in this test which is only executed for very recent
versions of gold. Fix this by removing that part of the error message
check, leaving only the interesting part of the message to be checked.
llvm-svn: 364965
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).
Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk
Reviewed By: RKSimon, dtemirbulatov
Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60897
llvm-svn: 364964
This teaches `tryOptSelect` to handle folding G_ICMP, and removes the
requirement that the G_SELECT we're dealing with is floating point.
Some refactoring to make this work nicely as well:
- Factor out the scalar case from the selection code for G_ICMP into
`emitIntegerCompare`.
- Make `tryOptCMN` return a MachineInstr* instead of a bool.
- Make `tryOptCMN` not modify the instruction being selected.
- Factor out the CMN emission into `emitCMN` for readability.
By doing this this way, we can get all of the compare selection optimizations
in select emission.
Differential Revision: https://reviews.llvm.org/D64084
llvm-svn: 364961
Summary:
If LTOUnit splitting is disabled, the module summary analysis computes
the summary information necessary to perform single implementation
devirtualization during the thin link with the index and no IR. The
information collected from the regular LTO IR in the current hybrid WPD
algorithm is summarized, including:
1) For vtable definitions, record the function pointers and their offset
within the vtable initializer (subsumes the information collected from
IR by tryFindVirtualCallTargets).
2) A record for each type metadata summarizing the vtable definitions
decorated with that metadata (subsumes the TypeIdentiferMap collected
from IR).
Also added are the necessary bitcode records, and the corresponding
assembly support.
The follow-on index-based WPD patch is D55153.
Depends on D53890.
Reviewers: pcc
Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54815
llvm-svn: 364960
Ordinarily it is lowered as a build_vector of each extract_vector_elt,
which in turn get lowered to bitcasts and bit shifts. Very little
understand the lowered extract pattern, resulting in much worse
code. We treat concat_vectors of v2i16 as legal, so prefer that.
llvm-svn: 364959
Don't use APInt::getZExtValue() if you can avoid it - eventually someone will call it with i128 or something that doesn't fit into 64-bits.
In this case it was completely superfluous as we'd moved the rest of the code to always use APInt.
Fixes the <1 x i128> addition bug in PR42486
llvm-svn: 364953
Similar for (V)MOVSD. Ultimately, I'd like to see about folding
scalar_to_vector+load to vzload. Which would select as (V)MOVSSrm
so this is closer to that.
llvm-svn: 364948
I initially committed it with --check-prefix instead of --check-prefixes
(again, shame on me, and utils/update_*.py not complaining!)
and did not have a moment to understand the failure,
so i reverted it initially in rL64939.
llvm-svn: 364945
As it is pointed out in https://reviews.llvm.org/D63992,
before we get to pick canonical variant in middle-end
we should ensure best codegen in backend.
llvm-svn: 364930
The register bank for the destination of the sample argument copy was
wrong. We shouldn't be constraining each source to the result register
bank. Allow constraining the original register to the right size.
llvm-svn: 364928
Object/corrupt.test has the same purpose as Object/invalid.test:
it tests the behavior on invalid inputs.
In this patch I converted it to YAML, merged into invalid.test,
added comments and removed a few precompiled binaries.
Differential revision: https://reviews.llvm.org/D63927
llvm-svn: 364916
I was actually wondering if there was some nicer way than m_Value()+cast,
but apparently what i was really "subconsciously" thinking about
was correctness issue.
hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction,
not for BinaryOperator, so let's just use m_Instruction(),
thus both avoiding a cast, and a crash.
Fixes https://bugs.llvm.org/show_bug.cgi?id=42484,
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587
llvm-svn: 364915
Passing a vector type over the soft-float ABI involves it being split
into four GPRs, so the first thing that has to happen at the start of
the function is to recombine those into a vector register. The ABI
types all vectors as v2f64, so we need to support BUILD_VECTOR for
that type, which I do in this patch by allowing it to be expanded in
terms of INSERT_VECTOR_ELT, and writing an ISel pattern for that in
turn. Similarly, I provide a rule for EXTRACT_VECTOR_ELT so that a
returned vector can be marshalled back into GPRs.
While I'm here, I've also added ISD::UNDEF to the list of operations
we turn back on in `setAllExpand`, because I noticed that otherwise it
gets expanded into a BUILD_VECTOR with explicit zero inputs, leading
to pointless machine instructions to zero out a vector register that's
about to have every lane overwritten of in any case.
Reviewers: dmgreen, ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63937
llvm-svn: 364910
If you compile with `-mattr=+mve` (enabling integer MVE instructions
but not floating-point ones), then the scalar FP //registers// exist
and it's legal to move things in and out of them, load and store them,
but it's not legal to do arithmetic on them.
In D60708, the calls to `addRegisterClass` in ARMISelLowering that
enable use of the scalar FP registers became conditionalised on
`Subtarget->hasFPRegs()` instead of `Subtarget->hasVFP2Base()`, so
that loads, stores and moves of those registers would work. But I
didn't realise that that would also enable all the operations on those
types by default.
Now, if the target doesn't have basic VFP, we follow up those
`addRegisterClass` calls by turning back off all the nontrivial
operations you can perform on f32 and f64. That causes several
knock-on failures, which are fixed by allowing the `VMOVDcc` and
`VMOVScc` instructions to be selected even if all you have is
`HasFPRegs`, and adjusting several checks for 'is this a double in a
single-precision-only world?' to the more general 'is this any FP type
we can't do arithmetic on?'. Between those, the whole of the
`float-ops.ll` and `fp16-instructions.ll` tests can now run in
MVE-without-FP mode and generate correct-looking code.
One odd side effect is that I had to relax the check lines in that
test so that they permit test functions like `add_f` to be generated
as tailcalls to software FP library functions, instead of ordinary
calls. Doing that is entirely legal, but the mystery is why this is
the first RUN line that's needed the relaxation: on the usual kind of
non-FP target, no tailcalls ever seem to be generated. Going by the
llc messages, I think `SoftenFloatResult` must be perturbing the code
generation in some way, but that's as much as I can guess.
Reviewers: dmgreen, ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63938
llvm-svn: 364909
I guess the problem is because of endianess of
the bytes tested by "od" tool. I changed the Content
sequence as it does not actually matter.
llvm-svn: 364907
Some of our test cases are using objects which
has sections with a broken sh_offset field.
There was no way to set it from YAML until this patch.
Differential revision: https://reviews.llvm.org/D63879
llvm-svn: 364898
The code for duplicating instructions could sometimes try to emit copies
intended to deal with unconstrainable register classes to the tail block of the
original instruction, rather than before the newly cloned instruction in the
predecessor block.
This was exposed by GlobalISel on arm64.
Differential Revision: https://reviews.llvm.org/D64049
llvm-svn: 364888
After implemented this hook, we will model the memory dependency in the scheduling dependency graph more precise,
and will have more opportunity to reorder the load/stores, as they didn't have the dependency at some condition
Differential Revision: https://reviews.llvm.org/D63804
llvm-svn: 364886
For a given floating point load / store pair, if the load value isn't used by any other operations,
then consider transforming the pair to integer load / store operations if the target deems the transformation profitable.
And we can exploiting much more when there are other operation nodes with chain operand between the load/store pair
so long as we keep the chain ordering original. We only replace the register used to load/store from float to integer.
I only add testcase in ARM because the TLI.isDesirableToTransformToIntegerOp hook is only enabled in ARM target.
Differential Revision: https://reviews.llvm.org/D60601
llvm-svn: 364883
Use both one bit and signbit shifting to check for one bit merge.
Reviewers: lebedev.ri, spatel, efriedma, craig.topper
Reviewed By: lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63903
llvm-svn: 364857
Extends the transform from:
rL364341
...to include another (more common?) pattern that tests whether a
value is a power-of-2 (including or excluding zero).
llvm-svn: 364856
If the requested source type an be used as a merge source type, create
a merge of merges. This avoids creating large, illegal extensions and
bit-ops directly to the result type.
llvm-svn: 364841
These instructions only read 64-bits of memory so we shouldn't
allow a full vector width load to be pattern matched in case it
is marked volatile.
Instead allow vzload or scalar_to_vector+load.
Also add a DAG combine to turn full vector loads into vzload when
used by one of these instructions if the load isn't volatile.
This fixes another case for PR42079
llvm-svn: 364838
Tests don't cover the masked input path since non-kernel arguments
aren't lowered yet.
Test is copied directly from the existing test, with 2 additions.
llvm-svn: 364833
Replace the brcond for the 2 cases that act as branches. For now
follow how the current system works, although I think we can
eventually get rid of the pseudos.
llvm-svn: 364832
This needs to be extended to s32, and expanded into cmp+select. This
is relying on the fact that widenScalar happens to leave the
instruction in place, but this isn't a guaranteed property of
LegalizerHelper.
llvm-svn: 364831
The function findPotentialBlockers may consider debug info instructions as
potential blockers and may stop searching for a store-load pair prematurely.
This patch corrects this and tests the cases where the store is separated
from the load by more than InspectionLimit debug instructions.
Patch by Chris Dawson.
Differential Revision: https://reviews.llvm.org/D62408
llvm-svn: 364829
The condition register bank must be scc or vcc so that a copy will be
inserted, which will be lowered to a compare.
Currently greedy unnecessarily forces using a VCC select.
llvm-svn: 364825
Summary:
ds_ordered_count can now simultaneously operate on up to 4 dwords
in a single instruction, which are taken from (and returned to)
lanes 0..3 of a single VGPR.
Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276
Reviewers: mareko, arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63716
llvm-svn: 364815
Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.
llvm-svn: 364806
There are several things broken, but at least emit the right thing for
gfx9.
The import of the pattern with the unused carry out seems to not
work. Needs a special class for clamp, because OperandWithDefaultOps
doesn't really work.
llvm-svn: 364804
Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.
This does address the regression being exposed in D63992.
https://godbolt.org/z/7DGltUhttps://rise4fun.com/Alive/EPO0
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]
Reviewers: spatel, nikic, huihuiz
Reviewed By: spatel
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63993
llvm-svn: 364792
Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)`
It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.
We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v
Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].
Reviewers: spatel, nikic, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63829
llvm-svn: 364791
Summary:
The stride should depend on the wave size, not the hardware generation.
Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be
relevant.
Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6
Reviewers: arsenm, rampitec, mareko
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63808
llvm-svn: 364788
Summary:
SCRUB_LOOP_COMMENT_RE was introduced in https://reviews.llvm.org/D31285
This works for some loops.
However, we may generate lines with loop comments only.
And since we don't scrub leading white spaces, this will leave an empty
line there, and FileCheck will complain it.
eg: llvm/test/CodeGen/PowerPC/PR35812-neg-cmpxchg.ll:27:15:
error: found empty check string with prefix 'CHECK:'
; CHECK-NEXT:
This prevented us from using the `update_llc_test_checks.py` for quite some cases.
We should still keep the comment token there, so that we can safely
scrub the loop comment without breaking FileCheck.
Reviewers: timshen, hfinkel, lebedev.ri, RKSimon
Subscribers: nemanjai, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63957
llvm-svn: 364775
As discussed in https://reviews.llvm.org/D63829
*if* *both* shifts are one-use, we'd most likely want to produce `lshr`,
and not rely on ordering.
Also, there should likely be a *separate* fold to do this reordering.
llvm-svn: 364772
This was checking the size of the register with the value of the size,
which happens to be exec. Also fix assuming VCC is 64-bit to fix
wave32.
Also remove some untested handling for physical registers which is
skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical
copy source. I'm not sure if this should be trying to handle this
special case instead of dealing with this in copyPhysReg.
llvm-svn: 364761