Commit Graph

964 Commits

Author SHA1 Message Date
Jessica Paquette 727328ab63 [AArch64][GlobalISel] Tail call memory intrinsics
Because memory intrinsics are handled differently than other calls, we need to
check them for tail call eligiblity in the legalizer. This allows us to still
inline them when it's beneficial to do so, but also tail call when possible.

This adds simple tail calling support for when the intrinsic is followed by a
return.

It ports the attribute checks from `TargetLowering::isInTailCallPosition` into
a similarly-named function in LegalizerHelper.cpp. The target-specific
`isUsedByReturnOnly` hook is not ported here.

Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call
memory intrinsics.

Update legalize-memcpy-et-al.mir to have a case where we don't tail call.

Differential Revision: https://reviews.llvm.org/D67566

llvm-svn: 371893
2019-09-13 20:25:58 +00:00
Matt Arsenault 4d33918034 AMDGPU/GlobalISel: Legalize G_FMAD
Unlike SelectionDAG, treat this as a normally legalizable operation.
In SelectionDAG this is supposed to only ever formed if it's legal,
but I've found that to be restricting. For AMDGPU this is contextually
legal depending on whether denormal flushing is allowed in the use
function.

Technically we currently treat the denormal mode as a subtarget
feature, so custom lowering could be avoided. However I consider this
to be a defect, and this should be contextually dependent on the
controllable rounding mode of the parent function.

llvm-svn: 371800
2019-09-13 00:44:35 +00:00
Jessica Paquette a42070a6aa [AArch64][GlobalISel] Support sibling calls with outgoing arguments
This adds support for lowering sibling calls with outgoing arguments.

e.g

```
define void @foo(i32 %a)
```

Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`.
The only thing that is missing is a full port of
`TargetLowering::parametersInCSRMatch`. So, if we're using swiftself,
we'll never tail call.

- Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used
  for both outgoing and incoming arguments
- Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for
  stack arguments.
- Teach `lowerFormalArguments` to set the bytes in the caller's stack argument
  area. This is used later to check if the tail call's parameters will fit on
  the caller's stack.
- Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on
  the callee's outgoing arguments.

For testing:

- Update call-translator-tail-call to verify that we can now tail call with
  outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the
  size of the caller's stack
- Remove GISel-specific check lines from speculation-hardening.ll, since GISel
  now tail calls like the other selectors
- Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that
  test now
- Add a GISel test line to tailcall_misched_graph.ll since we tail call there
  now. Add specific check lines for GISel, since the debug output from the
  machine-scheduler differs with GlobalISel. The dependency still holds, but
  the output comes out in a different order.

Differential Revision: https://reviews.llvm.org/D67471

llvm-svn: 371780
2019-09-12 22:10:36 +00:00
Amara Emerson 55d86f04c7 [AArch64][GlobalISel] Fall back on attempts to allocate split types on the stack.
First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually
make a difference for us in CallLowering but fix that anyway to silence the assert.

The bigger issue was that after fixing the assert we were generating invalid MIR
because the merging/unmerging of values split across multiple registers wasn't
also implemented for memory locs. This happens when we run out of registers and
have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but
for now just fall back.

llvm-svn: 371693
2019-09-11 23:53:23 +00:00
Jessica Paquette 469d42fcf6 [GlobalISel] When a tail call is emitted in a block, stop translating it
This fixes a crash in tail call translation caused by assume and lifetime_end
intrinsics.

It's possible to have instructions other than a return after a tail call which
will still have `Analysis::isInTailCallPosition` return true. (Namely,
lifetime_end and assume intrinsics.)

If we emit a tail call, we should stop translating instructions in the block.
Otherwise, we can end up emitting an extra return, or dead instructions in
general. This makes the verifier unhappy, and is generally unfortunate for
codegen.

This also removes the code from AArch64CallLowering that checks if we have a
tail call when lowering a return. This is covered by the new code now.

Also update call-translator-tail-call.ll to show that we now properly tail call
in the presence of lifetime_end and assume.

Differential Revision: https://reviews.llvm.org/D67415

llvm-svn: 371572
2019-09-10 23:34:45 +00:00
Jessica Paquette 2af5b193d5 [AArch64][GlobalISel] Support sibling calls with mismatched calling conventions
Add support for sibcalling calls whose calling convention differs from the
caller's.

- Port over `CCState::resultsCombatible` from CallingConvLower.cpp into
  CallLowering. This is used to verify that the way the caller and callee CC
  handle incoming arguments matches up.

- Add `CallLowering::analyzeCallResult`. This is basically a port of
  `CCState::AnalyzeCallResult`, but using `ArgInfo` rather than `ISD::InputArg`.

- Add `AArch64CallLowering::doCallerAndCalleePassArgsTheSameWay`. This checks
  that the calling conventions are compatible, and that the caller and callee
  preserve the same registers.

For testing:

- Update call-translator-tail-call.ll to show that we can now handle this.

- Add a GISel line to tailcall-ccmismatch.ll to show that we will not tail call
  when the regmasks don't line up.

Differential Revision: https://reviews.llvm.org/D67361

llvm-svn: 371570
2019-09-10 23:25:12 +00:00
Aditya Nandakumar 5112b71126 [GlobalISel]: Fix a bug where we could dereference None
getConstantVRegVal returns None when dealing with constants > 64 bits.
Don't assume we always have a value in GISelKnownBits.

llvm-svn: 371465
2019-09-09 22:51:41 +00:00
Tim Northover 06d93e0a25 GlobalISel: fix unused warnings in release builds.
llvm-svn: 371385
2019-09-09 10:36:58 +00:00
Tim Northover 36147adc0b GlobalISel: add combiner to form indexed loads.
Loosely based on DAGCombiner version, but this part is slightly simpler in
GlobalIsel because all address calculation is performed by G_GEP. That makes
the inc/dec distinction moot so there's just pre/post to think about.

No targets can handle it yet so testing is via a special flag that overrides
target hooks.

llvm-svn: 371384
2019-09-09 10:04:23 +00:00
Matt Arsenault cf10372119 GlobalISel: Add G_FMAD instruction
llvm-svn: 371254
2019-09-06 20:49:10 +00:00
Daniel Sanders f803237926 [globalisel][knownbits] Account for missing type constraints
Now that we look through copies, it's possible to visit registers that
have a register class constraint but not a type constraint. Avoid looking
through copies when this occurs as the SrcReg won't be able to determine
it's bit width or any known bits.

Along the same lines, if the initial query is on a register that doesn't
have a type constraint then the result is a default-constructed KnownBits,
that is, a 1-bit fully-unknown value.

llvm-svn: 371116
2019-09-05 20:26:02 +00:00
Jessica Paquette 20e8667098 Recommit "[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls"
Recommit basic sibling call lowering (https://reviews.llvm.org/D67189)

The issue was that if you have a return type other than void, call lowering
will emit COPYs to get the return value after the call.

Disallow sibling calls other than ones that return void for now. Also
proactively disable swifterror tail calls for now, since there's a similar issue
with COPYs there.

Update call-translator-tail-call.ll to include test cases for each of these
things.

llvm-svn: 371114
2019-09-05 20:18:34 +00:00
Simon Pilgrim 071287c5a9 Revert rL370996 from llvm/trunk: [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189
........
This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll

llvm-svn: 371051
2019-09-05 10:38:39 +00:00
Jessica Paquette b78324fc40 [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189

llvm-svn: 370996
2019-09-04 22:54:52 +00:00
Matt Arsenault 5ff310e298 GlobalISel: Add basic legalization for G_BITREVERSE
llvm-svn: 370979
2019-09-04 20:46:15 +00:00
Daniel Sanders b276a9a51e [globalisel] Support trivial COPY in GISelKnownBits
Summary: Allow GISelKnownBits to look through the trivial case of TargetOpcode::COPY

Reviewers: aditya_nandakumar

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67131

llvm-svn: 370955
2019-09-04 18:59:43 +00:00
Matt Arsenault 70becc20fa GlobalISel: Add G_BITREVERSE
This is the first failing pattern for AMDGPU and is trivial to handle.

llvm-svn: 370927
2019-09-04 17:06:53 +00:00
Amara Emerson 5d5150f0b4 [GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type combination.
Similar to the issue with G_ZEXT that was fixed earlier, this is a quick
to fall back if the source type is not exactly half of the dest type.

Fixes the clang-cmake-aarch64-lld bot build.

llvm-svn: 370847
2019-09-04 07:58:45 +00:00
Amara Emerson 2a2c25ba48 [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.
Now that we have the infrastructure to support s128 types as parameters
we can expand these to libcalls.

Differential Revision: https://reviews.llvm.org/D66185

llvm-svn: 370823
2019-09-03 21:42:32 +00:00
Amara Emerson fbaf425b79 [GlobalISel][CallLowering] Add support for splitting types according to calling conventions.
On AArch64, s128 types have to be split into s64 GPRs when passed as arguments.
This change adds the generic support in call lowering for dealing with multiple
registers, for incoming and outgoing args.

Support for splitting for return types not yet implemented.

Differential Revision: https://reviews.llvm.org/D66180

llvm-svn: 370822
2019-09-03 21:42:28 +00:00
Amara Emerson 453ef4e376 [AArch64][GlobalISel] Fix zext narrowScalar to use the right type when creating
the merges.

Fixes PR43171.

llvm-svn: 370627
2019-09-02 08:18:55 +00:00
Matt Arsenault 466ec2d552 GlobalISel: Fix missing pass dependency
llvm-svn: 370496
2019-08-30 17:41:58 +00:00
Petar Avramovic 6412b56513 [MIPS GlobalISel] Lower fptoui
Add lower for G_FPTOUI. Algorithm is similar to the SDAG version
in TargetLowering::expandFP_TO_UINT.
Lower G_FPTOUI for MIPS32.

Differential Revision: https://reviews.llvm.org/D66929

llvm-svn: 370431
2019-08-30 05:44:02 +00:00
Matt Arsenault 093ebf9275 GlobalISel: Don't compute known bits for non-integral GEP
llvm-svn: 370392
2019-08-29 17:55:05 +00:00
Matt Arsenault b2b9a23758 GlobalISel: Add maskedValueIsZero and signBitIsZero to known bits
I dropped the DemandedElts since it seems to be missing from some of
the new interfaces, but not others.

llvm-svn: 370389
2019-08-29 17:24:36 +00:00
Matt Arsenault caff0a88dd GlobalISel: Add known bits to InstructionSelector
AMDGPU uses this for some addressing mode selection patterns. The
analysis run itself doesn't do anything so it seems easier to just
always require this than adding a way to opt in.

llvm-svn: 370388
2019-08-29 17:24:32 +00:00
Jessica Paquette af0bd41e06 [AArch64][GlobalISel] Fall back when translating musttail calls
These are currently translated as normal functions calls in AArch64.

Until we have proper tail call lowering, we shouldn't translate these.

Differential Revision: https://reviews.llvm.org/D66842

llvm-svn: 370225
2019-08-28 16:19:01 +00:00
Amara Emerson e20b91c265 [GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC.
This change moves the actual stack pointer manipulation into the legalizer,
available to targets via lower(). The codegen is slightly different because
we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than
adding a negative amount via G_GEP.

Differential Revision: https://reviews.llvm.org/D66678

llvm-svn: 370104
2019-08-27 19:54:27 +00:00
Petar Avramovic a393238422 [GlobalISel] Factor narrowScalar for G_ASHR and G_LSHR. NFC
Main difference is in the way Hi for Long shift (HiL) is made.
G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value.

Differential Revision: https://reviews.llvm.org/D66589

llvm-svn: 370064
2019-08-27 14:33:05 +00:00
Petar Avramovic d568ed40e0 [GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAG
Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS
to match names of surrounding variables.

Differential Revision: https://reviews.llvm.org/D66587

llvm-svn: 370062
2019-08-27 14:22:32 +00:00
Volkan Keles 277631e3b8 [GlobalISel] Legalizer: Retry combining illegal artifacts as long as there new artifacts
Summary:
Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s
possible to combine them after processing the rest of the instruction because
the legalization is likely to generate more artifacts that allow ArtifactCombiner
to combine away them.

Instead, move illegal artifacts to another list called RetryList and wait until all of the
instruction in InstList are legalized. After that, check if there is any new artifacts and
try to combine them again if that’s the case. If not, abort. The idea is similar to D59339,
but the approach is a bit different.

This patch fixes the issue described above, but the legalizer still may be unable to handle
some cases depending on when to legalize artifacts. So, in the long run, we probably need
a different legalization strategy that handles this dependency in a better way.

Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette

Reviewed By: dsanders

Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65894

llvm-svn: 369805
2019-08-23 20:30:35 +00:00
Matt Arsenault fba82858f2 GlobalISel: Don't create G_UADDE with constant false carry in
The x86 tests are now broken (in paticular add-scalar.ll now hits the
DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a
custom lowering for this, so that will need to be implemented.

llvm-svn: 369673
2019-08-22 17:29:17 +00:00
Matt Arsenault 954a012b4c GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources
This is necessary for handling <3 x s16> on AMDGPU, assuming this
should be handled as 2 separate legalization actions. The alternative
would be for fewerElementsVector to handle 3->2.

llvm-svn: 369547
2019-08-21 16:59:10 +00:00
Petar Avramovic 5b4c5c2c54 [MIPS GlobalISel] NarrowScalar G_TRUNC
Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source.
NarrowScalar G_TRUNC to s32 for MIPS32.

Differential Revision: https://reviews.llvm.org/D66202

llvm-svn: 369509
2019-08-21 09:26:39 +00:00
Amara Emerson 56606a4db3 [AArch64][GlobalISel] Add support for narrowScalar of G_ZEXT
We do this by merging the source with the high bits set to 0.

Differential Revision: https://reviews.llvm.org/D66181

llvm-svn: 369480
2019-08-21 00:12:37 +00:00
Aditya Nandakumar 08bd080872 [GlobalISel] Handle multiple registers in dbg.value intrinsic
https://reviews.llvm.org/D66077

The value passed into dbg.value may relate to multiple registers,
each of which need a DBG_VALUE.

This fix calls MIRBuilder.buildDirectDbgValue for each register.

Without this, IR passed in from flang-compiler/flang may fail an
assertion in getOrCreateVReg.

Patch by : peterwaller-arm.

llvm-svn: 369403
2019-08-20 16:28:37 +00:00
Amara Emerson c809230a69 [AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask.
Again, it's weird that these are allowed. Since lowering support was added in
r368709 we started crashing on compiling the neon intrinsics test in the test
suite. This fixes the lowering to fold the 1 elt src/mask case into copies.

llvm-svn: 369135
2019-08-16 18:06:53 +00:00
Volkan Keles 0ae6006bee [GlobalISel] CSEMIRBuilder: Add support for G_GEP
Summary:
This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors
`translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types.

Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson

Reviewed By: aditya_nandakumar

Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66316

llvm-svn: 369070
2019-08-15 23:45:45 +00:00
Daniel Sanders 0c47611131 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041
2019-08-15 19:22:08 +00:00
Jonas Devlieghere 0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Aditya Nandakumar c65ac865c3 [GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong source index
https://reviews.llvm.org/D66182

llvm-svn: 368781
2019-08-14 01:23:33 +00:00
Aditya Nandakumar 615eee6402 [GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources
https://reviews.llvm.org/D66171

llvm-svn: 368753
2019-08-13 21:49:11 +00:00
Matt Arsenault 28215caa60 GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES
Odd sized vectors aren't handled yet.

llvm-svn: 368713
2019-08-13 16:26:28 +00:00
Matt Arsenault 690645bda0 GlobalISel: Implement lower for G_SHUFFLE_VECTOR
llvm-svn: 368709
2019-08-13 16:09:07 +00:00
Matt Arsenault 5af9cf042f GlobalISel: Change representation of shuffle masks
Currently shufflemasks get emitted as any other constant, and you end
up with a bunch of virtual registers of G_CONSTANT with a
G_BUILD_VECTOR. The AArch64 selector then asserts on anything that
doesn't fit this pattern. This isn't an ideal representation, and
should avoid legalization and have fewer opportunities for a
representational error.

Rather than invent a new shuffle mask operand type, similar to what
ShuffleVectorSDNode does, just track the original IR Constant mask
operand. I don't completely like the idea of adding another link to
the IR, but MIR is already quite dependent on IR constants already,
and this will allow sharing the shuffle mask utility functions with
the IR.

llvm-svn: 368704
2019-08-13 15:34:38 +00:00
Amara Emerson e14c91b71a [GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained.
Currently we can't keep any state in the selector object that we get from
subtarget. As a result we have to plumb through all our variables through
multiple functions. This change makes it non-const and adds a virtual init()
method to allow further state to be captured for each target.

AArch64 makes use of this in this patch to cache a call to hasFnAttribute()
which is expensive to call, and is used on each selection of G_BRCOND.

Differential Revision: https://reviews.llvm.org/D65984

llvm-svn: 368652
2019-08-13 06:26:59 +00:00
Aditya Nandakumar 70fdfed45f [GlobalISel]: Add KnownBits for G_XOR
https://reviews.llvm.org/D66119

llvm-svn: 368648
2019-08-13 04:32:33 +00:00
Aditya Nandakumar 55371e697c [GISel]: Fix a bug in KnownBits where we should have been using SizeInBits
https://reviews.llvm.org/D66039

We were using getIndexSize instead of getIndexSizeInBits().
Added test case for G_PTRTOINT and G_INTTOPTR.

llvm-svn: 368618
2019-08-12 21:28:12 +00:00
Daniel Sanders e9a57c2b23 [globalisel] Add G_SEXT_INREG
Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
  %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
  %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 23

All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.

To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
  %1 = G_CONSTANT 16
  %2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
  %2 = G_SEXT_INREG %0, 16
or:
  %1 = G_CONSTANT 16
  %2:_(s32) = G_SHL %0, %1
  %3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.

Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61289

llvm-svn: 368487
2019-08-09 21:11:20 +00:00
Tim Northover e1a5f668b3 GlobalISel: pack various parameters for lowerCall into a struct.
I've now needed to add an extra parameter to this call twice recently. Not only
is the signature getting extremely unwieldy, but just updating all of the
callsites and implementations is a pain. Putting the parameters in a struct
sidesteps both issues.

llvm-svn: 368408
2019-08-09 08:26:38 +00:00