llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Morehouse	671f0930fe	[X86] Selective relocation relaxation for +tagged-globals For tagged-globals, we only need to disable relaxation for globals that we actually tag. With this patch function pointer relocations, which we do not instrument, can be relaxed. This patch also makes tagged-globals work properly with LTO, as -Wa,-mrelax-relocations=no doesn't work with LTO. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D113220	2021-11-19 07:18:27 -08:00
Simon Pilgrim	d391e4fe84	[X86] Update RET/LRET instruction to use the same naming convention as IRET (PR36876). NFC Be more consistent in the naming convention for the various RET instructions to specify in terms of bitwidth. Helps prevent future scheduler model mismatches like those that were only addressed in D44687. Differential Revision: https://reviews.llvm.org/D113302	2021-11-07 15:06:54 +00:00
Jack Andersen	bd4dad87f4	[MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand Based on the reasoning of D53903, register operands of DBG_VALUE are invariably treated as RegState::Debug operands. This change enforces this invariant as part of MachineInstr::addOperand so that all passes emit this flag consistently. RegState::Debug is inconsistently set on DBG_VALUE registers throughout LLVM. This runs the risk of a filtering iterator like MachineRegisterInfo::reg_nodbg_iterator to process these operands erroneously when not parsed from MIR sources. This issue was observed in the development of the llvm-mos fork which adds a backend that relies on physical register operands much more than existing targets. Physical RegUnit 0 has the same numeric encoding as $noreg (indicating an undef for DBG_VALUE). Allowing debug operands into the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision of register numbers with different zero semantics). Eventually, this causes an assert where DBG_VALUE instructions are prohibited from participating in live register ranges. Reviewed By: MatzeB, StephenTozer Differential Revision: https://reviews.llvm.org/D110105	2021-10-07 16:08:52 +01:00
Kazu Hirata	c1e32b3fc0	[Target] Migrate from getNumArgOperands to arg_size (NFC) Note that getNumArgOperands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-10-02 12:06:29 -07:00
Simon Pilgrim	4c72b10f0a	[X86] X86FastISel::fastMaterializeConstant - break if-else chain to fix llvm-else-after-return warning. NFCI All previous if-else cases return	2021-09-25 14:31:14 +01:00
Wang, Pengfei	6f7f5b54c8	[X86] AVX512FP16 instructions enabling 1/6 1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105263	2021-08-10 12:46:01 +08:00
Luo, Yuanke	36003c20ad	[X86] Selecting fld0 for undefined value in fast ISEL. When set opt-bisect-limit to some value that is less than ISel pass in command line and CurBisectNum expired, "DAG to DAG" pass lower its opt level to O0. However "processimpdefs" and "X86 FP Stackifier" is not stopped due to the CurBisectNum expiration. So undefined fp0 is generated. This cause crash in the "X86 FP Stackifier" pass, because Stackifier doesn't expect any undefined fp value. Here is the scenario that cause compiler crash. successors: %bb.26 liveins: $r14 ST_FPrr $st0, implicit-def $fpsw, implicit $fpcw renamable $rdi = MOV64ri @.str.3.16422 renamable $rdx = LEA64r %stack.6, 1, $noreg, 0, $noreg ADJCALLSTACKDOWN64 0, 0, 0, implicit-def $rsp, implicit-def dead $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp dead $esi = MOV32r0 implicit-def dead $eflags, implicit-def $rsi CALL64pcrel32 @foo, implicit $rsp, implicit $ssp, implicit $rdi, implicit $rsi, implicit $rdx, implicit-def dead $fp0 renamable $xmm0 = MOVSDrm_alt %stack.10, 1, $noreg, 0, $noreg :: (load 8 from %stack.10) ADJCALLSTACKUP64 0, 0, implicit-def $rsp, implicit-def dead $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp renamable $fp2 = CHS_Fp80 killed undef renamable $fp0, implicit-def $fpsw JMP_1 %bb.26 The CALL64pcrel32 mark fp0 dead, so llvm free the stack slot for fp0 and the stack become empty. In the late instruction CHS_Fp80, it use undefined register fp0, the original code assume there must be a stack slot for the src register (fp0) without respecting it is undefined, so llvm report error. We have some discussion in https://reviews.llvm.org/D104440 and we decide to fix it in fast ISel. The fix is to lower undefined fp value to zero value, so that it release the burden of "X86 FP Stackifier" pass. Thank Craig for the suggestion and the initial patch to fix it. Differential Revision: https://reviews.llvm.org/D104678	2021-06-26 08:43:09 +08:00
Fangrui Song	06e7de795b	Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build	2021-06-04 23:34:43 -07:00
Tim Northover	ba1509da7b	Recommit X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86). Recommited with a fix for unwind info when i386 pc-rel thunks are adjacent to a prologue.	2021-05-18 15:19:05 +01:00
Mitch Phillips	6791a6b309	Revert "X86: support Swift Async context" This reverts commit `747e5cfb9f`. Reason: New frame layout broke the sanitizer unwinder. Not clear why, but seems like some of the changes aren't always guarded by Swyft checks. See https://reviews.llvm.org/rG747e5cfb9f5d944b47fe014925b0d5dc2fda74d7 for more information.	2021-05-17 12:44:57 -07:00
Tim Northover	747e5cfb9f	X86: support Swift Async context This adds support to the X86 backend for the newly committed swiftasync function parameter. If such a (pointer) parameter is present it gets stored into an augmented frame record (populated in IR, but generally containing enhanced backtrace for coroutines using lots of tail calls back and forth). The context frame is identical to AArch64 (primarily so that unwinders etc don't get extra complexity). Specfically, the new frame record is [AsyncCtx, %rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp being set to 1. %rbp still points to the frame pointer in memory for backwards compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest of the record is because of x86).	2021-05-17 11:56:16 +01:00
Tim Northover	82a0e808bb	IR/AArch64/X86: add "swifttailcc" calling convention. Swift's new concurrency features are going to require guaranteed tail calls so that they don't consume excessive amounts of stack space. This would normally mean "tailcc", but there are also Swift-specific ABI desires that don't naturally go along with "tailcc" so this adds another calling convention that's the combination of "swiftcc" and "tailcc". Support is added for AArch64 and X86 for now.	2021-05-17 10:48:34 +01:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00
Nikita Popov	665065821e	[FastISel] Remove kill tracking This is a followup to D98145: As far as I know, tracking of kill flags in FastISel is just a compile-time optimization. However, I'm not actually seeing any compile-time regression when removing the tracking. This probably used to be more important in the past, before FastRA was switched to allocate instructions in reverse order, which means that it discovers kills as a matter of course. As such, the kill tracking doesn't really seem to serve a purpose anymore, and just adds additional complexity and potential for errors. This patch removes it entirely. The primary changes are dropping the hasTrivialKill() method and removing the kill arguments from the emitFast methods. The rest is mechanical fixup. Differential Revision: https://reviews.llvm.org/D98294	2021-04-03 15:50:13 +02:00
Nikita Popov	7669455df4	[X86][FastISel] Fix with.overflow eflags clobber (PR49587) If the successor block has a phi node, then additional moves may be inserted into predecessors, which may clobber eflags. Don't try to fold the with.overflow result into the branch in that case. This is done by explicitly checking for any phis in successor blocks, not sure if there's some more principled way to address this. Other fused compare and branch patterns avoid the issue by emitting the comparison when handling the branch, so that no instructions may be inserted in between. In this case, the with.overflow call is emitted separately (and I don't think this is avoidable, as it will generally have at least two users). Fixes https://bugs.llvm.org/show_bug.cgi?id=49587. Differential Revision: https://reviews.llvm.org/D98600	2021-03-29 23:08:47 +02:00
Craig Topper	6204ac4536	[X86] Bale out of X86FastISel::X86SelectCmp for vectors. None of the code in this function was written to handle vectors. Most of the cases already fail for vectors for one reason or another. The exception is an optimization that detects identical operands. This can be triggered by vectors, but the code always creates a 0 or 1 constants in a scalar register which is incorrect for vectors. Fixes PR49706.	2021-03-23 20:16:04 -07:00
Philip Reames	46e764a628	[x86] introduce no_callee_saved_registers attribute This is directly analogous to the existing no_caller_saved_registers, but with the opposite intention. A function or call so marked shifts the responsibility of spilling the usual CSRs to it's caller. An indirect call site and callee which don't agree on the attribute is ill defined. The motivation for this change is that being able to prune callee saves (without modifying other details of the calling convention) is sometimes useful when generating stubs and adapters. There's no intention to expose this as a source language feature; this is expected to be used by frontends to implement adapters where warranted. Some specific examples of use cases: * GC compatible compiled code wants to call an externally defined library function without needing to track pointer values through CSRs. * debug enabled code wants to call precompiled library which doesn't provide enough information to track CSRs while preserving debug quality in caller. * adapter stub entering hand written assembler which doesn't follow normal calling conventions.	2021-02-01 16:19:14 -08:00
Philip Reames	9d09db941f	[NFC][X86] Use CallBase interface to simplify code	2021-02-01 15:24:41 -08:00
Philip Reames	bb6c23b1f5	[NFC][X86] Avoid redundant work inspecting callee	2021-02-01 15:24:41 -08:00
Fangrui Song	687b83ceab	[X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model This fixes the bug referenced by `5582a79876` which was exposed by `961f31d8ad`. With this change, `movq src@GOTPCREL, %rcx` => `movq src@GOTPCREL(%rip), %rcx`	2020-12-05 23:13:28 -08:00
Harald van Dijk	47c902ba84	[X86] Have indirect calls take 64-bit operands in 64-bit modes The build bots caught two additional pre-existing problems exposed by the test change part of my change https://reviews.llvm.org/D91339, when expensive checks are enabled. This fixes one of them. X86 has CALL64r and CALL32r opcodes, where CALL64r takes a 64-bit register, and CALL32r takes a 32-bit register. CALL64r can only be used in 64-bit mode, CALL32r can only be used in 32-bit mode. LLVM would assume that after picking the appropriate CALLr opcode, a pointer-sized register would be a valid operand, but in x32 mode, a 64-bit mode, pointers are 32 bits. In this mode, it is invalid to directly pass a pointer to CALL64r, it needs to be extended to 64 bits first. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D91924	2020-11-28 16:46:30 +00:00
Harald van Dijk	4629afa101	[X86] Include %rip for 32-bit RIP-relative relocs for x32 %rip was only included for 64-bit RIP-relative relocations, but needs to be included for 32-bit as well. Reviewed By: MaskRay, RKSimon Differential Revision: https://reviews.llvm.org/D91339	2020-11-21 09:20:20 -08:00
Simon Pilgrim	30a1d91127	[X86] Reduce scope of DestReg and use specific Register type not unsigned. NFCI.	2020-10-31 11:46:07 +00:00
Fangrui Song	e36a41b3cf	[X86] Fix some clang-tidy bugprone-argument-comment issues	2020-10-08 15:26:50 -07:00
Sanjay Patel	2d3e12818e	[FastISel] update to use intrinsic's isCommutative(); NFC This requires adding a missing 'const' to the definition because the callers are using const args, but there should be no change in behavior. The intrinsic method was added with D86798 / rG096527214033	2020-08-30 11:36:41 -04:00
Fangrui Song	bef684154d	[X86][FastISel] Support materializing floating-point constants for large code model & PIC The following program miscompiles because rL216012 added static relocation model support but not for PIC. ``` // clang -fpic -mcmodel=large -O0 a.cc double foo() { return 42.0; } ``` This patch adds PIC support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86024	2020-08-23 08:36:18 -07:00
Guillaume Chatelet	8dbafd24d6	[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82977	2020-07-02 11:28:02 +00:00
Guillaume Chatelet	28de229bc6	[Alignment][NFC] Migrate MachineFrameInfo::CreateStackObject to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82894	2020-07-01 07:28:11 +00:00
Craig Topper	a5041987ed	[X86] Emit a reg-reg copy for fast isel of vector bitcasts. Previously we just updated a map and moved on. But it possible we cached known bits information with the vreg that can be used by another basic block. If the other basic block has a different view of the VT these known bits won't make sense. By emitting a copy we ensure we have different vregs before and after the bitcast. This prevents the known bits from being used with the wrong type. Differential Revision: https://reviews.llvm.org/D82517	2020-06-24 20:15:21 -07:00
Eric Christopher	3ff8f61930	Tidy up unsigned -> Register fixups.	2020-06-11 16:50:58 -07:00
Eric Christopher	cb21b16822	Add a diagnostic string to an assert.	2020-06-11 16:34:55 -07:00
Guillaume Chatelet	1778564f91	[Alignment][NFC] Migrate the rest of backends Summary: This is a followup on D81196 Reviewers: courbet Subscribers: arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81278	2020-06-08 07:17:20 +00:00
Craig Topper	5c7aca6a4c	[X86] Ignore large code model in X86FastISel::X86MaterializeFP in 32-bit mode Large code model doesn't mean anything to 32-bit mode. But nothing prevents it from being set. Ignore to avoid generating 64-bit mode only instructions. Differential Revision: https://reviews.llvm.org/D80768	2020-05-29 10:39:08 -07:00
Arthur Eubanks	8a88755610	Reland [X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 11:25:44 -07:00
Arthur Eubanks	b8cbff51d3	Revert "[X86] Codegen for preallocated" This reverts commit `810567dc69`. Some tests are unexpectedly passing	2020-05-20 10:04:55 -07:00
Arthur Eubanks	810567dc69	[X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 09:20:38 -07:00
Nikita Popov	52e98f620c	[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC) Now that load/store alignment is required, we no longer need most of them. Also switch the getLoadStoreAlignment() helper to return Align instead of MaybeAlign.	2020-05-17 22:19:15 +02:00
Craig Topper	8c72b0271b	[CodeGen] Use Align in MachineConstantPool.	2020-05-12 10:06:40 -07:00
Craig Topper	dbb272b0a3	[CallSite removal][FastISel] Use CallBase instead of CallSite in fastLowerCall.	2020-04-12 18:02:24 -07:00
Craig Topper	9c1842d8af	Change FastISel::CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This is the same as what was done to the CallLoweringInfo in TargetLowering.h in r309159. This is just a step on the way to replacing this with CallBase.	2020-04-10 23:45:36 -07:00
Matt Arsenault	dcce3ef1d2	FastISel: Partially use Register Doesn't try to convert the cases that depend on generated code.	2020-04-08 12:10:58 -04:00
Scott Constable	71e8021d82	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810	2020-04-02 21:55:13 -07:00
Guillaume Chatelet	bdf77209b9	[Alignment][NFC] Use Align version of getMachineMemOperand Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77059	2020-03-30 15:46:27 +00:00
Guillaume Chatelet	3ba550a05a	[Alignment][NFC] Use TFL::getStackAlign() Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: dylanmckay, sdardis, nemanjai, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76551	2020-03-23 13:48:29 +01:00
Craig Topper	85726bbcba	[X86] Disable fast-isel call lowering for functions with vXi1 arguments on avx512. This fails an assert because the type is marked in the calling convention td file as needing promotion, but the code doesn't know how to do it. It also much more complicated because we try to pass these in xmm/ymm/zmm registers. As of a few weeks ago we do this promotion from getRegisterTypeForCallingConv before the td file generated code gets involved.	2020-03-16 18:20:42 -07:00
Craig Topper	391cc4dd41	[X86] Use ZERO_EXTEND instead of SIGN_EXTEND in the fast isel handling of convert_from_fp16.	2020-02-14 10:57:12 -08:00
Craig Topper	fc0c72b2df	[X86] Add AVX512 support to the fast isel code for Intrinsic::convert_from_fp16/convert_to_fp16.	2020-02-14 10:57:11 -08:00
Craig Topper	a57dd66d5e	[X86] In X86FastEmitSSESelect, fall back to SelectionDAG if the inputs to the compare can't be found in registers. We were checking that the original Value * for the compare operands were null. But that can never happen. I believe we intended to check for 0 registers here instead. Fixes PR44749.	2020-02-01 12:24:55 -08:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00

1 2 3 4 5 ...

682 Commits