llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	3ecab8e455	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338	2019-09-19 16:26:14 +00:00
Hans Wennborg	13bdae8541	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314	2019-09-19 12:33:07 +00:00
Matt Arsenault	d8399d12cd	GlobalISel: Don't materialize immarg arguments to intrinsics Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285	2019-09-19 01:33:14 +00:00
Ulrich Weigand	b21e245711	[SystemZ] Support constrained fpto[su]i intrinsics Now that constrained fpto[su]i intrinsic are available, add codegen support to the SystemZ backend. In addition to pure back-end changes, I've also needed to add the strict_fp_to_[su]int and any_fp_to_[su]int pattern fragments in the obvious way. llvm-svn: 370674	2019-09-02 16:49:29 +00:00
Ulrich Weigand	0f0a8b7784	[SystemZ] Add support for new cpu architecture - arch13 This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Assembler/disassembler support for new instructions. - CodeGen for new instructions, including new LLVM intrinsics. - Scheduler description for the new processor. - Detection of arch13 as host processor. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365932	2019-07-12 18:13:16 +00:00
Ulrich Weigand	3641b10f3d	[SystemZ] Support vector load/store alignment hints Vector load/store instructions support an optional alignment field that the compiler can use to provide known alignment info to the hardware. If the field is used (and the information is correct), the hardware may be able (on some models) to perform faster memory accesses than otherwise. This patch adds support for alignment hints in the assembler and disassembler, and fills in known alignment during codegen. llvm-svn: 363806	2019-06-19 14:20:00 +00:00
Ulrich Weigand	6c5d5ce551	Allow target to handle STRICT floating-point nodes The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is both marked as mayRaiseFPException and FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663	2019-06-05 22:33:10 +00:00
Ulrich Weigand	8e42f6ddc8	[SystemZ] Model floating-point control register This adds the FPC (floating-point control register) as a reserved physical register and models its use by SystemZ instructions. Note that only the current rounding modes and the IEEE exception masks are modeled. Changes of the FPC due to exceptions (in particular the IEEE exception flags and the DXC) are not modeled. At this point, this patch is mostly NFC, but it will prevent scheduling of floating-point instructions across SPFC/LFPC etc. llvm-svn: 360570	2019-05-13 09:47:26 +00:00
Jonas Paulsson	c110b5b69f	[SystemZ] Wait with selection of legal vector/FP constants until Select(). This patch aims to make sure that any such constant that can be generated with a vector instruction (for example VGBM) is recognized as such during legalization and kept as a target independent node through post-legalize DAGCombining. Two new functions named isVectorConstantLegal() and loadVectorConstant() replace old ways of handling vector/FP constants. A new struct named SystemZVectorConstantInfo is used to cache the results of isVectorConstantLegal() and pass them onto loadVectorConstant(). Support for fp128 constants in the presence of FeatureVectorEnhancements1 (z14) has been added. Review: Ulrich Weigand https://reviews.llvm.org/D58270 llvm-svn: 354896	2019-02-26 16:47:59 +00:00
Jonas Paulsson	8cda83a5db	[SystemZ] Wait with VGBM selection until after DAGCombine2. Don't lower BUILD_VECTORs to BYTE_MASK, but instead expose the BUILD_VECTORs to the DAGCombiner and select them to VGBM in Select(). This allows the DAGCombiner to understand the constant vector values. For floating point, only all-zeros vectors are now generated with VGBM, as it turned out to be somewhat complicated to handle any arbitrary constants, while in practice this is very rare and hardly needed. The SystemZ ISD opcodes z_byte_mask, z_vzero and z_vones have been removed. Review: Ulrich Weigand https://reviews.llvm.org/D57152 llvm-svn: 353325	2019-02-06 18:59:19 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Ulrich Weigand	44d37ae38c	[SystemZ] Make better use of VLLEZ This patch fixes two deficiencies in current code that recognizes the VLLEZ idiom: - For the floating-point versions, we have ISel patterns that match on a bitconvert as the top node. In more complex cases, that bitconvert may already have been merged into something else. Fix the patterns to match the inner nodes instead. - For the 64-bit integer versions, depending on the surrounding code, we may get either a DAG tree based on JOIN_DWORDS or one based on INSERT_VECTOR_ELT. Use a PatFrags to simply match both variants. llvm-svn: 349749	2018-12-20 13:05:03 +00:00
Thomas Lively	30f1d69115	[NFC] Rename minnan and maxnan to minimum and maximum Summary: Changes all uses of minnan/maxnan to minimum/maximum globally. These names emphasize that the semantic difference between these operations is more than just NaN-propagation. Reviewers: arsenm, aheejin, dschuff, javed.absar Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53112 llvm-svn: 345218	2018-10-24 22:49:55 +00:00
Krzysztof Parzyszek	2a119b9a98	[SystemZ] Replace subreg_r with subreg_h Change subreg_r32 -> subreg_h32 subreg_r64 -> subreg_h64 subreg_hr32 -> subreg_hh32 The subregisters subreg_r32 and subreg_r64 were added to emphasize the fact that modifying these subregisters may clobber the entire register. This is not necessarily the case for subreg_h32, et al. However, the ability to compose subreg_h64 with subreg_r32, and with subreg_h32 and subreg_l32 at the same time makes the compositions be treated as non-overlapping (leading to problems when tracking subreg liveness). See D50468 for more details. Differential Revision: https://reviews.llvm.org/D50725 llvm-svn: 339778	2018-08-15 15:21:23 +00:00
Jonas Paulsson	b5b91cd402	[SystemZ] set 'guessInstructionProperties = 0' and set flags as needed. This has proven a healthy exercise, as many cases of incorrect instruction flags were corrected in the process. As part of this, IntrWriteMem was added to several SystemZ instrinsics. Furthermore, a bug was exposed in TwoAddress with this change (as incorrect hasSideEffects flags were removed and instructions could now be sunk), and the test case for that bugfix (r319646) is included here as test/CodeGen/SystemZ/twoaddr-sink.ll. One temporary test regression (one extra copy) which will hopefully go away in upcoming patches for similar cases: test/CodeGen/SystemZ/vec-trunc-to-i1.ll Review: Ulrich Weigand. https://reviews.llvm.org/D40437 llvm-svn: 319756	2017-12-05 11:24:39 +00:00
Ulrich Weigand	f2968d58cb	[SystemZ] Add support for IBM z14 processor (3/3) This adds support for the new 128-bit vector float instructions of z14. Note that these instructions actually only operate on the f128 type, since only each 128-bit vector register can hold only one 128-bit float value. However, this is still preferable to the legacy 128-bit float instructions, since those operate on pairs of floating-point registers (so we can hold at most 8 values in registers), while the new instructions use single vector registers (so we hold up to 32 value in registers). Adding support includes: - Enabling the instructions for the assembler/disassembler. - CodeGen for the instructions. This includes allocating the f128 type now to the VR128BitRegClass instead of FP128BitRegClass. - Scheduler description support for the instructions. Note that for a small number of operations, we have no new vector instructions (like integer <-> 128-bit float conversions), and so we use the legacy instruction and then reformat the operand (i.e. copy between a pair of floating-point registers and a vector register). llvm-svn: 308196	2017-07-17 17:44:20 +00:00
Ulrich Weigand	33435c4c9c	[SystemZ] Add support for IBM z14 processor (2/3) This adds support for the new 32-bit vector float instructions of z14. This includes: - Enabling the instructions for the assembler/disassembler. - CodeGen for the instructions, including new LLVM intrinsics. - Scheduler description support for the instructions. - Update to the vector cost function calculations. In general, CodeGen support for the new v4f32 instructions closely matches support for the existing v2f64 instructions. llvm-svn: 308195	2017-07-17 17:42:48 +00:00
Ulrich Weigand	2b3482fe85	[SystemZ] Add support for IBM z14 processor (1/3) This patch series adds support for the IBM z14 processor. This part includes: - Basic support for the new processor and its features. - Support for new instructions (except vector 32-bit float and 128-bit float). - CodeGen for new instructions, including new LLVM intrinsics. - Scheduler description for the new processor. - Detection of z14 as host processor. Support for the new 32-bit vector float and 128-bit vector float instructions is provided by separate patches. llvm-svn: 308194	2017-07-17 17:41:11 +00:00
Jonas Paulsson	d034e7ddc8	[SystemZ] Mark vector immediate load instructions with useful flags. Vector immediate load instructions should have the isAsCheapAsAMove, isMoveImm and isReMaterializable flags set. With them, these instruction will get hoisted out of loops. Review: Ulrich Weigand llvm-svn: 292790	2017-01-23 14:09:58 +00:00
Ulrich Weigand	d28be373d4	[SystemZ] Guard LEFR/LFER with FeatureVector The LEFR/LFER pseudos are aliases for vector instructions and should therefore be guared by FeatureVector. If they aren't, the TableGen scheduler definition checking might complain that there is no data for those pseudos for pre-z13 machines. No functional change intended. llvm-svn: 285576	2016-10-31 14:28:43 +00:00
Jonas Paulsson	8010b631d5	[SystemZ] Post-RA scheduler implementation Post-RA sched strategy and scheduling instruction annotations for z196, zEC12 and z13. This scheduler optimizes decoder grouping and balances processor resources (including side steering the FPd unit instructions). The SystemZHazardRecognizer keeps track of the scheduling state, which can be dumped with -debug-only=misched. Reviers: Ulrich Weigand, Andrew Trick. https://reviews.llvm.org/D17260 llvm-svn: 284704	2016-10-20 08:27:16 +00:00
Ulrich Weigand	6e31ab388a	[SystemZ] Add missing vector instructions for the assembler Most z13 vector instructions have a base form where the data type of the operation (whether to consider the vector to be 16 bytes, 8 halfwords, 4 words, or 2 doublewords) is encoded into a mask field, and then a set of extended mnemonics where the mask field is not present but the data type is encoded into the mnemonic name. Currently, LLVM only supports the type-specific forms (since those are really the ones needed for code generation), but not the base type-generic forms. To complete the assembler support and make it fully compatible with the GNU assembler, this commit adds assembler aliases for all the base forms of the various vector instructions. It also adds two more alias forms that are documented in the PoP: VFPSO/VFPSODB/WFPSODB -- generic form of VFLCDB etc. VNOT -- special variant of VNO llvm-svn: 284586	2016-10-19 13:03:18 +00:00
Ulrich Weigand	556a90c00c	[SystemZ] Add optional argument to some vector string instructions The vfee[bhf], vfene[bhf], and vistr[bhf] assembler mnemonics are documented in the Principles of Operation to have an optional last operand to encode arbitrary values in a mask field. This commit adds support for those optional operands, and cleans up the patterns to generate vector string instruction as bit. No change to code generation intended. llvm-svn: 284585	2016-10-19 12:57:46 +00:00
Michael Kuperstein	2bc3d4d46c	[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129	2016-08-18 20:08:15 +00:00
Ulrich Weigand	c1708b2618	[SystemZ] Add vector intrinsics This adds intrinsics to allow access to all of the z13 vector instructions. Note that instructions whose semantics can be described by standard LLVM IR do not get any intrinsics. For each instructions whose semantics cannot (fully) be described, we define an LLVM IR target-specific intrinsic that directly maps to this instruction. For instructions that also set the condition code, the LLVM IR intrinsic returns the post-instruction CC value as a second result. Instruction selection will attempt to detect code that compares that CC value against constants and use the condition code directly instead. Based on a patch by Richard Sandiford. llvm-svn: 236527	2015-05-05 19:31:09 +00:00
Ulrich Weigand	cd2a1b5341	[SystemZ] Handle sub-128 vectors The ABI allows sub-128 vectors to be passed and returned in registers, with the vector occupying the upper part of a register. We therefore want to legalize those types by widening the vector rather than promoting the elements. The patch includes some simple tests for sub-128 vectors and also tests that we can recognize various pack sequences, some of which use sub-128 vectors as temporary results. One of these forms is based on the pack sequences generated by llvmpipe when no intrinsics are used. Signed unpacks are recognized as BUILD_VECTORs whose elements are individually sign-extended. Unsigned unpacks can have the equivalent form with zero extension, but they also occur as shuffles in which some elements are zero. Based on a patch by Richard Sandiford. llvm-svn: 236525	2015-05-05 19:29:21 +00:00
Ulrich Weigand	49506d78e7	[SystemZ] Add CodeGen support for scalar f64 ops in vector registers The z13 vector facility includes some instructions that operate only on the high f64 in a v2f64, effectively extending the FP register set from 16 to 32 registers. It's still better to use the old instructions if the operands happen to fit though, since the older instructions have a shorter encoding. Based on a patch by Richard Sandiford. llvm-svn: 236524	2015-05-05 19:28:34 +00:00
Ulrich Weigand	80b3af7ab3	[SystemZ] Add CodeGen support for v4f32 The architecture doesn't really have any native v4f32 operations except v4f32->v2f64 and v2f64->v4f32 conversions, with only half of the v4f32 elements being used. Even so, using vector registers for <4 x float> and scalarising individual operations is much better than generating completely scalar code, since there's much less register pressure. It's also more efficient to do v4f32 comparisons by extending to 2 v2f64s, comparing those, then packing the result. This particularly helps with llvmpipe. Based on a patch by Richard Sandiford. llvm-svn: 236523	2015-05-05 19:27:45 +00:00
Ulrich Weigand	cd808237b2	[SystemZ] Add CodeGen support for v2f64 This adds ABI and CodeGen support for the v2f64 type, which is natively supported by z13 instructions. Based on a patch by Richard Sandiford. llvm-svn: 236522	2015-05-05 19:26:48 +00:00
Ulrich Weigand	ce4c109585	[SystemZ] Add CodeGen support for integer vector types This the first of a series of patches to add CodeGen support exploiting the instructions of the z13 vector facility. This patch adds support for the native integer vector types (v16i8, v8i16, v4i32, v2i64). When the vector facility is present, we default to the new vector ABI. This is characterized by two major differences: - Vector types are passed/returned in vector registers (except for unnamed arguments of a variable-argument list function). - Vector types are at most 8-byte aligned. The reason for the choice of 8-byte vector alignment is that the hardware is able to efficiently load vectors at 8-byte alignment, and the ABI only guarantees 8-byte alignment of the stack pointer, so requiring any higher alignment for vectors would require dynamic stack re-alignment code. However, for compatibility with old code that may use vector types, when not using the vector facility, the old alignment rules (vector types are naturally aligned) remain in use. These alignment rules are not only implemented at the C language level (implemented in clang), but also at the LLVM IR level. This is done by selecting a different DataLayout string depending on whether the vector ABI is in effect or not. Based on a patch by Richard Sandiford. llvm-svn: 236521	2015-05-05 19:25:42 +00:00
Ulrich Weigand	a8b04e1cbc	[SystemZ] Add z13 vector facility and MC support This patch adds support for the z13 processor type and its vector facility, and adds MC support for all new instructions provided by that facilily. Apart from defining the new instructions, the main changes are: - Adding VR128, VR64 and VR32 register classes. - Making FP64 a subclass of VR64 and FP32 a subclass of VR32. - Adding a D(V,B) addressing mode for scatter/gather operations - Adding 1-, 2-, and 3-bit immediate operands for some 4-bit fields. Until now all immediate operands have been the same width as the underlying field (hence the assert->return change in decode[SU]ImmOperand). In addition, sys::getHostCPUName is extended to detect running natively on a z13 machine. Based on a patch by Richard Sandiford. llvm-svn: 236520	2015-05-05 19:23:40 +00:00

31 Commits