llvm-project

Commit Graph

Author	SHA1	Message	Date
Davide Italiano	51cbe13a3f	[Hexagon] Garbage collect dead code. llvm-svn: 285654	2016-10-31 22:56:56 +00:00
Saleem Abdulrasool	e1aa782bd0	CodeGen: further loosen -O0 CG for WoA division Generate the slowest possible codepath for noopt CodeGen. Even trying to be clever with the negated jump can cause out-of-range jumps. Use a wide branch instead. Although the code is modelled simplistically, the later optimizations would recombine the branching into `cbz` if possible. This re-enables the previous optimization as well as hopefully gives us working code in all cases. Addresses PR30356! llvm-svn: 285649	2016-10-31 22:12:37 +00:00
Justin Lebar	ed1e312f05	[NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass. Summary: This has been replaced by the NVPTXInferAddressSpaces pass. We've had the new one as the default with the old one accessible via a flag for some months now, and we've had no problems. Reviewers: tra Subscribers: llvm-commits, jholewinski, jingyue, mgorny Differential Revision: https://reviews.llvm.org/D26165 llvm-svn: 285642	2016-10-31 21:51:42 +00:00
Nemanja Ivanovic	60bdfe5a7c	[PPC] add absolute difference altivec instructions and matching intrinsics This patch corresponds to review https://reviews.llvm.org/D26072. Committing on behalf of Sean Fertile. llvm-svn: 285627	2016-10-31 19:47:52 +00:00
Tim Northover	037af52c8b	GlobalISel: allow truncating pointer casts on AArch64. llvm-svn: 285615	2016-10-31 18:31:09 +00:00
Tim Northover	cdf23f1d93	GlobalISel: translate stack protector intrinsics llvm-svn: 285614	2016-10-31 18:30:59 +00:00
Michael Zuckerman	68a5c53616	[x86][inline-asm][AVX512][llvm][PART-2] Introducing "k" and "Yk" constraints for extended inline assembly, enabling use of AVX512 masked vectorized instructions. Commit on behalf of mharoush Extending inline assembly support, compatible with GCC as folowing: "k" constraint hints the compiler to select any of AVX512 k0-k7 registers. "Yk" constraint is a subset of "k" excluding k0 which is not allowd to be used as a mask. Reviewer: 1. rnk Differential Revision: https://reviews.llvm.org/D25062 llvm-svn: 285591	2016-10-31 16:19:58 +00:00
Artem Tamazov	54bfd548aa	[AMDGPU][MC][gfx8] Support 20-bit immediate offset in SMEM instructions. Fixes Bug 30808. Note that passing subtarget information to predicates seems too complicated, so gfx8-specific def smrd_offset_20 introduced. Old gfx6/7-specific def renamed to smrd_offset_8 for clarity. Lit tests updated. Differential Revision: https://reviews.llvm.org/D26085 llvm-svn: 285590	2016-10-31 16:07:39 +00:00
Krzysztof Parzyszek	22586dcb2a	[Hexagon] Don't expand mux instructions with both sources identical llvm-svn: 285588	2016-10-31 15:45:09 +00:00
Ulrich Weigand	2e5e51b3f3	[SystemZ] Rework processor feature definitions and add -mcpu=archX support This patch implements two changes: - Move processor feature definition into a new file SystemZFeatures.td, and provide explicit lists of supported and unsupported features for each level of the z/Architecture. This allows specifying unsupported features in the scheduler definition files for each processor. - Add optional aliases for the -mcpu processor names according to the level of the z/Architecture, for compatibility with other compilers on the platform. The supported aliases are: -mcpu=arch8 equals -mcpu=z10 -mcpu=arch9 equals -mcpu=z196 -mcpu=arch10 equals -mcpu=zEC12 -mcpu=arch11 equals -mcpu=z13 llvm-svn: 285577	2016-10-31 14:33:29 +00:00
Ulrich Weigand	d28be373d4	[SystemZ] Guard LEFR/LFER with FeatureVector The LEFR/LFER pseudos are aliases for vector instructions and should therefore be guared by FeatureVector. If they aren't, the TableGen scheduler definition checking might complain that there is no data for those pseudos for pre-z13 machines. No functional change intended. llvm-svn: 285576	2016-10-31 14:28:43 +00:00
Ulrich Weigand	d9001301d9	[SystemZ] Correctly diagnose missing features in AsmParser Currently, when using an instruction that is not supported on the currently selected architecture, the LLVM assembler is likely to diagnose an "invalid operand" instead of a "missing feature". This is because many operands require a custom parser in order to be processed correctly, and if an instruction is not available according to the current feature set, the generated parser code will also not detect the associated custom operand parsers. Fixed by temporarily enabling all features while parsing operands. The missing features will then be correctly detected when actually parsing the instruction itself. llvm-svn: 285575	2016-10-31 14:25:05 +00:00
Ulrich Weigand	ec5d779eb8	[SystemZ] Fix encoding of MVCK and .insn ss LLVM currently treats the first operand of MVCK as if it were a regular base+index+displacement address. However, it is in fact a base+displacement combined with a length register field. While the two might look syntactically similar, there are two semantic differences: - %r0 is a valid length register, even though it cannot be used as an index register. - In an expression with just a single register like 0(%rX), the register is treated as base with normal addresses, while it is treated as the length register (with an empty base) for MVCK. Fixed by adding a new operand parser class BDRAddr and reworking the assembler parser to distinguish between address + length register operands and regular addresses. llvm-svn: 285574	2016-10-31 14:21:36 +00:00
Jonas Paulsson	6788ddeac9	[SystemZ] Model 2 VBU units (not 1) in SystemZScheduleZ13.td. NFC. Review: Ulrich Weigand. llvm-svn: 285566	2016-10-31 13:05:48 +00:00
Alexey Bataev	d07c731d86	Improved cost model for FDIV and FSQRT, by Andrew Tischenko There is a bug describing poor cost model for floating point operations: Bug 29083 - [X86][SSE] Improve costs for floating point operations. This patch is the second one in series of patches dealing with cost model. Differential Revision: https://reviews.llvm.org/D25722 llvm-svn: 285564	2016-10-31 12:10:53 +00:00
Craig Topper	d4e580705d	[AVX-512] Add missing patterns for selecting masked vector extracts that started from shuffles. llvm-svn: 285546	2016-10-31 05:55:57 +00:00
Craig Topper	b7781a95fd	[X86] Use intrinsics table for PMADDUBSW and PMADDWD so that we can use the legacy intrinsics to select EVEX encoded instructions when available. This removes a couple tablegen classes that become unused after this change. Another class gained an additional parameter to allow PMADDUBSW to specify a different result type from its input type. llvm-svn: 285515	2016-10-30 06:56:16 +00:00
Craig Topper	bf9e5a16a4	[X86] Don't use loadv2i64 on SSE version of PMULHRSW. Use memopv2i64 instead. This bug was introduced in r285501. llvm-svn: 285510	2016-10-30 00:02:55 +00:00
Craig Topper	defe9ffbb5	[X86] Use intrinsics table for VPMULHRSW intrincis so that the legacy intrinsics can select EVEX encoded instructions when available. This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated. llvm-svn: 285501	2016-10-29 18:41:45 +00:00
Elena Demikhovsky	519b4ccd70	Fixed FMA + FNEG combine. Masked form of FMA should be omitted in this optimization. Differential Revision: https://reviews.llvm.org/D25984 llvm-svn: 285492	2016-10-29 08:44:46 +00:00
Matt Arsenault	c88ba36eab	AMDGPU: Use 1/2pi inline imm on VI I'm guessing at how it is supposed to be printed llvm-svn: 285490	2016-10-29 04:05:06 +00:00
Matthias Braun	7d78614ae9	AArch64DeadRegisterDefinitionsPass: Cleanup; NFC - Fix doxygen file comment - reduce indentation in loop - Factor out some common subexpressions - Move independent helper function out of class - Fix Changed flag (this is not strictly NFC but a bugfix, but the flag seems ignored anyway) llvm-svn: 285488	2016-10-29 01:03:41 +00:00
Tom Stellard	6695ba0440	AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions Summary: Flat instruction can return out of order, so we need always need to wait for all the outstanding flat operations. Reviewers: tony-tye, arsenm Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D25998 llvm-svn: 285479	2016-10-28 23:53:48 +00:00
Matt Arsenault	4e9c1e3a79	AMDGPU: Fix instruction flags for s_endpgm Set isReturn, remove hasSideEffects. Also remove hasCtrlDep, I'm not really sure what that does. llvm-svn: 285476	2016-10-28 23:00:38 +00:00
Matt Arsenault	7b6475568d	AMDGPU: Add definitions for scalar store instructions Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463	2016-10-28 21:55:15 +00:00
Matt Arsenault	4b6a6cc8e9	AMDGPU: Rename glc operand type While trying to add the glc bit to SMEM instructions on VI with the new refactoring I ran into some kind of shadowing problem for the glc operand when using the pseudoinstruction as a multiclass parameter. Everywhere that currently uses it defines the operand to have the same name as its type, i.e. glc:$glc which works. For some reason now it conflicts, and its up evaluating to the wrong thing. For the real encoding classes, let Inst{16} = !if(ps.has_glc, glc, ?); was not being evaluated and still visible in the Inst initializer in the expanded td file. In other cases I got a a different error about an illegal operand where this was using { 0 } initializer from the bits<1> glc initializer instead of evaluating it as false in the if. For consistency all of the operand types should probably be captialized to avoid conflicting with the variable names unless somebody has a better idea of how to fix this. llvm-svn: 285462	2016-10-28 21:55:08 +00:00
Justin Lebar	f0a80ba385	[NVPTX] Compute 'rem' using the result of 'div', if possible. Summary: In isel, transform Num % Den into Num - (Num / Den) * Den if the result of Num / Den is already available. Reviewers: tra Subscribers: hfinkel, llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26090 llvm-svn: 285461	2016-10-28 21:44:00 +00:00
Matt Arsenault	4eae301995	AMDGPU: Diagnose using too many SGPRs This is possible when using inline asm. llvm-svn: 285447	2016-10-28 20:31:47 +00:00
Matt Arsenault	08906a3c62	AMDGPU: Fix using incorrect private resource with no allocation It's possible to have a use of the private resource descriptor or scratch wave offset registers even though there are no allocated stack objects. This would result in continuing to use the maximum number reserved registers. This could go over the number of SGPRs available on VI, or violate the SGPR limit requested by the function attributes. llvm-svn: 285435	2016-10-28 19:43:31 +00:00
Nemanja Ivanovic	e28a0fc72a	Implement vector count leading/trailing bytes with zero lsb and vector parity builtins - llvm portion This patch corresponds to review https://reviews.llvm.org/D26003. Committing on behalf of Zaara Syeda. llvm-svn: 285434	2016-10-28 19:38:24 +00:00
Krzysztof Parzyszek	87a47be039	[Hexagon] Maintain kill flags through splitting in expand-condsets Do not use LiveIntervals to recalculate kills, because that cannot be done accurately without implicit uses on predicated instructions. llvm-svn: 285409	2016-10-28 15:50:22 +00:00
Tom Stellard	aea899e2a0	AMDGPU/SI: Handle hazard with s_rfe_b64 Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25638 llvm-svn: 285368	2016-10-27 23:50:21 +00:00
Tom Stellard	04051b5fad	AMDGPU/SI: Handle hazard with sgpr lane selects for v_{read,write}lane Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25637 llvm-svn: 285367	2016-10-27 23:42:29 +00:00
Tom Stellard	6b9c1be4ea	AMDGPU/SI: Fix unused variable warning on non-debug builds llvm-svn: 285363	2016-10-27 23:28:03 +00:00
Tom Stellard	b133fbb9a4	AMDGPU/SI: Handle hazard with > 8 byte VMEM stores Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25577 llvm-svn: 285359	2016-10-27 23:05:31 +00:00
Tom Stellard	30d30824b4	AMDGPU/SI: Handle s_setreg hazard in GCNHazardRecognizer Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25528 llvm-svn: 285338	2016-10-27 20:39:09 +00:00
Simon Pilgrim	d23219b9ee	[X86][AVX512] Fix MUL v8i64 costs on non-AVX512DQ targets llvm-svn: 285329	2016-10-27 18:32:06 +00:00
Simon Pilgrim	47c1ff7a43	[X86][AVX512DQ] Move v2i64 and v4i64 MUL lowering to tablegen As suggested by @igorb on D26011 llvm-svn: 285313	2016-10-27 17:07:40 +00:00
Saleem Abdulrasool	075d2e3c59	ARM: ensure that the Windows DBZ check is in range The Windows ARM target expects the compiler to emit a division-by-zero check. The check would use the form of: cmp r?, #0 cbz .Ltrap b .Lbody .Lbody: ... .Ltrap: udf #249 @ __brkdiv0 This works great most of the time. However, if the body of the function is greater than 127 bytes, the branch target limitation of cbz becomes an issue. This occurs in the unoptimized code generation cases sometimes (like in compiler-rt). Since this is a matter of correctness, possibly pay a small penalty instead. We now form this slightly differently: cbnz .Lbody udf #249 @ __brkdiv0 .Lbody: ... The positive case is through the branch instead of being the next instruction. However, because of the basic block layout, the negated branch is going to be a short distance always (2 bytes away, after the inserted __brkdiv0). The new t__brkdiv0 instruction is required to explicitly mark the instruction as a terminator as the generic UDF instruction is not a terminator. Addresses PR30532! llvm-svn: 285312	2016-10-27 16:59:22 +00:00
Vasileios Kalintiris	cfb005a0ee	[mips] Do not allow -opt-bisect-limit to skip the PIC call optimization pass. r282428 added the MipsOptimizePICCall as an opt-in pass that can be skipped when using the -opt-bisect-limit option. However, this pass is needed because it generates code that conforms to the o32 ABI specification by using the $t9 register for PIC calls with JALR instructions. This bug was exposed by the fact that skipFunction() also checks for the "optnone" attribute. This caused functions with that attribute to break the requirements of the o32 ABI. llvm-svn: 285305	2016-10-27 15:50:36 +00:00
Simon Pilgrim	820e1326d7	[X86][AVX512DQ] Improve lowering of MUL v2i64 and v4i64 With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq). Updated cost table accordingly. Differential Revision: https://reviews.llvm.org/D26011 llvm-svn: 285304	2016-10-27 15:27:00 +00:00
Krzysztof Parzyszek	046da74699	[Hexagon] Do not expand ISD::SELECT for HVX vectors llvm-svn: 285297	2016-10-27 14:30:16 +00:00
Sam Parker	e7d9505c08	[ARM] Predicate UMAAL selection on hasDSP. UMAAL is a DSP instruction and it is not available on thumbv7m (Cortex-M3) and thumbv6m (Cortex-M0+1) targets. Also fix wrong CHECK prefix in longMAC.ll test. Patch by Vadzim Dambrouski. Differential Revision: https://reviews.llvm.org/D25890 llvm-svn: 285278	2016-10-27 09:47:10 +00:00
Dylan McKay	dd680cc753	[AVR] Generate all of the TableGen files we need This enables generation of all of the TableGen files that are used downstream. llvm-svn: 285274	2016-10-27 08:20:47 +00:00
Nicolai Haehnle	7b0e25b7ad	AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies Summary: When finding a match for a merge and collecting the instructions that must be moved, keep in mind that the instruction we merge might actually use one of the defs that are being moved. Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect]. The fact that the ds_read in the test case is not eliminated suggests that there might be another problem related to alias analysis, but that's a separate problem: this pass should still work correctly even when earlier optimization passes missed something or were disabled. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25829 llvm-svn: 285273	2016-10-27 08:15:07 +00:00
Dylan McKay	00009d4824	[AVR] Compile the disassembler This also updates references of 'TheAVRTarget' to the new 'getTheAVRTarget()' method. llvm-svn: 285272	2016-10-27 08:09:15 +00:00
Dylan McKay	ec47065795	[AVR] Add AVRISelDAGToDAG.cpp Summary: This pulls the AVR instruction selector in-tree. Reviewers: arsenm, kparzysz Subscribers: llvm-commits, wdng, beanz, japaric, mgorny Differential Revision: https://reviews.llvm.org/D25278 llvm-svn: 285270	2016-10-27 07:03:47 +00:00
Dylan McKay	6eaa4e4bcc	[AVR] Add the machine code emitter Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, japaric, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25388 llvm-svn: 285269	2016-10-27 06:56:46 +00:00
Nemanja Ivanovic	32b5fed639	[PowerPC] - No SExt/ZExt needed for count trailing zeros This patch corresponds to review: https://reviews.llvm.org/D25896 It just eliminates the redundant ZExt after a count trailing zeros instruction. llvm-svn: 285267	2016-10-27 05:17:58 +00:00
Evandro Menezes	ca8370396a	[AArch64] Create feature set for Samsung Exynos-M2 Since Exynos-M2 improved the FP square root unit a bit over the one in Exynos-M1, it does not benefit from using the Newton series for such operations. llvm-svn: 285246	2016-10-26 22:06:20 +00:00

1 2 3 4 5 ...

39915 Commits