llvm-project

Commit Graph

Author	SHA1	Message	Date
Ahmed Bougacha	89bba61c84	[AArch64] Don't assert when combining (v3f32 select (setcc f64)). When the setcc has f64 operands, we can't build a vector setcc mask to feed a vselect, because f64 doesn't divide v3f32 evenly. Just bail out when that happens. llvm-svn: 235917	2015-04-27 21:01:20 +00:00
Bill Schmidt	e71db85bed	Silence unused variable errors for no-asserts builds llvm-svn: 235913	2015-04-27 20:22:35 +00:00
Bill Schmidt	fe723b9a6d	[PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations This patch adds a new SSA MI pass that runs on little-endian PPC64 code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors without alignment constraints are accomplished for little-endian using lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional xxswapd instructions hurts performance in comparison with big-endian code, but they are necessary in the general case to support correct semantics. However, the general case does not apply to most vector code. Many vector instructions are lane-insensitive; they do not "care" which lanes the parallel computations are performed within, provided that the resulting data is stored into the correct locations. Thus this pass looks for computations that perform only lane-insensitive operations, and remove the unnecessary swaps from loads and stores in such computations. Future improvements will allow computations using certain lane-sensitive operations to also be optimized in this manner, by modifying the lane-sensitive operations to account for the permuted order of the lanes. However, this patch only adds the infrastructure to permit this; no lane-sensitive operations are optimized at this time. This code is heavily exercised by the various vectorizing applications in the projects/test-suite tree. For the time being, I have only added one simple test case to demonstrate what the pass is doing. Although it is quite simple, it provides coverage for much of the code, including the special case handling of copies and subreg-to-reg operations feeding the swaps. I plan to add additional tests in the future as I fill in more of the "special handling" code. Two existing tests were affected, because they expected the swaps to be present, but they are now removed. llvm-svn: 235910	2015-04-27 19:57:34 +00:00
Sanjay Patel	8fd573e87f	fix 80-cols; NFC llvm-svn: 235902	2015-04-27 17:45:44 +00:00
Sanjay Patel	912315811e	fix typos; NFC llvm-svn: 235896	2015-04-27 17:03:31 +00:00
Toma Tabacu	bda745f532	[mips] Correct bytes to bits in 2 comments. NFC. llvm-svn: 235891	2015-04-27 15:21:38 +00:00
Elena Demikhovsky	a480ef5494	AVX-512: added calling conventions for i1 vectors. Fixed bug: https://llvm.org/bugs/show_bug.cgi?id=20724 llvm-svn: 235889	2015-04-27 15:11:19 +00:00
Brendon Cahoon	55bdeb7bc7	[Hexagon] Use constant extenders to fix up hardware loops Use a loop instruction with a constant extender for a hardware loop instruction that is too far away from the start of the loop. This is cheaper than changing the SA register value. Differential Revision: http://reviews.llvm.org/D9262 llvm-svn: 235882	2015-04-27 14:16:43 +00:00
Toma Tabacu	d9d344b485	[mips] [IAS] Improve warning for using AT with .set noat. Summary: Changed the warning message to show the current value of $at, similar to what clang does for typedef's, and renamed warnIfAssemblerTemporary to a more descriptive name. I also changed the type of variables which store registers from int to unsigned, updated the relevant test and tried to make the related comments clearer. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8479 llvm-svn: 235881	2015-04-27 14:05:04 +00:00
Vasileios Kalintiris	7a6b18783f	Reapply "[mips][FastISel] Implement shift ops for Mips fast-isel."" This reapplies r235194, which was reverted in r235495 because it was causing a failure in our out-of-tree buildbots for MIPS. With the sign-extension patch in r235718, this patch doesn't cause any problem any more. llvm-svn: 235878	2015-04-27 13:28:05 +00:00
Toma Tabacu	b19cf2082f	[mips] [IAS] Rename getATRegNum and setATReg to {g,s}etATRegIndex. NFC. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8480 llvm-svn: 235877	2015-04-27 13:12:59 +00:00
Elena Demikhovsky	d1084c5b3f	AVX-512: Extend/Truncate operations for SKX, SETCC for bit-vectors llvm-svn: 235875	2015-04-27 12:57:59 +00:00
Simon Pilgrim	4f683c264a	[X86][SSE] Add v16i8/v32i8 multiplication support Patch to allow int8 vectors to be multiplied on the SSE unit instead of being scalarized. The patch sign extends the i8 lanes to i16, uses the SSE2 pmullw multiplication instruction, then packs the lower byte from each result. Differential Revision: http://reviews.llvm.org/D9115 llvm-svn: 235837	2015-04-27 07:55:46 +00:00
Alexei Starovoitov	f26c748b1b	[bpf] fix build and remove a compiler warning in Release mode Patch by Brenden Blanco. llvm-svn: 235814	2015-04-26 01:58:08 +00:00
Benjamin Kramer	a44b37e676	[ARM] Simplify code. NFC. llvm-svn: 235803	2015-04-25 17:25:13 +00:00
Benjamin Kramer	6246069c89	[hexagon] Use range-based for loops. No functionality change intended. llvm-svn: 235802	2015-04-25 14:46:53 +00:00
Benjamin Kramer	a37c809ce5	[hexagon] Remove setHexLibcallName, it leaks memory. Just spell out the full names, it's not that much more code. No functional change intended. llvm-svn: 235801	2015-04-25 14:46:46 +00:00
Lang Hames	9ff69c8f4d	[AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr. AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a reference for this is crufty. llvm-svn: 235752	2015-04-24 19:11:51 +00:00
Vasileios Kalintiris	1202f36b10	[mips][FastISel] Specify which types we handle for integer extension. Summary: Perform integer extension only when the destination type is one of i8, i16 & i32 and when the source type is i1, i8 or i16. For other combinations we fall back to SelectionDAG. This fixes the test MultiSource/Benchmarks/7zip that was failing in our out-of-tree MIPS buildbots. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9243 llvm-svn: 235718	2015-04-24 13:48:19 +00:00
Jingyue Wu	72fca6c89b	Resurrect r235688 We should skip vector types which are not SCEVable. test/CodeGen/NVPTX/sched2.ll passes llvm-svn: 235695	2015-04-24 04:22:39 +00:00
Jingyue Wu	62af99b0db	Revert r235688 Seems breaking builds llvm-svn: 235690	2015-04-24 03:26:11 +00:00
Jingyue Wu	312fd0242d	[NVPTX] Emits "generic()" depending on the original address space Summary: Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()" on aggregates of addrspacecasts. Test Plan: addrspacecast-gvar.ll Reviewers: eliben, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9130 llvm-svn: 235689	2015-04-24 02:57:30 +00:00
Jingyue Wu	3daace5295	[NVPTX] enable NaryReassociate in NVPTX Summary: We run NaryReassociate right after SLSR because SLSR enables many opportunities for NaryReassociate. For example, in nary-slsr.ll foo((a + b) + c); foo((a + b * 2) + c); foo((a + b * 3) + c); // 2 muls and 6 adds after SLSR: ab = a + b; foo(ab + c); ab2 = ab + b; foo(ab2 + c); ab3 = ab2 + b; foo(ab3 + c); // 6 adds after NaryReassociate: abc = (a + b) + c; foo(abc); ab2c = abc + b; foo(ab2c); ab3c = ab2c + b; foo(ab3c); // 4 adds Test Plan: nary-slsr.ll Reviewers: jholewinski, eliben Reviewed By: eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9066 llvm-svn: 235688	2015-04-24 02:54:06 +00:00
Matt Arsenault	5e10016f03	R600/SI: Fix verifier error when producing v_madmk_f32 Copy the kill flags when swapping the operands. llvm-svn: 235687	2015-04-24 01:57:58 +00:00
Matthias Braun	e1a67412cf	R600/RegisterCoalescer: Enable more rematerialization/add missing testcase This enables the rematerialization of some R600 MOV instructions in the RegisterCoalescer and adds a testcase for r235668. llvm-svn: 235675	2015-04-24 00:25:50 +00:00
Matt Arsenault	a48b866068	R600/SI: Special case v_mov_b32 as really rematerializable This should be fixed to properly understand all rematerializable instructions while ignoring implicit reads of exec. llvm-svn: 235671	2015-04-23 23:34:48 +00:00
Hal Finkel	4dc8fcc224	[PowerPC] Support register name prefixes for vector registers Match binutils by supporting the optional register name prefix for new vector registers ("vs" for VSX registers and "q" for QPX registers). llvm-svn: 235665	2015-04-23 23:16:22 +00:00
Hal Finkel	d86e90abdd	[PowerPC] Use sync inst alias when printing So long as the choice between printing msync and sync is not ambiguous, we can print 'sync 0' and just 'sync'. llvm-svn: 235663	2015-04-23 23:05:08 +00:00
Tom Stellard	ff5cf0e1fd	R600: Correctly lower CONCAT_VECTOR nodes with more than 2 operands llvm-svn: 235662	2015-04-23 22:59:24 +00:00
Hal Finkel	fefcfffe68	[PowerPC] Add asm/disasm support for dcbt with hint Add assembler/disassembler support for dcbt/dcbtst (and aliases) with the hint field specified (non-zero). Unforunately, the syntax for this instruction is special in that it differs for server vs. embedded cores: dcbt ra, rb, th [server] dcbt th, ra, rb [embedded] where th can be omitted when it is 0. dcbtst is the same. Thus we need to play games in the parser and the printer to flip the operands around on the embedded cores. We'll use the server syntax as the default (binutils currently uses the embedded form by default, but IBM is changing that). We also stop marking dcbtst as having unmodeled side effects (this is not necessary, it is just a hint like dcbt -- noticed by inspection, so no separate test case). llvm-svn: 235657	2015-04-23 22:47:57 +00:00
Krzysztof Parzyszek	ed75e7aece	Unbreak build llvm-svn: 235646	2015-04-23 20:57:39 +00:00
Krzysztof Parzyszek	27ba19a177	[Hexagon] Minor cleanup in HexagonFrameLowering llvm-svn: 235645	2015-04-23 20:42:20 +00:00
Tom Stellard	8b0182af2f	R600/SI: Fix indirect addressing with a negative constant offset When the base register index of the vector plus the constant offset was less than zero, we were passing the wrong base register to the indirect addressing instruction. In this case, we need to set the base register to v0 and then add the computed (negative) index to m0. llvm-svn: 235641	2015-04-23 20:32:01 +00:00
Peter Collingbourne	167668f8c8	Thumb2: When applying branch optimizations, visit branches in reverse order. The order in which branches appear in ImmBranches is approximately their order within the function body. By visiting later branches first, we reduce the distance between earlier forward branches and their targets, making it more likely that the cbn?z optimization, which can only apply to forward branches, will succeed for those earlier branches. Differential Revision: http://reviews.llvm.org/D9185 llvm-svn: 235640	2015-04-23 20:31:35 +00:00
Peter Collingbourne	cfee5b04bc	ARM: When re-creating a branch via InsertBranch, preserve CPSR flags. In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639	2015-04-23 20:31:32 +00:00
Peter Collingbourne	6529523151	Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero. This allows the constant island pass to lower these branches to cbn?z instructions, resulting in a shorter instruction sequence. Differential Revision: http://reviews.llvm.org/D9183 llvm-svn: 235638	2015-04-23 20:31:30 +00:00
Peter Collingbourne	78f1ecc59c	ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637	2015-04-23 20:31:26 +00:00
Peter Collingbourne	1213918bf4	ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools. This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636	2015-04-23 20:31:22 +00:00
Krzysztof Parzyszek	e568967986	[Hexagon] Fix compiler warnings in release build Patch by Aditya Nandakumar. llvm-svn: 235635	2015-04-23 20:26:21 +00:00
Jingyue Wu	3286ec1484	[NVPTX] run SeparateConstOffsetFromGEP before SLSR Summary: We pick this order because SeparateConstOffsetFromGEP may create more opportunities for SLSR. Test Plan: reassociate-geps-and-slsr.ll no performance regression on internal benchmarks Reviewers: meheff Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D9230 llvm-svn: 235632	2015-04-23 20:00:04 +00:00
Tom Stellard	d1f0f0268c	R600/SI: Add assembler support for all CI and VI VOP1 instructions llvm-svn: 235629	2015-04-23 19:33:54 +00:00
Tom Stellard	4b3e755480	R600/SI: v_mov_fed_b32 does not exist on VI llvm-svn: 235628	2015-04-23 19:33:52 +00:00
Tom Stellard	21cce29041	R600/SI: Use a better error message for unsupported instructions in the assembler llvm-svn: 235627	2015-04-23 19:33:51 +00:00
Tom Stellard	7130ef49cb	R600/SI: Improve AsmParser support for forced e64 encoding We can now force e64 encoding even when the operands would be legal for e32 encoding. llvm-svn: 235626	2015-04-23 19:33:48 +00:00
Hal Finkel	7c5cb066d0	[PowerPC] Enable printing instructions using aliases TableGen had been nicely generating code to print a number of instructions using shorter aliases (and PowerPC has plenty of short mnemonics), but we were not calling it. For some of the aliases we support in the parser, TableGen can't infer the "inverse" alias relationship, so there is still more to do. Thus, after some hours of updating test cases... llvm-svn: 235616	2015-04-23 18:30:38 +00:00
Pirama Arumuga Nainar	745615ca00	[AArch64] Add nvcast patterns for v4f16 and v8f16 Summary: Constant stores of f16 vectors can create NvCast nodes from various operand types to v4f16 or v8f16 depending on patterns in the stored constants. This patch adds nvcast rules with v4f16 and v8f16 values. AArchISelLowering::LowerBUILD_VECTOR has the details on which constant patterns generate the nvcast nodes. Reviewers: jmolloy, srhines, ab Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9201 llvm-svn: 235610	2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar	b18815354d	[AArch64] Handle vec4, vec8, vec16 *itofp for half Summary: Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32, v8i8, v8i16 inputs to allow promotion of v4f16 results. Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32, and i64 vectors. Only missing tests are for v16i8 and v16i16 as the shift operations are too complicated to write a proper check sequence. The conversions from v4i64 to v4f16 do not depend on this patch - v4i64 is split and the conversion gets handled while lowering v2i64. I am adding a test here for completeness. Reviewers: aemerson, rengolin, ab, jmolloy, srhines Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9166 llvm-svn: 235609	2015-04-23 17:16:27 +00:00
Hans Wennborg	0867b151c9	Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608	2015-04-23 16:45:24 +00:00
Krzysztof Parzyszek	876a19d855	[Hexagon] Shrink-wrap stack frame (Hexagon-specific) llvm-svn: 235603	2015-04-23 16:05:39 +00:00
Toma Tabacu	7fc89d2141	[mips] [IAS] Move NOP emission after pseudo-instruction expansion. NFC. As suggested in the review for http://reviews.llvm.org/D8537. llvm-svn: 235601	2015-04-23 14:48:38 +00:00

1 2 3 4 5 ...

32792 Commits