llvm-project

Commit Graph

Author	SHA1	Message	Date
Mark Searles	e4f067ebe2	[AMDGPU] Turn off MergeConsecutiveStores() before Instruction Selection for AMDGPU. Commit dbbb6c5fc3642987430866dffdf710df4f616ac7 turned on MergeConsecutiveStores() before Instruction Selection for all targets. Enough AMDGPU compiles go into an infinite loop ( MergeConsecutiveStores() merges two stores; LegalizeStoreOps() un-merges; MergeConsecutiveStores() re-merges, etc. ) to warrant turning it off until the issues can be addressed. Differential Revision: https://reviews.llvm.org/D41377 llvm-svn: 321100	2017-12-19 19:26:23 +00:00
Simon Pilgrim	7cabb4c384	[X86] Regenerate popcnt tests llvm-svn: 321093	2017-12-19 18:05:13 +00:00
Amara Emerson	b6ddbef673	[GlobalISel][Legalizer] Fix crash when trying to lower G_FNEG of fp128 types. This doesn't add legalizer support, just prevents crashing so that we can gracefully fall back to SDAG. Fixes PR35690. llvm-svn: 321091	2017-12-19 17:21:35 +00:00
Nirav Dave	51425fa5ba	[DAG] Elide overlapping store Summary: Extend overlapping store elision to handle overwrites of stores by larger stores. Nontemporal tests have been modified to add memory dependencies to prevent store elision. Reviewers: craig.topper, rnk, t.p.northover Subscribers: javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40969 llvm-svn: 321089	2017-12-19 17:10:56 +00:00
Simon Pilgrim	d873b6f6ba	[X86][AVX512] Attempt target shuffle combining to different types instead of early-out We try to prevent shuffle combining to value types that would stop the folding of masked operations, but by just returning early, we were failing to try different shuffle types. The TODOs are all still relevant here to improve codegen but we're lacking test examples. llvm-svn: 321085	2017-12-19 16:54:07 +00:00
Simon Pilgrim	fd5df639a3	[X86][SSE] Add cpu feature for aggressive combining to variable shuffles As mentioned in D38318 and D40865, modern Intel processors prefer to combine multiple shuffles to a variable shuffle mask (PSHUFB/VPERMPS etc.) instead of having multiple stage 'fixed' shuffles which put more pressure on Port 5 (at the expense of extra shuffle mask loads). This patch provides a FeatureFastVariableShuffle target flag for Haswell+ CPUs that prefers combining 2 or more fixed shuffles to a single variable shuffle (default is 3 shuffles). The long term aim is to drive more of this from schedule data (probably via the MC) but we're not close to being ready for that yet. Differential Revision: https://reviews.llvm.org/D41323 llvm-svn: 321074	2017-12-19 13:16:43 +00:00
David Green	110844d21c	[ARM] Register the Thumb2SizeReducePass. NFC Also adds a simple test case. llvm-svn: 321072	2017-12-19 12:19:08 +00:00
Simon Pilgrim	f6d4ab6daf	[X86][SSE] Use (V)PHMINPOSUW for vXi8 SMAX/SMIN/UMAX/UMIN horizontal reductions (PR32841) Extension to D39729 which performed this for vXi16, with the same bit flipping to handle SMAX/SMIN/UMAX cases, vXi8 UMIN horizontal reductions can be performed. This makes use of the fact that by performing a pair-wise i8 SHUFFLE/UMIN before PHMINPOSUW, we both get the UMIN of each pair but also zero-extend the upper bits ready for v8i16. Differential Revision: https://reviews.llvm.org/D41294 llvm-svn: 321070	2017-12-19 12:02:40 +00:00
Simon Dardis	1ade566c45	[mips] Handle the emission of microMIPSr6 sll instruction when used as a nop. This instruction is encoded as zero, so we have handle that case when checking for unimplemented opcodes when producing the encoding for an instruction. llvm-svn: 321066	2017-12-19 11:16:22 +00:00
Craig Topper	13142b10d5	[X86] Don't extend v16i8 non-uniform shifts to v16i32 if we have BWI. Use v16i16 instead. BWI supports shifting by word amounts. Even if VLX isn't support we can still widen to v32i16 and extract the lower half. For SKX its preferrable to not use 512-bit vector if we can. llvm-svn: 321059	2017-12-19 06:59:10 +00:00
Justin Bogner	4314f3adc2	update_mir_test_checks: Accept IR as input as well as MIR We need to handle IR for tests that want to do lowering (or just -stop-after with IR as input). I've run this on one AArch64 test to demonstrate what it looks like. llvm-svn: 321048	2017-12-19 00:49:04 +00:00
Matthias Braun	e29c0b8862	TargetLoweringBase: Followup to r321035 I missed some prefixes and the fact that on AArch64 we use "bzero" instead of "__bzero" as on X86 when doing my refactoring in r321035. Improve tests for bzero. llvm-svn: 321046	2017-12-19 00:43:00 +00:00
Krzysztof Parzyszek	e704583f23	[Hexagon] Cache loads to select to avoid traversing mutating DAG llvm-svn: 321034	2017-12-18 23:13:27 +00:00
Evandro Menezes	687df6380e	[AArch64] Expand test coverage of vector element shuffling to Exynos Make sure that all test cases are run for Exynos as well. Otherwise, NFC. llvm-svn: 321032	2017-12-18 22:17:39 +00:00
Dimitry Andric	e4f5d01033	Fix more inconsistent line endings. NFC. llvm-svn: 321016	2017-12-18 19:46:56 +00:00
Jessica Paquette	02c124d644	[MachineOutliner] Recommit r320229 LR was undefined entering outlined functions that contain calls. This made the machine verifier unhappy when expensive checks were enabled. This fixes that. llvm-svn: 321014	2017-12-18 19:33:21 +00:00
Benjamin Kramer	efc7c88ea8	[PPC] Also disable the pre-emit version of reg+reg to reg+imm transformation. This has the same issue as the early pass disabled in r321010. llvm-svn: 321013	2017-12-18 19:21:56 +00:00
Krzysztof Parzyszek	6b589e593d	[Hexagon] Generate HVX code for vector sign-, zero- and any-extends Implement any-extend as zero-extend. llvm-svn: 321004	2017-12-18 18:32:27 +00:00
Simon Pilgrim	f947137ed0	[X86] Regenerate test to improve codegen testing for D41350 llvm-svn: 321003	2017-12-18 18:31:02 +00:00
Francis Visoiu Mistrih	b213b27ee3	[YAML] Add support for non-printable characters LLVM IR function names which disable mangling start with '\01' (https://www.llvm.org/docs/LangRef.html#identifiers). When an identifier like "\01@abc@" gets dumped to MIR, it is quoted, but only with single quotes. http://www.yaml.org/spec/1.2/spec.html#id2770814: "The allowed character range explicitly excludes the C0 control block allowed), the surrogate block #xD800-#xDFFF, #xFFFE, and #xFFFF." http://www.yaml.org/spec/1.2/spec.html#id2776092: "All non-printable characters must be escaped. [...] Note that escape sequences are only interpreted in double-quoted scalars." This patch adds support for printing escaped non-printable characters between double quotes if needed. Should also fix PR31743. Differential Revision: https://reviews.llvm.org/D41290 llvm-svn: 320996	2017-12-18 17:38:03 +00:00
Simon Dardis	fd8c65e868	Reland "[mips] Fix the target specific instruction verifier" Fix an off by one error in the bounds checking for 'dinsu' and update the ranges in the test comments so that they are accurate. This version has the correct commit message. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D41183 llvm-svn: 320991	2017-12-18 15:56:40 +00:00
Sean Fertile	5fb624a3b8	[Memcpy Loop Lowering] Remove the fixed int8 lowering. Switch over to the lowering that uses target supplied operand types. Differential Revision: https://reviews.llvm.org/D41201 llvm-svn: 320989	2017-12-18 15:31:14 +00:00
Simon Dardis	f70af977af	Revert "[mips] Fix the target specific instruction verifier" This reverts commit r320974. The commit message lacked the Differential Revison: line. llvm-svn: 320975	2017-12-18 12:30:34 +00:00
Simon Dardis	c3c0d4590b	[mips] Fix the target specific instruction verifier Fix an off by one error in the bounds checking for 'dinsu' and update the ranges in the test comments so that they are accurate. Reviewers: atanasyan https://reviews.llvm.org/D41183 llvm-svn: 320974	2017-12-18 12:24:17 +00:00
Tim Northover	9097a07e4e	AArch64: work around how Cyclone handles "movi.2d vD, #0". For Cylone, the instruction "movi.2d vD, #0" is executed incorrectly in some rare circumstances. Work around the issue conservatively by avoiding the instruction entirely. This patch changes CodeGen so that problematic instructions are never generated, and the AsmParser so that an equivalent instruction is used (with a warning). llvm-svn: 320965	2017-12-18 10:36:00 +00:00
Sam Parker	fd967f2f7a	[ARM] Adjust test checks Correct the CHECK-LABELS of a couple of dag combine tests. llvm-svn: 320963	2017-12-18 10:08:03 +00:00
Sam Parker	00804efd72	[DAGCombine] Move AND nodes to multiple load leaves Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320962	2017-12-18 10:04:27 +00:00
Sanjay Patel	9da049fa8a	[x86] add tests for finite libcall lowering (PR35672); NFC llvm-svn: 320955	2017-12-18 00:38:45 +00:00
Craig Topper	255a76d6d1	[X86] Add test cases that show cases where buildvector of extract and inserts should be turned into fmsubadd. This is a follow up to the fmaddsub support added in r320950. Hopefully in the future we can fix lowering to handle this fmsubadd too. llvm-svn: 320951	2017-12-17 18:31:36 +00:00
Craig Topper	fd8d040820	[X86] Make the code that creates fmaddsub from build_vector of extracts and inserts functional and add tests. Summary: We had no tests for this and we couldn't do the optimization because of a bad use count check. We need to know how many non-undef pieces of the build vector were filled in and ensure our use count is equal to that. But on the shuffle combine version we need the use count to be 2. The missing coverage was noticed during the review of D40335. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41133 llvm-svn: 320950	2017-12-17 18:23:45 +00:00
Simon Pilgrim	406d04a916	[X86] Regenerate truncated rotation tests + add missing 32-bit checks llvm-svn: 320949	2017-12-17 18:20:42 +00:00
Craig Topper	ee1e71e576	[X86] Use extract_vector_elt instead of X86ISD::VEXTRACT for isel of vXi1 extractions. llvm-svn: 320937	2017-12-17 01:35:48 +00:00
Craig Topper	c0c2d19e08	[X86] Canonicalize extract_vector_elt from vXi1 to always return MVT::i32. This allows us to remove some isel patterns that allowed MVT::i8 result type. llvm-svn: 320936	2017-12-17 01:35:47 +00:00
Simon Pilgrim	4c9e8215e9	[X86][AVX] lowerVectorShuffleAsBroadcast - aggressively peek through BITCASTs Assuming we can safely adjust the broadcast index for the new type to keep it suitably aligned, then peek through BITCASTs when looking for the broadcast source. Fixes PR32007 llvm-svn: 320933	2017-12-16 23:32:18 +00:00
Simon Pilgrim	f3b6da00f5	[X86][AVX] Fix failed broadcast fold Strip excess BITCASTs from EXTRACT_SUBVECTOR input llvm-svn: 320930	2017-12-16 22:57:17 +00:00
Sean Fertile	68d7f9da76	[Memcpy Loop Lowering] Only calculate residual size/bytes copied when needed. If the loop operand type is int8 then there will be no residual loop for the unknown size expansion. Dont create the residual-size and bytes-copied values when they are not needed. llvm-svn: 320929	2017-12-16 22:41:39 +00:00
Craig Topper	1260a4e826	[X86] When using vpopcntdq for ctpop of v8i16 vectors, only promote to v8i32. Previously we promoted to v8i64, but we don't need to go all the way to 512-bits. If we have VLX we can use the 256-bit instruction. And even if we don't have VLX we can widen v8i32 to v16i32 and drop the upper half. llvm-svn: 320926	2017-12-16 19:31:36 +00:00
Craig Topper	c08960597c	[X86] Add 128 and 256-bit VPOPCNTDQ instructions. Adjust some tablegen classes LZCNT/POPCNT. I think when this instruction was first published it was only for a Knights CPU and thus VLX version was missing. llvm-svn: 320910	2017-12-16 02:40:28 +00:00
Krzysztof Parzyszek	266d6f03a1	[Hexagon] Handle concat_vectors of all allowed HVX types llvm-svn: 320865	2017-12-15 21:23:12 +00:00
Krzysztof Parzyszek	29832a6c8b	[Hexagon] Fix operand-swapping PatFrag for atomic stores PatFrag now has the atomicity information stored as bit fields. They need to be copied to the new PatFrag. llvm-svn: 320855	2017-12-15 20:13:57 +00:00
Craig Topper	3fb8386685	[SelectionDAG][X86] Fix insert_vector_elt lowering for v32i1/v64i1 with non-constant index Summary: Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1. We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires. For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that. For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them. Reviewers: RKSimon, delena, spatel, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40942 llvm-svn: 320849	2017-12-15 19:35:22 +00:00
Sean Fertile	42b13343fd	[Memcpy Loop Lowering] Insert loop BB inbetween the split BB. The original memcpy expansion inserted the loop basic block inbetween the 2 new basic blocks created by splitting the original block the memcpy call was in. This commit makes the new memcpy expansion do the same to keep the layout of the IR matching between the old and new implementations. Differential Review: https://reviews.llvm.org/D41197 llvm-svn: 320848	2017-12-15 19:29:12 +00:00
Andrew V. Tischenko	22f0742dda	Fix for bug PR35549 - Repeated schedule comments. Differential Revision: https://reviews.llvm.org/D40960 llvm-svn: 320837	2017-12-15 18:13:05 +00:00
Francis Visoiu Mistrih	0b5bdceabf	[CodeGen] Print stack object references as %(fixed-)stack.0 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%stack.0` instead of `<fi#0>`, and `%fixed-stack.0` instead of `<fi#-4>` (supposing there are 4 fixed stack objects). Only debug syntax is affected. Differential Revision: https://reviews.llvm.org/D41027 llvm-svn: 320827	2017-12-15 16:33:45 +00:00
Sam Parker	18b0d1e5b9	[ARM] Some DAG combine tests Add some more and and shift load combine tests. llvm-svn: 320822	2017-12-15 15:30:39 +00:00
Francis Visoiu Mistrih	5de20e039e	[MIR] Add support for missing CFI directives The following CFI directives are suported by MC but not by MIR: * .cfi_rel_offset * .cfi_adjust_cfa_offset * .cfi_escape * .cfi_remember_state * .cfi_restore_state * .cfi_undefined * .cfi_register * .cfi_window_save Add support for printing, parsing and update tests. Differential Revision: https://reviews.llvm.org/D41230 llvm-svn: 320819	2017-12-15 15:17:18 +00:00
Simon Pilgrim	5009a1c738	[X86] Add RTM schedule tests llvm-svn: 320815	2017-12-15 14:37:28 +00:00
Simon Pilgrim	786431231f	[X86] Add MWAITX/MONITORX schedule tests llvm-svn: 320812	2017-12-15 14:22:15 +00:00
Simon Pilgrim	e662fa3752	[X86] Add XOP schedule tests llvm-svn: 320810	2017-12-15 14:02:35 +00:00
Simon Pilgrim	0c1e0dbb96	[X86] Add AVX512 VPOPCNTDQ schedule tests Demonstrates how to perform full coverage avx512 schedule tests llvm-svn: 320805	2017-12-15 11:32:31 +00:00
Alex Bradbury	59136ffab1	[RISCV] Enable emission of alias instructions by default This patch switches the default for -riscv-no-aliases to false and updates all affected MC and CodeGen tests. As recommended in D41071, MC tests use the canonical instructions and the CodeGen tests use the aliases. Additionally, for the f and d instructions with rounding mode, the tests for the aliased versions are moved and tightened such that they can actually detect if alias emission is enabled. (see D40902 for context) Differential Revision: https://reviews.llvm.org/D41225 Patch by Mario Werner. llvm-svn: 320797	2017-12-15 09:47:01 +00:00
Roger Ferrer Ibanez	9fcc4727ac	[ARM] Add tests for D34515 This is NFC and a preparatory step for D34515. Differential Revision: https://reviews.llvm.org/D41122 llvm-svn: 320795	2017-12-15 09:24:46 +00:00
Nemanja Ivanovic	6995e5dae7	[PowerPC] Convert r+r instructions to r+i (pre and post RA) This patch adds the necessary infrastructure to convert instructions that take two register operands to those that take a register and immediate if the necessary operand is produced by a load-immediate. Furthermore, it uses this infrastructure to perform such conversions twice - first at MachineSSA and then pre-emit. There are a number of reasons we may end up with opportunities for this transformation, including but not limited to: - X-Form instructions chosen since the exact offset isn't available at ISEL time - Atomic instructions with constant operands (we will add patterns for this in the future) - Tail duplication may duplicate code where one block contains this redundancy - When emitting compare-free code in PPCDAGToDAGISel, we don't handle constant comparands specially Furthermore, this patch moves the initialization of PPCMIPeepholePass so that it can be used for MIR tests. llvm-svn: 320791	2017-12-15 07:27:53 +00:00
Craig Topper	7cfacbf6ea	[X86] Fix a couple bugs in my recent changes to vXi1 insert_subvector lowering. A couple places didn't use the same SDValue variables to connect everything all the way through. I don't have a test case for a bug in insert into the lower bits of a non-zero, non-undef vector. Not sure the best way to create that. We don't create the case when lowering concat_vectors which is the main way to get insert_subvectors. llvm-svn: 320790	2017-12-15 07:16:41 +00:00
Yaxun Liu	c41e2f6e7b	Recommit CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value The regression on ppc64 was not due to this commit. llvm-svn: 320788	2017-12-15 03:56:57 +00:00
Nemanja Ivanovic	0d47d32caa	Disabling r312514 as it causes miscompiles that show up on bootstrap The compare elimination peephole introduced in https://reviews.llvm.org/rL312514 causes a miscompile in AMDGPUInstrInfo.cpp which in turn causes some AMDGPU test case failures in stage2 bootstrap testing. This miscompile didn't cause any test case failures until https://reviews.llvm.org/rL320614, so it appeared as if that patch caused these failures. Disabling this transformation for now to bring the build bots back to green and the author of the patch will investigate the miscompile. llvm-svn: 320786	2017-12-15 01:38:03 +00:00
Saleem Abdulrasool	05e285bcc5	FastISel: support no-PLT PIC calls on ELF x86_64 Add support for properly handling PIC code with no-PLT. This equates to `-fpic -fno-plt -O0` with the clang frontend. External functions are marked with nonlazybind, which must then be indirected through the GOT. This allows code to be built without optimizations in PIC mode without going through the PLT. Addresses PR35653! llvm-svn: 320776	2017-12-15 00:32:09 +00:00
Sam Clegg	bafe69026d	[WebAssembly] Implement @llvm.global_ctors and @llvm.global_dtors Summary: - lowers @llvm.global_dtors by adding @llvm.global_ctors functions which register the destructors with `__cxa_atexit`. - impements @llvm.global_ctors with wasm start functions and linker metadata See [here](https://github.com/WebAssembly/tool-conventions/issues/25) for more background. Subscribers: jfb, dschuff, mgorny, jgravelle-google, aheejin, sunfish Differential Revision: https://reviews.llvm.org/D41211 llvm-svn: 320774	2017-12-15 00:17:10 +00:00
Krzysztof Parzyszek	470760533a	[Hexagon] Generate HVX code for comparisons and selects llvm-svn: 320744	2017-12-14 21:28:48 +00:00
Craig Topper	600f1ba333	[X86] Don't zero the upper bits of the k-register before extracting a single bit from a vXi1. This doesn't match the semantics of the extract_vector_elt operation. Nothing downstream knows the bits were zeroed so they still get masked or sign extended after the extrat anyway. llvm-svn: 320723	2017-12-14 18:35:25 +00:00
Geoff Berry	dcc646e40b	[ARM] Fix isRenamable flag setting on expanded VSTMDIA opcode. Fixes expensive-check ARM buildbot failure. llvm-svn: 320718	2017-12-14 18:06:25 +00:00
Simon Dardis	12645285ed	[mips] Update some tests before posting a patch, NFC. llvm-svn: 320715	2017-12-14 16:42:04 +00:00
Yaxun Liu	f902ef0a5d	Revert CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value This commit might have caused regression on ppc64. Revert it to verify that. llvm-svn: 320712	2017-12-14 16:12:04 +00:00
Simon Dardis	e94fdd125f	[mips] Add partial support for R6 in the long branch pass MIPSR6 introduced several new jump instructions and deprecated the use of the 'j' instruction. For microMIPS32R6, 'j' was removed entirely and it only has non delay slot jumps. This patch adds support for MIPSR6 by using some R6 instructions-- 'bc' instead of 'j', 'jic $reg, 0' instead of 'jalr $zero, $reg'-- and modifies the sequences not to use delay slots for R6. Reviewers: atanasyan Reviewed By: atanasyan Subscribers: dschuff, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D40786 llvm-svn: 320703	2017-12-14 14:55:25 +00:00
Benjamin Kramer	a85822cb1e	Revert "[DAGCombine] Move AND nodes to multiple load leaves" This reverts commit r320679. Causes miscompiles. llvm-svn: 320698	2017-12-14 14:03:07 +00:00
Michael Zuckerman	19fd217eaa	[AVX512] Adding support for load truncate store of I1 store operation on a truncated memory (load) of vXi1 is poorly supported by LLVM and most of the time end with an assertion. This patch fixes this issue. Differential Revision: https://reviews.llvm.org/D39547 Change-Id: Ida5523dd09c1ad384acc0a27e9e59273d28cbdc9 llvm-svn: 320691	2017-12-14 11:55:50 +00:00
Simon Pilgrim	9f19fe51d2	[X86] Add FMA4 schedule tests llvm-svn: 320690	2017-12-14 11:40:54 +00:00
Simon Pilgrim	7424de2036	[X86] Add FMA3 schedule tests Rewrote to use inline asm for full coverage llvm-svn: 320689	2017-12-14 11:30:01 +00:00
Francis Visoiu Mistrih	5df3bbf3e6	[CodeGen] Print global addresses as @foo in both MIR and debug output Work towards the unification of MIR and debug output by printing `@foo` instead of `<ga:@foo>`. Also print target flags in the MIR format since most of them are used on global address operands. Only debug syntax is affected. llvm-svn: 320682	2017-12-14 10:03:09 +00:00
Francis Visoiu Mistrih	e76c5fcd70	[CodeGen] Print external symbols as $symbol in both MIR and debug output Work towards the unification of MIR and debug output by printing `$symbol` instead of `<es:symbol>`. Only debug syntax is affected. llvm-svn: 320681	2017-12-14 10:02:58 +00:00
Sam Parker	ef12b41ef7	[DAGCombine] Move AND nodes to multiple load leaves Recommitting rL319773, which was reverted due to a recursive issue causing timeouts. This happened because I failed to check whether the discovered loads could be narrowed further. In the case of a tree with one or more narrow loads, that could not be further narrowed, as well as a node that would need masking, an AND could be introduced which could then be visited and recombined again with the same load. This could again create the masking load, with would be combined again... We now check that the load can be narrowed so that this process stops. Original commit message: Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320679	2017-12-14 09:31:01 +00:00
Craig Topper	8cdf7c0e68	[X86] Make ANY_EXTEND from vXi1 Custom for more types. We should be able to support ANY_EXTEND for any types we support ZERO_EXTEND for. llvm-svn: 320675	2017-12-14 08:26:00 +00:00
Craig Topper	eab2d4665f	[SelectionDAG][X86] Improve legalization of v32i1 CONCAT_VECTORS of v16i1 for AVX512F. A v32i1 CONCAT_VECTORS of v16i1 uses promotion to v32i8 to legalize the v32i1. This results in a bunch of extract_vector_elts and a build_vector that ultimately gets scalarized. This patch checks to see if v16i8 is legal and inserts a any_extend to that so that we can concat v16i8 to v32i8 and avoid creating the extracts. llvm-svn: 320674	2017-12-14 08:25:58 +00:00
Craig Topper	cf77203ff6	[SelectionDAG] When legalizing the result type of CONCAT_VECTORS, take into account whether the input type also needs to be promoted. If so go ahead and get the promoted input vector to extract from. Previously, we would create a bunch of any_extends of extract_vector_elts with illegal input type that needs to be promoted. The legalization of those extract_vector_elts would then potentially introduce a truncate. So now we have a bunch of any_extends of truncates. By legalizing both parts together we avoid creating these extra nodes. The test changes seem to be because we were previously combining the build_vector with the any_extend before the any_extend got combined with the truncate. llvm-svn: 320669	2017-12-14 06:49:07 +00:00
Simon Pilgrim	5af7a6ddf2	[X86] Add missing MULX32 schedule test llvm-svn: 320651	2017-12-13 22:43:55 +00:00
Yaxun Liu	a5315a040d	CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value Two issues were found about machine inst scheduler when compiling ProRender with -g for amdgcn target: GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it should not since DBG_VALUE is not mapped in LiveIntervals. when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D41132 llvm-svn: 320650	2017-12-13 22:38:09 +00:00
Simon Pilgrim	49dbfe7de9	[X86] Add CLWB schedule test llvm-svn: 320644	2017-12-13 22:09:09 +00:00
Simon Pilgrim	14318c5b31	[X86] Move ADX schedule tests out of schedule-x86_64.ll llvm-svn: 320637	2017-12-13 21:49:09 +00:00
Adrian Prantl	46af7316ea	Ignore metainstructions during the shrink wrap analysis Shrink wrapping should ignore DBG_VALUEs referring to frame indices, since the presence of debug information must not affect code generation. Differential Revision: https://reviews.llvm.org/D41187 llvm-svn: 320606	2017-12-13 19:10:54 +00:00
Simon Pilgrim	f02a39c371	[X86] Add JCC/JECXZ/JECXZ/JRCXZ/LOOP schedule tests llvm-svn: 320603	2017-12-13 18:09:45 +00:00
Amaury Sechet	a402e51428	Regenerate test-shrink.ll test results. NFC llvm-svn: 320602	2017-12-13 18:04:57 +00:00
Simon Pilgrim	542a711806	[X86] Add RET/RETF schedule tests llvm-svn: 320600	2017-12-13 17:50:40 +00:00
Simon Pilgrim	c1bd968c8c	[X86] Add POP/PUSH schedule tests llvm-svn: 320598	2017-12-13 17:42:25 +00:00
Galina Kistanova	9dee3f0a97	Reverted r320229. It broke tests on builder llvm-clang-x86_64-expensive-checks-win. llvm-svn: 320588	2017-12-13 15:26:27 +00:00
Simon Pilgrim	0bd31a8360	[X86] Add PREFETCH schedule tests llvm-svn: 320587	2017-12-13 15:12:02 +00:00
Simon Pilgrim	1df18ee3fc	[X86] Add XCHG schedule tests llvm-svn: 320586	2017-12-13 15:02:10 +00:00
Simon Pilgrim	9d9f170172	[X86] Add MOVNTI schedule tests llvm-svn: 320585	2017-12-13 14:51:06 +00:00
Nemanja Ivanovic	6f590bf8bb	[PowerPC] MachineSSA pass to reduce the number of CR-logical operations The initial implementation of an MI SSA pass to reduce cr-logical operations. Currently, the only operations handled by the pass are binary operations where both CR-inputs come from the same block and the single use is a conditional branch (also in the same block). Committing this off by default to allow for a period of field testing. Will enable it by default in a follow-up patch soon. Differential Revision: https://reviews.llvm.org/D30431 llvm-svn: 320584	2017-12-13 14:47:35 +00:00
Simon Pilgrim	88e6f83f9e	[X86] Add ENTER/LEAVE schedule tests llvm-svn: 320583	2017-12-13 14:46:33 +00:00
Simon Pilgrim	cef5b64fdb	[X86] Add IMUL schedule tests llvm-svn: 320582	2017-12-13 14:24:04 +00:00
Simon Pilgrim	f00ea1b4cd	[X86] Add RDMSR/WRMSR, RDPMC + RDTSC/RDTSCP schedule tests Add missing RDTSCP itinerary llvm-svn: 320581	2017-12-13 14:22:04 +00:00
Simon Pilgrim	46ec195d19	[X86] Add ARPL/BOUND schedule tests llvm-svn: 320580	2017-12-13 13:54:45 +00:00
Simon Pilgrim	f51f4d3623	[X86][SSE] MOVMSK only uses the sign bit from each vector element Pass the input vector through SimplifyDemandedBits as we only need the sign bit from each vector element of MOVMSK We'd probably get more hits if SimplifyDemandedBits was better at handling vectors... Differential Revision: https://reviews.llvm.org/D41119 llvm-svn: 320570	2017-12-13 11:43:14 +00:00
Roger Ferrer Ibanez	e8d4e88bab	[DAG] Promote ADDCARRY / SUBCARRY Add missing case that was not implemented yet. Differential Revision: https://reviews.llvm.org/D38942 llvm-svn: 320567	2017-12-13 10:45:21 +00:00
Francis Visoiu Mistrih	b41dbbe325	[CodeGen] Print jump-table index operands as %jump-table.0 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%jump-table.0` instead of `<jt#0>`. Only debug syntax is affected. llvm-svn: 320566	2017-12-13 10:30:59 +00:00
Francis Visoiu Mistrih	26ae8a6582	[CodeGen] Print constant pool index operands as %const.0 + 8 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%const.0 + 8` instead of `<cp#0+8>` and `%const.0 - 8` instead of `<cp#0-8>`. Only debug syntax is affected. Differential Revision: https://reviews.llvm.org/D41116 llvm-svn: 320564	2017-12-13 10:30:45 +00:00
Stefan Maksimovic	0a075d68ec	[mips] Provide additional DSP bitconvert patterns Previously, v2i16 -> f32 bitcast could not be matched. Add patterns to support matching this and similar types of bitcasts. Differential revision: https://reviews.llvm.org/D40959 llvm-svn: 320562	2017-12-13 10:13:35 +00:00
Serguei Katkov	ac4a8fb1cd	Revert "[CGP] Enable select in complex addr mode" Causes: Assertion `ScaledReg == nullptr' failed. This actually a revert of rL320551. llvm-svn: 320553	2017-12-13 07:39:35 +00:00
Serguei Katkov	b8cb5da28d	[CGP] Enable select in complex addr mode Enable select instruction handling in complex addr modes. Reviewers: john.brawn, reames, aaboud Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40634 llvm-svn: 320551	2017-12-13 06:57:59 +00:00
Krzysztof Parzyszek	2eda05db87	[Hexagon] Relax some checks in testcases, NFC llvm-svn: 320529	2017-12-12 21:44:04 +00:00
Krzysztof Parzyszek	edcd9dcbc4	[Hexagon] Better detection of identity and undef masks in shuffles llvm-svn: 320523	2017-12-12 20:23:12 +00:00
Krzysztof Parzyszek	40a605f1be	[Hexagon] Fix wrong order of operands for vmux Shuffle generation uses vmux to collapse vectors resulting from two individual shuffles into one. The indexes of the elements selected from the first operand were indicated by 0xFF in the constant vector used in the compare instruction, but the compare (veqb) set the bits corresponding to the 0x00 elements, thus inverting the selection. Reverse the order of operands to vmux to get the correct output. llvm-svn: 320516	2017-12-12 19:32:41 +00:00
Sanjoy Das	1074eb225b	Reapply "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320508, in effect re-applying r320308. Simon has already reverted the parts that caused the crash that motivated the revert in r320492. llvm-svn: 320512	2017-12-12 19:11:31 +00:00
Sanjoy Das	81a4a02cbc	Revert "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320308. r320308 crashes LLC, please see the llvm-commits thread for a reproducer. llvm-svn: 320508	2017-12-12 18:40:58 +00:00
Nirav Dave	674d053d18	[X86] Cleanup type conversion of 64-bit load-store pairs. Summary: Simplify and generalize chain handling and search for 64-bit load-store pairs. Nontemporal test now converts 64-bit integer load-store into f64 which it realizes directly instead of splitting into two i32 pairs. Reviewers: craig.topper, spatel Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40918 llvm-svn: 320505	2017-12-12 18:25:48 +00:00
Geoff Berry	60c431022e	[MachineOperand][MIR] Add isRenamable to MachineOperand. Summary: Add isRenamable() predicate to MachineOperand. This predicate can be used by machine passes after register allocation to determine whether it is safe to rename a given register operand. Register operands that aren't marked as renamable may be required to be assigned their current register to satisfy constraints that are not captured by the machine IR (e.g. ABI or ISA constraints). Reviewers: qcolombet, MatzeB, hfinkel Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39400 llvm-svn: 320503	2017-12-12 17:53:59 +00:00
Simon Pilgrim	68f9accf51	[X86] Remove CompleteModel tags from CPU targets until we have better error checking (PR35636) The checks we have for complete models are not great and miss many cases - e.g. in PR35636 it failed to recognise that only the first output (of 2) was actually tagged by the InstRW Raised PR35639 and PR35643 as examples llvm-svn: 320492	2017-12-12 16:12:53 +00:00
Ayman Musa	c2eed926b0	[X86] Recognize constant arrays with special values and replace loads from it with subtract and shift instructions, which then will be replaced by X86 BZHI machine instruction. Recognize constant arrays with the following values: 0x0, 0x1, 0x3, 0x7, 0xF, 0x1F, .... , 2^(size - 1) -1 where //size// is the size of the array. the result of a load with index //idx// from this array is equivalent to the result of the following: (0xFFFFFFFF >> (sub 32, idx)) (assuming the array of type 32-bit integer). And the result of an 'AND' operation on the returned value of such a load and another input, is exactly equivalent to the X86 BZHI instruction behavior. See test cases in the LIT test for better understanding. Differential Revision: https://reviews.llvm.org/D34141 llvm-svn: 320481	2017-12-12 14:13:51 +00:00
Nemanja Ivanovic	b0783cccb7	[PowerPC] Follow-up to r318436 to get the missed CSE opportunities The last of the three patches that https://reviews.llvm.org/D40348 was broken up into. Canonicalize the materialization of constants so that they are more likely to be CSE'd regardless of the bit-width of the use. If a constant can be materialized using PPC::LI, materialize it the same way always. For example: li 4, -1 li 4, 255 li 4, 65535 are equivalent if the uses only use the low byte. Canonicalize it to the first form. Differential Revision: https://reviews.llvm.org/D40348 llvm-svn: 320473	2017-12-12 12:09:34 +00:00
Craig Topper	468a813315	[X86] Use Ld scheduler classes for instructions with folded loads. llvm-svn: 320459	2017-12-12 07:06:35 +00:00
Craig Topper	c1e72c019d	[X86] Correct the FMA3 regular expressions in the znver1 scheduler model. llvm-svn: 320458	2017-12-12 07:06:32 +00:00
Richard Trieu	efef032f02	Revert r318704 - [Sparc] efficient pattern for UINT_TO_FP conversion See bug https://bugs.llvm.org/show_bug.cgi?id=35631 r318704 is giving a fatal error on some code with unsigned to floating point conversions. llvm-svn: 320429	2017-12-11 22:25:04 +00:00
Matt Arsenault	3e268cc0dd	LSR: Check more intrinsic pointer operands llvm-svn: 320424	2017-12-11 21:38:43 +00:00
Tony Jiang	3b49dc548f	[PowerPC] Partially enable the ISEL expansion pass. The pass to expand ISEL instructions into if-then-else sequences in patch D23630 is currently disabled. This patch partially enable it by always removing the unnecessary ISELs (all registers used by the ISELs are the same one) and folding the ISELs which have the same input registers into unconditional copies. Differential Revision: https://reviews.llvm.org/D40497 llvm-svn: 320414	2017-12-11 20:42:37 +00:00
Krzysztof Parzyszek	a8ab1b75cb	[Hexagon] Add support for Hexagon V65 llvm-svn: 320404	2017-12-11 18:57:54 +00:00
Simon Pilgrim	e83876e31d	[X86] Add LODS schedule tests llvm-svn: 320403	2017-12-11 18:39:42 +00:00
Simon Pilgrim	e8715025f5	[X86] Add CMP/TEST schedule tests llvm-svn: 320402	2017-12-11 18:32:59 +00:00
Simon Pilgrim	5512525c5d	[X86] Add AND/OR/XOR schedule tests llvm-svn: 320400	2017-12-11 18:23:24 +00:00
Simon Pilgrim	9b2a5e1e0b	[X86] Add ADD/SUB schedule tests llvm-svn: 320397	2017-12-11 18:13:40 +00:00
Simon Pilgrim	dbe6c45fcd	[X86] Add ADC/SBB schedule tests llvm-svn: 320395	2017-12-11 17:59:05 +00:00
Simon Pilgrim	8c2d90a2f4	[X86] Add MOVSLQ schedule tests llvm-svn: 320392	2017-12-11 17:37:08 +00:00
Amara Emerson	df9b529d42	[GlobalISel] Disable GISel for big endian. This is due to PR26161 needing to be resolved before we can fix big endian bugs like PR35359. The work to split aggregates into smaller LLTs instead of using one large scalar will take some time, so in the mean time we'll fall back to SDAG. Some ARM BE tests xfailed for now as a result. Differential Revision: https://reviews.llvm.org/D40789 llvm-svn: 320388	2017-12-11 16:58:29 +00:00
Simon Pilgrim	fabe354b42	[X86] Add LWP schedule tests Tag LWP instructions as WriteSystem llvm-svn: 320387	2017-12-11 16:47:21 +00:00
Simon Pilgrim	67644be692	[X86] Add INT/INTO schedule tests llvm-svn: 320386	2017-12-11 16:32:58 +00:00
Simon Pilgrim	1fe82016a2	[X86] Add IN/OUT schedule tests llvm-svn: 320385	2017-12-11 16:16:40 +00:00
Simon Pilgrim	d0ce975528	[X86] Add IDIV schedule tests llvm-svn: 320384	2017-12-11 16:08:21 +00:00
Simon Pilgrim	6c29962f2e	[X86] Add CMPXCHG schedule tests llvm-svn: 320383	2017-12-11 16:04:08 +00:00
Simon Pilgrim	1c83cd18ae	[X86] Add CLZERO schedule test llvm-svn: 320382	2017-12-11 15:53:12 +00:00
Simon Pilgrim	d9d37f8c3c	[X86] Add ADCX/ADOX/XADD/XLAT schedule tests llvm-svn: 320380	2017-12-11 15:41:52 +00:00
Nirav Dave	e830b758b8	[X86] Modify Nontemporal tests to avoid deadstore optimization. llvm-svn: 320379	2017-12-11 15:35:40 +00:00
Simon Pilgrim	4f2c415a13	[X86] Add SETCC/STC/STD/UD2 schedule tests llvm-svn: 320376	2017-12-11 15:25:31 +00:00
Sanjay Patel	f3436d7dab	[DAGCombiner] protect against an infinite loop between shl <--> mul (PR35579) At first, I tried to thread the x86 needle and use a target hook (isVectorShiftByScalarCheap()) to disable the transform only for non-splat pow-of-2 constants, but not AVX2, but only some element types, but...it's difficult. Here we just avoid the loop with the x86 vector transform that conflicts with the general DAG combine and preserve all of the existing behavior AFAICT otherwise. Some tests that will probably fail if someone does try to restrict this in a more targeted way for x86-only may be found in: test/CodeGen/X86/combine-mul.ll test/CodeGen/X86/vector-mul.ll test/CodeGen/X86/widen_arith-5.ll This should prevent the infinite looping seen with: https://bugs.llvm.org/show_bug.cgi?id=35579 Differential Revision: https://reviews.llvm.org/D41040 llvm-svn: 320374	2017-12-11 15:19:31 +00:00
Simon Pilgrim	5154d249a8	[X86] Add SAR/SHL/SHR schedule tests llvm-svn: 320371	2017-12-11 14:56:44 +00:00
Simon Pilgrim	426add6915	[X86] Add RCL/RCR schedule tests llvm-svn: 320370	2017-12-11 14:46:42 +00:00
Krzysztof Parzyszek	152414595b	[Hexagon] Crash in instruction selection for insert_vector_elt for HVX A wrong type was passed to insertVector, causing an out-of-bounds value to be added an an operand to HexagonISD::INSERT. This later failed in instruction selection. llvm-svn: 320369	2017-12-11 14:46:06 +00:00
Nemanja Ivanovic	50d37a1129	[PowerPC] Sign-extend negative constant stores Second part of https://reviews.llvm.org/D40348. Revision r318436 has extended all constants feeding a store to 64 bits to allow for CSE on the SDAG. However, negative constants were zero extended which made the constant being loaded appear to be a positive value larger than 16 bits. This resulted in long sequences to materialize such constants rather than simply a "load immediate". This patch just sign-extends those updated constants so that they remain 16-bit signed immediates if they started out that way. llvm-svn: 320368	2017-12-11 14:35:48 +00:00
Diana Picus	291e8d924f	[ARM GlobalISel] Add test for a MOVTi16 pattern. NFC Add test for matching an OR with 0xFFFF0000 to a MOVTi16. llvm-svn: 320362	2017-12-11 13:28:45 +00:00
Simon Pilgrim	969850f514	[X86] Add fsgsbase schedule tests. llvm-svn: 320361	2017-12-11 13:25:02 +00:00
Alex Bradbury	dc31c61b18	[RISCV] Add custom CC_RISCV calling convention and improved call support The TableGen-based calling convention definitions are inflexible, while writing a function to implement the calling convention is very straight-forward, and allows difficult cases to be handled more easily. With this patch adds support for: * Passing large scalars according to the RV32I calling convention * Byval arguments * Passing values on the stack when the argument registers are exhausted The custom CC_RISCV calling convention is also used for returns. This patch also documents the ABI lowering that a language frontend is expected to perform. I would like to work to simplify these requirements over time, but this will require further discussion within the LLVM community. We add PendingArgFlags CCState, as a companion to PendingLocs. The PendingLocs vector is used by a number of backends to handle arguments that are split during legalisation. However CCValAssign doesn't keep track of the original argument alignment. Therefore, add a PendingArgFlags vector which can be used to keep track of the ISD::ArgFlagsTy for every value added to PendingLocs. Differential Revision: https://reviews.llvm.org/D39898 llvm-svn: 320359	2017-12-11 12:49:02 +00:00
Alex Bradbury	bfb00d4c1c	[RISCV] Allow lowering of dynamic_stackalloc, stacksave, stackrestore llvm-svn: 320358	2017-12-11 12:38:17 +00:00
Alex Bradbury	b014e3de52	[RISCV] Implement prolog and epilog insertion As frame pointer elimination isn't implemented until a later patch and we make extensive use of update_llc_test_checks.py, this changes touches a lot of the RISC-V tests. Differential Revision: https://reviews.llvm.org/D39849 llvm-svn: 320357	2017-12-11 12:34:11 +00:00
Simon Pilgrim	220b1c13bf	[X86] Regenerate fsgsbase intrinsic tests. NFCI. llvm-svn: 320356	2017-12-11 12:22:15 +00:00
Roger Ferrer Ibanez	5ea0f2501f	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 - fixes PR35103 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 320355	2017-12-11 12:13:45 +00:00
Alex Bradbury	660bcceccf	[RISCV] Support lowering FrameIndex Introduces the AddrFI "addressing mode", which is necessary simply because it's not possible to write a pattern that directly matches a frameindex. Ensure callee-saved registers are accessed relative to the stackpointer. This is necessary as callee-saved register spills are performed before the frame pointer is set. Move HexagonDAGToDAGISel::isOrEquivalentToAdd to SelectionDAGISel, so we can make use of it in the RISC-V backend. Differential Revision: https://reviews.llvm.org/D39848 llvm-svn: 320353	2017-12-11 11:53:54 +00:00
Diana Picus	775bb74379	[ARM GlobalISel] Add tests for PKHBT and PKHTB Test (some of) the patterns for selecting PKHBT and PKHTB. The others are just very similar to the ones we're testing and there would be little value in covering them as well. llvm-svn: 320352	2017-12-11 11:44:23 +00:00
Aleksandar Beserminji	d6dada17ff	[mips] Removal of microMIPS64R6 All files and parts of files related to microMIPS4R6 are removed. When target is microMIPS4R6, errors are printed. This is LLVM part of patch. Differential Revision: https://reviews.llvm.org/D35625 llvm-svn: 320350	2017-12-11 11:21:40 +00:00
Craig Topper	ad45bf5895	[DAGCombiner] Support folding (mulhs/u X, 0)->0 for vectors. We should probably also fold (mulhs/u X, 1) for vectors, but that's harder. llvm-svn: 320344	2017-12-11 08:33:20 +00:00
Craig Topper	0bea09b737	[X86] Regenerate test with update_llc_test_checks.py llvm-svn: 320342	2017-12-11 06:16:26 +00:00
Craig Topper	1e83485613	[X86] Add a test case for masked scatter where the index needs to be legalized from v2i32 while other types are legal. llvm-svn: 320340	2017-12-11 01:48:10 +00:00
Simon Pilgrim	6b1f532ccf	[X86] Add ROL/ROR schedule tests llvm-svn: 320334	2017-12-10 22:11:56 +00:00
Simon Pilgrim	a6564e2358	[X86] Add DIV/MUL/NEG/NOP/NOT/PAUSE schedule tests llvm-svn: 320333	2017-12-10 21:56:24 +00:00
Simon Pilgrim	8e6d0fcbac	[X86] Add DEC/INC schedule tests Include i686 (non-REX) variant tests as well llvm-svn: 320332	2017-12-10 21:28:00 +00:00
Simon Pilgrim	f1c51d187a	[X86] Add INS/OUTS schedule tests llvm-svn: 320331	2017-12-10 21:10:28 +00:00
Simon Pilgrim	07ebbd53f0	[X86] Add CMPS/MOVS/SCAS/STOS schedule tests llvm-svn: 320330	2017-12-10 20:58:22 +00:00
Simon Pilgrim	f65831d731	[X86] Add CMOV schedule tests llvm-svn: 320329	2017-12-10 20:46:57 +00:00
Simon Pilgrim	4a431edddc	[X86] Add BT/BTC/BTR/BTS schedule tests llvm-svn: 320328	2017-12-10 20:22:47 +00:00
Craig Topper	a0be5a06c1	[X86] Rename some instructions that start with Int_ to have the _Int at the end. This matches AVX512 version and is more consistent overall. And improves our scheduler models. In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses. llvm-svn: 320325	2017-12-10 19:47:56 +00:00
Simon Pilgrim	c493d4f5b9	[X86][X87] Fix typo in znver1 FIST/FISTT schedule patterns llvm-svn: 320322	2017-12-10 19:19:22 +00:00
Simon Pilgrim	930e435937	[X86][X87] Add missing x87 scheduler tests Split off some 'n' instruction versions to make it clearer when WAIT is being inserted llvm-svn: 320321	2017-12-10 18:53:15 +00:00
Craig Topper	1de942b2d1	[X86] Rename some instructions from 'rb' to 'rrb' to make 'b' a proper suffix. Fix the scheduling information for some of them. Some of the scheduling information was only present for the 'rb' version' and not the 'rr' version. Now we match 'rr(b?)' llvm-svn: 320320	2017-12-10 17:42:44 +00:00
Craig Topper	c7445f2cdc	[X86] Add VCVTQQ2PS to the skylake server scheduler models. llvm-svn: 320319	2017-12-10 17:42:43 +00:00
Simon Pilgrim	1f8cfba0bb	[X86] Flag BroadWell scheduler model as complete Locally tag COPY as WriteMove, which has caused some reg-reg + reg-mem instruction tests to reorder. llvm-svn: 320308	2017-12-10 13:49:51 +00:00
Simon Pilgrim	4ff43d8120	Regenerate some AVX2+ scheduling tests that got missed llvm-svn: 320307	2017-12-10 13:41:29 +00:00
Simon Pilgrim	af35b76bda	Regenerate some scheduling tests that got missed llvm-svn: 320305	2017-12-10 12:59:55 +00:00
Craig Topper	253562eb81	[X86] Fix duplicate entries in skylake server scheduler model by changing Z128 to Z256 Based on the fact that the 'Y' version of the instruction is next to this, I assume Z256 is the intended value. llvm-svn: 320295	2017-12-10 09:14:45 +00:00
Craig Topper	90c9c15936	[X86] Add MOVQI2PQIrm, MOVSDmr, and MOVSDrm to scheduler information The VEX versions were present but not the legacy SSE versions. llvm-svn: 320294	2017-12-10 09:14:44 +00:00
Craig Topper	4e57776fb2	[X86] Correct the _Int part of more scheduler model instrexes. Put _b in the correct order relative to _Int llvm-svn: 320282	2017-12-10 03:16:38 +00:00
Craig Topper	29868dcbaa	[X86] Fix test case I failed ot update in r320279. llvm-svn: 320280	2017-12-10 01:27:54 +00:00
Craig Topper	391c6f9507	[X86] Fix bad regular expressions in the scheduler models. Question marks should be outside of multicharacter parenthesized expressions If the question mark is inside the parentheses it only applies to the single character proceeding it. I had to make a few additional cleanups to fix some duplicate warnings that were exposed by fixing this. llvm-svn: 320279	2017-12-10 01:24:08 +00:00
Joel Jones	5cc21e83ce	[AArch64] Improve loop unrolling performance on Cavium T99 This patch improves performance on Cavium T99 as shown here (libquantum 0.2.4): https://docs.google.com/spreadsheets/d/1Lo1o2E1NjrpkwS7DvYYWsiVvPdd93h7KBaqeptMrZPY/edit?usp=sharing By increasing the LoopMicroOpsBufferSize in the Cavium T99 Scheduler file, loop unrolling becomes more aggressive. This helps performance on T99. Test case included. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D40695 llvm-svn: 320272	2017-12-09 23:59:55 +00:00
Craig Topper	f4e3044db9	[X86] Use KMOV instructions to zero upper bits of vectors when possible. llvm-svn: 320268	2017-12-09 23:10:59 +00:00
Craig Topper	5ac75d5628	[X86] Improve lowering of vXi1 insert_subvectors to better utilize (insert_subvector zero, vec, 0) for zeroing upper bits. This can be better recognized during isel when the producer already zeroed the upper bits. llvm-svn: 320267	2017-12-09 22:44:42 +00:00
Craig Topper	504534514c	[X86] Don't use getTargetConstant for all 0s and all 1s mask vector. llvm-svn: 320260	2017-12-09 19:18:30 +00:00
Craig Topper	6504a8f888	[X86] When inserting into the upper bits of a vXi1 vector, make sure we shift enough bits if we widened the vector. We may need to widen the vector to make the shifts legal, but if we do that we need to make sure we shift left/right after accounting for the new size. If not we can't guarantee we are shifting in zeros. The test cases affected actually show cases where we should move the shifts all together, but that's another problem. llvm-svn: 320248	2017-12-09 08:19:07 +00:00
Dylan McKay	f7e8ec1348	[AVR] Fix two CodeGen tests These were broken because of various printing format changes. llvm-svn: 320246	2017-12-09 07:51:43 +00:00
Craig Topper	b3e14ce90c	[X86] Improve lowering of concats of mask vectors to better optimize zero vector inputs. We were previously using kunpck with zero inputs unnecessarily. And we had cases where we would insert into a zero vector and then insert into larger zero vector incurring two sets of shifts. llvm-svn: 320244	2017-12-09 07:02:19 +00:00
Dylan McKay	80463fe64d	Relax unaligned access assertion when type is byte aligned Summary: This relaxes an assertion inside SelectionDAGBuilder which is overly restrictive on targets which have no concept of alignment (such as AVR). In these architectures, all types are aligned to 8-bits. After this, LLVM will only assert that accesses are aligned on targets which actually require alignment. This patch follows from a discussion on llvm-dev a few months ago http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html Reviewers: bogner, nemanjai, joerg, efriedma Reviewed By: efriedma Subscribers: efriedma, cactus, llvm-commits Differential Revision: https://reviews.llvm.org/D39946 llvm-svn: 320243	2017-12-09 06:45:36 +00:00
Jessica Paquette	a249c4f513	[MachineOutliner] Outline calls The outliner previously would never outline calls. Calls are pretty common in files, so it makes sense to outline them. In fact, in the LLVM test suite, if you count the number of instructions that the outliner misses when you outline calls vs when you don't, it turns out that, on average, around 6% of the instructions encountered are calls. So, if we outline calls, we can find more candidates, and thus save some more space. This commit adds that functionality and updates the mir test to reflect that. llvm-svn: 320229	2017-12-09 00:43:49 +00:00
Paul Robinson	8bd9d6ad83	Fix out-of-order stepping behavior in programs with sunk instructions. MachineSink attempts to place instructions near the basic blocks where they are needed. Once an instruction has been sunk, its location relative to other instructions no longer is consistent with the original source code. In order to ensure correct stepping in the debugger, the debug location for sunk instructions is either merged with the insertion point or erased if the target successor block is empty. Originally submitted as r318679, revised to fix sanitizer failure and improve testing. Patch by Matthew Voss! Differential Revision: https://reviews.llvm.org/D39933 llvm-svn: 320216	2017-12-09 00:17:01 +00:00
Dan Gohman	3a762bf9df	[WebAssembly] Reapply r319186: "Support bitcasted function addresses with varargs." This puts the functionality under control of a command-line option which is off by default to avoid breaking existing setups. llvm-svn: 320197	2017-12-08 21:27:00 +00:00
Dan Gohman	6736f59078	[WebAssemby] Re-apply r320041: "Support main functions with alternate signatures." This includes a fix so that it doesn't transform declarations, and it puts the functionality under control of a command-line option which is off by default to avoid breaking existing setups. llvm-svn: 320196	2017-12-08 21:18:21 +00:00
Konstantin Zhuravlyov	c40d9f2e5d	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage - Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194	2017-12-08 20:52:28 +00:00
Craig Topper	7f0d456ef8	[X86] Teach lowering to only let through (insert_subvector (vXi1 zeros), subvec, 0) for vector sizes that have native KSHIFT support. For narrow sizes we'll widen the zero vector and widen the insert. Then do an extract_subvector to get back down to correct size. This allows us to remove some patterns from the isel table that had to COPY_TO_REGCLASS to an oversized register, do the shift and then COPY_TO_REGCLASS back to the narrow register. Now this is represented explicitly in the DAG. This seems to have perturbed the register allocation in one of the tests, but the number of instructions didn't change. llvm-svn: 320190	2017-12-08 20:10:33 +00:00
Simon Pilgrim	6415f56c79	[X86][X87] Tag x87 float compare instructions scheduler classes llvm-svn: 320189	2017-12-08 20:10:31 +00:00
Matt Arsenault	856777d8c9	AMDGPU: image_getlod and image_getresinfo do not read memory llvm-svn: 320187	2017-12-08 20:00:57 +00:00
Konstantin Zhuravlyov	e30f88f3a9	AMDGPU: Report Arg's Value name in metadata if kernel_arg_name metadata is not available Differential Revision: https://reviews.llvm.org/D40924 llvm-svn: 320176	2017-12-08 19:22:12 +00:00
Simon Pilgrim	19d460b066	[X86][SHA] Tag SHA instructions scheduler classes Put these under VecIMul itinerary classes for now - seems to be a good average value llvm-svn: 320161	2017-12-08 16:38:41 +00:00
Gadi Haber	2cf601f28f	[X86][Haswell]: Updating the scheduling information for the Haswell subtarget. Updated the scheduling information for the Haswell subtarget with the following changes: Regrouped the instructions after adding appropriate load + store latencies. Added scheduling for missing instructions such as the GATHER instrs. The changes were made after revisiting the latencies impact of all memory uOps. Reviewers: RKSimon, zvi, craig.topper, apilipenko Differential Revision: https://reviews.llvm.org/D40021 Change-Id: Iaf6c1f5169add1552845a8a566af4e5a359217a7 llvm-svn: 320137	2017-12-08 09:48:44 +00:00
Abderrazek Zaafrani	2c80e4c7c3	[AArch64] Avoid SIMD interleaved store instruction for Exynos. Replace interleaved store instructions by equivalent and more efficient instructions based on latency cost model. Https://reviews.llvm.org/D38196 llvm-svn: 320123	2017-12-08 00:58:49 +00:00
Derek Schuff	9e1baeda74	Revert "[WebAssemby] Support main functions with alternate signatures." This reverts commit 959e37e669b0c3cfad4cb9f1f7c9261ce9f5e9ae. That commit doesn't handle the case where main is declared rather than defined, in particular the even-more special case where main is a prototypeless declaration (which is of course the one actually used by musl currently). llvm-svn: 320121	2017-12-08 00:39:54 +00:00
Craig Topper	323ba39f10	[X86] Handle alls version of vXi1 insert_vector_elt with a constant index without falling back to shuffles. We previously only supported inserting to the LSB or MSB where it was easy to zero to perform an OR to insert. This change effectively extracts the old value and the new value, xors them together and then xors that single bit with the correct location in the original vector. This will cancel out the old value in the first xor leaving the new value in the position. The way I've implemented this uses 3 shifts and two xors and uses an additional register. We can avoid the additional register at the cost of another shift. llvm-svn: 320120	2017-12-08 00:16:09 +00:00
Eric Christopher	a469acac03	Temporarily revert "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." It is causing sanitizer failures on llvm tests in a bootstrapped compiler. No bot link since it's currently down, but following up to get the bot up. This reverts commit r319218. llvm-svn: 320106	2017-12-07 22:26:19 +00:00
Jessica Paquette	59948666fb	[MachineOutliner] Fix offset overflow check The offset overflow check before was incorrect. It would always give the correct result, but it was comparing the SCALED potential fixed-up offset against an UNSCALED minimum/maximum. As a result, the outliner was missing a bunch of frame setup/destroy instructions that ought to have been safe to outline. This fixes that, and adds an instruction to the .mir test that failed the old test. llvm-svn: 320090	2017-12-07 21:51:43 +00:00
Mark Searles	9ebdbb433a	[AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output." Patch caused a buildbot failure; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15733/steps/build_Lld/logs/stdio : lib/Target/AMDGPU/SIInsertWaitcnts.cpp:396:11: error: private field 'InstCnt' is not used [-Werror,-Wunused-private-field] int32_t InstCnt = 0; ^ 1 error generated. " This reverts commit 71627f79010aafe74fdcba901bba28dd7caa0869. llvm-svn: 320086	2017-12-07 21:14:41 +00:00
Mark Searles	a84d23489a	[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output. -amdgpu-waitcnt-forcezero={1\|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs Differential Revision: https://reviews.llvm.org/D40091 llvm-svn: 320084	2017-12-07 20:36:39 +00:00
Mark Searles	d29f24acfb	[AMDGPU] Add GCNHazardRecognizer::checkInlineAsmHazards() and GCNHazardRecognizer::checkVALUHazardsHelper(). checkInlineAsmHazards() checks INLINEASM for hazards that we particularly care about (so not exhaustive); this patch adds a check for INLINEASM that defs vregs that hold data-to-be stored by immediately preceding store of more than 8 bytes. If the instr were not within an INLINEASM, this scenario would be handled by checkVALUHazard(). Add checkVALUHazardsHelper(), which will be called by both checkVALUHazards() and checkInlineAsmHazards(). Differential Revision: https://reviews.llvm.org/D40098 llvm-svn: 320083	2017-12-07 20:34:25 +00:00
Craig Topper	dfc79c7c33	[X86] Fix InsertBitToMaskVector to only issue KSHIFTS of native size so that upper bits are properly zeroed. There's no v2i1 or v4i1 kshift, and v8i1 is only supported with AVXDQ. Isel has fake patterns to extend these types to native shifts, but makes no guarantees about the value of any bits shifted in when shifting right. This patch promotes the vector to a type that supports a native shift first and only allows inserting into the msb of a native sized shift. I've constructed this in a way that doesn't do the promotion if we're going to fallback to using a xmm/ymm/zmm shuffle. I think I have a plan to remove the shuffle fall back entirely. In which case we this can be simplified, but I wanted to fix the correctness issue first. llvm-svn: 320081	2017-12-07 20:10:04 +00:00
Simon Pilgrim	386b23f1fa	[X86] Tag BMI/BMI2/TBM instructions scheduler classes Put these under UNARY/BINOP ALU itinerary classes for now - seems to be a good average value llvm-svn: 320064	2017-12-07 17:37:39 +00:00
Krzysztof Parzyszek	039d4d9286	[Hexagon] Generate HVX code for basic arithmetic operations Handle and, or, xor, add, sub, mul for vectors of i8, i16, and i32. llvm-svn: 320063	2017-12-07 17:37:28 +00:00
Simon Pilgrim	d2e93e76b8	[X86][TBM] Add TBM scheduling tests llvm-svn: 320062	2017-12-07 17:23:00 +00:00
Craig Topper	5db260fca4	[X86] Rename function in recently added test case to not be 'main' returning 'void'. NFC llvm-svn: 320059	2017-12-07 17:02:49 +00:00
Simon Pilgrim	2983b46973	[X86] Tag SALC instructions scheduler class Treat these the same as LAHF/SAHF (although its not a x86_64 instruction) llvm-svn: 320055	2017-12-07 16:07:06 +00:00
Simon Pilgrim	ffce0d8fbc	[X86] Add LAHF/SAHF scheduling test llvm-svn: 320054	2017-12-07 16:04:20 +00:00
Simon Pilgrim	a383f84233	[X86] Add SALC scheduling test llvm-svn: 320052	2017-12-07 15:46:58 +00:00
Simon Pilgrim	f1d599adb2	[X86] Tag LZCNT/TZCNT instructions scheduler classes Tagged as IMUL instructions for a reasonable approximation (ALU tends to be a lot faster) - POPCNT is currently tagged as FAdd which I think should be replaced with IMUL as well llvm-svn: 320051	2017-12-07 15:24:14 +00:00
Sanjay Patel	9012391af1	[DAGCombiner] eliminate shuffle of insert element I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either repeating a scalar insertion at the same position in a vector or translated to a different element index. Like the earlier patch, this could be an instcombine too, but since we opted to make this a DAG transform earlier, I've made this one a DAG patch too. We do not need any legality checking because the new insert is identical to the existing insert except that it may have a different constant insertion operand. The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the motivation for D38756. Differential Revision: https://reviews.llvm.org/D40209 llvm-svn: 320050	2017-12-07 15:17:58 +00:00
Simon Pilgrim	ff5212091a	[X86][FMA] Regenerate fma schedule tests llvm-svn: 320048	2017-12-07 14:51:47 +00:00
Simon Pilgrim	60411d9a8c	[X86] Tag RDRAND/RDSEED instruction scheduler classes llvm-svn: 320045	2017-12-07 14:18:48 +00:00
Simon Pilgrim	9a2898ed22	[X86] Regenerate RDTSC codegen tests llvm-svn: 320042	2017-12-07 13:50:29 +00:00
Dan Gohman	cdaa87dd2e	[WebAssemby] Support main functions with alternate signatures. WebAssembly requires caller and callee signatures to match, so the usual C runtime trick of calling main and having it just work regardless of whether main is defined as '()' or '(int argc, char *argv[])' doesn't work. Extend the FixFunctionBitcasts pass to rewrite main to use the latter form. llvm-svn: 320041	2017-12-07 13:49:27 +00:00
Simon Pilgrim	439679c085	[X86][RDSEED] Add rdseed scheduling tests llvm-svn: 320040	2017-12-07 13:47:17 +00:00
Simon Pilgrim	eb87fe62ec	[X86][RDRAND] Add rdrand scheduling tests llvm-svn: 320039	2017-12-07 13:46:47 +00:00
Nikolai Bozhenov	1cf9c54e5c	[Nios2] final infrastructure to provide compilation of a return from a function This patch includes all missing functionality needed to provide first compilation of a simple program that just returns from a function. I've added a test case that checks for "ret" instruction printed in assembly output. Patch by Andrei Grischenko (andrei.l.grischenko@intel.com) Differential revision: https://reviews.llvm.org/D39688 llvm-svn: 320035	2017-12-07 12:35:02 +00:00
Andrew V. Tischenko	44cfc51415	Add proper BTVER2 sched support for MOV instr. Differential Revision: https://reviews.llvm.org/D40345 llvm-svn: 320034	2017-12-07 11:19:49 +00:00
Francis Visoiu Mistrih	a8a83d150f	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register. Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022	2017-12-07 10:40:31 +00:00
Mikael Holmen	b5deac444d	Skip DBG instr in OptimizePHIs when looking for dead PHI cycles Summary: Changed use_instructions() to use_nodbg_instructions() when building an instruction set. We don't want the presence of debug info to affect the code we generate. Reviewers: dblaikie, Eugene.Zelenko, chandlerc, aprantl Reviewed By: aprantl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D40882 llvm-svn: 320010	2017-12-07 07:01:21 +00:00
Dan Gohman	5cf6473903	[WebAssembly] Don't try to emit size information for unsized types Patch by John Sully! Fixes PR35164. Differential Revision: https://reviews.llvm.org/D39519 llvm-svn: 319991	2017-12-07 00:14:30 +00:00
Florian Hahn	5d6a4e43ba	[AArch64] Add patterns to replace fsub fmul with fma fneg. Summary: This patch adds MachineCombiner patterns for transforming (fsub (fmul x y) z) into (fma x y (fneg z)). This has a lower latency on micro architectures where fneg is cheap. Patch based on work by George Steed. Reviewers: rengolin, joelkevinjones, joel_k_jones, evandro, efriedma Reviewed By: evandro Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40306 llvm-svn: 319980	2017-12-06 22:48:36 +00:00
Krzysztof Parzyszek	d2967868be	[Hexagon] Recognize vdealb, vdealh, vshuffb and vshuffh specifically llvm-svn: 319978	2017-12-06 22:41:49 +00:00
Krzysztof Parzyszek	64533cf630	[Hexagon] Handle perfect shuffles on single vectors llvm-svn: 319965	2017-12-06 21:25:03 +00:00
Dan Gohman	ad19047d83	[WebAssembly] Remove WASM_STACK_POINTER. WASM_STACK_POINTER and the .stack_pointer directive are no longer needed now that the stack pointer global is an import. llvm-svn: 319956	2017-12-06 20:56:40 +00:00
Simon Pilgrim	9afbe77a91	[X86][AVX512] Tag mask reg op instruction scheduler classes llvm-svn: 319945	2017-12-06 19:36:00 +00:00
Simon Pilgrim	7724b03cde	[X86][SSE] Regenerate vpmovm2/vpmov2m avx512 schedule tests llvm-svn: 319921	2017-12-06 18:47:37 +00:00
Simon Pilgrim	809c024b3d	[X86][AVX2] Tag MASKMOV instruction scheduler classes llvm-svn: 319915	2017-12-06 18:24:48 +00:00
Craig Topper	fa172a5251	[X86] Regenerate test for r319778 llvm-svn: 319914	2017-12-06 18:04:39 +00:00
Simon Pilgrim	df05251921	[X86][AVX512] Tag aligned/unaligned move instruction scheduler classes llvm-svn: 319913	2017-12-06 17:59:26 +00:00
Simon Pilgrim	3ee91ade9b	[X86][AVX] Regenerate vpmovm2/vpmov2m avx512 schedule tests llvm-svn: 319912	2017-12-06 17:57:18 +00:00
Artem Belevich	a659d2590e	[NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang. Differential Revision: https://reviews.llvm.org/D40872 llvm-svn: 319909	2017-12-06 17:50:05 +00:00
Zvi Rackover	ffaed72089	AMDGPU Tests: Change a case to be run with -O0 D40231 requires to run case with -O0 to prevent InstructionSimplify from transforming an extractelement with undef index. llvm-svn: 319907	2017-12-06 17:40:09 +00:00
Krzysztof Parzyszek	7d37dd8902	[Hexagon] Generate HVX code for vector construction and access Support for: - build vector, - extract vector element, subvector, - insert vector element, subvector, - shuffle. llvm-svn: 319901	2017-12-06 16:40:37 +00:00
Simon Pilgrim	aa902be158	[X86][AVX512] Tag BROADCAST instruction scheduler classes llvm-svn: 319900	2017-12-06 15:48:40 +00:00
Nirav Dave	7d8f3e0c93	[ARM][AArch64][DAG] Reenable post-legalize store merge Reenable post-legalize stores with constant merging computation and corresponding test case. * Properly truncate store merge constants * Disable merging of truncated stores floating points * Ensure merges of constant stores into a single vector are constructed from legal elements. Reviewers: eastig, efriedma Reviewed By: eastig Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40701 llvm-svn: 319899	2017-12-06 15:30:13 +00:00
Simon Pilgrim	a9282309e5	[X86][AVX512] Regenerate vpmovm2/vpmov2m avx512 schedule tests llvm-svn: 319895	2017-12-06 14:07:38 +00:00
Jonas Paulsson	19380bae05	[SystemZ] Bugfix in expandRxSBG() Csmith discovered a program that caused wrong code generation with -O0: When handling a SIGN_EXTEND in expandRxSBG(), RxSBG.BitSize may be less than the Input width (if a truncate was previously traversed), so maskMatters() should be called with a masked based on the width of the sign extend result instead. Review: Ulrich Weigand llvm-svn: 319892	2017-12-06 13:53:24 +00:00
Vlad Tsyrklevich	0b40f21134	Revert "[DAGCombine] Move AND nodes to multiple load leaves" This reverts commit r319773. It was causing some buildbots to hang, e.g. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/5589 llvm-svn: 319867	2017-12-06 01:16:08 +00:00
Simon Pilgrim	d495301414	[X86][AVX512] Tag BLENDM instruction scheduler classes llvm-svn: 319833	2017-12-05 21:05:25 +00:00
Simon Pilgrim	b69dae42e3	[X86][AVX512] Tag GATHER/SCATTER instruction scheduler classes NOTE: At the moment these use the WriteLoad/WriteStore classes, which severely underestimates the costs. This needs to be reviewed. llvm-svn: 319829	2017-12-05 20:47:11 +00:00
Matt Arsenault	8ae38bc0bd	AMDGPU: Fix SDWA crash on inline asm This was only searching for explicit defs, and asserting for any implicit or variadic instruction defs, like inline asm. llvm-svn: 319826	2017-12-05 20:32:01 +00:00
Hans Wennborg	5df9f0878b	Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack" The patch originally broke Chromium (crbug.com/791714) due to its failing to specify that the new pseudo instructions clobber EFLAGS. This commit fixes that. > Summary: This strengthens the guard and matches MSVC. > > Reviewers: hans, etienneb > > Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits > > Differential Revision: https://reviews.llvm.org/D40622 llvm-svn: 319824	2017-12-05 20:22:20 +00:00
Ulrich Weigand	5bfed6cb7c	[SystemZ] Validate shifted compare value in adjustForTestUnderMask When folding a shift into a test-under-mask comparison, make sure that there is no loss of precision when creating the shifted comparison value. This usually never happens, except for certain always-true comparisons in unoptimized code. Fixes PR35529. llvm-svn: 319818	2017-12-05 19:42:07 +00:00
Simon Pilgrim	833c260a4b	[X86][AVX512] Tag VPTRUNC/VPMOVSX/VPMOVZX instruction scheduler classes llvm-svn: 319815	2017-12-05 19:21:28 +00:00
Matt Arsenault	7f0a527300	AMDGPU: Fix infinite loop with dbg_value Surprisingly SIOptimizeExecMaskingPreRA can infinite loop in some case with DBG_VALUE. Most tests using dbg_value are run at -O0, so don't run this pass. This seems to only happen when the value argument is undef. llvm-svn: 319808	2017-12-05 18:23:17 +00:00
Simon Pilgrim	65f805fe30	[X86][X87] Tag FCMOV instruction scheduler classes llvm-svn: 319804	2017-12-05 18:01:26 +00:00
Dan Gohman	c2c997718d	[WebAssembly] Implement WASM_STACK_POINTER. Use the .stack_pointer directive to implement WASM_STACK_POINTER for specifying a global variable to be the stack pointer. llvm-svn: 319797	2017-12-05 17:23:43 +00:00
Dan Gohman	f7172f4ab0	[WebAssembly] Don't emit .import_global for the wasm target. .import_global is used by the ELF-based target and not needed by the wasm target. llvm-svn: 319796	2017-12-05 17:21:57 +00:00
Jina Nahias	51c1a627c2	[x86][AVX512] Lowering kunpack intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR. Differential Revision: https://reviews.llvm.org/D39720 Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1 llvm-svn: 319778	2017-12-05 15:42:56 +00:00
Bjorn Pettersson	5abbad7999	Add REQUIRES asserts in combine_loads_from_build_pair.ll A fixup of r319771, that was causing buildbot failures. llvm-svn: 319775	2017-12-05 15:26:01 +00:00
Sam Parker	0a436a9d62	[DAGCombine] Move AND nodes to multiple load leaves Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D39604 llvm-svn: 319773	2017-12-05 15:13:47 +00:00
Bjorn Pettersson	823b299fbc	[DAGCombine] Handle big endian correctly in CombineConsecutiveLoads Summary: Found out, at code inspection, that there was a fault in DAGCombiner::CombineConsecutiveLoads for big-endian targets. A BUILD_PAIR is always having the least significant bits of the composite value in element 0. So when we are doing the checks for consecutive loads, for big endian targets, we should check if the load to elt 1 is at the lower address and the load to elt 0 is at the higher address. Normally this bug only resulted in missed oppurtunities for doing the load combine. I guess that in some rare situation it could lead to faulty combines, but I've not seen that happen. Note that this patch actually will trigger load combine for some big endian regression tests. One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get t76: i64,ch = load<LD8[FixedStack-9] instead of t37: i32,ch = load<LD4[FixedStack-10]> t35: i32,ch = load<LD4[FixedStack-9]> t41: i64 = build_pair t37, t35 before legalization. Then the legalization will split the LD8 into two loads, so the end result is the same. That should verify that the transfomation is correct now. Reviewers: niravd, hfinkel Reviewed By: niravd Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D40444 llvm-svn: 319771	2017-12-05 14:50:05 +00:00
Simon Pilgrim	71660c61e6	[X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classes llvm-svn: 319770	2017-12-05 14:34:42 +00:00

... 3 4 5 6 7 ...

22910 Commits