llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	bb10c0f1ec	[X86] FileCheckize one of the rotate tests. llvm-svn: 295676	2017-02-20 19:44:10 +00:00
Steven Wu	abfea28867	Fix use-after-free found by ASAN DenseMap::lookup returns copy of the value in the map. Returning the address of the temporary return value will cause use-after-free. llvm-svn: 295675	2017-02-20 18:33:40 +00:00
Craig Topper	2012dda9a0	[AVX-512] Add a few more patterns for selecting masked vpternlog with broadcast loads where the passthru operand is not operand 0. llvm-svn: 295673	2017-02-20 17:44:09 +00:00
Simon Pilgrim	2967ed1c7e	[X86] Tidyup combineExtractVectorElt. NFCI. Pull out repeated code for extraction index operand and source vector value type. Use isNullConstant helper to check for zero extraction index. llvm-svn: 295670	2017-02-20 16:09:45 +00:00
Simon Pilgrim	e9a8145adb	[X86][SSE] Regenerate extracted bitcasted constant tests and add 32-bit test target llvm-svn: 295669	2017-02-20 15:57:14 +00:00
Daniel Sanders	e604ef5f55	[globalisel] OperandPredicateMatcher's shouldn't need to generate the MachineOperand expr. NFC Summary: Each OperandPredicateMatcher shouldn't need to know how to generate the expression to reference a MachineOperand. The OperandMatcher should provide it. In addition to separating responsibilities, this also lays some groundwork for decoupling source patterns from destination patterns to allow invented operands or operands provided by GlobalISel's equivalent to the ComplexPattern<> class. Depends on D29709 Reviewers: t.p.northover, ab, rovka, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: dberris, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D29710 llvm-svn: 295668	2017-02-20 15:30:43 +00:00
Simon Pilgrim	72d666e443	[X86][SSE] Regenerate re-materialized store tests and add 64-bit test target llvm-svn: 295666	2017-02-20 15:20:37 +00:00
Simon Pilgrim	5a33d1c266	[X86][SSE] Regenerate vselect widening tests and add 32-bit test target llvm-svn: 295665	2017-02-20 15:16:43 +00:00
Diana Picus	1c33c9f0b0	[ARM] GlobalISel: Don't select atomic loads There used to be a check in the IRTranslator that prevented us from having to deal with atomic loads/stores. That check has been removed in r294993 and the AArch64 backend was updated accordingly. This commit does the same thing for the ARM backend. In general, in the ARM backend we introduce fences during the atomic expand pass, so we don't have to worry about atomics, except for the 32-bit ARMv8 target, which handles atomics more like AArch64. Since we don't want to worry about that yet, just bail out of instruction selection if we find any atomic loads. llvm-svn: 295662	2017-02-20 14:45:58 +00:00
Daniel Sanders	b41ce2b392	[globalisel] Separate the SelectionDAG importer from the emitter. NFC Summary: In the near future the rules will be sorted between these two steps to ensure that more important rules are not prevented by less important ones. Reviewers: t.p.northover, ab, rovka, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29709 llvm-svn: 295661	2017-02-20 14:31:27 +00:00
Igor Breger	fda32d266a	[X86] Fix EXTRACT_VECTOR_ELT with variable index from v32i16 and v64i8 vector. Its more profitable to go through memory (1 cycles throughput) than using VMOVD + VPERMV/PSHUFB sequence ( 2/3 cycles throughput) to implement EXTRACT_VECTOR_ELT with variable index. IACA tool was used to get performace estimation (https://software.intel.com/en-us/articles/intel-architecture-code-analyzer) For example for var_shuffle_v16i8_v16i8_xxxxxxxxxxxxxxxx_i8 test from vector-shuffle-variable-128.ll I get 26 cycles vs 79 cycles. Removing the VINSERT node, we don't need it any more. Differential Revision: https://reviews.llvm.org/D29690 llvm-svn: 295660	2017-02-20 14:16:29 +00:00
Alexey Bataev	19b35bf7f4	[SLP] Additional test for vectorization of cal/invoke args vectorization llvm-svn: 295657	2017-02-20 12:41:16 +00:00
Simon Pilgrim	5910ebe720	[X86][AVX512] Add support for ASHR v2i64/v4i64 support without VLX Use v8i64 ASHR instructions if we don't have VLX. Differential Revision: https://reviews.llvm.org/D28537 llvm-svn: 295656	2017-02-20 12:16:38 +00:00
Sanne Wouda	47eb9723de	[ARM] Add a div regression test for Cortex-M23 Summary: This file was missed in the commit for Cortex-M23 and Cortex-M33 support. See https://reviews.llvm.org/D29073?id=85814 . Reviewers: rengolin, javed.absar, samparker Reviewed By: samparker Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D30162 llvm-svn: 295655	2017-02-20 12:05:07 +00:00
Simon Pilgrim	c0dc9a4913	Strip trailing whitespace. llvm-svn: 295653	2017-02-20 11:56:43 +00:00
Simon Pilgrim	50b958c07a	[SelectionDAG] Add scalarization support for ISD::*_EXTEND_VECTOR_INREG opcodes. Thanks to Mikael Holmén for the initial test case llvm-svn: 295652	2017-02-20 11:55:58 +00:00
Sjoerd Meijer	e22a79e898	AArch64AsmParser: tablegen the isBranchTarget helper functions Use tablegen to autogenerate isBranchtarget helper functions. This is a cleanup that removes almost identical functions that differ only in a few constants. Differential Revision: https://reviews.llvm.org/D30160 llvm-svn: 295649	2017-02-20 10:57:54 +00:00
Simon Dardis	df943b02a9	[mips] Add test for mul macro variants llvm-svn: 295648	2017-02-20 10:53:03 +00:00
NAKAMURA Takumi	5539d8d1c9	llvm/examples/Kaleidoscope/BuildingAJIT: More fixup corresponding to r295636. I missed updating them since I just ran check-llvm (with examples) in r295645. llvm-svn: 295646	2017-02-20 10:07:41 +00:00
NAKAMURA Takumi	d4c7a12177	llvm/examples/Kaleidoscope/include/KaleidoscopeJIT.h: Fixup corresponding to r295636. llvm-svn: 295645	2017-02-20 09:56:24 +00:00
Ayman Musa	51ffeab8c8	[X86][AVX] Extend hasVEX_WPrefix bit to accept WIG value (W Ignore) + update all AVX instructions with the new value. Add WIG value to all of AVX instructions which ignore the W-bit in their encoding, instead of giving them the default value of 0. This patch is needed for a follow up work on EVEX2VEX pass (replacing EVEX encoded instructions with their corresponding VEX version when possible). Differential Revision: https://reviews.llvm.org/D29876 llvm-svn: 295643	2017-02-20 08:27:54 +00:00
Alexey Bataev	f96465b9b8	[SLP] nullptr'ize initial value in `findBuildAggregate()`, NFC. Initial value of V is sett nullptr, as it is not used. llvm-svn: 295642	2017-02-20 08:04:11 +00:00
Alexey Bataev	2f6b124e01	[SLP] Rework `findBuildAggregate()` from ercursive form to iterative, NFC. Reviewers: mkuper Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30103 llvm-svn: 295641	2017-02-20 07:49:39 +00:00
Craig Topper	c6c68f5958	[AVX-512] Add more patterns to fold masked VPTERNLOG with load when the passthru isn't operand 0. llvm-svn: 295640	2017-02-20 07:00:40 +00:00
Craig Topper	5aef828ba7	[AVX-512] Add tests for missed opportunities to fold masked VPTERNLOG with load when the passthru op isn't operand 0. llvm-svn: 295639	2017-02-20 07:00:37 +00:00
Craig Topper	a5fa2e40f9	[AVX-512] Fix mistake in the immediate swizzle for some of the VPTERNLOG patterns. llvm-svn: 295638	2017-02-20 07:00:34 +00:00
Craig Topper	cb5b45cc36	[AVX-512] Use a better immediate in the VPTERNLOG commuting tests so its easier to spot bad swizzling. llvm-svn: 295637	2017-02-20 07:00:31 +00:00
Lang Hames	67de5d24a9	[Orc] Rename ObjectLinkingLayer -> RTDyldObjectLinkingLayer. The current ObjectLinkingLayer (now RTDyldObjectLinkingLayer) links objects in-process using MCJIT's RuntimeDyld class. In the near future I hope to add new object linking layers (e.g. a remote linking layer that links objects in the JIT target process, rather than the client), so I'm renaming this class to be more descriptive. llvm-svn: 295636	2017-02-20 05:45:14 +00:00
Craig Topper	5b4e36aafa	[AVX-512] Add more VPTERNLOG patterns to enable folding of broadcast loads that aren't in operand 2. llvm-svn: 295634	2017-02-20 02:47:42 +00:00
Craig Topper	c184b671d9	[X86] Use memory form of shift right by 1 when the rotl immediate is one less than the operation size. An earlier commit already did this for the register form. llvm-svn: 295626	2017-02-20 00:37:23 +00:00
Craig Topper	0f14411b57	[X86] Add test cases showing missed opportunities to use rotate right by 1 instructions when operation reads/writes memory. llvm-svn: 295625	2017-02-20 00:37:20 +00:00
Daniel Jasper	5a51f8cae4	s/REQUIRES: Asserts/REQUIRES: asserts/ Other than this, we consistently use lower case. llvm-svn: 295623	2017-02-19 23:26:00 +00:00
Craig Topper	63801df251	[AVX-512] Remove AddedComplexity from masked operations. The size of the patterns already increases their priority. llvm-svn: 295619	2017-02-19 21:44:35 +00:00
Simon Pilgrim	14a7eee0b4	[X86] Use peekThroughOneUseBitcasts helper. NFCI. llvm-svn: 295618	2017-02-19 21:40:51 +00:00
Davide Italiano	16b476ffcc	[X86] Prefer static_cast<> to C-style cast. NFCI. llvm-svn: 295617	2017-02-19 21:35:41 +00:00
Craig Topper	489057715e	[AVX-512] Disable peephole optimizations on the VPTERNLOG commute test. Add new patterns to enable isel to fold the loads on it own. llvm-svn: 295616	2017-02-19 21:32:15 +00:00
Davide Italiano	1aef59eb44	[AArch64] Prefer static_cast<> to C-style cast. NFCI. llvm-svn: 295615	2017-02-19 21:31:14 +00:00
Simon Pilgrim	d590de2998	[X86][SSE] Use getTargetConstantBitsFromNode to find zeroable shuffle elements. Replaces existing approach that could only search BUILD_VECTOR nodes. Requires getTargetConstantBitsFromNode to discriminate cases with all/partial UNDEF bits in each element - this should also be useful when we get around to supporting getTargetShuffleMaskIndices with UNDEF elements. llvm-svn: 295613	2017-02-19 19:40:31 +00:00
Craig Topper	4e794c71a6	[AVX-512] Add patterns to recognize masked vpternlog when the passthrough operand is not operand 0. This uses a SDNodeXForm to swizzle the appropriate immediate bits to allow this to be matched. llvm-svn: 295612	2017-02-19 19:36:58 +00:00
Craig Topper	ab1afa85ba	[AVX-512] Add test cases that show failure to select masked VPTERNLOG when a select is used to force the passthru operand to be not operand 0. llvm-svn: 295611	2017-02-19 19:36:54 +00:00
Simon Pilgrim	4271186f9c	[X86][SSE] Enable initial support for domain crossing at high shuffle combine depths. As discussed on D27692, this permits another domain to be used to combine a shuffle at high depths. We currently set the required depth at 4 or more combined shuffles, this is probably too high for most targets but is a good starting point and already helps avoid a number of costly variable shuffles. llvm-svn: 295608	2017-02-19 17:19:38 +00:00
Artyom Skrobov	be31754094	Remove redundant call to GluedNodes.back() [NFC] llvm-svn: 295607	2017-02-19 16:56:18 +00:00
Simon Pilgrim	6d07d514de	[X86][SSE] Generalize INSERTPS/SHUFPS/SHUFPD combines across domains. Relax the INSERTPS/SHUFPS/SHUFPD combines to support integer inputs if permitted. llvm-svn: 295606	2017-02-19 15:15:40 +00:00
Igor Kudrin	9e015dafbf	[llvm-cov] Respect Windows line endings when parsing demangled symbols. Differential Revision: https://reviews.llvm.org/D30096 llvm-svn: 295605	2017-02-19 14:26:52 +00:00
Simon Pilgrim	b4460cf5a9	[X86][SSE] Add domain crossing support for target shuffle combines. Add the infrastructure to flag whether float and/or int domains are permitable. A future patch will enable domain crossing based off shuffle depth and the value types of the source vectors. llvm-svn: 295604	2017-02-19 14:12:25 +00:00
Simon Pilgrim	5798180947	Removed extra ';' llvm-svn: 295603	2017-02-19 12:32:44 +00:00
Craig Topper	218d1a020e	[AVX-512] Add broadcast VPTERNLOG instructions to special case commuting switch. The instructions are marked commutable, but without special handling we don't get the immediate correct. While here also remove the masked memory forms that aren't commutable. llvm-svn: 295602	2017-02-19 08:03:26 +00:00
Craig Topper	94de4b9330	[AVX-512] Add patterns to show missed opportunities for folding vpternlog with broadcast loads. Also demonstrates a bug in the commuting of broadcast vpternlog instructions when we are able to select them. llvm-svn: 295601	2017-02-19 08:03:23 +00:00
NAKAMURA Takumi	7802984126	Untabify. llvm-svn: 295599	2017-02-19 06:51:46 +00:00
Daniel Berlin	17b1375299	Re-add debugcounter.ll with Requires: Asserts so that it only triggers when asserts are on llvm-svn: 295598	2017-02-19 06:45:02 +00:00
Daniel Berlin	d46dfb3d0e	Which, in turn, causes build bots to fail that have it unexpectedly passing. So remove debugcounter.ll for now llvm-svn: 295597	2017-02-19 04:56:07 +00:00
Daniel Berlin	1ad7f5cab0	XFAIL this test until we figure out what to do here, since it will fail if NDEBUG defined llvm-svn: 295596	2017-02-19 04:55:02 +00:00
Daniel Berlin	ce3d348a5c	Add two files lost in rebase, causing build break llvm-svn: 295595	2017-02-19 04:29:50 +00:00
Daniel Berlin	a4b5c01dd2	Add a DebugCounter for PredicateInfo renaming, and an associated test llvm-svn: 295594	2017-02-19 04:29:01 +00:00
Daniel Berlin	25f1db1111	Add initial support for debug counting Summary: We have support for bisection, and bugpoint can reduce testcases often to a single pass. But that doesn't help reduce it to a single transform by a single pass. Which debug counting lets us do. Debug counting lets you instrument a pass so that it only executes a certain thing (rwhatever you want) after skipping it a certain time of times, and then only does a certain number of executions before saying "skip" again. To make it concrete, for predicateinfo, if i instrument use renaming, i can make it so it skips renaming the first N uses, renames the next N, and then skips the rest. This lets you narrow down a miscompilation to, often, a single transformation, and then also debug it (by using the same command line parameters). Reviewers: chandlerc, davide, mehdi_amini Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29998 llvm-svn: 295593	2017-02-19 04:28:56 +00:00
NAKAMURA Takumi	486dfe11af	llvm/test/CodeGen/AMDGPU/r600.alu-limits.ll should require +Asserts. This would run into infinite loop anyways with -Asserts. llvm-svn: 295591	2017-02-19 02:31:06 +00:00
Craig Topper	007c93b2b9	[X86] Remove patterns for MOVSD with v4i32 types. We don't appear to really need them and if we do we should just use a bitcast to a 64-bit element type. llvm-svn: 295589	2017-02-19 02:08:48 +00:00
Craig Topper	06ae5e821c	[X86] Tighten up some of the SDNode type constraints. llvm-svn: 295588	2017-02-19 01:54:47 +00:00
Simon Pilgrim	dba9011942	Fix unused variable warning when assertions are disabled. llvm-svn: 295587	2017-02-19 00:33:37 +00:00
Simon Pilgrim	599b872ca2	[X86] Fix enumeral/non-enumeral conditional expression warning. gcc only allows you to mix enums / ints if they have the same signedness. llvm-svn: 295586	2017-02-19 00:04:30 +00:00
Simon Pilgrim	ee6ef4d6dd	Fix 'variable set but not used' warning when assertions are disabled. llvm-svn: 295585	2017-02-19 00:03:46 +00:00
Daniel Berlin	f7d9580a08	NewGVN: Start making use of predicateinfo pass. Summary: This begins using the predicateinfo pass in NewGVN. Reviewers: davide Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D29682 llvm-svn: 295583	2017-02-18 23:06:50 +00:00
Daniel Berlin	b355c4ff5f	NewGVN: Make ranking prefer undef to constants. Fix direction of shouldSwapOperands to be correct. llvm-svn: 295582	2017-02-18 23:06:47 +00:00
Daniel Berlin	588e0be39d	PredicateInfo: Clean up predicate info a little, using insertion helpers, and fixing support for the renaming the comparison. llvm-svn: 295581	2017-02-18 23:06:38 +00:00
Simon Pilgrim	2f2d8dc630	Fix signed/unsigned comparison warning. llvm-svn: 295580	2017-02-18 22:56:17 +00:00
Craig Topper	811756b4dc	[X86][XOP] Reduce the size of a multiclass by moving more stuff to parameters instead of doing 128-bit and 256-bit simultaneously. This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions. llvm-svn: 295579	2017-02-18 22:53:43 +00:00
Craig Topper	f6564c991b	[TableGen] Make sure EnforceSameSize populates the type sets if necessary. This was found by another commit I'm working on. llvm-svn: 295578	2017-02-18 22:53:38 +00:00
Simon Pilgrim	b092166a76	[AArch64] Fix enumeral/non-enumeral conditional expression warning. gcc only allows you to mix enums / ints if they have the same signedness. llvm-svn: 295577	2017-02-18 22:50:28 +00:00
Simon Pilgrim	7a87eebcad	[X86] Fix enumeral/non-enumeral comparison warning. gcc only allows you to mix enums / ints if they have the same signedness. llvm-svn: 295576	2017-02-18 22:40:58 +00:00
Simon Pilgrim	2e78c94ea5	[X86][SSE] Avoid repeated calls to SDValue::getValueType. Added assertion to check input type of X86ISD::VZEXT during target known bits calculation. llvm-svn: 295575	2017-02-18 22:25:27 +00:00
Sanjay Patel	53c5c3d65d	[InstCombine] add nsw/nuw X, signbit --> or X, signbit Changing to 'or' (rather than 'xor' when no wrapping flags are set) allows icmp simplifies to happen as expected. Differential Revision: https://reviews.llvm.org/D29729 llvm-svn: 295574	2017-02-18 22:20:09 +00:00
Sanjay Patel	fe67255961	[InstSimplify] add nsw/nuw (xor X, signbit), signbit --> X The change to InstCombine in: https://reviews.llvm.org/D29729 ...exposes this missing fold in InstSimplify, so adding this first to avoid a regression. llvm-svn: 295573	2017-02-18 21:59:09 +00:00
Sanjay Patel	308eb22118	[InstSimplify] add tests for add nsw/nuw (xor X, signbit), signbit --> X; NFC llvm-svn: 295572	2017-02-18 21:51:14 +00:00
Craig Topper	de10312bea	Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR." Clang has now been fixed to not use these intrinsics. llvm-svn: 295571	2017-02-18 21:50:58 +00:00
Sanjay Patel	dc8a24ea4c	[x86] remove stale comments from tests; NFC llvm-svn: 295569	2017-02-18 21:07:37 +00:00
Sanjay Patel	12c2093e1e	[x86] fold sext (xor Bool, -1) --> sub (zext Bool), 1 This is the same transform that is current used for: select Bool, 0, -1 llvm-svn: 295568	2017-02-18 21:03:28 +00:00
Piotr Padlewski	cc5868c186	[MemorySSA] NFC small fixes Summary: 2 small fixes extracted from https://reviews.llvm.org/D29064 Reviewers: kuhar, davide, dberlin, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30109 llvm-svn: 295566	2017-02-18 20:34:36 +00:00
Craig Topper	ba2a726cc6	Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR." This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support. llvm-svn: 295565	2017-02-18 20:14:20 +00:00
Craig Topper	884db3f85d	[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR. It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses. llvm-svn: 295564	2017-02-18 19:51:25 +00:00
Craig Topper	03a9adc2ba	[X86][IR] Simplify the XOP vpcmov autoupgrade code. NFC llvm-svn: 295563	2017-02-18 19:51:19 +00:00
Craig Topper	aa49f14496	[X86][IR] Merge together some very similar AutoUpgrade handling. NFC llvm-svn: 295562	2017-02-18 19:51:14 +00:00
Matt Arsenault	2021f08080	AMDGPU: Fix assembler subtarget predicate for gfx9 This was accepting GFX9 instructions on VI. llvm-svn: 295557	2017-02-18 19:12:26 +00:00
Matt Arsenault	a3b3b489fb	AMDGPU: Fix disassembly of aperture registers llvm-svn: 295555	2017-02-18 18:41:41 +00:00
Matt Arsenault	e823d92f7f	AMDGPU: Merge initial gfx9 support llvm-svn: 295554	2017-02-18 18:29:53 +00:00
Sanjay Patel	6d5dddb85f	[InstCombine] add tests for trunc(insertelement); NFC llvm-svn: 295553	2017-02-18 18:27:04 +00:00
Easwaran Raman	617f63640b	Refactor instruction simplification code in visitors. NFC. Several visitors check if operands to the instruction are constants, either as it is or after looking up SimplifiedValues, check if the result is a constant and update the SimplifiedValues map. This refactoring splits it into a common function that does the checking of whether the operands are constants and updating of the SimplifiedValues table, and an instruction specific part that is implemented by each instruction visitor as a lambda and passed to the common function. Differential revision: https://reviews.llvm.org/D30104 llvm-svn: 295552	2017-02-18 17:22:52 +00:00
Sanjay Patel	86554de2bd	[InstCombine] update trunc(shuffle) tests to reflect IR reality; NFC We're ok shrinking splats, but not shuffles in general. See https://reviews.llvm.org/D30123 for discussion. llvm-svn: 295547	2017-02-18 15:24:31 +00:00
Brian Cain	cc24826e1e	opt-viewer: Fix syntax highlighting Syntax highlighting has been done line-at-a-time. Done this way, the lexer resets the context at each line, distorting the formatting. This change will render the whole file at once and feed the highlighted text line-at-a-time to be wrapped by the SourceFileRenderer. Leading/trailing newlines were being ignored by Pygments but since each line was rendered in its own row, it didn't matter. This bug was masked by the line-at-a-time algorithm. So now we need to add "stripnl=False" to the CppLexer to change its behavior to match the expectation. llvm-svn: 295546	2017-02-18 15:13:58 +00:00
Craig Topper	a505169ca5	[AVX-512] Remove 128/256-bit masked fp max/min intrinsics. Upgrade them to legacy unmasked intrinsics and select instructions. llvm-svn: 295543	2017-02-18 07:07:50 +00:00
Dehao Chen	b70afdb5e8	Add default OptLevel value for createSimpleLoopUnrollPass to fix the build break introduced by r295538. (NFC) llvm-svn: 295542	2017-02-18 06:42:16 +00:00
Jan Vesely	4b1243facb	AMDGPU/R600: Assert on infinite loop in EmitClauseMarkers Differential Revision: https://reviews.llvm.org/D29792 llvm-svn: 295539	2017-02-18 04:24:10 +00:00
Dehao Chen	7d230325ef	Increases full-unroll threshold. Summary: The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge): Performance: 403 0.11% 433 0.51% 445 0.48% 447 3.50% 453 1.49% 464 0.75% Code size: 403 0.56% 433 0.96% 445 2.16% 447 2.96% 453 0.94% 464 8.02% The compiler time overhead is similar with code size. Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc Reviewed By: hfinkel, chandlerc Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D28368 llvm-svn: 295538	2017-02-18 03:46:51 +00:00
Davide Italiano	982bf827b5	[IR/Verifier] Don't visit DISubprograms more than needed. Before this patch we happened to visit twice, one when scanning MDNodes and the other one while visiting the function. Remove the explicit call to visitDISubprogram there, so we don't emit the same error twice in case the verifier fail and we save some time when running it. Thanks to Justin Bogner for the report and Adrian for the quick review! PR: 31995 llvm-svn: 295537	2017-02-18 03:02:44 +00:00
Dylan McKay	83788ff349	[AVR] Set UseIntegratedAssembler llvm-svn: 295535	2017-02-18 02:26:11 +00:00
Justin Bogner	7bc978b543	OptDiag: Allow constructing DiagnosticLocation from DISubprograms This avoids creating a DILocation just to represent a line number, since creating Metadata is expensive. Creating a DiagnosticLocation directly is much cheaper. llvm-svn: 295531	2017-02-18 02:00:27 +00:00
Zachary Turner	607418771e	Remove the is_trivially_copyable check entirely. This is still breaking builds because some compilers think this type is not trivially copyable even when it should be. Reverting this static_assert until I have time to investigate. llvm-svn: 295529	2017-02-18 01:51:00 +00:00
Zachary Turner	3720124545	Use llvm workaround for missing is_trivially_copyable. some versions of GCC don't have this, so LLVM provides a workaround. llvm-svn: 295526	2017-02-18 01:46:01 +00:00
Zachary Turner	181fe17b6f	Don't assume little endian in StreamReader / StreamWriter. In an effort to generalize this so it can be used by more than just PDB code, we shouldn't assume little endian. llvm-svn: 295525	2017-02-18 01:35:33 +00:00
Matthias Braun	2a707a3d3d	machine-region-info.mir: Slightly simplify test, -mtriple llvm-svn: 295520	2017-02-18 00:48:43 +00:00
Justin Bogner	d890f95bf6	OptDiag: Decouple backend diagnostics from debug info metadata This creates and uses a DiagnosticLocation type rather than using DebugLoc for this purpose in the backend diagnostics. This is NFC for now, but will allow us to create locations for diagnostics without having to create new metadata nodes when we don't have a DILocation. llvm-svn: 295519	2017-02-18 00:42:23 +00:00
Matthias Braun	431305927f	MachineRegionInfo: Fix pass initialization - Adapt MachineBasicBlock::getName() to have the same behavior as the IR BasicBlock (Value::getName()). - Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in the CodeGen library. - MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region"). - MachineRegionInfo should depend on MachineDominatorTree, MachinePostDominatorTree and MachineDominanceFrontier instead of their respective IR versions. - Since there were no tests for this, add a X86 MIR test. Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com> llvm-svn: 295518	2017-02-18 00:41:16 +00:00
Justin Bogner	efc3fbf6a2	Verifier: Disallow a line number without a file in DISubprogram A line number doesn't make much sense if you don't say where it's from. Add a verifier check for this and update some tests that had bogus debug info. llvm-svn: 295516	2017-02-17 23:57:42 +00:00
Sanjay Patel	f8346550bf	[InstCombine] add tests for trunc(shuffle X, C, M); NFC llvm-svn: 295513	2017-02-17 23:16:54 +00:00
Matthias Braun	d9a59a8df8	AArch64LoadStoreOptimizer: Correctly clear kill flags When promoting the Load of a Store-Load pair to a COPY all kill flags between the store and the load need to be cleared. rdar://30402435 Differential Revision: https://reviews.llvm.org/D30110 llvm-svn: 295512	2017-02-17 23:15:03 +00:00
Simon Pilgrim	8670993dc1	[X86] Add MOVBE targets to load combine tests Test folded endian swap tests with MOVBE instructions. llvm-svn: 295508	2017-02-17 23:00:21 +00:00
Guozhi Wei	7ec2c72095	[PPC] Give unaligned memory access lower cost on processor that supports it Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost. This patch fixes pr31492. Differential Revision: https://reviews.llvm.org/D28630 This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug. llvm-svn: 295506	2017-02-17 22:29:39 +00:00
Eugene Zelenko	be37db1882	[CodeGen] Revert changes in LowLevelType to pre-r295499 to fix broken buildbots. llvm-svn: 295505	2017-02-17 22:23:34 +00:00
Krzysztof Parzyszek	1aaf41af54	[Hexagon] Start using regmasks on calls Reapply r295371 with a fix for the Windows bot failures. llvm-svn: 295504	2017-02-17 22:14:51 +00:00
Davide Italiano	bca05df38b	[NewGVN] isOnlyReachableViaThisEdge() is dead now. NFCI. llvm-svn: 295503	2017-02-17 22:12:30 +00:00
Simon Pilgrim	7db8f42fe3	[X86] Simplify by pulling out valuetype. NFCI. llvm-svn: 295502	2017-02-17 22:10:10 +00:00
Eugene Zelenko	9676ebec15	[CodeGen] Attempt to fix buildbots broken in r295499. llvm-svn: 295501	2017-02-17 22:07:26 +00:00
Davide Italiano	6db0ecafd7	[NewGVN] createVariableOrConstant is not required anymore. NFCI. llvm-svn: 295500	2017-02-17 21:55:47 +00:00
Eugene Zelenko	5db84df728	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 295499	2017-02-17 21:43:25 +00:00
Simon Pilgrim	996f9b4cad	[X86] Add subborrow stack folding tests llvm-svn: 295496	2017-02-17 21:16:24 +00:00
Sanjay Patel	00872c3dfe	[x86] add tests for sext (not bool); NFC llvm-svn: 295495	2017-02-17 21:10:40 +00:00
Matthew Simpson	a899f86054	[LAA] Remove unused code (NFC) llvm-svn: 295493	2017-02-17 20:46:52 +00:00
Simon Pilgrim	a4c350ff17	[X86][SSE] Add (V)MOVD folding pattern with zextloadi64i32 load node. Fixes PRPR31309 llvm-svn: 295492	2017-02-17 20:43:32 +00:00
Adrian Prantl	e6c6a945ca	Fix windows bots by locking down the target triple on this testcase. llvm-svn: 295490	2017-02-17 20:02:26 +00:00
Matt Arsenault	f6cf1032fd	AMDGPU: Fix crashes on invalid icmp/fcmp intrinsics llvm-svn: 295489	2017-02-17 19:49:10 +00:00
Peter Collingbourne	184773d81f	WholeProgramDevirt: For VCP use a 32-bit ConstantInt for the byte offset. A future change will cause this byte offset to be inttoptr'd and then exported via an absolute symbol. On the importing end we will expect the symbol to be in range [0,2^32) so that it will fit into a 32-bit relocation. The problem is that on 64-bit architectures if the offset is negative it will not be in the correct range once we inttoptr it. This change causes us to use a 32-bit integer so that it can be inttoptr'd (which zero extends) into the correct range. Differential Revision: https://reviews.llvm.org/D30016 llvm-svn: 295487	2017-02-17 19:43:45 +00:00
Adrian Prantl	67c2442210	Debug Info: Sort frame index expressions before emitting them. This fixes PR31381, which caused an assertion and/or invalid debug info. This affects debug variables that have multiple fragments in the MMI side (i.e.: in the stack frame) table. rdar://problem/30571676 llvm-svn: 295486	2017-02-17 19:42:32 +00:00
Petr Hosek	7acbeaf1bc	[CMake] Support externalizing debug info on non-Darwin platforms On other platorms, we use objcopy to export the debug info. Differential Revision: https://reviews.llvm.org/D28575 llvm-svn: 295481	2017-02-17 19:29:12 +00:00
Simon Pilgrim	a3f2803905	[X86][SHA] Add SHA stack folding tests llvm-svn: 295479	2017-02-17 19:24:55 +00:00
Artyom Skrobov	4592f6206c	In Thumb1 mode, the custom lowering for ARMISD::CMPZ could never emit tADDi3 Reviewers: jmolloy, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D30097 llvm-svn: 295478	2017-02-17 18:59:16 +00:00
Simon Pilgrim	f4f5cd5d19	[X86][TBM] Add TBM stack folding tests llvm-svn: 295477	2017-02-17 18:51:53 +00:00
Tim Northover	88634996c7	GlobalISel: verify that generic loads & stores have a mem operand. The mem operand is used by GlobalISel to convey atomic constraints so dropping it is invalid. llvm-svn: 295476	2017-02-17 18:50:15 +00:00
Joel Jones	ab0f3b43e3	[AArch64] Add Cavium ThunderX support This set of patches adds support for Cavium ThunderX ARM64 processors: * ThunderX * ThunderX T81 * ThunderX T83 * ThunderX T88 Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D28891 llvm-svn: 295475	2017-02-17 18:34:24 +00:00
Peter Collingbourne	37317f1207	WholeProgramDevirt: Examine the function body when deciding whether functions are readnone. The goal is to get an analysis result even for de-refineable functions. Differential Revision: https://reviews.llvm.org/D29803 llvm-svn: 295472	2017-02-17 18:17:04 +00:00
Simon Pilgrim	99193de8ab	[X86][BMI] Add BMI2 stack folding tests llvm-svn: 295470	2017-02-17 18:00:43 +00:00
Peter Collingbourne	10c500ddc0	opt: Rename -default-data-layout flag to -data-layout and make it always override the layout. There isn't much point in a flag that only works if the data layout is empty. Differential Revision: https://reviews.llvm.org/D30014 llvm-svn: 295468	2017-02-17 17:36:52 +00:00
Justin Bogner	073f56dc1a	OptDiag: Rename DiagnosticInfoWithDebugLoc to WithLocation. NFC This generalizes the name in preparation for decoupling the concept from DebugLoc. llvm-svn: 295465	2017-02-17 17:34:37 +00:00
Rui Ueyama	ac20c17962	MC/COFF: Do not emit forward associative section referenceds. MSVC link.exe cannot handle associative sections that refer later sections in the section header. Technically, such COFF object doesn't violate the Microsoft COFF spec, as the spec doesn't say anything about that, but still we should avoid doing that to make it compatible with MS tools. This patch assigns smaller section numbers to non-associative sections and larger numbers to associative sections. This should resolve the compatibility issue. Differential Revision: https://reviews.llvm.org/D30080 llvm-svn: 295464	2017-02-17 17:32:54 +00:00
Sanjay Patel	7f2e58972c	[DAGCombiner] split i1 select-of-constants from non-i1 case; NFCI I can't find any tests of the non-i1 code path, so it may be unnecessary at this point. llvm-svn: 295463	2017-02-17 17:13:27 +00:00
Simon Pilgrim	09dde435ab	[X86][BMI] Add BMI stack folding tests llvm-svn: 295462	2017-02-17 17:11:00 +00:00
Sanjay Patel	f2a345c8ee	[PowerPC] add tests for select-of-constants; NFC llvm-svn: 295460	2017-02-17 16:43:43 +00:00
Sanjay Patel	9b6cfaa7b1	[ARM] add tests for select-of-constants; NFC llvm-svn: 295459	2017-02-17 16:34:13 +00:00
Matthew Simpson	f68e183f91	[LV] Remove constant restriction for vector phi creation We previously only created a vector phi node for an induction variable if its step had a constant integer type. However, the step actually only needs to be loop-invariant. We only handle inductions having loop-invariant steps, so this patch should enable vector phi node creation for all integer induction variables that will be vectorized. Differential Revision: https://reviews.llvm.org/D29956 llvm-svn: 295456	2017-02-17 16:09:07 +00:00
Simon Pilgrim	0429c0cf8b	Fix signed/unsigned comparison warning. llvm-svn: 295453	2017-02-17 16:01:16 +00:00
Sam Parker	58af0c55d2	[ARM] Replace HasT2ExtractPack with HasDSP Removed the HasT2ExtractPack feature and replaced its references with HasDSP. This then allows the Thumb2 extend instructions to be selected for ARMv8M +dsp. These instruction descriptions have also been refactored and more target tests have been added for their isel. Differential Revision: https://reviews.llvm.org/D29623 llvm-svn: 295452	2017-02-17 15:42:44 +00:00
Simon Pilgrim	511d788a95	[DAGCombine] Recognise any_extend_vector_inreg and truncation style shuffle masks During legalization we are often creating shuffles (via a build_vector scalarization stage) that are "any_extend_vector_inreg" style masks, and also other masks that are the equivalent of "truncate_vector_inreg" (if we had such a thing). This patch is an attempt to match these cases to help undo the effects of just leaving shuffle lowering to handle it - which typically means we lose track of the undefined elements of the shuffles resulting in an unnecessary extension+truncation stage for widened illegal types. The 2011-10-21-widen-cmp.ll regression will be fixed by making SIGN_EXTEND_VECTOR_IN_REG legal in SSE instead of lowering them to X86ISD::VSEXT (PR31712). Differential Revision: https://reviews.llvm.org/D29454 llvm-svn: 295451	2017-02-17 15:14:48 +00:00
Sanjay Patel	5573042035	[DAGCombiner] improve readability; NFCI llvm-svn: 295447	2017-02-17 14:21:59 +00:00
Diana Picus	e836878bf1	[ARM] GlobalISel: Clean up some helpers Return invalid opcodes when some of the helpers in the instruction selection pass can't handle a given combination. llvm-svn: 295446	2017-02-17 13:44:19 +00:00
Diana Picus	38699dbac5	[ARM] GlobalISel: Check mappings used by reg bank select Add some asserts to make sure we're using the mappings that we think we're using. This is to keep us from accidentally breaking functionality while moving to TableGen'erated mappings. llvm-svn: 295441	2017-02-17 13:14:25 +00:00
Diana Picus	7cab0786bd	[ARM] GlobalISel: Use Subtarget in Legalizer Start using the Subtarget to make decisions about what's legal. In particular, we only mark floating point operations as legal if we have VFP2, which is something we should've done from the very start. llvm-svn: 295439	2017-02-17 11:25:17 +00:00
Diana Picus	d2f3ba71c9	[ARM] GlobalISel: Add end-to-end tests for double Test some really basic functionality through the whole GlobalISel pipeline. llvm-svn: 295438	2017-02-17 11:25:11 +00:00
Ismail Donmez	c7ff81435d	Update Bugzilla URLs in docs llvm-svn: 295432	2017-02-17 08:26:11 +00:00
Eugene Leviant	958fcd7502	InstCombine: fix extraction when performing vector/array punning Differential revision: https://reviews.llvm.org/D29491 llvm-svn: 295429	2017-02-17 07:36:03 +00:00
Craig Topper	cbd1b60e42	[IR][X86] Simplify some AutoUpgrade code slightly. NFC llvm-svn: 295426	2017-02-17 07:07:24 +00:00
Craig Topper	905cc75f97	[IR][X86] Rename an AutoUpgrade helper function to more accurately match what intrinsics it handles. NFC llvm-svn: 295425	2017-02-17 07:07:21 +00:00
Craig Topper	b9b9cb0ce6	[IR][X86] Move X86 specific portions of UpgradeIntrinsicFunction1 to a couple helper functions. NFC This enables some early outs to avoid repeatedly using IsX86 check to qualify. I hope to continue to improve this to shorten the lengths of some of the string comparisons. llvm-svn: 295424	2017-02-17 07:07:19 +00:00

1 2 3 4 5 ...

145226 Commits