llvm-project

Commit Graph

Author	SHA1	Message	Date
Etienne Bergeron	752f8839a4	[compiler-rt] Avoid instrumenting sanitizer functions Summary: Function __asan_default_options is called by __asan_init before the shadow memory got initialized. Instrumenting that function may lead to flaky execution. As the __asan_default_options is provided by users, we cannot expect them to add the appropriate function atttributes to avoid instrumentation. Reviewers: kcc, rnk Subscribers: dberris, chrisha, llvm-commits Differential Revision: https://reviews.llvm.org/D24566 llvm-svn: 281503	2016-09-14 17:18:37 +00:00
Simon Pilgrim	a369219ce6	[X86][SSE] Improve recognition of i64 sitofp conversions that can be performed as i32 (PR29078) Until AVX512DQ we only support i64/vXi64 sitofp conversion as scalars. This patch sees if the sign bit extends far enough that we can truncate to a i32 type and then perform sitofp without loss of precision. Differential Revision: https://reviews.llvm.org/D24345 llvm-svn: 281502	2016-09-14 17:15:26 +00:00
Chad Rosier	e6b3a63a3d	[LoopInterchange] Typo. NFC. llvm-svn: 281501	2016-09-14 17:12:30 +00:00
Chad Rosier	72431890b1	[LoopInterchange] Add CL option to override cost threshold. Mostly useful for getting consistent lit testing. llvm-svn: 281500	2016-09-14 17:07:13 +00:00
Simon Pilgrim	fbbb28ebb3	[X86][SSE] Don't use PSHUFD directly - lower with generic shuffle Remove the last user of the old getTargetShuffleNode helpers llvm-svn: 281499	2016-09-14 17:04:22 +00:00
Sanjay Patel	284582b6d4	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits(), round 2 ; NFCI llvm-svn: 281498	2016-09-14 16:54:10 +00:00
Chad Rosier	58ede270a7	[LoopInterchange] Cleanup debug whitespace. NFC. llvm-svn: 281497	2016-09-14 16:43:19 +00:00
Sanjay Patel	1ed771f5d7	getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI llvm-svn: 281495	2016-09-14 16:37:15 +00:00
Sanjay Patel	b1f0a0f4a8	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI llvm-svn: 281493	2016-09-14 16:05:51 +00:00
Etienne Bergeron	9bd4281006	Fix typo in comment [NFC] llvm-svn: 281492	2016-09-14 15:59:32 +00:00
Matt Arsenault	2bc198a333	AMDGPU: Support folding FrameIndex operands This avoids test regressions in a future commit. llvm-svn: 281491	2016-09-14 15:51:33 +00:00
Sanjay Patel	5f6bb6cd24	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI llvm-svn: 281490	2016-09-14 15:43:44 +00:00
Sanjay Patel	bd6fca1419	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI llvm-svn: 281489	2016-09-14 15:21:00 +00:00
Matt Arsenault	fa5f767a38	AMDGPU: Improve splitting 64-bit bit ops by constants This addresses a TODO to handle operations besides and. This also starts eliminating no-op operations with a constant that can emerge later. llvm-svn: 281488	2016-09-14 15:19:03 +00:00
Matthew Simpson	b25e87fca5	[LV] Process pointer IVs with PHINodes in collectLoopUniforms This patch moves the processing of pointer induction variables in collectLoopUniforms from the consecutive pointer phase of the analysis to the phi node phase. Previously, if a pointer induction variable was used by both a scalarized non-memory instruction as well as a vectorized memory instruction, we would incorrectly identify the pointer as uniform. Pointer induction variables should be treated the same as other phi nodes. That is, they are uniform if all users of the induction variable and induction variable update are uniform. Differential Revision: https://reviews.llvm.org/D24511 llvm-svn: 281485	2016-09-14 14:47:40 +00:00
James Molloy	13065b00ba	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281484	2016-09-14 14:47:27 +00:00
Simon Pilgrim	ec2d206669	[X86][SSE] Removed unused getTargetShuffleNode function llvm-svn: 281481	2016-09-14 14:30:00 +00:00
Nemanja Ivanovic	d5deb4896c	Fix code-gen crash on Power9 for insert_vector_elt with variable index (PR30189) This patch corresponds to review: https://reviews.llvm.org/D24021 In the initial implementation of this instruction, I forgot to account for variable indices. This patch fixes PR30189 and should probably be merged into 3.9.1 (I'll open a bug according to the new instructions). llvm-svn: 281479	2016-09-14 14:19:09 +00:00
Silviu Baranga	0a020f0fb0	[StackProtector] Use INITIALIZE_TM_PASS instead of INITIALIZE_PASS in order to make sure that its TargetMachine constructor is registered. This allows us to run the PEI machine pass with MIR input (see PR30324). llvm-svn: 281474	2016-09-14 14:09:43 +00:00
Nemanja Ivanovic	a103d104e1	Adding missing directive for Power9. There is currently no codegen for Power9 that depends on the directive so this is NFC for now but will be important in the future. This was missed in r268950 so I'm adding it now. llvm-svn: 281473	2016-09-14 14:09:39 +00:00
Simon Pilgrim	ba325e3a73	[X86][SSE] Don't blend vector shifts with MOVSS/MOVSD directly, lower from generic shuffle Shuffle lowering will correctly lower to MOVSS/MOVSD/PBLEND, improving commutation opportunities llvm-svn: 281471	2016-09-14 14:08:18 +00:00
Kuba Brecka	a1ea64a044	[asan] Enable -asan-use-private-alias on Darwin/Mach-O, add test for ODR false positive with LTO (llvm part) The '-asan-use-private-alias’ option (disabled by default) option is currently only enabled for Linux and ELF, but it also works on Darwin and Mach-O. This option also fixes a known problem with LTO on Darwin (https://github.com/google/sanitizers/issues/647). This patch enables the support for Darwin (but still keeps it off by default) and adds the LTO test case. Differential Revision: https://reviews.llvm.org/D24292 llvm-svn: 281470	2016-09-14 14:06:33 +00:00
James Molloy	9790d8f81d	Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently" This reverts commit r281323. It caused chromium test failures and a selfhost failure. llvm-svn: 281451	2016-09-14 09:45:28 +00:00
Vassil Vassilev	2ec8b1506a	Missing includes. llvm-svn: 281450	2016-09-14 08:55:18 +00:00
Tim Northover	1c7825fd79	GlobalISel: mark pointer stores as legal on AArch64. llvm-svn: 281448	2016-09-14 08:28:54 +00:00
Sjoerd Meijer	724023a1ec	This reapplies r281304. The issue was that I had missed to copy the new isAdd field in the tablegen data structure. llvm-svn: 281447	2016-09-14 08:20:03 +00:00
Elena Demikhovsky	0569d9d588	AVX-512: Fixed a bug in kortest.z intrinsic Lowering was wrong - X86ISD::SETCC node should return i8 type. llvm-svn: 281446	2016-09-14 08:06:54 +00:00
Igor Breger	74813fc19c	[AVX512BW] Change truncStore action (v16i16->v16i18). It can be legal only with AVX512VL. Differential Revision: http://reviews.llvm.org/D24547 llvm-svn: 281445	2016-09-14 08:04:28 +00:00
Craig Topper	4e2d5a43cf	[X86] Remove the VCVTSI2SD32 with rounding intrinsic. It's not used by clang and not needed since 32-bit integer to double is always exact. llvm-svn: 281442	2016-09-14 06:27:46 +00:00
Wei Mi	24662395df	Create a getelementptr instead of sub expr for ValueOffsetPair if the value is a pointer. This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair, if the value is of pointer type, we can only create a getelementptr instead of sub expr. Differential Revision: https://reviews.llvm.org/D24088 llvm-svn: 281439	2016-09-14 04:39:50 +00:00
Kostya Serebryany	a00b243c75	[libFuzzer] start using trace-pc-guard as an alternative source of coverage llvm-svn: 281435	2016-09-14 02:13:06 +00:00
Kostya Serebryany	da718e55cf	[sanitizer-coverage] add yet another flavour of coverage instrumentation: trace-pc-guard. The intent is to eventually replace all of {bool coverage, 8bit-counters, trace-pc} with just this one. LLVM part llvm-svn: 281431	2016-09-14 01:39:35 +00:00
Akira Hatanaka	6d5a29489a	Address Pete's review comment and define OrigArg on its own line. This is a follow-up to r281419. llvm-svn: 281421	2016-09-13 23:53:43 +00:00
Akira Hatanaka	dea090e6b2	[ObjCARC] Traverse chain downwards to replace uses of argument passed to ObjC library call with call return. ARC contraction tries to replace uses of an argument passed to an objective-c library call with the call return value. For example, in the following IR, it replaces uses of argument %9 and uses of the values discovered traversing the chain upwards (%7 and %8) with the call return %10, if they are dominated by the call to @objc_autoreleaseReturnValue. This transformation enables code-gen to tail-call the call to @objc_autoreleaseReturnValue, which is necessary to enable auto release return value optimization. %7 = tail call i8* @objc_loadWeakRetained(i8** %6) %8 = bitcast i8* %7 to %0* %9 = bitcast %0* %8 to i8* %10 = tail call i8* @objc_autoreleaseReturnValue(i8* %9) ret %0* %8 Since r276727, llvm started removing redundant bitcasts and as a result started feeding the following IR to ARC contraction: %7 = tail call i8* @objc_loadWeakRetained(i8** %6) %8 = bitcast i8* %7 to %0* %9 = tail call i8* @objc_autoreleaseReturnValue(i8* %7) ret %0* %8 ARC contraction no longer does the optimization described above since it only traverses the chain upwards and fails to recognize that the function return can be replaced by the call return. This commit changes ARC contraction to traverse the chain downwards too and replace uses of bitcasts with the call return. rdar://problem/28011339 Differential Revision: https://reviews.llvm.org/D24523 llvm-svn: 281419	2016-09-13 23:43:11 +00:00
Pawel Bylica	c397f0b272	[CodeGen] Fix invalid shift in mul expansion Summary: When expanding mul in type legalization make sure the type for shift amount can actually fit the value. This fixes PR30354 https://llvm.org/bugs/show_bug.cgi?id=30354. Reviewers: hfinkel, majnemer, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D24478 llvm-svn: 281403	2016-09-13 21:55:41 +00:00
Michael Kuperstein	59f8305305	[DAG] Allow build-to-shuffle combine to combine builds from two wide vectors. This allows us to, in some cases, create a vector_shuffle out of a build_vector, when the inputs to the build are extract_elements from two different vectors, at least one of which is wider than the output. (E.g. a <8 x i16> being constructed out of elements from a <16 x i16> and a <8 x i16>). Differential Revision: https://reviews.llvm.org/D24491 llvm-svn: 281402	2016-09-13 21:53:32 +00:00
Kevin Enderby	f76b56cb9c	Next set of additional error checks for invalid Mach-O files for bad load commands that use the Mach::dyld_info_command type for the load commands that are currently use in the MachOObjectFile constructor. This contains the missing checks for LC_DYLD_INFO and LC_DYLD_INFO_ONLY load commands and the fields for the Mach::dyld_info_command type. llvm-svn: 281400	2016-09-13 21:42:28 +00:00
Krzysztof Parzyszek	d19d0507c8	[Hexagon] Better handling of HVX vector lowering - Expand SELECT_CC and BR_CC for vector types. - Implement TLI::isShuffleMaskLegal. llvm-svn: 281397	2016-09-13 21:16:07 +00:00
Matt Arsenault	e2e6cfee61	Reapply "InstCombine: Reduce trunc (shl x, K) width." This reapplies r272987 with a fix for infinitely looping when the truncated value is another shift of a constant. llvm-svn: 281379	2016-09-13 19:43:57 +00:00
Matthias Braun	1af1414d4d	AArch64: Cleanup tailcall CC check, enable swiftcc. Cleanup/change the code that checks for possible tailcall conventions to look the same as the one in the X86 target. This makes the distinction between calling conventions that can guarnatee tailcalls and the ones that may tailcall more obvious. - Add Swift to the mayTailCall list - PreserveMost seemed to be incorrectly part of the guarnteed tail call list, move it to the mayTailCall list. llvm-svn: 281376	2016-09-13 19:27:38 +00:00
Matt Arsenault	a992f71bef	AMDGPU: Remove code I think is dead As far as I can tell, resolveFrameIndex is supposed to be called with a legal offset, so inserting an add shouldn't be necessary. llvm-svn: 281372	2016-09-13 19:15:25 +00:00
Matt Arsenault	25dba30017	AMDGPU: Support commuting a FrameIndex operand llvm-svn: 281369	2016-09-13 19:03:12 +00:00
Matthew Simpson	81335bec96	[LV] Clean up uniform induction variable analysis (NFC) llvm-svn: 281368	2016-09-13 19:01:45 +00:00
Davide Italiano	39ccd24126	[LTO] Don't pass SF_Undefined symbols to the IRmover. This should fix PR 30363. llvm-svn: 281366	2016-09-13 18:45:13 +00:00
Simon Pilgrim	4a8eba3e96	[DAGCombiner] Use APInt directly in (shl (zext (srl x, C)), C) combine range test To avoid assertion, we must ensure that the inner shift constant is within range before calling ConstantSDNode::getZExtValue(). We already know that the outer shift constant is in range. Followup to D23007 llvm-svn: 281362	2016-09-13 18:33:29 +00:00
Nico Weber	e204c48d16	Revert r281336 (and r281337), it caused PR30372. llvm-svn: 281361	2016-09-13 18:17:00 +00:00
Douglas Katzman	8ea02f4e1c	[Myriad]: set LeonCASA processor feature llvm-svn: 281359	2016-09-13 17:51:41 +00:00
Simon Pilgrim	bd28a85d14	[DAGCombiner] Use APInt directly in (shl (ext (shl x, c1)), c2) combine Fix failure to detect out of range shift constants leading to assert in ConstantSDNode::getZExtValue() Followup to D23007 llvm-svn: 281354	2016-09-13 17:15:28 +00:00
Matt Arsenault	30bccade0b	Fix misleading comment for getOrEnforceKnownAlignment It does not return 0 to indicate failure, and returns the known alignment. llvm-svn: 281350	2016-09-13 16:39:43 +00:00
Andrea Di Biagio	7277afeec1	[ConstantFold] Improve the bitcast folding logic for constant vectors. The constant folder didn't know how to always fold bitcasts of constant integer vectors. In particular, it was unable to handle the case where a constant vector had some undef elements, and the resulting (i.e. bitcasted) vector type had more elements than the original vector type. Example: %cast = bitcast <2 x i64><i64 undef, i64 2> to <4 x i32> On a little endian target, %cast could have been folded to: <4 x i32><i32 undef, i32 undef, i32 2, i32 0> This patch improves the folding logic by teaching how to correctly propagate undef elements in the folded vector. Differential Revision: https://reviews.llvm.org/D24301 llvm-svn: 281343	2016-09-13 14:50:47 +00:00
Krzysztof Parzyszek	b558ae2125	[Hexagon] Clear the flow queue after visiting a single instruction llvm-svn: 281339	2016-09-13 14:36:55 +00:00
Nirav Dave	fbd38cadf1	Apply Clang-format to MCAsmParser.cpp NFC. llvm-svn: 281337	2016-09-13 13:57:16 +00:00
Nirav Dave	9fa8af2180	Defer asm errors to post-statement failure Recommitting after fixing AsmParser Initialization. Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281336	2016-09-13 13:55:06 +00:00
Chad Rosier	7ea0d3947a	[LoopInterchange] Minor refactor. NFC. llvm-svn: 281334	2016-09-13 13:30:30 +00:00
Chad Rosier	61683a22cb	Don't use else if after return. Tidy comments. NFC. llvm-svn: 281331	2016-09-13 13:08:53 +00:00
Chad Rosier	d18ea0654b	Typo. NFC. llvm-svn: 281330	2016-09-13 13:00:29 +00:00
Chad Rosier	09c1109b12	[LoopInterchange] Tidy up and remove unnecessary dyn_casts. NFC. llvm-svn: 281328	2016-09-13 12:56:04 +00:00
James Molloy	043d613791	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656 llvm-svn: 281327	2016-09-13 12:45:51 +00:00
Pablo Barrio	bb6984d401	[ARM] Add ".code 32" to functions in the ARM instruction set Before, only Thumb functions were marked as ".code 16". These ".code x" directives are effective until the next directive of its kind is encountered. Therefore, in code with interleaved ARM and Thumb functions, it was possible to declare a function as ARM and end up with a Thumb function after assembly. A test has been added. An existing test has also been fixed to take this change into account. Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24337 llvm-svn: 281324	2016-09-13 12:18:15 +00:00
James Molloy	d246c598de	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281323	2016-09-13 12:12:32 +00:00
Sam Parker	214f7bf5cc	Enable simplify libcalls for ARM PCS Teach SimplifyLibcalls that in can treat functions annotated with apcs, aapcs or aapcs_vfp like normal C functions if they only take and return integer or pointer values, and the target is not iOS. Differential Revision: https://reviews.llvm.org/D24453 llvm-svn: 281322	2016-09-13 12:10:14 +00:00
Peter Smith	85bbda191d	[ARM] Support ldr.w in pseudo instruction ldr rd,=immediate The changes made in r269352, r269353 and r269354 to support the transformation of the ldr rd,=immediate to mov introduced a regression from 3.8 (ldr.w rd, =immediate) not supported. This change puts support back in for ldr.w by means of a t2InstAlias for the .w form. The .w is ignored in ARM state and propagated to the ldr in Thumb2. llvm-svn: 281319	2016-09-13 11:15:51 +00:00
James Molloy	3e4bc66134	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281314	2016-09-13 10:28:11 +00:00
Ayman Musa	0c2da88f82	Remove MVT:i1 xor instruction before SELECT. (Performance improvement). Differential Revision: https://reviews.llvm.org/D23764 llvm-svn: 281308	2016-09-13 09:12:45 +00:00
Sjoerd Meijer	520a18df9c	Revert of r281304 as it is causing build bot failures in hexagon hwloop regression tests. These tests pass locally; will be investigating where these differences come from. llvm-svn: 281306	2016-09-13 08:51:59 +00:00
Sjoerd Meijer	05453991fe	This adds a new field isAdd to MCInstrDesc. The ARM and Hexagon instruction descriptions now tag add instructions, and the Hexagon backend is using this to identify loop induction statements. Patch by Sam Parker and Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D23601 llvm-svn: 281304	2016-09-13 08:08:06 +00:00
Elena Demikhovsky	b906df9fe5	AVX-512: Fix for PR28175 - Scalar code optimization. Optimized (truncate (assertzext x) to i1) and anyext i1 to i8/16/32. Optimization of this patterns is a one more step towards i1 optimization on AVX-512. Differential Revision: https://reviews.llvm.org/D24456 llvm-svn: 281302	2016-09-13 07:57:00 +00:00
Diana Picus	4b97288184	[AArch64] Support stackmap/patchpoint in getInstSizeInBytes We currently return 4 for stackmaps and patchpoints, which is very optimistic and can in rare cases cause the branch relaxation pass to fail to relax certain branches. This patch causes getInstSizeInBytes to return a pessimistic estimate of the size as the number of bytes requested in the stackmap/patchpoint. In the future, we could provide a more accurate estimate by sharing some of the logic in AArch64::LowerSTACKMAP/PATCHPOINT. Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750 Differential Revision: https://reviews.llvm.org/D24073 llvm-svn: 281301	2016-09-13 07:45:17 +00:00
Craig Topper	4619c9e6a8	[X86] Remove masked shufpd/shufps intrinsics and autoupgrade to native vector shuffles. They were removed from clang previously but accidentally left in the backend. llvm-svn: 281300	2016-09-13 07:40:53 +00:00
Zachary Turner	d97d5a2cee	Revert "[Support][CommandLine] Add cl::getRegisteredSubcommands()" This reverts r281290, as it breaks unit tests. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/303 llvm-svn: 281292	2016-09-13 04:11:57 +00:00
Dean Michael Berris	d9d290c0c6	[Support][CommandLine] Add cl::getRegisteredSubcommands() This should allow users of the library to get a range to iterate through all the subcommands that are registered to the global parser. This allows users to define subcommands in libraries that self-register to have dispatch done at a different stage (like main). It allows for writing code like the following: for (auto S : cl::getRegisteredSubcommands()) { if (S) { // Dispatch on S->getName(). } } This change also contains tests that show this usage pattern. Reviewers: zturner, dblaikie, echristo Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24489 llvm-svn: 281290	2016-09-13 02:35:00 +00:00
Peter Collingbourne	d4135bbc30	DebugInfo: New metadata representation for global variables. This patch reverses the edge from DIGlobalVariable to GlobalVariable. This will allow us to more easily preserve debug info metadata when manipulating global variables. Fixes PR30362. A program for upgrading test cases is attached to that bug. Differential Revision: http://reviews.llvm.org/D20147 llvm-svn: 281284	2016-09-13 01:12:59 +00:00
Michael Kuperstein	efc0667583	[DAG] Refactor BUILD_VECTOR combine to make it easier to extend. NFCI. This should make it easier to add cases that we currently don't cover, like supporting more kinds of type mismatches and more than 2 input vectors. llvm-svn: 281283	2016-09-13 00:57:43 +00:00
Hans Wennborg	8a42d4b9cc	X86: Conditional tail calls should not have isBarrier = 1 That confuses e.g. machine basic block placement, which then doesn't realize that control can fall through a block that ends with a conditional tail call. Instead, isBranch=1 should be set. Also, mark EFLAGS as used by these instructions. llvm-svn: 281281	2016-09-13 00:21:32 +00:00
Eric Christopher	04c7db31e8	Temporarily Revert "[MC] Defer asm errors to post-statement failure" as it's causing errors on the sanitizer bots. This reverts commit r281249. llvm-svn: 281280	2016-09-13 00:19:29 +00:00
Philip Reames	9db7948e90	[LVI] Complete the abstract of the cache layer [NFCI] Convert the previous introduced is-a relationship between the LVICache and LVIImple clases into a has-a relationship and hide all the implementation details of the cache from the lazy query layer. The only slightly concerning change here is removing the addition of a queried block into the SeenBlock set in LVIImpl::getBlockValue. As far as I can tell, this was effectively dead code. I think it used to be the case that getCachedValueInfo wasn't const and might end up inserting elements in the cache during lookup. That's no longer true and hasn't been for a while. I did fixup the const usage to make that more obvious. llvm-svn: 281272	2016-09-12 22:38:44 +00:00
Philip Reames	b627aec407	[LVI] Sink a couple more cache manipulation routines into the cache itself [NFCI] The only interesting bit here is the refactor of the handle callback and even that's pretty straight-forward. llvm-svn: 281267	2016-09-12 22:03:36 +00:00
Philip Reames	92e5e1b92d	[LVI] Abstract out the actual cache logic [NFCI] Seperate the caching logic from the implementation of the lazy analysis. For the moment, the lazy analysis impl has a is-a relationship with the cache; this will change to a has-a relationship shortly. This was done as two steps merely to keep the changes simple and the diff understandable. llvm-svn: 281266	2016-09-12 21:46:58 +00:00
Nico Weber	7c31d0ebc0	Revert r281215, it caused PR30358. llvm-svn: 281263	2016-09-12 21:40:50 +00:00
Dehao Chen	c32d71253c	Fix the bug introduced in r281252. llvm-svn: 281253	2016-09-12 20:29:54 +00:00
Dehao Chen	9bbb941acf	Lower consecutive select instructions correctly. Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095 Reviewers: davidxl Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24147 llvm-svn: 281252	2016-09-12 20:23:28 +00:00
Nirav Dave	c0c0f7a196	[MC] Defer asm errors to post-statement failure Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281249	2016-09-12 20:03:02 +00:00
Lang Hames	8d4be3aacf	[MCJIT] Fix some inconsistent handling of name mangling inside MCJIT. This patch moves symbol mangling from findSymbol to getSymbolAddress. The findSymbol, findExistingSymbol and findModuleForSymbol methods now always take a mangled name, allowing the 'demangle-and-retry' cruft to be removed from findSymbol. See http://llvm.org/PR28699 for details. Patch by James Holderness. Thanks very much James! llvm-svn: 281238	2016-09-12 17:19:24 +00:00
Sanjay Patel	f5887f1fbd	[InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors isSignBitCheck could be changed to take a pointer param to avoid the 'UnusedBit' ugliness. llvm-svn: 281231	2016-09-12 16:25:41 +00:00
Nicolai Haehnle	e58e0e3fe3	AMDGPU: Do not clobber SCC in SIWholeQuadMode Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22198 llvm-svn: 281230	2016-09-12 16:25:20 +00:00
Ahmed Bougacha	925961b20c	[GlobalISel] Fix mismatched "<..)" in intrinsic MO printing. NFC. llvm-svn: 281229	2016-09-12 16:21:49 +00:00
James Molloy	3d06ff22b7	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625 llvm-svn: 281228	2016-09-12 16:18:23 +00:00
Ahmed Bougacha	b678219aa6	[BranchFolding] Unique added live-ins after hoisting code. We're not supposed to have duplicate live-ins. llvm-svn: 281224	2016-09-12 16:05:31 +00:00
Ahmed Bougacha	45bfa8772f	[X86] Copy imp-uses when folding tailcall into conditional branch. r280832 added 32-bit support for emitting conditional tail-calls, but dropped imp-used parameter registers. This went unnoticed until r281113, which added 64-bit support, as this is only exposed with parameter passing via registers. Don't drop the imp-used parameters. llvm-svn: 281223	2016-09-12 16:05:27 +00:00
David Majnemer	c83044d9bb	[FunctionAttrs] Don't try to infer returned if it is already on an argument Trying to infer the 'returned' attribute if an argument is already 'returned' can lead to verification failure: inference might determine that a different argument is passed through which would result in two different arguments marked as 'returned'. This fixes PR30350. llvm-svn: 281221	2016-09-12 16:04:59 +00:00
Sanjay Patel	0531f0a5bb	fix formatting; NFC llvm-svn: 281220	2016-09-12 15:52:28 +00:00
Sanjay Patel	3151dec7f1	[InstCombine] add helper function for foldICmpUsingKnownBits; NFCI llvm-svn: 281217	2016-09-12 15:24:31 +00:00
Sam Kolton	fb0d9d9c13	[AMDGPU] Assembler: Move disabled SDWA and DPP instruction into Disable asm variant Summary: This removes disabled instructions from match tables so we will not match them at all. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: wdng, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D24452 llvm-svn: 281216	2016-09-12 14:42:43 +00:00
James Molloy	1e1b56bd48	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281215	2016-09-12 14:30:48 +00:00
Sanjay Patel	5352331716	fix formatting/typos; NFC llvm-svn: 281214	2016-09-12 14:25:46 +00:00
James Molloy	8f82d45ff4	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281213	2016-09-12 13:42:16 +00:00
Chad Rosier	a4c424654e	[LoopInterchange] Improve debug output. NFC. llvm-svn: 281212	2016-09-12 13:24:47 +00:00
Rafael Espindola	74941239d8	Define a dummy zlib::uncompress when zlib is not available. Should fix link errors in some bots when it is used. llvm-svn: 281208	2016-09-12 13:00:51 +00:00
Tim Northover	032548fc5e	GlobalISel: support translation of global addresses. llvm-svn: 281207	2016-09-12 12:10:41 +00:00
Tim Northover	a7653b3919	GlobalISel: translate GEP instructions. Unlike SDag, we use a separate G_GEP instruction (much simplified, only taking a single byte offset) to preserve the pointer type information through selection. llvm-svn: 281205	2016-09-12 11:20:22 +00:00
Tim Northover	d28d3cc079	GlobalISel: disambiguate types when printing MIR Some generic instructions have multiple types. While in theory these always be discovered by inspecting the single definition of each generic vreg, in practice those definitions won't always be local and traipsing through a big function to find them will not be fun. So this changes MIRPrinter to print out the type of uses as well as defs, if they're known to be different or not known to be the same. On the parsing side, we're a little more flexible: provided each register is given a type in at least one place it's mentioned (and all types are consistent) we accept the MIR. This doesn't introduce ambiguity but makes writing tests manually a bit less painful. llvm-svn: 281204	2016-09-12 11:20:10 +00:00
Eric Liu	c7e5a9ce17	Fix WebAssembly broken build related to interface change in r281172. Reviewers: bkramer Subscribers: jfb, llvm-commits, dschuff Differential Revision: https://reviews.llvm.org/D24449 llvm-svn: 281201	2016-09-12 09:35:59 +00:00
Duncan P. N. Exon Smith	cd0fffb6e1	MC: Move MCSection::begin/end to header, NFC llvm-svn: 281188	2016-09-12 00:17:09 +00:00
Sanjay Patel	60312bc45f	[InstCombine] add helper function for folding {and,or,xor} (cast X), C ; NFCI llvm-svn: 281187	2016-09-12 00:16:23 +00:00
Duncan P. N. Exon Smith	23d8306d13	ADT: Add AllocatorList, and use it for yaml::Token - Add AllocatorList, a non-intrusive list that owns an LLVM-style allocator and provides a std::list-like interface (trivially built on top of simple_ilist), - add a typedef (and unit tests) for BumpPtrList, and - use BumpPtrList for the list of llvm::yaml::Token (i.e., TokenQueueT). TokenQueueT has no need for the complexity of an intrusive list. The only reason to inherit from ilist was to customize the allocator. TokenQueueT was the only example in-tree of using ilist<> in a truly non-intrusive way. Moreover, this removes the final use of the non-intrusive ilist_traits<>::createNode (after r280573, r281177, and r281181). I have a WIP patch that removes this customization point (and the API that relies on it) that I plan to commit soon. Note: AllocatorList owns the allocator, which limits the viable API (e.g., splicing must be on the same list). For now I've left out any problematic API. It wouldn't be hard to split AllocatorList into two layers: an Impl class that calls DerivedT::getAlloc (via CRTP), and derived classes that handle Allocator ownership/reference/etc semantics; and then implement splice with appropriate assertions; but TBH we should probably just customize the std::list allocators at that point. llvm-svn: 281182	2016-09-11 22:40:40 +00:00
Craig Topper	7600794dde	[TwoAddressInstruction] When commuting an instruction don't assume that the destination register is operand 0. Pass it from the caller. In practice it probably is 0 so this may not be a functional change. llvm-svn: 281180	2016-09-11 22:10:42 +00:00
Duncan P. N. Exon Smith	8b4e4af5ed	ScalarOpts: Use std::list for Candidates, NFC There is nothing intrusive about the Candidate list; use std::list over llvm::ilist for simplicity. llvm-svn: 281177	2016-09-11 21:29:34 +00:00
Duncan P. N. Exon Smith	077f5b41e4	ScalarOpts: Sort includes, NFC llvm-svn: 281176	2016-09-11 21:04:36 +00:00
Duncan P. N. Exon Smith	1872096f1e	CodeGen: Give MachineBasicBlock::reverse_iterator a handle to the current MI Now that MachineBasicBlock::reverse_instr_iterator knows when it's at the end (since r281168 and r281170), implement MachineBasicBlock::reverse_iterator directly on top of an ilist::reverse_iterator by adding an IsReverse template parameter to MachineInstrBundleIterator. This replaces another hard-to-reason-about use of std::reverse_iterator on list iterators, matching the changes for ilist::reverse_iterator from r280032 (see the "out of scope" section at the end of that commit message). MachineBasicBlock::reverse_iterator now has a handle to the current node and has obvious invalidation semantics. r280032 has a more detailed explanation of how list-style reverse iterators (invalidated when the pointed-at node is deleted) are different from vector-style reverse iterators like std::reverse_iterator (invalidated on every operation). A great motivating example is this commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp. Note: If your out-of-tree backend deletes instructions while iterating on a MachineBasicBlock::reverse_iterator or converts between MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator, you'll need to update your code in similar ways to r280032. The following table might help: [Old] ==> [New] delete &RI, RE = end() delete &RI++ RI->erase(), RE = end() RI++->erase() reverse_iterator(I) std::prev(I).getReverse() reverse_iterator(I) ++I.getReverse() --reverse_iterator(I) I.getReverse() reverse_iterator(std::next(I)) I.getReverse() RI.base() std::prev(RI).getReverse() RI.base() ++RI.getReverse() --RI.base() RI.getReverse() std::next(RI).base() RI.getReverse() (For more details, have a look at r280032.) llvm-svn: 281172	2016-09-11 18:51:28 +00:00
Duncan P. N. Exon Smith	cc9edace0c	CodeGen: Turn on sentinel tracking for MachineInstr iterators This is a prep commit before fixing MachineBasicBlock::reverse_iterator invalidation semantics, ala r281167 for ilist::reverse_iterator. This changes MachineBasicBlock::Instructions to track which node is the sentinel regardless of LLVM_ENABLE_ABI_BREAKING_CHECKS. There's almost no functionality change (aside from ABI). However, in the rare configuration: #if !defined(NDEBUG) && !defined(LLVM_ENABLE_ABI_BREAKING_CHECKS) the isKnownSentinel() assertions in ilist_iterator<>::operator* suddenly have teeth for MachineInstr. If these assertions start firing for your out-of-tree backend, have a look at the suggestions in the commit message for r279314, and at some of the commits leading up to it that avoid dereferencing the end() iterator. llvm-svn: 281168	2016-09-11 16:38:18 +00:00
Igor Breger	e73ef85c6f	[AVX512] Fix pattern for vgetmantsd and all other instructions that use same class. Fix memory operand size, remove unnecessary pattern. Differential Revision: http://reviews.llvm.org/D24443 llvm-svn: 281164	2016-09-11 12:38:46 +00:00
James Molloy	104370ab37	[SimplifyCFG] Be even more conservative in SinkThenElseCodeToEnd This should actually fix PR30244. This cranks up the workaround for PR30188 so that we never sink loads or stores of allocas. The idea is that these should be removed by SROA/Mem2Reg, and any movement of them may well confuse SROA or just cause unwanted code churn. It's not ideal that the midend should be crippled like this, but that unwanted churn can really cause significant regressions in important workloads (tsan). llvm-svn: 281162	2016-09-11 09:00:03 +00:00
James Molloy	18d96e8fa5	[SimplifyCFG] Harden up the profitability heuristic for block splitting during sinking Exposed by PR30244, we will split a block currently if we think we can sink at least one instruction. However this isn't right - the reason we split predecessors is so that we can sink instructions that otherwise couldn't be sunk because it isn't safe to do so - stores, for example. So, change the heuristic to only split if it thinks it can sink at least one non-speculatable instruction. Should fix PR30244. llvm-svn: 281160	2016-09-11 08:07:30 +00:00
Craig Topper	1f81deee1f	[CodeGen] Make the TwoAddressInstructionPass check if the instruction is commutable before calling findCommutedOpIndices for every operand. Also make sure the operand is a register before each call to save some work on commutable instructions that might have an operand. llvm-svn: 281158	2016-09-11 06:00:15 +00:00
Craig Topper	fb4564cf21	[AVX-512] Add VPTERNLOG to load folding tables. llvm-svn: 281156	2016-09-11 05:33:40 +00:00
Craig Topper	69be1bd352	[X86] Make a helper method into a static function local to the cpp file. llvm-svn: 281154	2016-09-11 05:33:35 +00:00
Justin Lebar	11a3204355	Add handling of !invariant.load to PropagateMetadata. Summary: This will let e.g. the load/store vectorizer propagate this metadata appropriately. Reviewers: arsenm Subscribers: tra, jholewinski, hfinkel, mzolotukhin Differential Revision: https://reviews.llvm.org/D23479 llvm-svn: 281153	2016-09-11 01:39:08 +00:00
Justin Lebar	6d6b11a4a6	[NVPTX] Use ldg for explicitly invariant loads. Summary: With this change (plus some changes to prevent !invariant from being clobbered within llvm), clang will be able to model the __ldg CUDA builtin as an invariant load, rather than as a target-specific llvm intrinsic. This will let the optimizer play with these loads -- specifically, we should be able to vectorize them in the load-store vectorizer. Reviewers: tra Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc Differential Revision: https://reviews.llvm.org/D23477 llvm-svn: 281152	2016-09-11 01:39:04 +00:00
Justin Lebar	adbf09e8cf	[CodeGen] Split out the notions of MI invariance and MI dereferenceability. Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151	2016-09-11 01:38:58 +00:00
Arnold Schwaighofer	6c57f4f56d	It should also be legal to pass a swifterror parameter to a call as a swifterror argument. rdar://28233388 llvm-svn: 281147	2016-09-10 19:42:53 +00:00
Arnold Schwaighofer	5d335559b9	InstCombine: Don't combine loads/stores from swifterror to a new type This generates invalid IR: the only users of swifterror can be call arguments, loads, and stores. rdar://28242257 llvm-svn: 281144	2016-09-10 18:14:57 +00:00
Arnold Schwaighofer	ad83002243	Add an isSwiftError predicate to Value llvm-svn: 281143	2016-09-10 18:14:54 +00:00
Sanjay Patel	0a3d72bb93	[InstCombine] clean up foldICmpBinOpEqualityWithConstant / foldICmpIntrinsicWithConstant ; NFC 1. Rename variables to be consistent with related/preceding code (may want to reorganize). 2. Fix comments/formatting. llvm-svn: 281140	2016-09-10 15:33:39 +00:00
Sanjay Patel	f58f68c891	[InstCombine] rename and reorganize some icmp folding functions; NFC Everything under foldICmpInstWithConstant() should now be working for splat vectors via m_APInt matchers. Ie, I've removed all of the FIXMEs that I added while cleaning that section up. Note that not all of the associated FIXMEs in the regression tests are gone though, because some of the tests require earlier folds that are still scalar-only. llvm-svn: 281139	2016-09-10 15:03:44 +00:00
Arnold Schwaighofer	112ff66505	We also need to pass swifterror in R12 under swiftcc not only under ccc rdar://28190687 llvm-svn: 281138	2016-09-10 14:16:55 +00:00
Valery Pykhtin	b66e5eb612	[AMDGPU] Refactor MUBUF/MTBUF instructions Differential revision: https://reviews.llvm.org/D24295 llvm-svn: 281137	2016-09-10 13:09:16 +00:00
Heejin Ahn	99bd16b34b	[WebAssembly] Fix typos in comments llvm-svn: 281131	2016-09-10 02:33:47 +00:00
Kostya Serebryany	8c537c556a	[libFuzzer] print a failed-merge warning only in the merge mode llvm-svn: 281130	2016-09-10 02:17:22 +00:00
Matt Arsenault	3354f42ae7	AMDGPU: Implement is{LoadFrom\|StoreTo}FrameIndex llvm-svn: 281128	2016-09-10 01:20:33 +00:00
Matt Arsenault	7348a7eadd	AMDGPU: Fix scheduling info for spill pseudos These defaulted to Write32Bit. I don't think this actually matters since these don't exist during scheduling. llvm-svn: 281127	2016-09-10 01:20:28 +00:00
Vitaly Buka	3ac3aa50f6	[asan] Add flag to allow lifetime analysis of problematic allocas Summary: Could be useful for comparison when we suspect that alloca was skipped because of this. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24437 llvm-svn: 281126	2016-09-10 01:06:11 +00:00
Justin Lebar	d98cf00c95	[CodeGen] Rename MachineInstr::isInvariantLoad to isDereferenceableInvariantLoad. NFC Summary: I want to separate out the notions of invariance and dereferenceability at the MI level, so that they correspond to the equivalent concepts at the IR level. (Currently an MI load is MI-invariant iff it's IR-invariant and IR-dereferenceable.) First step is renaming this function. Reviewers: chandlerc Subscribers: MatzeB, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D23370 llvm-svn: 281125	2016-09-10 01:03:20 +00:00
Kostya Serebryany	4529960a3b	[libFuzzer] don't print help for internal flags llvm-svn: 281124	2016-09-10 00:35:30 +00:00
Kostya Serebryany	b991cc1f0e	[libFuzzer] print a visible message if merge fails due to a crash llvm-svn: 281122	2016-09-10 00:15:41 +00:00
Matt Arsenault	124384f08d	AMDGPU: Fix immediate folding logic when shrinking instructions If the literal is being folded into src0, it doesn't matter if it's an SGPR because it's being replaced with the literal. Also fixes initially selecting 32-bit versions of some instructions which also confused commuting. llvm-svn: 281117	2016-09-09 23:32:53 +00:00
Arnold Schwaighofer	c9277f40fd	Inliner: Don't mark swifterror allocas with lifetime markers This would create a bitcast use which fails the verifier: swifterror values may only be used by loads, stores, and as function arguments. rdar://28233244 llvm-svn: 281114	2016-09-09 22:40:27 +00:00
Hans Wennborg	6ecf619be9	X86: Fold tail calls into conditional branches also for 64-bit (PR26302) This extends the optimization in r280832 to also work for 64-bit. The only quirk is that we can't do this for 64-bit Windows (yet). Differential Revision: https://reviews.llvm.org/D24423 llvm-svn: 281113	2016-09-09 22:37:27 +00:00
Matt Arsenault	0efdd06b22	AMDGPU: Run LoadStoreVectorizer pass by default llvm-svn: 281112	2016-09-09 22:29:28 +00:00
Kostya Serebryany	1837152a34	[libFuzzer] use sizeof() in tests instead of 4 and 8 llvm-svn: 281111	2016-09-09 22:21:16 +00:00
Matt Arsenault	950a82047b	LSV: Fix incorrectly increasing alignment If the unaligned access has a dynamic offset, it may be odd which would make the adjusted alignment incorrect to use. llvm-svn: 281110	2016-09-09 22:20:14 +00:00
Sanjay Patel	58109abe91	[InstCombine] use m_APInt to allow icmp ult X, C folds for splat constant vectors llvm-svn: 281107	2016-09-09 21:59:37 +00:00
Kostya Serebryany	4b17a331ae	[libFuzzer] one more puzzle for value profile llvm-svn: 281106	2016-09-09 21:58:42 +00:00
Simon Pilgrim	a3d1e03cd7	[X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targets Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets llvm-svn: 281105	2016-09-09 21:47:21 +00:00
Krzysztof Parzyszek	73e0ad8220	[Hexagon] Fix disassembler crash after r279255 When p0 was added as an explicit operand to the duplex subinstructions, the disassembler was not updated to reflect this. llvm-svn: 281104	2016-09-09 21:45:00 +00:00
Arnold Schwaighofer	7d7b4b4014	Create phi nodes for swifterror values at the end of the phi instructions list ISel makes assumption about the order of phi nodes. rdar://28190150 llvm-svn: 281095	2016-09-09 21:18:47 +00:00
Justin Lebar	b5e884976b	[NVPTX] Implement llvm.fabs.f32, llvm.max.f32, etc. Summary: Previously these only worked via NVPTX-specific intrinsics. This change will allow us to convert these target-specific intrinsics into the general LLVM versions, allowing existing LLVM passes to reason about their behavior. It also gets us some minor codegen improvements as-is, from situations where we canonicalize code into one of these llvm intrinsics. Reviewers: majnemer Subscribers: llvm-commits, jholewinski, tra Differential Revision: https://reviews.llvm.org/D24300 llvm-svn: 281092	2016-09-09 21:07:26 +00:00
Saleem Abdulrasool	92e33a3ebc	ARM: move the builtins libcall CC setup Move the target specific setup into the target specific lowering setup. As pointed out by Anton, the initial change was moving this too high up the stack resulting in a violation of the layering (the target generic code path setup target specific bits). Sink this into the ARM specific setup. NFC. llvm-svn: 281088	2016-09-09 20:11:31 +00:00
Rafael Espindola	46cdcb03ed	Add a lower level zlib::uncompress. SmallVectors are convenient, but they don't cover every use case. In particular, they are fairly large (3 pointers + one element) and there is no way to take ownership of the buffer to put it somewhere else. This patch then adds a lower lever interface that works with any buffer. llvm-svn: 281082	2016-09-09 19:32:36 +00:00
Wei Ding	06f8d39424	AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type. Differential Revision: http://reviews.llvm.org/D23700 llvm-svn: 281081	2016-09-09 19:31:51 +00:00
Tom Stellard	b2869eb6e9	AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSA Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24405 llvm-svn: 281080	2016-09-09 19:28:00 +00:00
Zachary Turner	36efbfa6d8	[pdb] Print out some more info when dumping a raw stream. We have various command line options that print the type of a stream, the size of a stream, etc but nowhere that it can all be viewed together. Since a previous patch introduced the ability to dump the bytes of a stream, this seems like a good place to present a full view of the stream's properties including its size, what kind of data it represents, and the blocks it occupies. So I added the ability to print that information to the -stream-data command line option. llvm-svn: 281077	2016-09-09 19:00:49 +00:00
Dehao Chen	22ce5eb051	Do not widen load for different variable in GVN. Summary: Widening load in GVN is too early because it will block other optimizations like PRE, LICM. https://llvm.org/bugs/show_bug.cgi?id=29110 The SPECCPU2006 benchmark impact of this patch: Reference: o2_nopatch (1): o2_patched Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 25.2 -0.08% spec/2006/fp/C++/447.dealII 45.92 +1.05% spec/2006/fp/C++/450.soplex 41.7 -0.26% spec/2006/fp/C++/453.povray 35.65 +1.68% spec/2006/fp/C/433.milc 23.79 +0.42% spec/2006/fp/C/470.lbm 41.88 -1.12% spec/2006/fp/C/482.sphinx3 47.94 +1.67% spec/2006/int/C++/471.omnetpp 22.46 -0.36% spec/2006/int/C++/473.astar 21.19 +0.24% spec/2006/int/C++/483.xalancbmk 36.09 -0.11% spec/2006/int/C/400.perlbench 33.28 +1.35% spec/2006/int/C/401.bzip2 22.76 -0.04% spec/2006/int/C/403.gcc 32.36 +0.12% spec/2006/int/C/429.mcf 41.04 -0.41% spec/2006/int/C/445.gobmk 26.94 +0.04% spec/2006/int/C/456.hmmer 24.5 -0.20% spec/2006/int/C/458.sjeng 28 -0.46% spec/2006/int/C/462.libquantum 55.25 +0.27% spec/2006/int/C/464.h264ref 45.87 +0.72% geometric mean +0.23% For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400. Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames Subscribers: gberry, junbuml Differential Revision: https://reviews.llvm.org/D24096 llvm-svn: 281074	2016-09-09 18:42:35 +00:00
Rui Ueyama	a5edf655af	Fix another -Wunused-variable for non-assert build. llvm-svn: 281073	2016-09-09 18:37:08 +00:00
Rui Ueyama	47320da1ee	Fix -Wunused-variable for non-assert build. llvm-svn: 281069	2016-09-09 18:07:33 +00:00
Zachary Turner	9ba31a5efe	[pdb] Pass CVRecord's through the visitor as non-const references. This simplifies a lot of code, and will actually be necessary for an upcoming patch to serialize TPI record hash values. The idea before was that visitors should be examining records, not modifying them. But this is no longer true with a visitor that constructs a CVRecord from Yaml. To handle this until now, we were doing some fixups on CVRecord objects at a higher level, but the code is really awkward, and it makes sense to just have the visitor write the bytes into the CVRecord. In doing so I uncovered a few bugs related to `Data` and `RawData` and fixed those. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24362 llvm-svn: 281067	2016-09-09 18:03:39 +00:00
Kostya Serebryany	00ef27112e	[libFuzzer] one more puzzle, value_profile cracks it in a second llvm-svn: 281066	2016-09-09 18:00:04 +00:00
Zachary Turner	c6d54da891	[pdb] Write PDB TPI Stream from Yaml. This writes the full sequence of type records described in Yaml to the TPI stream of the PDB file. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24316 llvm-svn: 281063	2016-09-09 17:46:17 +00:00
Reid Kleckner	1076288e22	[codeview] Don't assert if the array element type is incomplete This can happen when the frontend knows the debug info will be emitted somewhere else. Usually this happens for dynamic classes with out of line constructors or key functions, but it can also happen when modules are enabled. llvm-svn: 281060	2016-09-09 17:29:36 +00:00
Sam Kolton	1eeb11bfd4	AMDGPU] Assembler: better support for immediate literals in assembler. Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050	2016-09-09 14:44:04 +00:00
Chris Dewhurst	c59f7c745b	[Sparc][LEON] Removed the parts of the errata fixes implemented using inline assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly. llvm-svn: 281047	2016-09-09 14:16:51 +00:00
James Molloy	57d9dfa9ac	[ARM] ADD with a negative offset can become SUB for free So model that directly in TTI::getIntImmCost(). llvm-svn: 281044	2016-09-09 13:35:36 +00:00
James Molloy	1454e90f86	[ARM] icmp %x, -C can be lowered to a simple ADDS or CMN Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043	2016-09-09 13:35:28 +00:00
Simon Pilgrim	153b408433	[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042	2016-09-09 13:31:52 +00:00
James Molloy	4d86bed0bb	[Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0) The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040	2016-09-09 12:52:24 +00:00
Tim Northover	25d1286e5a	GlobalISel: remove G_TYPE and G_PHI These instructions were only necessary when type information was stored in the MachineInstr (because only generic MachineInstrs possessed a type). Now that it's in MachineRegisterInfo, COPY and PHI work fine. llvm-svn: 281037	2016-09-09 11:47:31 +00:00
Tim Northover	1f8b1db93e	GlobalISel: fix comments and add assertions for valid instructions. llvm-svn: 281036	2016-09-09 11:46:58 +00:00
Tim Northover	0f140c769a	GlobalISel: move type information to MachineRegisterInfo. We want each register to have a canonical type, which means the best place to store this is in MachineRegisterInfo rather than on every MachineInstr that happens to use or define that register. Most changes following from this are pretty simple (you need an MRI anyway if you're going to be doing any transformations, so just check the type there). But legalization doesn't really want to check redundant operands (when, for example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's operand type field to encode these constraints and limit legalization's work. As an added bonus, more validation is possible, both in MachineVerifier and MachineIRBuilder (coming soon). llvm-svn: 281035	2016-09-09 11:46:34 +00:00
Simon Dardis	ba92b034bf	Revert "[mips] Fix c.<cc>.<fmt> instruction definition." This reverts commit r281022. Mips buildbot broke, due to unhandled register class FCC. llvm-svn: 281033	2016-09-09 11:06:01 +00:00
Sam Kolton	a2e5c88baf	[AMDGPU] Assembler: rename amd_kernel_code_t asm names according to spec Summary: Also removed duplicate code from AMDGPUTargetAsmStreamer. This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D24296 llvm-svn: 281028	2016-09-09 10:08:02 +00:00
James Molloy	0f41227b21	[Thumb1] Teach optimizeCompareInstr about thumb1 compares This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags. Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean. llvm-svn: 281027	2016-09-09 09:51:06 +00:00
Sam Kolton	d63d8a7c05	[AMDGPU] Assembler: match e32 VOP instructions before e64. Summary: Split assembler match table in 4 tables with assembler variants: Default - all instructions except VOP3, SDWA and DPP - VOP3 - SDWA - DPP First match Default table then VOP3, SDWA and DPP. Reviewers: tstellarAMD, artem.tamazov, vpykhtin Subscribers: arsenm, wdng, nhaehnle, AMDGPU Differential Revision: https://reviews.llvm.org/D24252 llvm-svn: 281023	2016-09-09 09:37:51 +00:00
Simon Dardis	8efa979029	[mips] Fix c.<cc>.<fmt> instruction definition. As part of this effort, remove MipsFCmp nodes and use tablegen patterns rather than custom lowering through C++. Unexpectedly, this improves codesize for microMIPS as previous floating point setcc expansions would materialize 0 and 1 into GPRs before using the relevant mov[tf].[sd] instruction. Now $zero is used directly. Reviewers: dsanders, vkalintiris, zoran.jovanovic Differential Review: https://reviews.llvm.org/D23118 llvm-svn: 281022	2016-09-09 09:22:52 +00:00
Gor Nishanov	faf36c2e0b	[Coroutines] Part13: Handle single edge PHINodes across suspends Summary: If one of the uses of the value is a single edge PHINode, handle it. Original: %val = something <suspend> %p = PHINode [%val] After Spill + Part13: %val = something %slot = gep val.spill.slot store %val, %slot <suspend> %p = load %slot Plus tiny fixes/changes: * use correct index for coro.free in CoroCleanup * fixup id parameter in coro.free to allow authoring coroutine in plain C with __builtins Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24242 llvm-svn: 281020	2016-09-09 05:39:00 +00:00
Amaury Sechet	5f04d819a5	Rationalise the attribute getter/setter methods on Function and CallSite. Summary: While woring on mapping attributes in the C API, it clearly appeared that the recent changes in the API on the C++ side left Function and Call/Invoke with an attribute API that grew in an ad hoc manner. This makes it difficult to work with it, because one doesn't know which overloads exists and which do not. Make sure that getter/setter function exists for both enum and string version. Remove inconsistent getter/setter, unless they have many callsites. This should make it easier to work with attributes in the future. This doesn't change how attribute works. Reviewers: bkramer, whitequark, mehdi_amini, void Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21514 llvm-svn: 281019	2016-09-09 04:50:38 +00:00
Kostya Serebryany	b76a2a5503	[libFuzzer] improve -print_pcs to not print new PCs coming from libFuzzer itself llvm-svn: 281016	2016-09-09 02:38:28 +00:00
Kostya Serebryany	8ea4f9873b	[libFuzzer] remove unneeded call llvm-svn: 281014	2016-09-09 01:57:38 +00:00
Craig Topper	149e6bdc16	[AVX-512] Add VPCMP instructions to the load folding tables and make them commutable. llvm-svn: 281013	2016-09-09 01:36:10 +00:00
Kostya Serebryany	5c04bd250e	[libFuzzer] remove use_traces=1 since use_value_profile seems to be strictly better llvm-svn: 281007	2016-09-09 01:17:03 +00:00
David Majnemer	2c3ea55498	[X86] Tighten up a comment which confused x64 ABI terminology. The x64 ABI has two major function types: - frame functions - leaf functions A frame function is one which requires a stack frame. A leaf function is one which does not. A frame function may or may not have a frame pointer. A leaf function does not require a stack frame and may never modify SP except via a return (RET, tail call via JMP). A frame function which has a frame pointer is permitted to use the LEA instruction in the epilogue, a frame function without which doesn't establish a frame pointer must use ADD to adjust the stack pointer epilogue. Fun fact: Leaf functions don't require a function table entry (associated PDATA/XDATA). llvm-svn: 281006	2016-09-09 01:07:01 +00:00
Hans Wennborg	c39ef776fc	Win64: Don't use REX prefix for direct tail calls The REX prefix should be used on indirect jmps, but not direct ones. For direct jumps, the unwinder looks at the offset to determine if it's inside the current function. Differential Revision: https://reviews.llvm.org/D24359 llvm-svn: 281003	2016-09-08 23:35:10 +00:00
Dehao Chen	87823f8e4d	Remove debug info when hoisting instruction from then/else branch. Summary: The hoisted instruction is executed speculatively. It could affect the debugging experience as user would see gdb go into code that may not be expected to execute. It will also affect sample profile accuracy by assigning incorrect frequency to source within then/else branch. Reviewers: davidxl, dblaikie, chandlerc, kcc, echristo Subscribers: mehdi_amini, probinson, eric_niebler, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D24164 llvm-svn: 280995	2016-09-08 21:53:33 +00:00
Matthew Simpson	bfe5e1817b	[LV] Ensure proper handling of multi-use case when collecting uniforms The test case included in r280979 wasn't checking what it was supposed to be checking for the predicated store case. Fixing the test revealed that the multi-use case (when a pointer is used by both vectorized and scalarized memory accesses) wasn't being handled properly. We can't skip over non-consecutive-like pointers since they may have looked consecutive-like with a different memory access. llvm-svn: 280992	2016-09-08 21:38:26 +00:00
Krzysztof Parzyszek	a1218728d3	[RDF] Further improve handling of multiple phis reached from shadows llvm-svn: 280987	2016-09-08 20:48:42 +00:00
Matthew Simpson	408a3abcfe	[LV] Don't mark pointers used by scalarized memory accesses uniform Previously, all consecutive pointers were marked uniform after vectorization. However, if a consecutive pointer is used by a memory access that is eventually scalarized, the pointer won't remain uniform after all. An example is predicated stores. Even though a predicated store may be consecutive, it will still be scalarized, making it's pointer operand non-uniform. This patch updates the logic in collectLoopUniforms to consider the cases where a memory access may be scalarized. If a memory access may be scalarized, its pointer operand is not marked uniform. The determination of whether a given memory instruction will be scalarized or not has been moved into a common function that is used by the vectorizer, cost model, and legality analysis. Differential Revision: https://reviews.llvm.org/D24271 llvm-svn: 280979	2016-09-08 19:11:07 +00:00
Zachary Turner	35377f88f5	[YAMLIO] Add the ability to map with context. mapping a yaml field to an object in code has always been a stateless operation. You could still pass state by using the `setContext` function of the YAMLIO object, but this represented global state for the entire yaml input. In order to have context-sensitive state, it is necessary to pass this state in at the granularity of an individual mapping. This patch adds support for this type of context-sensitive state. You simply pass an additional argument of type T to the `mapRequired` or `mapOptional` functions, and provided you have specialized a `MappingContextTraits<U, T>` class with the appropriate mapping function, you can pass this context into the mapping function. Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D24162 llvm-svn: 280977	2016-09-08 18:22:44 +00:00
Matt Arsenault	d745c28945	AMDGPU: Sign extend constants when splitting them This will confuse later passes which try to look at the immediate value and don't truncate first. llvm-svn: 280974	2016-09-08 17:44:36 +00:00
Krzysztof Parzyszek	a696b1b641	[Hexagon] Expand sext- and zextloads of vector types, not just extloads Recent change exposed this issue, breaking the Hexagon buildbots. llvm-svn: 280973	2016-09-08 17:42:14 +00:00
Matt Arsenault	be90f70d3a	AMDGPU: Try to commute when selecting s_addk_i32/s_mulk_i32 llvm-svn: 280972	2016-09-08 17:35:41 +00:00
Eric Christopher	98ddbdb563	AArch64 .arch directive - Include default arch attributes with extensions. Fix the .arch asm parser to use the full set of features for the architecture and any extensions on the command line. Add and update testcases accordingly as well as add an extension that was used but not supported. llvm-svn: 280971	2016-09-08 17:27:03 +00:00
Matt Arsenault	bbb47da8a1	AMDGPU: Support commuting with immediate in src0 llvm-svn: 280970	2016-09-08 17:19:29 +00:00
Renato Golin	049f387112	Revert "[XRay] ARM 32-bit no-Thumb support in LLVM" And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967	2016-09-08 17:10:39 +00:00
Balaram Makam	c6cebf727c	[LoopDataPrefetch] Use range based for loop; NFCI Switch to range based for loop. No functional change, but more readable code. llvm-svn: 280966	2016-09-08 17:08:20 +00:00
Sanjay Patel	1c608f4323	[InstCombine] return a vector-safe true/false constant I introduced this potential bug by missing this diff in: https://reviews.llvm.org/rL280873 ...however, I'm not sure how to reach this code path with a regression test. We may be able to remove this code and assume that the transform to a constant is always handled by InstSimplify? llvm-svn: 280964	2016-09-08 16:54:02 +00:00
Dehao Chen	db3810771e	revert r280427 Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself. Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking. llvm-svn: 280949	2016-09-08 15:25:12 +00:00
Renato Golin	d257373887	[ARM XRay] Try to fix Thumb-only failure I mised the check that it had to support ARM to work. This commit tries to fix that, to make sure we don't emit ARM code in Thumb-only mode. llvm-svn: 280935	2016-09-08 13:45:10 +00:00
James Molloy	c6a6144966	[SDAGBuilder] Don't create a binary tree for switches in minsize mode This bloats codesize - all of the non-leaf nodes are extra code. llvm-svn: 280932	2016-09-08 13:12:22 +00:00
James Molloy	753c18f5c0	[Thumb1] AND with a constant operand can be converted into BIC So model the cost of materializing the constant operand C as the minimum of C and ~C. llvm-svn: 280929	2016-09-08 12:58:12 +00:00
James Molloy	7c7255e40b	[Thumb1] Fix cost calculation for complemented immediates Materializing something like "-3" can be done as 2 instructions: MOV r0, #3 MVN r0, r0 This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256. There were no tests failing after changing this... :/ llvm-svn: 280928	2016-09-08 12:58:04 +00:00
Simon Pilgrim	cc7b4b511b	[SelectionDAG] Add BUILD_VECTOR support to computeKnownBits and SimplifyDemandedBits Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element. This is an initial step towards determining the sign bits of a vector (PR29079). Differential Revision: https://reviews.llvm.org/D24253 llvm-svn: 280927	2016-09-08 12:57:51 +00:00
Simon Pilgrim	a01ee07a19	[DAGCombiner] Enable AND combines of splatted constant vectors Allow AND combines to use a vector splatted constant as well as a constant scalar. Preliminary part of D24253. llvm-svn: 280926	2016-09-08 12:36:39 +00:00
Pablo Barrio	2b7ed1339c	Revert "[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)" This reverts commit r280808. It is possible that this change results in an infinite loop. This is causing timeouts in some tests on ARM, and a Chromebook bot is failing. llvm-svn: 280918	2016-09-08 10:05:57 +00:00
Hrvoje Varga	dbe4d96b4f	[mips][microMIPS] Implement DBITSWAP, DLSA and LWUPC and add tests for AUI instructions Differential Revision: https://reviews.llvm.org/D16452 llvm-svn: 280909	2016-09-08 07:41:43 +00:00
Vitaly Buka	58a81c6540	[asan] Avoid lifetime analysis for allocas with can be in ambiguous state Summary: C allows to jump over variables declaration so lifetime.start can be avoid before variable usage. To avoid false-positives on such rare cases we detect them and remove from lifetime analysis. PR27453 PR28267 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24321 llvm-svn: 280907	2016-09-08 06:27:58 +00:00
Michael Zolotukhin	e72997a524	Revert "[LoopUnroll] Properly update loop-info when cloning prologues and epilogues." This reverts commit r280901. This caused a bunch of failures, reverting it until I investigate them. llvm-svn: 280905	2016-09-08 03:51:30 +00:00
Michael Zolotukhin	5e0a20697e	[LoopUnroll] Properly update loop-info when cloning prologues and epilogues. Summary: When cloning blocks for prologue/epilogue we need to replicate the loop structure from the original loop. It wasn't a problem for the innermost loops, but it led to an incorrect loop info when we unrolled a loop with a child loop - in this case created prologue-loop had a child loop, but loop info didn't reflect that. This fixes PR28888. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas Differential Revision: https://reviews.llvm.org/D24203 llvm-svn: 280901	2016-09-08 01:52:26 +00:00
Michael Kuperstein	f79af6f8c4	[CGP] Be less conservative about tail-duplicating a ret to allow tail calls CGP tail-duplicates rets into blocks that end with a call that feed the ret. This puts the call in tail position, potentially allowing the DAG builder to lower it as a tail call. To avoid tail duplication in cases where we won't form the tail call, CGP tried to predict whether this is going to be possible, and avoids doing it when lowering as a tail call will definitely fail. However, it was being too conservative by always throwing away calls to functions with a signext/zeroext attribute on the return type. Instead, we can use the same logic the builder uses to determine whether the attributes work out. Differential Revision: https://reviews.llvm.org/D24315 llvm-svn: 280894	2016-09-08 00:48:37 +00:00
Dean Michael Berris	cf3801eee8	[XRay] Remove unused variable llvm-svn: 280891	2016-09-08 00:38:22 +00:00
Dean Michael Berris	17d94e279e	[XRay] ARM 32-bit no-Thumb support in LLVM This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: 1. https://reviews.llvm.org/D23932 (Clang test) 2. https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 280888	2016-09-08 00:19:04 +00:00
Peter Collingbourne	8f1dd5c41e	IR: Remove Value::intersectOptionalDataWith, replace all calls with calls to Instruction::andIRFlags. The two functions are functionally equivalent. Differential Revision: https://reviews.llvm.org/D22830 llvm-svn: 280884	2016-09-07 23:39:04 +00:00
Vitaly Buka	c5e53b2a53	Revert "[asan] Avoid lifetime analysis for allocas with can be in ambiguous state" Fails on Windows. This reverts commit r280880. llvm-svn: 280883	2016-09-07 23:37:15 +00:00
Vitaly Buka	2ca05b07d6	[asan] Avoid lifetime analysis for allocas with can be in ambiguous state Summary: C allows to jump over variables declaration so lifetime.start can be avoid before variable usage. To avoid false-positives on such rare cases we detect them and remove from lifetime analysis. PR27453 PR28267 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24321 llvm-svn: 280880	2016-09-07 23:18:23 +00:00
Sanjay Patel	9b40f98357	[InstCombine] use m_APInt to allow icmp (and (sh X, Y), C2), C1 folds for splat constant vectors llvm-svn: 280873	2016-09-07 22:33:03 +00:00
Hal Finkel	ac5803ba91	[SimplifyCFG] Don't try to create metadata-valued PHIs We can't create metadata-valued PHIs; don't try to do so when sinking. I created a test case for this using the @llvm.type.test intrinsic, because it takes a metadata parameter and does not have severe side effects (thus SimplifyCFG is willing to otherwise sink it). Previously, running the test case would crash with: Invalid use of metadata! %.sink = select i1 %flag, metadata <...>, metadata <0x4e45dc0> LLVM ERROR: Broken function found, compilation aborted! llvm-svn: 280866	2016-09-07 21:38:22 +00:00
Haicheng Wu	109f4f3509	[LoopUnroll] Correct a debug message. NFC. Differential Revision: https://reviews.llvm.org/D24299 llvm-svn: 280865	2016-09-07 21:30:16 +00:00
Elena Demikhovsky	dcc86d5bb6	Shift-left (ISD::SHL) operation crashes on "DAG Legalization" phase. https://llvm.org/bugs/show_bug.cgi?id=29058. While node legalization we tried to legalize its operands. If an operand node is replaced during legalization the user node may be destroyed. Differential Revision: https://reviews.llvm.org/D24244 llvm-svn: 280862	2016-09-07 20:54:33 +00:00
Sanjay Patel	def931e76a	[InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectors This is a revert of r280676 which was a revert of r280637; ie, this is r280637 again. It was speculatively reverted to help debug buildbot failures. llvm-svn: 280861	2016-09-07 20:50:44 +00:00
Krzysztof Parzyszek	2db0c8b75f	[RDF] Fix liveness analysis for phi nodes with shadow uses Shadow uses need to be analyzed together, since each individual shadow will only have a partial reaching def. All shadows together may cover a given register ref, while each individual shadow may not. llvm-svn: 280855	2016-09-07 20:37:05 +00:00
Michael Kuperstein	71321563de	Don't reuse a variable name in a nested scope. NFC. llvm-svn: 280853	2016-09-07 20:29:49 +00:00
Krzysztof Parzyszek	1ff99525f7	[RDF] Introduce "undef" flag for ref nodes llvm-svn: 280851	2016-09-07 20:10:56 +00:00
Yaxun Liu	90658fff1b	AMDGPU: Remove a useless variable which caused build failure for lld. llvm-svn: 280841	2016-09-07 18:31:11 +00:00
Wei Mi	f100d4e93d	Don't reduce the width of vector mul if the target doesn't support SSE2. The patch is to fix PR30298, which is caused by rL272694. The solution is to bail out if the target has no SSE2. Differential Revision: https://reviews.llvm.org/D24288 llvm-svn: 280837	2016-09-07 18:22:17 +00:00
Chad Rosier	13bc0d19a8	Typo. NFC. llvm-svn: 280834	2016-09-07 18:15:12 +00:00
Saleem Abdulrasool	02d9851c1c	CodeGen: ensure that libcalls are always AAPCS CC The original commit was too aggressive about marking LibCalls as AAPCS. The libcalls contain libc/libm/libunwind calls which are not AAPCS, but C. llvm-svn: 280833	2016-09-07 17:56:09 +00:00
Hans Wennborg	75e25f6812	X86: Fold tail calls into conditional branches where possible (PR26302) When branching to a block that immediately tail calls, it is possible to fold the call directly into the branch if the call is direct and there is no stack adjustment, saving one byte. Example: define void @f(i32 %x, i32 %y) { entry: %p = icmp eq i32 %x, %y br i1 %p, label %bb1, label %bb2 bb1: tail call void @foo() ret void bb2: tail call void @bar() ret void } before: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne .LBB0_2 jmp foo .LBB0_2: jmp bar after: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne bar .LBB0_1: jmp foo I don't expect any significant size savings from this (on a Clang bootstrap I saw 288 bytes), but it does make the code a little tighter. This patch only does 32-bit, but 64-bit would work similarly. Differential Revision: https://reviews.llvm.org/D24108 llvm-svn: 280832	2016-09-07 17:52:14 +00:00
Davide Italiano	ec9612da1a	[lib/LTO] Add a way to run a custom pipeline Differential Revision: https://reviews.llvm.org/D24095 llvm-svn: 280830	2016-09-07 17:46:16 +00:00
Yaxun Liu	638914009a	AMDGPU: Add hidden kernel arguments to runtime metadata OpenCL kernels have hidden kernel arguments for global offset and printf buffer. For consistency, these hidden argument should be included in the runtime metadata. Also updated kernel argument kind metadata. Differential Revision: https://reviews.llvm.org/D23424 llvm-svn: 280829	2016-09-07 17:44:00 +00:00
Reid Kleckner	a9f4cc9510	[codeview] Add new directives to record inlined call site line info Summary: Previously we were trying to represent this with the "contains" list of the .cv_inline_linetable directive, which was not enough information. Now we directly represent the chain of inlined call sites, so we know what location to emit when we encounter a .cv_loc directive of an inner inlined call site while emitting the line table of an outer function or inlined call site. Fixes PR29146. Also fixes PR29147, where we would crash when .cv_loc directives crossed sections. Now we write down the section of the first .cv_loc directive, and emit an error if any other .cv_loc directive for that function is in a different section. Also fixes issues with discontiguous inlined source locations, like in this example: volatile int unlikely_cond = 0; extern void __declspec(noreturn) abort(); __forceinline void f() { if (!unlikely_cond) abort(); } int main() { unlikely_cond = 0; f(); unlikely_cond = 0; } Previously our tables gave bad location information for the 'abort' call, and the debugger wouldn't snow the inlined stack frame for 'f'. It is important to emit good line tables for this code pattern, because it comes up whenever an asan bug occurs in an inlined function. The __asan_report* stubs are generally placed after the normal function epilogue, leading to discontiguous regions of inlined code. Reviewers: majnemer, amccarth Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24014 llvm-svn: 280822	2016-09-07 16:15:31 +00:00
Chad Rosier	90bcb9176e	[LoopInterchange] Improve debug output. NFC. llvm-svn: 280820	2016-09-07 16:07:17 +00:00
Chad Rosier	f5814f56b8	[LoopInterchange] Improve debug output. NFC. llvm-svn: 280819	2016-09-07 15:56:59 +00:00
Justin Lebar	3a5f40c191	[LSV] Use the original loads' names for the extractelement instructions. Summary: LSV replaces multiple adjacent loads with one vectorized load and a bunch of extractelement instructions. This patch makes the extractelement instructions' names match those of the original loads, for (hopefully) improved readability. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D23748 llvm-svn: 280818	2016-09-07 15:49:48 +00:00
Sanjay Patel	0bf9a99c7d	[x86] move combines of 'select of 2 constants' to its own function; NFC There are missing folds here and possibly folds that could be made generic. llvm-svn: 280817	2016-09-07 15:47:34 +00:00
Pablo Barrio	fc752bb70a	[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM) Summary: This saves a library call to __aeabi_uidivmod. However, the processor must feature hardware division in order to benefit from the transformation. Reviewers: scott-0, jmolloy, compnerd, rengolin Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D24133 llvm-svn: 280808	2016-09-07 12:49:15 +00:00
Andrea Di Biagio	f3fd316223	[InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic. This fixes a similar issue to the one already fixed by r280804 (revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts in the extrq/extrqi combining logic. However, it turns out that even the insertq/insertqi logic was affected by the same problem. llvm-svn: 280807	2016-09-07 12:47:53 +00:00
Andrea Di Biagio	8df5b9cf48	[InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls. This patch fixes an assertion failure caused by unsafe dynamic casts on the constant operands of sse4a intrinsic calls to extrq/extrqi The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently checks if the input operands are constants. Internally, that logic relies on dyn_casts of values returned by calls to method Constant::getAggregateElement. However, method getAggregateElemet may return nullptr if the constant element cannot be retrieved. So, all the dyn_casts can potentially fail. This is what happens for example if a constexpr value is passed in input to an extrq/extrqi intrinsic call. This patch fixes the problem by using a dyn_cast_or_null (instead of a simple dyn_cast) on the result of each call to Constant::getAggregateElement. Added reproducible test cases to x86-sse4a.ll. Differential Revision: https://reviews.llvm.org/D24256 llvm-svn: 280804	2016-09-07 12:03:03 +00:00
Renato Golin	c69e0818e0	Revert "[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address." This reverts commit r280796, as it broke the AArch64 bots for no reason. The tests were passing and we should try to keep them passing, so a proper review should make that happen. llvm-svn: 280802	2016-09-07 10:54:42 +00:00
Vasileios Kalintiris	1ed49fd384	[mips] Disable the TImode shift libcalls for 32-bit targets. Summary: The o32 ABI doesn't not support the TImode helpers. For the time being, disable just the shift libcalls as they break recursive builds on MIPS. Reviewers: sdardis Subscribers: llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D24259 llvm-svn: 280798	2016-09-07 10:01:18 +00:00
Sagar Thakur	69c78d8db7	[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address. Adding 40-bit shadow memory parameters because MIPS64 uses 40-bit virtual memory addresses. Reviewed by bruening Differential: D23801 llvm-svn: 280796	2016-09-07 09:45:37 +00:00
James Molloy	6c009c1c85	[SimplifyCFG] Followup fix to r280790 In failure cases it's not guaranteed that the PHI we're inspecting is actually in the successor block! In this case we need to bail out early, and never query getIncomingValueForBlock() as that will cause an assert. llvm-svn: 280794	2016-09-07 09:01:22 +00:00
James Molloy	ec905a62ae	[SimplifyCFG] Update workaround for PR30188 to also include loads I should have realised this the first time around, but if we're avoiding sinking stores where the operands come from allocas so they don't create selects, we also have to do the same for loads because SROA will be just as defective looking at loads of selected addresses as stores. Fixes PR30188 (again). llvm-svn: 280792	2016-09-07 08:40:20 +00:00
James Molloy	bf1837d9c9	[SimplifyCFG] Check PHI uses more accurately PR30292 showed a case where our PHI checking wasn't correct. We were checking that all values were used by the same PHI before deciding to sink, but we weren't checking that the incoming values for that PHI were what we expected. As a result, we had to bail out after block splitting which caused us to never reach a steady state in SimplifyCFG. Fixes PR30292. llvm-svn: 280790	2016-09-07 08:15:54 +00:00
Hal Finkel	42c83f131e	[PowerPC] Fix address-offset folding for plain addi When folding an addi into a memory access that can take an immediate offset, we were implicitly assuming that the existing offset was zero. This was incorrect. If we're dealing with an addi with a plain constant, we can add it to the existing offset (assuming that doesn't overflow the immediate, etc.), but if we have anything else (i.e. something that will become a relocation expression), we'll go back to requiring the existing immediate offset to be zero (because we don't know what the requirements on that relocation expression might be - e.g. maybe it is paired with some addis in some relevant way). On the other hand, when dealing with a plain addi with a regular constant immediate, the alignment restrictions (from the TOC base pointer, etc.) are irrelevant. I've added the test case from PR30280, which demonstrated the bug, but also demonstrates a missed optimization opportunity (i.e. we don't need the memory accesses at all). Fixes PR30280. llvm-svn: 280789	2016-09-07 07:36:11 +00:00
Elena Demikhovsky	f0ddd1b8b5	AVX512F: FMA intrinsic + FNEG - sequence optimization The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set. FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))). It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead. I added pattern match for integer XOR. Differential Revision: https://reviews.llvm.org/D24221 llvm-svn: 280785	2016-09-07 06:54:28 +00:00
Matt Arsenault	479ba3aac0	AMDGPU: Make some scalar instructions commutable llvm-svn: 280784	2016-09-07 06:25:55 +00:00
Matt Arsenault	6cda10c950	Remove unnecessary call to getAllocatableRegClass This reapplies r252565 and r252674, effectively reverting r252956. This allows VS_32/VS_64 to be unallocatable like they should be. llvm-svn: 280783	2016-09-07 06:16:45 +00:00
Craig Topper	0e473955a0	[X86] Add hasSideEffects=0 to some instructions. llvm-svn: 280782	2016-09-07 04:46:15 +00:00
Craig Topper	b880ad3a71	[AVX-512] Add support for commuting masked instructions in findCommutedOpIndices. The default implementation doesn't skip the mask input or the preserved input. llvm-svn: 280781	2016-09-07 04:46:11 +00:00
Saleem Abdulrasool	a7ade33d16	Revert "CodeGen: ensure that libcalls are always AAPCS CC" This reverts SVN r280683. Revert until I figure out why this is breaking lli tests. llvm-svn: 280778	2016-09-07 03:17:19 +00:00
Nick Lewycky	edd0a7023f	Fix typo in comment, NFC llvm-svn: 280774	2016-09-07 01:49:41 +00:00
Davide Italiano	24c29b1426	[LTO] Rename variables to be more explicative. Thanks to Mehdi for the suggestion! llvm-svn: 280772	2016-09-07 01:08:31 +00:00
Hal Finkel	8ca2ed22b2	[DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike) I might have called this "r246507, the sequel". It fixes the same issue, as the issue has cropped up in a few more places. The underlying problem is that isSetCCEquivalent can pick up select_cc nodes with a result type that is not legal for a setcc node to have, and if we use that type to create new setcc nodes, nothing fixes that (and so we've violated the contract that the infrastructure has with the backend regarding setcc node types). Fixes PR30276. For convenience, here's the commit message from r246507, which explains the problem is greater detail: [DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 280767	2016-09-06 23:02:23 +00:00

... 3 4 5 6 7 ...

95082 Commits