llvm-project

Commit Graph

Author	SHA1	Message	Date
Adam Nemet	053c4e825c	[AVX512] Fix miscompile for unpack r189189 implemented AVX512 unpack by essentially performing a 256-bit unpack between the low and the high 256 bits of src1 into the low part of the destination and another unpack of the low and high 256 bits of src2 into the high part of the destination. I don't think that's how unpack works. AVX512 unpack simply has more 128-bit lanes but other than it works the same way as AVX. So in each 128-bit lane, we're always interleaving certain parts of both operands rather different parts of one of the operands. E.g. for this: __v16sf a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }; __v16sf b = { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 }; __v16sf c = __builtin_shufflevector(a, b, 0, 8, 1, 9, 4, 12, 5, 13, 16, 24, 17, 25, 20, 28, 21, 29); we generated punpcklps (notice how the elements of a and b are not interleaved in the shuffle). In turn, c was set to this: 0 16 1 17 4 20 5 21 8 24 9 25 12 28 13 29 Obviously this should have just returned the mask vector of the shuffle vector. I mostly reverted this change and made sure the original AVX code worked for 512-bit vectors as well. Also updated the tests because they matched the logic from the code. llvm-svn: 217602	2014-09-11 16:51:10 +00:00
Sanjay Patel	1eb5047ddb	Add triple and remove hashes to account for buildbot differences in comment strings. llvm-svn: 217601	2014-09-11 16:08:44 +00:00
Benjamin Kramer	9e5b4a5827	Move constant-sized bitvector to the stack. llvm-svn: 217600	2014-09-11 15:58:39 +00:00
Sanjay Patel	7bd228a82e	Combine fmul vector FP constants when unsafe math is allowed. This is an extension of the change made with r215820: http://llvm.org/viewvc/llvm-project?view=revision&revision=215820 That patch allowed combining of splatted vector FP constants that are multiplied. This patch allows combining non-uniform vector FP constants too by relaxing the check on the type of vector. Also, canonicalize a vector fmul in the same way that we already do for scalars - if only one operand of the fmul is a constant, make it operand 1. Otherwise, we miss potential folds. This fold is also done by -instcombine, but it's possible that extra fmuls may have been generated during lowering. Differential Revision: http://reviews.llvm.org/D5254 llvm-svn: 217599	2014-09-11 15:45:27 +00:00
Sanjay Patel	4cb54e0a78	typo llvm-svn: 217597	2014-09-11 15:41:01 +00:00
Aaron Watry	1885e53a75	R600: Add cmpxchg instruction for evergreen Refactored the R600_LDS_1A2D class a bit to get it to actually work. It seemed to be previously unused and broken. We also have to disable the conversion to the noret variant for now in R600ISelLowering because the getLDSNoRetOp method only handles 1A1D LDS ops. Someone can feel free to modify the AMDGPU::getLDSNoRetOp method to work for more than 1A1D variants of LDS operations. It's being left as a future TODO for now. Signed-off-by: Aaron Watry <awatry at gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217596	2014-09-11 15:02:54 +00:00
Aaron Watry	3ffc560094	R600: Test local atomics for evergreen Now that the operations are all implemented, we can test this sub-arch here. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217595	2014-09-11 15:02:52 +00:00
Aaron Watry	21591670c9	R600: Add LDS_WRXCHG[_RET] instructions for Evergreen. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217594	2014-09-11 15:02:49 +00:00
Aaron Watry	564a22e995	R600: Add LDS_MIN_[U]INT[_RET] instructions for Evergreen Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217593	2014-09-11 15:02:47 +00:00
Aaron Watry	e51794f2fa	R600: Add LDS_XOR[_RET] instructions for Evergreen Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217592	2014-09-11 15:02:46 +00:00
Aaron Watry	cffa0114c7	R600: Add LDS_OR[_RET] instructions for Evergreen Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217591	2014-09-11 15:02:44 +00:00
Aaron Watry	a7f122da60	R600: Add LDS_AND[_RET] instructions for Evergreen Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217590	2014-09-11 15:02:43 +00:00
Aaron Watry	62a0af4a0d	R600: Add LDS_MAX_[U]INT[_RET] instructions for Evergreen This was only present for SI before. Cayman may still be missing, but I am unable to test that currently. v2: Don't create atomicrmw max tests in separate file Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217589	2014-09-11 15:02:41 +00:00
Daniel Sanders	f605184180	[docs] Mention character array constants in docs/LangRef.rst Summary: They were used in the 'Module Structure' example but weren't otherwise documented. Credit to Reed Kotler for noticing. Reviewers: hans Reviewed By: hans Subscribers: hans, llvm-commits Differential Revision: http://reviews.llvm.org/D5191 llvm-svn: 217583	2014-09-11 12:02:59 +00:00
Tilmann Scheller	ee0e49398c	[ARM] Add Thumb-2 code size optimization regression test for LSR (register). llvm-svn: 217582	2014-09-11 10:45:50 +00:00
Tilmann Scheller	579379a6f4	[ARM] Add Thumb-2 code size optimization regression test for LSR (immediate). llvm-svn: 217581	2014-09-11 10:42:17 +00:00
Arnaud A. de Grandmaison	3690266739	[AArch64] Reenable the PBQP test now that the leak issue has been fixed. David Blaikie's commits r217563 & r217564, which added shared_ptr to the CostPool have fixed some memory leak issues exposed by the PBQP with coalescing constraints. The sanitizer bot was failing because of those leaks. Now that the leaks are gone, we can reenable the aarch64/pbqp test. llvm-svn: 217580	2014-09-11 10:39:52 +00:00
Tilmann Scheller	0c1249ac60	[ARM] Add Thumb-2 code size optimization regression test for LSL (register). llvm-svn: 217579	2014-09-11 10:33:39 +00:00
Tilmann Scheller	7430df486e	[ARM] Add Thumb2 code size optimization regression test for LSL (immediate). llvm-svn: 217576	2014-09-11 10:29:42 +00:00
Chandler Carruth	1ec3e4e4bd	[x86] Fixup r217565 which baked in an assumption about the function name that breaks on some platforms. This part of the test just doesn't matter... llvm-svn: 217575	2014-09-11 10:21:25 +00:00
Hal Finkel	f83e1f7f66	[AlignmentFromAssumptions] Don't crash just because the target is 32-bit We used to crash processing any relevant @llvm.assume on a 32-bit target (because we'd ask SE to subtract expressions of differing types). I've copied our 'simple.ll' test, but with the data layout from arm-linux-gnueabihf to get some meaningful test coverage here. llvm-svn: 217574	2014-09-11 08:40:17 +00:00
David Xu	f7aff68fe3	Build correct vector filled with undef nodes llvm-svn: 217570	2014-09-11 05:10:28 +00:00
Justin Bogner	8e5f548b81	utils: Teach lldbDataFormatters how to format ArrayRefs llvm-svn: 217567	2014-09-11 01:47:38 +00:00
Chandler Carruth	292303dd47	[x86] FileCheck-ize this test. llvm-svn: 217565	2014-09-11 00:13:35 +00:00
David Blaikie	792e8f3c02	Use CostPool::PoolRef typedef some more Cleanup to 217563 suggested by Lang Hames in post-commit review. llvm-svn: 217564	2014-09-11 00:08:54 +00:00
David Blaikie	ebd7f671df	shared_ptrify ownershp of PoolEntries in PBQP's CostPool Leveraging both intrusive shared_ptr-ing (std::enable_shared_from_this) and shared_ptr<T>-owning-U (to allow external users to hold std::shared_ptr<CostT> while keeping the underlying PoolEntry alive). The intrusiveness could be removed if we had a weak_set that implicitly removed items from the set when their underlying data went away. This /might/ fix an existing memory leak reported by LeakSanitizer in r217504. llvm-svn: 217563	2014-09-10 23:54:45 +00:00
Matt Arsenault	61a528adc7	R600/SI: Fix losing chain when fixing reg class of loads. The lost chain resulting in earlier side effecting nodes being deleted. llvm-svn: 217561	2014-09-10 23:26:19 +00:00
Matt Arsenault	2e9911205f	R600/SI: Report offset in correct units for st64 DS instructions Need to convert the 64 element offset into bytes, not just the element size like the normal case instructions. Noticed by inspection. This can't be hit now because st64 instructions aren't emitted during instruction selection, and the post-RA scheduler isn't enabled. llvm-svn: 217560	2014-09-10 23:26:16 +00:00
Peter Collingbourne	d0ec5ab948	Add LLVMgold target to test dependencies. llvm-svn: 217557	2014-09-10 22:20:49 +00:00
Matt Arsenault	16e313343d	R600: Custom lower frem llvm-svn: 217553	2014-09-10 21:44:27 +00:00
Rafael Espindola	c435adcde0	Add doInitialization/doFinalization to DataLayoutPass. With this a DataLayoutPass can be reused for multiple modules. Once we have doInitialization/doFinalization, it doesn't seem necessary to pass a Module to the constructor. Overall this change seems in line with the idea of making DataLayout a required part of Module. With it the only way of having a DataLayout used is to add it to the Module. llvm-svn: 217548	2014-09-10 21:27:43 +00:00
Hal Finkel	8123630a21	Enable use of __builtin_assume_aligned when self-hosting Clang/LLVM trunk now have support for __builtin_assume_aligned, turn this && into an \|\| so we can use it ourselves. llvm-svn: 217545	2014-09-10 21:06:11 +00:00
Hal Finkel	71b7084112	[AlignmentFromAssumptions] Don't divide by zero for unknown starting alignment The routine that determines an alignment given some SCEV returns zero if the answer is unknown. In a case where we could determine the increment of an AddRec but not the starting alignment, we would compute the integer modulus by zero (which is illegal and traps). Prevent this by returning early if either the start or increment alignment is unknown (zero). llvm-svn: 217544	2014-09-10 21:05:52 +00:00
Dan Liew	4773d0b4bf	[sphinx cleanup] Fix sphinx warning introduced by r217537 llvm-svn: 217541	2014-09-10 20:43:03 +00:00
Gerolf Hoflehner	7b0abb89c2	[AArch64] Revert r216141 for cyclone The increase of the interleave factor to 4 has side-effects like performance losses eg. due to reminder loops being executed more frequently and may increase code size. It requires more analysis and careful heuristic tuning. Expect double digit gains in small benchmarks like lowercase.c and losses in puzzle.c. llvm-svn: 217540	2014-09-10 20:31:57 +00:00
Gerolf Hoflehner	008e5cdcba	[PassManager] Adding Hidden attribute to EnableMLSM option llvm-svn: 217539	2014-09-10 20:24:03 +00:00
Gerolf Hoflehner	24815d9b8f	[MergedLoadStoreMotion] Move pass enabling option to PassManagerBuilder llvm-svn: 217538	2014-09-10 19:55:29 +00:00
Nico Weber	4b916b21b4	Fix docs reference to inexistent class. Patch sent via telegraph by TNorthover. Thanks! llvm-svn: 217537	2014-09-10 19:50:55 +00:00
Rafael Espindola	71143ed24b	Remember to eraseFromParent after replaceAllUsesWith. llvm-svn: 217536	2014-09-10 19:39:41 +00:00
Adrian Prantl	1383d6f808	Cleanup: Use the appropriate API for accessing the DIVariable of a DBG_VALUE intrinsic. llvm-svn: 217533	2014-09-10 18:52:29 +00:00
Arnaud A. de Grandmaison	d17f96c9ad	[AArch64] Temporarily desactivate the PBQP test, while I investigate some leaks in the allocator llvm-svn: 217531	2014-09-10 18:40:18 +00:00
Alexey Samsonov	17a9cff55c	Make CallingConv::ID an alias of "unsigned". Summary: Make CallingConv::ID a plain unsigned instead of enum with a fixed set of valus. LLVM IR allows arbitraty calling conventions (you are free to write cc12345), and loading them as enum is an undefined behavior. This was reported by UBSan. Test Plan: llvm regression test suite Reviewers: nicholas Reviewed By: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5248 llvm-svn: 217529	2014-09-10 18:00:17 +00:00
Sanjay Patel	b653de1ada	Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528	2014-09-10 17:58:16 +00:00
Gerolf Hoflehner	e4f6684d1b	Removed misleading comment. llvm-svn: 217527	2014-09-10 17:54:50 +00:00
Gerolf Hoflehner	68570c63ca	Added missing blank llvm-svn: 217526	2014-09-10 17:52:27 +00:00
Hans Wennborg	0def0668e4	LangRef: @baz should be @bar in the COMDAT example llvm-svn: 217520	2014-09-10 17:05:08 +00:00
Arnaud A. de Grandmaison	0dbcfba659	[AArch64] Address Chad's post commit review comments for r217504 (PBQP experimental support) llvm-svn: 217518	2014-09-10 17:03:25 +00:00
Sanjay Patel	1893a25a5f	typo llvm-svn: 217516	2014-09-10 16:58:40 +00:00
Frederic Riss	a873414f87	Fix comments of createReplaceableForwardDecl() and createForwardDecl(). Noticed while trying to understand how the merge of forward decalred types and defintions work. Reviewers: echristo, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5291 llvm-svn: 217514	2014-09-10 16:03:14 +00:00
Rafael Espindola	d8bd91ccfc	Replace a few virtual with override. llvm-svn: 217513	2014-09-10 15:50:08 +00:00

1 2 3 4 5 ...

107571 Commits