llvm-project

Commit Graph

Author	SHA1	Message	Date
Wei Ding	5b2636a152	AMDGPU: Add LLVM IR Intrinsic for v_lerp_u8 Differential Revision: http://reviews.llvm.org/D22239 llvm-svn: 275197	2016-07-12 18:02:14 +00:00
Xinliang David Li	9eb472ba4b	[PGO] Don't include full file path in static function profile counter names Patch by Jake VanAdrighem Differential Revision: http://reviews.llvm.org/D22028 llvm-svn: 275193	2016-07-12 17:14:51 +00:00
Sanjay Patel	4a6a751dce	add tests for missing DeMorgan's Law folds llvm-svn: 275192	2016-07-12 17:05:04 +00:00
Sanjay Patel	3900191ecc	auto-generate checks llvm-svn: 275188	2016-07-12 16:21:55 +00:00
Sanjay Patel	93dffe629a	auto-generate checks llvm-svn: 275187	2016-07-12 16:17:30 +00:00
Sanjay Patel	6d1f227e6b	auto-generate checks llvm-svn: 275186	2016-07-12 16:13:04 +00:00
Haicheng Wu	711ca868fc	[AArch64] Set FMOVS0 and FMOVD0 as isAsCheapAsAMove when needed. If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now only Kryo has both), set FMOVS0 and FMOVD0 isAsCheapAsAMove. Differential Revision: http://reviews.llvm.org/D22256 llvm-svn: 275178	2016-07-12 15:31:41 +00:00
Nemanja Ivanovic	eebbcb6d57	[PowerPC] Cannonicalize applicable vector shift immediates as swaps This patch corresponds to review: http://reviews.llvm.org/D21358 Vector shifts that have the same semantics as a vector swap are cannonicalized as such to provide additional opportunities for swap removal optimization to remove unnecessary swaps. llvm-svn: 275168	2016-07-12 12:16:27 +00:00
Amjad Aboud	acee568545	[codeview] Improved array type support. Added support for: 1. Multi dimension array. 2. Array of structure type, which previously was declared incompletely. 3. Dynamic size array. 4. Array where element type is a typedef, volatile or constant (this should resolve PR28311). Differential Revision: http://reviews.llvm.org/D21526 llvm-svn: 275167	2016-07-12 12:06:34 +00:00
Nicolai Haehnle	7968c34586	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160	2016-07-12 08:12:16 +00:00
Vitaly Buka	204dc533c5	Revert "New pass manager for LICM." Summary: This reverts commit r275118. Subscribers: sanjoy, mehdi_amini Differential Revision: http://reviews.llvm.org/D22259 llvm-svn: 275156	2016-07-12 06:25:32 +00:00
Craig Topper	a6e6febe2c	[AVX512] Remove masked logic op intrinsics and autoupgrade them to native IR. llvm-svn: 275155	2016-07-12 05:27:53 +00:00
Ivan Krasin	5474645dc8	Print remarks from WholeProgramDevirt pass for each call site. Summary: It's useful to have some visibility about which call sites are devirtualized, especially for debug purposes. Another use case is a regression test on the application side (like, Chromium). Reviewers: pcc Differential Revision: http://reviews.llvm.org/D22252 llvm-svn: 275145	2016-07-12 02:38:37 +00:00
NAKAMURA Takumi	e92e2124f6	llvm/test/CodeGen/AMDGPU/selected-stack-object.ll REQUIRES +Asserts, since it expects assertion failure. llvm-svn: 275144	2016-07-12 02:18:09 +00:00
Haicheng Wu	1e39574e9f	[Kryo] Enable ZCZeroing feature This feature uses immediate #0 to zero a register. Differential Revision: http://reviews.llvm.org/D19985 llvm-svn: 275143	2016-07-12 02:04:01 +00:00
Nico Weber	c7bf646a99	Teach FastISel about thiscall (and, hence, about callee-pop). http://reviews.llvm.org/D22115 llvm-svn: 275135	2016-07-12 01:30:35 +00:00
Matt Arsenault	45f8216cee	AMDGPU: Remove superfluous string attributes from tests Also fix v_mac.ll not testing right thing for fneg llvm-svn: 275129	2016-07-11 23:35:48 +00:00
Mehdi Amini	e75aa6f674	Add a libLTO API to query a memory buffer and check if it contains ObjC categories The linker supports a feature to force load an object from a static archive if it defines an Objective-C category. This API supports this feature by looking at every section in the module to find if a category is defined in the module. llvm-svn: 275125	2016-07-11 23:10:18 +00:00
Dehao Chen	7ef5820fa3	New pass manager for LICM. Summary: Port LICM to the new pass manager. Reviewers: davidxl, silvas Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21772 llvm-svn: 275118	2016-07-11 22:45:24 +00:00
Alina Sbirlea	cbc6ac2afd	Correct ordering of loads/stores. Summary: Aiming to correct the ordering of loads/stores. This patch changes the insert point for loads to the position of the first load. It updates the ordering method for loads to insert before, rather than after. Before this patch the following sequence: "load a[1], store a[1], store a[0], load a[2]" Would incorrectly vectorize to "store a[0,1], load a[1,2]". The correctness check was assuming the insertion point for loads is at the position of the first load, when in practice it was at the last load. An alternative fix would have been to invert the correctness check. The current fix changes insert position but also requires reordering of instructions before the vectorized load. Updated testcases to reflect the changes. Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22071 llvm-svn: 275117	2016-07-11 22:34:29 +00:00
Tim Northover	3e0361710a	ARM: validate immediate branch targets in AsmParser. Immediate branch targets aren't commonly used, but if they are we should make sure they can actually be encoded. This means they must be divisible by 2 when targeting Thumb mode, and by 4 when targeting ARM mode. Also do a little naming cleanup while I was changing everything around anyway. llvm-svn: 275116	2016-07-11 22:29:37 +00:00
Nicolai Haehnle	c06bfa1daa	AMDGPU: Treat texture gather instructions more like other MIMG instructions Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210 llvm-svn: 275113	2016-07-11 21:59:43 +00:00
Zachary Turner	dbeaea7b35	Refactor the PDB writing to use a builder approach llvm-svn: 275110	2016-07-11 21:45:26 +00:00
Zachary Turner	f6b9382467	[pdb] Add a pdb2yaml option to not dump file headers. This will be useful once we start adding the ability to dump type records and symbol records, since it will allow us to generate mergeable information instead of information that specifies an entire file. llvm-svn: 275109	2016-07-11 21:45:09 +00:00
Nicolai Haehnle	f52c3cf272	AMDGPU: fix local stack slot allocation bugs Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551 llvm-svn: 275108	2016-07-11 21:44:40 +00:00
Michael Kuperstein	f0c59330e9	[X86] Make some cast costs more precise Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106	2016-07-11 21:39:44 +00:00
Quentin Colombet	fb82c7bc94	[X86] Fix tailcall return address clobber bug. This bug (llvm.org/PR28124) was introduced by r237977, which refactored the tail call sequence to be generated in two passes instead of one. Unfortunately, the stack adjustment produced by the first pass was not recognized by X86FrameLowering::mergeSPUpdates() in all cases, causing code such as the following, which clobbers the return address, to be generated: popl %edi popl %edi pushl %eax jmp tailcallee # TAILCALL To fix the problem, the entire stack adjustment is performed in X86ExpandPseudo::ExpandMI() for tail calls. Patch by Magnus Lång <margnus1@gmail.com> Differential Revision: http://reviews.llvm.org/D21325 llvm-svn: 275103	2016-07-11 21:03:03 +00:00
Alina Sbirlea	327955e057	Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains. Add additional parameters: AddressSpace, Alignment, Fast. Reviewers: llvm-commits, jlebar Subscribers: arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21935 llvm-svn: 275100	2016-07-11 20:46:17 +00:00
Michael Kuperstein	cfbac5f361	[X86] Disable FixupSetCC for CodeGenOpt::None It is an optimization pass, and should not run at -O0. Especially since Fast RA will not do the required register coalescing anyway, so it's a loss even from the optimization standpoint. This also works around (but doesn't quite fix) PR28489. llvm-svn: 275099	2016-07-11 20:40:44 +00:00
Chad Rosier	4f0dad1674	[IPRA] Properly compute register usage at call sites. Differential Revision: http://reviews.llvm.org/D21395 Patch by Vivek Pandya. PR28144 llvm-svn: 275087	2016-07-11 18:45:49 +00:00
Zhan Jun Liau	def708a0f9	[SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunities Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 llvm-svn: 275086	2016-07-11 18:45:03 +00:00
Jingyue Wu	641cfee976	[SLSR] Call getPointerSizeInBits with the correct address space. llvm-svn: 275083	2016-07-11 18:13:28 +00:00
Davide Italiano	e8ae0b5eb4	[PM/IPO] Port LowerTypeTests to the new PassManager. There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. llvm-svn: 275082	2016-07-11 18:10:06 +00:00
Jacques Pienaar	c3a162c451	[lanai] Add more tests for assembly of conditional ALU ops llvm-svn: 275081	2016-07-11 17:58:16 +00:00
Dehao Chen	9232f98279	Implement callsite-hotness based inline cost for Sample-based PGO Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073	2016-07-11 16:48:54 +00:00
Dehao Chen	29d2641f52	Tune the weight propagation algorithm for sample profile. Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 llvm-svn: 275072	2016-07-11 16:40:17 +00:00
Sanjay Patel	8f1d408c74	[x86] make some of the tests 256-bit for testing diversity llvm-svn: 275070	2016-07-11 15:08:37 +00:00
Nirav Dave	8603062ee4	Fix branch relaxation in 16-bit mode. Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 llvm-svn: 275068	2016-07-11 14:23:53 +00:00
Sanjay Patel	b428951990	[x86] specify triple to avoid bot failures llvm-svn: 275067	2016-07-11 14:17:54 +00:00
Nicolai Haehnle	889a20cf40	[Sink] Don't move calls to readonly functions across stores Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 llvm-svn: 275066	2016-07-11 14:11:51 +00:00
Sanjay Patel	0d38830aca	[x86] update checks llvm-svn: 275064	2016-07-11 14:07:31 +00:00
Nirav Dave	53a72f4d3c	Provide support for preserving assembly comments Preserve assembly comments from input in output assembly and flags to toggle property. This is on by default for inline assembly and off in llvm-mc. Parsed comments are emitted immediately before an EOL which generally places them on the expected line. Reviewers: rtrieu, dwmw2, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20020 llvm-svn: 275058	2016-07-11 12:42:14 +00:00
Artem Tamazov	53c9de08d2	[AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions. Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 llvm-svn: 275054	2016-07-11 12:07:18 +00:00
Zlatko Buljan	cba9f80ba8	[mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 llvm-svn: 275050	2016-07-11 07:41:56 +00:00
Elena Demikhovsky	d84f337953	AVX-512: DAG lowering for scalar MIN/MAX commutable ops DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. llvm-svn: 275048	2016-07-11 06:08:06 +00:00
Craig Topper	7ee070e7bc	[AVX512] Add support for 512-bit ANDN now that all ones build vectors survive long enough to allow the matching. llvm-svn: 275046	2016-07-11 05:36:53 +00:00
Craig Topper	516e14cd8e	[AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors. llvm-svn: 275045	2016-07-11 05:36:48 +00:00
Hal Finkel	02012bcfee	Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. llvm-svn: 275042	2016-07-11 04:51:23 +00:00
Hal Finkel	2cac58f604	Pointer-comparison folding should look through returned-argument functions For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 llvm-svn: 275039	2016-07-11 03:37:59 +00:00
Hal Finkel	bf3957a553	Teach isDereferenceablePointer to look through returned-argument functions For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038	2016-07-11 03:08:49 +00:00
Hal Finkel	e186debb8b	Teach SCEV to look through returned-argument functions When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037	2016-07-11 02:48:23 +00:00
Hal Finkel	6fd5e1f02b	Teach computeKnownBits to look through returned-argument functions If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 llvm-svn: 275036	2016-07-11 02:25:14 +00:00
Hal Finkel	5c12d8fe8f	BasicAA should look through functions with returned arguments Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035	2016-07-11 01:32:20 +00:00
Hal Finkel	d66a7b05db	Let FuncAttrs infer the 'returned' argument attribute A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 llvm-svn: 275027	2016-07-10 22:02:55 +00:00
Jan Vesely	2fa28c330c	AMDGPU/R600: Add implicitarg.ptr intrinsic Differential Revision: http://reviews.llvm.org/D21622 llvm-svn: 275024	2016-07-10 21:20:29 +00:00
Simon Pilgrim	2191faa433	[X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHW llvm-svn: 275022	2016-07-10 21:02:47 +00:00
Sanjay Patel	ccd08fc8c4	[x86, SSE, AVX] add tests for icmp+zext (PR28484) Note the inconsistent vpbroadcast generation for AVX2; another bug. llvm-svn: 275020	2016-07-10 20:45:14 +00:00
Simon Pilgrim	51c786bd91	[X86][SSE] Added tests for combining shuffles to PSHUFLW/PSHUFHW llvm-svn: 275019	2016-07-10 20:19:56 +00:00
Marcin Koscielnicki	cf7cc724a7	[SystemZ] Utilize Test Data Class instructions. This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32\|64\|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016	2016-07-10 14:41:22 +00:00
Craig Topper	0b0954570a	[AVX512] Add support for lowering to 512-bit SHUFPS. llvm-svn: 275011	2016-07-10 05:55:53 +00:00
Sean Silva	db90d4d9c1	[PM] Port LoopVectorize to the new PM. llvm-svn: 275000	2016-07-09 22:56:50 +00:00
Simon Pilgrim	606126e848	[X86][SSE] Add support for target shuffle combining to INSERTPS llvm-svn: 274990	2016-07-09 21:47:55 +00:00
Simon Pilgrim	890b415902	[X86][SSE] Regenerate vector shift tests llvm-svn: 274987	2016-07-09 20:55:20 +00:00
David Majnemer	28c3646f82	[COFF, Dwarf] Don't emit DW_AT_location for dllimported entities There exists no relocation which can describe the address of a dllimported variable: do not try to describe their location. llvm-svn: 274986	2016-07-09 20:47:48 +00:00
Jingyue Wu	debce55ac3	[SLSR] Fix crash on handling 128-bit integers. ConstantInt::getSExtValue may fail on >64-bit integers. Add checks to call getSExtValue only on narrow integers. As a minor aside, simplify slsr-gep.ll to remove unnecessary load instructions. llvm-svn: 274982	2016-07-09 19:13:18 +00:00
Jacques Pienaar	b32a912f72	[lanai] Treat .t as optional in assembly parser for RR operands and add predicate operand to ShiftRR llvm-svn: 274980	2016-07-09 18:26:04 +00:00
Matt Arsenault	c1e6a45f2e	AMDGPU: Merge / reorganize tests llvm-svn: 274972	2016-07-09 08:02:28 +00:00
Matt Arsenault	b2cb5f8105	AMDGPU: Simplify tests with per function subtargets llvm-svn: 274971	2016-07-09 07:55:03 +00:00
Matt Arsenault	dfec5ce032	AMDGPU: Fix fdiv lowering when f32 denormals supported Also fix test not actually using function labels. llvm-svn: 274969	2016-07-09 07:48:11 +00:00
Craig Topper	70610cf7b6	[X86] Remove and autoupgrade 512-bit non-temporal store intrinsics. llvm-svn: 274966	2016-07-09 04:38:27 +00:00
Davide Italiano	92b933a55c	[PM] Port CrossDSOCFI to the new pass manager. llvm-svn: 274962	2016-07-09 03:25:35 +00:00
Davide Italiano	cd96cfd8df	[PM] Port LoopSimplify to the new pass manager. While here move simplifyLoop() function to the new header, as suggested by Chandler in the review. Differential Revision: http://reviews.llvm.org/D21404 llvm-svn: 274959	2016-07-09 03:03:01 +00:00
Matt Arsenault	1322b6f8bb	AMDGPU: Improve offset folding for register indexing llvm-svn: 274954	2016-07-09 01:13:56 +00:00
Matthias Braun	152e7c8b12	VirtRegMap: Replace some identity copies with KILL instructions. An identity COPY like this: %AL = COPY %AL, %EAX<imp-def> has no semantic effect, but encodes liveness information: Further users of %EAX only depend on this instruction even though it does not define the full register. Replace the COPY with a KILL instruction in those cases to maintain this liveness information. (This reverts a small part of r238588 but this time adds a comment explaining why a KILL instruction is useful). llvm-svn: 274952	2016-07-09 00:19:07 +00:00
Piotr Padlewski	7a298c1df0	Added REQUIRES to TestingGuide documentation Reviewers: alexfh, wolfgangp, rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22172 llvm-svn: 274949	2016-07-08 23:47:29 +00:00
Piotr Padlewski	3b77612839	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified FIXED missing colon on requires. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274947	2016-07-08 23:01:49 +00:00
Piotr Padlewski	d4b792346c	Revert "Add 'thinlto_src_module' md with asserts or -enable-import-metadata" Reverting because of 17463 http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17463 This reverts commit d20cb431bba2ba43b4c65a8556cff445bfefbb7c. llvm-svn: 274946	2016-07-08 22:55:48 +00:00
Jacques Pienaar	9e70127b0a	[lanai] Update test to use peephole-opt and not peephole-opts llvm-svn: 274945	2016-07-08 22:28:29 +00:00
Anna Thomas	9ad45adfd7	Revert "InstCombine rule to fold truncs whose value is available" This reverts commit r274853. Caused failure in ppcBE build llvm-svn: 274943	2016-07-08 22:15:08 +00:00
David Majnemer	230bbfbeec	[MC, COFF] Permit a variable to be redefined Our assertions in WinCOFFStreamer had unexpected side effects resulting in symbols getting unexpectedly marked as used. This fixes PR28462. llvm-svn: 274941	2016-07-08 21:54:16 +00:00
Piotr Padlewski	d6efefa2b8	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274938	2016-07-08 21:25:39 +00:00
Matt Arsenault	3fb8f9eabf	Reapply r274829 with fix for FP vectors llvm-svn: 274937	2016-07-08 21:25:33 +00:00
Adam Nemet	f836067cc0	[LAA] Port test to the new PM This is a follow-on to r274452. The LAA with the new PM is a loop pass so we go from inner to outer loops. Also using a CHECK-NOT didn't make much sense because we print something in either case; whether an invariant is 'found' or 'not found'. llvm-svn: 274935	2016-07-08 21:24:06 +00:00
Sanjay Patel	664514f7fe	[InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), but we'll try to err on the conservative side by going with the case that has less IR instructions. Note: This question came up in http://reviews.llvm.org/D22114 , but this part is independent of that patch proposal, so I'm making this small change ahead of that one. See also: http://reviews.llvm.org/rL274926 llvm-svn: 274932	2016-07-08 21:17:51 +00:00
Sanjay Patel	5246482c7a	add another multi-use test for logic->select transform llvm-svn: 274929	2016-07-08 21:08:16 +00:00
Sanjay Patel	f4a08ede03	[InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops llvm-svn: 274926	2016-07-08 20:53:29 +00:00
Sanjay Patel	297a0e67b6	adjust test so it won't completely optimize away llvm-svn: 274925	2016-07-08 20:35:53 +00:00
Sanjay Patel	0733e6b61c	add tests for multi-use folding to select llvm-svn: 274922	2016-07-08 20:22:27 +00:00
Dehao Chen	429f5c735f	Remove inline hints computation from SampleProfile.cpp Summary: As we will move to use uniformed hotness check in inliner, we do not need inline hints in SampleProfile pass any more. Reviewers: dnovillo, davidxl Subscribers: eraman, llvm-commits Differential Revision: http://reviews.llvm.org/D19287 llvm-svn: 274918	2016-07-08 20:12:44 +00:00
Nico Weber	28410c6846	Revert r274829, it caused PR28472. llvm-svn: 274916	2016-07-08 19:52:19 +00:00
Simon Pilgrim	0a0e0d4e8e	[X86] Regenerated bitreverse tests to demonstrate what is going on. llvm-svn: 274915	2016-07-08 19:51:08 +00:00
Simon Pilgrim	aaaeedb8cb	[X86] Added bitreverse tests for non-legal types Requested on D21578 llvm-svn: 274914	2016-07-08 19:48:33 +00:00
Simon Pilgrim	950419f948	[X86][AVX2] Add support for target shuffle combining to VPERMPD/VPERMQ llvm-svn: 274908	2016-07-08 19:23:29 +00:00
Davide Italiano	d555bde59f	[SCCP] Fold constants as we build them whne visiting cast instructions. This should be slightly more efficient and could avoid spurious overdefined markings, as Eli pointed out. Differential Revision: http://reviews.llvm.org/D22122 llvm-svn: 274905	2016-07-08 19:13:40 +00:00
Sanjay Patel	1b6b824548	[InstCombine] check for one-use before turning simple logic op into a select llvm-svn: 274891	2016-07-08 17:26:47 +00:00
Simon Pilgrim	4ca42e232d	[SLPVectorizer][X86] Added fma vectorization tests llvm-svn: 274889	2016-07-08 17:19:13 +00:00
Sanjay Patel	910ce0d511	add test to show multi-use output llvm-svn: 274887	2016-07-08 17:12:27 +00:00
Simon Pilgrim	b600ba3b79	[X86][AVX] Added combine test that should simplify to insertps llvm-svn: 274884	2016-07-08 17:01:42 +00:00
Sanjay Patel	cbfca9e8ef	[InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors llvm-svn: 274883	2016-07-08 17:01:15 +00:00
Zhan Jun Liau	7d4d436c74	[SystemZ] Add support for the .word directive. Summary: Branch off the work to add support for the .word directive, using addAliasForDirective. Reviewers: koriakin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22142 llvm-svn: 274878	2016-07-08 16:50:02 +00:00
Sanjay Patel	647174c8a4	add vector tests to show missing transform llvm-svn: 274876	2016-07-08 16:39:53 +00:00
Matt Arsenault	44540a3db2	PeepholeOptimizer: Make pass name match DEBUG_TYPE llvm-svn: 274874	2016-07-08 16:29:11 +00:00
Zhan Jun Liau	3b4c3f4d51	[SystemZ] Add support for missing instructions Summary: Add support to allow clang integrated assembler to recognize some missing instructions, for openssl. Instructions are: LM, LMH, LMY, STM, STMH, STMY, ICM, ICMH, ICMY, SLA, SLAK, TML, TMH, EX, EXRL. Reviewers: uweigand Subscribers: koriakin, llvm-commits Differential Revision: http://reviews.llvm.org/D22050 llvm-svn: 274869	2016-07-08 16:18:40 +00:00
Sanjay Patel	46df968326	minimize tests The cmp and load aren't required. llvm-svn: 274864	2016-07-08 16:11:48 +00:00
Sanjay Patel	e1acad9b61	regenerate checks llvm-svn: 274860	2016-07-08 16:06:38 +00:00
Chris Dewhurst	3202f065b8	[Sparc] Leon errata fix passes. Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor. The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these. Note: Running clang-format has changed a few other lines too, unrelated to the implemented errata fixes. These have been left in as this keeps the code formatting consistent. Differential Revision: http://reviews.llvm.org/D21960 llvm-svn: 274856	2016-07-08 15:33:56 +00:00
Sjoerd Meijer	1ee119f897	Do not expand SDIV when compiling for minimum code size Differential Revision: http://reviews.llvm.org/D22139 llvm-svn: 274855	2016-07-08 15:32:01 +00:00
Anna Thomas	3124f6273a	InstCombine rule to fold truncs whose value is available We can fold truncs whose operand feeds from a load, if the trunc value is available through a prior load/store. This change is from: http://reviews.llvm.org/D21246, which folded the trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW call, when the load type didnt match the prior load/store type. Differential Revision: http://reviews.llvm.org/D21791 llvm-svn: 274853	2016-07-08 15:18:56 +00:00
Valery Pykhtin	68853ab2c5	[AMDGPU] fix ds_swizzle_b32 opcode for VI (bz 28371) Differential Revision: http://reviews.llvm.org/D22049 llvm-svn: 274852	2016-07-08 15:12:46 +00:00
Sjoerd Meijer	46c4c3d31c	Addressing post-commit comments regarding not expanding UDIV; we don't expand only when compiling for minimum code size. llvm-svn: 274847	2016-07-08 14:17:09 +00:00
Simon Pilgrim	4f1877fb57	[X86][SSE] Improve constant folding tests for CVTSD/CVTSS/CVTTSD/CVTTSS As discussed on D22106, improve the testing for constant folding sse scalar conversion intrinsics to ensure we are correctly handling special/out of range cases llvm-svn: 274846	2016-07-08 13:28:34 +00:00
Sjoerd Meijer	a625af3feb	Code size optimisation: don't expand a div to a mul and and a shift sequence. As a result, the urem instruction will not be expanded to a sequence of umull, lsrs, muls and sub instructions, but just a call to __aeabi_uidivmod. Differential Revision: http://reviews.llvm.org/D22131 llvm-svn: 274843	2016-07-08 12:54:43 +00:00
Simon Pilgrim	828c731880	[X86][SSE] Accept any shuffle mask that is all zeroes Until we have a better way to extract constants through bitcasted build vectors (and how to handle undefs of partial lanes etc.) at least accept build vectors that are all zeroes. llvm-svn: 274833	2016-07-08 10:39:12 +00:00
Matt Arsenault	c3a6fe6ecd	Bug 28444: Fix assertion when extract_vector_elt has mismatched type For some reason extract_vector_elt is sometimes allowed to have a different result type than the vector element type. llvm-svn: 274829	2016-07-08 07:05:00 +00:00
Craig Topper	f7bf6de0af	[AVX512] Remove and autoupgrade a duplicate set of 512-bit masked shift intrinsics. I'm not sure if clang ever used these builtin names or not. llvm-svn: 274827	2016-07-08 06:14:47 +00:00
Wei Mi	90d195a5fd	[PM] Port UnreachableBlockElim to the new Pass Manager Differential Revision: http://reviews.llvm.org/D22124 llvm-svn: 274824	2016-07-08 03:32:49 +00:00
Saleem Abdulrasool	eb059b0e0a	ARM: support high registers in __builtin_longjmp on WoA Windows on ARM uses a pure thumb-2 environment. This means that it can select a high register when doing a __builtin_longjmp. We would use a tLDRi which would truncate the register to a low register. Use a t2LDRi12 to get the full register file access. Tweak the code to just load into PC, as that is an interworking branch on all supported cores anyways. llvm-svn: 274815	2016-07-08 00:48:22 +00:00
Andrew Kaylor	3387074ae9	Temporarily remove a test case to unblock PPC bots. llvm-svn: 274813	2016-07-08 00:35:39 +00:00
Andrew Kaylor	8b8805c94c	Temporarily remove one test run line to unblock PPC bots. llvm-svn: 274812	2016-07-08 00:32:58 +00:00
Jacques Pienaar	6d3eecc843	[lanai] Use peephole optimizer to generate more conditional ALU operations. Summary: * Similiar to the ARM backend yse the peephole optimizer to generate more conditional ALU operations; * Add predicated type with default always true to RR instructions in LanaiInstrInfo.td; * Move LanaiSetflagAluCombiner into optimizeCompare; * The ASM parser can currently only handle explicitly specified CC, so specify ".t" (true) where needed in the ASM test; * Remove unused MachineOperand flags; Reviewers: eliben Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D22072 llvm-svn: 274807	2016-07-07 23:36:04 +00:00
Michael Kuperstein	3e3652aef2	Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD) which was not appreciated by fast regalloc on 32-bit. llvm-svn: 274802	2016-07-07 22:50:23 +00:00
Vedant Kumar	0fdffd3709	[tsan] Try harder to not instrument gcov counters GCOVProfiler::emitProfileArcs() can create many variables with names starting with "__llvm_gcov_ctr", so llvm appends a numeric suffix to most of them. Teach tsan about this. llvm-svn: 274801	2016-07-07 22:45:28 +00:00
Kevin Enderby	1851a827a0	Add checks to the MachOObjectFile() constructor to make sure load commands sizes are the correct multiple. llvm-svn: 274798	2016-07-07 22:11:42 +00:00
Davide Italiano	16284df8ec	[PM] Port InstSimplify to the new pass manager. llvm-svn: 274796	2016-07-07 21:14:36 +00:00
Anna Thomas	6a78c78a03	[DSE] Remove dead stores in end blocks containing fence We can remove dead stores in the presence of fence instructions. Fence does not change an otherwise thread local store to visible. reviewers: reames, dexonsmith, jfb Differential Revision: http://reviews.llvm.org/D22001 llvm-svn: 274795	2016-07-07 20:51:42 +00:00
Chad Rosier	112d0e996b	[AArch64] Change the preferred alignment for char and short to word alignment. The commit reinstates r273279, which was informally approved. Original Review: http://reviews.llvm.org/D21414 This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1. llvm-svn: 274790	2016-07-07 20:02:18 +00:00
Andrew Kaylor	65fa0704aa	Include SelectionDAGISel in the opt-bisect process Differential Revision: http://reviews.llvm.org/D21143 llvm-svn: 274786	2016-07-07 18:55:02 +00:00
Peter Collingbourne	73589f321b	ThinLTO: Do not take into account whether a definition has multiple copies when promoting. We currently do not touch a symbol's linkage in the case where a definition has a single copy. However, this code is effectively unnecessary: either the definition is not exported, in which case the internalize phase sets its linkage to internal, or it is exported, in which case we need to promote linkage to weak. Those two cases are already handled by existing code. I believe that the only real functional change here is in the case where we have a single definition which does not prevail (e.g. because the definition in a native object file prevails). In that case we now lower linkage to available_externally following the existing code path for that case. As a result we can remove the isExported function parameter from the thinLTOResolveWeakForLinkerInIndex function. Differential Revision: http://reviews.llvm.org/D21883 llvm-svn: 274784	2016-07-07 18:31:51 +00:00
Tim Northover	1d106c5fc2	tests: accept different TargetOpcode values. These tests don't actually care about the internal opcode number, but have to be updated whenever we add a new one for GlobalISel. That's bad. llvm-svn: 274774	2016-07-07 17:51:42 +00:00
Michael Kuperstein	edb38a94f8	Revert r274692 to check whether this is what breaks windows selfhost. llvm-svn: 274771	2016-07-07 16:55:35 +00:00
Justin Bogner	a466cc33fa	NVPTX: Remove the legacy ptx intrinsics - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769	2016-07-07 16:40:17 +00:00
Chad Rosier	3972953efd	Revert "[AArch64] Change the preferred alignment for char and short to word alignment" This reverts commit r273279 as the change was not properly approved. llvm-svn: 274768	2016-07-07 16:37:29 +00:00
Valery Pykhtin	af8b1bddbd	[AMDGPU] fix ds_write_src2 encoding (bz26027) Differential revision: http://reviews.llvm.org/D22041 llvm-svn: 274756	2016-07-07 14:23:38 +00:00
Rafael Espindola	b34cba97b7	Don't crash trying to relax 32 loads on COFF. Fixes pr28452. llvm-svn: 274754	2016-07-07 14:00:07 +00:00
Sjoerd Meijer	17c08dc701	Code size optimisation: don't rewrite fputs to fwrite when optimising for size because fwrite requires more arguments and thus extra MOVs are required. llvm-svn: 274753	2016-07-07 13:56:23 +00:00
David Majnemer	7afb46d3c8	[LoopAccessAnalysis] Fix an integer overflow We were inappropriately using 32-bit types to account for quantities that can be far larger. Fixed in PR28443. llvm-svn: 274737	2016-07-07 06:24:36 +00:00
Craig Topper	d5d2a35013	[AVX512] Zero extend the result of vpcmpeq/vpcmpgt and similar intrinsics in the autoupgrade code. This currently results in worse codegen but is needed for correctness. llvm-svn: 274736	2016-07-07 06:11:07 +00:00
Elena Demikhovsky	fc1e969dfc	Fixed a bug in vectorizing GEP before gather/scatter intrinsic. Vectorizing GEP was incorrect and broke SSA in some cases. The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997. Differential revision: http://reviews.llvm.org/D22035 llvm-svn: 274735	2016-07-07 06:06:46 +00:00
David Majnemer	a54fe1acdc	[CodeView] Implement support for thread-local variables llvm-svn: 274734	2016-07-07 05:14:21 +00:00
Qin Zhao	c35b2cba6f	[esan:cfrag] Add option -esan-aux-field-info Summary: Adds option -esan-aux-field-info to control generating binary with auxiliary struct field information. Extracts code for creating auxiliary information from createCacheFragInfoGV into createCacheFragAuxGV. Adds test struct_field_small.ll for -esan-aux-field-info test. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D22019 llvm-svn: 274726	2016-07-07 03:20:16 +00:00
Peter Collingbourne	730c82e6b8	ThinLTO: Remove check for multiple modules before applying weak resolutions. This check is not only unnecessary, it can produce the wrong result. If we are linking a single module and it has an exported linkonce symbol, we need to promote to weak in order to avoid PR19901-style problems. Differential Revision: http://reviews.llvm.org/D21917 llvm-svn: 274722	2016-07-07 01:51:11 +00:00
Sean Silva	284b0324e2	[PM] Avoid getResult on a higher level in LoopAccessAnalysis Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. llvm-svn: 274712	2016-07-07 01:01:53 +00:00
Sean Silva	59fe82f4ce	[PM] Port TailCallElim llvm-svn: 274708	2016-07-06 23:48:41 +00:00
Sean Silva	b025d375a1	[PM] Port CorrelatedValuePropagation llvm-svn: 274705	2016-07-06 23:26:29 +00:00
Peter Collingbourne	d1d2614ee1	ThinLTO: Add test cases for promote+internalize. This tests the effect of both promotion and internalization on a module, and helps show that D21883 is NFC wrt promotion+internalization. Differential Revision: http://reviews.llvm.org/D21915 llvm-svn: 274699	2016-07-06 22:53:02 +00:00
Sanjay Patel	65a51c25c1	[InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we allow these transforms for splat vectors. Differential Revision: http://reviews.llvm.org/D21899 llvm-svn: 274696	2016-07-06 22:23:01 +00:00
Manman Ren	524ca27b90	Add testing coverage for r274582. llvm-svn: 274693	2016-07-06 22:01:28 +00:00
Michael Kuperstein	1ef6c59b1d	[X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. Differential Revision: http://reviews.llvm.org/D21774 llvm-svn: 274692	2016-07-06 21:56:18 +00:00
Vedant Kumar	4c01092a25	[llvm-cov] Add support for creating html reports Based on a patch by Harlan Haskins! Differential Revision: http://reviews.llvm.org/D18278 llvm-svn: 274688	2016-07-06 21:44:05 +00:00
Matthias Braun	ad0032a649	AArch64: Change modeling of zero cycle zeroing. On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should be used to zero a vector register. This was previously done at instruction selection time, however the register coalescer sometimes widened multiple vregs to the Q width because of that leading to extra spills. This patch leaves the decision on how to zero a register to the AsmPrinter phase where it doesn't affect register allocation anymore. This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0. This fixes http://llvm.org/PR27454, rdar://25866262 Differential Revision: http://reviews.llvm.org/D21826 llvm-svn: 274686	2016-07-06 21:39:33 +00:00

1 2 3 4 5 ...

37969 Commits