llvm-project

Commit Graph

Author	SHA1	Message	Date
Jingyue Wu	641cfee976	[SLSR] Call getPointerSizeInBits with the correct address space. llvm-svn: 275083	2016-07-11 18:13:28 +00:00
Davide Italiano	e8ae0b5eb4	[PM/IPO] Port LowerTypeTests to the new PassManager. There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. llvm-svn: 275082	2016-07-11 18:10:06 +00:00
Jacques Pienaar	c3a162c451	[lanai] Add more tests for assembly of conditional ALU ops llvm-svn: 275081	2016-07-11 17:58:16 +00:00
Dehao Chen	9232f98279	Implement callsite-hotness based inline cost for Sample-based PGO Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073	2016-07-11 16:48:54 +00:00
Dehao Chen	29d2641f52	Tune the weight propagation algorithm for sample profile. Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 llvm-svn: 275072	2016-07-11 16:40:17 +00:00
Sanjay Patel	8f1d408c74	[x86] make some of the tests 256-bit for testing diversity llvm-svn: 275070	2016-07-11 15:08:37 +00:00
Nirav Dave	8603062ee4	Fix branch relaxation in 16-bit mode. Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 llvm-svn: 275068	2016-07-11 14:23:53 +00:00
Sanjay Patel	b428951990	[x86] specify triple to avoid bot failures llvm-svn: 275067	2016-07-11 14:17:54 +00:00
Nicolai Haehnle	889a20cf40	[Sink] Don't move calls to readonly functions across stores Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 llvm-svn: 275066	2016-07-11 14:11:51 +00:00
Sanjay Patel	0d38830aca	[x86] update checks llvm-svn: 275064	2016-07-11 14:07:31 +00:00
Nirav Dave	53a72f4d3c	Provide support for preserving assembly comments Preserve assembly comments from input in output assembly and flags to toggle property. This is on by default for inline assembly and off in llvm-mc. Parsed comments are emitted immediately before an EOL which generally places them on the expected line. Reviewers: rtrieu, dwmw2, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20020 llvm-svn: 275058	2016-07-11 12:42:14 +00:00
Artem Tamazov	53c9de08d2	[AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions. Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 llvm-svn: 275054	2016-07-11 12:07:18 +00:00
Zlatko Buljan	cba9f80ba8	[mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 llvm-svn: 275050	2016-07-11 07:41:56 +00:00
Elena Demikhovsky	d84f337953	AVX-512: DAG lowering for scalar MIN/MAX commutable ops DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. llvm-svn: 275048	2016-07-11 06:08:06 +00:00
Craig Topper	7ee070e7bc	[AVX512] Add support for 512-bit ANDN now that all ones build vectors survive long enough to allow the matching. llvm-svn: 275046	2016-07-11 05:36:53 +00:00
Craig Topper	516e14cd8e	[AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors. llvm-svn: 275045	2016-07-11 05:36:48 +00:00
Hal Finkel	02012bcfee	Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. llvm-svn: 275042	2016-07-11 04:51:23 +00:00
Hal Finkel	2cac58f604	Pointer-comparison folding should look through returned-argument functions For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 llvm-svn: 275039	2016-07-11 03:37:59 +00:00
Hal Finkel	bf3957a553	Teach isDereferenceablePointer to look through returned-argument functions For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038	2016-07-11 03:08:49 +00:00
Hal Finkel	e186debb8b	Teach SCEV to look through returned-argument functions When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037	2016-07-11 02:48:23 +00:00
Hal Finkel	6fd5e1f02b	Teach computeKnownBits to look through returned-argument functions If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 llvm-svn: 275036	2016-07-11 02:25:14 +00:00
Hal Finkel	5c12d8fe8f	BasicAA should look through functions with returned arguments Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035	2016-07-11 01:32:20 +00:00
Hal Finkel	d66a7b05db	Let FuncAttrs infer the 'returned' argument attribute A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 llvm-svn: 275027	2016-07-10 22:02:55 +00:00
Jan Vesely	2fa28c330c	AMDGPU/R600: Add implicitarg.ptr intrinsic Differential Revision: http://reviews.llvm.org/D21622 llvm-svn: 275024	2016-07-10 21:20:29 +00:00
Simon Pilgrim	2191faa433	[X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHW llvm-svn: 275022	2016-07-10 21:02:47 +00:00
Sanjay Patel	ccd08fc8c4	[x86, SSE, AVX] add tests for icmp+zext (PR28484) Note the inconsistent vpbroadcast generation for AVX2; another bug. llvm-svn: 275020	2016-07-10 20:45:14 +00:00
Simon Pilgrim	51c786bd91	[X86][SSE] Added tests for combining shuffles to PSHUFLW/PSHUFHW llvm-svn: 275019	2016-07-10 20:19:56 +00:00
Marcin Koscielnicki	cf7cc724a7	[SystemZ] Utilize Test Data Class instructions. This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32\|64\|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016	2016-07-10 14:41:22 +00:00
Craig Topper	0b0954570a	[AVX512] Add support for lowering to 512-bit SHUFPS. llvm-svn: 275011	2016-07-10 05:55:53 +00:00
Sean Silva	db90d4d9c1	[PM] Port LoopVectorize to the new PM. llvm-svn: 275000	2016-07-09 22:56:50 +00:00
Simon Pilgrim	606126e848	[X86][SSE] Add support for target shuffle combining to INSERTPS llvm-svn: 274990	2016-07-09 21:47:55 +00:00
Simon Pilgrim	890b415902	[X86][SSE] Regenerate vector shift tests llvm-svn: 274987	2016-07-09 20:55:20 +00:00
David Majnemer	28c3646f82	[COFF, Dwarf] Don't emit DW_AT_location for dllimported entities There exists no relocation which can describe the address of a dllimported variable: do not try to describe their location. llvm-svn: 274986	2016-07-09 20:47:48 +00:00
Jingyue Wu	debce55ac3	[SLSR] Fix crash on handling 128-bit integers. ConstantInt::getSExtValue may fail on >64-bit integers. Add checks to call getSExtValue only on narrow integers. As a minor aside, simplify slsr-gep.ll to remove unnecessary load instructions. llvm-svn: 274982	2016-07-09 19:13:18 +00:00
Jacques Pienaar	b32a912f72	[lanai] Treat .t as optional in assembly parser for RR operands and add predicate operand to ShiftRR llvm-svn: 274980	2016-07-09 18:26:04 +00:00
Matt Arsenault	c1e6a45f2e	AMDGPU: Merge / reorganize tests llvm-svn: 274972	2016-07-09 08:02:28 +00:00
Matt Arsenault	b2cb5f8105	AMDGPU: Simplify tests with per function subtargets llvm-svn: 274971	2016-07-09 07:55:03 +00:00
Matt Arsenault	dfec5ce032	AMDGPU: Fix fdiv lowering when f32 denormals supported Also fix test not actually using function labels. llvm-svn: 274969	2016-07-09 07:48:11 +00:00
Craig Topper	70610cf7b6	[X86] Remove and autoupgrade 512-bit non-temporal store intrinsics. llvm-svn: 274966	2016-07-09 04:38:27 +00:00
Davide Italiano	92b933a55c	[PM] Port CrossDSOCFI to the new pass manager. llvm-svn: 274962	2016-07-09 03:25:35 +00:00
Davide Italiano	cd96cfd8df	[PM] Port LoopSimplify to the new pass manager. While here move simplifyLoop() function to the new header, as suggested by Chandler in the review. Differential Revision: http://reviews.llvm.org/D21404 llvm-svn: 274959	2016-07-09 03:03:01 +00:00
Matt Arsenault	1322b6f8bb	AMDGPU: Improve offset folding for register indexing llvm-svn: 274954	2016-07-09 01:13:56 +00:00
Matthias Braun	152e7c8b12	VirtRegMap: Replace some identity copies with KILL instructions. An identity COPY like this: %AL = COPY %AL, %EAX<imp-def> has no semantic effect, but encodes liveness information: Further users of %EAX only depend on this instruction even though it does not define the full register. Replace the COPY with a KILL instruction in those cases to maintain this liveness information. (This reverts a small part of r238588 but this time adds a comment explaining why a KILL instruction is useful). llvm-svn: 274952	2016-07-09 00:19:07 +00:00
Piotr Padlewski	7a298c1df0	Added REQUIRES to TestingGuide documentation Reviewers: alexfh, wolfgangp, rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22172 llvm-svn: 274949	2016-07-08 23:47:29 +00:00
Piotr Padlewski	3b77612839	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified FIXED missing colon on requires. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274947	2016-07-08 23:01:49 +00:00
Piotr Padlewski	d4b792346c	Revert "Add 'thinlto_src_module' md with asserts or -enable-import-metadata" Reverting because of 17463 http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17463 This reverts commit d20cb431bba2ba43b4c65a8556cff445bfefbb7c. llvm-svn: 274946	2016-07-08 22:55:48 +00:00
Jacques Pienaar	9e70127b0a	[lanai] Update test to use peephole-opt and not peephole-opts llvm-svn: 274945	2016-07-08 22:28:29 +00:00
Anna Thomas	9ad45adfd7	Revert "InstCombine rule to fold truncs whose value is available" This reverts commit r274853. Caused failure in ppcBE build llvm-svn: 274943	2016-07-08 22:15:08 +00:00
David Majnemer	230bbfbeec	[MC, COFF] Permit a variable to be redefined Our assertions in WinCOFFStreamer had unexpected side effects resulting in symbols getting unexpectedly marked as used. This fixes PR28462. llvm-svn: 274941	2016-07-08 21:54:16 +00:00
Piotr Padlewski	d6efefa2b8	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274938	2016-07-08 21:25:39 +00:00
Matt Arsenault	3fb8f9eabf	Reapply r274829 with fix for FP vectors llvm-svn: 274937	2016-07-08 21:25:33 +00:00
Adam Nemet	f836067cc0	[LAA] Port test to the new PM This is a follow-on to r274452. The LAA with the new PM is a loop pass so we go from inner to outer loops. Also using a CHECK-NOT didn't make much sense because we print something in either case; whether an invariant is 'found' or 'not found'. llvm-svn: 274935	2016-07-08 21:24:06 +00:00
Sanjay Patel	664514f7fe	[InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), but we'll try to err on the conservative side by going with the case that has less IR instructions. Note: This question came up in http://reviews.llvm.org/D22114 , but this part is independent of that patch proposal, so I'm making this small change ahead of that one. See also: http://reviews.llvm.org/rL274926 llvm-svn: 274932	2016-07-08 21:17:51 +00:00
Sanjay Patel	5246482c7a	add another multi-use test for logic->select transform llvm-svn: 274929	2016-07-08 21:08:16 +00:00
Sanjay Patel	f4a08ede03	[InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops llvm-svn: 274926	2016-07-08 20:53:29 +00:00
Sanjay Patel	297a0e67b6	adjust test so it won't completely optimize away llvm-svn: 274925	2016-07-08 20:35:53 +00:00
Sanjay Patel	0733e6b61c	add tests for multi-use folding to select llvm-svn: 274922	2016-07-08 20:22:27 +00:00
Dehao Chen	429f5c735f	Remove inline hints computation from SampleProfile.cpp Summary: As we will move to use uniformed hotness check in inliner, we do not need inline hints in SampleProfile pass any more. Reviewers: dnovillo, davidxl Subscribers: eraman, llvm-commits Differential Revision: http://reviews.llvm.org/D19287 llvm-svn: 274918	2016-07-08 20:12:44 +00:00
Nico Weber	28410c6846	Revert r274829, it caused PR28472. llvm-svn: 274916	2016-07-08 19:52:19 +00:00
Simon Pilgrim	0a0e0d4e8e	[X86] Regenerated bitreverse tests to demonstrate what is going on. llvm-svn: 274915	2016-07-08 19:51:08 +00:00
Simon Pilgrim	aaaeedb8cb	[X86] Added bitreverse tests for non-legal types Requested on D21578 llvm-svn: 274914	2016-07-08 19:48:33 +00:00
Simon Pilgrim	950419f948	[X86][AVX2] Add support for target shuffle combining to VPERMPD/VPERMQ llvm-svn: 274908	2016-07-08 19:23:29 +00:00
Davide Italiano	d555bde59f	[SCCP] Fold constants as we build them whne visiting cast instructions. This should be slightly more efficient and could avoid spurious overdefined markings, as Eli pointed out. Differential Revision: http://reviews.llvm.org/D22122 llvm-svn: 274905	2016-07-08 19:13:40 +00:00
Sanjay Patel	1b6b824548	[InstCombine] check for one-use before turning simple logic op into a select llvm-svn: 274891	2016-07-08 17:26:47 +00:00
Simon Pilgrim	4ca42e232d	[SLPVectorizer][X86] Added fma vectorization tests llvm-svn: 274889	2016-07-08 17:19:13 +00:00
Sanjay Patel	910ce0d511	add test to show multi-use output llvm-svn: 274887	2016-07-08 17:12:27 +00:00
Simon Pilgrim	b600ba3b79	[X86][AVX] Added combine test that should simplify to insertps llvm-svn: 274884	2016-07-08 17:01:42 +00:00
Sanjay Patel	cbfca9e8ef	[InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors llvm-svn: 274883	2016-07-08 17:01:15 +00:00
Zhan Jun Liau	7d4d436c74	[SystemZ] Add support for the .word directive. Summary: Branch off the work to add support for the .word directive, using addAliasForDirective. Reviewers: koriakin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22142 llvm-svn: 274878	2016-07-08 16:50:02 +00:00
Sanjay Patel	647174c8a4	add vector tests to show missing transform llvm-svn: 274876	2016-07-08 16:39:53 +00:00
Matt Arsenault	44540a3db2	PeepholeOptimizer: Make pass name match DEBUG_TYPE llvm-svn: 274874	2016-07-08 16:29:11 +00:00
Zhan Jun Liau	3b4c3f4d51	[SystemZ] Add support for missing instructions Summary: Add support to allow clang integrated assembler to recognize some missing instructions, for openssl. Instructions are: LM, LMH, LMY, STM, STMH, STMY, ICM, ICMH, ICMY, SLA, SLAK, TML, TMH, EX, EXRL. Reviewers: uweigand Subscribers: koriakin, llvm-commits Differential Revision: http://reviews.llvm.org/D22050 llvm-svn: 274869	2016-07-08 16:18:40 +00:00
Sanjay Patel	46df968326	minimize tests The cmp and load aren't required. llvm-svn: 274864	2016-07-08 16:11:48 +00:00
Sanjay Patel	e1acad9b61	regenerate checks llvm-svn: 274860	2016-07-08 16:06:38 +00:00
Chris Dewhurst	3202f065b8	[Sparc] Leon errata fix passes. Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor. The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these. Note: Running clang-format has changed a few other lines too, unrelated to the implemented errata fixes. These have been left in as this keeps the code formatting consistent. Differential Revision: http://reviews.llvm.org/D21960 llvm-svn: 274856	2016-07-08 15:33:56 +00:00
Sjoerd Meijer	1ee119f897	Do not expand SDIV when compiling for minimum code size Differential Revision: http://reviews.llvm.org/D22139 llvm-svn: 274855	2016-07-08 15:32:01 +00:00
Anna Thomas	3124f6273a	InstCombine rule to fold truncs whose value is available We can fold truncs whose operand feeds from a load, if the trunc value is available through a prior load/store. This change is from: http://reviews.llvm.org/D21246, which folded the trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW call, when the load type didnt match the prior load/store type. Differential Revision: http://reviews.llvm.org/D21791 llvm-svn: 274853	2016-07-08 15:18:56 +00:00
Valery Pykhtin	68853ab2c5	[AMDGPU] fix ds_swizzle_b32 opcode for VI (bz 28371) Differential Revision: http://reviews.llvm.org/D22049 llvm-svn: 274852	2016-07-08 15:12:46 +00:00
Sjoerd Meijer	46c4c3d31c	Addressing post-commit comments regarding not expanding UDIV; we don't expand only when compiling for minimum code size. llvm-svn: 274847	2016-07-08 14:17:09 +00:00
Simon Pilgrim	4f1877fb57	[X86][SSE] Improve constant folding tests for CVTSD/CVTSS/CVTTSD/CVTTSS As discussed on D22106, improve the testing for constant folding sse scalar conversion intrinsics to ensure we are correctly handling special/out of range cases llvm-svn: 274846	2016-07-08 13:28:34 +00:00
Sjoerd Meijer	a625af3feb	Code size optimisation: don't expand a div to a mul and and a shift sequence. As a result, the urem instruction will not be expanded to a sequence of umull, lsrs, muls and sub instructions, but just a call to __aeabi_uidivmod. Differential Revision: http://reviews.llvm.org/D22131 llvm-svn: 274843	2016-07-08 12:54:43 +00:00
Simon Pilgrim	828c731880	[X86][SSE] Accept any shuffle mask that is all zeroes Until we have a better way to extract constants through bitcasted build vectors (and how to handle undefs of partial lanes etc.) at least accept build vectors that are all zeroes. llvm-svn: 274833	2016-07-08 10:39:12 +00:00
Matt Arsenault	c3a6fe6ecd	Bug 28444: Fix assertion when extract_vector_elt has mismatched type For some reason extract_vector_elt is sometimes allowed to have a different result type than the vector element type. llvm-svn: 274829	2016-07-08 07:05:00 +00:00
Craig Topper	f7bf6de0af	[AVX512] Remove and autoupgrade a duplicate set of 512-bit masked shift intrinsics. I'm not sure if clang ever used these builtin names or not. llvm-svn: 274827	2016-07-08 06:14:47 +00:00
Wei Mi	90d195a5fd	[PM] Port UnreachableBlockElim to the new Pass Manager Differential Revision: http://reviews.llvm.org/D22124 llvm-svn: 274824	2016-07-08 03:32:49 +00:00
Saleem Abdulrasool	eb059b0e0a	ARM: support high registers in __builtin_longjmp on WoA Windows on ARM uses a pure thumb-2 environment. This means that it can select a high register when doing a __builtin_longjmp. We would use a tLDRi which would truncate the register to a low register. Use a t2LDRi12 to get the full register file access. Tweak the code to just load into PC, as that is an interworking branch on all supported cores anyways. llvm-svn: 274815	2016-07-08 00:48:22 +00:00
Andrew Kaylor	3387074ae9	Temporarily remove a test case to unblock PPC bots. llvm-svn: 274813	2016-07-08 00:35:39 +00:00
Andrew Kaylor	8b8805c94c	Temporarily remove one test run line to unblock PPC bots. llvm-svn: 274812	2016-07-08 00:32:58 +00:00
Jacques Pienaar	6d3eecc843	[lanai] Use peephole optimizer to generate more conditional ALU operations. Summary: * Similiar to the ARM backend yse the peephole optimizer to generate more conditional ALU operations; * Add predicated type with default always true to RR instructions in LanaiInstrInfo.td; * Move LanaiSetflagAluCombiner into optimizeCompare; * The ASM parser can currently only handle explicitly specified CC, so specify ".t" (true) where needed in the ASM test; * Remove unused MachineOperand flags; Reviewers: eliben Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D22072 llvm-svn: 274807	2016-07-07 23:36:04 +00:00
Michael Kuperstein	3e3652aef2	Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD) which was not appreciated by fast regalloc on 32-bit. llvm-svn: 274802	2016-07-07 22:50:23 +00:00
Vedant Kumar	0fdffd3709	[tsan] Try harder to not instrument gcov counters GCOVProfiler::emitProfileArcs() can create many variables with names starting with "__llvm_gcov_ctr", so llvm appends a numeric suffix to most of them. Teach tsan about this. llvm-svn: 274801	2016-07-07 22:45:28 +00:00
Kevin Enderby	1851a827a0	Add checks to the MachOObjectFile() constructor to make sure load commands sizes are the correct multiple. llvm-svn: 274798	2016-07-07 22:11:42 +00:00
Davide Italiano	16284df8ec	[PM] Port InstSimplify to the new pass manager. llvm-svn: 274796	2016-07-07 21:14:36 +00:00
Anna Thomas	6a78c78a03	[DSE] Remove dead stores in end blocks containing fence We can remove dead stores in the presence of fence instructions. Fence does not change an otherwise thread local store to visible. reviewers: reames, dexonsmith, jfb Differential Revision: http://reviews.llvm.org/D22001 llvm-svn: 274795	2016-07-07 20:51:42 +00:00
Chad Rosier	112d0e996b	[AArch64] Change the preferred alignment for char and short to word alignment. The commit reinstates r273279, which was informally approved. Original Review: http://reviews.llvm.org/D21414 This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1. llvm-svn: 274790	2016-07-07 20:02:18 +00:00
Andrew Kaylor	65fa0704aa	Include SelectionDAGISel in the opt-bisect process Differential Revision: http://reviews.llvm.org/D21143 llvm-svn: 274786	2016-07-07 18:55:02 +00:00
Peter Collingbourne	73589f321b	ThinLTO: Do not take into account whether a definition has multiple copies when promoting. We currently do not touch a symbol's linkage in the case where a definition has a single copy. However, this code is effectively unnecessary: either the definition is not exported, in which case the internalize phase sets its linkage to internal, or it is exported, in which case we need to promote linkage to weak. Those two cases are already handled by existing code. I believe that the only real functional change here is in the case where we have a single definition which does not prevail (e.g. because the definition in a native object file prevails). In that case we now lower linkage to available_externally following the existing code path for that case. As a result we can remove the isExported function parameter from the thinLTOResolveWeakForLinkerInIndex function. Differential Revision: http://reviews.llvm.org/D21883 llvm-svn: 274784	2016-07-07 18:31:51 +00:00
Tim Northover	1d106c5fc2	tests: accept different TargetOpcode values. These tests don't actually care about the internal opcode number, but have to be updated whenever we add a new one for GlobalISel. That's bad. llvm-svn: 274774	2016-07-07 17:51:42 +00:00
Michael Kuperstein	edb38a94f8	Revert r274692 to check whether this is what breaks windows selfhost. llvm-svn: 274771	2016-07-07 16:55:35 +00:00
Justin Bogner	a466cc33fa	NVPTX: Remove the legacy ptx intrinsics - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769	2016-07-07 16:40:17 +00:00
Chad Rosier	3972953efd	Revert "[AArch64] Change the preferred alignment for char and short to word alignment" This reverts commit r273279 as the change was not properly approved. llvm-svn: 274768	2016-07-07 16:37:29 +00:00
Valery Pykhtin	af8b1bddbd	[AMDGPU] fix ds_write_src2 encoding (bz26027) Differential revision: http://reviews.llvm.org/D22041 llvm-svn: 274756	2016-07-07 14:23:38 +00:00
Rafael Espindola	b34cba97b7	Don't crash trying to relax 32 loads on COFF. Fixes pr28452. llvm-svn: 274754	2016-07-07 14:00:07 +00:00
Sjoerd Meijer	17c08dc701	Code size optimisation: don't rewrite fputs to fwrite when optimising for size because fwrite requires more arguments and thus extra MOVs are required. llvm-svn: 274753	2016-07-07 13:56:23 +00:00
David Majnemer	7afb46d3c8	[LoopAccessAnalysis] Fix an integer overflow We were inappropriately using 32-bit types to account for quantities that can be far larger. Fixed in PR28443. llvm-svn: 274737	2016-07-07 06:24:36 +00:00
Craig Topper	d5d2a35013	[AVX512] Zero extend the result of vpcmpeq/vpcmpgt and similar intrinsics in the autoupgrade code. This currently results in worse codegen but is needed for correctness. llvm-svn: 274736	2016-07-07 06:11:07 +00:00
Elena Demikhovsky	fc1e969dfc	Fixed a bug in vectorizing GEP before gather/scatter intrinsic. Vectorizing GEP was incorrect and broke SSA in some cases. The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997. Differential revision: http://reviews.llvm.org/D22035 llvm-svn: 274735	2016-07-07 06:06:46 +00:00
David Majnemer	a54fe1acdc	[CodeView] Implement support for thread-local variables llvm-svn: 274734	2016-07-07 05:14:21 +00:00
Qin Zhao	c35b2cba6f	[esan:cfrag] Add option -esan-aux-field-info Summary: Adds option -esan-aux-field-info to control generating binary with auxiliary struct field information. Extracts code for creating auxiliary information from createCacheFragInfoGV into createCacheFragAuxGV. Adds test struct_field_small.ll for -esan-aux-field-info test. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D22019 llvm-svn: 274726	2016-07-07 03:20:16 +00:00
Peter Collingbourne	730c82e6b8	ThinLTO: Remove check for multiple modules before applying weak resolutions. This check is not only unnecessary, it can produce the wrong result. If we are linking a single module and it has an exported linkonce symbol, we need to promote to weak in order to avoid PR19901-style problems. Differential Revision: http://reviews.llvm.org/D21917 llvm-svn: 274722	2016-07-07 01:51:11 +00:00
Sean Silva	284b0324e2	[PM] Avoid getResult on a higher level in LoopAccessAnalysis Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. llvm-svn: 274712	2016-07-07 01:01:53 +00:00
Sean Silva	59fe82f4ce	[PM] Port TailCallElim llvm-svn: 274708	2016-07-06 23:48:41 +00:00
Sean Silva	b025d375a1	[PM] Port CorrelatedValuePropagation llvm-svn: 274705	2016-07-06 23:26:29 +00:00
Peter Collingbourne	d1d2614ee1	ThinLTO: Add test cases for promote+internalize. This tests the effect of both promotion and internalization on a module, and helps show that D21883 is NFC wrt promotion+internalization. Differential Revision: http://reviews.llvm.org/D21915 llvm-svn: 274699	2016-07-06 22:53:02 +00:00
Sanjay Patel	65a51c25c1	[InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we allow these transforms for splat vectors. Differential Revision: http://reviews.llvm.org/D21899 llvm-svn: 274696	2016-07-06 22:23:01 +00:00
Manman Ren	524ca27b90	Add testing coverage for r274582. llvm-svn: 274693	2016-07-06 22:01:28 +00:00
Michael Kuperstein	1ef6c59b1d	[X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. Differential Revision: http://reviews.llvm.org/D21774 llvm-svn: 274692	2016-07-06 21:56:18 +00:00
Vedant Kumar	4c01092a25	[llvm-cov] Add support for creating html reports Based on a patch by Harlan Haskins! Differential Revision: http://reviews.llvm.org/D18278 llvm-svn: 274688	2016-07-06 21:44:05 +00:00
Matthias Braun	ad0032a649	AArch64: Change modeling of zero cycle zeroing. On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should be used to zero a vector register. This was previously done at instruction selection time, however the register coalescer sometimes widened multiple vregs to the Q width because of that leading to extra spills. This patch leaves the decision on how to zero a register to the AsmPrinter phase where it doesn't affect register allocation anymore. This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0. This fixes http://llvm.org/PR27454, rdar://25866262 Differential Revision: http://reviews.llvm.org/D21826 llvm-svn: 274686	2016-07-06 21:39:33 +00:00
Chad Rosier	232e29ebea	[MemorySSA] Reinstate the legacy printer and verifier. Differential Revision: http://reviews.llvm.org/D22058 llvm-svn: 274679	2016-07-06 21:20:47 +00:00
Rafael Espindola	a29971faeb	Add initial support for R_386_GOT32X. This adds it only for movl mov@GOT(%reg), %reg. llvm-svn: 274678	2016-07-06 21:19:11 +00:00
David Majnemer	7abd269aa9	[CodeView] Emit an appropriate symbol kind for globals We emitted debug info for globals/functions as if they all had external linkage. Instead, emit local symbol records when appropriate. llvm-svn: 274676	2016-07-06 21:07:47 +00:00
David Majnemer	e1e7372e93	[CodeView] Unions are always sealed It is impossible to inherit from a union. We are missing a way to represent this in IR for classes/structs... llvm-svn: 274675	2016-07-06 21:07:42 +00:00
Justin Lebar	6f9d01bbd5	[NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D22068 llvm-svn: 274674	2016-07-06 21:06:10 +00:00
Haicheng Wu	a95cd1267f	[LIR] Fix mis-compilation with unwinding. To fix PR27859, bail out if there is an instruction may throw. Differential Revision: http://reviews.llvm.org/D20638 llvm-svn: 274673	2016-07-06 21:05:40 +00:00
Piotr Padlewski	6deaa6afae	Add 'thinlto_src_module' metadata to imported function Added metadata to be able to make statistics on how many functions that have been imported have been removed. Also module name might be helpfull when debugging. Reviewers: tejohnson, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D21943 llvm-svn: 274668	2016-07-06 20:26:25 +00:00
Derek Bruening	d712a3c10e	[esan\|wset] Fix incorrect memory size assert Summary: Fixes an incorrect assert that fails on 128-bit-sized loads or stores. Augments the wset tests to include this case. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D22062 llvm-svn: 274666	2016-07-06 20:13:53 +00:00
Justin Bogner	a463537a36	NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 Everywhere where cuda.syncthreads or __syncthreads is used, use the properly namespaced nvvm.barrier0 instead. llvm-svn: 274664	2016-07-06 20:02:45 +00:00
Adrian McCarthy	820ca5404c	Retry: "Emit CodeView type records for nested classes." Now with a corrected test to account for a recently supported properties bit in the debug info of a struct. Original review: http://reviews.llvm.org/D21939 This reverts commit 970c3fd497a28d25dd69526eb52594a696c37968. llvm-svn: 274661	2016-07-06 19:49:51 +00:00
Chad Rosier	dcfce2d0ec	[DSE] Avoid iterator invalidation bugs. The dse_with_dbg_value.ll test committed with r273141 is removed because this we no longer performs any type of back tracking, which is what was causing the codegen differences with and without debug information. Differential Revision: http://reviews.llvm.org/D21613 llvm-svn: 274660	2016-07-06 19:48:52 +00:00
Sanjay Patel	04b3496d9b	[x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434) This is "cvtdq2ps" which does not appear to be particularly slow on any CPU according to Agner's tables. Choosing "5" as a cost here as suggested in: https://llvm.org/bugs/show_bug.cgi?id=21356 ...but it seems very conservative given that the instruction is fully pipelined, and I think these costs are supposed to model throughput. Note that related costs are also most likely too high, but this fixes PR21356 and partly fixes PR28434. llvm-svn: 274658	2016-07-06 19:15:54 +00:00
Sean Silva	f50d4b6cdc	Work around PR28400 a bit harder. We were still crashing in the "no change" case because LVI was not getting invalidated. See the thread "Should analyses be able to hold AssertingVH to IR? (related to PR28400)" for more discussion. llvm-svn: 274656	2016-07-06 19:05:41 +00:00
Elliot Colp	bc2cfc2291	[SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotate On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount. Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we can remove the AND operation entirely. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 274650	2016-07-06 18:13:11 +00:00
Zachary Turner	8848a7a6b2	[pdb] Round trip the PDB stream between YAML and binary PDB. This gets writing of the PDB stream working. llvm-svn: 274647	2016-07-06 18:05:57 +00:00
Kit Barton	f9d0a40573	Ensure all uses of permute instructions feed vector stores There is a problem in VSXSwapRemoval where it is incorrectly removing permute instructions. In this case, the permute is feeding both a vector store and also a non-store instruction. In this case, the permute cannot be removed. The fix is to simply look at all the uses of the vector register defined by the permute and ensure that all the uses are vector store instructions. This problem was reported in PR 27735 (https://llvm.org/bugs/show_bug.cgi?id=27735). Test case based on the original problem reported. Phabricator Review: http://reviews.llvm.org/D21802 llvm-svn: 274645	2016-07-06 18:03:52 +00:00
Tim Shen	1c3c0afc53	[DAGCombiner] Fix visitSTORE to continue processing current SDNode, if findBetterNeighborChains doesn't actually CombineTo it. Summary: findBetterNeighborChains may or may not find a better chain for each node it finds, which include the node ("St") that visitSTORE is currently processing. If no better chain is found for St, visitSTORE should continue instead of return SDValue(St, 0), as if it's CombinedTo'ed. This fixes bug 28130. There might be other ways to make the test pass (see D21409). I think both of the patches are fixing actual bugs revealed by the same testcase. Reviewers: echristo, wschmidt, hfinkel, kbarton, amehsan, arsenm, nemanjai, bogner Subscribers: mehdi_amini, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D21692 llvm-svn: 274644	2016-07-06 17:44:03 +00:00
Michael Kuperstein	aa71bdd3af	[TTI] The cost model should not assume vector casts get completely scalarized The cost model should not assume vector casts get completely scalarized, since on targets that have vector support, the common case is a partial split up to the legal vector size. So, when a vector cast gets split, the resulting casts end up legal and cheap. Instead of pessimistically assuming scalarization, base TTI can use the costs the concrete TTI provides for the split vector, plus a fudge factor to account for the cost of the split itself. This fudge factor is currently 1 by default, except on AMDGPU where inserts and extracts are considered free. Differential Revision: http://reviews.llvm.org/D21251 llvm-svn: 274642	2016-07-06 17:30:56 +00:00
Adrian McCarthy	7649d8388a	Revert "Emit CodeView type records for nested classes." This reverts commit 256b29322c827a2d94da56468c936596f5509032. llvm-svn: 274632	2016-07-06 15:14:10 +00:00
Simon Pilgrim	118da63a9d	[X86][SSE] Added test cases for missed opportunities to combine pshufb to pslldq/psrldq llvm-svn: 274631	2016-07-06 15:09:48 +00:00
Adrian McCarthy	024a7b6358	Emit CodeView type records for nested classes. Differential Revision: http://reviews.llvm.org/D21939 llvm-svn: 274629	2016-07-06 14:47:32 +00:00
Matthew Simpson	433cb1dfe3	[LV] Don't widen trivial induction variables We currently always vectorize induction variables. However, if an induction variable is only used for counting loop iterations or computing addresses with getelementptr instructions, we don't need to do this. Vectorizing these trivial induction variables can create vector code that is difficult to simplify later on. This is especially true when the unroll factor is greater than one, and we create vector arithmetic when computing step vectors. With this patch, we check if an induction variable is only used for counting iterations or computing addresses, and if so, scalarize the arithmetic when computing step vectors instead. This allows for greater simplification. This patch addresses the suboptimal pointer arithmetic sequence seen in PR27881. Reference: https://llvm.org/bugs/show_bug.cgi?id=27881 Differential Revision: http://reviews.llvm.org/D21620 llvm-svn: 274627	2016-07-06 14:26:59 +00:00
Elena Demikhovsky	ad0a56f3da	Re-commit of 274613. The prev commit failed on compilation. A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure. llvm-svn: 274626	2016-07-06 14:15:43 +00:00
Sam Kolton	3c21a69077	[AMDGPU] Assembler: regression tests for bug 28413. NFC llvm-svn: 274623	2016-07-06 12:52:20 +00:00
Diana Picus	b772e409ba	[ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 2 flags. This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also removes two command-line flags that weren't used in any of the tests: widen-vmovs and swift-partial-update-clearance. The former may be easily replaced with the mattr mechanism, but the latter may not (as it is a subtarget property, and not a proper feature). Differential Revision: http://reviews.llvm.org/D21797 llvm-svn: 274620	2016-07-06 11:22:11 +00:00
Elena Demikhovsky	02ced295aa	Reverted 274613 due to compilation failue. llvm-svn: 274615	2016-07-06 09:11:49 +00:00
Elena Demikhovsky	5a4f2476fd	AVX-512: Optimization for patterns with i1 scalar type The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc". I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction. I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions. This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173. Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails). Differential revision: http://reviews.llvm.org/D21956 llvm-svn: 274613	2016-07-06 09:01:20 +00:00
Nicolai Haehnle	e40530ea7b	AMDGPU: Fix return of non-void-returning shaders Summary: Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that ensures that a non-void-returning shader falls off the end of the last basic block was effectively disabled, since SI_RETURN is now used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21975 llvm-svn: 274612	2016-07-06 08:35:17 +00:00
Elena Demikhovsky	971fbfda1e	Vector GEP test: renamed + some comments Differential revision: http://reviews.llvm.org/D21957 llvm-svn: 274611	2016-07-06 08:11:23 +00:00
Daniel Berlin	fc7e651bfd	Fix handling of forward unreachable but reverse-reachable blocks in MemorySSA construction llvm-svn: 274606	2016-07-06 05:32:05 +00:00
George Burgess IV	bfa401e5ad	[CFLAA] Split into Anders+Steens analysis. StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more accurate, but we'll also end up being slower. So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA. Long-term, we want to end up in a place where CFLSteens is queried first; if it can provide an answer, great (since queries are basically map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc. This patch splits everything out so we can try to do something like that when we get a reasonable CFLAnders implementation. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21910 llvm-svn: 274589	2016-07-06 00:26:41 +00:00

1 2 3 4 5 ...

37938 Commits