llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexey Bataev	4fe77b9329	[SLP] Initial test for fix of PR31690. llvm-svn: 292631	2017-01-20 18:40:21 +00:00
Simon Pilgrim	a50a93fcd0	[InstCombine][X86] Add MULDQ/MULUDQ undef handling llvm-svn: 292627	2017-01-20 18:20:30 +00:00
Alexey Bataev	f5677329a6	[SLP] A new test for horizontal vectorization for non-power-of-2 instructions. llvm-svn: 292626	2017-01-20 18:04:29 +00:00
Matthias Braun	2e8c11e4b3	AArch64LoadStoreOptimizer: Update kill flags when merging stores Kill flags need to be updated correctly when moving stores up/down to form store pair instructions. Those invalid flags have been ignored before but as of r290014 they are recognized when using -mllvm -verify-machineinstrs. Also simplifies test/CodeGen/AArch64/ldst-opt-dbg-limit.mir, renames it to ldst-opt.mir test and adds a new tests for this change. Differential Revision: https://reviews.llvm.org/D28875 llvm-svn: 292625	2017-01-20 18:04:27 +00:00
Petar Jovanovic	dbb39356b4	[mips] Fix debug information for __thread variable This patch fixes debug information for __thread variable on Mips using .dtprelword and .dtpreldword directives. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D28770 llvm-svn: 292624	2017-01-20 17:53:30 +00:00
Wei Mi	d5f7eebd42	[RegisterCoalescing] Recommit the patch "Remove partial redundent copy". The recommit fixes a bug related with live interval update after the partial redundent copy is moved. The original patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 llvm-svn: 292621	2017-01-20 17:38:54 +00:00
Simon Pilgrim	06f125230f	[InstCombine][SSE] Tests showing missed opportunities to handle muldq/muludq with undef arguments Fixed a typo in existing test names at the same time llvm-svn: 292619	2017-01-20 17:06:38 +00:00
Haicheng Wu	71ef5bc0ff	Revert "Recommit "[InlineCost] Use TTI to check if GEP is free." #2" This reverts commit r292616 because the test case still has problem. llvm-svn: 292618	2017-01-20 16:52:22 +00:00
Haicheng Wu	8f34ae2aae	Recommit "[InlineCost] Use TTI to check if GEP is free." #2 This is the second attemp to recommit r292526. The original summary: Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions. llvm-svn: 292616	2017-01-20 16:36:34 +00:00
Simon Pilgrim	2817b476e8	[InstCombine][SSE] Tests showing missed opportunities to constant fold packss/packus llvm-svn: 292609	2017-01-20 13:21:30 +00:00
Sjoerd Meijer	2db2a947f6	[Thumb] Add support for tMUL in the compare instruction peephole optimizer. We also want to optimise tests like this: return a*b == 0. The MULS instruction is flag setting, so we don't need the CMP instruction but can instead branch on the result of the MULS. The generated instructions sequence for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the boolean values resulting from the select instruction, but these MOVS instructions are flag setting and were thus preventing this optimisation. Now we first reorder and move the MULS to before the CMP and generate sequence MOVS, MOVS, MULS, CMP so that the optimisation could trigger. Reordering of the MULS and MOVS is safe to do because the subsequent MOVS instructions just set the CPSR register and don't use it, i.e. the CPSR is dead. Differential Revision: https://reviews.llvm.org/D27990 llvm-svn: 292608	2017-01-20 13:10:12 +00:00
Simon Pilgrim	8942722cbc	[InstCombine][SSE] Tests showing missed opportunities to handle packss/packus with undef arguments llvm-svn: 292601	2017-01-20 11:28:07 +00:00
Chandler Carruth	3cdf650770	[PM] Tidy up the spacing of this new, much nicer test file. llvm-svn: 292592	2017-01-20 09:30:03 +00:00
Simon Pilgrim	51b3b98e3a	[InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions Simplify a packss/packus truncation based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28777 llvm-svn: 292591	2017-01-20 09:28:21 +00:00
Chandler Carruth	e9b18e3d34	[PM] Port LoopSink to the new pass manager. Like several other loop passes (the vectorizer, etc) this pass doesn't really fit the model of a loop pass. The critical distinction is that it isn't intended to be pipelined together with other loop passes. I plan to add some documentation to the loop pass manager to make this more clear on that side. LoopSink is also different because it doesn't really need a lot of the infrastructure of our loop passes. For example, if there aren't loop invariant instructions causing a preheader to exist, there is no need to form a preheader. It also doesn't need LCSSA because this pass is only involved in sinking invariant instructions from a preheader into the loop, not reasoning about live-outs. This allows some nice simplifications to the pass in the new PM where we can directly walk the loops once without restructuring them. Differential Revision: https://reviews.llvm.org/D28921 llvm-svn: 292589	2017-01-20 08:42:19 +00:00
Craig Topper	ae78b5dcff	[AVX-512] Fix a couple test cases to not pass an undef mask to gather intrinsic. This could break if any future optimizations taken advantage of the undef. llvm-svn: 292585	2017-01-20 07:12:30 +00:00
Daniel Berlin	89fea6fd9d	NewGVN: Fix PR 31682, an overactive assert. Part of the assert has been left active for further debugging. The other part has been turned into a stat for tracking for the moment. llvm-svn: 292583	2017-01-20 06:38:41 +00:00
Mohammad Shahid	5dc021bf45	[SLP] Add a base test for jumbled store Change-Id: I905ce08a02c76a6896dcfd9629547417c99adc4a llvm-svn: 292581	2017-01-20 06:05:33 +00:00
Saleem Abdulrasool	a63fd043fd	llvm-cxxfilt: support `-t` to demangle types By default c++filt demangles functions, though you can optionally pass `-t` to have it decode types as well, behaving nearly identical to `__cxa_demangle`. Add support for this mode. llvm-svn: 292576	2017-01-20 04:25:26 +00:00
Haicheng Wu	8f2aca388b	Revert "Recommit "[InlineCost] Use TTI to check if GEP is free."" This reverts commit r292570. The test still has problem. llvm-svn: 292572	2017-01-20 03:40:41 +00:00
Haicheng Wu	1af1f071ea	Recommit "[InlineCost] Use TTI to check if GEP is free." This recommits r292526 which is reverted in r292529 after fixing the test case. The original summary: Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions. llvm-svn: 292570	2017-01-20 03:09:11 +00:00
Greg Parker	2873766788	[test] Remove a unwanted match for `XFAIL:`. llvm-svn: 292567	2017-01-20 02:01:04 +00:00
Ahmed Bougacha	d294823930	[AArch64][GlobalISel] Widen scalar int->fp conversions. It's incorrect to ignore the higher bits of the integer source. Teach the legalizer how to widen it. llvm-svn: 292563	2017-01-20 01:37:24 +00:00
Michael Kuperstein	568027aabb	[PM] Attempt to pacify windows bots. Another difference in type pretty-printing, this one windows-specific. llvm-svn: 292556	2017-01-20 00:47:32 +00:00
Stanislav Mekhanoshin	6ec3e3a728	[AMDGPU] Prevent spills before exec mask is restored Inline spiller can decide to move a spill as early as possible in the basic block. It will skip phis and label, but we also need to make sure it skips instructions in the basic block prologue which restore exec mask. Added isPositionLike callback in TargetInstrInfo to detect instructions which shall be skipped in addition to common phis, labels etc. Differential Revision: https://reviews.llvm.org/D27997 llvm-svn: 292554	2017-01-20 00:44:31 +00:00
Ahmed Bougacha	19252ac6f0	[AArch64][GlobalISel] Split FP conversion legalizer tests. NFC. Big functions with large vreg # are quite unwieldy to update. Change it to have one function per test (it does increase boilerplate, but makes the core hopefully more readable and maintanable). llvm-svn: 292552	2017-01-20 00:30:09 +00:00
Ahmed Bougacha	b0de0d1bfc	[AArch64][GlobalISel] Split legalizer combine tests. NFC. Big functions with large vreg # are quite unwieldy to update. This test also relied on legal s8 operations which we're considering removing. Change it to have one function per test (it does increase boilerplate, but makes the core hopefully more readable and maintanable), and use 100% legal operations throughout. llvm-svn: 292551	2017-01-20 00:30:06 +00:00
Ahmed Bougacha	bf480554df	[MIRParser] Allow generic register specification on operand. This completes r292321 by adding support for generic registers, e.g.: %2:_(s32) = G_ADD %0, %1 llvm-svn: 292550	2017-01-20 00:29:59 +00:00
Anna Thomas	698f0deea9	[AliasAnalysis] Fences do not modify constant memory location Summary: Fence instructions are currently marked as `ModRef` for all memory locations. We can improve this for constant memory locations (such as constant globals), since fence instructions cannot modify these locations. This helps us to forward constant loads across fences (added test case in GVN). There were no changes in behaviour for similar test cases in early-cse and licm. Reviewers: dberlin, sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28914 llvm-svn: 292546	2017-01-20 00:21:33 +00:00
Tim Northover	3babfef11d	AArch64: fall back to DAG ISel for inline assembly. We can't currently handle "calls" to inlineasm strings so it's better to let the DAG handle it than generate rubbish. llvm-svn: 292540	2017-01-19 23:59:35 +00:00
Michael Kuperstein	853e3337db	[PM] Make default pipeline test for the new PM strict Use CHECK-NEXT to verify that a test breaks whenever unexpected passes, analyses, or invalidations show up in default pipelines. The test case is constructed so that we don't expect to invalidate anything, and needs to be kept that way. The test is slightly less strict than we'd like because of differences in type pretty-printing. (Right now it does show some invalidations - all of those are intentional and temporary.) Differential Revision: https://reviews.llvm.org/D28887 llvm-svn: 292536	2017-01-19 23:39:28 +00:00
Michael Kuperstein	c9bb572b73	Revert r292530 since it breaks buildbots. llvm-svn: 292534	2017-01-19 23:22:55 +00:00
Davide Italiano	6c2c3e07bf	[SCCP] Teach the pass how to handle `div` with overdefined operands. This can prove that: extern int f; int g() { int x = 0; for (int i = 0; i < 365; ++i) { x /= f; } return x; } always returns zero. Thanks to Sanjoy for confirming this transformation actually made sense (bugs are mine). llvm-svn: 292531	2017-01-19 23:07:51 +00:00
Michael Kuperstein	5a52af0f63	[PM] Make default pipeline test for the new PM strict Use CHECK-NEXT to verify that a test breaks whenever unexpected passes, analyses, or invalidations show up in default pipelines. The test case is constructed so that we don't expect to invalidate anything, and needs to be kept that way. (Right now it does show some invalidations - all of those are intentional and temporary.) Differential Revision: https://reviews.llvm.org/D28887 llvm-svn: 292530	2017-01-19 22:55:46 +00:00
Haicheng Wu	e036df4723	Revert "[InlineCost] Use TTI to check if GEP is free." This reverts commit r292526. The test case has problem. llvm-svn: 292529	2017-01-19 22:51:03 +00:00
Simon Pilgrim	fb32eea1b4	[SelectionDAG] Improve knownbits handling of UMIN/UMAX (PR31293) This patch improves the knownbits logic for unsigned integer min/max opcodes. For UMIN we know that the result will have the maximum of the inputs' known leading zero bits in the result, similarly for UMAX the maximum of the inputs' leading one bits. This is particularly useful for simplifying clamping patterns,. e.g. as SSE doesn't have a uitofp instruction we want to use sitofp instead where possible and for that we need to confirm that the top bit is not set. Differential Revision: https://reviews.llvm.org/D28853 llvm-svn: 292528	2017-01-19 22:41:22 +00:00
Haicheng Wu	da556345dc	[InlineCost] Use TTI to check if GEP is free. Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions. Differential Revision: https://reviews.llvm.org/D28693 llvm-svn: 292526	2017-01-19 22:28:34 +00:00
Serge Rogatch	f83d2a25bf	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 llvm-svn: 292516	2017-01-19 20:24:23 +00:00
Chad Rosier	9245e12f95	[Assembler] Improve error when unable to evaluate expression. Add a SMLoc to MCExpr. Most code does not generate or consume the SMLoc (yet). Patch by Sanne Wouda <sanne.wouda@arm.com>! Differential Revision: https://reviews.llvm.org/D28861 llvm-svn: 292515	2017-01-19 20:06:32 +00:00
Evgeniy Stepanov	f2d9a46b5f	Fix aliases to thumbfunc-based exprs to be thumbfunc. If F is a Thumb function symbol, and G = F + const, and G is a function symbol, then G is Thumb. Because what else could it be? Differential Revision: https://reviews.llvm.org/D28878 llvm-svn: 292514	2017-01-19 20:04:11 +00:00
Xin Tong	5ee40ba400	Improve what can be promoted in LICM. Summary: In case of non-alloca pointers, we check for whether it is a pointer from malloc-like calls and it is not captured. In such case, we can promote the pointer, as the caller will have no way to access this pointer even if there is unwinding in middle of the loop. Reviewers: hfinkel, sanjoy, reames, eli.friedman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28834 llvm-svn: 292510	2017-01-19 19:31:40 +00:00
Davide Italiano	2ef8c4e708	[InstCombine] Simplify gep (gep p, a), (b-a) Patch by Andrea Canciani. Differential Revision: https://reviews.llvm.org/D27413 llvm-svn: 292506	2017-01-19 18:51:56 +00:00
Kevin Enderby	650fca5f28	Remove this test from the r292500 commit till Chris and I figure out why it is failing on a couple of build bots. llvm-svn: 292501	2017-01-19 18:07:22 +00:00
Kevin Enderby	a4579c4184	Add support for the new LC_NOTE load command. It describes a region of arbitrary data included in a Mach-O file. Its initial use is to record extra data in MH_CORE files. rdar://30001545 rdar://30001731 llvm-svn: 292500	2017-01-19 17:36:31 +00:00
Simon Pilgrim	5f2f53b106	[X86][SSE] Attempt to pre-truncate arithmetic operations that have already been extended As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further llvm-svn: 292493	2017-01-19 16:25:02 +00:00
Sanjay Patel	291c3d8ff2	[InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1 Try harder to fold icmp with shl nsw as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html This is similar to the 'shl nuw' transforms that were added with D25913. This may eventually help solve: https://llvm.org/bugs/show_bug.cgi?id=30773 Differential Revision: https://reviews.llvm.org/D28406 llvm-svn: 292492	2017-01-19 16:12:10 +00:00
Simon Pilgrim	3b23eac71f	[X86][SSE] Added tests for pre-truncating arithmetic operations that have already been extended As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further llvm-svn: 292487	2017-01-19 15:03:00 +00:00
Mikael Holmen	2074e7497b	[DAG] Don't increase SDNodeOrder for dbg.value/declare. Summary: The SDNodeOrder is saved in the IROrder field in the SDNode, and this field may affects scheduling. Thus, letting dbg.value/declare increase the order numbers may in turn affect scheduling. Because of this change we also need to update the code deciding when dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues. Dbg values now have the same order as the SDNode they are connected to, not the following orders. Test cases provided by Florian Hahn. Reviewers: bogner, aprantl, sunfish, atrick Reviewed By: atrick Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D25318 llvm-svn: 292485	2017-01-19 13:55:55 +00:00
Kristof Beyls	e9412b4d47	[GlobalISel] Pointers are legal operands for G_SELECT on AArch64 Differential Revision: https://reviews.llvm.org/D28805 llvm-svn: 292481	2017-01-19 13:32:14 +00:00
Elena Demikhovsky	e01512cecf	Recommiting unsigned saturation with a bugfix. A test case that crached is added to avx512-trunc.ll. (PR31589) llvm-svn: 292479	2017-01-19 12:08:21 +00:00
Justin Bogner	ddb80aee7e	GlobalISel: Implement widening for shifts llvm-svn: 292476	2017-01-19 07:51:17 +00:00
Craig Topper	b8e92f775d	[AVX-512] Add test cases that show where we are using two subvector inserts to broadcast a 128-bit subvector into a 512-bit vector. We'd be better off using something like SHUFF32X4. If the subvector comes from a load, we convert to SUBV_BROADCAST and use a broadcast instruction. But if there is no load we keep the inserts. I think we should create the SUBV_BROADCAST even without the load and let isel use the fallback patterns that are used if the load can't be folded. This will use the SHUFF32X4 or similar instruction for the 128-bit into 512-bit case and a single insert for 128 into 256 or 256 into 512. This should be fixed so subvector broadcast intrinsics can be replaced with native IR since some of those currently lower directly to SHUFF32X4. llvm-svn: 292475	2017-01-19 07:37:45 +00:00
Craig Topper	200ea31684	[AVX-512] Support ADD/SUB/MUL of mask vectors Summary: Currently we expand and scalarize these operations, but I think we should be able to implement ADD/SUB with KXOR and MUL with KAND. We already do this for scalar i1 operations so I just extended it to vectors of i1. Reviewers: zvi, delena Reviewed By: delena Subscribers: guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D28888 llvm-svn: 292474	2017-01-19 07:12:35 +00:00
Matt Arsenault	3e6f9b5773	AMDGPU: Disable some fneg combines unless nsz For -(x + y) -> (-x) + (-y), if x == -y, this would change the result from -0.0 to 0.0. Since the fma/fmad combine is an extension of this problem it also applies there. fmul should be fine, and I don't think any of the unary operators or conversions should be a problem either. llvm-svn: 292473	2017-01-19 06:35:27 +00:00
Matt Arsenault	3b99f12a4e	AMDGPU: Remove modifiers from v_div_scale_* They seem to produce nonsense results when used. This should be applied to the release branch. llvm-svn: 292472	2017-01-19 06:04:12 +00:00
Mike Aizatsky	7da919b8b0	[sancov] applying blacklist to covered points too Differential Revision: https://reviews.llvm.org/D28872 llvm-svn: 292468	2017-01-19 03:49:18 +00:00
Saleem Abdulrasool	c8bcda2b56	llvm-cxxfilt: filter out invalid manglings c++filt does not attempt to demangle symbols which do not match its expected format. This means that the symbol must start with _Z or ___Z (block invocation function extension). Any other symbols are returned as is. Note that this is different from the behaviour of __cxa_demangle which will demangle fragments. llvm-svn: 292467	2017-01-19 02:58:46 +00:00
Craig Topper	b561e66384	[AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector broadcasts that can't fold the load. llvm-svn: 292466	2017-01-19 02:34:29 +00:00
Craig Topper	044662d14b	[AVX-512] Add additional test cases for broadcast intrinsics that demonstates that we don't fold the loads to use a broadcast instruction. llvm-svn: 292465	2017-01-19 02:34:25 +00:00
Michael Kuperstein	8ecc38ef85	[PM] Add LoopVectorize to the default module pipeline LV no longer "requires" LCSSA and LoopSimplify, and instead forms them internally as required. So, there's nothing preventing it from being enabled. llvm-svn: 292464	2017-01-19 02:21:54 +00:00
Peter Collingbourne	22d9d3cdce	LowerTypeTests: Implement exporting of type identifiers. Type identifiers are exported by: - Adding coarse-grained information about how to test the type identifier to the summary. - Creating symbols in the object file (aliases and absolute symbols) containing fine-grained information about the type identifier. Differential Revision: https://reviews.llvm.org/D28424 llvm-svn: 292462	2017-01-19 01:20:11 +00:00
Justin Bogner	d09c3ce6c0	GlobalISel: Implement narrowing for G_LOAD llvm-svn: 292461	2017-01-19 01:05:48 +00:00
Matthias Braun	58f99615d6	Use an actual valid register in test llvm-svn: 292459	2017-01-19 01:04:08 +00:00
Dehao Chen	1ce8d6ca59	Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection Summary: SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info: * start line of all subprograms * linkage name of all subprograms * standalone subprograms (functions that has neither inlined nor been inlined) This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch): -gmlt(orig) -gmlt(patched) -g 433.milc 4.68% 5.40% 19.73% 444.namd 8.45% 8.93% 45.99% 447.dealII 97.43% 115.21% 374.89% 450.soplex 27.75% 31.88% 126.04% 453.povray 21.81% 26.16% 92.03% 470.lbm 0.60% 0.67% 1.96% 482.sphinx3 5.77% 6.47% 26.17% 400.perlbench 17.81% 19.43% 73.08% 401.bzip2 3.73% 3.92% 12.18% 403.gcc 31.75% 34.48% 122.75% 429.mcf 0.78% 0.88% 3.89% 445.gobmk 6.08% 7.92% 42.27% 456.hmmer 10.36% 11.25% 35.23% 458.sjeng 5.08% 5.42% 14.36% 462.libquantum 1.71% 1.96% 6.36% 464.h264ref 15.61% 16.56% 43.92% 471.omnetpp 11.93% 15.84% 60.09% 473.astar 3.11% 3.69% 14.18% 483.xalancbmk 56.29% 81.63% 353.22% geomean 15.60% 18.30% 57.81% Debug info size change for -gmlt binary with this patch: 433.milc 13.46% 444.namd 5.35% 447.dealII 18.21% 450.soplex 14.68% 453.povray 19.65% 470.lbm 6.03% 482.sphinx3 11.21% 400.perlbench 8.91% 401.bzip2 4.41% 403.gcc 8.56% 429.mcf 8.24% 445.gobmk 29.47% 456.hmmer 8.19% 458.sjeng 6.05% 462.libquantum 11.23% 464.h264ref 5.93% 471.omnetpp 31.89% 473.astar 16.20% 483.xalancbmk 44.62% geomean 16.83% Reviewers: davidxl, echristo, dblaikie Reviewed By: echristo, dblaikie Subscribers: aprantl, probinson, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25434 llvm-svn: 292457	2017-01-19 00:44:11 +00:00
Michael Kuperstein	230867e583	[LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them This changes the vectorizer to explicitly use the loopsimplify and lcssa utils, instead of "requiring" the transformations as if they were analyses. This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA for all loops, but rather only for the loops we expect to modify. Differential Revision: https://reviews.llvm.org/D28868 llvm-svn: 292456	2017-01-19 00:42:28 +00:00
Artem Belevich	3d3f6190ab	[NVPTX] Fix lowering of fp16 ISD::FNEG. There's no neg.f16 instruction, so negation has to be done via subtraction from zero. Differential Revision: https://reviews.llvm.org/D28876 llvm-svn: 292452	2017-01-19 00:14:45 +00:00
Eli Friedman	f1f49c8265	[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly. To avoid regressions, make ScalarEvolution::createSCEV a bit more clever. Also get rid of some useless code in ScalarEvolution::howFarToZero which was hiding this bug. No new testcase because it's impossible to actually expose this bug: we don't have any in-tree users of getUDivExactExpr besides the two functions I just mentioned, and they both dodged the problem. I'll try to add some interesting users in a followup. Differential Revision: https://reviews.llvm.org/D28587 llvm-svn: 292449	2017-01-18 23:56:42 +00:00
Krzysztof Parzyszek	de44c9d857	Treat segment [B, E) as not overlapping block with boundaries [A, B) llvm-svn: 292446	2017-01-18 23:12:19 +00:00
Krzysztof Parzyszek	954dd8d9ba	[Hexagon] Remove dead defs from the live set when expanding wstores llvm-svn: 292445	2017-01-18 23:11:40 +00:00
Michael Kuperstein	d3d2925933	Revert r291670 because it introduces a crash. r291670 doesn't crash on the original testcase from PR31589, but it crashes on a slightly more complex one. PR31589 has the new reproducer. llvm-svn: 292444	2017-01-18 23:05:58 +00:00
Sanjay Patel	cfb8a45942	[InstCombine] add tests for shl nsw with icmp eq/ne; NFCI These should be fixed with D28406. llvm-svn: 292441	2017-01-18 21:31:21 +00:00
Peter Collingbourne	20a00933fb	ThinLTOBitcodeWriter: Clear comdats on filtered globals. Differential Revision: https://reviews.llvm.org/D28839 llvm-svn: 292431	2017-01-18 20:03:02 +00:00
Michael Kuperstein	7cefb409b0	[LV] Allow reductions that have several uses outside the loop We currently check whether a reduction has a single outside user. We don't really need to require that - we just need to make sure a single value is used externally. The number of external users of that value shouldn't actually matter. Differential Revision: https://reviews.llvm.org/D28830 llvm-svn: 292424	2017-01-18 19:02:52 +00:00
Evandro Menezes	7960b2e19a	[AArch64] Generate literals by the little end ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 llvm-svn: 292422	2017-01-18 18:57:08 +00:00
Mehdi Amini	67d2cc1fad	[ThinLTO] Add a recursive step in Metadata lazy-loading Summary: Without this, we're stressing the RAUW of unique nodes, which is a costly operation. This is intended to limit the number of RAUW, and is very effective on the total link-time of opt with ThinLTO, before: real 4m4.587s user 15m3.401s sys 0m23.616s after: real 3m25.261s user 12m22.132s sys 0m24.152s Reviewers: tejohnson, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28751 llvm-svn: 292420	2017-01-18 18:36:21 +00:00
Alexey Bataev	f86cca1a42	[SLP] Add a tests for a fix for PR30787. Add a test for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops. llvm-svn: 292416	2017-01-18 18:07:46 +00:00
Stanislav Mekhanoshin	a4e63ead4b	[AMDGPU] Do not allow register coalescer to create big superregs Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 llvm-svn: 292413	2017-01-18 17:30:05 +00:00
Justin Bogner	fde0104649	GlobalISel: Implement narrowing for G_STORE Legalize stores of types that are too wide by breaking them up into sequences of smaller stores. llvm-svn: 292412	2017-01-18 17:29:54 +00:00
Teresa Johnson	2d384ac381	Don't create a comdat group for a dropped def with initializer Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to available_externally when possible. If they had an initializer in the global_ctors list, a comdat group was being created. This code already had logic to skip available_externally defs, but now the EliminateAvailableExternally pass will drop these symbols to declarations earlier. Change the check to skip all declarations for linker (which includes available_externally along with declarations). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28737 llvm-svn: 292408	2017-01-18 16:58:43 +00:00
Sam Parker	b0de00d545	[ARM] Create SubtargetFeatures from build attrs An ELFObjectFile can now create SubtargetFeatures from the available ARM build attributes, in a similar manner to MIPS. I've moved the MIPS code into its own function and the ARM handler also has a separate function. Differential Revision: https://reviews.llvm.org/D28291 llvm-svn: 292403	2017-01-18 15:52:11 +00:00
Chad Rosier	771db6f895	[Assembler] Fix crash when assembling .quad for AArch32. A 64-bit relocation does not exist in 32-bit ARMELF. Report an error instead of crashing. PR23870 Patch by Sanne Wouda (sanwou01). Differential Revision: https://reviews.llvm.org/D28851 llvm-svn: 292373	2017-01-18 15:02:54 +00:00
Simon Pilgrim	fe2c0ed4cf	[InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292371	2017-01-18 14:47:49 +00:00
Simon Pilgrim	970d67c653	[InstCombine][AVX2] Tests showing missed opportunities to pass demanded elts through a vpermd/vpermps shuffle llvm-svn: 292368	2017-01-18 14:23:06 +00:00
Sam Parker	df7c6ef96f	[ARM] Create objdump subtarget from build attrs Enable an ELFObjectFile to read the its arm build attributes to produce a target triple with a specific ARM architecture. llvm-objdump now uses this functionality to automatically produce a more accurate target. Differential Revision: https://reviews.llvm.org/D28769 llvm-svn: 292366	2017-01-18 13:52:12 +00:00
Simon Pilgrim	4b51989635	Fixed parser error on windows shell evaluation of RUN script line llvm-svn: 292363	2017-01-18 11:40:28 +00:00
Simon Pilgrim	d0ccf5e2e3	[X86][SSE] Simplify umax knownbits test combineSRA doesn't detect sign bits splats that it does itself so just use -1 as the demanded input so that its already splatted llvm-svn: 292361	2017-01-18 11:20:31 +00:00
Michael Zuckerman	0c0240ce84	[X86] Improve mul combine for negative multiplayer (2^c - 1) This patch improves the mul instruction combine function (combineMul) by adding new layer of logic. In this patch, we are adding the ability to fold (mul x, -((1 << c) -1)) or (mul x, -((1 << c) +1)) into (neg(X << c) -x) or (neg((x << c) + x) respective. Differential Revision: https://reviews.llvm.org/D28232 llvm-svn: 292358	2017-01-18 09:31:13 +00:00
Renato Golin	03c5e69d07	Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier" This reverts commit r292210, as it broke the Thumb buldbot with: clang-5.0: error: the clang compiler does not support '-fxray-instrument on thumbv7-unknown-linux-gnueabihf'. llvm-svn: 292357	2017-01-18 09:08:43 +00:00
Jonas Paulsson	a9bb00d82b	[SystemZ] Proper handling of undef flag while expanding pseudo. During post-RA pseudo expansion, an 'undef' flag of the source operand should be propagated by emitGRX32Move(). Review: Ulrich Weigand llvm-svn: 292353	2017-01-18 08:32:54 +00:00
Marina Yatsina	197db00e3e	[X86] Fix for bugzilla 31576 - add support for "data32" instruction prefix This patch fixes bugzilla 31576 (https://llvm.org/bugs/show_bug.cgi?id=31576). "data32" instruction prefix was not defined in the llvm. An exception had to be added to the X86 tablegen and AsmPrinter because both "data16" and "data32" are encoded to 0x66 (but in different modes). Differential Revision: https://reviews.llvm.org/D28468 llvm-svn: 292352	2017-01-18 08:07:51 +00:00
Chandler Carruth	d50c5fb13f	[PM] Teach LoopDeletion to correctly update the LPM when loops are deleted. I've expanded its test coverage a bit including adding one test that will crash clearly without this change. llvm-svn: 292332	2017-01-18 02:41:26 +00:00
Chandler Carruth	88c36d7852	[LoopDeletion] (cleanup, NFC) Make this test actually test what it claims to test. LoopSimplify was unifying the multiple exits in this test case, making it never even test the multiple exit handling of LoopDeletion. Doh. Now it works (thanks to a great idea from mkuper) and will fail if we ever change something to make it stop working. llvm-svn: 292331	2017-01-18 02:29:35 +00:00
Matt Arsenault	f411071d63	DAG: Consider nnan in isKnownNeverNaN llvm-svn: 292328	2017-01-18 02:10:08 +00:00
Wei Mi	ce9d04ce58	Revert rL292292 since it causes a SEGV on sanitizer-x86_64-linux-fuzzer build bot. llvm-svn: 292327	2017-01-18 01:53:53 +00:00
Dan Gohman	73e3aaa61e	[WebAssembly] Update grow_memory's return type. The grow_memory instruction now returns the previous memory size. Add the return type to the LLVM intrinsic. llvm-svn: 292322	2017-01-18 01:02:45 +00:00
Matthias Braun	de5fea2c30	MIRParser: Allow regclass specification on operand You can now define the register class of a virtual register on the operand itself avoiding the need to use a "registers:" block. Example: "%0:gr64 = COPY %rax" Differential Revision: https://reviews.llvm.org/D22398 llvm-svn: 292321	2017-01-18 00:59:19 +00:00
Justin Lebar	1cf6bf4989	[NVPTX] Support global variables of integer type larger than i64. Reviewers: tra, majnemer Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28825 llvm-svn: 292316	2017-01-18 00:29:53 +00:00
Xin Tong	58e8142f0e	2 returns next to each other =). NFC llvm-svn: 292315	2017-01-18 00:26:17 +00:00
Justin Lebar	cc938fc197	[NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic. Summary: This change also lets us use max.{s,u}16. There's a vague warning in a test about this maybe being less efficient, but I could not come up with a case where the resulting SASS (sm_35 or sm_60) was different with or without max.{s,u}16. It's true that nvcc seems to emit only max.{s,u}32, but even ptxas 7.0 seems to have no problem generating efficient SASS from max.{s,u}16 (the casts up to i32 and back down to i16 seem to be implicit and nops, happening via register aliasing). In the absence of evidence, better to have fewer special cases, emit more straightforward code, etc. In particular, if a new GPU has 16-bit min/max instructions, we want to be able to use them. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28732 llvm-svn: 292304	2017-01-18 00:09:01 +00:00
Justin Lebar	7dc3d6c341	[NVPTX] Lower integer absolute value idiom to abs instruction. Summary: Previously we lowered it literally, to shifts and xors. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28722 llvm-svn: 292303	2017-01-18 00:08:44 +00:00
Justin Lebar	1091a9f566	[NVPTX] Improve lowering of llvm.ctpop. Summary: Avoid an unnecessary conversion operation when using the result of ctpop.i32 or ctpop.i16 as an i32, as in both cases the ptx instruction we run returns an i32. (Previously if we used the value as an i32, we'd do an unnecessary zext+trunc.) Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28721 llvm-svn: 292302	2017-01-18 00:08:27 +00:00
Justin Lebar	c7d20128bd	[NVPTX] Add lowering for llvm.bitreverse. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28720 llvm-svn: 292301	2017-01-18 00:08:10 +00:00
Justin Lebar	47087814f1	[NVPTX] Fix function names in ctlz.ll test. Test-only change. Looks like a copy/paste mistake, all the functions in ctlz.ll were named "ctpop". llvm-svn: 292300	2017-01-18 00:07:52 +00:00
Justin Lebar	d17de5380b	[NVPTX] Improve lowering of llvm.ctlz. Summary: * Disable "ctlz speculation", which inserts a branch on every ctlz(x) which has defined behavior on x == 0 to check whether x is, in fact zero. * Add DAG patterns that avoid re-truncating or re-expanding the result of the 16- and 64-bit ctz instructions. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28719 llvm-svn: 292299	2017-01-18 00:07:35 +00:00
Sanjay Patel	0e9f681dee	[InstCombine] add tests to show missed shrinkage; NFC A patch to partially solve this: https://reviews.llvm.org/D28625 llvm-svn: 292296	2017-01-18 00:03:23 +00:00
Dehao Chen	c3f87f02b1	Introduce -unroll-partial-threshold to separate PartialThreshold from Threshold in loop unorller. Summary: Partial unrolling should have separate threshold with full unrolling. Reviewers: efriedma, mzolotukhin Reviewed By: efriedma, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28831 llvm-svn: 292293	2017-01-17 23:39:33 +00:00
Wei Mi	8f4178a59e	[RegisterCoalescing] Remove partial redundent copy. The patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 llvm-svn: 292292	2017-01-17 23:39:07 +00:00
Tim Northover	33a1a0b001	GlobalISel: fix comparison order for G_FCMP As with G_ICMP we'd written the CSET instructions backwards. llvm-svn: 292285	2017-01-17 23:04:01 +00:00
Tim Northover	509091f9e0	GlobalISel: add callseq instructions to record stack usage llvm-svn: 292284	2017-01-17 22:43:34 +00:00
Tim Northover	d943354216	GlobalISel: correctly handle varargs Some platforms (notably iOS) use a different calling convention for unnamed vs named parameters in varargs functions, so we need to keep track of this information when translating calls. Since not many platforms are involved, the guts of the special handling is in the ValueHandler class (with a generic implementation that should work for most targets). llvm-svn: 292283	2017-01-17 22:30:10 +00:00
Matthew Simpson	e2c9ad9483	[LV] Add requires asserts to test case llvm-svn: 292280	2017-01-17 22:21:33 +00:00
Tim Northover	b6636fd392	[GlobalISel] track predecessor mapping during switch lowering. Correctly populating Machine PHIs relies on knowing exactly how the IR level CFG was lowered to MachineIR. This needs to be tracked by any translation phases that meddle (currently only SwitchInst handling). This reapplies r291973 which was reverted because of testing failures. Fixes: + Don't return an ArrayRef to a local temporary. + Incorporate Kristof's suggested comment improvements. llvm-svn: 292278	2017-01-17 22:13:50 +00:00
Simon Pilgrim	421f2d9af8	[X86][SSE] Split UMIN and UMAX known bits tests llvm-svn: 292277	2017-01-17 22:12:25 +00:00
Xin Tong	0bc2977874	Add a test case for LICM when promoting locals that may be read after the throw within the loop. NFCI. Summary: Add a test case for LICM when promoting locals that may be read after the throw within the loop. Reviewers: eli.friedman, hfinkel, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28822 llvm-svn: 292261	2017-01-17 21:26:36 +00:00
Matthew Simpson	3fbdaa5906	[LV] Mark non-consecutive-like pointers non-uniform If a memory instruction will be vectorized, but it's pointer operand is non-consecutive-like, the instruction is a gather or scatter operation. Its pointer operand will be non-uniform. This should fix PR31671. Reference: https://llvm.org/bugs/show_bug.cgi?id=31671 Differential Revision: https://reviews.llvm.org/D28819 llvm-svn: 292254	2017-01-17 20:51:39 +00:00
Xin Tong	8343b5096d	Rename scalar_promote.ll to scalar-promote.ll and scalar_promote-unwind.ll to scalar-promote-unwind.ll. NFCI llvm-svn: 292251	2017-01-17 20:28:36 +00:00
Sanjoy Das	6de072a712	[EarlyCSE] Don't DSE across readnone functions that may throw Summary: Depends on D28740 Reviewers: dberlin, chandlerc, hfinkel, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D28741 llvm-svn: 292249	2017-01-17 20:15:47 +00:00
Sanjay Patel	f42a955c3e	[InstCombine] add tests for shl nsw + icmp sle; NFC We want to handle these cases similarly to icmp sgt, so add the tests for it. See: https://reviews.llvm.org/D28406 llvm-svn: 292248	2017-01-17 20:15:26 +00:00
Joerg Sonnenberger	270dd41f75	Remove an overeager assert from r288844. llvm-svn: 292244	2017-01-17 19:29:15 +00:00
Bob Wilson	f2d0b68b3b	Revert r291640 change to fold X86 comparison with atomic_load_add. Even with the fix from r291630, this still causes problems. I get widespread assertion failures in the Swift runtime's WeakRefCount::increment() function. I sent a reduced testcase in reply to the commit. llvm-svn: 292242	2017-01-17 19:18:57 +00:00
Chandler Carruth	b6e32daa81	[PM] Teach the LoopPassManager to automatically canonicalize loops by runnig LCSSA over them prior to running the loop pipeline. This also teaches the loop PM to verify that LCSSA form is preserved throughout the pipeline's run across the loop nest. Most of the test updates just leverage this new functionality. One has to be relaxed with the new PM as IVUsers is less powerful when it sees LCSSA input. Differential Revision: https://reviews.llvm.org/D28743 llvm-svn: 292241	2017-01-17 19:18:12 +00:00
Sanjay Patel	9666996563	[ValueTracking] recognize a 'not' of an assumed condition as false Also, add the corresponding match to the AssumptionCache's 'Affected Values' list. Differential Revision: https://reviews.llvm.org/D28485 llvm-svn: 292239	2017-01-17 18:15:49 +00:00
David Majnemer	de55c606d1	[InstCombine] Fold ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) This further extends r292179 to support additional binary operators beyond subtraction. llvm-svn: 292238	2017-01-17 18:08:06 +00:00
Simon Pilgrim	60662cb5d0	[X86][AVX512] Add all_of/any_of avx512vl tests llvm-svn: 292235	2017-01-17 17:33:18 +00:00
Chad Rosier	8520429bdd	[ValueTracking] Extend known bits to understand @llvm.bitreverse. Differential Revision: https://reviews.llvm.org/D28780 llvm-svn: 292233	2017-01-17 17:23:51 +00:00
Sam Kolton	9dffada98b	[AMDGPU] Assembler: fix v_mac_f16 immediates Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28802 llvm-svn: 292224	2017-01-17 15:26:02 +00:00
Simon Pilgrim	8b2996fe1a	[X86][SSE] Tests showing horizontal all_of/any_of of vector comparison results llvm-svn: 292223	2017-01-17 15:02:01 +00:00
Krasimir Georgiev	4cbe21a43c	[llvm-objdump tests] Copy the inputs of tests closer to tests. Summary: Tests under tools/llvm-objdump should not use inputs from Object. Copied the required inputs and aligned the new tests to be more consistent with the existing tests in this respect. Reviewers: ioeric Reviewed By: ioeric Subscribers: davide, djasper, cfe-commits Differential Revision: https://reviews.llvm.org/D28799 llvm-svn: 292222	2017-01-17 14:22:29 +00:00
Serge Rogatch	50be6b45a9	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 llvm-svn: 292210	2017-01-17 11:52:10 +00:00
Simon Pilgrim	d4eb800b03	[InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292209	2017-01-17 11:35:03 +00:00
Matt Arsenault	4165efdc58	AMDGPU: Add replacement export intrinsics llvm-svn: 292205	2017-01-17 07:26:53 +00:00
Alexei Starovoitov	e4975487f5	[bpf] error when unknown bpf helper is called Emit error when BPF backend sees a call to a global function or to an external symbol. The kernel verifier only allows calls to predefined helpers from bpf.h which are defined in 'enum bpf_func_id'. Such calls in assembler must look like 'call [1-9]+' where number matches bpf_func_id. Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 292204	2017-01-17 07:26:17 +00:00
Craig Topper	729d30d0ae	[AVX-512] Add support for taking a bitcast between a SUBV_BROADCAST and VSELECT and moving it to the input of the SUBV_BROADCAST if it will help with using a masked operation. llvm-svn: 292201	2017-01-17 06:49:59 +00:00
Craig Topper	7b003b9cf3	[AVX-512] Add test cases showing missed opportunities to fold subvector broadcasts with a mask operation. llvm-svn: 292200	2017-01-17 06:49:54 +00:00
Sanjoy Das	679bc32c6a	[InstCombine] Don't DSE across readnone functions that may throw Summary: Depends on D28740 Reviewers: dberlin, chandlerc, hfinkel, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D28742 llvm-svn: 292197	2017-01-17 05:45:09 +00:00
Ahmed Bougacha	9e5a085cf1	Revert "[TLI] Robustize SDAG proto checking by merging it into TLI." This reverts commit r292189, as it causes issues on SystemZ bots. llvm-svn: 292191	2017-01-17 03:31:00 +00:00
Ahmed Bougacha	c018efd680	[TLI] Robustize SDAG proto checking by merging it into TLI. SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-checking. Use TLI instead, removing another heap of redundant code. This isn't strictly NFC, as the SDAG code was too lax. Concretely, this means changes are required to two tests: - calling a non-variadic function via a variadic prototype isn't OK; it just happens to work on x86_64 (but not on, e.g., aarch64). - mempcpy has a size_t parameter; the SDAG code accepts any integer type, which meant using i32 on x86_64 worked. I don't think it's worth supporting either of these (IMO) broken testcases. Instead, fix them to be more correct. llvm-svn: 292189	2017-01-17 03:10:06 +00:00
Alexei Starovoitov	05de2e4818	[bpf] error when BPF stack size exceeds 512 bytes Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 292180	2017-01-17 01:05:17 +00:00
David Majnemer	36d382b773	[InstCombine] Fold ((C1-zext(X)) & C2) -> zext((C1-X) & C2) This is valid if C2 fits within the bitwidth of X thanks to two's complement modulo arithmetic. llvm-svn: 292179	2017-01-17 00:45:57 +00:00
Matt Arsenault	c8cc2be9f8	Add comment to test file I forgot to save llvm-svn: 292178	2017-01-17 00:35:28 +00:00
Matt Arsenault	b948b4d8df	SimplifyLibCalls: Remove checks for fabs Use the intrinsic instead of emitting the libcall which will be replaced by the intrinsic. llvm-svn: 292176	2017-01-17 00:30:31 +00:00
Matt Arsenault	7233344c28	SimplifyLibCalls: Replace fabs libcalls with intrinsics Add missing fabs(fpext) optimzation that worked with the call, and also fixes it creating a second fpext when there were multiple uses. llvm-svn: 292172	2017-01-17 00:10:40 +00:00
Davide Italiano	1825a03f72	[Object] Fixup permissions of input files. They just need to be read/dumped, so no need to set the exec bit on any of them. NFCI, I guess. llvm-svn: 292171	2017-01-16 23:28:58 +00:00
Davide Italiano	eb9ad9831b	[llvm-objdump] Dump PT_NOTE as part of -p. PR: 31641 llvm-svn: 292170	2017-01-16 23:13:46 +00:00
Davide Italiano	cad192779a	[llvm-objdump] Dump PT_GNU_RELRO as part of -p. PR: 31641 llvm-svn: 292169	2017-01-16 22:58:26 +00:00
Davide Italiano	6cc726ead0	[llvm-objdump] Dump PT_OPENBSD_{BOOTDATA,RANDOMIZE,WXNEEDED}. PR: 31641 llvm-svn: 292167	2017-01-16 22:01:41 +00:00
Simon Pilgrim	a0b0b96d83	[InstCombine][AVX] Tests showing missed opportunities to pass demanded elts through a permilpd/permilps shuffle mask llvm-svn: 292165	2017-01-16 21:34:22 +00:00
Jan Vesely	334f51a6fe	ADMGPU/EG,CM: Implement _noret global atomics _RTN versions will be a lot more complicated Differential Revision: https://reviews.llvm.org/D28067 llvm-svn: 292162	2017-01-16 21:20:13 +00:00
David Blaikie	87299ad2e7	[XRay] Implement the `llvm-xray graph` subcommand Here we define the `graph` subcommand which generates a graph from the function call information and uses it to present the call information graphically with additional annotations. Reviewers: dblaikie, dberris Differential Revision: https://reviews.llvm.org/D27243 llvm-svn: 292156	2017-01-16 20:36:26 +00:00
Tony Jiang	8e8c444d3d	[PowerPC] Expand ISEL instruction into if-then-else sequence. Generally, the ISEL is expanded into if-then-else sequence, in some cases (like when the destination register is the same with the true or false value register), it may just be expanded into just the if or else sequence. llvm-svn: 292154	2017-01-16 20:12:26 +00:00

1 2 3 4 5 ...

42253 Commits