llvm-project

Commit Graph

Author	SHA1	Message	Date
Dehao Chen	58fa724494	Use PMADDWD to expand reduction in a loop Summary: PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like: for (int i = 0; i < count; i++) a += x[i] * y[i]; Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31679 llvm-svn: 299776	2017-04-07 15:41:52 +00:00
Reid Kleckner	d3c87b5332	[lit] Try using process pools by default again Both pickling errors encountered on clang bots and Darwin compiler-rt should now be fixed. This has no impact on testing time on Linux, and on Windows goes from 88s to 63s for 'check'. The tests pass on Mac, but I haven't compared execution time. llvm-svn: 299775	2017-04-07 15:28:32 +00:00
Igor Breger	2953788c36	[GlobalISel] implement narrowing for G_CONSTANT. Summary: [GlobalISel] implement narrowing for G_CONSTANT. Reviewers: bogner, zvi, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31744 llvm-svn: 299772	2017-04-07 14:41:59 +00:00
Gor Nishanov	138ad6c9c0	[coroutines] Insert spills of PHI instructions correctly Summary: Fix a bug where we were inserting a spill in between the PHIs in the beginning of the block. Consider this fragment: ``` begin: %phi1 = phi i32 [ 0, %entry ], [ 2, %alt ] %phi2 = phi i32 [ 1, %entry ], [ 3, %alt ] %sp1 = call i8 @llvm.coro.suspend(token none, i1 false) switch i8 %sp1, label %suspend [i8 0, label %resume i8 1, label %cleanup] resume: call i32 @print(i32 %phi1) ``` Unless we are spilling the argument or result of the invoke, we were always inserting the spill immediately following the instruction. The fix adds a check that if the spilled instruction is a PHI Node, select an appropriate insert point with `getFirstInsertionPt()` that skips all the PHI Nodes and EH pads. Reviewers: majnemer, rnk Reviewed By: rnk Subscribers: qcolombet, EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D31799 llvm-svn: 299771	2017-04-07 14:16:49 +00:00
Matthew Simpson	11fe2e9f2b	Reapply r298620: [LV] Vectorize GEPs This patch reapplies r298620. The original patch was reverted because of two issues. First, the patch exposed a bug in InstCombine that caused the Chromium builds to fail (PR32414). This issue was fixed in r299017. Second, the patch introduced a bug in the vectorizer's scalars analysis that caused test suite builds to fail on SystemZ. The scalars analysis was too aggressive and marked a memory instruction scalar, even though it was going to be vectorized. This issue has been fixed in the current patch and several new test cases for the scalars analysis have been added. llvm-svn: 299770	2017-04-07 14:15:34 +00:00
Simon Dardis	9f6a5cd91d	[mips] Remove usage of debug only variable (NFC) Fix the lld-x86_64-darwin13 buildbot by removing the declaration of a debug only variable and instead moving the value into the debug statement. llvm-svn: 299769	2017-04-07 13:49:12 +00:00
Petar Jovanovic	bc54eb89ad	[mips][msa] Fix generation of bm(n)zi and bins[lr]i instructions We have two cases here, the first one being the following instruction selection from the builtin function: bm(n)zi builtin -> vselect node -> bins[lr]i machine instruction In case of bm(n)zi having an immediate which has either its high or low bits set, a bins[lr] instruction can be selected through the selectVSplatMask[LR] function. The function counts the number of bits set, and that value is being passed to the bins[lr]i instruction as its immediate, which in turn copies immediate modulo the size of the element in bits plus 1 as per specs, where we get the off-by-one-error. The other case is: bins[lr]i -> vselect node -> bsel.v In this case, a bsel.v instruction gets selected with a mask having one bit less set than required. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D30579 llvm-svn: 299768	2017-04-07 13:31:36 +00:00
Dmitry Preobrazhensky	e5147247b8	[AMDGPU][MC] Fix for Bug 28211 + LIT tests - corrected DS_GWS_* opcodes (see VI_Shader_Programming#16.pdf for detailed description) - address operand is not used - several opcodes have data operand - all opcodes have offset modifier - DS_AND_SRC2_B32: corrected typo in mnemo - DS_WRAP_RTN_F32 replaced with DS_WRAP_RTN_B32 - added CI/VI opcodes: - DS_CONDXCHG32_RTN_B64 - DS_GWS_SEMA_RELEASE_ALL - added VI opcodes: - DS_CONSUME - DS_APPEND - DS_ORDERED_COUNT Differential Revision: https://reviews.llvm.org/D31707 llvm-svn: 299767	2017-04-07 13:07:13 +00:00
Simon Dardis	6470ff0b24	[SelectionDAG] Enable target specific vector scalarization of calls and returns By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 299766	2017-04-07 13:03:52 +00:00
Jonas Paulsson	cad72efee6	[SystemZ] Check for presence of vector support in SystemZISelLowering A test case was found with llvm-stress that caused DAGCombiner to crash when compiling for an older subtarget without vector support. SystemZTargetLowering::combineTruncateExtract() should do nothing for older subtargets. This check was placed in canTreatAsByteVector(), which also helps in a few other places. Review: Ulrich Weigand llvm-svn: 299763	2017-04-07 12:35:11 +00:00
Jonas Paulsson	16100c637e	[SystemZ] Remove confusing comment in combineEXTRACT_VECTOR_ELT() It isn't just one-element vectors that can appear here. llvm-svn: 299762	2017-04-07 12:11:41 +00:00
Diana Picus	fed80723c0	[ARM] GlobalISel: Test hard float properly It turns out -float-abi=hard doesn't set the hard float calling convention for libcalls. We need to use a hard float triple instead (e.g. gnueabihf). llvm-svn: 299761	2017-04-07 12:04:24 +00:00
Sam Kolton	6e79529db4	[AMDGPU] Move SiShrinkInstruction and SDWAPeephole to SSAOptimization passes Summary: Difference beetween PreRegAlloc() and MachineSSAOptimization() are that the former is run despite of -O0 optimization level. In my undestanding SiShrinkInstructions and SDWAPeephole shouldn't run when optimizations are disabled. With this change order of passes will not change. Reviewers: arsenm, vpykhtin, rampitec Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31705 llvm-svn: 299757	2017-04-07 10:53:12 +00:00
Diana Picus	3c608448e1	[ARM] GlobalISel: Support frem for 64-bit values Legalize to a libcall. llvm-svn: 299756	2017-04-07 10:50:02 +00:00
Diana Picus	a5bab61a8d	[ARM] GlobalISel: Support frem for 32-bit values Legalize to a libcall. On this occasion, also start allowing soft float subtargets. For the moment G_FREM is the only legal floating point operation for them. llvm-svn: 299753	2017-04-07 09:41:39 +00:00
Craig Topper	33e0dbcc58	[InstCombine] Handle more commuted cases of ((A & B) \| ~A) -> (~A \| B) llvm-svn: 299747	2017-04-07 07:32:00 +00:00
Craig Topper	ccf85f24c8	[InstCombine] Add additional tests with varied commuting to show missing combines. NFC llvm-svn: 299746	2017-04-07 07:31:55 +00:00
Craig Topper	60dd9cd8e4	[InstSimplify] Use Instruction::BinaryOps instead of unsigned for a few function operands to remove some casts. NFC llvm-svn: 299745	2017-04-07 05:57:51 +00:00
Daniel Berlin	d952ceae2f	AliasAnalysis: Be less conservative about volatile than atomic. Summary: getModRefInfo is meant to answer the question "what impact does this instruction have on a given memory location" (not even another instruction). Long debate on this on IRC comes to the conclusion the answer should be "nothing special". That is, a noalias volatile store does not affect a memory location just by being volatile. Note: DSE and GVN and memdep currently believe this, because memdep just goes behind AA's back after it says "modref" right now. see line 635 of memdep. Prior to this patch we would get modref there, then check aliasing, and if it said noalias, we would continue. getModRefInfo already has this same AA check, it just wasn't being used because volatile was lumped in with ordering. (I am separately testing whether this code in memdep is now dead except for the invariant load case) Reviewers: jyknight, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31726 llvm-svn: 299741	2017-04-07 01:28:36 +00:00
Craig Topper	72a622cac7	[InstCombine] Add more commuted patterns to support folding ((~A & B) \| A) -> (A \| B). llvm-svn: 299737	2017-04-07 00:29:47 +00:00
Derek Schuff	9bb494caf4	[WebAssembly] Fix -Wcovered-switch-default warning llvm-svn: 299736	2017-04-06 23:52:01 +00:00
Zachary Turner	10169b6d0d	Allow specification of what kinds of class members to dump. Previously when dumping class definitions, there were only two modes - on or off. But it's useful to sometimes get a little more fine-grained. For example, you might only want to see the record layout (for example to look for extraneous padding). This patch adds a third mode, layout mode, which does exactly that. Only this-relative data members are displayed in this mode. Differential Revision: https://reviews.llvm.org/D31794 llvm-svn: 299733	2017-04-06 23:43:39 +00:00
Zachary Turner	63230a4e71	[llvm-pdbdump] Allow pretty to only dump specific types of types. Previously we just had the -types option, which would dump all classes, typedefs, and enums. But this produces a lot of output if you only want to view classes, for example. This patch breaks this down into 3 additional options, -classes, -enums, and -typedefs, and keeps the -types option around which implies all 3 more specific options. Differential Revision: https://reviews.llvm.org/D31791 llvm-svn: 299732	2017-04-06 23:43:12 +00:00
Konstantin Zhuravlyov	4b3847e865	AMDGPU/GFX9: Fix shared and private aperture queries Differential Revision: https://reviews.llvm.org/D31786 llvm-svn: 299727	2017-04-06 23:02:33 +00:00
Eric Christopher	380611addc	Remove the default subtarget from the Power port. It's unnecessary and harmful if used. llvm-svn: 299726	2017-04-06 23:01:30 +00:00
Craig Topper	740fe1a6eb	[InstCombine] Add a few cases for OR we fail to optimize due to missing commuted patterns checks. llvm-svn: 299725	2017-04-06 23:00:22 +00:00
Yi Kong	60b5a1cd17	Revert "Revert "[ARM] Add Kryo to available targets"" This reverts commit dc9458d5a747a02a9a8f198b84c2b92a6939a8dd. Added missing case for PreISelOperandLatencyAdjustment. llvm-svn: 299724	2017-04-06 22:47:47 +00:00
Eli Friedman	5fba1e53f2	Turn on -addr-sink-using-gep by default. The new codepath has been in the tree for years, and there isn't any reason to use two codepaths here. Differential Revision: https://reviews.llvm.org/D30596 llvm-svn: 299723	2017-04-06 22:42:18 +00:00
Michael Kuperstein	6129887d21	[X86] Revert r299387 due to AVX legalization infinite loop. llvm-svn: 299720	2017-04-06 22:33:25 +00:00
Craig Topper	a521c30dc6	[InstCombine] Remove testing assert I accidentally left in r299710. llvm-svn: 299715	2017-04-06 21:29:43 +00:00
Zachary Turner	2f3df6137a	iwyu fixes for lldbCore. This adjusts header file includes for headers and source files in Core. In doing so, one dependency cycle is eliminated because all the includes from Core to that project were dead includes anyway. In places where some files in other projects were only compiling due to a transitive include from another header, fixups have been made so that those files also include the header they need. Tested on Windows and Linux, and plan to address failures on OSX and FreeBSD after watching the bots. llvm-svn: 299714	2017-04-06 21:28:29 +00:00
Matt Arsenault	21a438255d	AMDGPU: Diagnose illegal SGPR to VGPR copies This is possible in ways that are not compiler bugs, so stop asserting on them. This emits an extra error when emitting objects when it can't encode the new pseudo, but I'm not sure that matters. llvm-svn: 299712	2017-04-06 21:09:53 +00:00
Craig Topper	b4da6840d8	[InstCombine] When checking to see if we can turn subtracts of 2^n - 1 into xor, we only need to call computeKnownBits on the RHS not the whole subtract. While there use isMask instead of isPowerOf2(C+1) Calling computeKnownBits on the RHS should allows us to recurse one step further. isMask is equivalent to the isPowerOf2(C+1) except in the case where C is all ones. But that was already handled earlier by creating a not which is an Xor with all ones. So this should be fine. llvm-svn: 299710	2017-04-06 21:06:03 +00:00
Matt Arsenault	5cf4271883	AMDGPU: Replace fp16SrcZerosHighBits with a whitelist FCOPYSIGN is lowered to bit operations which don't clear the high bits. llvm-svn: 299708	2017-04-06 20:58:30 +00:00
Rong Xu	2bf4c59025	[PGO] Preserve GlobalsAA in pgo-memop-opt pass. Preserve GlobalsAA analysis in memory intrinsic calls optimization based on profiled size. llvm-svn: 299707	2017-04-06 20:56:00 +00:00
Keno Fischer	1505de5495	[llvm-extract] Add option for recursive extraction Summary: Particularly, with --delete, this can be very useful for testing new optimizations on some hotspots, without having to run it on the whole application. E.g. as such: ``` llvm-extract app.bc --recursive --rfunc .hotspot. > hotspot.bc llvm-extract app.bc --recursive --delete --rfunc .hotspot. > residual.bc llc -filetype=obj residual.bc > residual.o llc -filetype=obj hotspot.bc > hotspot.o cc -o app residual.o hotspot.o ``` Reviewed By: davide Differential Revision: https://reviews.llvm.org/D31722 llvm-svn: 299706	2017-04-06 20:51:40 +00:00
Craig Topper	7226d796aa	[InstCombine] Remove redundant combine from visitAnd This combine is fully handled by SimplifyDemandedInstructionBits as of r299658 where I fixed this code to ensure the Add/Sub had only a single user. Otherwise it would fire and create additional instructions. That fix resulted in an improvement to code generated for tsan which is why I committed it before deleting. Differential Revision: https://reviews.llvm.org/D31543 llvm-svn: 299704	2017-04-06 20:41:48 +00:00
Davide Italiano	18ad20eab5	[BFIterator] Remove an assertion that doesn't hold. NFCI. llvm-svn: 299703	2017-04-06 20:32:10 +00:00
Mehdi Amini	db11fdfda5	Revert "Turn some C-style vararg into variadic templates" This reverts commit r299699, the examples needs to be updated. llvm-svn: 299702	2017-04-06 20:23:57 +00:00
Huihui Zhang	98240e9643	[SelectionDAG] [ARM CodeGen] Fix chain information of LowerMUL In LowerMUL, the chain information is not preserved for the new created Load SDNode. For example, if a Store alias with one of the operand of Mul. The Load for that operand need to be scheduled before the Store. The dependence is recorded in the chain of Store, in TokenFactor. However, when lowering MUL, the SDNodes for the new Loads for VMULL are not updated in the TokenFactor for the Store. Thus the chain is not preserved for the lowered VMULL. llvm-svn: 299701	2017-04-06 20:22:51 +00:00
Mehdi Amini	579540a8f7	Turn some C-style vararg into variadic templates Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299699	2017-04-06 20:09:31 +00:00
Evgeniy Stepanov	6c3a8cbc4d	[asan] Fix dead stripping of globals on Linux. Use a combination of !associated, comdat, @llvm.compiler.used and custom sections to allow dead stripping of globals and their asan metadata. Sometimes. Currently this works on LLD, which supports SHF_LINK_ORDER with sh_link pointing to the associated section. This also works on BFD, which seems to treat comdats as all-or-nothing with respect to linker GC. There is a weird quirk where the "first" global in each link is never GC-ed because of the section symbols. At this moment it does not work on Gold (as in the globals are never stripped). This is a re-land of r298158 rebased on D31358. This time, asan.module_ctor is put in a comdat as well to avoid quadratic behavior in Gold. llvm-svn: 299697	2017-04-06 19:55:17 +00:00
Evgeniy Stepanov	5dfe420d10	[asan] Put ctor/dtor in comdat. When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. This is a rebase of r298756. llvm-svn: 299696	2017-04-06 19:55:13 +00:00
Evgeniy Stepanov	039af609f1	[asan] Delay creation of asan ctor. Create the constructor in the module pass. This in needed for the GC-friendly globals change, where the constructor can be put in a comdat in some cases, but we don't know about that in the function pass. This is a rebase of r298731 which was reverted due to a false alarm. llvm-svn: 299695	2017-04-06 19:55:09 +00:00
Peter Collingbourne	db4cafa6c4	Bitcode: Do not create FNENTRYs for aliases of functions. There doesn't seem to be any point in doing this. Differential Revision: https://reviews.llvm.org/D31691 llvm-svn: 299694	2017-04-06 19:39:24 +00:00
Keno Fischer	bacc64b5fa	[StripDeadDebugInfo] Drop dead CUs entirely Summary: Prior to this while it would delete the dead DIGlobalVariables, it would leave dead DICompileUnits and everything referenced therefrom. For a bit bitcode file with thousands of compile units those dead nodes easily outnumbered the real ones. Clean that up. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D31720 llvm-svn: 299692	2017-04-06 19:26:22 +00:00
Yaxun Liu	76ae47cb35	[AMDGPU] Temporarily change constant address space from 4 to 2 Our final address space mapping is to let constant address space to be 4 to match nvptx. However for now we will make it 2 to avoid unnecessary work in FE/BE/devlib about intrinsics returning constant pointers. Differential Revision: https://reviews.llvm.org/D31770 llvm-svn: 299690	2017-04-06 19:17:32 +00:00
Yi Kong	5e7059b702	Revert "[ARM] Add Kryo to available targets" This reverts commit 942d6e6f58bf7e63810dd7cbcbce1fdfa5ebc6d4. Build breakage. llvm-svn: 299689	2017-04-06 19:16:14 +00:00
Nirav Dave	974f7c23ae	[SDAG] Fix visitAND optimization to deal with vector extract case again. Summary: Fix case elided by rL298920. Fixes PR32545. Reviewers: eli.friedman, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31759 llvm-svn: 299688	2017-04-06 19:05:41 +00:00
Craig Topper	8ef20ea7c2	[InstSimplify] Remove unreachable default from SimplifyBinOp. We have dedicated handlers for every opcode so nothing can get here anymore. The switch doesn't get detected as fully covered because Opcode is an unsigned. Casting to Instruction::BinaryOps still doesn't detect it because BinaryOpsEnd is in the enum and 1 past the last opcode. llvm-svn: 299687	2017-04-06 18:59:08 +00:00

1 2 3 4 5 ...

147190 Commits