llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	301162c4fe	AMDGPU: Replace i64 add/sub lowering Use VOP3 add/addc like usual. This has some tradeoffs. Inline immediates fold a little better, but other constants are worse off. SIShrinkInstructions could be made smarter to handle these cases. This allows us to avoid selecting scalar adds where we need to track the carry in scc and replace its users. This makes it easier to use the carryless VALU adds. llvm-svn: 318340	2017-11-15 21:51:43 +00:00
Dan Gohman	89bf88c87c	[WebAssembly] Update cfg-stackify.ll to remove the workaround added in r318288. Remove -switch-peel-threshold=100 and update the expected results in test10 in cfg-stackify.ll. llvm-svn: 318338	2017-11-15 21:38:33 +00:00
Jake Ehrlich	d49c92b124	[llvm-objcopy] Change -O binary to respect section removal and behave like GNU objcopy The original -O binary implementation just copied segment data from the object and dumped it into a file. This doesn't take into account any operations performed on objects such as section removal. GNU objcopy has some specific behavior that we'd also like to respect. For instance using -O binary and -j <some_section> will dump <some_section> to a file. This change implements GNU objcopy style -O binary to as close of an approximation as I can determine. Differential Revision: https://reviews.llvm.org/D39713 llvm-svn: 318324	2017-11-15 19:13:31 +00:00
Sanjay Patel	03d0cd6a81	[InstCombine] trunc (binop X, C) --> binop (trunc X, C') Note that one-use and shouldChangeType() are checked ahead of the switch. Without the narrowing folds, we can produce inferior vector code as shown in PR35299: https://bugs.llvm.org/show_bug.cgi?id=35299 llvm-svn: 318323	2017-11-15 19:12:01 +00:00
Sean Fertile	0f0837e84e	[PowerPC] Implement mayBeEmittedAsTailCall for PPC Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables CodeGenPrepare to duplicate returns when they might enable a tail-call. Differential Revision: https://reviews.llvm.org/D39777 llvm-svn: 318321	2017-11-15 18:58:27 +00:00
Reid Kleckner	72b819b8ee	[InstCombine] Salvage debug info during initial DCE InstCombine salvages debug info for every instruction it erases from its worklist, but it wasn't doing it during its initial DCE when populating its worklist. This fixes that. This should help improve availability of 'this' in optimized debug info when casts are necessary. llvm-svn: 318320	2017-11-15 18:51:12 +00:00
Sanjay Patel	680c73f049	[InstCombine] add tests for missing trunc folds; NFC As noted in PR35299: https://bugs.llvm.org/show_bug.cgi?id=35299 ...this is likely the root cause for a mis-vectorization transform. llvm-svn: 318319	2017-11-15 18:09:43 +00:00
Simon Pilgrim	56415772d6	[X86] Add CBW/CDQ/CDQE/CQO/CWD/CWDE to WriteALU schedule class Some CPUs are already overriding these sign extension instructions but we should be able to use the WriteALU schedule class by default. Differential Revision: https://reviews.llvm.org/D39899 llvm-svn: 318308	2017-11-15 17:11:24 +00:00
Adam Nemet	572a87c76f	[SLP] Added more missed optimization remarks Summary: Added more remarks to SLP pass, in particular "missed" optimization remarks. Also proposed several tests for new functionality. Patch by Vladimir Miloserdov! For reference you may look at: https://reviews.llvm.org/rL302811 Reviewers: anemet, fhahn Reviewed By: anemet Subscribers: javed.absar, lattner, petecoup, yakush, llvm-commits Differential Revision: https://reviews.llvm.org/D38367 llvm-svn: 318307	2017-11-15 17:04:53 +00:00
Sanjay Patel	956dec63fb	[PassManager, SimplifyCFG] add test for PR34603 / D38566; NFC This is a recommit of r316908 which was reverted by r317444. llvm-svn: 318300	2017-11-15 16:37:30 +00:00
Sanjay Patel	3e29890a7f	[(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM This is a recommit of r316869 which was speculatively reverted with r317444 and subsequently shown to not be the cause of PR35210. That crash should be fixed after r318237. Original commit message: The old PM sets the options of what used to be known as "latesimplifycfg" on the instantiation after the vectorizers have run, so that's what we'redoing here. FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not set the "late" options. I'm not sure if that's intentional or not. Differential Revision: https://reviews.llvm.org/D39407 llvm-svn: 318299	2017-11-15 16:33:11 +00:00
Sander de Smalen	8e607346af	[AArch64][SVE] Asm: Report SVE parsing diagnostics only once Summary: Prevent an issue where a diagnostic is reported multiple times by bailing out with a ParseFail if an invalid SVE register element qualifier/suffix is specified, for example: <stdin>:10:18: error: invalid sve vector kind qualifier add z20.h, z2.h, z31.x ^ <stdin>:10:18: error: invalid sve vector kind qualifier add z20.h, z2.h, z31.x ... <stdin>:10:18: error: invalid sve vector kind qualifier add z20.h, z2.h, z31.x ^ Reviewers: fhahn, rengolin Reviewed By: rengolin Subscribers: aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D39894 llvm-svn: 318297	2017-11-15 15:44:43 +00:00
Petar Jovanovic	cd729ead01	[mips] Improve genConstMult() to work with arbitrary precision APInt is now used instead of uint64_t in function genConstMult() allowing multiplication optimizations with constants of arbitrary length. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D38130 llvm-svn: 318296	2017-11-15 15:24:04 +00:00
Igor Laevsky	6f065a9f7c	[llvm-opt-fuzzer] Add opt fuzzer to the test-depends list. This should help with the buildbot failures after rL318293. llvm-svn: 318295	2017-11-15 15:07:37 +00:00
Igor Laevsky	445ae853fb	[llvm-opt-fuzzer] Only run tests for the x86 target. This fixes build bot failures after rL318293. llvm-svn: 318294	2017-11-15 13:35:42 +00:00
Igor Laevsky	354fd88fa2	[llvm-opt-fuzzer] NFC. Add sanity tests. llvm-svn: 318293	2017-11-15 12:36:57 +00:00
Jonas Devlieghere	294e689509	[DebugInfo] Fix potential CU mismatch for SubprogramScopeDIEs. In constructAbstractSubprogramScopeDIE there can be a potential mismatch between `this` and the CU of ContextDIE when a scope is shared between two DISubprograms belonging to a different CU. In that case, `this` is the CU that was specified in the IR, but the CU of ContextDIE is that of the first subprogram that was emitted. This patch fixes the mismatch by looking up the CU of ContextDIE, and switching to use that. This fixes PR35212 (https://bugs.llvm.org/show_bug.cgi?id=35212) Patch by Philip Craig! Differential revision: https://reviews.llvm.org/D39981 llvm-svn: 318289	2017-11-15 10:57:05 +00:00
Ilya Biryukov	ee7a96229e	Workaround CodeGen/WebAssembly/cfg-stackify.ll failure after r318202 By disabling the introduced optimization. llvm-svn: 318288	2017-11-15 10:50:43 +00:00
Mikael Holmen	6e60297ee6	[Lint] Don't warn about passing alloca'd value to tail call if using byval Summary: This fixes PR35241. When using byval, the data is effectively copied as part of the call anyway, so the pointer returned by the alloca will not be leaked to the callee and thus there is no reason to issue a warning. Reviewers: rnk Reviewed By: rnk Subscribers: Ka-Ka, llvm-commits Differential Revision: https://reviews.llvm.org/D40009 llvm-svn: 318279	2017-11-15 07:46:48 +00:00
NAKAMURA Takumi	5ce714a334	Fix llvm/test/Transforms/LoopRotate/pr35210.ll in rL318237, it uses debug options. llvm-svn: 318273	2017-11-15 06:46:58 +00:00
Craig Topper	f7b86728fa	[InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition. Summary: This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select. As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero? Reviewers: spatel, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39999 llvm-svn: 318267	2017-11-15 05:23:02 +00:00
Matt Arsenault	45b98189bd	AMDGPU: Don't use MUBUF vaddr if address may overflow Effectively revert r263964. Before we would not allow this if vaddr was not known to be positive. llvm-svn: 318240	2017-11-15 00:45:43 +00:00
Hans Wennborg	45cabacd2f	Revert r318193 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops." It crashes building sqlite; see reply on the llvm-commits thread. > [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. > > Patch tries to improve vectorization of the following code: > > void add1(int * __restrict dst, const int * __restrict src) { > dst++ = src++; > dst++ = src++ + 1; > dst++ = src++ + 2; > dst++ = src++ + 3; > } > Allows to vectorize even if the very first operation is not a binary add, but just a load. > > Fixed issues related to previous commit. > > Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev > > Reviewed By: ABataev, RKSimon > > Subscribers: llvm-commits, RKSimon > > Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 318239	2017-11-15 00:38:13 +00:00
Craig Topper	bf6495fbcb	[LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened. PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch. Fixes PR35210. Differential Revision: https://reviews.llvm.org/D40035 llvm-svn: 318237	2017-11-15 00:22:42 +00:00
Evgeniy Stepanov	cff19ee233	[asan] Prevent rematerialization of &__asan_shadow. Summary: In the mode when ASan shadow base is computed as the address of an external global (__asan_shadow, currently on android/arm32 only), regalloc prefers to rematerialize this value to save register spills. Even in -Os. On arm32 it is rather expensive (2 loads + 1 constant pool entry). This changes adds an inline asm in the function prologue to suppress this behavior. It reduces AsanTest binary size by 7%. Reviewers: pcc, vitalybuka Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40048 llvm-svn: 318235	2017-11-15 00:11:51 +00:00
Matt Arsenault	c8903125cd	AMDGPU: Handle or in multi-use shl ptr combine llvm-svn: 318223	2017-11-14 23:46:42 +00:00
Hans Wennborg	1403100b6b	Fix switch-lower-peel-top-case.ll isel pass is not registered error The test was doing -stop-after=isel, but that pass is actually the AMDGPUDAGToDAGISel pass, which might not be built when targeting x86_64. This changes the test to -stop-after=expand-isel-pseudos instead. Follow-up to r318202. llvm-svn: 318220	2017-11-14 23:30:28 +00:00
Tim Renouf	39e7ce8f21	[AMDGPU] updated PAL metadata record keys Summary: The ABI changed before specification was finalized. Reviewers: kzhuravl, dstuttard Subscribers: wdng, nhaehnle, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D39807 llvm-svn: 318213	2017-11-14 23:05:36 +00:00
Mitch Phillips	02993892d8	[cfi-verify] Add DOT graph printing for GraphResult objects. Allows users to view GraphResult objects in a DOT directed-graph format. This feature can be turned on through the --print-graphs flag. Also enabled pretty-printing of instructions in output. Together these features make analysis of unprotected CF instructions much easier by providing a visual control flow graph. Reviewers: pcc Subscribers: llvm-commits, kcc, vlad.tsyrklevich Differential Revision: https://reviews.llvm.org/D39819 llvm-svn: 318211	2017-11-14 22:43:13 +00:00
Aditya Nandakumar	e6201c8724	[GISel]: Rework legalization algorithm for better elimination of artifacts along with DCE Legalization Artifacts are all those insts that are there to make the type system happy. Currently, the target needs to say all combinations of extends and truncs are legal and there's no way of verifying that post legalization, we only have truly legal instructions. This patch changes roughly the legalization algorithm to process all illegal insts at one go, and then process all truncs/extends that were added to satisfy the type constraints separately trying to combine trivial cases until they converge. This has the added benefit that, the target legalizerinfo can only say which truncs and extends are okay and the artifact combiner would combine away other exts and truncs. Updated legalization algorithm to roughly the following pseudo code. WorkList Insts, Artifacts; collect_all_insts_and_artifacts(Insts, Artifacts); do { for (Inst in Insts) legalizeInstrStep(Inst, Insts, Artifacts); for (Artifact in Artifacts) tryCombineArtifact(Artifact, Insts, Artifacts); } while(!Insts.empty()); Also, wrote a simple wrapper equivalent to SetVector, except for erasing, it avoids moving all elements over by one and instead just nulls them out. llvm-svn: 318210	2017-11-14 22:42:19 +00:00
Simon Dardis	de5ed0c58e	Reland "[mips][mt][6/7] Add support for mftr, mttr instructions." This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win buildbot. Unlike many other instructions, these instructions have aliases which take coprocessor registers, gpr register, accumulator (and dsp accumulator) registers, floating point registers, floating point control registers and coprocessor 2 data and control operands. For the moment, these aliases are treated as pseudo instructions which are expanded into the underlying instruction. As a result, disassembling these instructions shows the underlying instruction and not the alias. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35253 llvm-svn: 318207	2017-11-14 22:26:42 +00:00
Rong Xu	dc07ae259e	[CodeGen] Fix the test case added in r318202 Add the -mtriple option to filter some platforms. llvm-svn: 318206	2017-11-14 22:08:37 +00:00
Reid Kleckner	29a5c03cc2	Make salvageDebugInfo of casts work for dbg.declare and dbg.addr Summary: Instcombine (and probably other passes) sometimes want to change the type of an alloca. To do this, they generally create a new alloca with the desired type, create a bitcast to make the new pointer type match the old pointer type, replace all uses with the cast, and then simplify the casts. We already knew how to salvage dbg.value instructions when removing casts, but we can extend it to cover dbg.addr and dbg.declare. Fixes a debug info quality issue uncovered in Chromium in http://crbug.com/784609 Reviewers: aprantl, vsk Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40042 llvm-svn: 318203	2017-11-14 21:49:06 +00:00
Rong Xu	3573d8da36	[CodeGen] Peel off the dominant case in switch statement in lowering This patch peels off the top case in switch statement into a branch if the probability exceeds a threshold. This will help the branch prediction and avoids the extra compares when lowering into chain of branches. Differential Revision: http://reviews.llvm.org/D39262 llvm-svn: 318202	2017-11-14 21:44:09 +00:00
Hans Wennborg	e1ecd61b98	Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195	2017-11-14 21:09:45 +00:00
Dinar Temirbulatov	2bd1836520	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Fixed issues related to previous commit. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev Reviewed By: ABataev, RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 318193	2017-11-14 20:55:08 +00:00
Matt Arsenault	9ba465a972	AMDGPU: Error on stack size overflow llvm-svn: 318189	2017-11-14 20:33:14 +00:00
Ulrich Weigand	5f4373a2fc	[SystemZ] Do not crash when selecting an OR of two constants In rare cases, common code will attempt to select an OR of two constants. This confuses the logic in splitLargeImmediate, causing an internal error during isel. Fixed by simply leaving this case to common code to handle. This fixes PR34859. llvm-svn: 318187	2017-11-14 20:00:34 +00:00
Martin Storsjo	6835cac2f9	[llvm-strings] Add support for the -a/--all options They don't actually change nay behaviour, as llvm-strings currently checks the whole object without looking at individual sections anyway. This allows using llvm-strings in a context that explicitly passes the -a option. Differential Revision: https://reviews.llvm.org/D40020 llvm-svn: 318185	2017-11-14 19:58:36 +00:00
Hiroshi Yamauchi	69c233ac6c	Simplify irreducible loop metadata test code. Summary: Shorten the irreducible loop metadata test code by removing insignificant instructions. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40043 llvm-svn: 318182	2017-11-14 19:48:59 +00:00
Easwaran Raman	0d55b55bb6	[CodeGenPrepare] Disable div bypass when working set size is huge. Summary: Bypass of slow divs based on operand values is currently disabled for -Os. Do the same when profile summary is available and the working set size of the application is huge. This is similar to how loop peeling is guarded by hasHugeWorkingSetSize. In the div bypass case, the generated extra code (and the extra branch) tendss to outweigh the benefits of the bypass. This results in noticeable performance improvement on an internal application. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39992 llvm-svn: 318179	2017-11-14 19:31:51 +00:00
Ulrich Weigand	55b8590e03	[SystemZ] Fix invalid codegen using RISBMux on out-of-range bits Before using the 32-bit RISBMux set of instructions we need to verify that the input bits are actually within range of the 32-bit instruction. This fixer PR35289. llvm-svn: 318177	2017-11-14 19:20:46 +00:00
Simon Dardis	35d90aea7a	[mips] Simplify test for 5.0.1 (NFC) Simplify testing that an emergency spill slot is used when MSA is used so that it can be included in the 5.0.1 release. llvm-svn: 318172	2017-11-14 19:11:45 +00:00
Jake Ehrlich	d56725a042	[llvm-objcopy] Add -strip-non-alloc option to remove all non-allocated sections This change adds a new flag not present in GNU objcopy that we call --strip-non-alloc. Differential Revision: https://reviews.llvm.org/D39926 llvm-svn: 318168	2017-11-14 18:50:24 +00:00
Yaxun Liu	0b2f73fd84	CodeGen: Fix TargetLowering::LowerCallTo for sret value type TargetLowering::LowerCallTo assumes that sret value type corresponds to a pointer in default address space, which is incorrect, since sret value type should correspond to a pointer in alloca address space, which may not be the default address space. This causes assertion for amdgcn target in amdgiz environment. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39996 llvm-svn: 318167	2017-11-14 18:46:52 +00:00
Jake Ehrlich	99e2c41c1a	[llvm-objcopy] Support the rest of the ELF formats We haven't been supporting anything but ELF64LE since the start. Luckily this was always accounted for and the change is pretty trivial. B35281 requests this change for ELF32LE. This change adds support for ELF32LE, ELF64BE, and ELF32BE with all supported features that already existed for ELF64LE. Differential Revision: https://reviews.llvm.org/D39977 llvm-svn: 318166	2017-11-14 18:41:47 +00:00
Adam Nemet	852d12f303	Adjust test after r318159 llvm-svn: 318160	2017-11-14 17:12:36 +00:00
Adam Nemet	1142b2d7b7	[llvm-profdata] Report if profile data file is IR- or FE-level Differential Revision: https://reviews.llvm.org/D39997 llvm-svn: 318159	2017-11-14 16:59:18 +00:00
Ilya Biryukov	e7329a7882	Use input redirection in WebAssembly/comdat.ll test. To match how the other tests do it. llvm-svn: 318153	2017-11-14 14:26:42 +00:00
Simon Pilgrim	600174e740	[X86][AVX] Add scheduling test for vmovntdq 256-bit store Needs to use inline asm as domain will otherwise be changed to float (vmovntps) llvm-svn: 318151	2017-11-14 14:03:29 +00:00
Gil Rapaport	848581cadb	[LV] Introduce VPBlendRecipe, VPWidenMemoryInstructionRecipe This patch is part of D38676. The patch introduces two new Recipes to handle instructions whose vectorization involves masking. These Recipes take VPlan-level masks in D38676, but still rely on ILV's existing createEdgeMask(), createBlockInMask() in this patch. VPBlendRecipe handles intra-loop phi nodes, which are vectorized as a sequence of SELECTs. Its execute() code is refactored out of ILV::widenPHIInstruction(), which now handles only loop-header phi nodes. VPWidenMemoryInstructionRecipe handles load/store which are to be widened (but are not part of an Interleave Group). In this patch it simply calls ILV::vectorizeMemoryInstruction on execute(). Differential Revision: https://reviews.llvm.org/D39068 llvm-svn: 318149	2017-11-14 12:09:30 +00:00
Tim Northover	5cdc4f9c33	ARM: correctly update CFG when splitting BB to fix branch. Because the block-splitting code is multi-purpose, we have to meddle with the branches when using it to fixup a conditional branch destination. We got the code right, but forgot to update the CFG so the verifier complained when expensive checks were on. Probably harmless since constant-islands comes so late, but best to fix it anyway. llvm-svn: 318148	2017-11-14 11:43:54 +00:00
Diana Picus	21a42bcc0b	[ARM GlobalISel] Remove C++ code for G_CONSTANT Get rid of the handwritten instruction selector code for handling G_CONSTANT. This code wasn't checking all the preconditions correctly anyway, so it's better to leave it to TableGen, which can handle at least some cases correctly (e.g. MOVi, MOVi16, folding into binary operations). Also add tests to cover those cases. llvm-svn: 318146	2017-11-14 11:20:32 +00:00
Momchil Velikov	dc86e1444d	[ARM] Fix incorrect conversion of a tail call to an ordinary call When we emit a tail call for Armv8-M, but then discover that the caller needs to save/restore `LR`, we convert the tail call to an ordinary one, since restoring `LR` takes extra instructions, which may negate the benefits of the tail call. If the callee, however, takes stack arguments, this conversion is incorrect, since nothing has been done to pass the stack arguments. Thus the patch reverts https://reviews.llvm.org/rL294000 Also, we improve the instruction sequence for popping `LR` in the case when we couldn't immediately find a scratch low register, but we can use as a temporary one of the callee-saved low registers and restore `LR` before popping other callee-saves. Differential Revision: https://reviews.llvm.org/D39599 llvm-svn: 318143	2017-11-14 10:36:52 +00:00
Matt Arsenault	b3a255eaf9	AMDGPU: Fix test llvm-svn: 318138	2017-11-14 06:40:00 +00:00
Dylan McKay	8443bcc898	[AVR] Remove the select-mbb-placement-bug.ll test This test was originally added when an old bug was fixed that caused broken iterator code to break basic block placement. The issue has an extremely low chance of every being a problem again. This specific test is very flaky and fails often due to upstream changes. I have removed this test because it negates more value than it returns. llvm-svn: 318134	2017-11-14 04:32:49 +00:00
Matt Arsenault	57c37b2dcd	AMDGPU: Fix producing saveexec when the copy is spilled If the register from the copy from exec was spilled, the copy before the spill was deleted leaving a spill of undefined register verifier error and miscompiling. Check for other use instructions of the copy register. llvm-svn: 318132	2017-11-14 02:16:54 +00:00
Chandler Carruth	00a301d568	[PM] Port BoundsChecking to the new PM. Registers it and everything, updates all the references, etc. Next patch will add support to Clang's `-fexperimental-new-pass-manager` path to actually enable BoundsChecking correctly. Differential Revision: https://reviews.llvm.org/D39084 llvm-svn: 318128	2017-11-14 01:30:04 +00:00
Sam Clegg	999660761e	[WebAssembly] Explicily disable comdat support for wasm output For now at least. We clearly need some kind of comdat or linkonce_odr support for wasm but currently COMDAT is not supported. Disable COMDAT support in the same way we do the Mach-O. This also causes clang not to generated COMDATs. Differential Revision: https://reviews.llvm.org/D39873 llvm-svn: 318123	2017-11-14 00:49:16 +00:00
Hans Wennborg	08b34a017a	Update some code.google.com links llvm-svn: 318115	2017-11-13 23:47:58 +00:00
Matt Arsenault	4b7938c658	AMDGPU: Fix not converting d16 load/stores to offset Fixes missed optimization with new MUBUF instructions. llvm-svn: 318106	2017-11-13 23:24:26 +00:00
Matt Arsenault	4eea3f3da3	AMDGPU: Implement computeKnownBitsForTargetNode for mbcnt llvm-svn: 318100	2017-11-13 22:55:05 +00:00
Jake Ehrlich	1bfefc1c72	[llvm-objcopy] Add --strip-debug Many projects use this option. There are two ways to use it. You can either a) Just use --strip-debug and keep the old file with debug content or b) you can use --strip-debug, --only-keep-debug, and --add-gnu-debuglink all in conjunction to create two separate files, the stripped file and the debug file. --only-keep-debug is more complicated than --strip-debug because it keeps the section headers without keeping section contents. That's not really supported by llvm-objcopy at the moment but I plan on adding it. So this change just supports a) and options to support b) will come soon. Differential Revision: https://reviews.llvm.org/D39919 llvm-svn: 318094	2017-11-13 22:13:08 +00:00
Jake Ehrlich	fabddf18a0	[llvm-objcopy] Add --strip-all option to llvm-objcopy This change adds a slightly less extreme form of stripping. It should remove any section that starts with ".debug" and should remove any symbol table or relocations. In general this strips out most of the stuff you don't need to execute but leaves a number of things around. This behavior has been designed to be compatible with GNU strip/objcopy --strip-all so that anywhere you currently use --strip-all you should be able to use llvm-objcopy as a drop in replacement. Differential Revision: https://reviews.llvm.org/D39769 llvm-svn: 318092	2017-11-13 22:02:07 +00:00
Adrian Prantl	73d0e94e82	Fix an assertion in SelectionDAG::transferDbgValues() when transferring debug info describing the lower bits of an extended SDNode. rdar://problem/35504722 llvm-svn: 318086	2017-11-13 21:24:54 +00:00
Evgeniy Stepanov	76d5ac4906	[arm] Fix Unnecessary reloads from GOT. Summary: This fixes PR35221. Use pseudo-instructions to let MachineCSE hoist global address computation. Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39871 llvm-svn: 318081	2017-11-13 20:45:38 +00:00
Sanjay Patel	feabdd18d9	[Reassociation] regenerate test checks; NFC llvm-svn: 318076	2017-11-13 19:46:28 +00:00
Dinar Temirbulatov	a9e47fd7d9	NFC, Allow SystemZ SLP tests only when SystemZ is supported. llvm-svn: 318070	2017-11-13 18:35:43 +00:00
Daniel Sanders	b78ac6e322	[globalisel][tablegen] Add support for extload. llvm-svn: 318068	2017-11-13 18:30:23 +00:00
Craig Topper	c314f461dd	[X86] Allow X86ISD::Wrapper to be folded into the base of gather/scatter address If the base of our gather corresponds to something contained in X86ISD::Wrapper we should be able to fold it into the address. This patch refactors some of the address matching to more fully use the X86ISelAddressMode struct and the getAddressOperands helper. A new helper function matchVectorAddress is added to call matchWrapper or fall back to matchAddressBase. We should also be able to support constant offsets from a wrapper, but I'll look into that in a future patch. We may even be able to completely reuse matchAddress here, but I wanted to start simple and work up to it. Differential Revision: https://reviews.llvm.org/D39927 llvm-svn: 318057	2017-11-13 17:53:59 +00:00
Sanjay Patel	7822fd884b	[Reassociate] add tests with 'reassoc' FMF; NFC llvm-svn: 318053	2017-11-13 17:29:11 +00:00
Jatin Bhateja	c61ade1ca0	[SCEV] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: sanjoy, junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 318050	2017-11-13 16:43:24 +00:00
Simon Dardis	8222160eb3	Revert "[CodeGenPrepare] Check that erased sunken address are not reused" This reverts commit r318032. The test broke some sanitizer bots. llvm-svn: 318049	2017-11-13 16:41:17 +00:00
Diana Picus	69aa20e3ca	[ARM GlobalISel] Update legalizer test Make one of the legalizer tests a bit more robust by making sure all values we're interested in are used (either in a store or a return) and by using loads instead of constants for obtaining values on fewer than 32 bits. This should make the test less fragile to changes in the legalize combiner, since those loads are legal (as opposed to the constants, which were being widened and thus produced opportunities for the legalize combiner). llvm-svn: 318047	2017-11-13 16:02:42 +00:00
Omer Paparo Bivas	4c679e1435	Inserting a base test for X86 performance nops Change-Id: I69da08b617d7fae8024c5aee04720eb465f39b81 llvm-svn: 318041	2017-11-13 15:02:39 +00:00
Uriel Korach	2aa707bdaa	[X86] test/testn intrinsics lowering to IR. llvm part. Remove builtins from llvm and add AutoUpgrade support. Also add fast-isel tests for the TEST and TESTN instructions. Differential Revision: https://reviews.llvm.org/D38736 llvm-svn: 318036	2017-11-13 12:51:18 +00:00
Momchil Velikov	842aa90192	[ARM] Place jump table as the first operand in additions When generating table jump code for switch statements, place the jump table label as the first operand in the various addition instructions in order to enable addressing mode selectors to better match index computation and possibly fold them into the addressing mode of the table entry load instruction. Differential revision: https://reviews.llvm.org/D39752 llvm-svn: 318033	2017-11-13 11:56:48 +00:00
Simon Dardis	8e2a5bd235	[CodeGenPrepare] Check that erased sunken address are not reused CodeGenPrepare sinks address computations from one basic block to another and attempts to reuse address computations that have already been sunk. If the same address computation appears twice with the first instance as an operand of a load whose result is an operand to a simplifable select, CodeGenPrepare simplifies the select and recursively erases the now dead instructions. CodeGenPrepare then attempts to use the erased address computation for the second load. Fix this by erasing the cached address value if it has zero uses before looking for the address value in the sunken address map. This partially resolves PR35209. Thanks to Alexander Richardson for reporting the issue! Reviewers: john.brawn Differential Revision: https://reviews.llvm.org/D39841 llvm-svn: 318032	2017-11-13 11:47:21 +00:00
Florian Hahn	0e9dec672d	[PartialInliner] Inline vararg functions that forward varargs. Summary: This patch extends the partial inliner to support inlining parts of vararg functions, if the vararg handling is done in the outlined part. It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is non-null, all varargs passed to the inlined function will be added to all calls to `ForwardVarArgsTo`. The partial inliner takes care to only pass `ForwardVarArgsTo` if the varargs handing is done in the outlined function. It checks that vastart is not part of the function to be inlined. `test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part of the repo) checks we do not do partial inlining if vastart is used in a basic block that will be inlined. Reviewers: davide, davidxl, grosser Reviewed By: davide, davidxl, grosser Subscribers: gyiu, grosser, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D39607 llvm-svn: 318028	2017-11-13 10:35:52 +00:00
Jina Nahias	9a7f9f123c	[x86][AVX512] Lowering shuffle i/f intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38671 Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661 llvm-svn: 318026	2017-11-13 09:16:39 +00:00
Gadi Haber	c9f2300652	[X86][SKX] Adding scheduling info of non-intrinsic + commutable SKX opcodes. Updated the scheduling information of the SKX subtarget in the file X86SchedSkylakeServer.td under lib/Target/X86 to: 1. add regular opcodes in addition to the suffixed "_Int" opcodes 2. add the (V)MAXCPD/MAXCPS/MAXCSD/MAXCSS/MINCPD/MINCPS/MINCSD/MINCSS instructions that are equivalent to their counterparts without the 'C' as they are part of a hack to make floating point min/max commutable under fast math. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D39833 Change-Id: Ie13702a5ce1b1a08af91ca637a52b6962881e7d6 llvm-svn: 318024	2017-11-13 08:42:07 +00:00
Craig Topper	1af2adb9f3	[X86] Limit NOPs to 7 bytes when 'slm' is spelled 'silvermont'. We support 2 spelling for silvermont and we should accept both here. llvm-svn: 318023	2017-11-13 08:17:30 +00:00
Craig Topper	75d71540f8	[X86] Use sse_load_f32/f64 to improve load folding of scalar vfscalefss/sd, vrcp14ss/sd, rsqrt14ss/sd instructions. llvm-svn: 318022	2017-11-13 08:07:33 +00:00
Craig Topper	c748455e51	[X86] Regenerate test. NFC llvm-svn: 318021	2017-11-13 08:07:31 +00:00
Craig Topper	ca8abedb2a	[X86] Use sse_load_f32/f64 to improve load folding for scalar VFPCLASS intrinsics. llvm-svn: 318019	2017-11-13 06:46:48 +00:00
Craig Topper	bf328f263e	[X86] Add tests for missed opportunities to fold a 128-bit vector load into vfpclassss and vpfpclasssd. llvm-svn: 318018	2017-11-13 06:46:46 +00:00
Craig Topper	d4f6094091	[X86] Fix SQRTSS/SQRTSD/RCPSS/RCPSD intrinsics to use sse_load_f32/sse_load_f64 to increase load folding opportunities. llvm-svn: 318016	2017-11-13 05:25:24 +00:00
Craig Topper	24389c6746	[X86] Add tests for full vector loads to fold-load-unops.ll. We should be able to fold a full vector load into a scalar intrinsic. Since it's legal to narrow a load. llvm-svn: 318015	2017-11-13 05:25:23 +00:00
Craig Topper	a95a1fd42d	[X86] Regenerate fold-load-unops.ll and add and avx512f command line. llvm-svn: 318014	2017-11-13 05:25:21 +00:00
Matt Arsenault	fbe9533509	AMDGPU: Fix multi-use shl/add combine This was using a custom function that didn't handle the addressing modes properly for private. Use isLegalAddressingMode to avoid duplicating this. Additionally, skip the combine if there is only one use since the standard combine will handle it. llvm-svn: 318013	2017-11-13 05:11:54 +00:00
Craig Topper	deee24b83c	[X86] Use sse_load_f32/f64 in patterns for the memory forms of VRNDSCALESS/SD. llvm-svn: 318009	2017-11-13 02:03:01 +00:00
Craig Topper	63157c4784	[X86] Use EVEX encoded VRNDSCALE instructions to implement the legacy round intrinsics. The VRNDSCALE instructions implement a superset of the (V)ROUND instructions. They are equivalent if the upper 4-bits of the immediate are 0. This patch lowers the legacy intrinsics to the VRNDSCALE ISD node and masks the upper bits of the immediate to 0. This allows us to take advantage of the larger register encoding space. We should maybe consider converting VRNDSCALE back to VROUND in the EVEX to VEX pass if the extended registers are not being used. I notice some load folding opportunities being missed for the VRNDSCALESS/SD instructions that I'll try to fix in future patches. llvm-svn: 318008	2017-11-13 02:03:00 +00:00
Matt Arsenault	90e4f719e1	Fix some misc. -enable-var-scope violations llvm-svn: 318006	2017-11-13 01:47:52 +00:00
Matt Arsenault	e1cd482fda	AMDGPU: Select d16 loads into low component of register llvm-svn: 318005	2017-11-13 00:22:09 +00:00
Matt Arsenault	70b9282015	AMDGPU: Fix -enable-var-scope violations llvm-svn: 318004	2017-11-12 23:53:44 +00:00
Matt Arsenault	cf9b6d8d57	AMDGPU: Fix missing gfx9 atomic inc/dec tests The global instructions weren't tested. Plus there were also some -enable-var-scope violations and broken check prefixes. llvm-svn: 318003	2017-11-12 23:40:12 +00:00
Craig Topper	b42a23ff8f	[X86] Add an X86ISD::RANGES opcode to use for the scalar intrinsics. This fixes a bug where we selected packed instructions for scalar intrinsics. llvm-svn: 317999	2017-11-12 18:51:09 +00:00
Craig Topper	6b53c4a982	[X86] Add test cases and command lines demonstrating how we accidentally select vrangeps/vrangepd from vrangess/vrangesd instrinsics when the rounding mode is CUR_DIRECTION llvm-svn: 317998	2017-11-12 18:51:08 +00:00
Craig Topper	d3e5781e53	[InstCombine] Teach visitICmpInst to not break integer absolute value idioms Summary: This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max. In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value. I've chosen a simpler case for the test here. Reviewers: spatel, davide, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39766 llvm-svn: 317994	2017-11-12 02:28:21 +00:00
Craig Topper	ac250825c6	[X86] Use vrndscaleps/pd for 128/256 ffloor/ftrunc/fceil/fnearbyint/frint when avx512vl is enabled. This matches what we do for scalar and 512-bit types. llvm-svn: 317991	2017-11-11 21:44:51 +00:00

1 2 3 4 5 ...

48910 Commits