llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	242374e219	[X86] Don't remove sign extend of gather/scatter indices during SelectionDAGBuilder. The sign extend might be from an i16 or i8 type and was inserted by InstCombine to match the pointer width. X86 gather legalization isn't currently detecting this to reinsert a sign extend to make things legal. It's a bit weird for the SelectionDAGBuilder to do this kind of optimization in the first place. With this removed we can at least lean on InstCombine somewhat to ensure the index is i32 or i64. I'll work on trying to recover some of the test cases by removing sign extends in the backend when its safe to do so with an understanding of the current legalizer capabilities. This should fix PR30690. llvm-svn: 318466	2017-11-16 23:08:57 +00:00
Craig Topper	e85ff4f732	[X86] Pre-truncate gather/scatter indices that have element sizes larger than 64-bits before Legalize. The wider element type will normally cause legalize to try to split and scalarize the gather/scatter, but we can't handle that. Instead, truncate the index early so the gather/scatter node is insulated from the legalization. This really shouldn't happen in practice since InstCombine will normalize index types to the same size as pointers. llvm-svn: 318452	2017-11-16 20:23:22 +00:00
Yonghong Song	ce96738dee	bpf: print backward branch target properly Currently, it prints the backward branch offset as unsigned value like below: 7: 7d 34 0b 00 00 00 00 00 if r4 s>= r3 goto 11 <LBB0_3> 8: b7 00 00 00 00 00 00 00 r0 = 0 LBB0_2: 9: 07 00 00 00 01 00 00 00 r0 += 1 ...... 17: bf 31 00 00 00 00 00 00 r1 = r3 18: 6d 32 f6 ff 00 00 00 00 if r2 s> r3 goto 65526 <LBB0_3+0x7FFB0> The correct print insn 18 should be: 18: 6d 32 f6 ff 00 00 00 00 if r2 s> r3 goto -10 <LBB0_2> To provide better clarity and be consistent with kernel verifier output, the insn 7 output is changed to the following with "+" added to non-negative branch offset: 7: 7d 34 0b 00 00 00 00 00 if r4 s>= r3 goto +11 <LBB0_3> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 318442	2017-11-16 19:15:36 +00:00
Guozhi Wei	433e8d3e04	[PPC] Change i32 constant in store instruction to i64 This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant. Differential Revision: https://reviews.llvm.org/D39352 llvm-svn: 318436	2017-11-16 18:27:34 +00:00
Yaxun Liu	407ca36b27	Let llvm.invariant.group.barrier accepts pointer to any address space llvm.invariant.group.barrier may accept pointers to arbitrary address space. This patch let it accept pointers to i8 in any address space and returns pointer to i8 in the same address space. Differential Revision: https://reviews.llvm.org/D39973 llvm-svn: 318413	2017-11-16 16:32:16 +00:00
Simon Pilgrim	e8e6acdac9	[X86] Add scheduling tests for SHLD/SHRD llvm-svn: 318402	2017-11-16 14:13:48 +00:00
Diana Picus	bfdf7b6c39	[ARM GlobalISel] Add tests for BIC. NFC Add instruction selector tests for BICrr and BICri, which are handled by TableGen. llvm-svn: 318398	2017-11-16 13:32:47 +00:00
Diana Picus	4d242b18b2	[ARM GlobalISel] Add tests for REVSH patterns. NFC Add instruction selector tests for some of the REVSH patterns handled by TableGen. llvm-svn: 318393	2017-11-16 12:29:28 +00:00
Yaxun Liu	0844ff2aa7	Fix pointer EVT in SelectionDAGBuilder::visitAlloca SelectionDAGBuilder::visitAlloca assumes alloca address space is 0, which is incorrect for triple amdgcn---amdgiz and causes isel failure. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40095 llvm-svn: 318392	2017-11-16 12:22:19 +00:00
Sam Parker	43fa5911a1	[DAGCombine] Enable more srl -> load combines Change the calculation for the desired ValueType for non-sign extending loads, as in those cases we don't care about the higher bits. This creates a smaller ExtVT and allows for such combinations as: (srl (zextload i16, [addr]), 8) -> (zextload i8, [addr + 1]) Differential Revision: https://reviews.llvm.org/D40034 llvm-svn: 318390	2017-11-16 11:28:26 +00:00
Craig Topper	46a5d58b8c	[X86] Update TTI to report that v1iX/v1fX types aren't legal for masked gather/scatter/load/store. The type legalizer will try to scalarize these operations if it sees them, but there is no handling for scalarizing them. This leads to a fatal error. With this change they will now be scalarized by the mem intrinsic scalarizing pass before SelectionDAG. llvm-svn: 318380	2017-11-16 06:02:05 +00:00
Yaxun Liu	4d9a4d7ac8	Fix APInt bit size in processDbgDeclares processDbgDeclares assumes pointer size is the same for different addr spaces. It uses pointer size for addr space 0 for all pointers, which causes assertion in stripAndAccumulateInBoundsConstantOffsets for amdgcn---amdgiz since pointer in addr space 5 has different size than in addr space 0. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40085 llvm-svn: 318370	2017-11-16 02:54:49 +00:00
Yonghong Song	4c3ce59e61	bpf: enable llvm-objdump to print out symbolized jmp target Add hook in BPF backend so that llvm-objdump can print out the jmp target with label names, e.g., ... if r1 != 2 goto 6 <LBB0_2> ... goto 7 <LBB0_4> ... LBB0_2: ... LBB0_4: ... Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 318358	2017-11-16 00:52:30 +00:00
Matt Arsenault	301162c4fe	AMDGPU: Replace i64 add/sub lowering Use VOP3 add/addc like usual. This has some tradeoffs. Inline immediates fold a little better, but other constants are worse off. SIShrinkInstructions could be made smarter to handle these cases. This allows us to avoid selecting scalar adds where we need to track the carry in scc and replace its users. This makes it easier to use the carryless VALU adds. llvm-svn: 318340	2017-11-15 21:51:43 +00:00
Dan Gohman	89bf88c87c	[WebAssembly] Update cfg-stackify.ll to remove the workaround added in r318288. Remove -switch-peel-threshold=100 and update the expected results in test10 in cfg-stackify.ll. llvm-svn: 318338	2017-11-15 21:38:33 +00:00
Sean Fertile	0f0837e84e	[PowerPC] Implement mayBeEmittedAsTailCall for PPC Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables CodeGenPrepare to duplicate returns when they might enable a tail-call. Differential Revision: https://reviews.llvm.org/D39777 llvm-svn: 318321	2017-11-15 18:58:27 +00:00
Simon Pilgrim	56415772d6	[X86] Add CBW/CDQ/CDQE/CQO/CWD/CWDE to WriteALU schedule class Some CPUs are already overriding these sign extension instructions but we should be able to use the WriteALU schedule class by default. Differential Revision: https://reviews.llvm.org/D39899 llvm-svn: 318308	2017-11-15 17:11:24 +00:00
Petar Jovanovic	cd729ead01	[mips] Improve genConstMult() to work with arbitrary precision APInt is now used instead of uint64_t in function genConstMult() allowing multiplication optimizations with constants of arbitrary length. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D38130 llvm-svn: 318296	2017-11-15 15:24:04 +00:00
Ilya Biryukov	ee7a96229e	Workaround CodeGen/WebAssembly/cfg-stackify.ll failure after r318202 By disabling the introduced optimization. llvm-svn: 318288	2017-11-15 10:50:43 +00:00
Matt Arsenault	45b98189bd	AMDGPU: Don't use MUBUF vaddr if address may overflow Effectively revert r263964. Before we would not allow this if vaddr was not known to be positive. llvm-svn: 318240	2017-11-15 00:45:43 +00:00
Matt Arsenault	c8903125cd	AMDGPU: Handle or in multi-use shl ptr combine llvm-svn: 318223	2017-11-14 23:46:42 +00:00
Hans Wennborg	1403100b6b	Fix switch-lower-peel-top-case.ll isel pass is not registered error The test was doing -stop-after=isel, but that pass is actually the AMDGPUDAGToDAGISel pass, which might not be built when targeting x86_64. This changes the test to -stop-after=expand-isel-pseudos instead. Follow-up to r318202. llvm-svn: 318220	2017-11-14 23:30:28 +00:00
Tim Renouf	39e7ce8f21	[AMDGPU] updated PAL metadata record keys Summary: The ABI changed before specification was finalized. Reviewers: kzhuravl, dstuttard Subscribers: wdng, nhaehnle, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D39807 llvm-svn: 318213	2017-11-14 23:05:36 +00:00
Aditya Nandakumar	e6201c8724	[GISel]: Rework legalization algorithm for better elimination of artifacts along with DCE Legalization Artifacts are all those insts that are there to make the type system happy. Currently, the target needs to say all combinations of extends and truncs are legal and there's no way of verifying that post legalization, we only have truly legal instructions. This patch changes roughly the legalization algorithm to process all illegal insts at one go, and then process all truncs/extends that were added to satisfy the type constraints separately trying to combine trivial cases until they converge. This has the added benefit that, the target legalizerinfo can only say which truncs and extends are okay and the artifact combiner would combine away other exts and truncs. Updated legalization algorithm to roughly the following pseudo code. WorkList Insts, Artifacts; collect_all_insts_and_artifacts(Insts, Artifacts); do { for (Inst in Insts) legalizeInstrStep(Inst, Insts, Artifacts); for (Artifact in Artifacts) tryCombineArtifact(Artifact, Insts, Artifacts); } while(!Insts.empty()); Also, wrote a simple wrapper equivalent to SetVector, except for erasing, it avoids moving all elements over by one and instead just nulls them out. llvm-svn: 318210	2017-11-14 22:42:19 +00:00
Rong Xu	dc07ae259e	[CodeGen] Fix the test case added in r318202 Add the -mtriple option to filter some platforms. llvm-svn: 318206	2017-11-14 22:08:37 +00:00
Rong Xu	3573d8da36	[CodeGen] Peel off the dominant case in switch statement in lowering This patch peels off the top case in switch statement into a branch if the probability exceeds a threshold. This will help the branch prediction and avoids the extra compares when lowering into chain of branches. Differential Revision: http://reviews.llvm.org/D39262 llvm-svn: 318202	2017-11-14 21:44:09 +00:00
Hans Wennborg	e1ecd61b98	Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195	2017-11-14 21:09:45 +00:00
Matt Arsenault	9ba465a972	AMDGPU: Error on stack size overflow llvm-svn: 318189	2017-11-14 20:33:14 +00:00
Ulrich Weigand	5f4373a2fc	[SystemZ] Do not crash when selecting an OR of two constants In rare cases, common code will attempt to select an OR of two constants. This confuses the logic in splitLargeImmediate, causing an internal error during isel. Fixed by simply leaving this case to common code to handle. This fixes PR34859. llvm-svn: 318187	2017-11-14 20:00:34 +00:00
Easwaran Raman	0d55b55bb6	[CodeGenPrepare] Disable div bypass when working set size is huge. Summary: Bypass of slow divs based on operand values is currently disabled for -Os. Do the same when profile summary is available and the working set size of the application is huge. This is similar to how loop peeling is guarded by hasHugeWorkingSetSize. In the div bypass case, the generated extra code (and the extra branch) tendss to outweigh the benefits of the bypass. This results in noticeable performance improvement on an internal application. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39992 llvm-svn: 318179	2017-11-14 19:31:51 +00:00
Ulrich Weigand	55b8590e03	[SystemZ] Fix invalid codegen using RISBMux on out-of-range bits Before using the 32-bit RISBMux set of instructions we need to verify that the input bits are actually within range of the 32-bit instruction. This fixer PR35289. llvm-svn: 318177	2017-11-14 19:20:46 +00:00
Simon Dardis	35d90aea7a	[mips] Simplify test for 5.0.1 (NFC) Simplify testing that an emergency spill slot is used when MSA is used so that it can be included in the 5.0.1 release. llvm-svn: 318172	2017-11-14 19:11:45 +00:00
Yaxun Liu	0b2f73fd84	CodeGen: Fix TargetLowering::LowerCallTo for sret value type TargetLowering::LowerCallTo assumes that sret value type corresponds to a pointer in default address space, which is incorrect, since sret value type should correspond to a pointer in alloca address space, which may not be the default address space. This causes assertion for amdgcn target in amdgiz environment. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39996 llvm-svn: 318167	2017-11-14 18:46:52 +00:00
Ilya Biryukov	e7329a7882	Use input redirection in WebAssembly/comdat.ll test. To match how the other tests do it. llvm-svn: 318153	2017-11-14 14:26:42 +00:00
Simon Pilgrim	600174e740	[X86][AVX] Add scheduling test for vmovntdq 256-bit store Needs to use inline asm as domain will otherwise be changed to float (vmovntps) llvm-svn: 318151	2017-11-14 14:03:29 +00:00
Tim Northover	5cdc4f9c33	ARM: correctly update CFG when splitting BB to fix branch. Because the block-splitting code is multi-purpose, we have to meddle with the branches when using it to fixup a conditional branch destination. We got the code right, but forgot to update the CFG so the verifier complained when expensive checks were on. Probably harmless since constant-islands comes so late, but best to fix it anyway. llvm-svn: 318148	2017-11-14 11:43:54 +00:00
Diana Picus	21a42bcc0b	[ARM GlobalISel] Remove C++ code for G_CONSTANT Get rid of the handwritten instruction selector code for handling G_CONSTANT. This code wasn't checking all the preconditions correctly anyway, so it's better to leave it to TableGen, which can handle at least some cases correctly (e.g. MOVi, MOVi16, folding into binary operations). Also add tests to cover those cases. llvm-svn: 318146	2017-11-14 11:20:32 +00:00
Momchil Velikov	dc86e1444d	[ARM] Fix incorrect conversion of a tail call to an ordinary call When we emit a tail call for Armv8-M, but then discover that the caller needs to save/restore `LR`, we convert the tail call to an ordinary one, since restoring `LR` takes extra instructions, which may negate the benefits of the tail call. If the callee, however, takes stack arguments, this conversion is incorrect, since nothing has been done to pass the stack arguments. Thus the patch reverts https://reviews.llvm.org/rL294000 Also, we improve the instruction sequence for popping `LR` in the case when we couldn't immediately find a scratch low register, but we can use as a temporary one of the callee-saved low registers and restore `LR` before popping other callee-saves. Differential Revision: https://reviews.llvm.org/D39599 llvm-svn: 318143	2017-11-14 10:36:52 +00:00
Matt Arsenault	b3a255eaf9	AMDGPU: Fix test llvm-svn: 318138	2017-11-14 06:40:00 +00:00
Dylan McKay	8443bcc898	[AVR] Remove the select-mbb-placement-bug.ll test This test was originally added when an old bug was fixed that caused broken iterator code to break basic block placement. The issue has an extremely low chance of every being a problem again. This specific test is very flaky and fails often due to upstream changes. I have removed this test because it negates more value than it returns. llvm-svn: 318134	2017-11-14 04:32:49 +00:00
Matt Arsenault	57c37b2dcd	AMDGPU: Fix producing saveexec when the copy is spilled If the register from the copy from exec was spilled, the copy before the spill was deleted leaving a spill of undefined register verifier error and miscompiling. Check for other use instructions of the copy register. llvm-svn: 318132	2017-11-14 02:16:54 +00:00
Sam Clegg	999660761e	[WebAssembly] Explicily disable comdat support for wasm output For now at least. We clearly need some kind of comdat or linkonce_odr support for wasm but currently COMDAT is not supported. Disable COMDAT support in the same way we do the Mach-O. This also causes clang not to generated COMDATs. Differential Revision: https://reviews.llvm.org/D39873 llvm-svn: 318123	2017-11-14 00:49:16 +00:00
Matt Arsenault	4b7938c658	AMDGPU: Fix not converting d16 load/stores to offset Fixes missed optimization with new MUBUF instructions. llvm-svn: 318106	2017-11-13 23:24:26 +00:00
Matt Arsenault	4eea3f3da3	AMDGPU: Implement computeKnownBitsForTargetNode for mbcnt llvm-svn: 318100	2017-11-13 22:55:05 +00:00
Evgeniy Stepanov	76d5ac4906	[arm] Fix Unnecessary reloads from GOT. Summary: This fixes PR35221. Use pseudo-instructions to let MachineCSE hoist global address computation. Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39871 llvm-svn: 318081	2017-11-13 20:45:38 +00:00
Daniel Sanders	b78ac6e322	[globalisel][tablegen] Add support for extload. llvm-svn: 318068	2017-11-13 18:30:23 +00:00
Craig Topper	c314f461dd	[X86] Allow X86ISD::Wrapper to be folded into the base of gather/scatter address If the base of our gather corresponds to something contained in X86ISD::Wrapper we should be able to fold it into the address. This patch refactors some of the address matching to more fully use the X86ISelAddressMode struct and the getAddressOperands helper. A new helper function matchVectorAddress is added to call matchWrapper or fall back to matchAddressBase. We should also be able to support constant offsets from a wrapper, but I'll look into that in a future patch. We may even be able to completely reuse matchAddress here, but I wanted to start simple and work up to it. Differential Revision: https://reviews.llvm.org/D39927 llvm-svn: 318057	2017-11-13 17:53:59 +00:00
Diana Picus	69aa20e3ca	[ARM GlobalISel] Update legalizer test Make one of the legalizer tests a bit more robust by making sure all values we're interested in are used (either in a store or a return) and by using loads instead of constants for obtaining values on fewer than 32 bits. This should make the test less fragile to changes in the legalize combiner, since those loads are legal (as opposed to the constants, which were being widened and thus produced opportunities for the legalize combiner). llvm-svn: 318047	2017-11-13 16:02:42 +00:00
Omer Paparo Bivas	4c679e1435	Inserting a base test for X86 performance nops Change-Id: I69da08b617d7fae8024c5aee04720eb465f39b81 llvm-svn: 318041	2017-11-13 15:02:39 +00:00
Uriel Korach	2aa707bdaa	[X86] test/testn intrinsics lowering to IR. llvm part. Remove builtins from llvm and add AutoUpgrade support. Also add fast-isel tests for the TEST and TESTN instructions. Differential Revision: https://reviews.llvm.org/D38736 llvm-svn: 318036	2017-11-13 12:51:18 +00:00

1 2 3 4 5 ...

22237 Commits