llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	6b9bf7ecbc	[X86][AVX] Prefer VPBLENDW+VPBLENDD to VPBLENDVB for v16i16 blend shuffles Noticed while looking at D49562 codegen - we can avoid a large constant mask load and a slow VPBLENDVB select op by using VPBLENDW+VPBLENDD instead. TODO: As discussed on the patch, we should investigate adding VPBLENDVB handling to target shuffle combining as well, that will allow us to extend this to VPBLENDW+VPBLENDW+VPBLENDD. Differential Revision: https://reviews.llvm.org/D50074 llvm-svn: 340913	2018-08-29 10:51:08 +00:00
Krasimir Georgiev	edc318166f	[MC] fix a clang-tidy warning, NFC Summary: Per clang-tidy: function 'llvm::MCStreamer::checkCVLocSection' has a definition with different parameter names .../llvm/lib/MC/MCStreamer.cpp:275:18: the definition seen here .../llvm/include/llvm/MC/MCStreamer.h:235:8: differing parameters are named here: ('FuncId'), in definition: ('FunctionId') Reviewers: bkramer Reviewed By: bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51406 llvm-svn: 340912	2018-08-29 10:40:51 +00:00
George Rimar	9fbecc97ae	Revert r340904 "[llvm-mc] - Allow to set custom flags for debug sections." It broke PPC64 BB: http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/23252 llvm-svn: 340906	2018-08-29 09:04:52 +00:00
George Rimar	999d1ce517	[llvm-mc] - Allow to set custom flags for debug sections. I am experimenting with a single split dwarf (.dwo sections in .o files). I want to make linker to ignore .dwo sections in .o, for that I am trying to add SHF_EXCLUDE flag ("E") for them in my asm sample. I found that currently, it is impossible to add any flag for debug sections using llvm-mc. That happens because we have a set of predefined unique sections created early with default flags: https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCObjectFileInfo.cpp#L391 This patch allows a user to add any flags he wants. I had to edit TargetLoweringObjectFileImpl.cpp to set MetaData type for debug sections. Their kind was Data by default (so they were allocatable) and so after changes introduced by this patch the SHF_ALLOC flag was applied for them, what does not make sense for debug sections. One of OrcJITTests tests failed because of that. Differential revision: https://reviews.llvm.org/D51361 llvm-svn: 340904	2018-08-29 08:42:02 +00:00
Nicolai Haehnle	283b995097	AMDGPU: Fix getInstSizeInBytes Summary: Add some optional code to validate getInstSizeInBytes for emitted instructions. This flushed out some issues which are fixed by this patch: - Streamline getInstSizeInBytes - Properly define the VI readlane/writelane instruction as VOP3 - Fix the inline constant determination. Specifically, this change fixes an issue where a 32-bit value of 0xffffffff was recorded as unsigned. This is equal to -1 when restricting to a 32-bit comparison, and an inline constant can be used. Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D50629 Change-Id: Id87c3b7975839da0de8156a124b0ce98c5fb47f2 llvm-svn: 340903	2018-08-29 07:46:09 +00:00
Hans Wennborg	e0f3e9283f	LoopSink: Don't sink into blocks without an insertion point (PR38462) In the PR, LoopSink was trying to sink into a catchswitch block, which doesn't have a valid insertion point. Differential Revision: https://reviews.llvm.org/D51307 llvm-svn: 340900	2018-08-29 06:55:27 +00:00
Zachary Turner	b2fef1a0b0	Add support for various C++14 demanglings. Mostly this includes <auto> and <decltype-auto> return values. Additionally, this fixes a fairly obscure back-referencing bug that was encountered in one of the C++14 tests, which is that if you have something like Foo<&bar, &bar> then the `bar` forms a backreference. llvm-svn: 340896	2018-08-29 04:12:44 +00:00
Zachary Turner	38d2edd60d	[MS Demangler] Add output flags to all function calls. Previously we had a FunctionSigFlags, but it's more flexible to just have one set of output flags that apply to the entire process and just pipe the entire set of flags through the output process. This will be useful when we start allowing the user to customize the outputting behavior. llvm-svn: 340894	2018-08-29 03:59:17 +00:00
Aditya Nandakumar	6d47a41520	[GISel]: Add legalization support for Widening UADDO/USUBO https://reviews.llvm.org/D51384 Added code in LegalizerHelper to widen UADDO/USUBO along with unit tests. Reviewed by volkan. llvm-svn: 340892	2018-08-29 03:17:08 +00:00
Craig Topper	9f42726cc7	[X86] Support v2i32 gather/scatter indices with -x86-experimental-vector-widening-legalization Summary: This is split out from D41062 to cover the code in LegalVectorTypes.cpp Reviewers: RKSimon, spatel, efriedma Reviewed By: efriedma Subscribers: sdardis, jvesely, nhaehnle, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D51337 llvm-svn: 340891	2018-08-29 02:12:49 +00:00
Peter Collingbourne	9c9c8b22d2	Start reserving x18 by default on Android targets. Differential Revision: https://reviews.llvm.org/D45588 llvm-svn: 340889	2018-08-29 01:38:47 +00:00
Reid Kleckner	689f773317	[codeview] Clean up machinery for deferring .cv_loc emission Now that we create the label at the point of the directive, we don't need to set the "current CV location", and then later when we emit the next instruction, create a label for it and emit it. DWARF still defers the labels used in .debug_loc until the next instruction or value, for reasons unknown. llvm-svn: 340883	2018-08-28 23:25:59 +00:00
Zhaoshi Zheng	35818e2789	[QTOOL-37352] Consider isLegalAddressingImm in Constant Hoisting In Thumb1, legal imm range is [0, 255] for ADD/SUB instructions. However, the legal imm range for LD/ST in (R+Imm) addressing mode is [0, 127]. Imms in [128, 255] are materialized by mov R, #imm, and LD/STs use them in (R+R) addressing mode. This patch checks if a constant is used as offset in (R+Imm), if so, it checks isLegalAddressingMode passing the constant value as BaseOffset. Differential Revision: https://reviews.llvm.org/D50931 llvm-svn: 340882	2018-08-28 23:00:59 +00:00
Reid Kleckner	c8074aa654	[codeview] Emit labels for .cv_loc immediately Previously we followed the DWARF implementation, which waits until the next instruction or data to emit the label to use in the .debug_loc section. We might want to consider re-evaluating that design choice as well, since it means the .loc skips alignment padding, for better or worse. This was the most minimal fix I could come up with, but we should be able to do a lot of cleanups now that we don't need to save a pending CV location on the CodeViewContext. I plan to do those next, but this immediately fixes an assertion for some of our users. llvm-svn: 340878	2018-08-28 22:29:12 +00:00
Lang Hames	6cadc7c06b	[ORC] Replace lookupFlags in JITSymbolResolver with getResponsibilitySet. The new method name/behavior more closely models the way it was being used. It also fixes an assertion that can occur when using the new ORC Core APIs, where flags alone don't necessarily provide enough context to decide whether the caller is responsible for materializing a given symbol (which was always the reason this API existed). The default implementation of getResponsibilitySet uses lookupFlags to determine responsibility as before, so existing JITSymbolResolvers should continue to work. llvm-svn: 340874	2018-08-28 21:18:05 +00:00
Fedor Sergeev	43083111a2	[NFC][PassTiming] factor out generic PassTimingInfo Moving PassTimingInfo from legacy pass manager code into a separate header. Making it suitable for both legacy and new pass manager. Adding a test on -time-passes main functionality. llvm-svn: 340872	2018-08-28 21:06:51 +00:00
Alina Sbirlea	52e97a28d4	[SimpleLoopUnswitch] Form dedicated exits after trivial unswitches. Summary: Form dedicated exits after trivial unswitches. Fixes PR38737, PR38283. Reviewers: chandlerc, fedor.sergeev Subscribers: sanjoy, jlebar, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D51375 llvm-svn: 340871	2018-08-28 20:41:05 +00:00
Lang Hames	37a66413c1	[ORC] Add an addObjectFile method to LLJIT. The addObjectFile method adds the given object file to the JIT session, making its code available for execution. Support for the -extra-object flag is added to lli when operating in -jit-kind=orc-lazy mode to support testing of this feature. llvm-svn: 340870	2018-08-28 20:20:31 +00:00
Craig Topper	9401fd0ed2	[X86] Add intrinsics for KADD instructions These are intrinsics for supporting kadd builtins in clang. These builtins are already in gcc to implement intrinsics from icc. Though they are missing from the Intel Intrinsics Guide. This instruction adds two mask registers together as if they were scalar rather than a vXi1. We might be able to get away with a bitcast to scalar and a normal add instruction, but that would require DAG combine smarts in the backend to recoqnize add+bitcast. For now I'd prefer to go with the easiest implementation so we can get these builtins in to clang with good codegen. Differential Revision: https://reviews.llvm.org/D51370 llvm-svn: 340869	2018-08-28 19:22:55 +00:00
Fangrui Song	9cca227d3e	[AMDGPU] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=off llvm-svn: 340868	2018-08-28 19:19:03 +00:00
Matt Morehouse	bab8556f01	Revert "[libFuzzer] Port to Windows" This reverts commit r340860 due to failing tests. llvm-svn: 340867	2018-08-28 19:07:24 +00:00
Matt Arsenault	755f41f3a2	AMDGPU: Don't delete instructions if S_ENDPGM has implicit uses This can leave behind the uses with the defs removed. Since this should only really happen in tests, it's not worth the effort of trying to handle this. llvm-svn: 340866	2018-08-28 18:55:55 +00:00
Aditya Nandakumar	6b4d343e13	[GISel]: Add missing opcodes for overflow intrinsics https://reviews.llvm.org/D51197 Currently, IRTranslator (and GISel) seems to be arbitrarily picking which overflow intrinsics get mapped into opcodes which either have a carry as an input or not. For intrinsics such as Intrinsic::uadd_with_overflow, translate it to an opcode (G_UADDO) which doesn't have any carry inputs (similar to LLVM IR). This patch adds 4 missing opcodes for completeness - G_UADDO, G_USUBO, G_SSUBE and G_SADDE. llvm-svn: 340865	2018-08-28 18:54:10 +00:00
Thomas Lively	adb6da10b8	[WebAssembly][NFC] Document stackifier tablegen backend Summary: Add comments to help readers avoid having to read tablegen backends to understand the code. Also remove unecessary breaks from the output. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51371 llvm-svn: 340864	2018-08-28 18:49:47 +00:00
Matt Arsenault	44a8a756e2	AMDGPU: Force shrinking of add/sub even if the carry is used The original motivating example uses a 64-bit add, so the carry is used. Insert a copy from VCC. This may allow shrinking of the used carry instruction. At worst, we are replacing a mov to materialize the constant with a copy of vcc. llvm-svn: 340862	2018-08-28 18:44:16 +00:00
Matt Morehouse	c6fff3b6f5	[libFuzzer] Port to Windows Summary: Port libFuzzer to windows-msvc. This patch allows libFuzzer targets to be built and run on Windows, using -fsanitize=fuzzer and/or fsanitize=fuzzer-no-link. It allows these forms of coverage instrumentation to work on Windows as well. It does not fix all issues, such as those with -fsanitize-coverage=stack-depth, which is not usable on Windows as of this patch. It also does not fix any libFuzzer integration tests. Nearly all of them fail to compile, fixing them will come in a later patch, so libFuzzer tests are disabled on Windows until them. Patch By: metzman Reviewers: morehouse, rnk Reviewed By: morehouse, rnk Subscribers: morehouse, kcc, eraman Differential Revision: https://reviews.llvm.org/D51022 llvm-svn: 340860	2018-08-28 18:34:32 +00:00
Matt Arsenault	de6c421cc8	AMDGPU: Shrink insts to fold immediates This needs to be done in the SSA fold operands pass to be effective, so there is a bit of overlap with SIShrinkInstructions but I don't think this is practically avoidable. llvm-svn: 340859	2018-08-28 18:34:24 +00:00
Thomas Lively	995ad61f23	[WebAssembly] v128.not Implementation and tests. llvm-svn: 340857	2018-08-28 18:31:15 +00:00
Matt Arsenault	35b1902bce	AMDGPU: Move canShrink into TII llvm-svn: 340855	2018-08-28 18:22:34 +00:00
Nirav Dave	11e39fb6fb	[DAGCombine] Rework MERGE_VALUES to inline in single pass. NFCI. Avoid hyperlinear cost of inlining MERGE_VALUE node by constructing temporary vector and doing a single replacement. llvm-svn: 340853	2018-08-28 18:13:26 +00:00
Nirav Dave	113f2b9058	[DAG] Avoid recomputing Divergence checks. NFCI. When making multiple updates to the same SDNode, recompute node divergence only once after all changes have been made. llvm-svn: 340852	2018-08-28 18:13:00 +00:00
Nirav Dave	0b8cb46e0b	[DAG] Fix updateDivergence calculation Check correct SDNode when deciding if we should update the divergence property. llvm-svn: 340851	2018-08-28 18:12:35 +00:00
Matt Arsenault	10de2775bd	AMDGPU: Remove nan tests in class if src is nnan llvm-svn: 340850	2018-08-28 18:10:02 +00:00
Heejin Ahn	56e79dd048	[WebAssembly] Use getCalleeOpNo utility function (NFC) Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51366 llvm-svn: 340848	2018-08-28 17:49:39 +00:00
Craig Topper	c73095e264	[X86] Mark the FUCOMI instructions as requiring CMOV to be enabled. NFCI These instructions were added on the PentiumPro along with CMOV. This was already comprehended by the lowering process which should emit an alternate sequence using FCOM and FNSTW. This just makes it an explicit error if that doesn't work for some reason. llvm-svn: 340844	2018-08-28 17:17:13 +00:00
Brian Cain	3e0ca5708f	[debuginfo] generate debug info with asm+.file Summary: For assembly input files, generate debug info even when the .file directive is present, provided it does not include a file-number argument. Fixes PR38695. Reviewers: probinson, sidneym Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D51315 llvm-svn: 340839	2018-08-28 16:23:39 +00:00
David Bolvansky	c1b27b562b	[Inliner] Attribute callsites with inline remarks Summary: Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites. An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//. For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons: remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6) remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites: define void @test1() { call void @bar(i1 true) #0 call void @bar(i1 false) #2 ret void } attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" } .. attributes #2 = { "inline-remark"="(cost=never): recursive" } Patch by: yrouban (Yevgeny Rouban) Reviewers: xbolva00, tejohnson, apilipenko Reviewed By: xbolva00, tejohnson Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D50435 llvm-svn: 340834	2018-08-28 15:27:25 +00:00
Ryan Taylor	1f334d0062	[AMDGPU] Add support for a16 modifiear for gfx9 Summary: Adding support for a16 for gfx9. A16 bit replaces r128 bit for gfx9. Change-Id: Ie8b881e4e6d2f023fb5e0150420893513e5f4841 Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50575 llvm-svn: 340831	2018-08-28 15:07:30 +00:00
Mikael Holmen	4d652c4ce7	[CloneFunction] Constant fold terminators before checking single predecessor Summary: This fixes PR31105. There is code trying to delete dead code that does so by e.g. checking if the single predecessor of a block is the block itself. That check fails on a block like this bb: br i1 undef, label %bb, label %bb since that has two (identical) predecessors. However, after the check for dead blocks there is a call to ConstantFoldTerminator on the basic block, and that call simplifies the block to bb: br label %bb Therefore we now do the call to ConstantFoldTerminator before the check if the block is dead, so it can realize that it really is. The original behavior lead to the block not being removed, but it was simplified as above, and then we did a call to Dest->replaceAllUsesWith(&*I); with old and new being equal, and an assertion triggered. Reviewers: chandlerc, fhahn Reviewed By: fhahn Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D51280 llvm-svn: 340820	2018-08-28 12:40:11 +00:00
Alexandros Lamprineas	484bd13e2d	[GVNHoist] Prune out useless CHI insertions Fix for the out-of-memory error when compiling SemaChecking.cpp with GVNHoist and ubsan enabled. I've used a cache for inserted CHIs to avoid excessive memory usage. Differential Revision: https://reviews.llvm.org/D50323 llvm-svn: 340818	2018-08-28 11:07:54 +00:00
Simon Pilgrim	af98587095	[X86][SSE] Improve variable scalar shift of vXi8 vectors (PR34694) This patch creates the shift mask and actual shift using the vXi16 vector shift ops. Differential Revision: https://reviews.llvm.org/D51263 llvm-svn: 340813	2018-08-28 10:37:29 +00:00
Simon Pilgrim	f119e27d80	[X86][SSE] Avoid vector extraction/insertion for non-constant uniform shifts As discussed on D51263, we're better off using byte shifts to clear the upper bits on pre-SSE41 hardware. llvm-svn: 340810	2018-08-28 10:14:09 +00:00
Max Kazantsev	0c4b84e2df	[NFC] A loop can never contain Ret instruction llvm-svn: 340808	2018-08-28 09:26:28 +00:00
David Chisnall	5e52cadf89	Fix in getAllocationDataForFunction Summary: Correct to use set like behaviour of AllocType. Should check for subset, not precise value. Reviewers: theraven Reviewed By: theraven Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50959 llvm-svn: 340807	2018-08-28 08:59:06 +00:00
Craig Topper	c1436db753	[X86] Fix some comments to refer to KORTEST not KTEST. NFC KTEST is a different instruction. All of this code uses KORTEST. llvm-svn: 340799	2018-08-28 06:39:35 +00:00
Craig Topper	c7506b28c1	[DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the resulting load is legal for the target. Summary: I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there... I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))). Reviewers: efriedma, atanasyan, arsenm Reviewed By: efriedma Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50491 llvm-svn: 340797	2018-08-28 03:47:20 +00:00
Craig Topper	a6cd4b9bce	[InstCombine] Extend (add (sext x), cst) --> (sext (add x, cst')) and (add (zext x), cst) --> (zext (add x, cst')) to work for vectors Differential Revision: https://reviews.llvm.org/D51236 llvm-svn: 340796	2018-08-28 02:02:29 +00:00
Kit Barton	7c80f98b69	[PPC] Remove Darwin support from POWER backend. This patch issues an error message if Darwin ABI is attempted with the PPC backend. It also cleans up existing test cases, either converting the test to use an alternative triple or removing the test if the coverage is no longer needed. Updated Tests ------------- The majority of test cases were updated to use a different triple that does not include the Darwin ABI. Many tests were also updated to use FileCheck, in place of grep. Deleted Tests ------------- llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test specific functionality of dsymutil using an object file created with an old version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he suggested removing the test. llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a PPC test to a SystemZ test, as the behavior is also reproducible there. All other tests that were deleted were specific to the darwin/ppc ABI and no longer necessary. Phabricator Review: https://reviews.llvm.org/D50988 llvm-svn: 340795	2018-08-28 01:18:29 +00:00
David Blaikie	7d30653259	Revert "[CodeGenPrepare] Scan past debug intrinsics to find select candidates (NFC)" This causes crashes due to the interleaved dbg.value intrinsics being left at the end of basic blocks, causing the actual terminators (br, etc) to be not where they should be (not at the end of the block), leading to later crashes. Further discussion on the original commit thread. This reverts commit r340368. llvm-svn: 340794	2018-08-28 00:55:19 +00:00
George Burgess IV	6a9aa02ff3	[MemorySSA] Add NDEBUG checks to verifiers; NFC verify*() methods are intended to have no side-effects (unless we detect broken MSSA, in which case they assert()), and all of the other verify methods are wrapped by `#ifndef NDEBUG`. llvm-svn: 340793	2018-08-28 00:32:32 +00:00

1 2 3 4 5 ...

116198 Commits