llvm-project

Commit Graph

Author	SHA1	Message	Date
Leonard Chan	5652f35817	[NewPM] Port Sancov This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty. Changes: - Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over functions and ModuleSanitizerCoverage for passing over modules. - ModuleSanitizerCoverage exists for adding 2 module level calls to initialization functions but only if there's a function that was instrumented by sancov. - Added legacy and new PM wrapper classes that own instances of the 2 new classes. - Update llvm tests and add clang tests. Differential Revision: https://reviews.llvm.org/D62888 llvm-svn: 365838	2019-07-11 22:35:40 +00:00
Stanislav Mekhanoshin	28550c8680	[AMDGPU] Fixed asan error with agpr spilling Instruction was used after it was erased. llvm-svn: 365837	2019-07-11 22:30:11 +00:00
Diego Novillo	a35a7d49e5	Fix build errors LLVM tests are disabled. Original patch from alanbaker@google.com Fixes the error: CMake Error in <...>/llvm/cmake/modules/CMakeLists.txt: export called with target "LLVMTestingSupport" which requires target "gtest" that is not in the export set. This occurs when LLVM is embedded in a larger project, but is configured not to include tests. If testing is disabled gtest isn't available and LLVM fails to configure. Differential revision: https://reviews.llvm.org/D63097 llvm-svn: 365836	2019-07-11 22:08:35 +00:00
Stanislav Mekhanoshin	937ff6e701	[AMDGPU] gfx908 agpr spilling Differential Revision: https://reviews.llvm.org/D64594 llvm-svn: 365833	2019-07-11 21:54:13 +00:00
Stefan Stipanovic	0626367202	[Attributor] Deduce "nosync" function attribute. Introduce and deduce "nosync" function attribute to indicate that a function does not synchronize with another thread in a way that other thread might free memory. Reviewers: jdoerfert, jfb, nhaehnle, arsenm Subscribers: wdng, hfinkel, nhaenhle, mehdi_amini, steven_wu, dexonsmith, arsenm, uenoku, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D62766 llvm-svn: 365830	2019-07-11 21:37:40 +00:00
Stanislav Mekhanoshin	7d2019bb96	[AMDGPU] gfx908 hazard recognizer Differential Revision: https://reviews.llvm.org/D64593 llvm-svn: 365829	2019-07-11 21:30:34 +00:00
Stanislav Mekhanoshin	b83e283e65	[AMDGPU] gfx908 scheduling Differential Revision: https://reviews.llvm.org/D64590 llvm-svn: 365826	2019-07-11 21:25:00 +00:00
Stanislav Mekhanoshin	e67cc380a8	[AMDGPU] gfx908 mfma support Differential Revision: https://reviews.llvm.org/D64584 llvm-svn: 365824	2019-07-11 21:19:33 +00:00
Reid Kleckner	f002fcb2ad	Open native file handles to avoid converting from FDs, NFC Follow up to r365588. llvm-svn: 365820	2019-07-11 20:29:32 +00:00
Wouter van Oortmerssen	a617967d68	[WebAssembly] Assembler: support negative float constants. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64367 llvm-svn: 365802	2019-07-11 18:18:07 +00:00
Benjamin Kramer	fa1a4e4de5	[NVPTX] Use atomicrmw fadd instead of intrinsics AutoUpgrade the old intrinsics to atomicrmw fadd. llvm-svn: 365796	2019-07-11 17:11:25 +00:00
Sanjay Patel	5cc7c9ab93	[X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483) Follow up to D58597, where it was noted that the commuted ISD::SUB variant was having problems with lack of combines. See also D63958 where we untangled setcc/sub pairs. Differential Revision: https://reviews.llvm.org/D58875 llvm-svn: 365791	2019-07-11 15:56:33 +00:00
Nico Weber	96dff91998	Fix a few 'no newline at end of file' warnings that Xcode emits (Xcode even has a snazzy "Fix" button, but clicking that inserts two newlines. So close!) llvm-svn: 365789	2019-07-11 15:26:45 +00:00
Simon Pilgrim	d0307f93a7	[DAGCombine] narrowInsertExtractVectorBinOp - add CONCAT_VECTORS support We already split extract_subvector(binop(insert_subvector(v,x),insert_subvector(w,y))) -> binop(x,y). This patch adds support for extract_subvector(binop(concat_vectors(),concat_vectors())) cases as well. In particular this means we don't have to wait for X86 lowering to convert concat_vectors to insert_subvector chains, which helps avoid some cases where demandedelts/combine calls occur too late to split large vector ops. The fast-isel-store.ll load folding regression is annoying but I don't think is that critical. Differential Revision: https://reviews.llvm.org/D63653 llvm-svn: 365785	2019-07-11 14:45:03 +00:00
Matt Arsenault	6eb8ae8f17	RegUsageInfoCollector: Skip calling conventions I missed before llvm-svn: 365784	2019-07-11 14:41:40 +00:00
Matt Arsenault	b725d27350	AMDGPU/GlobalISel: Move kernel argument handling to separate function llvm-svn: 365782	2019-07-11 14:18:25 +00:00
Matt Arsenault	7e71902b79	GlobalISel: Use Register llvm-svn: 365780	2019-07-11 14:18:19 +00:00
Sanjay Patel	3487791fea	[InstCombine] don't move FP negation out of a constant expression -(X * ConstExpr) becomes X * (-ConstExpr), so don't reverse that and infinite loop. llvm-svn: 365774	2019-07-11 13:44:29 +00:00
Tim Northover	67828edbbd	OpaquePtr: switch to GlobalValue::getValueType in a few places. NFC. llvm-svn: 365770	2019-07-11 13:13:02 +00:00
Tim Northover	f2d6597653	OpaquePtr: use byval accessor instead of inspecting pointer type. NFC. The accessor can deal with both "byval(ty)" and "ty* byval" forms seamlessly. llvm-svn: 365769	2019-07-11 13:12:38 +00:00
Tim Northover	27658ed512	OpaquePtr: use load instruction directly for type. NFC. llvm-svn: 365768	2019-07-11 13:12:08 +00:00
Tim Northover	030bb3d363	InstructionSimplify: Simplify InstructionSimplify. NFC. The interface predates CallBase, so both it and implementation were significantly more complicated than they needed to be. There was even some redundancy that could be eliminated. Should also help with OpaquePointers by not trying to derive a function's type from it's PointerType. llvm-svn: 365767	2019-07-11 13:11:44 +00:00
George Rimar	eb41f7f081	[yaml2obj] - Allow overriding the sh_size field. There is no way to set broken sh_size field currently for sections. It can be usefull for writing the test cases. Differential revision: https://reviews.llvm.org/D64401 llvm-svn: 365766	2019-07-11 12:59:29 +00:00
David Bolvansky	e23be09e66	[InstCombine] Reorder recently added/improved pow transformations Changed cases are now faster with exp2. llvm-svn: 365758	2019-07-11 10:55:04 +00:00
Florian Hahn	3b9994615f	Revert [BitcodeReader] Validate OpNum, before accessing Record array. This reverts r365750 (git commit `8b222ecf27`) llvm-dis runs out of memory while opening invalid-fcmp-opnum.bc on llvm-hexagon-elf, probably because the bitcode file contains other suspicious values. http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/21949 llvm-svn: 365757	2019-07-11 10:53:40 +00:00
Fangrui Song	6dc5962957	[llvm-objcopy] Don't change permissions of non-regular output files There is currently an EPERM error when a regular user executes `llvm-objcopy a.o /dev/null`. Worse, root can even change the mode bits of /dev/null. Fix it by checking if the output file is special. A new overload of llvm::sys::fs::setPermissions with FD as the parameter is added. Users should provide `perm & ~umask` as the parameter if they intend to respect umask. The existing overload of llvm::sys::fs::setPermissions may be deleted if we can find an implementation of fchmod() on Windows. fchmod() is usually better than chmod() because it saves syscalls and can avoid race condition. Reviewed By: jakehehrlich, jhenderson Differential Revision: https://reviews.llvm.org/D64236 llvm-svn: 365753	2019-07-11 10:17:59 +00:00
Fangrui Song	f9ca13cb5f	[X86] -fno-plt: use GOT __tls_get_addr only if GOTPCRELX is enabled Summary: As of binutils 2.32, ld has a bogus TLS relaxation error when the GD/LD code sequence using R_X86_64_GOTPCREL (instead of R_X86_64_GOTPCRELX) is attempted to be relaxed to IE/LE (binutils PR24784). gold and lld are good. In gcc/config/i386/i386.md, there is a configure-time check of as/ld support and the GOT relaxation will not be used if as/ld doesn't support it: if (flag_plt \|\| !HAVE_AS_IX86_TLS_GET_ADDR_GOT) return "call\t%P2"; return "call\t{*%p2@GOT(%1)\|[DWORD PTR %p2@GOT[%1]]}"; In clang, -DENABLE_X86_RELAX_RELOCATIONS=OFF is the default. The ld.bfd bogus error can be reproduced with: thread_local int a; int main() { return a; } clang -fno-plt -fpic a.cc -fuse-ld=bfd GOTPCRELX gained relative good support in 2016, which is considered relatively new. It is even difficult to conditionally default to -DENABLE_X86_RELAX_RELOCATIONS=ON due to cross compilation reasons. So work around the ld.bfd bug by only using GOT when GOTPCRELX is enabled. Reviewers: dalias, hjl.tools, nikic, rnk Reviewed By: nikic Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64304 llvm-svn: 365752	2019-07-11 10:10:09 +00:00
Florian Hahn	8b222ecf27	[BitcodeReader] Validate OpNum, before accessing Record array. Currently invalid bitcode files can cause a crash, when OpNum exceeds the number of elements in Record, like in the attached bitcode file. The test case was generated by clusterfuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15698 Reviewers: t.p.northover, thegameg, jfb Reviewed By: jfb Differential Revision: https://reviews.llvm.org/D64507 llvm-svn: 365750	2019-07-11 09:57:00 +00:00
Sam Parker	08b4a8da07	[ARM][LowOverheadLoops] Correct offset checking This patch addresses a couple of problems: 1) The maximum supported offset of LE is -4094. 2) The offset of WLS also needs to be checked, this uses a maximum positive offset of 4094. The use of BasicBlockUtils has been changed because the block offsets weren't being initialised, but the isBBInRange checks both positive and negative offsets. ARMISelLowering has been tweaked because the test case presented another pattern that we weren't supporting. llvm-svn: 365749	2019-07-11 09:56:15 +00:00
Simon Tatham	7916198a41	[ARM] Remove nonexistent unsigned forms of MVE VQDMLAH. The VQDMLAH.U8, VQDMLAH.U16 and VQDMLAH.U32 instructions don't actually exist: the Armv8.1-M architecture spec only lists signed forms of that instruction. The unsigned ones were added in error: they existed in an early draft of the spec, but they were removed before the public version, and we missed that particular spec change. Also affects the variant forms VQDMLASH, VQRDMLAH and VQRDMLASH. Reviewers: miyuki Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64502 llvm-svn: 365747	2019-07-11 09:52:15 +00:00
Petar Avramovic	962524070a	[MIPS GlobalISel] Skip copies in addUseDef and addDefUses Skip copies between virtual registers during search for UseDefs and DefUses. Since each operand has one def search for UseDefs is straightforward. But since operand can have many uses, we have to check all uses of each copy we traverse during search for DefUses. Differential Revision: https://reviews.llvm.org/D64486 llvm-svn: 365744	2019-07-11 09:28:34 +00:00
Petar Avramovic	e3bb0a72b6	[MIPS GlobalISel] RegBankSelect for chains of ambiguous instructions When one of the uses/defs of ambiguous instruction is also ambiguous visit it recursively and search its uses/defs for instruction with only one mapping available. When all instruction in a chain are ambiguous arbitrary mapping can be selected. For s64 operands in ambiguous chain fprb is selected since it results in less instructions then having to narrow scalar s64 to s32. For s32 both gprb and fprb result in same number of instructions and gprb is selected like a general purpose option. At the moment we always avoid cross register bank copies. TODO: Implement a model for costs calculations of different mappings on same instruction and cross bank copies. Allow cross bank copies when appropriate according to cost model. Differential Revision: https://reviews.llvm.org/D64485 llvm-svn: 365743	2019-07-11 09:22:49 +00:00
Haojian Wu	e6695821e5	Revert Recommit "[CommandLine] Remove OptionCategory and SubCommand caches from the Option class." This reverts r365675 (git commit `43d75f9778`) The patch causes a crash in SupportTests (CommandLineTest.AliasesWithArguments). llvm-svn: 365742	2019-07-11 08:54:28 +00:00
Jay Foad	c1b7db9eda	Remove some redundant code from r290372 and improve a comment. llvm-svn: 365741	2019-07-11 08:49:52 +00:00
Sam Parker	85ad78b1cf	[ARM][ParallelDSP] Change the search for smlads Two functional changes have been made here: - Now search up from any add instruction to find the chains of operations that we may turn into a smlad. This allows the generation of a smlad which doesn't accumulate into a phi. - The search function has been corrected to stop it falsely searching up through an invalid path. The bulk of the changes have been making the Reduction struct a class and making it more C++y with getters and setters. Differential Revision: https://reviews.llvm.org/D61780 llvm-svn: 365740	2019-07-11 07:47:50 +00:00
Heejin Ahn	54c136bbdf	[WebAssembly] Print error message for llvm.clear_cache intrinsic Summary: Wasm does not currently support `llvm.clear_cache` intrinsic, and this prints a proper error message instead of segfault. Reviewers: dschuff, sbc100, sunfish Subscribers: jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64322 llvm-svn: 365731	2019-07-11 05:55:47 +00:00
Chen Zheng	627095ec5b	[SCEV] teach SCEV symbolical execution about overflow intrinsics folding. Differential Revision: https://reviews.llvm.org/D64422 llvm-svn: 365726	2019-07-11 02:18:22 +00:00
Johannes Doerfert	3ed286a388	Replace three "strip & accumulate" implementations with a single one This patch replaces the three almost identical "strip & accumulate" implementations for constant pointer offsets with a single one, combining the respective functionalities. The old interfaces are kept for now. Differential Revision: https://reviews.llvm.org/D64468 llvm-svn: 365723	2019-07-11 01:14:48 +00:00
Craig Topper	88729e3dec	[X86] Don't convert 8 or 16 bit ADDs to LEAs on Atom in FixupLEAPass. We use the functions that convert to three address to do the conversion, but changing an 8 or 16 bit will cause it to create a virtual register. This can't be done after register allocation where this pass runs. I've switched the pass completely to a white list of instructions that can be converted to LEA instead of a blacklist that was incorrect. This will avoid surprises if we enhance the three address conversion function to include additional instructions in the future. Fixes PR42565. llvm-svn: 365720	2019-07-11 01:01:39 +00:00
Stanislav Mekhanoshin	e93279fd1b	[AMDGPU] gfx908 atomic fadd and atomic pk_fadd Differential Revision: https://reviews.llvm.org/D64435 llvm-svn: 365717	2019-07-11 00:10:17 +00:00
Stanislav Mekhanoshin	c0ae1be066	[AMDGPU] gfx908 dot instruction support Differential Revision: https://reviews.llvm.org/D64431 llvm-svn: 365715	2019-07-11 00:00:27 +00:00
Sanjay Patel	138328e45c	[SDAG] commute setcc operands to match a subtract If we have: R = sub X, Y P = cmp Y, X ...then flipping the operands in the compare instruction can allow using a subtract that sets compare flags. Motivated by diffs in D58875 - not sure if this changes anything there, but this seems like a good thing independent of that. There's a more involved version of this transform already in IR (in instcombine although that seems misplaced to me) - see "swapMayExposeCSEOpportunities()". Differential Revision: https://reviews.llvm.org/D63958 llvm-svn: 365711	2019-07-10 23:23:54 +00:00
Vitaly Buka	d03bd1db59	NFC: Pass DataLayout into isBytewiseValue Summary: We will need to handle IntToPtr which I will submit in a separate patch as it's not going to be NFC. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63940 llvm-svn: 365709	2019-07-10 22:53:52 +00:00
Craig Topper	1c327c7e0a	[X86] Add patterns with and_flag_nocf for BLSI and TBM instructions. Fixes similar issues to r352306. llvm-svn: 365705	2019-07-10 22:44:32 +00:00
Craig Topper	d916f23b83	[X86] Add BLSR and BLSMSK to isUseDefConvertible. Unfortunately subo formation in CGP prevents obvious ways of testing this. But we already have BLSI in here and the flag behavior is well understood. Might become more useful if we improve PR42571. llvm-svn: 365702	2019-07-10 22:14:39 +00:00
David Tenty	a2681296e0	[NFC]Fix IR/MC depency issue for function descriptor SDAG implementation Summary: llvm/IR/GlobalValue.h can't be included in MC, that creates a circular dependency between MC and IR libraries. This circular dependency is causing an issue for build system that enforce layering. Author: Xiangling_L Reviewers: sfertile, jasonliu, hubert.reinterpretcast, gribozavr Reviewed By: gribozavr Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64445 llvm-svn: 365701	2019-07-10 22:13:55 +00:00
Craig Topper	021ba49b31	[X86] Remove unused variable. NFC llvm-svn: 365697	2019-07-10 21:01:34 +00:00
Amara Emerson	7a4d2df04a	[AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and unknown values. Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes obstruct attempts to find constant source values. These usually come about when try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough about this operation allows the CBZ/CBNZ optimization to catch more cases. This change also improves the case where we can't find a constant source at all. Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit a cmp and conditional branch, saving an instruction. The cumulative code size improvement of this change plus D64354 is 5.5% geomean on arm64 CTMark -O0. Differential Revision: https://reviews.llvm.org/D64377 llvm-svn: 365690	2019-07-10 19:21:43 +00:00
Nikola Prica	bbfa4cf70b	Revert "[ELF] Loose a condition for relocation with a symbol" This reverts commit 8507eca1647118e73435b0ce1de8a1952a021d01. Reveting due to some suspicious failurse in santizer-x86_64-linux. llvm-svn: 365685	2019-07-10 18:58:05 +00:00
Jessica Paquette	7c95925b13	[GlobalISel][AArch64] Use getOpcodeDef instead of findMIFromReg Some minor cleanup. This function in Utils does the same thing as `findMIFromReg`. It also looks through copies, which `findMIFromReg` didn't. Delete `findMIFromReg` and use `getOpcodeDef` instead. This only happens in `tryOptVectorDup` right now. Update opt-shuffle-splat to show that we can look through the copies now, too. Differential Revision: https://reviews.llvm.org/D64520 llvm-svn: 365684	2019-07-10 18:46:56 +00:00
Jessica Paquette	3132968ae9	[GlobalISel][AArch64][NFC] Use getDefIgnoringCopies from Utils where we can There are a few places where we walk over copies throughout AArch64InstructionSelector.cpp. In Utils, there's a function that does exactly this which we can use instead. Note that the utility function works with the case where we run into a COPY from a physical register. We've run into bugs with this a couple times, so using it should defend us from similar future bugs. Also update opt-fold-compare.mir to show that we still handle physical registers properly. Differential Revision: https://reviews.llvm.org/D64513 llvm-svn: 365683	2019-07-10 18:44:57 +00:00
David Greene	d300a493df	Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces" This broke some PPC prefetching tests. This reverts commit `9fdfb045ae`. llvm-svn: 365680	2019-07-10 18:25:58 +00:00
Michael Berg	f4572249d7	Move three folds for FADD, FSUB and FMUL in the DAG combiner away from Unsafe to more aligned checks that reflect context Summary: Unsafe does not map well alone for each of these three cases as it is missing NoNan context when accessed directly with clang. I have migrated the fold guards to reflect the expectations of handing nan and zero contexts directly (NoNan, NSZ) and some tests with it. Unsafe does include NSZ, however there is already precedent for using the target option directly to reflect that context. Reviewers: spatel, wristow, hfinkel, craig.topper, arsenm Reviewed By: arsenm Subscribers: michele.scandale, wdng, javed.absar Differential Revision: https://reviews.llvm.org/D64450 llvm-svn: 365679	2019-07-10 18:23:26 +00:00
David Greene	9fdfb045ae	[System Model] [TTI] Update cache and prefetch TTI interfaces Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Adding a default "no information" subtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default subtarget implementation essentially returns "no information" for these interfaces. None of the existing users of TTI will hit that implementation because they define their own custom TTI implementations and won't use the BasicTTIImpl implementations. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 365676	2019-07-10 18:07:01 +00:00
Don Hinton	43d75f9778	Recommit "[CommandLine] Remove OptionCategory and SubCommand caches from the Option class." Previously reverted in 364141 due to buildbot breakage, and fixed here by making GeneralCategory global a ManagedStatic. Summary: This change processes `OptionCategory`s and `SubCommand`s as they are seen instead of caching them in the Option class and processing them later. Doing so simplifies the work needed to be done by the Global parser and significantly reduces the size of the Option class to a mere 64 bytes. Removing the `OptionCategory` cache saved 24 bytes, and removing the `SubCommand` cache saved an additional 48 bytes, for a total of a 72 byte reduction. Reviewed By: serge-sans-paille Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D62105 llvm-svn: 365675	2019-07-10 17:57:05 +00:00
Simon Pilgrim	5dd2af5248	[X86] EltsFromConsecutiveLoads - clean up element size calcs. NFCI. Determine the element/load size calculations earlier and assert that they are whole bytes in size. llvm-svn: 365674	2019-07-10 17:49:27 +00:00
Alina Sbirlea	58a37754bb	[LoopRotate + MemorySSA] Keep an <instruction-cloned instruction> map. Summary: The map kept in loop rotate is used for instruction remapping, in order to simplify the clones of instructions. Thus, if an instruction can be simplified, its simplified value is placed in the map, even when the clone is added to the IR. MemorySSA in contrast needs to know about that clone, so it can add an access for it. To resolve this: keep a different map for MemorySSA. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63680 llvm-svn: 365672	2019-07-10 17:36:56 +00:00
Lang Hames	843f198a83	[ORC] Add custom IR compiler configuration to LLJITBuilder to enable obj caches. LLJITBuilder now has a setCompileFunctionCreator method which can be used to construct a CompileFunction for the LLJIT instance being created. The motivating use-case for this is supporting ObjectCaches, which can now be set up at compile-function construction time. To demonstrate this an example project, LLJITWithObjectCache, is included. llvm-svn: 365671	2019-07-10 17:24:24 +00:00
Nick Desaulniers	8728e45706	[TargetLowering] support BlockAddress as "i" inline asm constraint Summary: This allows passing address of labels to inline assembly "i" input constraints. Fixes pr/42502. Reviewers: ostannard Reviewed By: ostannard Subscribers: void, echristo, nathanchance, ostannard, javed.absar, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D64167 llvm-svn: 365664	2019-07-10 17:08:25 +00:00
Peter Collingbourne	893f8d719c	MC: AArch64: Add support for pg_hi21_nc relocation specifier. Differential Revision: https://reviews.llvm.org/D64455 llvm-svn: 365661	2019-07-10 16:36:46 +00:00
Vedant Kumar	5eb6ba060a	[CodeExtractor] Fix sinking of allocas with multiple bitcast uses (PR42451) An alloca which can be sunk into the extraction region may have more than one bitcast use. Move these uses along with the alloca to prevent use-before-def. Testing: check-llvm, stage2 build of clang Fixes llvm.org/PR42451. Differential Revision: https://reviews.llvm.org/D64463 llvm-svn: 365660	2019-07-10 16:32:20 +00:00
Vedant Kumar	f65f302cc7	[CodeExtractor] Simplify findAllocas, NFC Split getLifetimeMarkers out into its own method and have it return a struct. Differential Revision: https://reviews.llvm.org/D64467 llvm-svn: 365659	2019-07-10 16:32:16 +00:00
Matt Arsenault	6ce1b4fec5	GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM llvm-svn: 365658	2019-07-10 16:31:19 +00:00
Matt Arsenault	e595a2c964	GlobalISel: Define the full family of FP min/max instructions llvm-svn: 365657	2019-07-10 16:31:15 +00:00
Simon Pilgrim	093f4aa72f	[X86] EltsFromConsecutiveLoads - remove duplicate check for element size. NFCI. We've already checked that each element is the correct contributory size for VT when we inspect the elements for Undef/Zero/Load. llvm-svn: 365656	2019-07-10 16:22:31 +00:00
Simon Pilgrim	893448a3e4	[X86] EltsFromConsecutiveLoads - ensure element reg/store sizes are the same size. NFCI. This renames the type so it doesn't sound like its based off the load size - as we're moving towards supporting combining loads of different sizes. llvm-svn: 365655	2019-07-10 16:14:26 +00:00
Matt Arsenault	58426a3707	AMDGPU: Serialize mode from MachineFunctionInfo llvm-svn: 365653	2019-07-10 16:09:26 +00:00
Roman Lebedev	c5f92bd67b	[PatternMatch] Generalize m_SpecificInt_ULT() to take ICmpInst::Predicate As discussed in the original review, this may be useful, so let's just do it. llvm-svn: 365652	2019-07-10 16:07:35 +00:00
Francis Visoiu Mistrih	3700736aa8	[Remarks] Add cl::Hidden to -remarks-yaml-string-table It was showing up in a lot of unrelated tools. llvm-svn: 365647	2019-07-10 15:46:36 +00:00
Jay Foad	bba37e89a5	[AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32 Summary: D59191 added support for these modifiers in the assembler and disassembler. This patch just teaches instruction selection that it can use them. Reviewers: arsenm, tstellar Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64497 llvm-svn: 365640	2019-07-10 14:53:47 +00:00
David Bolvansky	0735cc1954	[InstCombine] pow(C,x) -> exp2(log2(C)x) Summary: Transform pow(C,x) To exp2(log2(C)x) if C > 0, C != inf, C != NaN (and C is not power of 2, since we have some fold for such case already). log(C) is folded by the compiler and exp2 is much faster to compute than pow. Reviewers: spatel, efriedma, evandro Reviewed By: evandro Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64099 llvm-svn: 365637	2019-07-10 14:43:27 +00:00
Simon Pilgrim	0a9479ef39	[X86] EltsFromConsecutiveLoads - cleanup Zero/Undef/Load element collection. NFCI. llvm-svn: 365628	2019-07-10 13:28:13 +00:00
Petar Avramovic	7d0778ea6b	[MIPS GlobalISel] Select float and double phi Select float and double phi for MIPS32. Differential Revision: https://reviews.llvm.org/D64420 llvm-svn: 365627	2019-07-10 13:18:13 +00:00
Petar Avramovic	7b31491ae2	[MIPS GlobalISel] Select float and double load and store Select float and double load and store for MIPS32. Differential Revision: https://reviews.llvm.org/D64419 llvm-svn: 365626	2019-07-10 12:55:21 +00:00
Thomas Preud'homme	2bf04f25ff	[FileCheck] Simplify numeric variable interface Summary: This patch simplifies 2 aspects in the FileCheckNumericVariable code. First, setValue() method is turned into a void function since being called only on undefined variable is an invariant and is now asserted rather than returned. This remove the assert from the callers. Second, clearValue() method is also turned into a void function since the only caller does not check its return value since it may be trying to clear the value of variable that is already cleared without this being noteworthy. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D64231 > llvm-svn: 365249 llvm-svn: 365625	2019-07-10 12:49:28 +00:00
Thomas Preud'homme	f6ea43b8b3	[FileCheck] Fix @LINE value after match failure Summary: The value of the FileCheckNumericVariable class instance representing the @LINE numeric variable is set and cleared respectively before and after substitutions are made, if any. However, when a substitution fails, the value is not cleared. This causes the next substitution of @LINE later on to give the wrong value since setValue is a nop if the value is already set. This is what caused failures after commit r365249. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D64449 llvm-svn: 365624	2019-07-10 12:49:17 +00:00
Sam Parker	775b2f598a	[NFC][ARM] Convert lambdas to static helpers Break up and convert some of the lambdas in ARMLowOverheadLoops into static functions. llvm-svn: 365623	2019-07-10 12:29:43 +00:00
Simon Pilgrim	ef1aac3191	[X86] EltsFromConsecutiveLoads - LDBase is non-null. NFCI. Don't bother checking for LDBase != null - it should be (and we assert that it is). llvm-svn: 365622	2019-07-10 12:22:59 +00:00
Simon Pilgrim	94c84aca5d	[DAGCombine] visitINSERT_SUBVECTOR - use uint64_t subvector index. NFCI. Keep the uint64_t type from getZExtValue() to stop truncation/extension overflow warnings in MSVC in subvector index math. llvm-svn: 365621	2019-07-10 12:21:35 +00:00
Simon Pilgrim	c972193583	[X86] EltsFromConsecutiveLoads - store Loads on a per-element basis. NFCI. Cache the LoadSDNode nodes so we can easily map to/from the element index instead of packing them together - this will be useful for future patches for PR16739 etc. llvm-svn: 365620	2019-07-10 11:26:57 +00:00
Nikola Prica	fb163b4b20	[ELF] Loose a condition for relocation with a symbol Deleted code was introduced as a work around for a bug in the gold linker (http://sourceware.org/PR16794). Test case that was given as a reason for this part of code, the one on previous link, now works for the gold. This condition is too strict and when a code is compiled with debug info it forces generation of numerous relocations with symbol for architectures that do not have relocation addend. Reviewers: arsenm, espindola Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D64327 llvm-svn: 365618	2019-07-10 11:17:48 +00:00
Simon Pilgrim	6a58583951	[X86][SSE] EltsFromConsecutiveLoads - add basic dereferenceable support This patch checks to see if the vector element loads are based off a dereferenceable pointer that covers the entire vector width, in which case we don't need to have element loads at both extremes of the vector width - just the start (base pointer) of it. Another step towards partial vector loads...... Differential Revision: https://reviews.llvm.org/D64205 llvm-svn: 365614	2019-07-10 10:46:36 +00:00
Simon Pilgrim	bb1167a3a1	Fix const/non-const lambda return type warning. NFCI. llvm-svn: 365613	2019-07-10 10:45:09 +00:00
Simon Pilgrim	988925c127	Fix "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI. llvm-svn: 365612	2019-07-10 10:34:44 +00:00
Serguei Katkov	d000f8b69f	[SimpleLoopUnswitch] Don't consider unswitching `switch` insructions with one unique successor Only instructions with two or more unique successors should be considered for unswitching. Patch Author: Daniil Suchkov. Reviewers: reames, asbirlea, skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64404 llvm-svn: 365611	2019-07-10 10:25:22 +00:00
Mikhail Maltsev	ed143c5d59	[ARM] Enable VPUSH/VPOP aliases when either MVE or VFP is present Summary: Use the same predicates as VSTMDB/VLDMIA since VPUSH/VPOP alias to these. Patch by Momchil Velikov. Reviewers: ostannard, simon_tatham, SjoerdMeijer, samparker, t.p.northover, dmgreen Reviewed By: dmgreen Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64413 llvm-svn: 365604	2019-07-10 08:59:17 +00:00
Craig Topper	50f70de557	[X86] Limit getTargetConstantFromNode to only work on NormalLoads not extending loads. This seems to fix a failure reported by Jordan Rupprecht, but we don't have a reduced test case yet. llvm-svn: 365589	2019-07-10 00:40:01 +00:00
Reid Kleckner	cc418a3af4	[Support] Move llvm::MemoryBuffer to sys::fs::file_t Summary: On Windows, Posix integer file descriptors are a compatibility layer over native file handles provided by the C runtime. There is a hard limit on the maximum number of file descriptors that a process can open, and the limit is 8192. LLD typically doesn't run into this limit because it opens input files, maps them into memory, and then immediately closes the file descriptor. This prevents it from running out of FDs. For various reasons, I'd like to open handles to every input file and keep them open during linking. That requires migrating MemoryBuffer over to taking open native file handles instead of integer FDs. Reviewers: aganea, Bigcheese Reviewed By: aganea Subscribers: smeenai, silvas, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, llvm-commits, zturner Tags: #llvm Differential Revision: https://reviews.llvm.org/D63453 llvm-svn: 365588	2019-07-10 00:34:13 +00:00
Tom Stellard	d0ba79fe7b	AMDGPU/GlobalISel: Add support for wide loads >= 256-bits Summary: This adds support for the most commonly used wide load types: <8xi32>, <16xi32>, <4xi64>, and <8xi64> Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57399 llvm-svn: 365586	2019-07-10 00:22:41 +00:00
Matt Arsenault	b1843e130a	GlobalISel: Implement lower for G_FCOPYSIGN In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583	2019-07-09 23:34:29 +00:00
Francis Visoiu Mistrih	1a697aa607	[Bitcode] Explicitly include Bitstream/BitCodes.h and BitstreamWriter.h This fixes a modules issue. llvm-svn: 365580	2019-07-09 23:20:01 +00:00
Craig Topper	1ae60797cd	[X86] Don't form extloads in combineExtInVec unless the load extension is legal. This should prevent doing this on pre-sse4.1 targets or for 256 bit vectors without avx2. I don't know of a failure from this. Op legalization will probably take care of, but seemed better to be safe. llvm-svn: 365577	2019-07-09 23:05:54 +00:00
Matt Arsenault	3f1a34546c	AMDGPU/GlobalISel: Fix legality for G_BUILD_VECTOR llvm-svn: 365575	2019-07-09 22:48:04 +00:00
Stanislav Mekhanoshin	1e9eae95af	[AMDGPU] gfx908 v_pk_fmac_f16 support Differential Revision: https://reviews.llvm.org/D64433 llvm-svn: 365573	2019-07-09 22:42:24 +00:00
Matt Arsenault	14a4495155	GlobalISel: Combine unmerge of merge with intermediate cast This eliminates some illegal intermediate vectors when operations are scalarized. llvm-svn: 365566	2019-07-09 22:19:13 +00:00
Vedant Kumar	d6c15b661a	[Profile] Support raw/indexed profiles larger than 4GB rdar://45955976 llvm-svn: 365565	2019-07-09 22:01:04 +00:00
Stanislav Mekhanoshin	50d7f46460	[AMDGPU] gfx908 mAI instructions, MC part Differential Revision: https://reviews.llvm.org/D64446 llvm-svn: 365563	2019-07-09 21:43:09 +00:00
Nikita Popov	5ca39e828c	[SLP] Optimize getSpillCost(); NFCI For a given set of live values, the spill cost will always be the same for each call. Compute the cost once and multiply it by the number of calls. (I'm not sure this spill cost modeling makes sense if there are multiple calls, as the spill cost will likely be shared across calls in that case. But that's how it currently works.) llvm-svn: 365552	2019-07-09 20:24:44 +00:00
Peter Collingbourne	1366262b74	hwasan: Improve precision of checks using short granule tags. A short granule is a granule of size between 1 and `TG-1` bytes. The size of a short granule is stored at the location in shadow memory where the granule's tag is normally stored, while the granule's actual tag is stored in the last byte of the granule. This means that in order to verify that a pointer tag matches a memory tag, HWASAN must check for two possibilities: * the pointer tag is equal to the memory tag in shadow memory, or * the shadow memory tag is actually a short granule size, the value being loaded is in bounds of the granule and the pointer tag is equal to the last byte of the granule. Pointer tags between 1 to `TG-1` are possible and are as likely as any other tag. This means that these tags in memory have two interpretations: the full tag interpretation (where the pointer tag is between 1 and `TG-1` and the last byte of the granule is ordinary data) and the short tag interpretation (where the pointer tag is stored in the granule). When HWASAN detects an error near a memory tag between 1 and `TG-1`, it will show both the memory tag and the last byte of the granule. Currently, it is up to the user to disambiguate the two possibilities. Because this functionality obsoletes the right aligned heap feature of the HWASAN memory allocator (and because we can no longer easily test it), the feature is removed. Also update the documentation to cover both short granule tags and outlined checks. Differential Revision: https://reviews.llvm.org/D63908 llvm-svn: 365551	2019-07-09 20:22:36 +00:00
Philip Reames	a6548d0437	[PoisonChecking] Flesh out complete todo list for full coverage Note: I don't actually plan to implement all of the cases at the moment, I'm just documenting them for completeness. There's a couple of cases left which are practically useful for me in debugging loop transforms, and I'll probably stop there for the moment. llvm-svn: 365550	2019-07-09 19:59:39 +00:00
Craig Topper	84a1f07363	[X86][AMDGPU][DAGCombiner] Move call to allowsMemoryAccess into isLoadBitCastBeneficial/isStoreBitCastBeneficial to allow X86 to bypass it Basically the problem is that X86 doesn't set the Fast flag from allowsMemoryAccess on certain CPUs due to slow unaligned memory subtarget features. This prevents bitcasts from being folded into loads and stores. But all vector loads and stores of the same width are the same cost on X86. This patch merges the allowsMemoryAccess call into isLoadBitCastBeneficial to allow X86 to skip it. Differential Revision: https://reviews.llvm.org/D64295 llvm-svn: 365549	2019-07-09 19:55:28 +00:00
Reid Kleckner	c236eeaf7d	Fix build error for VC STL, use llvm::make_unique llvm-svn: 365548	2019-07-09 19:51:58 +00:00
Stanislav Mekhanoshin	9e77d0c6df	[AMDGPU] gfx908 register file changes Differential Revision: https://reviews.llvm.org/D64438 llvm-svn: 365546	2019-07-09 19:41:51 +00:00
Philip Reames	3dbd7e98d8	[PoisonCheker] Support for out of bounds operands on shifts + insert/extractelement These are sources of poison which don't come from flags, but are clearly documented in the LangRef. Left off support for scalable vectors for the moment, but should be easy to add if anyone is interested. llvm-svn: 365543	2019-07-09 19:26:12 +00:00
Sean Fertile	f09d54ed2a	Boilerplate for producing XCOFF object files from the PowerPC backend. Stubs out a number of the classes needed to produce a new object file format (XCOFF) for the powerpc-aix target. For testing input is an empty module which produces an object file with just a file header. Differential Revision: https://reviews.llvm.org/D61694 llvm-svn: 365541	2019-07-09 19:21:01 +00:00
Simon Pilgrim	294f37561a	[X86] LowerToHorizontalOp - use count_if to count non-UNDEF ops. NFCI. llvm-svn: 365540	2019-07-09 19:19:17 +00:00
Philip Reames	3b38b92541	[PoisonChecking] Add validation rules for "exact" on sdiv/udiv As directly stated in the LangRef, no ambiguity here... llvm-svn: 365538	2019-07-09 18:56:41 +00:00
Bob Haarman	6a4c2e4f0a	[ThinLTO] only emit used or referenced CFI records to index Summary: We emit CFI_FUNCTION_DEFS and CFI_FUNCTION_DECLS to distributed ThinLTO indices to implement indirect function call checking. This change causes us to only emit entries for functions that are either defined or used by the module we're writing the index for (instead of all functions in the combined index), which can make the indices substantially smaller. Fixes PR42378. Reviewers: pcc, vitalybuka, eugenis Subscribers: mehdi_amini, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63887 llvm-svn: 365537	2019-07-09 18:50:55 +00:00
Philip Reames	f47a313e71	Add a transform pass to make the executable semantics of poison explicit in the IR Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language. The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program. At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code. (See comment in PoisonChecking.cpp for full explanation and context) Differential Revision: https://reviews.llvm.org/D64215 llvm-svn: 365536	2019-07-09 18:49:29 +00:00
Sean Fertile	210314ae8c	Try to appease the Windows build bots. Several of the conditonal operators commited in llvm-svn: 365524 fail to compile on the windows buildbots. Converting to an if and early return to try to fix. llvm-svn: 365535	2019-07-09 18:44:28 +00:00
Yonghong Song	a1b2a27a38	[BPF] Fix a typo in the file name Fixed the file name from BPFAbstrctMemberAccess.cpp to BPFAbstractMemberAccess.cpp. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 365532	2019-07-09 18:35:46 +00:00
Stanislav Mekhanoshin	22b2c3d651	[AMDGPU] gfx908 target Differential Revision: https://reviews.llvm.org/D64429 llvm-svn: 365525	2019-07-09 18:10:06 +00:00
Sean Fertile	837ae69f8b	[Object][XCOFF] Add support for 64-bit file header and section header dumping. Adds a readobj dumper for 32-bit and 64-bit section header tables, and extend support for the file-header dumping to include 64-bit object files. Also refactors the binary file parsing to be done in a helper function in an attempt to cleanup error handeling. Differential Revision: https://reviews.llvm.org/D63843 llvm-svn: 365524	2019-07-09 18:09:11 +00:00
Jinsong Ji	06fef0b359	Revert "[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()" This reverts commit `d955573065`. llvm-svn: 365520	2019-07-09 17:53:09 +00:00
Christudasan Devadasan	b2d24bd540	[AMDGPU] Created a sub-register class for the return address operand in the return instruction. Function return instruction lowering, currently uses the fixed register pair s[30:31] for holding the return address. It can be any SGPR pair other than the CSRs. Created an SGPR pair sub-register class exclusive of the CSRs, and used this regclass while lowering the return instruction. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D63924 llvm-svn: 365512	2019-07-09 16:48:42 +00:00
Sam Elliott	114d2db49b	[RISCV] Fix ICE in isDesirableToCommuteWithShift Summary: There was an error being thrown from isDesirableToCommuteWithShift in some tests. This was tracked down to the method being called before legalisation, with an extended value type, not a machine value type. In the case I diagnosed, the error was only hit with an instruction sequence involving `i24`s in the add and shift. `i24` is not a Machine ValueType, it is instead an Extended ValueType which was causing the issue. I have added a test to cover this case, and fixed the error in the callback. Reviewers: asb, luismarques Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64425 llvm-svn: 365511	2019-07-09 16:24:16 +00:00
Amara Emerson	6616e269a6	[AArch64][GlobalISel] Optimize conditional branches followed by unconditional branches If we have an icmp->brcond->br sequence where the brcond just branches to the next block jumping over the br, while the br takes the false edge, then we can modify the conditional branch to jump to the br's target while inverting the condition of the incoming icmp. This means we can eliminate the br as an unconditional branch to the fallthrough block. Differential Revision: https://reviews.llvm.org/D64354 llvm-svn: 365510	2019-07-09 16:05:59 +00:00
Simon Atanasyan	e3892d84e0	[mips] Show error in case of using FP64 mode on pre MIPS32R2 CPU llvm-svn: 365508	2019-07-09 15:48:16 +00:00
Simon Pilgrim	57603cbde8	[DAGCombine] LoadedSlice - keep getOffsetFromBase() uint64_t offset. NFCI. Keep the uint64_t type from getOffsetFromBase() to stop truncation/extension overflow warnings in MSVC in alignment math. llvm-svn: 365504	2019-07-09 15:28:57 +00:00
Yonghong Song	d3d88d08b5	[BPF] Support for compile once and run everywhere Introduction ============ This patch added intial support for bpf program compile once and run everywhere (CO-RE). The main motivation is for bpf program which depends on kernel headers which may vary between different kernel versions. The initial discussion can be found at https://lwn.net/Articles/773198/. Currently, bpf program accesses kernel internal data structure through bpf_probe_read() helper. The idea is to capture the kernel data structure to be accessed through bpf_probe_read() and relocate them on different kernel versions. On each host, right before bpf program load, the bpfloader will look at the types of the native linux through vmlinux BTF, calculates proper access offset and patch the instruction. To accommodate this, three intrinsic functions preserve_{array,union,struct}_access_index are introduced which in clang will preserve the base pointer, struct/union/array access_index and struct/union debuginfo type information. Later, bpf IR pass can reconstruct the whole gep access chains without looking at gep itself. This patch did the following: . An IR pass is added to convert preserve__access_index to global variable who name encodes the getelementptr access pattern. The global variable has metadata attached to describe the corresponding struct/union debuginfo type. . An SimplifyPatchable MachineInstruction pass is added to remove unnecessary loads. . The BTF output pass is enhanced to generate relocation records located in .BTF.ext section. Typical CO-RE also needs support of global variables which can be assigned to different values to different hosts. For example, kernel version can be used to guard different versions of codes. This patch added the support for patchable externals as well. Example ======= The following is an example. struct pt_regs { long arg1; long arg2; }; struct sk_buff { int i; struct net_device dev; }; #define _(x) (__builtin_preserve_access_index(x)) static int (bpf_probe_read)(void dst, int size, const void unsafe_ptr) = (void ) 4; extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version; int bpf_prog(struct pt_regs ctx) { struct net_device dev = 0; // ctx->arg* does not need bpf_probe_read if (__kernel_version >= 41608) bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff )ctx->arg1)->dev)); else bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff )ctx->arg2)->dev)); return dev != 0; } In the above, we want to translate the third argument of bpf_probe_read() as relocations. -bash-4.4$ clang -target bpf -O2 -g -S trace.c The compiler will generate two new subsections in .BTF.ext, OffsetReloc and ExternReloc. OffsetReloc is to record the structure member offset operations, and ExternalReloc is to record the external globals where only u8, u16, u32 and u64 are supported. BPFOffsetReloc Size struct SecLOffsetReloc for ELF section #1 A number of struct BPFOffsetReloc for ELF section #1 struct SecOffsetReloc for ELF section #2 A number of struct BPFOffsetReloc for ELF section #2 ... BPFExternReloc Size struct SecExternReloc for ELF section #1 A number of struct BPFExternReloc for ELF section #1 struct SecExternReloc for ELF section #2 A number of struct BPFExternReloc for ELF section #2 struct BPFOffsetReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t TypeID; ///< TypeID for the relocation uint32_t OffsetNameOff; ///< The string to traverse types }; struct BPFExternReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t ExternNameOff; ///< The string for external variable }; Note that only externs with attribute section ".BPF.patchable_externs" are considered for Extern Reloc which will be patched by bpf loader right before the load. For the above test case, two offset records and one extern record will be generated: OffsetReloc records: .long .Ltmp12 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String .long .Ltmp18 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String ExternReloc record: .long .Ltmp5 # Insn Offset .long 165 # External Variable In string table: .ascii "0:1" # string offset=242 .ascii "__kernel_version" # string offset=165 The default member offset can be calculated as the 2nd member offset (0 representing the 1st member) of struct "sk_buff". The asm code: .Ltmp5: .Ltmp6: r2 = 0 r3 = 41608 .Ltmp7: .Ltmp8: .loc 1 18 9 is_stmt 0 # t.c:18:9 .Ltmp9: if r3 > r2 goto LBB0_2 .Ltmp10: .Ltmp11: .loc 1 0 9 # t.c:0:9 .Ltmp12: r2 = 8 .Ltmp13: .loc 1 19 66 is_stmt 1 # t.c:19:66 .Ltmp14: .Ltmp15: r3 = (u64 )(r1 + 0) goto LBB0_3 .Ltmp16: .Ltmp17: LBB0_2: .loc 1 0 66 is_stmt 0 # t.c:0:66 .Ltmp18: r2 = 8 .loc 1 21 66 is_stmt 1 # t.c:21:66 .Ltmp19: r3 = (u64 )(r1 + 8) .Ltmp20: .Ltmp21: LBB0_3: .loc 1 0 66 is_stmt 0 # t.c:0:66 r3 += r2 r1 = r10 .Ltmp22: .Ltmp23: .Ltmp24: r1 += -8 r2 = 8 call 4 For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number 8 is the structure offset based on the current BTF. Loader needs to adjust it if it changes on the host. For instruction .Ltmp5, "r2 = 0", the external variable got a default value 0, loader needs to supply an appropriate value for the particular host. Compiling to generate object code and disassemble: 0000000000000000 bpf_prog: 0: b7 02 00 00 00 00 00 00 r2 = 0 1: 7b 2a f8 ff 00 00 00 00 (u64 )(r10 - 8) = r2 2: b7 02 00 00 00 00 00 00 r2 = 0 3: b7 03 00 00 88 a2 00 00 r3 = 41608 4: 2d 23 03 00 00 00 00 00 if r3 > r2 goto +3 <LBB0_2> 5: b7 02 00 00 08 00 00 00 r2 = 8 6: 79 13 00 00 00 00 00 00 r3 = (u64 )(r1 + 0) 7: 05 00 02 00 00 00 00 00 goto +2 <LBB0_3> 0000000000000040 LBB0_2: 8: b7 02 00 00 08 00 00 00 r2 = 8 9: 79 13 08 00 00 00 00 00 r3 = (u64 )(r1 + 8) 0000000000000050 LBB0_3: 10: 0f 23 00 00 00 00 00 00 r3 += r2 11: bf a1 00 00 00 00 00 00 r1 = r10 12: 07 01 00 00 f8 ff ff ff r1 += -8 13: b7 02 00 00 08 00 00 00 r2 = 8 14: 85 00 00 00 04 00 00 00 call 4 Instructions #2, #5 and #8 need relocation resoutions from the loader. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61524 llvm-svn: 365503	2019-07-09 15:28:41 +00:00
Chen Zheng	d955573065	[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable() Differential Revision: https://reviews.llvm.org/D64197 llvm-svn: 365497	2019-07-09 14:56:17 +00:00
Petar Avramovic	be20e36107	[MIPS GlobalISel] Register bank select for G_PHI. Select i64 phi Select gprb or fprb when def/use register operand of G_PHI is used/defined by either: copy to/from physical register or instruction with only one mapping available for that use/def operand. Integer s64 phi is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64351 llvm-svn: 365494	2019-07-09 14:36:17 +00:00
Petar Avramovic	dbb6d01d34	[MIPS GlobalISel] Regbanks for G_SELECT. Select i64, f32 and f64 select Select gprb or fprb when def/use register operand of G_SELECT is used/defined by either: copy to/from physical register or instruction with only one mapping available for that use/def operand. Integer s64 select is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. For selection of floating point s32 or s64 select it is enough to set fprb of appropriate size and selectImpl will do the rest. Differential Revision: https://reviews.llvm.org/D64350 llvm-svn: 365492	2019-07-09 14:30:29 +00:00
Matt Arsenault	4dd5755d01	AMDGPU/GlobalISel: Legalize more concat_vectors llvm-svn: 365488	2019-07-09 14:17:31 +00:00
Matt Arsenault	6bdb92d833	AMDGPU/GlobalISel: Improve regbankselect for icmp s16 Account for 64-bit scalar eq/ne when available. llvm-svn: 365487	2019-07-09 14:13:09 +00:00
Matt Arsenault	8b8eee5904	AMDGPU/GlobalISel: Make s16 G_ICMP legal llvm-svn: 365486	2019-07-09 14:10:43 +00:00
Matt Arsenault	e6d10f97dd	AMDGPU/GlobalISel: Select G_SUB llvm-svn: 365484	2019-07-09 14:05:11 +00:00
Matt Arsenault	872f38be7e	AMDGPU/GlobalISel: Select G_UNMERGE_VALUES llvm-svn: 365483	2019-07-09 14:02:26 +00:00
Matt Arsenault	9b7ffc4e55	AMDGPU/GlobalISel: Select G_MERGE_VALUES llvm-svn: 365482	2019-07-09 14:02:20 +00:00
Simon Pilgrim	480e8ad217	[CodeGen] AccelTable - remove non-constexpr (MSVC) Atom defs Now that we've dropped VS2015 support (D64326) we can enable the constexpr variables on MSVC builds as VS2017+ correctly handles them llvm-svn: 365477	2019-07-09 13:07:48 +00:00
Simon Atanasyan	2fa6b54635	[mips] Implement sge/sgeu pseudo instructions The `sge/sgeu Dst, Src1, Src2/Imm` pseudo instructions set register `Dst` to 1 if register `Src1` is greater than or equal `Src2/Imm` and to 0 otherwise. Differential Revision: https://reviews.llvm.org/D64314 llvm-svn: 365476	2019-07-09 12:55:55 +00:00
Simon Atanasyan	00df4d92ed	[mips] Implement sgt/sgtu pseudo instructions with immediate operand The `sgt/sgtu Dst, Src1, Src2/Imm` pseudo instructions set register `Dst` to 1 if register `Src1` is greater than `Src2/Imm` and to 0 otherwise. Differential Revision: https://reviews.llvm.org/D64313 llvm-svn: 365475	2019-07-09 12:55:42 +00:00
Djordje Todorovic	c1e0ea9765	[NFC][AsmPrinter] Fix the formatting for the rL365467 In addition, fix the build failure for the 'unused' variable. The variable was used inside the 'LLVM_DEBUG()'. llvm-svn: 365469	2019-07-09 12:06:21 +00:00
Tim Northover	60afa49abe	OpaquePtr: add Type parameter to Loads analysis API. This makes the functions in Loads.h require a type to be specified independently of the pointer Value so that when pointers have no structure other than address-space, it can still do its job. Most callers had an obvious memory operation handy to provide this type, but a SROA and ArgumentPromotion were doing more complicated analysis. They get updated to merge the properties of the various instructions they were considering. llvm-svn: 365468	2019-07-09 11:35:35 +00:00
Djordje Todorovic	01eaae6dd1	[DwarfDebug] Dump call site debug info Dump the DWARF information about call sites and call site parameters into debug info sections. The patch also provides an interface for the interpretation of instructions that could load values of a call site parameters in order to generate DWARF about the call site parameters. ([13/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 365467	2019-07-09 11:33:56 +00:00
Alex Bradbury	e0831dac0c	[RISCV] Fix RISCVTTIImpl::getIntImmCost for immediates where getMinSignedBits() > 64 APInt::getSExtValue will assert if getMinSignedBits() > 64. This can happen, for instance, if examining an i128. Avoid this assertion by checking Imm.getMinSignedBits() <= 64 before doing getTLI()->isLegalAddImmediate(Imm.getSExtValue()). We could directly check getMinSignedBits() <= 12 but it seems better to reuse the isLegalAddImmediate helper for this. Differential Revision: https://reviews.llvm.org/D64390 llvm-svn: 365462	2019-07-09 10:56:18 +00:00
Bjorn Pettersson	051a6a1c33	[SelectionDAG] Simplify some calls to getSetCCResultType. NFC DAGTypeLegalizer and SelectionDAGLegalize has helper functions wrapping the call to TLI.getSetCCResultType(...). Use those helpers in more places. llvm-svn: 365456	2019-07-09 10:27:51 +00:00
Bjorn Pettersson	59029017a6	[LegalizeTypes] Fix saturation bug for smul.fix.sat Summary: Make sure we use SETGE instead of SETGT when checking if the sign bit is zero at SMULFIXSAT expansion. The faulty expansion occured when doing "expand" of SMULFIXSAT and the scale was exactly matching the size of the smaller type. For example doing i64 Z = SMULFIXSAT X, Y, 32 and expanding X/Y/Z into using two i32 values. The problem was that we sometimes did not saturate to min when overflowing. Here is an example using Q3.4 numbers: Consider that we are multiplying X and Y. X = 0x80 (-8.0 as Q3.4) Y = 0x20 (2.0 as Q3.4) To avoid loss of precision we do a widening multiplication, getting a 16 bit result Z = 0xF000 (-16.0 as Q7.8) To detect negative overflow we should check if the five most significant bits in Z are less than -1. Assume that we name the 4 most significant bits as HH and the next 4 bits as HL. Then we can do the check by examining if (HH < -1) or (HH == -1 && "sign bit in HL is zero"). The fault was that we have been doing the check as (HH < -1) or (HH == -1 && HL > 0) instead of (HH < -1) or (HH == -1 && HL >= 0). In our example HH is -1 and HL is 0, so the old code did not trigger saturation and simply truncated the result to 0x00 (0.0). With the bugfix we instead detect that we should saturate to min, and the result will be set to 0x80 (-8.0). Reviewers: leonardchan, bevinh Reviewed By: leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64331 llvm-svn: 365455	2019-07-09 10:24:50 +00:00
Guillaume Chatelet	336f3e1601	Fixing @llvm.memcpy not honoring volatile. This is explicitly not addressing target-specific code, or calls to memcpy. Summary: https://bugs.llvm.org/show_bug.cgi?id=42254 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63215 llvm-svn: 365449	2019-07-09 09:53:36 +00:00
Jeremy Morse	9bebc65d79	Revert r364515 and r364524 Jordan reports on llvm-commits a performance regression with r364515, backing the patch out while it's investigated. llvm-svn: 365448	2019-07-09 09:38:03 +00:00
Djordje Todorovic	12aca5de02	Reland "[LiveDebugValues] Emit the debug entry values" Emit replacements for clobbered parameters location if the parameter has unmodified value throughout the funciton. This is basic scenario where we can use the debug entry values. ([12/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D58042 llvm-svn: 365444	2019-07-09 08:36:34 +00:00
Serguei Katkov	77bb3a486f	[Loop Peeling] Add support for peeling of loops with multiple exits This patch modifies the loop peeling transformation so that it does not expect that there is only one loop exit from latch. It modifies only transformation. Update of branch weights remains only for exit from latch. The motivation is that in follow-up patch I plan to enable loop peeling for loops with multiple exits but only if other exits then from latch one goes to block with call to deopt. For now this patch is NFC. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames, fhahn Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D63921 llvm-svn: 365441	2019-07-09 06:07:25 +00:00
Yevgeny Rouban	592f44a7e7	Prepare for making SwitchInstProfUpdateWrapper strict This patch removes the test part that relates to the non-strict behavior of SwitchInstProfUpdateWrapper and changes the assertion to llvm_unreachable() to allow the check in release builds. This patch prepares SwitchInstProfUpdateWrapper to become strict with one line change. That is need to revert it easily if any failure will arise. llvm-svn: 365439	2019-07-09 05:07:28 +00:00
Serguei Katkov	c6caddb73d	[LoopInfo] Update getExitEdges to accept vector of pairs for non const BasicBlock D63921 requires getExitEdges fills a vector of Edge pairs where BasicBlocks are not constant. The rest Loop API mostly returns non-const BasicBlocks, so to be more consistent with other Loop API getExitEdges is modified to return non-const BasicBlocks as well. This is an alternative solution to D64060. Reviewers: reames, fhahn Reviewed By: reames, fhahn Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D64309 llvm-svn: 365437	2019-07-09 04:20:43 +00:00
Kai Luo	619e39bc72	[NFC][PowerPC] Fixed unused variable 'NewInstr'. llvm-svn: 365433	2019-07-09 03:33:04 +00:00
Stanislav Mekhanoshin	c776dc0b60	[AMDGPU] Added td definitions for HW regs Infrastructure work for future commit. NFC. Differential Revision: https://reviews.llvm.org/D64370 llvm-svn: 365432	2019-07-09 03:20:33 +00:00
Stanislav Mekhanoshin	818d748a45	[AMDGPU] Always use s_memtime for readcyclecounter Differential Revision: https://reviews.llvm.org/D64369 llvm-svn: 365431	2019-07-09 03:10:18 +00:00
Kai Luo	1931ed73c3	[PowerPC][Peephole] Combine extsw and sldi after instruction selection Summary: `extsw` and `sldi` are supposed to be combined if they are in the same BB in instruction selection phase. This patch handles the case where extsw and sldi are not in the same BB. Differential Revision: https://reviews.llvm.org/D63806 llvm-svn: 365430	2019-07-09 02:55:08 +00:00
Chen Zheng	25ab27e6ef	[PowerPC][NFC] remove redundant function isVFReg(). llvm-svn: 365429	2019-07-09 02:48:30 +00:00
Jinsong Ji	cbd64f7648	[MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogue Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined by the Phi, hence getting wrong value later. As the comment mentioned, we should "use the value defined by the Phi, unless we're generating the firstepilog and the Phi refers to a Phi in a different stage.", so Phi refering to same stage Phi should use the value defined by the Phi here. Reviewers: bcahoon, hfinkel Reviewed By: hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64035 llvm-svn: 365428	2019-07-09 02:27:35 +00:00
Heejin Ahn	947bfe73fc	[WebAssembly] Make sret parameter work with AddMissingPrototypes Summary: Even with functions with `no-prototype` attribute, there can be an argument `sret` (structure return) attribute, which is an optimization when a function return type is a struct. Fixes PR42420. Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64318 llvm-svn: 365426	2019-07-09 02:10:33 +00:00
Philip Reames	0e344e9dc5	[LoopPred] Stylistic improvement to recently added NE/EQ normalization [NFC] llvm-svn: 365425	2019-07-09 02:03:31 +00:00
Yonghong Song	e3919c6baf	[BPF] add new intrinsics preserve_{array,union,struct}_access_index For background of BPF CO-RE project, please refer to http://vger.kernel.org/bpfconf2019.html In summary, BPF CO-RE intends to compile bpf programs adjustable on struct/union layout change so the same program can run on multiple kernels with adjustment before loading based on native kernel structures. In order to do this, we need keep track of GEP(getelementptr) instruction base and result debuginfo types, so we can adjust on the host based on kernel BTF info. Capturing such information as an IR optimization is hard as various optimization may have tweaked GEP and also union is replaced by structure it is impossible to track fieldindex for union member accesses. Three intrinsic functions, preserve_{array,union,struct}_access_index, are introducted. addr = preserve_array_access_index(base, index, dimension) addr = preserve_union_access_index(base, di_index) addr = preserve_struct_access_index(base, gep_index, di_index) here, base: the base pointer for the array/union/struct access. index: the last access index for array, the same for IR/DebugInfo layout. dimension: the array dimension. gep_index: the access index based on IR layout. di_index: the access index based on user/debuginfo types. For example, for the following example, $ cat test.c struct sk_buff { int i; int b1:1; int b2:2; union { struct { int o1; int o2; } o; struct { char flags; char dev_id; } dev; int netid; } u[10]; }; static int (bpf_probe_read)(void dst, int size, const void unsafe_ptr) = (void ) 4; #define _(x) (__builtin_preserve_access_index(x)) int bpf_prog(struct sk_buff ctx) { char dev_id; bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id)); return dev_id; } $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \ test.c >& log The generated IR looks like below: ... define dso_local i32 @bpf_prog(%struct.sk_buff) #0 !dbg !15 { %2 = alloca %struct.sk_buff, align 8 %3 = alloca i8, align 1 store %struct.sk_buff %0, %struct.sk_buff %2, align 8, !tbaa !45 call void @llvm.dbg.declare(metadata %struct.sk_buff %2, metadata !43, metadata !DIExpression()), !dbg !49 call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50 call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51 %4 = load i32 (i8, i32, i8), i32 (i8, i32, i8)* @bpf_probe_read, align 8, !dbg !52, !tbaa !45 %5 = load %struct.sk_buff, %struct.sk_buff* %2, align 8, !dbg !53, !tbaa !45 %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs( %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19 %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons( [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53 %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons( %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26 %9 = bitcast %union.anon* %8 to %struct.anon.0, !dbg !53 %10 = call i8 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s( %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34 %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52 %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55 %13 = sext i8 %12 to i32, !dbg !54 call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56 ret i32 %13, !dbg !57 } !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20) !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27) !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35) Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index attached to instructions to provide struct/union debuginfo type information. For &ctx->u[5].dev.dev_id, . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout. . The "%7 = ..." represents array subscript "5". . The "%8 = ..." represents union member "dev" with index 1 for DI layout. . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout. Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and examining all preserve_*_access_index calls, the debuginfo struct/union/array access index can be achieved. The intrinsics also contain enough information to regenerate codes for IR layout. For array and structure intrinsics, the proper GEP can be constructed. For union intrinsics, replacing all uses of "addr" with "base" should be enough. The test case ThinLTO/X86/lazyload_metadata.ll is adjusted to reflect the new addition of the metadata. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61810 llvm-svn: 365423	2019-07-09 01:51:36 +00:00
Philip Reames	5a637cbdc7	[LoopPred] Extend LFTR normalization to the inverse EQ case A while back, I added support for NE latches formed by LFTR. I didn't think that quite through, as LFTR will also produce the inverse EQ form for some loops and I hadn't handled that. This change just adds handling for that case as well. llvm-svn: 365419	2019-07-09 01:27:45 +00:00
Nilanjana Basu	faed8516e4	Changing CodeView debug info type record representation in assembly files to make it more human-readable & editable & fixing bug introduced in r364987 llvm-svn: 365417	2019-07-09 01:11:02 +00:00
Nico Weber	e3f06b478c	Let unaliased Args track which Alias they were created from, and use that in Arg::getAsString() for diagnostics With this, `clang-cl /source-charset:utf-16 test.cc` now prints `invalid value 'utf-16' in '/source-charset:utf-16'` instead of `invalid value 'utf-16' in '-finput-charset=utf-16'` before, and several other clang-cl flags produce much less confusing output as well. Fixes PR29106. Since an arg and its alias can have different arg types (joined vs not) and different values (because of AliasArgs<>), I chose to give the Alias its own Arg object. For convenience, I just store the alias directly in the unaliased arg – there aren't many arg objects at runtime, so that seems ok. Finally, I changed Arg::getAsString() to use the alias's representation if it's present – that function was already documented as being the suitable function for diagnostics, and most callers already used it for diagnostics. Implementation-wise, Arg::accept() previously used to parse things as the unaliased option. The core of that switch is now extracted into a new function acceptInternal() which parses as the _aliased_ option, and the previously-intermingled unaliasing is now done as an explicit step afterwards. (This also changes one place in lld that didn't use getAsString() for diagnostics, so that that one place now also prints the flag as the user wrote it, not as it looks after it went through unaliasing.) Differential Revision: https://reviews.llvm.org/D64253 llvm-svn: 365413	2019-07-09 00:34:08 +00:00
Johannes Doerfert	accd3e8747	[Attributor] Deduce the "returned" argument attribute Deduce the "returned" argument attribute by collecting all potentially returned values. Not only the unique return value, if any, can be used by subsequent attributes but also the set of all potentially returned values as well as the mapping from returned values to return instructions that they originate from (see AAReturnedValues::checkForallReturnedValues). Change in statistics (-stats) for LLVM-TS + Spec2006, totaling ~19% more "returned" arguments. ADDED: attributor NumAttributesManifested n/a -> 637 ADDED: attributor NumAttributesValidFixpoint n/a -> 25545 ADDED: attributor NumFnArgumentReturned n/a -> 637 ADDED: attributor NumFnKnownReturns n/a -> 25545 ADDED: attributor NumFnUniqueReturned n/a -> 14118 CHANGED: deadargelim NumRetValsEliminated 470 -> 449 ( -4.468%) REMOVED: functionattrs NumReturned 535 -> n/a CHANGED: indvars NumElimIdentity 138 -> 164 ( +18.841%) Reviewers: homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames, efriedma, chandlerc Subscribers: hiraditya, bollu, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D59919 llvm-svn: 365407	2019-07-08 23:27:20 +00:00
Jessica Paquette	55d19247ef	[AArch64][GlobalISel] Use TST for comparisons when possible Porting over the part of `emitComparison` in AArch64ISelLowering where we use TST to represent a compare. - Rename `tryOptCMN` to `tryFoldIntegerCompare`, since it now also emits TSTs when possible. - Add a utility function for emitting a TST with register operands. - Rename opt-fold-cmn.mir to opt-fold-compare.mir, since it now also tests the TST fold as well. Differential Revision: https://reviews.llvm.org/D64371 llvm-svn: 365404	2019-07-08 22:58:36 +00:00
Matt Arsenault	9e7cbc0e7d	AMDGPU: Split extload/zextload local load patterns This will help removing the custom load predicates, allowing the global isel emitter to handle them. llvm-svn: 365398	2019-07-08 22:08:23 +00:00
Bill Wendling	c8933c4070	Add parentheses to silence warning. llvm-svn: 365394	2019-07-08 22:00:33 +00:00
Reid Kleckner	2f07c2e9d9	Standardize on MSVC behavior for triples with no environment Summary: This makes it so that IR files using triples without an environment work out of the box, without normalizing them. Typically, the MSVC behavior is more desirable. For example, it tends to enable things like constant merging, use of associative comdats, etc. Addresses PR42491 Reviewers: compnerd Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64109 llvm-svn: 365387	2019-07-08 21:05:20 +00:00
Sanjay Patel	3dee113ebc	[InstCombine] fold insertelement into splat of same scalar Forming the canonical splat shuffle improves analysis and may allow follow-on transforms (although some possibilities are missing as shown in the test diffs). The backend generically turns these patterns into build_vector, so there should be no codegen regressions. All targets are expected to be able to lower splats efficiently. llvm-svn: 365379	2019-07-08 19:48:52 +00:00
Matt Arsenault	8561844321	AMDGPU: Fix unused variable in release build llvm-svn: 365378	2019-07-08 19:47:42 +00:00
Yuanfang Chen	5de4692cc7	Teach the symbolizer lib symbolize objects directly. Currently, the symbolizer lib can only symbolize a file on disk. This patch teaches the symbolizer lib to symbolize objects. llvm-objdump needs this to support archive disassembly with source info. https://bugs.llvm.org/show_bug.cgi?id=41871 Reviewed by: jhenderson, grimar, MaskRay Differential Revision: https://reviews.llvm.org/D63521 llvm-svn: 365376	2019-07-08 19:28:57 +00:00
Matt Arsenault	acc9e1e4c2	AMDGPU: Fix stray typing llvm-svn: 365373	2019-07-08 19:05:19 +00:00
Matt Arsenault	71dfb7ec5c	AMDGPU: Make s34 the FP register Make the FP register callee saved. This is tricky because now the FP needs to be spilled in the prolog relative to the incoming SP register, rather than the frame register used throughout the rest of the function. I don't like how this bypassess the standard mechanism for CSR spills just to get the correct insert point. I may look for a better solution, since all CSR VGPRs may also need to have all lanes activated. Another option might be to make getFrameIndexReference change the base register if the frame index is a CSR, and then try to figure out the right insertion point in emitProlog. If there is a free VGPR lane available for SGPR spilling, try to use it for the FP. If that would require intrtoducing a new VGPR spill, try to use a free call clobbered SGPR. Only fallback to introducing a new VGPR spill as a last resort. This also doesn't attempt to handle SGPR spilling with scalar stores. llvm-svn: 365372	2019-07-08 19:03:38 +00:00
Matt Arsenault	5630e3a1c7	RegUsageInfoCollector: Don't iterate all regs for every reg class This is extremly slow on AMDGPU, which has a lot of physical register and a lot of register classes. determineCalleeSaves, via MachineRegisterInfo::isPhysRegUsed already added all of the super registers to the saved set. llvm-svn: 365370	2019-07-08 18:48:42 +00:00
Matt Arsenault	5e643036cb	AMDGPU: Move DEBUG_TYPE definition below includes llvm-svn: 365369	2019-07-08 18:48:39 +00:00
Whitney Tsang	7d8f30e6b2	Keep the order of the basic blocks in the cloned loop as the original loop Summary: Do the cloning in two steps, first allocate all the new loops, then clone the basic blocks in the same order as the original loop. Reviewer: Meinersbur, fhahn, kbarton, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, hiraditya, llvm-commits Tag: https://reviews.llvm.org/D64224 Differential Revision: llvm-svn: 365366	2019-07-08 18:30:35 +00:00
Denis Bakhvalov	74be349bcf	[SCEV] Fix for PR42397. SCEVExpander wrongly adds nsw to shl instruction. Change-Id: I76c9f628c092ae3e6e78ebdaf55cec726e25d692 llvm-svn: 365363	2019-07-08 18:03:43 +00:00
Yonghong Song	0d566dbbae	Revert "[BPF] add new intrinsics preserve_{array,union,struct}_access_index" This reverts commit r365352. Test ThinLTO/X86/lazyload_metadata.ll failed. Revert the commit and at the same time to fix the issue. llvm-svn: 365360	2019-07-08 17:47:43 +00:00
Yonghong Song	75c2a6709e	[BPF] add new intrinsics preserve_{array,union,struct}_access_index For background of BPF CO-RE project, please refer to http://vger.kernel.org/bpfconf2019.html In summary, BPF CO-RE intends to compile bpf programs adjustable on struct/union layout change so the same program can run on multiple kernels with adjustment before loading based on native kernel structures. In order to do this, we need keep track of GEP(getelementptr) instruction base and result debuginfo types, so we can adjust on the host based on kernel BTF info. Capturing such information as an IR optimization is hard as various optimization may have tweaked GEP and also union is replaced by structure it is impossible to track fieldindex for union member accesses. Three intrinsic functions, preserve_{array,union,struct}_access_index, are introducted. addr = preserve_array_access_index(base, index, dimension) addr = preserve_union_access_index(base, di_index) addr = preserve_struct_access_index(base, gep_index, di_index) here, base: the base pointer for the array/union/struct access. index: the last access index for array, the same for IR/DebugInfo layout. dimension: the array dimension. gep_index: the access index based on IR layout. di_index: the access index based on user/debuginfo types. For example, for the following example, $ cat test.c struct sk_buff { int i; int b1:1; int b2:2; union { struct { int o1; int o2; } o; struct { char flags; char dev_id; } dev; int netid; } u[10]; }; static int (bpf_probe_read)(void dst, int size, const void unsafe_ptr) = (void ) 4; #define _(x) (__builtin_preserve_access_index(x)) int bpf_prog(struct sk_buff ctx) { char dev_id; bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id)); return dev_id; } $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \ test.c >& log The generated IR looks like below: ... define dso_local i32 @bpf_prog(%struct.sk_buff) #0 !dbg !15 { %2 = alloca %struct.sk_buff, align 8 %3 = alloca i8, align 1 store %struct.sk_buff %0, %struct.sk_buff %2, align 8, !tbaa !45 call void @llvm.dbg.declare(metadata %struct.sk_buff %2, metadata !43, metadata !DIExpression()), !dbg !49 call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50 call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51 %4 = load i32 (i8, i32, i8), i32 (i8, i32, i8)* @bpf_probe_read, align 8, !dbg !52, !tbaa !45 %5 = load %struct.sk_buff, %struct.sk_buff* %2, align 8, !dbg !53, !tbaa !45 %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs( %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19 %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons( [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53 %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons( %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26 %9 = bitcast %union.anon* %8 to %struct.anon.0, !dbg !53 %10 = call i8 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s( %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34 %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52 %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55 %13 = sext i8 %12 to i32, !dbg !54 call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56 ret i32 %13, !dbg !57 } !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20) !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27) !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35) Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index attached to instructions to provide struct/union debuginfo type information. For &ctx->u[5].dev.dev_id, . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout. . The "%7 = ..." represents array subscript "5". . The "%8 = ..." represents union member "dev" with index 1 for DI layout. . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout. Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and examining all preserve_*_access_index calls, the debuginfo struct/union/array access index can be achieved. The intrinsics also contain enough information to regenerate codes for IR layout. For array and structure intrinsics, the proper GEP can be constructed. For union intrinsics, replacing all uses of "addr" with "base" should be enough. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61810 llvm-svn: 365352	2019-07-08 17:08:28 +00:00
Wouter van Oortmerssen	81db9f543c	[WebAssembly] tablegen: distinguish float/int immediate operands. Summary: Before, they were one category of operands which could cause crashes in non-sensical combinations, e.g. "f32.const symbol". Now these are forced to be an error. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64039 llvm-svn: 365351	2019-07-08 16:58:37 +00:00
Matt Arsenault	224d8cd987	AMDGPU: Remove mubuf specific PatFrags These are identical to the *_global PatFrag, and will only create more work to get the GlobalISel importer to handle them. llvm-svn: 365350	2019-07-08 16:53:53 +00:00
Matt Arsenault	430b0497e7	AMDGPU: Move waitcnt intrinsic to instruction definition pattern llvm-svn: 365349	2019-07-08 16:53:48 +00:00
Brian Homerding	498687bff2	Add, and infer, a nofree function attribute Removing dead code leftover from refactor. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365345	2019-07-08 16:33:32 +00:00
Matt Arsenault	079f77b590	GlobalISel: Convert some build functions to using SrcOp/DstOp llvm-svn: 365343	2019-07-08 16:27:47 +00:00
Sanjay Patel	0b59103a73	[InstCombine] canonicalize insert+splat to/from element 0 of vector We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue() and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts to that form for better matching. The backend generically turns these patterns into build_vector, so there should be no codegen difference. llvm-svn: 365342	2019-07-08 16:26:48 +00:00
Francis Visoiu Mistrih	d6fd354f3f	[Bitcode][NFC] Remove unused variable from BitcodeAnalyzer llvm-svn: 365340	2019-07-08 16:19:45 +00:00
Kevin P. Neal	472e5dda11	Teach the IRBuilder about fadd and friends. The IRBuilder has calls to create floating point instructions like fadd. It does not have calls to create constrained versions of them. This patch adds support for constrained creation of fadd, fsub, fmul, fdiv, and frem. Reviewed by: John McCall, Sanjay Patel Approved by: John McCall Differential Revision: https://reviews.llvm.org/D53157 llvm-svn: 365339	2019-07-08 16:18:18 +00:00
Brian Homerding	b4b21d807e	Add, and infer, a nofree function attribute This patch adds a function attribute, nofree, to indicate that a function does not, directly or indirectly, call a memory-deallocation function (e.g., free, C++'s operator delete). Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365336	2019-07-08 15:57:56 +00:00
Simon Pilgrim	e1a9b49d6b	[X86] ISD::INSERT_SUBVECTOR - use uint64_t index. NFCI. Keep the uint64_t type from getConstantOperandVal to stop truncation/extension overflow warnings in MSVC in subvector index math. llvm-svn: 365328	2019-07-08 14:52:56 +00:00
Cameron McInally	771769be90	[Float2Int] Add support for unary FNeg to Float2Int Differential Revision: https://reviews.llvm.org/D63941 llvm-svn: 365324	2019-07-08 14:46:07 +00:00
Petar Avramovic	aa699b20a0	[MIPS GlobalISel] Register bank select for G_LOAD. Select i64 load Select gprb or fprb when loaded value is used by either: copy to physical register or instruction with only one mapping available for that use operand. Load of integer s64 is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64269 llvm-svn: 365323	2019-07-08 14:45:52 +00:00
Petar Avramovic	ec575f6e3e	[MIPS GlobalISel] Register bank select for G_STORE. Select i64 store Select gprb or fprb when stored value is defined by either: copy from physical register or instruction with only one mapping available for that def operand. Store of integer s64 is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64268 llvm-svn: 365322	2019-07-08 14:36:36 +00:00
Dmitry Preobrazhensky	2eff0318c6	[AMDGPU][MC] Corrected parsing of FLAT offset modifier Summary of changes: - simplified handling of FLAT offset: offset_s13 and offset_u12 have been replaced with flat_offset; - provided information about error position for pre-gfx9 targets; - improved errors handling. Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D64244 llvm-svn: 365321	2019-07-08 14:27:37 +00:00
Matt Arsenault	bd791b57f8	GlobalISel: widenScalar for G_BUILD_VECTOR llvm-svn: 365320	2019-07-08 13:48:06 +00:00
Simon Pilgrim	9285bf0fb9	[TargetLowering] SimplifyDemandedBits - just call computeKnownBits for BUILD_VECTOR cases. Don't do this locally, computeKnownBits does this better (and can handle non-constant cases as well). A next step would be to actually simplify non-constant elements - building on what we already do in SimplifyDemandedVectorElts. llvm-svn: 365309	2019-07-08 11:00:39 +00:00
Mikhail Maltsev	ee81051fc9	[ARM] Relax constraints on operands of VQxDMLxDH instructions Summary: According to a recently updated Armv8-M spec (https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf) the 32-bit width versions of the following instructions: * VQDMLADH * VQDMLADHX * VQRDMLADH * VQRDMLADHX * VQDMLSDH * VQDMLSDHX * VQRDMLSDH * VQRDMLSDHX are no longer unpredictable when their output register is the same as one of the input registers. This patch updates the assembler parser and the corresponding tests and also removes @earlyclobber from the instruction constraints. Reviewers: simon_tatham, ostannard, dmgreen, SjoerdMeijer, samparker Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64250 llvm-svn: 365306	2019-07-08 09:44:52 +00:00
Alex Bradbury	0b9addb8c0	[RISCV] Specify registers used in DWARF exception handling Defines RISCV registers for getExceptionPointerRegister() and getExceptionSelectorRegister(). Differential Revision: https://reviews.llvm.org/D63411 Patch by Edward Jones. Modified by Alex Bradbury to add CHECK lines to exception-pointer-register.ll. llvm-svn: 365301	2019-07-08 09:16:47 +00:00
Fangrui Song	7d63be09b6	[ARM] Fix null pointer dereference in CodeGen/ARM/Windows/stack-protector-msvc.ll.test after D64292/r365283 CLI.CS may not be set. llvm-svn: 365299	2019-07-08 08:43:31 +00:00
Jay Foad	38902350ef	[AMDGPU] Use a named predicate instead of a magic number. Reviewers: arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64201 llvm-svn: 365294	2019-07-08 07:04:58 +00:00
Craig Topper	1deca50ab1	[X86] Allow execution domain fixing to turn SHUFPD into SHUFPS. This can help with code size on SSE targets where SHUFPD requires a 0x66 prefix and SHUFPS doesn't. llvm-svn: 365293	2019-07-08 06:52:49 +00:00
Craig Topper	d8261f0288	[X86] Make movsd commutable to shufpd with a 0x02 immediate on pre-SSE4.1 targets. This can help avoid a copy or enable load folding. On SSE4.1 targets we can commute it to blendi instead. I had to make shufpd with a 0x02 immediate commutable as well since we expect commuting to be reversible. llvm-svn: 365292	2019-07-08 06:52:43 +00:00
Alex Bradbury	e1e036a33b	[RISCV] Support z and i operand modifiers Differential Revision: https://reviews.llvm.org/D57792 Patch by James Clarke. llvm-svn: 365291	2019-07-08 05:00:26 +00:00
Craig Topper	46f2b583a2	[X86] Add MOVSDrr->MOVLPDrm entry to load folding table. Add custom handling to turn UNPCKLPDrr->MOVHPDrm when load is under aligned. If the load is aligned we can turn UNPCKLPDrr into UNPCKLPDrm. llvm-svn: 365287	2019-07-08 02:10:20 +00:00
Francis Visoiu Mistrih	4cdb68ebbd	[llvm-bcanalyzer] Refactor and move to libLLVMBitReader This allows us to use the analyzer from unit tests. * Refactor the interface to use proper error handling for most functions after JF's work. * Move everything into a BitstreamAnalyzer class. * Move that to Bitcode/BitcodeAnalyzer.h. Differential Revision: https://reviews.llvm.org/D64116 llvm-svn: 365286	2019-07-08 02:06:34 +00:00
Martin Storsjo	8d9d290d4c	[ARM] Add support for MSVC stack cookie checking Heavily based on the same for AArch64, from SVN r346469. Differential Revision: https://reviews.llvm.org/D64292 llvm-svn: 365283	2019-07-07 18:57:31 +00:00
Craig Topper	ac744d5a86	[X86] Make sure load isn't volatile before shrinking it in MOVDDUP isel patterns. llvm-svn: 365275	2019-07-07 05:33:20 +00:00
David Majnemer	617df204b5	[CodeGen] Add larger vector types for i32 and f32 Some out of tree backend require larger vector type. Since maintaining the changes out of tree is difficult due to the many manual changes needed when adding a new type we are adding it even if no backend currently use it. Differential Revision: https://reviews.llvm.org/D64141 Patch by Thomas Raoux! llvm-svn: 365274	2019-07-07 04:47:37 +00:00
Simon Pilgrim	a7145c45a7	[X86] SimplifyDemandedVectorEltsForTargetNode - fix shadow variable warning. NFCI. Fixes cppcheck warning. llvm-svn: 365271	2019-07-06 18:46:09 +00:00
Simon Pilgrim	01f1bad618	[X86] LowerBuildVectorv16i8 - pull out repeated getOperand() call. NFCI. llvm-svn: 365270	2019-07-06 18:33:29 +00:00
Simon Pilgrim	9c68aa33e3	[DAGCombine] convertBuildVecZextToZext - remove duplicate getOpcode() call. NFCI. llvm-svn: 365269	2019-07-06 18:32:15 +00:00
Craig Topper	317d6093df	[X86] Remove patterns from MOVLPSmr and MOVHPSmr instructions. These patterns are the same as the MOVLPDmr and MOVHPDmr patterns, but with a bitcast at the end. We can just select the PD instruction and let execution domain fixing switch to PS. llvm-svn: 365267	2019-07-06 17:59:51 +00:00
Craig Topper	913105ca42	[X86] Add patterns to select MOVLPDrm from MOVSD+load and MOVHPD from UNPCKL+load. These narrow the load so we can only do it if the load isn't volatile. There also tests in vector-shuffle-128-v4.ll that this should support, but we don't seem to fold bitcast+load on pre-sse4.2 targets due to the slow unaligned mem 16 flag. llvm-svn: 365266	2019-07-06 17:59:45 +00:00
Philip Reames	9e62c86408	[IRBuilder] Introduce helpers for and/or of multiple values at once We had versions of this code scattered around, so consolidate into one location. Not strictly NFC since the order of intermediate results may change in some places, but since these operations are associatives, should not change results. llvm-svn: 365259	2019-07-06 03:46:18 +00:00
Quentin Colombet	0ffe0db6fa	[RegisterCoalescer] Fix an overzealous assert Although removeCopyByCommutingDef deals with full copies, it is still possible to copy undef lanes and thus, we wouldn't have any a value number for these lanes. This fixes PR40215. llvm-svn: 365256	2019-07-06 00:34:54 +00:00
Matt Arsenault	705e46f449	RegUsageInfoCollector: Skip AMDGPU entry point functions I'm not sure if it's worth it or not to add a hook to disable the pass for an arbitrary function. This pass is taking up to 5% of compile time in tiny programs by iterating through all of the physical registers in every register class. This pass should be rewritten in terms of regunits. For now, skip doing anything for entry point functions. The vast majority of functions in the real world aren't callable, so just not running this will give the majority of the benefit. llvm-svn: 365255	2019-07-05 23:33:43 +00:00
Michael Liao	88b0d20edf	Revert "[FileCheck] Simplify numeric variable interface" This reverts commit `096600a4b0`. llvm-svn: 365251	2019-07-05 22:23:27 +00:00
Thomas Preud'homme	096600a4b0	[FileCheck] Simplify numeric variable interface Summary: This patch simplifies 2 aspects in the FileCheckNumericVariable code. First, setValue() method is turned into a void function since being called only on undefined variable is an invariant and is now asserted rather than returned. This remove the assert from the callers. Second, clearValue() method is also turned into a void function since the only caller does not check its return value since it may be trying to clear the value of variable that is already cleared without this being noteworthy. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D64231 llvm-svn: 365249	2019-07-05 21:49:59 +00:00
Matt Arsenault	5e9610a3f5	AMDGPU: Fix assert in clang test llvm-svn: 365245	2019-07-05 21:09:53 +00:00
Nikita Popov	a2a09cb606	[SystemZ] Fix addcarry of usubo (PR42512) Only custom lower uaddo+addcarry or usubo+subcarry chains and leave mixtures like usubo+addcarry or uaddo+subcarry to the generic legalizer. Otherwise we run into issues because SystemZ uses different CC values for carries and borrows. Fixes https://bugs.llvm.org/show_bug.cgi?id=42512. Differential Revision: https://reviews.llvm.org/D64213 llvm-svn: 365242	2019-07-05 20:35:11 +00:00
Matt Arsenault	e7e23e3e91	AMDGPU: Make AMDGPUPerfHintAnalysis an SCC pass Add a string attribute instead of directly setting MachineFunctionInfo. This avoids trying to get the analysis in the MachineFunctionInfo in a way that doesn't work with the new pass manager. This will also avoid re-visiting the call graph for every single function. llvm-svn: 365241	2019-07-05 20:26:13 +00:00
Michael Liao	8d6ea2d48c	[CodeGen] Enhance `MachineInstrSpan` to allow the end of MBB to be used. Summary: - Explicitly specify the parent MBB to allow the end iterator to be used. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64261 llvm-svn: 365240	2019-07-05 20:23:59 +00:00
Benjamin Kramer	05eebaa949	[PowerPC] Fold another unused variable into assertion. NFC. llvm-svn: 365237	2019-07-05 19:58:39 +00:00
Benjamin Kramer	31f6b13e83	[PowerPC] Fold variable into assert. NFC. Avoids a warning in Release builds. llvm-svn: 365236	2019-07-05 19:46:48 +00:00
Benjamin Kramer	049230b4d2	[PowerPC] Remove unused variable. NFC. llvm-svn: 365235	2019-07-05 19:28:02 +00:00
Craig Topper	d22b2d01ca	[X86] Correct the size check in foldMemoryOperandCustom. The Size either needs to be 0 meaning we aren't folding a stack reload. Or the stack slot needs to be at least 16 bytes. I've also added a paranoia check ensure the RCSize is at leat 16 bytes as well. This avoids any FR32/FR64 surprises, but I think we already filtered those earlier. All of our test case have Size as either 0 or 16 and RCSize == 16. So the Size <= 16 check worked for those cases. llvm-svn: 365234	2019-07-05 18:54:00 +00:00
Nemanja Ivanovic	6c9a392c8e	[PowerPC] Move TOC save to prologue when profitable The indirect call sequence on PPC requires that the TOC base register be saved prior to the indirect call and restored after the call since the indirect call may branch to a global entry point in another DSO which will update the TOC base. Over the last couple of years, we have improved this to: - be able to hoist TOC saves from loops (with changes to MachineLICM) - avoid multiple saves when one dominates the other[s] However, it is still possible to have multiple TOC saves dynamically in the execution path if there is no dominance relationship between them. This patch moves the TOC save to the prologue when one of the TOC saves is in a block that post-dominates entry (i.e. it cannot be avoided) or if it is in a block that is hotter than entry. Differential revision: https://reviews.llvm.org/D63803 llvm-svn: 365232	2019-07-05 18:38:09 +00:00
Craig Topper	6e6d229e5e	[X86] Update SSE1 MOVLPSrm and MOVHPSrm isel patterns to ensure loads are non-volatile before folding. These patterns use 128-bit loads, but the instructions only load 64-bits. We shouldn't narrow the load if its volatile. Fixes another variant of PR42079 llvm-svn: 365225	2019-07-05 17:31:29 +00:00
Craig Topper	8a93952a5c	[X86] Remove unnecessary isel pattern for MOVLPSmr. This was identical to a pattern for MOVPQI2QImr with a bitcast as an input. But we should be able to turn MOVPQI2QImr into MOVLPSmr in the execution domain fixup pass so we shouldn't need this. llvm-svn: 365224	2019-07-05 17:31:25 +00:00
Christudasan Devadasan	652ad423bb	[NFC] A test commit to check the access permission. Removed a blank line. llvm-svn: 365223	2019-07-05 17:07:42 +00:00
Thomas Preud'homme	56f6308b2d	[FileCheck] Share variable instance among uses Summary: This patch changes expression support to use one instance of FileCheckNumericVariable per numeric variable rather than one per variable and per definition. The current system was only necessary for the last patch of the numeric expression support patch series in order to handle a line using a variable defined earlier on the same line from the input text. However this can be dealt more efficiently. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D64229 llvm-svn: 365220	2019-07-05 16:25:46 +00:00
Thomas Preud'homme	fe7ac170a7	[FileCheck] Don't diagnose undef vars at parse time Summary: Diagnosing use of undefined variables takes place in parseNumericVariableUse() and printSubstitutions() for numeric variables but only takes place in printSubstitutions() for string variables. The reason for the split location of diagnostics is that parsing is not aware of the clearing of variables due to --enable-var-scope and thus use of variables cleared in this way can only be catched by printSubstitutions(). Beyond the code level inconsistency, there is also a user facing inconsistency since diagnostics look different between the two functions. While the diagnostic in printSubstitutions is more verbose, doing the diagnostic there allows to diagnose all undefined variables rather than just the first one and error out. This patch create dummy variable definition when encountering a use of undefined variable so that parsing can proceed and be diagnosed by printSubstitutions() later. Tests that were testing whether parsing fails in such case are thus modified accordingly. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D64228 llvm-svn: 365219	2019-07-05 16:25:33 +00:00
Yaxun Liu	a62413526d	[AMDGPU] Added a new metadata for multi grid sync implicit argument Patch by Christudasan Devadasan. Differential Revision: https://reviews.llvm.org/D63886 llvm-svn: 365217	2019-07-05 16:05:17 +00:00
Matt Arsenault	27a6985d90	ScheduleDAG: Fix incorrectly killing registers in bundles When looking for uses/defs to add kill flags, the iterator was double incremented, skipping the first instruction in the bundle. The use register in the first bundle instruction was then incorrectly killed. The "First" instruction should be the BUNDLE itself as the proper reverse iterator endpoint. llvm-svn: 365216	2019-07-05 15:32:28 +00:00
Eugene Leviant	3aef35288b	[ThinLTO] Attempt to recommit r365188 after alignment fix llvm-svn: 365215	2019-07-05 15:25:05 +00:00
David Green	47afdaa487	[ARM] MVE patterns for VMVN, VORR and VBIC This add simple Q register forms of bitwise not instructions. Differential Revision: https://reviews.llvm.org/D63983 llvm-svn: 365214	2019-07-05 15:21:29 +00:00
Jay Foad	7e0c10b55f	[AMDGPU] DPP combiner: recognize identities for more opcodes Summary: This allows the DPP combiner to kick in more often. For example the exclusive scan generated by the atomic optimizer for a divergent atomic add used to look like this: v_mov_b32_e32 v3, v1 v_mov_b32_e32 v5, v1 v_mov_b32_e32 v6, v1 v_mov_b32_dpp v3, v2 wave_shr:1 row_mask:0xf bank_mask:0xf s_nop 1 v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf v_mov_b32_dpp v6, v3 row_shr:3 row_mask:0xf bank_mask:0xf v_add3_u32 v3, v4, v5, v6 v_mov_b32_e32 v4, v1 s_nop 1 v_mov_b32_dpp v4, v3 row_shr:4 row_mask:0xf bank_mask:0xe v_add_u32_e32 v3, v3, v4 v_mov_b32_e32 v4, v1 s_nop 1 v_mov_b32_dpp v4, v3 row_shr:8 row_mask:0xf bank_mask:0xc v_add_u32_e32 v3, v3, v4 v_mov_b32_e32 v4, v1 s_nop 1 v_mov_b32_dpp v4, v3 row_bcast:15 row_mask:0xa bank_mask:0xf v_add_u32_e32 v3, v3, v4 s_nop 1 v_mov_b32_dpp v1, v3 row_bcast:31 row_mask:0xc bank_mask:0xf v_add_u32_e32 v1, v3, v1 v_add_u32_e32 v1, v2, v1 v_readlane_b32 s0, v1, 63 But now most of the dpp movs are combined into adds: v_mov_b32_e32 v3, v1 v_mov_b32_e32 v5, v1 s_nop 0 v_mov_b32_dpp v3, v2 wave_shr:1 row_mask:0xf bank_mask:0xf s_nop 1 v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf v_mov_b32_dpp v1, v3 row_shr:3 row_mask:0xf bank_mask:0xf v_add3_u32 v1, v4, v5, v1 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf v_add_u32_e32 v1, v2, v1 v_readlane_b32 s0, v1, 63 Reviewers: arsenm, vpykhtin Subscribers: kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, MaskRay, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64207 llvm-svn: 365211	2019-07-05 14:52:48 +00:00
Eugene Leviant	e91f86f0ac	Reverted r365188 due to alignment problems on i686-android llvm-svn: 365206	2019-07-05 13:26:05 +00:00
Graham Hunter	957c40db6a	Scalable Vector IR Type with further LTO fixes Reintroduces the scalable vector IR type from D32530, after it was reverted a couple of times due to increasing chromium LTO build times. This latest incarnation removes the walk over aggregate types from the verifier entirely, in favor of rejecting scalable vectors in the isValidElementType methods in ArrayType and StructType. This removes the 70% degradation observed with the second repro tarball from PR42210. Reviewers: thakis, hans, rengolin, sdesmalen Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D64079 llvm-svn: 365203	2019-07-05 12:48:16 +00:00
Robert Lougher	9dcfbbae76	This reverts r365061 and r365062 (test update) Revision r365061 changed a skip of debug instructions for a skip of meta instructions. This is not safe, as IMPLICIT_DEF is classed as a meta instruction. llvm-svn: 365202	2019-07-05 12:42:06 +00:00
Sam Elliott	b2c9eed0d7	[RISCV] Support @llvm.readcyclecounter() Intrinsic On RISC-V, the `cycle` CSR holds a 64-bit count of the number of clock cycles executed by the core, from an arbitrary point in the past. This matches the intended semantics of `@llvm.readcyclecounter()`, which we currently leave to the default lowering (to the constant 0). With this patch, we will now correctly lower this intrinsic to the intended semantics, using the user-space instruction `rdcycle`. On 64-bit targets, we can directly lower to this instruction. On 32-bit targets, we need to do more, as `rdcycle` only returns the low 32-bits of the `cycle` CSR. In this case, we perform a custom lowering, based on the PowerPC lowering, using `rdcycleh` to obtain the high 32-bits of the `cycle` CSR. This custom lowering inserts a new basic block which detects overflow in the high 32-bits of the `cycle` CSR during reading (because multiple instructions are required to read). The emitted assembly matches the suggested assembly in the RISC-V specification. Differential Revision: https://reviews.llvm.org/D64125 llvm-svn: 365201	2019-07-05 12:35:21 +00:00
Nico Weber	a780276301	lld, llvm-dlltool, llvm-lib: Use getAsString() instead of getSpelling() for printing unknown args Since OPT_UNKNOWN args never have any values and consist only of spelling (and are never aliased), this doesn't make any difference in practice, but it's more consistent with Arg's guidance to use getAsString() for diagnostics, and it matches what clang does. Also tweak two tests to use an unknown option that contains '=' for additional coverage while here. (The new tests pass fine with the old code too though.) llvm-svn: 365200	2019-07-05 12:31:32 +00:00
Robert Lougher	2478b62098	Revert r365198 as this accidentally commited something that should not have been added. llvm-svn: 365199	2019-07-05 12:30:45 +00:00
Robert Lougher	3bea2b15f5	This reverts r365061 and r365062 (test update) Revision r365061 changed a skip of debug instructions for a skip of meta instructions. This is not safe, as IMPLICIT_DEF is classed as a meta instruction. llvm-svn: 365198	2019-07-05 12:20:21 +00:00
Sam Elliott	6884d5e040	[RISCV][NFC] Replace hard-coded CSR duplication with symbolic references Reviewers: asb, lenary Reviewed By: asb, lenary Subscribers: MaskRay, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64139 Patch by James Clarke (jrtc27) llvm-svn: 365195	2019-07-05 12:16:40 +00:00
Thomas Preud'homme	41f2bea60c	[FileCheck] Fix comment in parseNumericVariableUse Summary: Comment explaining the interaction between parsing of numeric variable definition and uses in parseNumericVariableUse is stale since it suggests both use and definition parsing is done in the same function. This was the case in a previous version of the patch committed as `71d3f227a7` but is no longer the case. This patch updates the comment accordingly. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D64227 llvm-svn: 365192	2019-07-05 12:01:12 +00:00
Thomas Preud'homme	28196a5da8	[FileCheck] Factor some parsing checks out Summary: Both callers of parseNumericVariableDefinition() perform the same extra check that no character is found after the variable name. This patch factors out this check into parseNumericVariableDefinition(). Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D64226 llvm-svn: 365191	2019-07-05 12:01:06 +00:00
Thomas Preud'homme	a188ad2653	[FileCheck] Add missing final dot in comment llvm-svn: 365190	2019-07-05 12:00:56 +00:00
Eugene Leviant	820cc01d1e	[ThinLTO] Attempt to recommit r365040 after caching fix It's possible that some function can load and store the same variable using the same constant expression: store %Derived* @foo, %Derived bitcast (%Base @bar to %Derived*) %42 = load %Derived, %Derived bitcast (%Base @bar to %Derived**) The bitcast expression was mistakenly cached while processing loads, and never examined later when processing store. This caused @bar to be mistakenly treated as read-only variable. See load-store-caching.ll. llvm-svn: 365188	2019-07-05 12:00:10 +00:00
Nico Weber	cf1a11ded2	Make joined instances of JoinedOrSeparate flags point to the unaliased args, like all other arg types do This fixes an 8-year-old regression. r105763 made it so that aliases always refer to the unaliased option – but it missed the "joined" branch of JoinedOrSeparate flags. (r162231 then made the Args classes non-virtual, and r169344 moved them from clang to llvm.) Back then, there was no JoinedOrSeparate flag that was an alias, so it wasn't observable. Now /U in CLCompatOptions is a JoinedOrSeparate alias in clang, and warn_slash_u_filename incorrectly used the aliased arg id (using the unaliased one isn't really a regression since that warning checks if the undefined macro contains slash or backslash and only then emits the warning – and no valid use will pass "-Ufoo/bar" or similar). Also, lld has many JoinedOrSeparate aliases, and due to this bug it had to explicitly call `getUnaliasedOption()` in a bunch of places, even though that shouldn't be necessary by design. After this fix in Option, these calls really don't have an effect any more, so remove them. No intended behavior change. (I accidentally fixed this bug while working on PR29106 but then wondered why the warn_slash_u_filename broke. When I figured it out, I thought it would make sense to land this in a separate commit.) Differential Revision: https://reviews.llvm.org/D64156 llvm-svn: 365186	2019-07-05 11:45:24 +00:00
George Rimar	d0921a4696	[Object/ELF.h] - Improve error reporting. The errors coming from ELF.h are usually not very useful because they are uninformative. This patch is a first step to improve the situation. I tested this patch with a run of check-llvm and found that few messages are untested. In this patch, I did not add more tests but marked all such cases with a "TODO" comment. For all tested messages I extended the error text to provide more details (see test cases changed). Differential revision: https://reviews.llvm.org/D64014 llvm-svn: 365183	2019-07-05 11:28:49 +00:00
Simon Pilgrim	8b25d9bf01	[X86][SSE] LowerINSERT_VECTOR_ELT - early out for out of range indices Fixes OSS-Fuzz #15662 llvm-svn: 365180	2019-07-05 10:34:53 +00:00
David Green	25cf705097	[ARM] MVE VMOV immediate handling This adds some handling for VMOVimm, using the same method that NEON uses. We create VMOVIMM/VMVNIMM/VMOVFPIMM nodes based on the immediate, and select them using the now renamed ARMvmovImm/etc. There is also an extra 64bit immediate mode that I have not yet added here. Code by David Sherwood Differential Revision: https://reviews.llvm.org/D63884 llvm-svn: 365178	2019-07-05 10:02:43 +00:00
David Green	bb7e97d783	[ARM] MVE fp to int conversions This adds the patterns needed for fptosi and sitofp. Differential Revision: https://reviews.llvm.org/D63729 llvm-svn: 365176	2019-07-05 09:34:30 +00:00
Fangrui Song	6fa850c4fe	[RISCV] Delete a ctor that is commented out. NFC llvm-svn: 365175	2019-07-05 08:25:14 +00:00
Craig Topper	171732aeb3	[X86] Add custom isel to select ADD/SUB/OR/XOR/AND to their non-immediate forms under optsize when the immediate has additional users. Summary: We attempt to prevent folding immediates with multiple users under optsize. But we only do this from store nodes and X86ISD::ADD/SUB/XOR/OR/AND patterns. We don't do it for ISD::ADD/SUB/XOR/OR/AND even though we count them as users when deciding whether to fold into other nodes. This leads to situations where we block folding to a compare for example, but still fold into an AND or OR as seen in PR27202. Unfortunately touching the isel patterns in tablegen for the ISD::ADD/SUB/XOR/OR/AND opcodes will cause the patterns to be unusable for fast isel. And we don't have a way to make a fast isel only pattern. To workaround this, this patch adds custom isel in front of the isel table that will select the non-immediate forms if the immediate has additional users. This may create some issues for ANDN and NOT matching. And there's room for improvement with unsigned 32 immediates on 64-bit AND. This patch needs more thorough test cases, but I wanted to get feedback on the direction. Please send me any other test cases you've seen in the wild. I think we probably have the same issue with the immediate matching when we fold RMW from X86ISD::ADD/SUB/XOR/OR/AND. And our TEST immedaite shrinking logic. Our cost modeling for immediates that can fit in a sign extended 8-bit immediate on a 16/32/64 bit operation is completely wrong. I also wonder if we should update the ConstantHoisting cost model and block folding for "opaque" constants. But of course constants can still be created by DAG combine and lowering optimizations. Fixes PR27202 Reviewers: spatel, RKSimon, andreadb Reviewed By: RKSimon Subscribers: jsji, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59909 llvm-svn: 365163	2019-07-04 22:53:57 +00:00
Simon Atanasyan	1e9c00308b	[mips] Refactor expandSeq and expandSeqI methods. NFC llvm-svn: 365161	2019-07-04 22:45:07 +00:00
Craig Topper	e9aed963ce	[DAGCombiner] Don't combine (addcarry (uaddo X, Y), 0, Carry) -> (addcarry X, Y, Carry) if the Carry comes from the uaddo. Summary: The uaddo won't be removed and the addcarry will still be dependent on the uaddo. So we'll just increase the use count of X and Y and potentially require a COPY. Reviewers: spatel, RKSimon, deadalnix Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64190 llvm-svn: 365149	2019-07-04 18:18:46 +00:00
Tim Renouf	5816889c74	[AMDGPU] Custom lower INSERT_SUBVECTOR v3, v4, v5, v8 Summary: Since the changes to introduce vec3 and vec5, INSERT_VECTOR for these sizes has been marked "expand", which made LegalizeDAG lower it to loads and stores via a stack slot. The code got optimized a bit later, but the now-unused stack slot was never deleted. This commit avoids that problem by custom lowering INSERT_SUBVECTOR into an EXTRACT_VECTOR_ELT and INSERT_VECTOR_ELT for each element in the subvector to insert. V2: Addressed review comments re test. Differential Revision: https://reviews.llvm.org/D63160 Change-Id: I9e3c13e36f68cfa3431bb9814851cc1f673274e1 llvm-svn: 365148	2019-07-04 17:38:24 +00:00
Sanjay Patel	75b5edf6a1	[InstCombine] allow undef elements when forming splat from chain of insertelements We allow forming a splat (broadcast) shuffle, but we were conservatively limiting that to cases where all elements of the vector are specified. It should be safe from a codegen perspective to allow undefined lanes of the vector because the expansion of a splat shuffle would become the chain of inserts again. Forming splat shuffles can reduce IR and help enable further IR transforms. Motivating bugs: https://bugs.llvm.org/show_bug.cgi?id=42174 https://bugs.llvm.org/show_bug.cgi?id=16739 Differential Revision: https://reviews.llvm.org/D63848 llvm-svn: 365147	2019-07-04 16:45:34 +00:00
Jay Foad	0cd50b2a95	Fix typos in comments and debug output. llvm-svn: 365146	2019-07-04 15:04:29 +00:00
Michael Liao	7a9ad430fe	[AMDGPU] Correct the setting of `FlatScratchInit`. Summary: - That flag setting should skip spilling stack slot. Reviewers: arsenm, rampitec Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64143 llvm-svn: 365137	2019-07-04 13:29:45 +00:00
Simon Pilgrim	555d743fcf	Fix -Wdocumentation param warning. Don't put the full stop at the end of a param name - it confuses the doxygen parser llvm-svn: 365128	2019-07-04 10:35:31 +00:00
Simon Pilgrim	fde766de4b	[X86][AVX1] Combine concat_vectors(pshufd(x,c),pshufd(y,c)) -> vpermilps(concat_vectors(x,y),c) Bitcast v4i32 to v8f32 and back again - it might be worth adding isel patterns for X86PShufd v8i32 on AVX1 targets like we did for X86Blendi to avoid the bitcasts? llvm-svn: 365125	2019-07-04 10:17:10 +00:00
Simon Pilgrim	8177673fb4	Fix MSVC "not all control paths return a value" warnings. NFCI. llvm-svn: 365119	2019-07-04 09:46:06 +00:00
Mikael Holmen	67dd39f86e	[Remarks] Silence gcc warning by catching unhandled values in switches Without this fix gcc (7.4) complains with ../lib/Remarks/RemarkParser.cpp: In function 'std::unique_ptr<llvm::remarks::ParserImpl> formatToParserImpl(llvm::remarks::ParserFormat, llvm::StringRef)': ../lib/Remarks/RemarkParser.cpp:29:1: error: control reaches end of non-void function [-Werror=return-type] } ^ ../lib/Remarks/RemarkParser.cpp: In function 'std::unique_ptr<llvm::remarks::ParserImpl> formatToParserImpl(llvm::remarks::ParserFormat, llvm::StringRef, const llvm::remarks::ParsedStringTable&)': ../lib/Remarks/RemarkParser.cpp:38:1: error: control reaches end of non-void function [-Werror=return-type] } ^ The Format enum currently only contains the value YAML which is indeed already handled in the switches, but gcc complains anyway. Adding a default case with an llvm_unreachable silences gcc. llvm-svn: 365118	2019-07-04 09:29:18 +00:00
David Green	2b20ee4110	[ARM] Favour PL/MI over GE/LT when possible The arm condition codes for GE is N==V (and for LT is N!=V). If the source of flags cannot set V (overflow), such as a cmp against #0, then we can use the simpler PL and MI conditions that only check N. As these PL/MI conditions are simpler than GE/LT, other passes like the peephole optimiser can have a better time optimising away the redundant CMPs. The exception is the VSEL instruction, which cannot take the PL code, so there the transform favours GE. Differential Revision: https://reviews.llvm.org/D64160 llvm-svn: 365117	2019-07-04 08:58:58 +00:00
David Green	d2a9ec29d0	[ARM] MVE bitwise instruction patterns This adds patterns for the simpler VAND, VORR and VEOR bitwise vector instructions. It also adjusts the top16Zero PatLeaf to not match on vector instructions, which can otherwise cause problems. Code written by David Sherwood. Differential Revision: https://reviews.llvm.org/D63867 llvm-svn: 365113	2019-07-04 08:41:23 +00:00
QingShan Zhang	63e62006cf	[NFC][PowerPC] Make the PowerPC scheduling strategy feature only control the strategy instead of the scheduler. llvm-svn: 365110	2019-07-04 07:43:51 +00:00
Craig Topper	163b8bb3f5	[X86] Use pointer sized indices instead of i32 for EXTRACT_VECTOR_ELT and INSERT_VECTOR_ELT in a couple places. Most places already did this. llvm-svn: 365109	2019-07-04 06:21:54 +00:00
Serguei Katkov	6d8813a391	[LoopPeel] Some small comment update. NFC. Follow-up change of comment after https://reviews.llvm.org/D63917 is landed. llvm-svn: 365107	2019-07-04 05:10:14 +00:00
Fangrui Song	1f333562de	[PowerPC] Support constraint code "ww" Summary: "ww" and "ws" are both constraint codes for VSX vector registers that hold scalar double data. "ww" is preferred for float while "ws" is preferred for double. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D64119 llvm-svn: 365106	2019-07-04 04:44:42 +00:00
Chen Zheng	469f30abab	[PowerPC] Hardware Loop branch instruction's condition may not be icmp. This fixes pr42492. Differential Revision: https://reviews.llvm.org/D64124 llvm-svn: 365104	2019-07-04 01:51:47 +00:00
Francis Visoiu Mistrih	312f1d7d7c	[Remarks] Require an explicit format to the parser Make the parser require an explicit format. This allows new formats to be easily added by following YAML as an example. llvm-svn: 365102	2019-07-04 00:31:03 +00:00
Francis Visoiu Mistrih	e6ba313a86	[Remarks][NFC] Move the string table parsing out of the parser constructor Make the parser take an already-parsed string table. llvm-svn: 365101	2019-07-04 00:30:58 +00:00
Derek Schuff	51d3c4dfcd	[WebAssembly] Update test failure explanations llvm-svn: 365100	2019-07-04 00:24:35 +00:00
Shoaib Meenai	995798d2d5	[MachO] Add valid architecture function Added array of valid architectures and function returning array. Modified llvm-lipo to include list of valid architectures in error message for invalid arch. Patch by Anusha Basana <anusha.basana@gmail.com> Differential Revision: https://reviews.llvm.org/D63735 llvm-svn: 365099	2019-07-04 00:17:02 +00:00
Lang Hames	f5a885fddd	[JITLink][ORC] Add EHFrameRegistrar interface, use in EHFrameRegistrationPlugin. Replaces direct calls to eh-frame registration with calls to methods on an EHFrameRegistrar instance. This allows clients to substitute a registrar that registers frames in a remote process via IPC/RPC. llvm-svn: 365098	2019-07-04 00:05:12 +00:00
Reid Kleckner	f7e52fbdb5	Revert [ThinLTO] Optimize writeonly globals out This reverts r365040 (git commit `5cacb91475`) Speculatively reverting, since this appears to have broken check-lld on Linux. Partial analysis in https://crbug.com/981168. llvm-svn: 365097	2019-07-04 00:03:30 +00:00
Derek Schuff	ec4be57655	[WebAssembly] Enable IndirectBrExpandPass Wasm doesn't have a direct way to lower indirectbr, so hook up the IndirectBrExpandPass to lower indirectbr into a switch. Fixes PR42498 Reviewers: aheejin Differential Revision: https://reviews.llvm.org/D64161 llvm-svn: 365096	2019-07-03 23:54:06 +00:00
Matt Arsenault	5b0922fe1f	AMDGPU: Add pass to lower SGPR spills This is split out from my patches to split register allocation into a separate SGPR and VGPR phase, and has some parts that aren't yet used (like maintaining LiveIntervals). This simplifies making the frame pointer register callee saved. As it is now, the code to determine callee saves needs to predict all the possible SGPR spills and how many callee saved VGPRs are needed. By handling this before PrologEpilogInserter, it's possible to just check the spill objects that already exist. Change-Id: I29e6df4034afcf949e06f8ef44206acb94696f04 llvm-svn: 365095	2019-07-03 23:32:29 +00:00
Eli Friedman	41ee3977c4	[JumpThreading] Fix threading with unusual PHI nodes. If the block being cloned contains a PHI node, in general, we need to clone that PHI node, even though it's trivial. If the operand of the PHI is an instruction in the block being cloned, the correct value for the operand doesn't exist until SSAUpdater constructs it. We usually don't hit this issue because we try to avoid threading across loop headers, but it's possible to hit this in some cases involving irreducible CFGs. I added a flag to allow threading across loop headers to make the testcase easier to understand. Thanks to Brian Rzycki for reducing the testcase. Fixes https://bugs.llvm.org/show_bug.cgi?id=42085. Differential Revision: https://reviews.llvm.org/D63913 llvm-svn: 365094	2019-07-03 23:12:39 +00:00
Matt Arsenault	43cbca50e4	GlobalISel: Fix widenScalar for pointer typed G_MERGE_VALUES llvm-svn: 365093	2019-07-03 23:08:06 +00:00
Francis Visoiu Mistrih	e0308279cb	[Bitcode] Move Bitstream to a separate library This moves Bitcode/Bitstream*, Bitcode/BitCodes.h to Bitstream/. This is needed to avoid a circular dependency when using the bitstream code for parsing optimization remarks. Since Bitcode uses Core for the IR part: libLLVMRemarks -> Bitcode -> Core and Core uses libLLVMRemarks to generate remarks (see IR/RemarkStreamer.cpp): Core -> libLLVMRemarks we need to separate the Bitstream and Bitcode part. For clang-doc, it seems that it doesn't need the whole bitcode layer, so I updated the CMake to only use the bitstream part. Differential Revision: https://reviews.llvm.org/D63899 llvm-svn: 365091	2019-07-03 22:40:07 +00:00
Matt Arsenault	c96c174557	Revert "[AMDGPU] Kernel arg metadata: added support for "__hip_texture" type." This reverts commit r365073. This is crashing, and is improperly relying on IR type names. llvm-svn: 365087	2019-07-03 21:34:34 +00:00
Evgeniy Stepanov	50dc28b556	Teach ValueTracking that aarch64.irg result aliases its input. Reviewers: javed.absar, olista01 Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64103 llvm-svn: 365079	2019-07-03 20:19:14 +00:00
Philip Reames	ea06d63c35	[LFTR] Use SCEVExpander for the pointer limit case instead of manual IR gen As noted in the test change, this is not trivially NFC, but all of the changes in output are cases where the SCEVExpander form is more canonical/optimal than the hand generation. llvm-svn: 365075	2019-07-03 20:03:46 +00:00
Konstantin Pyzhov	6f419a3370	[AMDGPU] Kernel arg metadata: added support for "__hip_texture" type. Summary: Hip texture type is equivalent to OpenCL image. So, we need to set the Image type for kernel arguments with __hip_texture type. Differential revision: https://reviews.llvm.org/D63850 llvm-svn: 365073	2019-07-03 19:11:35 +00:00
Philip Reames	14f1543425	[LFTR] Remove a stray variable shadow of the same value [NFC] llvm-svn: 365072	2019-07-03 19:08:43 +00:00
Philip Reames	e7a258c6d9	[LFTR] Style and comment changes to clarify the narrow vs wide bitwidth evaluation behavior [NFC] llvm-svn: 365071	2019-07-03 19:03:37 +00:00
Philip Reames	abc8f344d6	[LFTR] Sink the decision not use truncate scheme for constants into genLoopLimit [NFC] We might as well just evaluate the constants using SCEV, and having the cases grouped makes the logic slightly easier to read anyway. llvm-svn: 365070	2019-07-03 18:41:03 +00:00
Jessica Paquette	6584109389	Fix precedence in assert from r364961 Precedence was wrong in an assert added in r364961. Add braces around the assertion condition to make it right. See: https://reviews.llvm.org/D64084 llvm-svn: 365069	2019-07-03 18:30:01 +00:00
Philip Reames	4c80281c96	[LFTR] Remove falsely generalized (dead) code [NFC] llvm-svn: 365067	2019-07-03 18:24:06 +00:00
Philip Reames	83cca94194	[LFTR] Hoist extend expressions outside of loops w/o waiting for LICM The motivation for this is two fold: 1) Make the output (and thus tests) a bit more readable to a human trying to understand the result of the transform 2) Reduce spurious diffs in a potential future change to restructure all of this logic to use SCEVExpander (which hoists by default) llvm-svn: 365066	2019-07-03 18:18:36 +00:00
Jessica Paquette	a99cfeea44	[GlobalISel][AArch64] Use getConstantVRegValWithLookThrough for selectArithImmed Instead of just stopping to see if we have a G_CONSTANT, instead, look through G_TRUNCs, G_SEXTs, and G_ZEXTs. This gives an average ~1.3% code size improvement on CINT2000 at -O3. Differential Revision: https://reviews.llvm.org/D64108 llvm-svn: 365063	2019-07-03 17:46:23 +00:00
Robert Lougher	720baf0416	[X86] Avoid SFB - Skip meta instructions This patch generalizes the fix in D61680 to ignore all meta instructions, not just debug info. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D62605 llvm-svn: 365061	2019-07-03 17:43:55 +00:00
Francis Visoiu Mistrih	83bbe2f418	[CodeGen] Make branch funnels pass the machine verifier We previously marked all the tests with branch funnels as `-verify-machineinstrs=0`. This is an attempt to fix it. 1) `ICALL_BRANCH_FUNNEL` has no defs. Mark it as `let OutOperandList = (outs)` 2) After that we hit an assert: ``` Assertion failed: (Op.getValueType() != MVT::Other && Op.getValueType() != MVT::Glue && "Chain and glue operands should occur at end of operand list!"), function AddOperand, file /Users/francisvm/llvm/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp, line 461. ``` The chain operand was added at the beginning of the operand list. Move that to the end. 3) After that we hit another verifier issue in the pseudo expansion where the registers used in the cmps and jmps are not added to the livein lists. Add the `EFLAGS` to all the new MBBs that we create. PR39436 Differential Review: https://reviews.llvm.org/D54155 llvm-svn: 365058	2019-07-03 17:16:45 +00:00
Simon Pilgrim	26812c7675	[X86] ComputeNumSignBitsForTargetNode - add target shuffle support. llvm-svn: 365057	2019-07-03 17:06:59 +00:00
Amaury Sechet	57dfacb32d	Use getAllOnesConstants instead of -1 in DAGCombiner. NFC llvm-svn: 365054	2019-07-03 16:34:36 +00:00
Philip Reames	39e7a97ad7	[SCEV] Preserve flags on add/muls in getSCEVATScope We haven't changed the set of users, just specialized an operand for those users. Given that, the previous wrap flags must still be correct. Sorry for the lack of test case. Noticed this while working on something else, and haven't figured out to exercise this standalone. llvm-svn: 365053	2019-07-03 16:34:08 +00:00
Amaury Sechet	bddb8c3597	[DAGCombine] More diamong carry pattern optimization. Summary: This diff improve the capability of DAGCOmbine to generate linear carries propagation in presence of a diamond pattern. It is now able to match a large variety of different patterns rather than some hardcoded one. Arguably, the codegen in test cases is not better, but this is to be expected. The goal of this transformation is more about canonicalisation than actual optimisation. Reviewers: hfinkel, RKSimon, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57302 llvm-svn: 365051	2019-07-03 16:15:59 +00:00
Simon Pilgrim	783dbe402f	[X86][AVX] combineX86ShufflesRecursively - peek through extract_subvector If we have more then 2 shuffle ops to combine, try to use combineX86ShuffleChainWithExtract to see if some are from the same super vector. llvm-svn: 365050	2019-07-03 15:46:08 +00:00
Sam Parker	6005681ac6	[ARM] Fix for NDEBUG builds Fix unused variable warning as well as a nonsense assert. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 365046	2019-07-03 14:39:23 +00:00
Simon Pilgrim	868d0b7fd9	[X86][AVX] Combine vpermi(bitcast(x)) -> bitcast(vpermi(x)) iff the number of elements doesn't change. This gets around an issue with combineX86ShuffleChain not being able to hint which domain is preferred for shuffles that can be done with either. Fixes regression introduced in rL365041 llvm-svn: 365044	2019-07-03 14:34:16 +00:00
James Molloy	fa4aac7335	[SelectionDAG] Propagate alias metadata to target intrinsic nodes When a target intrinsic has been determined to touch memory, we construct a MachineMemOperand during SDAG construction. In this case, we should propagate AAMDNodes metadata to the MachineMemOperand where available. Differential revision: https://reviews.llvm.org/D64131 llvm-svn: 365043	2019-07-03 14:33:29 +00:00
Simon Pilgrim	0c230209fe	[X86][AVX] combineX86ShuffleChainWithExtract - add number of non-zero extract_subvectors to the combine depth This better accounts for the cost/benefit of removing extract_subvectors from the shuffle and will be more useful in future patches. The vpermq predicate regression will be fixed shortly. llvm-svn: 365041	2019-07-03 14:17:21 +00:00
Eugene Leviant	5cacb91475	[ThinLTO] Optimize writeonly globals out Differential revision: https://reviews.llvm.org/D63444 llvm-svn: 365040	2019-07-03 14:14:52 +00:00
Simon Atanasyan	a10bf0939d	[mips] Mark general scheduling model as complete llvm-svn: 365034	2019-07-03 12:28:05 +00:00
Simon Atanasyan	4d364659f9	[mips] Add missing atomic instructions to general scheduling definitions llvm-svn: 365033	2019-07-03 12:27:58 +00:00
Simon Atanasyan	3e4c7eb33e	[mips] Add missing microMIPS instructions to general scheduling definitions llvm-svn: 365032	2019-07-03 12:27:51 +00:00
Simon Pilgrim	8c099cbe7c	[X86][SSE] lowerUINT_TO_FP_v2i32 - explicitly cast half word to double Fixes MSVC analyzer extension->double warning. llvm-svn: 365027	2019-07-03 11:23:27 +00:00
Simon Pilgrim	8df90b843d	[X86][SSE] LowerINSERT_VECTOR_ELT - ensure insertion index correctness. NFCI. Assert that the insertion index is in range and use uint64_t for the index to fix MSVC/cppcheck truncation warning. llvm-svn: 365025	2019-07-03 10:59:52 +00:00
Simon Pilgrim	8853bd9592	[X86][SSE] LowerScalarImmediateShift - ensure shift amount correctness. NFCI. Assert that the shift amount is in range and create vXi8 shift masks in a way that doesn't cause MSVC/cppcheck shift result is truncated then extended warnings. llvm-svn: 365024	2019-07-03 10:47:33 +00:00
Simon Atanasyan	3e41b97f14	[mips] Add SIGRIE,GINVI,GINVT to general scheduling definitions llvm-svn: 365023	2019-07-03 10:33:16 +00:00
Simon Atanasyan	dc3c67bbe2	[mips] Add missing mips16 instructions to general scheduling definitions llvm-svn: 365022	2019-07-03 10:33:09 +00:00
Simon Atanasyan	b04f6a1a25	[mips] Add missing MSA and ASE instructions to general scheduling definitions llvm-svn: 365021	2019-07-03 10:33:01 +00:00
Simon Atanasyan	e5dfbe83b6	[mips] Replace some itineraries by instructions in the general scheduling definitions llvm-svn: 365020	2019-07-03 10:32:54 +00:00
Simon Pilgrim	64e3a51534	Fix uninitialized variable warnings. NFCI. Both MSVC and cppcheck don't like the fact that the variables are initialized via references. llvm-svn: 365018	2019-07-03 10:22:08 +00:00
Simon Pilgrim	7b7b9b78a2	[X86] LowerFunnelShift - use modulo constant shift amount. This avoids the use of getZExtValue and uses the modulo shift amount which is whats expected for funnel shifts anyhow. llvm-svn: 365016	2019-07-03 10:04:16 +00:00
Oliver Stannard	830b20344b	[ARM] Thumb2: favor R4-R7 over R12/LR in allocation order when opt for minsize For Thumb2, we prefer low regs (costPerUse = 0) to allow narrow encoding. However, current allocation order is like: R0-R3, R12, LR, R4-R11 As a result, a lot of instructs that use R12/LR will be wide instrs. This patch changes the allocation order to: R0-R7, R12, LR, R8-R11 for thumb2 and -Osize. In most cases, there is no extra push/pop instrs as they will be folded into existing ones. There might be slight performance impact due to more stack usage, so we only enable it when opt for min size. https://reviews.llvm.org/D30324 llvm-svn: 365014	2019-07-03 09:58:52 +00:00
Sven van Haastregt	1bc2cccf18	Remove some autoconf references from docs and comments The autoconf build system support has been removed a while ago, remove some outdated references. Differential Revision: https://reviews.llvm.org/D63608 llvm-svn: 365013	2019-07-03 09:57:59 +00:00
Roman Lebedev	9f0c83902d	[InstCombine] Y - ~X --> X + Y + 1 fold (PR42457) Summary: I think we'd want this new variant, because we obviously have better handling for `add` as compared to `sub`/`not`. https://rise4fun.com/Alive/WMn Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]] Reviewers: spatel, nikic, huihuiz, efriedma Reviewed By: spatel Subscribers: RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63992 llvm-svn: 365011	2019-07-03 09:41:50 +00:00
Roman Lebedev	c4b83a6054	[Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457) Summary: This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]. In middle-end, we'd want to prefer the form with two adds - D63992, but as this diff shows, not every target will prefer that pattern. Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars, but only X86 prefer that same pattern for vectors. Here i'm adding a new TLI hook, always defaulting to the inc-of-add, but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars. Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel Reviewed By: efriedma Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64090 llvm-svn: 365010	2019-07-03 09:41:35 +00:00
Eugene Leviant	ac407a7b4a	[SCEV][LSR] Prevent using undefined value in binops On some occasions ReuseOrCreateCast may convert previously expanded value to undefined. That value may be passed by SCEVExpander as an argument to InsertBinop making IV chain undefined. Differential revision: https://reviews.llvm.org/D63928 llvm-svn: 365009	2019-07-03 09:36:32 +00:00
Alexander Potapenko	f82672873a	MSan: handle callbr instructions Summary: Handling callbr is very similar to handling an inline assembly call: MSan must checks the instruction's inputs. callbr doesn't (yet) have outputs, so there's nothing to unpoison, and conservative assembly handling doesn't apply either. Fixes PR42479. Reviewers: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64072 llvm-svn: 365008	2019-07-03 09:28:50 +00:00
Serguei Katkov	c22e772a28	[LoopPeel] Re-factor llvm::peelLoop method. NFC. Extract code dealing with branch weights in separate functions. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames, fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D63917 llvm-svn: 365002	2019-07-03 05:59:23 +00:00
Jordan Rupprecht	02647f73d4	Revert [InlineCost] cleanup calculations of Cost and Threshold This reverts r364422 (git commit `1a3dc76186`) The inlining cost calculation is incorrect, leading to stack overflow due to large stack frames from heavy inlining. llvm-svn: 365000	2019-07-03 04:01:51 +00:00
Michael Liao	80177ca5a9	[AMDGPU] Enable serializing of argument info. Summary: - Support serialization of all arguments in machine function info. This enables fabricating MIR tests depending on argument info. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64096 llvm-svn: 364995	2019-07-03 02:00:21 +00:00
Amara Emerson	cac1151845	[AArch64][GlobalISel] Overhaul legalization & isel or shifts to select immediate forms. There are two main issues preventing us from generating immediate form shifts: 1) We have partial SelectionDAG imported support for G_ASHR and G_LSHR shift immediate forms, but they currently don't work because the amount type is expected to be an s64 constant, but we only legalize them to have homogenous types. To deal with this, first we introduce a custom legalizer to only custom legalize s32 shifts which have a constant operand into a s64. There is also an additional artifact combiner to fold zexts(g_constant) to a larger G_CONSTANT if it's legal, a counterpart to the anyext version committed in an earlier patch. 2) For G_SHL the importer can't cope with the pattern. For this I introduced an early selection phase in the arm64 selector to select these forms manually before the tablegen selector pessimizes it to a register-register variant. Differential Revision: https://reviews.llvm.org/D63910 llvm-svn: 364994	2019-07-03 01:49:06 +00:00
Chen Zheng	dfdccbb26b	[PowerPC] exclude ICmpZero in LSR if icmp can be replaced in later hardware loop. Differential Revision: https://reviews.llvm.org/D63477 llvm-svn: 364993	2019-07-03 01:49:03 +00:00
Alex Lorenz	3dbdbbec84	[triple] Use 'macabi' environment name for the Mac Catalyst triples The 'macabi' environment name is preferred instead of 'maccatalyst'. llvm-svn: 364988	2019-07-03 01:02:43 +00:00
Nilanjana Basu	c0b557744a	Revert Changing CodeView debug info type record representation in assembly files to make it more human-readable & editable This reverts r364982 (git commit `2082bf28eb`) llvm-svn: 364987	2019-07-03 00:51:49 +00:00
Guanzhong Chen	b88ebe8cc9	[WebAssembly] Prevent inline assembly from being mangled by SjLj Summary: Before, inline assembly gets mangled by the SjLj transformation. For example, in a function with setjmp/longjmp, this LLVM IR code call void asm sideeffect "", ""() would be transformed into call void @__invoke_void(void ()* asm sideeffect "", "") This is invalid, and results in the error: Cannot take the address of an inline asm! In this diff, we skip the transformation for inline assembly. Reviewers: aheejin, tlively Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64115 llvm-svn: 364985	2019-07-03 00:37:49 +00:00
Matt Arsenault	c04aab9c06	AMDGPU: Look through bundles for existing waitcnts These aren't produced now, but will be in a future patch. llvm-svn: 364983	2019-07-03 00:30:44 +00:00
Nilanjana Basu	2082bf28eb	Changing CodeView debug info type record representation in assembly files to make it more human-readable & editable llvm-svn: 364982	2019-07-03 00:26:23 +00:00
Alex Lorenz	da1dfecd32	Add support for the 'macCatalyst' MachO platform Mac Catalyst is a new MachO platform in macOS Catalina. It always uses the build_version MachO load command. Differential Revision: https://reviews.llvm.org/D64107 llvm-svn: 364981	2019-07-02 23:47:11 +00:00
Craig Topper	b770d2c9d4	[X86] Add a DAG combine for turning *_extend_vector_inreg+load into an appropriate extload if the load isn't volatile. Remove the corresponding isel patterns that did the same thing without checking for volatile. This fixes another variation of PR42079 llvm-svn: 364977	2019-07-02 23:20:03 +00:00
Alex Lorenz	31dee6d6ed	[triple] add 'macCatalyst' environment type Mac Catalyst is a new deployment platform in macOS Catalina. Differential Revision: https://reviews.llvm.org/D64097 llvm-svn: 364971	2019-07-02 21:37:00 +00:00
Eli Friedman	e97aa961d3	[ARM] Fix unwind info for Thumb1 functions that save high registers. There were two issues here: one, some of the relevant instructions were missing the expected "FrameSetup" flag, and two, ARMAsmPrinter::EmitUnwindingInstruction wasn't expecting "mov" instructions in the prologue. I'm sticking the additional state into ARMFunctionInfo so it's obvious it only applies to the current function. I considered a few alternative approaches where we would compute the correct unwind information as part of the prologue/epilogue lowering, but it seems like a lot of work to introduce pseudo-instructions, and the current code seems to be reliable enough. Fixes https://bugs.llvm.org/show_bug.cgi?id=42408. Differential Revision: https://reviews.llvm.org/D63964 llvm-svn: 364970	2019-07-02 21:35:15 +00:00
David Bolvansky	10ee3ac396	[NFC] Strenghten isInteger condition for rL364940 llvm-svn: 364969	2019-07-02 21:16:34 +00:00
Teresa Johnson	5b868285ba	[ThinLTO] Address post-review suggestions for index-based WPD summary Removes a couple of unnecessary and/or redundant checks introduced by r364960. llvm-svn: 364968	2019-07-02 21:07:45 +00:00
Vasileios Porpodas	cf47ff5ffb	[SLP] Recommit: Look-ahead operand reordering heuristic. Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364964	2019-07-02 20:20:28 +00:00
Jessica Paquette	99316043bb	[AArch64][GlobalISel] Teach tryOptSelect to handle G_ICMP This teaches `tryOptSelect` to handle folding G_ICMP, and removes the requirement that the G_SELECT we're dealing with is floating point. Some refactoring to make this work nicely as well: - Factor out the scalar case from the selection code for G_ICMP into `emitIntegerCompare`. - Make `tryOptCMN` return a MachineInstr* instead of a bool. - Make `tryOptCMN` not modify the instruction being selected. - Factor out the CMN emission into `emitCMN` for readability. By doing this this way, we can get all of the compare selection optimizations in select emission. Differential Revision: https://reviews.llvm.org/D64084 llvm-svn: 364961	2019-07-02 19:44:16 +00:00
Teresa Johnson	a700436323	[ThinLTO] Add summary entries for index-based WPD Summary: If LTOUnit splitting is disabled, the module summary analysis computes the summary information necessary to perform single implementation devirtualization during the thin link with the index and no IR. The information collected from the regular LTO IR in the current hybrid WPD algorithm is summarized, including: 1) For vtable definitions, record the function pointers and their offset within the vtable initializer (subsumes the information collected from IR by tryFindVirtualCallTargets). 2) A record for each type metadata summarizing the vtable definitions decorated with that metadata (subsumes the TypeIdentiferMap collected from IR). Also added are the necessary bitcode records, and the corresponding assembly support. The follow-on index-based WPD patch is D55153. Depends on D53890. Reviewers: pcc Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54815 llvm-svn: 364960	2019-07-02 19:38:02 +00:00
Matt Arsenault	5fe851b6cd	AMDGPU: Custom lower vector_shuffle for v4i16/v4f16 Ordinarily it is lowered as a build_vector of each extract_vector_elt, which in turn get lowered to bitcasts and bit shifts. Very little understand the lowered extract pattern, resulting in much worse code. We treat concat_vectors of v2i16 as legal, so prefer that. llvm-svn: 364959	2019-07-02 19:15:45 +00:00
Teresa Johnson	e6768d613a	[RA] Fix spelling of Greedy register allocator internal option The internal option added with r323870 has a typo. It isn't being used by any tests, but I decided to fix the spelling and leave it in for use in debugging the changes added in that patch. llvm-svn: 364958	2019-07-02 18:54:03 +00:00
Erik Pilkington	eee944e7f9	[C++2a] Add __builtin_bit_cast, used to implement std::bit_cast This commit adds a new builtin, __builtin_bit_cast(T, v), which performs a bit_cast from a value v to a type T. This expression can be evaluated at compile time under specific circumstances. The compile time evaluation currently doesn't support bit-fields, but I'm planning on fixing this in a follow up (some of the logic for figuring this out is in CodeGen). I'm also planning follow-ups for supporting some more esoteric types that the constexpr evaluator supports, as well as extending __builtin_memcpy constexpr evaluation to use the same infrastructure. rdar://44987528 Differential revision: https://reviews.llvm.org/D62825 llvm-svn: 364954	2019-07-02 18:28:13 +00:00
Simon Pilgrim	5613874947	[X86] getTargetConstantBitsFromNode - remove unnecessary getZExtValue() (PR42486) Don't use APInt::getZExtValue() if you can avoid it - eventually someone will call it with i128 or something that doesn't fit into 64-bits. In this case it was completely superfluous as we'd moved the rest of the code to always use APInt. Fixes the <1 x i128> addition bug in PR42486 llvm-svn: 364953	2019-07-02 18:20:38 +00:00
Alexander Timofeev	66ac6b409d	[AMDGPU] LCSSA pass added in preISel. Fixing typo in previous commit llvm-svn: 364952	2019-07-02 18:16:42 +00:00
Alexander Timofeev	2ce560f029	[AMDGPU] LCSSA pass added in preISel. Uniform values defined in the divergent loop and used outside Differential Revision: https://reviews.llvm.org/D63953 Reviewers: rampitec, nhaehnle, arsenm llvm-svn: 364950	2019-07-02 17:59:44 +00:00
Craig Topper	cffbaa93b7	[X86] Add patterns to select (scalar_to_vector (loadf32)) as (V)MOVSSrm instead of COPY_TO_REGCLASS + (V)MOVSSrm_alt. Similar for (V)MOVSD. Ultimately, I'd like to see about folding scalar_to_vector+load to vzload. Which would select as (V)MOVSSrm so this is closer to that. llvm-svn: 364948	2019-07-02 17:51:02 +00:00
David Bolvansky	cb1a5a705c	[SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n) Summary: Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190 Reviewers: spatel, nikic, efriedma Reviewed By: efriedma Subscribers: efriedma, nikic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63038 llvm-svn: 364940	2019-07-02 15:58:45 +00:00
Serge Guelton	4137aeb4bf	Provide basic Full LTO extension points Differential Revision: https://reviews.llvm.org/D61738 llvm-svn: 364937	2019-07-02 15:52:39 +00:00
Sam McCall	edf904efff	getMainExecutable: handle realpath() failure, falling back to getprogpath(). Summary: Previously, we'd pass a nullptr to std::string and crash(). This case happens when the binary is deleted while being used (e.g. rebuilding clangd). Reviewers: kadircet Subscribers: ilya-biryukov, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64068 llvm-svn: 364936	2019-07-02 15:42:37 +00:00
Matt Arsenault	50be3481d4	AMDGPU/GlobalISel: Try generated matcher with intrinsics llvm-svn: 364933	2019-07-02 14:52:16 +00:00
Matt Arsenault	a8bff4b963	AMDGPU/GlobalISel: Select mul llvm-svn: 364932	2019-07-02 14:52:14 +00:00
Matt Arsenault	70a4d3f67c	AMDGPU/GlobalISel: Fix G_GEP with mixed SGPR/VGPR operands The register bank for the destination of the sample argument copy was wrong. We shouldn't be constraining each source to the result register bank. Allow constraining the original register to the right size. llvm-svn: 364928	2019-07-02 14:40:22 +00:00
Matt Arsenault	ed63399244	AMDGPU/GlobalISel: Select G_FENCE Manually select to workaround tablegen emitter emitting checks for G_CONSTANT. llvm-svn: 364927	2019-07-02 14:17:38 +00:00

... 5 6 7 8 9 ...

124953 Commits