llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	5e8e89d814	TruncInstCombine.cpp - use auto * to fix llvm-qualified-auto clang-tidy warning. NFCI.	2020-10-02 17:25:11 +01:00
Vinay Madhusudan	f192594956	[AArch64] Generate dot for v16i8 sum reduction to i32 Convert VECREDUCE_ADD( EXTEND(v16i8_type) ) to VECREDUCE_ADD( DOTv16i8(v16i8_type) ) whenever the result type is i32. This gains in one of the SPECCPU 2017 benchmark. This partially solves the bug: https://bugs.llvm.org/show_bug.cgi?id=46888 Meta ticket: https://bugs.llvm.org/show_bug.cgi?id=46929 Differential Revision: https://reviews.llvm.org/D88577	2020-10-02 17:11:02 +01:00
Utkarsh Saxena	db2a646c5f	[clangd] Add bencmark for measuring latency of DecisionForest model. Differential Revision: https://reviews.llvm.org/D88590	2020-10-02 18:04:31 +02:00
Diego Caballero	a611f9a5c6	[mlir] Fix call op conversion in bare-ptr calling convention We hit an llvm_unreachable related to unranked memrefs for call ops with scalar types. Removing the llvm_unreachable since the conversion should gracefully bail out in the presence of unranked memrefs. Adding tests to verify that. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D88709	2020-10-02 08:48:21 -07:00
Nicolas Vasilache	86b14d0969	[mlir] Attempt to appease gcc-5 const char* -> StringLiteral conversion issu	2020-10-02 10:53:48 -04:00
serge-sans-paille	f2c6bfa350	Fix interaction between stack alignment and inline-asm stack clash protection As reported in https://github.com/rust-lang/rust/issues/70143 alignment is not taken into account when doing the probing. Fix that by adjusting the first probe if the stack align is small, or by extending the dynamic probing if the alignment is large. Differential Revision: https://reviews.llvm.org/D84419	2020-10-02 16:51:49 +02:00
Denis Antrushin	7b19cd06d7	[Statepoints][ISEL] visitGCRelocate: chain to current DAG root. This is similar to D87251, but for CopyFromRegs nodes. Even for local statepoint uses we generate CopyToRegs/CopyFromRegs nodes. When generating CopyFromRegs in visitGCRelocate, we must chain to current DAG root, not EntryNode, to ensure proper ordering of copy w.r.t. statepoint node producing result for it. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D88639	2020-10-02 21:41:22 +07:00
Yaxun (Sam) Liu	c87c017a4c	Fix failure in test hip-macros.hip requires amdgpu-registered-target.	2020-10-02 10:33:32 -04:00
Kamil Rytarowski	2a9ce60de9	[compiler-rt] [netbsd] Improve the portability of ThreadSelfTlsTcb Use __lwp_gettcb_fast() and __lwp_getprivate_fast(), as _lwp_getprivate() can be a biased pointer and invalid for use in this function on all CPUs.	2020-10-02 16:32:58 +02:00
LLVM GN Syncbot	d9e3972080	[gn build] Port `0c1bb4f885`	2020-10-02 14:24:01 +00:00
Paul C. Anagnostopoulos	0c1bb4f885	[TableGen] New backend to print detailed records. Pertinent lints are fixed.	2020-10-02 10:22:13 -04:00
Yaxun (Sam) Liu	36501b180a	Emit predefined macro for wavefront size for amdgcn Also fix the issue of multiple -m[no-]wavefrontsize64 options to make the last one wins. Differential Revision: https://reviews.llvm.org/D88370	2020-10-02 10:17:21 -04:00
Haojian Wu	0f0cbcc4b1	[clangd] Extend the rename API. several changes: - return a structure result in rename API; - prepareRename now returns more information (main-file occurrences); - remove the duplicated detecting-touch-identifier code in prepareRename (which is implemented in rename API); Differential Revision: https://reviews.llvm.org/D88634	2020-10-02 16:03:44 +02:00
Simon Pilgrim	fa59135bf1	[Analysis] Drop local maxAPInt/minAPInt helpers. NFCI. Use standard APIntOps::smax/smin helpers instead.	2020-10-02 14:56:12 +01:00
Alexandre Ganea	fe1f0a1a19	[LLD] Fix /time formatting for very long runs. NFC.	2020-10-02 09:53:43 -04:00
Alexandre Ganea	55b97a6d2a	[LLD][COFF] Add more type record information to /summary This adds the following two new lines to /summary: 21351 Input OBJ files (expanded from all cmd-line inputs) 61 PDB type server dependencies 38 Precomp OBJ dependencies 1420669231 Input type records <<<< 78665073382 Input type records bytes <<<< 8801393 Merged TPI records 3177158 Merged IPI records 59194 Output PDB strings 71576766 Global symbol records 25416935 Module symbol records 2103431 Public symbol records Differential Revision: https://reviews.llvm.org/D88703	2020-10-02 09:36:11 -04:00
Louis Dionne	c7d4aa711a	[libc++] Move the weak symbols list to libc++abi Those symbols are exported from libc++abi in the first place, so it makes more sense to have them there.	2020-10-02 09:22:23 -04:00
Simon Pilgrim	4edd74a198	BlockFrequencyInfoImpl.h - use const references to avoid FrequencyData copies. NFCI.	2020-10-02 13:56:30 +01:00
Simon Pilgrim	71b89b1493	LoopAccessAnalysis.cpp - use const reference in for-range loops. NFCI.	2020-10-02 13:56:30 +01:00
Florian Hahn	bb448a2483	[SLP] Add test where reduction result is used in PHI. Test case for PR47670.	2020-10-02 13:38:53 +01:00
Simon Pilgrim	53fb9d062b	[InstCombine] Add partial bswap vector test from D88578	2020-10-02 13:19:19 +01:00
Sjoerd Meijer	8825fec37e	[AArch64] Add CPU Cortex-R82 This adds support for -mcpu=cortex-r82. Some more information about this core can be found here: https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r82 One note about the system register: that is a bit of a refactoring because of small differences between v8.4-A AArch64 and v8-R AArch64. This is based on patches from Mark Murray and Mikhail Maltsev. Differential Revision: https://reviews.llvm.org/D88660	2020-10-02 12:47:23 +01:00
Sam McCall	57ac47d788	[clangd] Make PopulateSwitch a fix. It fixes the -Wswitch warning, though we mark it as a fix even if that is off. This makes it the "recommended" action on an incomplete switch, which seems OK. Differential Revision: https://reviews.llvm.org/D88726	2020-10-02 13:24:24 +02:00
Florian Hahn	6481a76495	[PhaseOrdering] Add test that requires peeling before vectorization. Test case for PR47671.	2020-10-02 12:19:22 +01:00
Serguei Katkov	8ae1369f79	[GVN LoadPRE] Add test to show an opportunty. We can use context to prove that load can be safely executed at a point where load is being hoisted.	2020-10-02 17:53:37 +07:00
George Mitenkov	d4568ed743	[MLIR][LLVM] Fixed `topologicalSort()` to iterative version Instead of recursive helper method `topologicalSortImpl()`, sort's implementation is moved to `topologicalSort()` function's body directly. `llvm::ReversePostOrderTraversal` is used to create a traversal of blocks in reverse post order. Reviewed By: kiranchandramohan, rriddle Differential Revision: https://reviews.llvm.org/D88544	2020-10-02 13:48:27 +03:00
Nicolas Vasilache	cf9503c1b7	[mlir] Add subtensor_insert operation Differential revision: https://reviews.llvm.org/D88657	2020-10-02 06:32:31 -04:00
Kadir Cetinkaya	54c03d8f7d	[clangd][lit] Update document-link.test to respect custom resource-dir locations Differential Revision: https://reviews.llvm.org/D88721	2020-10-02 12:24:06 +02:00
Simon Pilgrim	ec07ae2a83	[InstCombine] Add some basic vector bswap tests We get the vNi16 cases already via matching as a rotate followed by the fshl -> bswap combines	2020-10-02 11:08:12 +01:00
Nicolas Vasilache	787bf5e383	[mlir] Add canonicalization for the `subtensor` op Differential revision: https://reviews.llvm.org/D88656	2020-10-02 06:05:52 -04:00
Nicolas Vasilache	e3de249a4c	[mlir] Add a subtensor operation This revision introduces a `subtensor` op, which is the counterpart of `subview` for a tensor operand. This also refactors the relevant pieces to allow reusing the `subview` implementation where appropriate. This operation will be used to implement tiling for Linalg on tensors.	2020-10-02 05:35:30 -04:00
Simon Pilgrim	670e60c023	[InstCombine] Add partial bswap test from D88578	2020-10-02 10:34:30 +01:00
Meera Nakrani	f7c0e2b8f2	[ARM] Prevent constants from iCmp instruction from being hoisted if part of a min(max()) pattern Marks constants of an ICmp instruction as free if it's only user is a select instruction that is part of a min(max()) pattern. Ensures that in loops, in particular when loop unrolling is turned on, SSAT will still be correctly generated. Differential Revision: https://reviews.llvm.org/D88662	2020-10-02 09:28:35 +00:00
Hsiangkai Wang	067add7b5f	[RISCV] Support vmsge.vx and vmsgeu.vx pseudo instructions in RVV. Implement vmsge{u}.vx pseudo instruction. According to RISC-V V specification, there are different scenarios for this pseudo instruction. I list them below. unmasked va >= x pseudoinstruction: vmsge{u}.vx vd, va, x expansion: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd masked va >= x, vd != v0 pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t expansion: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0 masked va >= x, vd == v0 pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt expansion: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt Use pseudo instruction to model vmsge{u}.vx. The pseudo instruction will convert to different expansion according to the condition. Differential Revision: https://reviews.llvm.org/D84732	2020-10-02 17:20:34 +08:00
Sam McCall	17747d2ec8	[clangd] Remove Tweak::Intent, use CodeAction kind directly. NFC Intent was a nice idea but it ends up being a bit awkward/heavyweight without adding much. In particular, it makes it hard to implement `CodeActionParams.only` properly (there's an inheritance hierarchy for kinds). Differential Revision: https://reviews.llvm.org/D88427	2020-10-02 11:14:23 +02:00
serge-sans-paille	9573c9f2a3	Fix limit behavior of dynamic alloca When the allocation size is 0, we shouldn't probe. Within [1, PAGE_SIZE], we should probe once etc. This fixes https://bugs.llvm.org/show_bug.cgi?id=47657 Differential Revision: https://reviews.llvm.org/D88548	2020-10-02 11:10:02 +02:00
Georgii Rymar	5829dc9250	[yaml2obj][elf2yaml] - Add a support for the `EntSize` field for `SHT_HASH` sections. Specification for SHT_HASH table says (https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html#hash) that it contains Elf32_Word entries for both 32/64 bit objects. Currently both GNU linkers and LLD sets the `sh_entsize` field to `4`. At the same time, `yaml2obj` ignores the `EntSize` field for SHT_HASH sections. This patch fixes this and also adds a support for obj2yaml: it will not dump this field when the `sh_entsize` contains the default value (`4`). Differential revision: https://reviews.llvm.org/D88652	2020-10-02 12:01:50 +03:00
Tres Popp	bfd7ee92cc	Handle unused variable without asserts	2020-10-02 10:22:55 +02:00
Sam McCall	bc18d8d9b7	[clangd] Drop dependence on standard library in check.test	2020-10-02 09:53:06 +02:00
Thomas Lively	542523a61a	[WebAssembly] Emulate v128.const efficiently v128.const was recently implemented in V8, but until it rolls into Chrome stable, we can't enable it in the WebAssembly backend without breaking origin trial users. So far we have been lowering build_vectors that would otherwise have been lowered to v128.const to splats followed by sequences of replace_lane instructions to initialize each lane individually. That produces large and inefficient code, so this patch introduces new logic to lower integer vector constants to a single i64x2.splat where possible, with at most a single i64x2.replace_lane following it if necessary. Adapted from a patch authored by @omnisip. Differential Revision: https://reviews.llvm.org/D88591	2020-10-02 00:28:06 -07:00
David Sherwood	b0ce9f0f4c	[SVE][CodeGen] Fix implicit TypeSize->uint64_t casts in TypePromotion The TypePromotion pass only operates on scalar types so I've fixed up all places where we were relying upon the implicit cast from TypeSize->uint64_t. Differential Revision: https://reviews.llvm.org/D88575	2020-10-02 08:12:11 +01:00
David Sherwood	b8ce6a6756	[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions When we know that a particular type is always going to be fixed width we have so far been writing code like this: getSizeInBits().getFixedSize() Since we are doing this in quite a few places now it seems to make sense to add a new helper function that allows us to replace these calls with a single getFixedSizeInBits() call. Differential Revision: https://reviews.llvm.org/D88649	2020-10-02 07:47:31 +01:00
Martin Storsjö	afb4e0f289	[AArch64] Omit SEH directives for the epilogue if none are needed For these cases, we already omit the prologue directives, if (!AFI->hasStackFrame() && !windowsRequiresStackProbe && !NumBytes). When writing the epilogue (after the prolog has been written), if the function doesn't have the WinCFI flag set (i.e. if no prologue was generated), assume that no epilogue will be needed either, and don't emit any epilog start pseudo instruction. After completing the epilogue, make sure that it actually matched the prologue. Previously, when epilogue start/end was generated, but no prologue, the unwind info for such functions actually was huge; 12 bytes xdata (4 bytes header, 4 bytes for one non-folded epilogue header, 4 bytes for padded opcodes) and 8 bytes pdata. Because the epilog consisted of one opcode (end) but the prolog was empty (no .seh_endprologue), the epilogue couldn't be folded into the prologue, and thus couldn't be considered for packed form either. On a 6.5 MB DLL with 110 KB pdata and 166 KB xdata, this gets rid of 38 KB pdata and 62 KB xdata. Differential Revision: https://reviews.llvm.org/D88641	2020-10-02 09:12:56 +03:00
Stephen Neuendorffer	47df8c57e4	[MLIR] Updates around MemRef Normalization The documentation for the NormalizeMemRefs pass and the associated MemRefsNormalizable traits was confusing and not on the website. This update clarifies the language around the difference between a MemRef Type, an operation that accesses the value of MemRef Type, and better documents the limitations of the current implementation. This patch also includes some basic debugging information for the pass so people might have a chance of figuring out why it doesn't work on their code. Differential Revision: https://reviews.llvm.org/D88532	2020-10-01 21:11:41 -07:00
Max Kazantsev	b8ac19cf1c	[SCEV] Limited support for unsigned preds in isImpliedViaOperations The logic there only considers `SLT/SGT` predicates. We can use the same logic for proving `ULT/UGT` predicates if all involved values are non-negative. Adding full-scale support for unsigned might be challenging because of code amount, so we can consider this in the future. Differential Revision: https://reviews.llvm.org/D88087 Reviewed By: reames	2020-10-02 10:20:57 +07:00
Philip Reames	f29645e7af	[gvn] Handle a corner case w/vectors of non-integral pointers If we try to coerce a vector of non-integral pointers to a narrower type (either narrower vector or single pointer), we use inttoptr and violate the semantics of non-integral pointers. In theory, we can handle many of these cases, we just need to use a different code idiom to convert without going through inttoptr and back. This shows up as wrong code bugs, and in some cases, crashes due to failed asserts. Modeled after a change which has lived downstream for a couple years, though completely rewritten to be more idiomatic.	2020-10-01 19:20:21 -07:00
Carl Ritson	2ef9d21e1a	[AMDGPU] SIInsertSkips: Tidy block splitting to use splitAt Convert to use new MachineBasicBlock splitAt function. Place code in splitBlock function for reuse in future changes. Should yield no functional change. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88537	2020-10-02 11:10:55 +09:00
Jason Molenda	a1e97923a0	Have kernel binary scanner load dSYMs as binary+dSYM if best thing found lldb's PlatforDarwinKernel scans the local filesystem (well known locations, plus user-specified directories) for kernels and kexts when doing kernel debugging, and loads them automatically. Sometimes kernel developers want to debug with only a dSYM, in which case they give lldb the DWARF binary + the dSYM as a binary and symbol file. This patch adds code to lldb to do this automatically if that's the best thing lldb can find. A few other bits of cleanup in PlatformDarwinKernel that I undertook at the same time: 1. Remove the 'platform.plugin.darwin-kernel.search-locally-for-kexts' setting. When I added the local filesystem index at start of kernel debugging, I thought people might object to the cost of the search and want a way to disable it. No one has. 2. Change the behavior of 'plugin.dynamic-loader.darwin-kernel.load-kexts' setting so it does not disable the local filesystem scan, or use of the local filesystem binaries. 3. PlatformDarwinKernel::GetSharedModule into GetSharedModuleKext and GetSharedModuleKernel for easier readability & maintenance. 4. Added accounting of .dSYM.yaa files (an archive format akin to tar) that I come across during the scan. I'm not using these for now; it would be very expensive to expand the archives & see if the UUID matches what I'm searching for. <rdar://problem/69774993> Differential Revision: https://reviews.llvm.org/D88632	2020-10-01 18:55:37 -07:00
Carl Ritson	5136f4748a	CodeGen: Fix livein calculation in MachineBasicBlock splitAt Fix and simplify computation of liveins for new block. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88535	2020-10-02 10:45:04 +09:00
Esme-Yi	c4690b0077	[PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC. Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that. The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS. Reviewed By: nemanjai, shchenz Differential Revision: https://reviews.llvm.org/D88274	2020-10-02 01:26:18 +00:00

1 2 3 4 5 ...

367933 Commits All Branches Search

367933 Commits

All Branches