llvm-project

Commit Graph

Author	SHA1	Message	Date
Eugene Zhulenev	d4f1a3c6e2	[mlir] Add microbenchmark for linalg+async-parallel-for Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D91896	2020-11-21 03:47:14 -08:00
Kazushi (Jam) Marukawa	4a1d230fa6	[VE][NFC] Modify function order and simplify comments	2020-11-21 16:09:37 +09:00
Kazushi (Jam) Marukawa	02b2bcd940	[VE] Correct types of return/argument values for getAdjustedFrameSize() A getAdjustedFrameSize function may need to handle larger than 32 bits integer, so change int to uint64_t. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D91862	2020-11-21 16:08:20 +09:00
Tom Stellard	da886bf471	GitHub Actions: Add job for automatically updating the main branch Differential Revision: https://reviews.llvm.org/D91554	2020-11-20 22:26:11 -08:00
Tony	8605d3134c	[NFC][AMDGPU] Document kernel descriptor - Document that the kernel descriptor defined is for code object V3. Document that it also applies to earlier code object formats for CP. - Document the deprecated bits in kernel descriptor. Differential Revision: https://reviews.llvm.org/D91458	2020-11-21 04:54:17 +00:00
Kazushi (Jam) Marukawa	a2dc4ac86b	[VE][NFC] Update missing bulk update tests to use typed sret	2020-11-21 13:11:25 +09:00
Nico Weber	e91b2344ad	[mac/arm] Fix test/Driver/darwin-sdk-version.c on arm macs Two invocations in this test used `-m64`, which on an arm mac means arm64 in the triple, not x86_64.	2020-11-20 22:32:31 -05:00
Nico Weber	c473184914	[mac/arm] Fix clang/test/Sema/wchar.c on mac/arm hosts Part of PR46644.	2020-11-20 21:49:17 -05:00
Matt Arsenault	41083267a9	OpaquePtr: Make byval/sret types mandatory	2020-11-20 21:23:33 -05:00
Matt Arsenault	79f75468b4	AMDGPU: Fix counting kernel arguments towards register usage Also use DataLayout to get type size. Relying on the IR type size is also pretty broken here, since this won't perfectly capture how types are legalized.	2020-11-20 21:23:33 -05:00
Kazu Hirata	226beb494c	[Analysis] Use llvm::is_contained (NFC)	2020-11-20 18:08:05 -08:00
Arthur O'Dwyer	6e965df605	Revert "Revert "[libc++] ADL-proof <vector> by adding _VSTD:: qualification on calls."" This reverts commit `620adacf87`. Fix: unsupport C++03 for the new test, define helpers before __swap_allocator (1) Add _VSTD:: qualification to __swap_allocator. (2) Add _VSTD:: qualification consistently to __to_address. (3) Add some more missing _VSTD:: to <vector>, with a regression test. This part is cleanup after `d9a4f936d0`. Note that a vector whose allocator actually runs afoul of any of these ADL calls will likely also run afoul of simple things like `v1 == v2` (which is also an ADL call). But, still, libc++ should be consistent in qualifying function calls wherever possible. Relevant blog post: https://quuxplusone.github.io/blog/2019/09/26/uglification-doesnt-stop-adl/ Differential Revision: https://reviews.llvm.org/D91708	2020-11-20 20:59:18 -05:00
Valentin Clement	553e364194	[flang][openacc] Add clause validity tests for the host_data directive Add some clause validity tests for the host_data directive to avoid future regressions. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D91889	2020-11-20 20:17:37 -05:00
Matt Arsenault	650fbd569a	Verifier: Fix assert when verifying non-pointer byval or preallocated This would fail on a cast<PointerType> when verifying the attribute if these attributes were incorrectly used with a non-pointer type.	2020-11-20 20:08:43 -05:00
Matt Arsenault	1d1234b2a4	OpaquePtr: Update more tests to use typed sret	2020-11-20 20:08:43 -05:00
Valentin Clement	755674b715	[flang][openacc] Add clause validity tests for the parallel directive Add some clause validity tests for parallel directive. Reviewed By: sameeranjoshi Differential Revision: https://reviews.llvm.org/D91871	2020-11-20 20:05:10 -05:00
Aart Bik	af42550523	[mlir][sparse] refine optimization, add few more test cases Adds tests for full sum reduction (tensors summed up into scalars) and the well-known sampled-dense-dense-matrix-product. Refines the optimizations rules slightly to handle the summation better. Reviewed By: penpornk Differential Revision: https://reviews.llvm.org/D91818	2020-11-20 17:01:59 -08:00
Evgenii Stepanov	08d90f72ce	[hwasan] Implement error report callback. Similar to __asan_set_error_report_callback, pass the entire report to a user provided callback function. Differential Revision: https://reviews.llvm.org/D91825	2020-11-20 16:48:19 -08:00
wlei	21c91454a8	[llvm-profgen][NFC]Fix build failure on different platform see titile Test Plan: ninja & ninja check-llvm Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D91897	2020-11-20 16:36:04 -08:00
Michael Jones	8a4ee3550b	[libc] Make more of the libc unit testing llvm independent (WIP, hopefully I'll add more to this patch before submitting) Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D91665	2020-11-21 00:27:08 +00:00
Michael Jones	3e18fb3390	[libc] Switch functions to using global headers This switches all of the files in src/string, src/math, and test/src/math from using relative paths (e.g. `#include “include/string.h”`) to global paths (e.g. `#include <string.h>`) to make bringing up those functions on other platforms, such as fuchsia, easier. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D91394	2020-11-21 00:07:17 +00:00
Samuel Giddins	244022a3cd	Don’t break before nested block param when prior param is not a block Add ScopedTrace to verify methods in FormatTestObjC Add tests from D17700 Reviewed By: keith, kastiglione Differential Revision: https://reviews.llvm.org/D91669	2020-11-20 15:16:04 -08:00
Matt Arsenault	20c43d6bd5	OpaquePtr: Bulk update tests to use typed sret	2020-11-20 17:58:26 -05:00
wlei	0196b45cea	[CSSPGO][llvm-profgen] Instruction symbolization This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC. This change adds the support of instruction symbolization. Given the RVA on an instruction pointer, a full calling context can be printed side-by-side with the disassembly code. E.g. ``` Disassembly of section .text [0x0, 0x4a]: <funcA>: 0: mov eax, edi funcA:0 2: mov ecx, dword ptr [rip] funcLeaf:2 @ funcA:1 8: lea edx, [rcx + 3] fib:2 @ funcLeaf:2 @ funcA:1 b: cmp ecx, 3 fib:2 @ funcLeaf:2 @ funcA:1 e: cmovl edx, ecx fib:2 @ funcLeaf:2 @ funcA:1 11: sub eax, edx funcLeaf:2 @ funcA:1 13: ret funcA:2 14: nop word ptr cs:[rax + rax] 1e: nop <funcLeaf>: 20: mov eax, edi funcLeaf:1 22: mov ecx, dword ptr [rip] funcLeaf:2 28: lea edx, [rcx + 3] fib:2 @ funcLeaf:2 2b: cmp ecx, 3 fib:2 @ funcLeaf:2 2e: cmovl edx, ecx fib:2 @ funcLeaf:2 31: sub eax, edx funcLeaf:2 33: ret funcLeaf:3 34: nop word ptr cs:[rax + rax] 3e: nop <fib>: 40: lea eax, [rdi + 3] fib:2 43: cmp edi, 3 fib:2 46: cmovl eax, edi fib:2 49: ret fib:8 ``` Test Plan: ninja check-llvm Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D89715	2020-11-20 14:26:27 -08:00
wlei	32221694cb	[CSSPGO][llvm-profgen] Disassemble text sections This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC. This change enables disassembling the text sections to build various address maps that are potentially used by the virtual unwinder. A switch `--show-disassembly` is being added to print the disassembly code. Like the llvm-objdump tool, this change leverages existing LLVM components to parse and disassemble ELF binary files. So far X86 is supported. Test Plan: ninja check-llvm Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D89712	2020-11-20 14:26:26 -08:00
wlei	a94fa86229	[CSSPGO][llvm-profgen] Parse mmap events from perf script This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC. As a starter, this change sets up an entry point by introducing PerfReader to load profiled binaries and perf traces(including perf events and perf samples). For the event, here it parses the mmap2 events from perf script to build the loader snaps, which is used to retrieve the image load address in the subsequent perf tracing parsing. As described in llvm-profgen.rst, the tool being built aims to support multiple input perf data (preprocessed by perf script) as well as multiple input binary images. It should also support dynamic reload/unload shared objects by leveraging the loader snaps being built by this change Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D89707	2020-11-20 14:26:26 -08:00
Amara Emerson	c58df88886	[AArch64][GlobalISel] Make G_EXTRACT_VECTOR_ELT of <2 x p0> legal. Also fix a selection issue for this which was using LLT::isScalar() when it should have been using !isVector(), add test for that too.	2020-11-20 14:07:45 -08:00
Richard Smith	bec968cbb3	Demangling support for class type non-type template parameter extensions. The extensions in question are described in: https://github.com/itanium-cxx-abi/cxx-abi/issues/47 https://github.com/itanium-cxx-abi/cxx-abi/issues/63 Differential Revision: https://reviews.llvm.org/D90003	2020-11-20 13:45:08 -08:00
Alexey Bataev	0b420d674a	[SLP][NFC]Fix assert condition in newTreeEntry, NFC.	2020-11-20 13:25:21 -08:00
Vitaly Buka	3b947cc8ce	[msan] unpoison_file from fclose and fflash Also unpoison IO_write_base/_IO_write_end buffer memcpy from fclose and fflash can copy internal bytes without metadata into user memory. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D91858	2020-11-20 13:09:01 -08:00
Nathan Lanza	33c79f76af	Revert "[lldb] add a missing dependency on intrinsics_gen" This reverts commit `137ff73317`. This belongs in Apple's Swift fork since this is a direct fix for unified Swift + llvm + lldb builds.	2020-11-20 16:02:16 -05:00
Zbigniew Sarbinowski	2c7e24c4b6	Guard init_priority attribute within libc++ Not all platforms support priority attribute. I'm moving conditional definition of this attribute to `include/__config`. Reviewed By: #libc, aaron.ballman Differential Revision: https://reviews.llvm.org/D91565	2020-11-20 15:53:26 -05:00
Craig Topper	9211da4215	[RISCV] Put RV32 before RV64 in the ValueTypeByHwMode and RegInfoByHwMode lists in RISCVRegisterInfo.td Addresses post-commit feedback from `77e25b5bc8`	2020-11-20 12:10:21 -08:00
Thomas Raoux	369c51a74b	[mlir][vector] Add transfer_op LoadToStore forwarding and deadStore optimizations Add transformation to be able to forward transfer_write into transfer_read operation and to be able to remove dead transfer_write when a transfer_write is overwritten before being read. Differential Revision: https://reviews.llvm.org/D91321	2020-11-20 11:59:01 -08:00
Sam McCall	de5b0b776f	[clangd] semanticTokens: fields are 'property', not 'member' This isn't obvious, but vscode maps member as 'entity.name.function.member', so it's really for member functions. Fixes https://github.com/clangd/vscode-clangd/issues/105	2020-11-20 20:53:12 +01:00
Alexey Bataev	c964f30814	[OPENMP]Use the real pointer value as base, not indexed value. After fix for PR48174 the base pointer for pointer-based array-sections/array-subscripts will be emitted as `&ptr[idx]`, but actually it should be just `ptr`, i.e. the address stored in the ponter to point correctly to the beginning of the array. Currently it may lead to a crash in the runtime. Differential Revision: https://reviews.llvm.org/D91805	2020-11-20 11:34:14 -08:00
Craig Topper	77e25b5bc8	[RISCV] Remove RV32 HwMode. Use DefaultMode for RV32 Prior to this the DefaultMode was never selected, but RISCVGenDAGISel.inc, RISCVGenRegisterInfo.inc, RISCVGenGlobalISel.inc all ended up with extra table entries for that mode. This patch removes the RV32 and uses DefaultMode for RV32. This impressively reduces the size of my release+asserts llc binary by about 270K. About 15K from RISCVGenDAGISel.inc, 1-2K from RISCVGenRegisterInfo.inc, but the vast majority from RISCVGenGlobalISel.inc. Differential Revision: https://reviews.llvm.org/D90973	2020-11-20 11:16:06 -08:00
Alexey Bataev	8f51dc4967	[OPENMP]Honor constantness of captured variables. Fixes bug reported via Stackoverflow: https://stackoverflow.com/questions/64179168/clang-overload-resolution-failure-with-templates-and-openmp-collapse Need to honor constantness of private/target variables to make the code compilable. Differential Revision: https://reviews.llvm.org/D91644	2020-11-20 11:11:47 -08:00
Matt Arsenault	06c192d454	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
Hongtao Yu	d0e42037bf	[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR, ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`. ``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ``` The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86495	2020-11-20 10:52:43 -08:00
Craig Topper	6a1d8b91ed	[RISCV] Custom type legalize i32 bswap/bitreverse to GREVIW on RV64 with Zbp extension Previously we required a sra to pattern match these properly in isel. If the consumer didn't need the result sign extended we'll have an srl instead of sra and fail to match. This patch switches to custom legalizing to GREVIW using portions of D91259. Differential Revision: https://reviews.llvm.org/D91457	2020-11-20 10:41:01 -08:00
Hongtao Yu	f3c445697d	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Pete Steinfeld	da02327b9c	Update OptionComparison.md	2020-11-20 10:33:21 -08:00
Craig Topper	78767b7f8e	[RISCV] Add RISCVISD::ROLW/RORW use those for custom legalizing i32 rotl/rotr on RV64IZbb. This should result in better utilization of RORIW since we don't need to look for a SIGN_EXTEND_INREG that may not exist. Also remove rotl/rotr isel matching to GREVI and just prefer RORI. This is to keep consistency so we don't have to match ROLW/RORW to GREVIW as well. I imagine RORI/RORIW performance will be the same or better than GREVI. Differential Revision: https://reviews.llvm.org/D91449	2020-11-20 10:25:47 -08:00
Simon Pilgrim	0341029bb4	[X86][AVX] LowerADDSAT_SUBSAT - avoid X86ISD::BLENDV in UADDSAT/USUBSAT v8i32/v4i64 lowering Use the OR(CMP,ADD) / AND(CMP,SUB) patterns like we do on SSE targets. Enable custom lowering for v8i32/v4i64 and generalize the 128-bit lowering code for any vector size - this also lets us use the slightly cheaper codegen for icmp_ugt instead of umin/umax.	2020-11-20 18:16:44 +00:00
William S. Moses	f5c5fd1c50	[MLIR] Correct block merge bug Block merging in MLIR will incorrectly merge blocks with operations whose values are used outside of that block. This change forbids this behavior and provides a test where it is illegal to perform such a merge. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D91745	2020-11-20 19:12:59 +01:00
Yitzhak Mandelbaum	88e6208562	[libTooling] Update Transformer's `node` combinator to include the trailing semicolon for decls. Currently, `node` only includes the semicolon for (some) statements. However, declarations have the same issue of (potentially) trailing semicolons, so `node` should behave the same for them. Differential Revision: https://reviews.llvm.org/D91872	2020-11-20 18:11:50 +00:00
Craig Topper	a7eae62a42	[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version. The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node. Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results. Differential Revision: https://reviews.llvm.org/D91846	2020-11-20 10:06:53 -08:00
Alex Zinenko	18d0f7d5c3	[mlir] add canonicalization patterns for trivial SCF 'for' and 'if' Add canoncalization patterns to remove zero-iteration 'for' loops, replace single-iteration 'for' loops with their bodies; remove known-false conditionals with no 'else' branch and replace conditionals with known value by the respective region. Although similar transformations are performed at the CFG level, not all flows reach that level, e.g., the GPU flow may want to remove single-iteration loops before deciding on loop mapping to thread dimensions. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D91865	2020-11-20 19:04:39 +01:00
Dave Lee	dbcc69217a	[lldb] Add examples and reword source-map help string Update the help string for `target.source-map` to remove the use of the word "duple" and to add examples. Additionally I rewrote parts with the goal of making the description more concrete. rdar://68736012 Differential Revision: https://reviews.llvm.org/D91742	2020-11-20 10:01:36 -08:00

1 2 3 4 5 ...

372762 Commits All Branches Search

372762 Commits

All Branches