llvm-project

Commit Graph

Author	SHA1	Message	Date
Valentin Clement	553e364194	[flang][openacc] Add clause validity tests for the host_data directive Add some clause validity tests for the host_data directive to avoid future regressions. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D91889	2020-11-20 20:17:37 -05:00
Matt Arsenault	650fbd569a	Verifier: Fix assert when verifying non-pointer byval or preallocated This would fail on a cast<PointerType> when verifying the attribute if these attributes were incorrectly used with a non-pointer type.	2020-11-20 20:08:43 -05:00
Matt Arsenault	1d1234b2a4	OpaquePtr: Update more tests to use typed sret	2020-11-20 20:08:43 -05:00
Valentin Clement	755674b715	[flang][openacc] Add clause validity tests for the parallel directive Add some clause validity tests for parallel directive. Reviewed By: sameeranjoshi Differential Revision: https://reviews.llvm.org/D91871	2020-11-20 20:05:10 -05:00
Aart Bik	af42550523	[mlir][sparse] refine optimization, add few more test cases Adds tests for full sum reduction (tensors summed up into scalars) and the well-known sampled-dense-dense-matrix-product. Refines the optimizations rules slightly to handle the summation better. Reviewed By: penpornk Differential Revision: https://reviews.llvm.org/D91818	2020-11-20 17:01:59 -08:00
Evgenii Stepanov	08d90f72ce	[hwasan] Implement error report callback. Similar to __asan_set_error_report_callback, pass the entire report to a user provided callback function. Differential Revision: https://reviews.llvm.org/D91825	2020-11-20 16:48:19 -08:00
wlei	21c91454a8	[llvm-profgen][NFC]Fix build failure on different platform see titile Test Plan: ninja & ninja check-llvm Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D91897	2020-11-20 16:36:04 -08:00
Michael Jones	8a4ee3550b	[libc] Make more of the libc unit testing llvm independent (WIP, hopefully I'll add more to this patch before submitting) Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D91665	2020-11-21 00:27:08 +00:00
Michael Jones	3e18fb3390	[libc] Switch functions to using global headers This switches all of the files in src/string, src/math, and test/src/math from using relative paths (e.g. `#include “include/string.h”`) to global paths (e.g. `#include <string.h>`) to make bringing up those functions on other platforms, such as fuchsia, easier. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D91394	2020-11-21 00:07:17 +00:00
Samuel Giddins	244022a3cd	Don’t break before nested block param when prior param is not a block Add ScopedTrace to verify methods in FormatTestObjC Add tests from D17700 Reviewed By: keith, kastiglione Differential Revision: https://reviews.llvm.org/D91669	2020-11-20 15:16:04 -08:00
Matt Arsenault	20c43d6bd5	OpaquePtr: Bulk update tests to use typed sret	2020-11-20 17:58:26 -05:00
wlei	0196b45cea	[CSSPGO][llvm-profgen] Instruction symbolization This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC. This change adds the support of instruction symbolization. Given the RVA on an instruction pointer, a full calling context can be printed side-by-side with the disassembly code. E.g. ``` Disassembly of section .text [0x0, 0x4a]: <funcA>: 0: mov eax, edi funcA:0 2: mov ecx, dword ptr [rip] funcLeaf:2 @ funcA:1 8: lea edx, [rcx + 3] fib:2 @ funcLeaf:2 @ funcA:1 b: cmp ecx, 3 fib:2 @ funcLeaf:2 @ funcA:1 e: cmovl edx, ecx fib:2 @ funcLeaf:2 @ funcA:1 11: sub eax, edx funcLeaf:2 @ funcA:1 13: ret funcA:2 14: nop word ptr cs:[rax + rax] 1e: nop <funcLeaf>: 20: mov eax, edi funcLeaf:1 22: mov ecx, dword ptr [rip] funcLeaf:2 28: lea edx, [rcx + 3] fib:2 @ funcLeaf:2 2b: cmp ecx, 3 fib:2 @ funcLeaf:2 2e: cmovl edx, ecx fib:2 @ funcLeaf:2 31: sub eax, edx funcLeaf:2 33: ret funcLeaf:3 34: nop word ptr cs:[rax + rax] 3e: nop <fib>: 40: lea eax, [rdi + 3] fib:2 43: cmp edi, 3 fib:2 46: cmovl eax, edi fib:2 49: ret fib:8 ``` Test Plan: ninja check-llvm Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D89715	2020-11-20 14:26:27 -08:00
wlei	32221694cb	[CSSPGO][llvm-profgen] Disassemble text sections This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC. This change enables disassembling the text sections to build various address maps that are potentially used by the virtual unwinder. A switch `--show-disassembly` is being added to print the disassembly code. Like the llvm-objdump tool, this change leverages existing LLVM components to parse and disassemble ELF binary files. So far X86 is supported. Test Plan: ninja check-llvm Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D89712	2020-11-20 14:26:26 -08:00
wlei	a94fa86229	[CSSPGO][llvm-profgen] Parse mmap events from perf script This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC. As a starter, this change sets up an entry point by introducing PerfReader to load profiled binaries and perf traces(including perf events and perf samples). For the event, here it parses the mmap2 events from perf script to build the loader snaps, which is used to retrieve the image load address in the subsequent perf tracing parsing. As described in llvm-profgen.rst, the tool being built aims to support multiple input perf data (preprocessed by perf script) as well as multiple input binary images. It should also support dynamic reload/unload shared objects by leveraging the loader snaps being built by this change Reviewed By: wenlei, wmi Differential Revision: https://reviews.llvm.org/D89707	2020-11-20 14:26:26 -08:00
Amara Emerson	c58df88886	[AArch64][GlobalISel] Make G_EXTRACT_VECTOR_ELT of <2 x p0> legal. Also fix a selection issue for this which was using LLT::isScalar() when it should have been using !isVector(), add test for that too.	2020-11-20 14:07:45 -08:00
Richard Smith	bec968cbb3	Demangling support for class type non-type template parameter extensions. The extensions in question are described in: https://github.com/itanium-cxx-abi/cxx-abi/issues/47 https://github.com/itanium-cxx-abi/cxx-abi/issues/63 Differential Revision: https://reviews.llvm.org/D90003	2020-11-20 13:45:08 -08:00
Alexey Bataev	0b420d674a	[SLP][NFC]Fix assert condition in newTreeEntry, NFC.	2020-11-20 13:25:21 -08:00
Vitaly Buka	3b947cc8ce	[msan] unpoison_file from fclose and fflash Also unpoison IO_write_base/_IO_write_end buffer memcpy from fclose and fflash can copy internal bytes without metadata into user memory. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D91858	2020-11-20 13:09:01 -08:00
Nathan Lanza	33c79f76af	Revert "[lldb] add a missing dependency on intrinsics_gen" This reverts commit `137ff73317`. This belongs in Apple's Swift fork since this is a direct fix for unified Swift + llvm + lldb builds.	2020-11-20 16:02:16 -05:00
Zbigniew Sarbinowski	2c7e24c4b6	Guard init_priority attribute within libc++ Not all platforms support priority attribute. I'm moving conditional definition of this attribute to `include/__config`. Reviewed By: #libc, aaron.ballman Differential Revision: https://reviews.llvm.org/D91565	2020-11-20 15:53:26 -05:00
Craig Topper	9211da4215	[RISCV] Put RV32 before RV64 in the ValueTypeByHwMode and RegInfoByHwMode lists in RISCVRegisterInfo.td Addresses post-commit feedback from `77e25b5bc8`	2020-11-20 12:10:21 -08:00
Thomas Raoux	369c51a74b	[mlir][vector] Add transfer_op LoadToStore forwarding and deadStore optimizations Add transformation to be able to forward transfer_write into transfer_read operation and to be able to remove dead transfer_write when a transfer_write is overwritten before being read. Differential Revision: https://reviews.llvm.org/D91321	2020-11-20 11:59:01 -08:00
Sam McCall	de5b0b776f	[clangd] semanticTokens: fields are 'property', not 'member' This isn't obvious, but vscode maps member as 'entity.name.function.member', so it's really for member functions. Fixes https://github.com/clangd/vscode-clangd/issues/105	2020-11-20 20:53:12 +01:00
Alexey Bataev	c964f30814	[OPENMP]Use the real pointer value as base, not indexed value. After fix for PR48174 the base pointer for pointer-based array-sections/array-subscripts will be emitted as `&ptr[idx]`, but actually it should be just `ptr`, i.e. the address stored in the ponter to point correctly to the beginning of the array. Currently it may lead to a crash in the runtime. Differential Revision: https://reviews.llvm.org/D91805	2020-11-20 11:34:14 -08:00
Craig Topper	77e25b5bc8	[RISCV] Remove RV32 HwMode. Use DefaultMode for RV32 Prior to this the DefaultMode was never selected, but RISCVGenDAGISel.inc, RISCVGenRegisterInfo.inc, RISCVGenGlobalISel.inc all ended up with extra table entries for that mode. This patch removes the RV32 and uses DefaultMode for RV32. This impressively reduces the size of my release+asserts llc binary by about 270K. About 15K from RISCVGenDAGISel.inc, 1-2K from RISCVGenRegisterInfo.inc, but the vast majority from RISCVGenGlobalISel.inc. Differential Revision: https://reviews.llvm.org/D90973	2020-11-20 11:16:06 -08:00
Alexey Bataev	8f51dc4967	[OPENMP]Honor constantness of captured variables. Fixes bug reported via Stackoverflow: https://stackoverflow.com/questions/64179168/clang-overload-resolution-failure-with-templates-and-openmp-collapse Need to honor constantness of private/target variables to make the code compilable. Differential Revision: https://reviews.llvm.org/D91644	2020-11-20 11:11:47 -08:00
Matt Arsenault	06c192d454	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
Hongtao Yu	d0e42037bf	[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR, ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`. ``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ``` The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86495	2020-11-20 10:52:43 -08:00
Craig Topper	6a1d8b91ed	[RISCV] Custom type legalize i32 bswap/bitreverse to GREVIW on RV64 with Zbp extension Previously we required a sra to pattern match these properly in isel. If the consumer didn't need the result sign extended we'll have an srl instead of sra and fail to match. This patch switches to custom legalizing to GREVIW using portions of D91259. Differential Revision: https://reviews.llvm.org/D91457	2020-11-20 10:41:01 -08:00
Hongtao Yu	f3c445697d	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Pete Steinfeld	da02327b9c	Update OptionComparison.md	2020-11-20 10:33:21 -08:00
Craig Topper	78767b7f8e	[RISCV] Add RISCVISD::ROLW/RORW use those for custom legalizing i32 rotl/rotr on RV64IZbb. This should result in better utilization of RORIW since we don't need to look for a SIGN_EXTEND_INREG that may not exist. Also remove rotl/rotr isel matching to GREVI and just prefer RORI. This is to keep consistency so we don't have to match ROLW/RORW to GREVIW as well. I imagine RORI/RORIW performance will be the same or better than GREVI. Differential Revision: https://reviews.llvm.org/D91449	2020-11-20 10:25:47 -08:00
Simon Pilgrim	0341029bb4	[X86][AVX] LowerADDSAT_SUBSAT - avoid X86ISD::BLENDV in UADDSAT/USUBSAT v8i32/v4i64 lowering Use the OR(CMP,ADD) / AND(CMP,SUB) patterns like we do on SSE targets. Enable custom lowering for v8i32/v4i64 and generalize the 128-bit lowering code for any vector size - this also lets us use the slightly cheaper codegen for icmp_ugt instead of umin/umax.	2020-11-20 18:16:44 +00:00
William S. Moses	f5c5fd1c50	[MLIR] Correct block merge bug Block merging in MLIR will incorrectly merge blocks with operations whose values are used outside of that block. This change forbids this behavior and provides a test where it is illegal to perform such a merge. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D91745	2020-11-20 19:12:59 +01:00
Yitzhak Mandelbaum	88e6208562	[libTooling] Update Transformer's `node` combinator to include the trailing semicolon for decls. Currently, `node` only includes the semicolon for (some) statements. However, declarations have the same issue of (potentially) trailing semicolons, so `node` should behave the same for them. Differential Revision: https://reviews.llvm.org/D91872	2020-11-20 18:11:50 +00:00
Craig Topper	a7eae62a42	[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version. The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node. Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results. Differential Revision: https://reviews.llvm.org/D91846	2020-11-20 10:06:53 -08:00
Alex Zinenko	18d0f7d5c3	[mlir] add canonicalization patterns for trivial SCF 'for' and 'if' Add canoncalization patterns to remove zero-iteration 'for' loops, replace single-iteration 'for' loops with their bodies; remove known-false conditionals with no 'else' branch and replace conditionals with known value by the respective region. Although similar transformations are performed at the CFG level, not all flows reach that level, e.g., the GPU flow may want to remove single-iteration loops before deciding on loop mapping to thread dimensions. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D91865	2020-11-20 19:04:39 +01:00
Dave Lee	dbcc69217a	[lldb] Add examples and reword source-map help string Update the help string for `target.source-map` to remove the use of the word "duple" and to add examples. Additionally I rewrote parts with the goal of making the description more concrete. rdar://68736012 Differential Revision: https://reviews.llvm.org/D91742	2020-11-20 10:01:36 -08:00
Arthur Eubanks	ac7419bb4f	[Hexagon][NewPM] Port -hexagon-loop-idiom and add to pipeline Fixes pmpy-mod.ll under NPM Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D91829	2020-11-20 09:34:37 -08:00
Stella Stamenova	370d0bac90	[mlir] Expose parseDimAndSymbolList from affineops.h This was removed from ops.h, but it is used by onnx-mlir Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D91830	2020-11-20 09:26:58 -08:00
Andrew Wei	1cd19fc568	[DeadMachineInstrctionElim] Post order visit all blocks and Iteratively run DeadMachineInstructionElim pass until nothing dead Patched by: guopeilin Reviewed By: hliao,rampitec Differential Revision: https://reviews.llvm.org/D91513	2020-11-21 00:43:23 +08:00
Simon Pilgrim	09a081f221	[X86][SSE] LowerADDSAT_SUBSAT - avoid X86ISD::BLENDV in UADDSAT/USUBSAT custom lowering Use the OR(CMP,ADD) / AND(CMP,SUB) patterns like we do on pre-SSE4 targets. We're still using X86ISD::BLENDV on some AVX targets as we don't do custom lowering for >= 256-bit vectors. Really this (and combineVSelectWithAllOnesOrZeros) needs moving to DAGCombiner, but pre-SSE42 we see the vXi64 comparison type as a 2 x 32-bits result so we can't just rely on ComputeNumSignBits to give us the 'all bits' result we need.	2020-11-20 16:53:01 +00:00
Andrzej Warzynski	1b749c0cb5	[flang][driver] Remove unnecessary CMake dependencies (nfc) Remove clangFrontend from the list of dependencies. These should have been removed in: `8d51d37e06`. See also https://reviews.llvm.org/D87774.	2020-11-20 16:44:11 +00:00
Sanjay Patel	e32bd35120	[CostModel] mostly remove cost-kind predicate for intrinsics in basic TTI implementation This is re-applying a combination of `f7eac51b9b` and `8ec7ea3ddc` as one patch to avoid regressions now that we have better testing in place. Those were reverted with `32dd5870ee` because of crashing in experimental intrinsics. That bug should be fixed with `7ae346434`. Paraphrased original commit messages: This is the last step in removing cost-kind as a consideration in the basic class model for intrinsics. See D89461 for the start of that. Subsequent commits dealt with each of the special-case intrinsics that had customization here in the basic class. This should remove a barrier to retrying D87188 (canonicalization to the abs intrinsic). The ARM and x86 cost diffs seen here may be wrong because the target-specific overrides have their own bugs, but we hope this is less wrong - if something has a significant throughput cost, then it should have a significant size / blended cost too by default. The only behavioral diff in current regression tests is shown in the x86 scatter-gather test (which is misplaced or broken because it runs the entire -O3 pipeline) - we unrolled less, and we assume that is a improvement. Exception: in general, we want the size cost for a scalar call to be cheap even if the other costs are expensive - we expect it to just be a branch with some optional stack manipulation. It is likely that we will want to carve out some exceptions/overrides to this rule as follow-up patches for calls that have some general and/or target-specific difference to the expected lowering. This was noticed as a regression in unrolling, so we have a test for that now along with a couple of direct cost model tests. If the assumed scalarization costs for the oversized vector calls are not realistic, that would be another follow-up refinement of the cost models. Differential Revision: https://reviews.llvm.org/D90554	2020-11-20 11:21:10 -05:00
Simon Pilgrim	e3f0177deb	[X86] Add SSE42 sat-add test coverage Check SSE42 targets which have PCMPGTQ	2020-11-20 16:00:24 +00:00
Alex Richardson	51e09e1d5a	[AMDGPU] Set the default globals address space to 1 This will ensure that passes that add new global variables will create them in address space 1 once the passes have been updated to no longer default to the implicit address space zero. This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't already to present to ensure bitcode backwards compatibility. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D84345	2020-11-20 15:46:53 +00:00
Alex Richardson	3bc4157556	Add a default address space for globals to DataLayout This is similar to the existing alloca and program address spaces (D37052) and should be used when creating/accessing global variables. We need this in our CHERI fork of LLVM to place all globals in address space 200. This ensures that values are accessed using CHERI load/store instructions instead of the normal MIPS/RISC-V ones. The problem this is trying to fix is that most of the time the type of globals is created using a simple PointerType::getUnqual() (or ::get() with the default address-space value of 0). This does not work for us and we get assertion/compilation/instruction selection failures whenever a new call is added that uses the default value of zero. In our fork we have removed the default parameter value of zero for most address space arguments and use DL.getProgramAddressSpace() or DL.getGlobalsAddressSpace() whenever possible. If this change is accepted, I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead of relying on the default value of 0 for PointerType::get(), etc. This patch and the follow-up changes will not have any functional changes for existing backends with the default globals address space of zero. A follow-up commit will change the default globals address space for AMDGPU to 1. Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D70947	2020-11-20 15:46:52 +00:00
Siva Chandra Reddy	4766a86cf2	[libc] Combine all math differential fuzzers into one target. Also added diffing of a few more math functions. Combining the diff check for all of these functions helps us meet the OSS fuzz bar of a minimum of 100 program edges. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D91817	2020-11-20 07:46:15 -08:00
Anton Afanasyev	6f1c07b23a	[SLP][Test] Update pr47269.ll test. NFC Expand test for PR47269 to better demonstrate changes introduced by D90445.	2020-11-20 18:33:57 +03:00
Jamie Schmeiser	7f6360cdc6	Reland: Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug. Summary: Expand existing loopsink testing to also test loopsinking using new pass manager. Enable memoryssa for loopsink with new pass manager. This combination exposed a bug that was previously fixed for loopsink without memoryssa. When sinking an instruction into a loop, the source block may not be part of the loop but still needs to be checked for pointer invalidation. This is the fix for bugzilla #39695 (PR 54659) expanded to also work with memoryssa. Respond to review comments. Enable Memory SSA in legacy Loop Sink pass under EnableMSSALoopDependency option control. Update tests accordingly. Respond to review comments. Add options controlling whether memoryssa is used for loop sink, defaulting to off. Expand testing based on these options. Respond to review comments. Properly indicated preserved analyses. This relanding addresses a compile-time performance problem by moving test for profile data earlier to avoid unnecessary computations. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: asbirlea (Alina Sbirlea) Differential Revision: https://reviews.llvm.org/D90249	2020-11-20 10:26:33 -05:00

... 2 3 4 5 6 ...

372900 Commits All Branches Search

372900 Commits

All Branches