llvm-project

Commit Graph

Author	SHA1	Message	Date
Jessica Paquette	1312852040	[AArch64][GlobalISel] Legalize narrow type G_CTPOPs Using `clampScalar` here because we ought to mark s128 as custom eventually. (Right now, it will just fall back.) With this legalization, we get the same code as SDAG: https://godbolt.org/z/TneoPKrKG Differential Revision: https://reviews.llvm.org/D100908	2021-05-07 14:52:23 -07:00
Adrian Prantl	c6ddf669dc	Fix the module-enabled build by removing a redundant type definition.	2021-05-07 14:45:17 -07:00
Petr Hosek	167906c109	[BareMetal] Ensure that sysroot always comes after library paths This addresses an issue introduced in D91559. We would invoke the compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both locations contain libraries with the same name, but we expect linker to pick up the library in path/to/lib since that version is more specialized. This was the case before D91559 where the sysroot path would be ignored, but after that change linker would now pick up the library from the sysroot which resulted in unexpected behavior. The sysroot path should always come after any user provided library paths, followed by compiler runtime paths. We want for libraries in user provided library paths to always take precedence over sysroot libraries. This matches the behavior of other toolchains used with other targets. Differential Revision: https://reviews.llvm.org/D102049	2021-05-07 14:42:02 -07:00
Nico Weber	d5a70db193	[lld/mac] Write every weak symbol only once in the output Before this, if an inline function was defined in several input files, lld would write each copy of the inline function the output. With this patch, it only writes one copy. Reduces the size of Chromium Framework from 378MB to 345MB (compared to 290MB linked with ld64, which also does dead-stripping, which we don't do yet), and makes linking it faster: N Min Max Median Avg Stddev x 10 3.9957051 4.3496981 4.1411121 4.156837 0.10092097 + 10 3.908154 4.169318 3.9712729 3.9846753 0.075773012 Difference at 95.0% confidence -0.172162 +/- 0.083847 -4.14165% +/- 2.01709% (Student's t, pooled s = 0.0892373) Implementation-wise, when merging two weak symbols, this sets a "canOmitFromOutput" on the InputSection belonging to the weak symbol not put in the symbol table. We then don't write InputSections that have this set, as long as they are not referenced from other symbols. (This happens e.g. for object files that don't set .subsections_via_symbols or that use .alt_entry.) Some restrictions: - not yet done for bitcode inputs - no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) -- Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs) (that is, catch block unwind information) and Personality Routines associated with weak functions still not stripped. This is wasteful, but harmless. - However, this does strip weaks from __unwind_info (which is needed for correctness and not just for size) - This nopes out on InputSections that are referenced form more than one symbol (eg from .alt_entry) for now Things that work based on symbols Just Work: - map files (change in MapFile.cpp is no-op and not needed; I just found it a bit more explicit) - exports Things that work with inputSections need to explicitly check if an inputSection is written (e.g. unwind info). This patch is useful in itself, but it's also likely also a useful foundation for dead_strip. I used to have a "canoncialRepresentative" pointer on InputSection instead of just the bool, which would be handy for ICF too. But I ended up not needing it for this patch, so I removed that again for now. Differential Revision: https://reviews.llvm.org/D102076	2021-05-07 17:11:40 -04:00
thomasraoux	b90b66bcbe	[mlir] Missed clang-format	2021-05-07 13:57:34 -07:00
thomasraoux	d0453a8933	[mlir][vector] Extend pattern to trim lead unit dimension to Splat Op Differential Revision: https://reviews.llvm.org/D102091	2021-05-07 13:54:41 -07:00
Petr Hosek	f97ada27aa	Revert "[BareMetal] Ensure that sysroot always comes after library paths" This reverts commit `6b00b34b8a`.	2021-05-07 13:38:04 -07:00
Florian Hahn	75b9997760	[LV] Remove reference of PHI from comment, they are not recorded (NFC). The comment incorrectly states that the PHI is recorded. That's not accurate, only the recipe for the incoming value is recorded. Suggested post-commit for `4ba8720f88`.	2021-05-07 21:34:23 +01:00
Andrea Di Biagio	3822ac909e	[MCA][RegisterFile] Fix register class check for move elimination (PR50265) The register file should always check if the destination register is from a register class that allows move elimination. Before this change, the check on the register class was only performed in a few very specific cases. However, it should have always been performed. This patch fixes the issue. Note that none of the upstream scheduling models is currently affected by this bug, so there is no test for it. The issue was found by Roman while working on the znver3 model. I was able to reproduce the issue locally by tweaking the btver2 model. I then verified that this patch fixes the issue.	2021-05-07 21:30:25 +01:00
Olivier Goffart	c4adc49a1c	[SEH] Fix regression with SEH in noexpect functions Commit `5baea05601` set the CurCodeDecl because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField, But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec and cause corruption of the EHStack. Revert the part of the commit that changes the CurCodeDecl, and instead adjust the assert to check for a null CurCodeDecl. Differential Revision: https://reviews.llvm.org/D102027	2021-05-07 13:27:59 -07:00
Florian Hahn	337d765282	[LV] Assert if trying to sink replicate region into another region (NFC) Currently sinking a replicate region into another replicate region is not supported. Add an assert, to make the problem more obvious, should it occur. Discussed post-commit for `ccebf7a109`.	2021-05-07 21:25:35 +01:00
Florian Hahn	01c26d4e04	[LV] Rename Region to TargetRegion, similar to SinkRegion (NFC). Adjust the name to make it clearer this is the region containing the target recipe, similar to SinkRegion below. Suggested post-commit for `ccebf7a109`.	2021-05-07 21:25:35 +01:00
peter klausler	01c78a0b07	[flang] Implement NORM2 in the runtime Implement the reduction transformational intrinsic function NORM2 in the runtime, using infrastructure already in place for MAXVAL & al. Differential Revision: https://reviews.llvm.org/D102024	2021-05-07 13:23:21 -07:00
Petr Hosek	6b00b34b8a	[BareMetal] Ensure that sysroot always comes after library paths This addresses an issue introduced in D91559. We would invoke the compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both locations contain libraries with the same name, but we expect linker to pick up the library in path/to/lib since that version is more specialized. This was the case before D91559 where the sysroot path would be ignored, but after that change linker would now pick up the library from the sysroot which resulted in unexpected behavior. The sysroot path should always come after any user provided library paths, followed by compiler runtime paths. We want for libraries in user provided library paths to always take precedence over sysroot libraries. This matches the behavior of other toolchains used with other targets. Differential Revision: https://reviews.llvm.org/D102049	2021-05-07 13:21:07 -07:00
Hsiangkai Wang	c04c66d705	[RISCV] Consider scalar types for required extensions. We have vector operations on double vector and float scalar. For example, vfwadd.wf is such a instruction. vfloat64m1_t vfwadd_wf(vfloat64m1_t op0, float op1, size_t op2); We should specify F and D extensions for it. Differential Revision: https://reviews.llvm.org/D102051	2021-05-08 04:06:45 +08:00
Vyacheslav Zakharin	f2f88f3e7a	An attempt to abandon omptarget out-of-tree builds. I want to start using LLVM component libraries in libomptarget to stop duplicating implementations already available in LLVM (e.g. LLVMObject, LLVMSupport, etc.). Without relying on LLVM in all libomptarget builds one has to provide fallback implementation for each used LLVM feature. This is an attempt to stop supporting out-of-llvm-tree builds of libomptarget. I understand that I may need to revert this, if this affects downstream projects in a bad way. Differential Revision: https://reviews.llvm.org/D101509	2021-05-07 12:43:50 -07:00
Alexander Belyaev	3444996b4c	[mlir] Add a pattern to bufferize std.index_cast. Differential Revision: https://reviews.llvm.org/D102088	2021-05-07 21:32:02 +02:00
Alexander Belyaev	a3f22d020b	[mlir] Add a pattern to bufferize linalg.tensor_reshape. Differential Revision: https://reviews.llvm.org/D102089	2021-05-07 21:31:17 +02:00
Emilio Cota	21db1e3b01	[mlir][docs] remove stale statement about index type in vectors `b614ada0e8` ("[mlir] add support for index type in vectors.") removed this limitation. Differential Revision: https://reviews.llvm.org/D102081	2021-05-07 19:25:17 +00:00
Arthur Eubanks	7ca26c5fa2	Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" This reverts commit `0791f968fe`. Causing crashes: https://crbug.com/1206764	2021-05-07 12:05:16 -07:00
Florian Hahn	6c99e63120	[SCEV] By more careful when traversing phis in isImpliedViaMerge. I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of Consider the case in exit_cond_depends_on_inner_loop. At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1). The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the previous iteration. Hence we incorrectly determine that the previous value was <= -1, which may not be true. I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow). So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard. Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101829	2021-05-07 19:52:29 +01:00
Thomas Lively	1e9c39a3f9	[WebAssembly] Use functions instead of macros for const SIMD intrinsics To improve hygiene, consistency, and usability, it would be good to replace all the macro intrinsics in wasm_simd128.h with functions. The reason for using macros in the first place was to enforce the use of constants for some arguments using `_Static_assert` with `__builtin_constant_p`. This commit switches to using functions and uses the `__diagnose_if__` attribute rather than `_Static_assert` to enforce constantness. The remaining macro intrinsics cannot be made into functions until the builtin functions they are implemented with can be replaced with normal code patterns because the builtin functions themselves require that their arguments are constants. This commit also fixes a bug with the const_splat intrinsics in which the f32x4 and f64x2 variants were incorrectly producing integer vectors. Differential Revision: https://reviews.llvm.org/D102018	2021-05-07 11:50:19 -07:00
Fangrui Song	724604901a	[unittest] Fix -Wunused-variable after D94717	2021-05-07 11:42:16 -07:00
Krzysztof Parzyszek	50cf0a1d1a	Allow empty value list in propagateMetadata(Inst, ArrayOf...) This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.	2021-05-07 13:20:50 -05:00
Fangrui Song	d8aba75a76	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
Louis Dionne	8002c5d65f	[libc++][ci] Run longer CI jobs first Jobs that test with a more recent standard version run more tests, so they take longer. We'll decrease the average latency by running them first instead of last.	2021-05-07 13:57:07 -04:00
Saleem Abdulrasool	d319005a37	lit: revert `134b103fc0` Revert the 32-process cap on Windows. When testing with Swift, we found that there was a time reduction for testing with the higher load. This should hopefully not matter much in practice. In the case that the original problem with python remains with a high subprocess count, we can easily revert this change.	2021-05-07 10:22:43 -07:00
Roman Lebedev	b8701dc174	[X86] AMD Zen 3: mark XMM/YMM (but not MMX!) reg moves as eliminatible in RegisterFile	2021-05-07 20:11:21 +03:00
Roman Lebedev	5b1610a250	[X86] AMD Zen 3: MOVSX32rr32 is a zero-cycle move It measures as such, and the reference docs agree. I can't easily add a MCA test, because there's no mnemonic for it, it can only be disassembled or created as a MCInst.	2021-05-07 20:11:20 +03:00
Fangrui Song	6a2850f3fc	[AArch64][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local Similar to X86 D73230 & `46788a21f9` With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode, for default visibility external linkage non-ifunc-non-COMDAT definitions. For such dso_local definitions, variable access/taking the address of a function/calling a function will go through a local alias to avoid GOT/PLT. Note: the 'S' inline assembly constraint refers to an absolute symbolic address or a label reference (D46745). Differential Revision: https://reviews.llvm.org/D101872	2021-05-07 09:44:26 -07:00
Matt Morehouse	f09414499c	[libFuzzer] Fix stack-overflow-with-asan.test. Fix function return type and remove check for SUMMARY, since it doesn't seem to be output in Windows.	2021-05-07 09:18:21 -07:00
Whitney Tsang	1006ac3963	[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717	2021-05-07 16:04:18 +00:00
Simon Pilgrim	f744723f75	[X86] combineXor - limit fold to non-opaque constants (PR50254) Ensure we don't try to fold when one might be an opaque constant - the constant fold will fail and then the reverse fold will happen in DAGCombine.....	2021-05-07 16:39:24 +01:00
Roman Lebedev	2819009b5a	[X86] AMD Zen 3: _REV variants of zero-cycles moves are also zero-cycles (PR50261) Sometimes disassembler picks _REV variants of instructions over the plain ones, which in this case exposed an issue that the _REV variants aren't being modelled as optimizable moves.	2021-05-07 18:27:40 +03:00
Roman Lebedev	a8e30e63ac	[NFC][X86][MCA] AMD Zen3: add test for zero-cycle X87 move	2021-05-07 18:27:40 +03:00
Sebastian Poeplau	70cbc6dbef	[libFuzzer] Fix stack overflow detection Address sanitizer can detect stack exhaustion via its SEGV handler, which is executed on a separate stack using the sigaltstack mechanism. When libFuzzer is used with address sanitizer, it installs its own signal handlers which defer to those put in place by the sanitizer before performing additional actions. In the particular case of a stack overflow, the current setup fails because libFuzzer doesn't preserve the flag for executing the signal handler on a separate stack: when we run out of stack space, the operating system can't run the SEGV handler, so address sanitizer never reports the issue. See the included test for an example. This commit fixes the issue by making libFuzzer preserve the SA_ONSTACK flag when installing its signal handlers; the dedicated signal-handler stack set up by the sanitizer runtime appears to be large enough to support the additional frames from the fuzzer. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D101824	2021-05-07 08:18:28 -07:00
thomasraoux	a970e69d6b	[mlir][vector] add pattern to cast away leading unit dim for elementwise op Differential Revision: https://reviews.llvm.org/D102034	2021-05-07 07:54:09 -07:00
thomasraoux	565ee6afc7	[mlir][spirv] add support lowering of extract_slice to scalar type Differential Revision: https://reviews.llvm.org/D102041	2021-05-07 07:52:02 -07:00
Joseph Tremoulet	bc302bfbef	BasicAA: Recognize inttoptr as isEscapeSource Pointers escape when converted to integers, so a pointer produced by converting an integer to a pointer must not be a local non-escaping object. Reviewed By: nikic, nlopes, aqjune Differential Revision: https://reviews.llvm.org/D101541	2021-05-07 07:48:50 -07:00
Sanjay Patel	0a6f11aabd	[AArch64] add test for missed vectorization; NFC This is a reduction of the example in: https://llvm.org/PR50256	2021-05-07 10:45:11 -04:00
Joseph Huber	a15f8589f4	[libomptarget] Add support for target memory allocators to cuda RTL Summary: The allocator interface added in D97883 allows the RTL to allocate shared and host-pinned memory from the cuda plugin. This patch adds support for these to the runtime. Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D102000	2021-05-07 10:27:02 -04:00
Tobias Gysi	f31531a30b	[mlir][linalg] Remove redundant indexOp builder. Remove the builder signature taking a signed dimension identifier. Reviewed By: ergawy Differential Revision: https://reviews.llvm.org/D102055	2021-05-07 14:22:12 +00:00
Tres Popp	faab8c140a	[mlir] Rename BufferAliasAnalysis to BufferViewFlowAnalysis This it to make more clear the difference between this and an AliasAnalysis. For example, given a sequence of subviews that create values A -> B -> C -> d: BufferViewFlowAnalysis::resolve(B) => {B, C, D} AliasAnalysis::resolve(B) => {A, B, C, D} Differential Revision: https://reviews.llvm.org/D100838	2021-05-07 16:12:54 +02:00
Ahsan Saghir	25bbff632d	[PowerPC] Provide MMA builtins for compatibility Vector pair intrinsics and builtins were renamed in https://reviews.llvm.org/D91974 to replace the _mma_ prefix by _vsx_. However, some projects used the _mma_ version, so this patch adds these intrinsics to provide compatibility. Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=50159 Reviewed By: nemanjai, amyk Differential Revision: https://reviews.llvm.org/D100482	2021-05-07 09:10:16 -05:00
Roman Lebedev	34de155f7e	[NFC][X86][MCA] AMD Zen3 Decrease iteration count in reg-move-elimination tests Drop it just enough so it still produces the right IPC.	2021-05-07 17:06:45 +03:00
Roman Lebedev	758c173309	[X86] AMD Zen 3: throughput for renameable XMM/YMM moves is 6 They are resolved at the register rename stage without using any execution units.	2021-05-07 17:06:45 +03:00
Roman Lebedev	715c0d0bd4	[X86] AMD Zen 3: AVX YMM moves are zero-cycle I've verified this with llvm-exegesis. This is not limited to zero registers.	2021-05-07 17:06:45 +03:00
Roman Lebedev	ee020b930d	[X86] AMD Zen 3: AVX XMM moves are zero-cycle I've verified this with llvm-exegesis. This is not limited to zero registers.	2021-05-07 17:06:44 +03:00
Roman Lebedev	9db4203883	[X86] AMD Zen 3: SSE XMM moves are zero-cycle I've verified this with llvm-exegesis. This is not limited to zero registers. Refs: AMD SOG 19h, 2.9.4 Zero Cycle Move The processor is able to execute certain register to register mov operations with zero cycle delay. Agner, 22.13 Instructions with no latency Register-to-register move instructions are resolved at the register rename stage without using any execution units. These instructions have zero latency. It is possible to do six such register renamings per clock cycle, and it is even possible to rename the same register multiple times in one clock cycle.	2021-05-07 17:06:44 +03:00
Roman Lebedev	0d961fbd52	[NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX YMM moves	2021-05-07 17:06:44 +03:00

1 2 3 4 5 ...

387794 Commits All Branches Search

387794 Commits

All Branches