llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	b46c085d2b	[NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions These intrinsics, not the icmp+select are the canonical form nowadays, so we might as well directly emit them. This should not cause any regressions, but if it does, then then they would needed to be fixed regardless. Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`, but that is a pessimization, not a correctness issue. Additionally, the non-intrinsic form has issues with undef, see https://reviews.llvm.org/D88287#2587863	2021-03-06 21:52:46 +03:00
Sean Fertile	f0904a6208	[PowePC][AIX] Handle variadic vector call operands. Patch adds support for passing vector call operands to variadic functions. Arguments which are fixed shadow GPRs and stack space even when they are passed in vector registers, while arguments passed through ellipses are passed in properly aligned GPRs if available and on the stack once all GPR arguments registers are consumed. Differential Revision: https://reviews.llvm.org/D97956	2021-03-06 13:49:55 -05:00
Ta-Wei Tu	8a003861a3	[NPM] Add -enable-loopinterchange option to NPM We have the `enable-loopinterchange` option in legacy pass manager but not in NPM. Add `LoopInterchange` pass to the optimization pipeline (at the same position as before) when `enable-loopinterchange` is turned on. Reviewed By: aeubanks, fhahn Differential Revision: https://reviews.llvm.org/D98116	2021-03-07 02:39:28 +08:00
Elia Geretto	b46c89892f	[XRay][compiler-rt][x86_64] Fix CFI directives in assembly trampolines This patch modifies the x86_64 XRay trampolines to fix the CFI information generated by the assembler. One of the main issues in correcting the CFI directives is the `ALIGNED_CALL_RAX` macro, which makes the CFA dependent on the alignment of the stack. However, this macro is not really necessary because some additional assumptions can be made on the alignment of the stack when the trampolines are called. The code has been written as if the stack is guaranteed to be 8-bytes aligned; however, it is instead guaranteed to be misaligned by 8 bytes with respect to a 16-bytes alignment. For this reason, always moving the stack pointer by 8 bytes is sufficient to restore the appropriate alignment. Trampolines that are called from within a function as a result of the builtins `__xray_typedevent` and `__xray_customevent` are necessarely called with the stack properly aligned so, in this case too, `ALIGNED_CALL_RAX` can be eliminated. Fixes https://bugs.llvm.org/show_bug.cgi?id=49060 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D96785	2021-03-06 10:38:27 -08:00
Fangrui Song	ca747e48af	[sanitizer] Restrict clock_gettime workaround to glibc The hackery is due to glibc clock_gettime crashing from preinit_array (D40679). 32-bit musl architectures do not define `__NR_clock_gettime` so the code causes a compile error. Tested on Alpine Linux x86-64 (musl) and FreeBSD x86-64. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D96925	2021-03-06 10:32:27 -08:00
William S. Moses	d163e75c81	[Attributor] Enable heap-to-stack of any size Enable Attributor's heap-to-stack to lower unbounded allocations given a max size of -1 Differential Revision: https://reviews.llvm.org/D97873	2021-03-06 12:57:32 -05:00
Philip Reames	9c139c50c9	[tests] Update an autogen test for format change	2021-03-06 09:49:27 -08:00
Philip Reames	5db2735af9	[gvn] Handle simply phi equivalence cases GVN basically doesn't handle phi nodes at all. This is for a reason - we can't value number their inputs since the predecessor blocks have probably not been visited yet. However, it also creates a significant pass ordering problem. As it stands, instcombine and simplifycfg ends up implementing CSE of phi nodes. This means that for any series of CSE opportunities intermixed with phi nodes, we end up having to alternate instcombine/simplifycfg and gvn to make progress. This patch handles the simplest case by simply preprocessing the phi instructions in a block, and CSEing them if they are syntactically identical. This turns out to be powerful enough to handle many cases in a single invocation of GVN since blocks which use the cse'd phi results are visited after the block containing the phi. If there's a CSE opportunity in one the phi predecessors required to recognize the phi CSE opportunity, that will require a second iteration on the function. (Still within a single run of gvn though.) Compile time wise, this could go either way. On one hand, we're potentially causing GVN to iterate over the function more. On the other, we're cutting down on iterations between two passes and potentially shrinking the IR aggressively. So, a bit unclear what to expect. Note that this does still rely on instcombine to canonicalize block order of the phis, but that's a one time transformation independent of the values incoming to the phi. Differential Revision: https://reviews.llvm.org/D98080	2021-03-06 09:31:12 -08:00
Martin Storsjö	15fdd536f9	[libcxx] [test] Fix path.itr/iterator.pass.cpp for windows Differential Revision: https://reviews.llvm.org/D98107	2021-03-06 19:27:14 +02:00
Philip Reames	06a8a867d1	[rs4gc/tests] Remove use of internal debug flags As a pragmatic tradeoff, the ease of updating the tests outweighs the slightly easier to understand test conditions. Where revevant, debug output was converted to comments to help human understanding.	2021-03-06 09:20:02 -08:00
Philip Reames	c6ec563f02	[rs4gc] autogen a bunch of tests for ease of update	2021-03-06 09:04:00 -08:00
Philip Reames	8fe59ba51e	[rs4gc] track the original value in the state use for base pointer rewriting I'd originally intended to build on this for another purpose and have decided not to, but at a minimum, the stronger asserts are useful.	2021-03-06 08:46:15 -08:00
Philip Reames	6334952ff0	[rs4gc] minor code style improvement	2021-03-06 08:46:15 -08:00
Vy Nguyen	70c0dbf151	[lld-macho][NFC] Replace config param with a global in hasCompatVersion() helper. Differential Revision: https://reviews.llvm.org/D98115	2021-03-06 11:32:51 -05:00
Nikita Popov	f278734bf1	[Loads] Restructure getAvailableLoadStore implementation (NFC) Separate out some conditions with early exits, to make it easier to support additional cases.	2021-03-06 16:58:11 +01:00
Nikita Popov	1c59bf4d4d	[InstCombine] Add tests for non-trivial store to load forward (NFC) Examples of things we mostly don't handle.	2021-03-06 16:58:11 +01:00
KareemErgawy-TomTom	3fb384d50e	[MLIR][SPIRV] Rename `spv.selection` to `spv.mlir.selection`. To unify the naming scheme across all ops in the SPIR-V dialect, we are moving from spv.camelCase to spv.CamelCase everywhere. For ops that don't have a SPIR-V spec counterpart, we use spv.mlir.snake_case. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D98014	2021-03-06 16:05:31 +01:00
Yaxun (Sam) Liu	34d1a5c7b1	[HIP] Support Spack packages Spack is a package management tool extensively used by HPC community. As ROCm packages are built by Spack by HPC community, we need to teach clang driver to detect ROCm installation built by Spack. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D97340	2021-03-06 08:41:37 -05:00
Lei Zhang	bb6f5c8314	[mlir][spirv] Convert tensor.extract for very small tensors Normally tensors will be stored in buffers before converting to SPIR-V, given that is how a large amount of data is sent to the GPU. However, SPIR-V supports converting from tensors directly too. This is for the cases where the tensor just contains a small amount of elements and it makes sense to directly inline them as a small data array in the shader. To handle this, internally the conversion might create new local variables. SPIR-V consumers in GPU drivers may or may not optimize that away. So this has implications over register pressure. Therefore, a threshold is used to control when the patterns should kick in. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D98052	2021-03-06 08:03:36 -05:00
Alexey Lapshin	cf7cdaff64	[X86][VARARG] Avoid spilling xmm registers for va_start. That review is extracted from D69372. It fixes https://bugs.llvm.org/show_bug.cgi?id=42219 bug. For the noimplicitfloat mode, the compiler mustn't generate floating-point code if it was not asked directly to do so. This rule does not work with variable function arguments currently. Though compiler correctly guards block of code, which copies xmm vararg parameters with a check for %al, it does not protect spills for xmm registers. Thus, such spills are generated in non-protected areas and could break code, which does not expect floating-point data. The problem happens in -O0 optimization mode. With this optimization level there is used FastRegisterAllocator, which spills virtual registers at basic block boundaries. Register Allocator does not protect spills with additional control-flow modifications. Thus to resolve that problem, it is suggested to not copy incoming physical registers into virtual registers. Instead, store incoming physical xmm registers into the memory from scratch. Differential Revision: https://reviews.llvm.org/D80163	2021-03-06 15:25:47 +03:00
Nikita Popov	edf7004851	[ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast() There seems to be an impedance mismatch between what the type system considers an aggregate (structs and arrays) and what constants consider an aggregate (structs, arrays and vectors). Rather than adjusting the type check, simply drop it entirely, as getAggregateElement() is well-defined for non-aggregates: It simply returns null in that case.	2021-03-06 12:17:56 +01:00
Nikita Popov	be58465591	[GVN] Regenerate test checks (NFC)	2021-03-06 12:11:16 +01:00
David Zarzycki	f4059cc352	Partially revert "[runtimes] Use add_lit_testsuite to register lit testsuites" This partially reverts commit `e1173c8794` until we find out why libcxx tests are failing under runtimes build.	2021-03-06 06:06:55 -05:00
Juneyoung Lee	7ae191f59f	[LangRef] dos2unix (NFC)	2021-03-06 18:44:40 +09:00
Nikita Popov	a917fb89dc	[LVI] Simplify and generalize handling of clamp patterns Instead of handling a number of special cases for selects, handle this generally when inferring ranges from conditions. We already infer ranges from `x + C pred C2` to `x`, so doing the same for `x pred C2` to `x + C` is straightforward.	2021-03-06 10:42:41 +01:00
Nikita Popov	906deaa0d9	[CVP] Add additional tests for clamp patterns (NFC) These are the same as the existing tests, but using different predicates that are not handled by the current code.	2021-03-06 10:42:40 +01:00
Raul Tambre	10a7289649	[runtimes] Fix crosscompiling after `a7cad6680b` (D97451) It moved the logic for CMake target arguments into llvm_ExternalProject_Add(). No handling was added for CMAKE_CROSSCOMPILING, which has a separate set of compiler_args. This broke crosscompiling, as now the runtimes builds defaulted to the compiler's default. I've also added passing of CMAKE_ASM_COMPILER, which was missing before although we were passing the triple for it. Reviewed By: zero9178 Differential Revision: https://reviews.llvm.org/D97855	2021-03-06 11:35:14 +02:00
Nikita Popov	b42be01788	[LVI] Pass offset by reference (NFC) Instead of by pointer. This allows us to use offsets that are not materialized in the IR.	2021-03-06 10:24:44 +01:00
Nikita Popov	019ae8220f	[CVP] Fix tests for clamp patterns (NFC) These tests didn't test the pattern they were supposed to, because %a instead of %add was used in the select, which turned this into a normal min/max). Noticed this when commenting out the clamp handling code did not result in any test failures...	2021-03-06 10:24:44 +01:00
Jay Foad	99682bc039	Revert "Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030"" This reverts commit `e58d68fcd0`. This reinstates commit `fc28f600e5` with a fix to initialize HasShaderCyclesRegister. See https://reviews.llvm.org/D97928.	2021-03-06 09:00:01 +00:00
Aleksandr Platonov	c4efd04f18	[clangd] Use URIs instead of paths in the index file list Without this patch the file list of the preamble index contains URIs, but other indexes file lists contain file paths. This makes `indexedFiles()` always returns `IndexContents::None` for the preamble index, because current implementation expects file paths inside the file list of the index. This patch fixes this problem and also helps to avoid a lot of URI to path conversions during indexes merge. Reviewed By: kadircet Differential Revision: https://reviews.llvm.org/D97535	2021-03-06 10:47:05 +03:00
Martin Storsjö	714644a36c	[libcxx] [test] Move the is_<platform> functions down to subclasses If cross testing (and manually specifying a LIBCXX_TARGET_INFO in the cmake configuration, as the default is to match the build platform), we want the accessors for querying the target platform, is_windows, is_darwin, to return the right value depending on which target info class is used, not based on what platform is running the build and driving the tests. When LIBCXX_TARGET_INFO isn't defined, the right target info class is chosen automatically based on the platform one is running on, so this shouldn't make any practical difference for such setups. Differential Revision: https://reviews.llvm.org/D98045	2021-03-06 08:52:34 +02:00
Martin Storsjö	ebe6d3be0f	[clang] Don't default to a specifically shared libunwind on mingw with a g++ driver For MinGW targets, we distinguish between an explicitly shared unwinder library (requested via -shared-libgcc), an explicitly static one (requested via -static-libgcc or -static) and the default case (which just passes -lunwind to the linker, which will pick either shared or static depending on what's available, with the normal linker logic). This makes the implicit default case (as added in D79995) actually work as it was intended, when using the g++ driver (which is the main usecase for libunwind as far as I know). Differential Revision: https://reviews.llvm.org/D98023	2021-03-06 08:50:46 +02:00
Martin Storsjö	002dd47bdd	[clang] Fix typos in the default logic for CLANG_DEFAULT_UNWINDLIB CLANG_DEFAULT_RTLIB had a typo, and libunwind isn't a valid option for it. This keeps the actual behaviour from before, defaulting to none if using compiler-rt as rtlib. Differential Revision: https://reviews.llvm.org/D98022	2021-03-06 08:50:46 +02:00
Fangrui Song	2d922de3af	[MC][RISCV] Support .reloc , BFD_RELOC_{NONE,32,64}, BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.	2021-03-05 21:45:11 -08:00
Fangrui Song	59ff9315fd	[MC][ARM] Support .reloc , BFD_RELOC_{NONE,8,16,32}, BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.	2021-03-05 21:39:16 -08:00
Fangrui Song	e4398bcdff	[MC][test] Fix reloc-directive-elf-*.s	2021-03-05 21:37:29 -08:00
Mehdi Amini	f8fe6d9f3f	Use gen-dialect-doc instead of gen-op-doc for the Builtin dialect This is fixing the missing title and menu entry on the MLIR website.	2021-03-06 05:32:46 +00:00
Fangrui Song	3110187f1f	[MC][PowerPC] Support .reloc , BFD_RELOC_{NONE,16,32,64}, BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.	2021-03-05 21:31:45 -08:00
Fangrui Song	aceea45d87	[MC][AArch64] Support .reloc , BFD_RELOC_{NONE,16,32,64}, BFD_RELOC_NONE is useful for ld --gc-sections: it provides a generic way indicating a dependency between two sections.	2021-03-05 21:31:08 -08:00
Fangrui Song	4f7562d52f	[MC][X86] Support .reloc , BFD_RELOC_{NONE,8,16,32,64}, The names are unfortunate, but BFD_RELOC_NONE provides a generic way indicating a dependency between two sections, which is useful for ld --gc-sections. See https://sourceware.org/bugzilla/show_bug.cgi?id=27530	2021-03-05 21:31:05 -08:00
Vitaly Buka	56ed64dfa9	[sanitizer] Don't expect ABORTING in print-module-map ABORTING message is inconsistent across sanitizers. Another followup for D98089	2021-03-05 19:22:34 -08:00
Christopher Di Bella	c744332793	[libcxx] adds std::ranges::swap, std::swappable, and std::swappable_with Implements parts of: - P0898R3 Standard Library Concepts - P1754 Rename concepts to standard_case for C++20, while we still can Depends on D96742 Differential Revision: https://reviews.llvm.org/D97162	2021-03-05 19:03:57 -08:00
Jianzhou Zhao	469d5462fa	[dfsan] Re-enable origin tracking test cases	2021-03-06 02:41:56 +00:00
Mitch Phillips	e58d68fcd0	Revert "[AMDGPU] Restore the s_memtime instruction in gfx1030" Broke the ASan/MSan buildbots. See more comments in the original patch, https://reviews.llvm.org/D97928. Build failure at http://lab.llvm.org:8011/#/builders/5/builds/5327 This reverts commit `fc28f600e5`.	2021-03-05 18:24:59 -08:00
Matheus Izvekov	71e6e82746	[clang] Fix constrained decltype(auto) deduction Prior to this fix, constrained decltype(auto) behaves exactly the same as constrained regular auto. This fixes it so it deduces like decltype(auto). Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D98087	2021-03-05 18:20:09 -08:00
Vitaly Buka	2fcd872d8a	[dfsan] Remove dfsan_get_origin from done_abilist.txt Followup for D95835	2021-03-05 17:59:39 -08:00
Shoaib Meenai	9a2a167b6c	[DirectoryWatcher] Increase timeout to make test less flaky We've observed this test being significantly flaky on our Mac CI machines when we're running the full check-clang suite. It fails because the wait_for condition isn't met within 3 seconds. We believe it's because our CI machines are somewhat underpowered and pretty heavily loaded when we're running the full check-clang suite. I ran some experiments on increasing the timeout. I ran the full check-clang suite 100 times with each timeout value and recorded how many flaky failures we encountered in these tests. The results are: 3 second timeout (baseline): 20 failures 10 second timeout: 14 failures 20 second timeout: 4 failures 30 second timeout: 2 failures 40 second timeout: 1 failure 50 second timeout: 0 failures 60 second timeout: 0 failures I ran another set of 100 tests for the 50 second timeout and observed one flaky failure. By contrast, I ended up running check-clang 500 times for the 60 second timeout and didn't observe a single flaky failure. That's how the 60 second timeout value used in this patch was derived. While a 60 second timeout might seem high, keep in mind that: - This is a timeout, not a sleep; the test should require much less time the vast majority of instances, especially on more powerful machines. - The long timeout is most likely to occur when other tests are also running at the same time, so the latency of the timeout will also be masked by the latency of the other tests. See https://reviews.llvm.org/D58418?id=200123#inline-554211 for where this timeout was originally introduced and the possibility of raising it if it wasn't enough was discussed. Reviewed By: plotfi Differential Revision: https://reviews.llvm.org/D97878	2021-03-05 17:49:14 -08:00
Vitaly Buka	1c5f083128	[NFC] Fix module map test Followup for D98089	2021-03-05 17:23:19 -08:00
Jianzhou Zhao	d02e0ba070	[dfsan] Disable origin test cases temporarily	2021-03-06 01:12:54 +00:00

... 3 4 5 6 7 ...

382143 Commits All Branches Search

382143 Commits

All Branches