llvm-project

Commit Graph

Author	SHA1	Message	Date
LLVM GN Syncbot	2b9016745d	[gn build] Port `aff57ff24a`	2021-06-26 11:38:00 +00:00
Lang Hames	aff57ff24a	[JITLink][ELF] Add generic ELFLinkGraphBuilder template. ELFLinkGraphBuilder<ELFT> will hold generic parsing and LinkGraph-building code that can be shared between JITLink ELF backends for different architectures. For now it's just a stub. The plan is to incrementally move functionality down from ELFLinkGraphBuilder_x86_64 into the new template.	2021-06-26 21:37:33 +10:00
Timm Bäder	3255db4919	[clang][tests] Specify unwindlib in aix-ld tests Clang can be configured with a different default unwindlib, for example gcc. In that case, -lunwind will not be present in the output. Fix this by explicitly specifying libunwind as the unwindlib. Differential Revision: https://reviews.llvm.org/D104899	2021-06-26 13:09:29 +02:00
Jim Lin	779d2b0a42	[RISCV][NFC] Combine the control flow for different RetOp of interrupt function Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D104838	2021-06-26 17:28:03 +08:00
Saurabh Jha	c8f3f46c69	[Docs] Minor fixes with language extension docs There were some issues in the patch https://reviews.llvm.org/D104198. I also forgot to address one comment. This patch addresses these. Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D104971	2021-06-26 10:07:33 +01:00
Craig Topper	d4f4a1ba62	[RISCV] Add DAG combine to detect opportunities to replace (i64 (any_extend (i32 X)) with sign_extend. If type legalization is going to insert a sign_extend for other users of X and we can fold the sign_extend into ADDW/MULW/SUBW, it is better to replace the ANY_EXTEND so we don't end up with a separate ADD/MUL/SUB instruction for the users of the ANY_EXTEND. I'm only handling setcc uses right now, but there are other instructions that force sign_extends like ashr. There are probably other *W instructions we could use in addition to ADDW/SUBW/MULW. My motivating case was a loop terminating compare and a phi use as seen in the new test file. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D104581	2021-06-25 23:16:37 -07:00
Gus Smith	043ce4e6bd	[MLIR][Sparse] Move `buildLattices` into Merger This allows us to use `buildLattices` in the `Merger` unittests. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D104879	2021-06-26 05:05:05 +00:00
Eric Astor	e074d580b2	[ms] [llvm-ml] Disable C-style comments	2021-06-25 23:09:13 -04:00
Luo, Yuanke	36003c20ad	[X86] Selecting fld0 for undefined value in fast ISEL. When set opt-bisect-limit to some value that is less than ISel pass in command line and CurBisectNum expired, "DAG to DAG" pass lower its opt level to O0. However "processimpdefs" and "X86 FP Stackifier" is not stopped due to the CurBisectNum expiration. So undefined fp0 is generated. This cause crash in the "X86 FP Stackifier" pass, because Stackifier doesn't expect any undefined fp value. Here is the scenario that cause compiler crash. successors: %bb.26 liveins: $r14 ST_FPrr $st0, implicit-def $fpsw, implicit $fpcw renamable $rdi = MOV64ri @.str.3.16422 renamable $rdx = LEA64r %stack.6, 1, $noreg, 0, $noreg ADJCALLSTACKDOWN64 0, 0, 0, implicit-def $rsp, implicit-def dead $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp dead $esi = MOV32r0 implicit-def dead $eflags, implicit-def $rsi CALL64pcrel32 @foo, implicit $rsp, implicit $ssp, implicit $rdi, implicit $rsi, implicit $rdx, implicit-def dead $fp0 renamable $xmm0 = MOVSDrm_alt %stack.10, 1, $noreg, 0, $noreg :: (load 8 from %stack.10) ADJCALLSTACKUP64 0, 0, implicit-def $rsp, implicit-def dead $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp renamable $fp2 = CHS_Fp80 killed undef renamable $fp0, implicit-def $fpsw JMP_1 %bb.26 The CALL64pcrel32 mark fp0 dead, so llvm free the stack slot for fp0 and the stack become empty. In the late instruction CHS_Fp80, it use undefined register fp0, the original code assume there must be a stack slot for the src register (fp0) without respecting it is undefined, so llvm report error. We have some discussion in https://reviews.llvm.org/D104440 and we decide to fix it in fast ISel. The fix is to lower undefined fp value to zero value, so that it release the burden of "X86 FP Stackifier" pass. Thank Craig for the suggestion and the initial patch to fix it. Differential Revision: https://reviews.llvm.org/D104678	2021-06-26 08:43:09 +08:00
Jon Chesterfield	50ad3478bd	Disable ReplaceLDS pass, patch up tests to match Most tests passed with an extra argument to explicitly enable the pass. One does not, deleted it as part of this change. I can't see why the codegen would be different between default on and default off but switched on. It can be retrieved from the project history. This would be a revert, but git revert was not clean. Disabling the pass and leaving it in tree is less likely to cause breakage elsewhere than patching up the git revert conflicts on unfamiliar code. It'll be landed without review, as @hsmhsm is believed unavailable at present. Differential Revision: https://reviews.llvm.org/D104962	2021-06-26 01:36:42 +01:00
Andrew Browne	45f6d5522f	[DFSan] Change shadow and origin memory layouts to match MSan. Previously on x86_64: +--------------------+ 0x800000000000 (top of memory) \| application memory \| +--------------------+ 0x700000008000 (kAppAddr) \| \| \| unused \| \| \| +--------------------+ 0x300000000000 (kUnusedAddr) \| origin \| +--------------------+ 0x200000008000 (kOriginAddr) \| unused \| +--------------------+ 0x200000000000 \| shadow memory \| +--------------------+ 0x100000008000 (kShadowAddr) \| unused \| +--------------------+ 0x000000010000 \| reserved by kernel \| +--------------------+ 0x000000000000 MEM_TO_SHADOW(mem) = mem & ~0x600000000000 SHADOW_TO_ORIGIN(shadow) = kOriginAddr - kShadowAddr + shadow Now for x86_64: +--------------------+ 0x800000000000 (top of memory) \| application 3 \| +--------------------+ 0x700000000000 \| invalid \| +--------------------+ 0x610000000000 \| origin 1 \| +--------------------+ 0x600000000000 \| application 2 \| +--------------------+ 0x510000000000 \| shadow 1 \| +--------------------+ 0x500000000000 \| invalid \| +--------------------+ 0x400000000000 \| origin 3 \| +--------------------+ 0x300000000000 \| shadow 3 \| +--------------------+ 0x200000000000 \| origin 2 \| +--------------------+ 0x110000000000 \| invalid \| +--------------------+ 0x100000000000 \| shadow 2 \| +--------------------+ 0x010000000000 \| application 1 \| +--------------------+ 0x000000000000 MEM_TO_SHADOW(mem) = mem ^ 0x500000000000 SHADOW_TO_ORIGIN(shadow) = shadow + 0x100000000000 Reviewed By: stephan.yichao.zhao, gbalats Differential Revision: https://reviews.llvm.org/D104896	2021-06-25 17:00:38 -07:00
Siva Chandra Reddy	2e9c75daff	[libc] Use __builtin_ctzll instead of __builtin_ctzl in elements_x86.h. __builtin_ctzl takes an unsigned long argument which need not be 64-bit long on all platforms. Using __builtin_ctzll, which takes an unsigned long long argument, ensures that 64-bit values will be handled on a wider range of platforms. Without this change, the test corresponding to M512 fails in Windows. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D104897	2021-06-25 22:58:13 +00:00
Nikita Popov	fdd4c199a1	Revert "[InstCombine] Make indexed compare fold opaque ptr compatible" This reverts commit `5cb20ef8a2`. Assertion failures with this patch were reported on https://reviews.llvm.org/rG5cb20ef8a235, revert for now.	2021-06-26 00:32:59 +02:00
Duncan P. N. Exon Smith	4506f614cb	OpaquePtr: Reject 'ptr' again when parsing textual IR Bring back the testcase dropped in `1e6303e60c` and get it passing by checking explicitly for `ptr` in LLParser. Uses `Type::isOpaquePointerTy()` from `ad4bb82809`. Differential Revision: https://reviews.llvm.org/D104938	2021-06-25 15:18:44 -07:00
Aart Bik	557b101ce7	[mlir][sparse] add print methods to Merger (for debugging) Reviewed By: gussmith23 Differential Revision: https://reviews.llvm.org/D104939	2021-06-25 15:10:06 -07:00
Matheus Izvekov	ad14b5b008	[clang] Stop providing builtin overload candidate for relational function pointer comparisons Word on the grapevine was that the committee had some discussion that ended with unanimous agreement on eliminating relational function pointer comparisons. We wanted to be bold and just ban all of them cold turkey. But then we chickened out at the last second and are going for eliminating just the spaceship overload candidate instead, for now. See D104680 for reference. This should be fine and "safe", because the only possible semantic change this would cause is that overload resolution could possibly be ambiguous if there was another viable candidate equally as good. But to save face a little we are going to: * Issue an "error" for three-way comparisons on function pointers. But all this is doing really is changing one vague error message, from an "invalid operands to binary expression" into an "ordered comparison of function pointers", which sounds more like we mean business. * Otherwise "warn" that comparing function pointers like that is totally not cool (unless we are told to keep quiet about this). Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D104892	2021-06-26 00:08:02 +02:00
Jonas Devlieghere	ffc0533855	[lldb] Use the non-locking variant of objc_copyRealizedClassList Avoid standing the Objective-C runtime lock by calling objc_copyRealizedClassList_nolock instead of objc_copyRealizedClassList. We already guarantee that no other threads can run while we're running this utility expression, similar to when we parse the data ourselves from the gdb_objc_realized_classes struct. Worst case this will crash if the list is getting edited, which won't do any harm and we'll just try again later. Differential revision: https://reviews.llvm.org/D104951	2021-06-25 15:02:49 -07:00
Jim Ingham	4eabb12057	Add support for the NSMutableDictionary variant: "__NSFrozenDictionaryM" This was an oversight of the commit: `bb93483c11` that added support for the Frozen variants. Also added a test case for the way that currently produces one of these variants (a copy).	2021-06-25 14:59:26 -07:00
Eli Friedman	8d5bf0709d	[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion The implementation is identical, but it makes the semantics a bit more obvious.	2021-06-25 14:43:13 -07:00
Eric Astor	c8d0d8a8a1	[ms] [llvm-ml] Add support for ALIGN, EVEN, and ORG directives Match ML.EXE's behavior for ALIGN, EVEN, and ORG directives both at file level and in STRUCTs. We currently reject negative offsets passed to ORG inside STRUCTs (in ML.EXE and ML64.EXE, they wrap around as for an unsigned 32-bit integer). Also, if a STRUCT is declared using an ORG directive, no value of that type can be defined. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92507	2021-06-25 17:19:45 -04:00
Adrian Prantl	4cf7c6c6a4	Change PathMappingList::RemapPath to return an optional result (NFC) This is an NFC modernization refactoring that replaces the combination of a bool return + reference argument, with an Optional return value. Differential Revision: https://reviews.llvm.org/D104404	2021-06-25 14:15:29 -07:00
Juneyoung Lee	1605593440	[SimplifyLibCalls] Fix memchr opt to use CreateLogicalAnd This fixes a bug at LibCallSimplifier::optimizeMemChr which does the following transformation: ``` // memchr("\r\n", C, 2) != nullptr -> (1 << C & ((1 << '\r') \| (1 << '\n'))) // != 0 // after bounds check. ``` As written above, a bounds check on C (whether it is less than integer bitwidth) is done before doing `1 << C` otherwise 1 << C will overflow. If the bounds check is false, the result of (1 << C & ...) must not be used at all, otherwise the result of shift (which is poison) will contaminate the whole results. A correct way to encode this is `select i1 (bounds check), (1 << C & ...), false` because select does not allow the unused operand to contaminate the result. However, this optimization was introducing `and (bounds check), (1 << C & ...)` which cannot do that. The bug was found from compilation of this C++ code: https://reviews.llvm.org/rG2fd3037ac615#1007197 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104901	2021-06-26 05:59:35 +09:00
Joseph Huber	5ccb7424fa	[OpenMP] Change OpenMPOpt to check openmp metadata The metadata added in D102361 introduces a module flag that we can check to determine if the module was compiled with `-fopenmp` enables. We can now check for the precense of this instead of scanning the call graph for OpenMP runtime functions. Depends on D102361 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102423	2021-06-25 16:34:22 -04:00
Joseph Huber	9ce02ea8c9	[OpenMP] Add Module metadata for OpenMP compilation This patch adds a module level metadata flag indicating that the module was compiled with the `-fopenmp` flag. This will make it easier for passes like OpenMPOpt to determine if it should be run. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102361	2021-06-25 16:34:19 -04:00
Nemanja Ivanovic	4e22c7265d	[PowerPC] Disable combine 64-bit bswap(load) without LDBRX This causes failures on the big endian bootstrap bot. Disabling this combine temporarily until I can get a proper fix.	2021-06-25 15:11:22 -05:00
Valeriy Savchenko	d646157146	[analyzer] Fix assertion failure on code with transparent unions rdar://76948312 Differential Revision: https://reviews.llvm.org/D104716	2021-06-25 23:09:16 +03:00
Martin Storsjö	bdb03557c0	[llvm-rc] Don't rewrite the arch in the default triple unless necessary When the default target arch isn't one that is supported as a windows target, we want to set a suitable architecture (so that Clang tests that run plain 'llvm-rc' succeed checks for e.g. "#ifdef _WIN32" even for llvm builds that default to e.g. ppc64). But if the default target architecture is usable, don't rewrite it. (Rewriting it, by e.g. "T.setArch(T.getArch())", normalizes the spelling of the architecture, e.g. changing i686 to i386. Such a change can make clang unable to find the right sysroot.) This can't, unfortunately, practically be tested very well because it is entirely dependent on the default triple of the llvm build. Differential Revision: https://reviews.llvm.org/D104589	2021-06-25 22:59:09 +03:00
Fangrui Song	2508733e1b	[ELF] --sysroot: change sysrooted script to not fall back for an absolute path Modify the D13209 logic: for a script inside the sysroot, if an absolute path does not exist, report an error instead of falling back to the path without the sysroot prefix. This matches GNU ld, which makes sense to me: we don't want to find an arbitrary file in the host. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D104894	2021-06-25 12:52:39 -07:00
Ulrich Weigand	b2674670f2	[SystemZ] Add support for .reloc assembler directive Add support for the .reloc directive along the lines of other back-ends. This fixes a regression after https://reviews.llvm.org/D104080 was merged, since that patch presupposed support for .reloc.	2021-06-25 21:51:10 +02:00
Hongtao Yu	3638085ff0	[Coroutines] Define __coro_frame_ty in function scope Types should be defined in function scope instead of a local lexical scope. Field types should be defined inside in its parent type scope. We were seeing a type defined in a local scope causing trouble to the dwarf emitter where a context is required to be a funciton scope, a namespace or a global scope. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D104937	2021-06-25 12:33:20 -07:00
Eugene Zhulenev	34a164c938	[mlir:Async] Submit accidentally omitted changes Accidentally pushed old branches that did not include all the changes discussed in the PRs. https://reviews.llvm.org/rGd43b23608ad664f02f56e965ca78916bde220950 https://reviews.llvm.org/rG86ad0af87054c3cccd68d32e103a6f1f6c6194c7 Differential Revision: https://reviews.llvm.org/D104943	2021-06-25 12:23:02 -07:00
Nikita Popov	5b2573e9c7	[OpaquePtr] Enumerate GlobalAlias value type The type is no longer implicitly enumerated through the pointer type.	2021-06-25 21:21:10 +02:00
Arthur O'Dwyer	585496803c	[libc++] Enable the rvalue overloads of operator<< and operator>> even in C++03. Continuing to eliminate no-longer-needed uses of _LIBCPP_CXX03_LANG. Differential Revision: https://reviews.llvm.org/D104725	2021-06-25 14:59:58 -04:00
Nikita Popov	ad4bb82809	[IR] Add Type::isOpaquePointerTy() helper (NFC) Shortcut to check for opaque pointers without a cast to PointerType.	2021-06-25 20:56:59 +02:00
peter klausler	3cc5d4ff8e	[flang] Fix generic/specific procedure confusion A recent change that extended semantic analysis for actual arguments that associate with procedure dummy arguments exposed some bugs in regression test suites due to points of confusion in symbol table handling in situations where a generic interface contains a specific procedure of the same name. When passing that name as an actual argument, for example, it's necessary to take this possibility into account because the symbol for the generic interface shadows the symbol of the same name for the specific procedure, which is what needs to be checked. So add a small utility that bypasses the symbol for a generic interface in this case, and use it where needed. Differential Revision: https://reviews.llvm.org/D104929	2021-06-25 11:54:29 -07:00
David Green	b8c8bb0769	[DAG] Fold neg(splat(neg(x)) -> splat(x) This add as a fold of sub(0, splat(sub(0, x))) -> splat(x). This can come up in the lowering of right shifts under AArch64, where we generate a shift left of a negated number. Differential Revision: https://reviews.llvm.org/D103755	2021-06-25 19:53:29 +01:00
Craig Topper	0f3bc00a7d	[X86] Simplify part of the isel for X86ISD::FCMP/STRICT_FCMP/STRICT_FCMPS. We don't need to have the compare output a value and then copy it to FPSW for use by FNSTSW. Instead we can just have the compare output Glue and glue the FNSTSW to it. InstrEmitter effectively performed this optimization when emitting the Machine IR. Doing it directly simplifies the codes and reduces the work in InstrEmitter. There's no change in the machine IR at the end of isel before and after this change.	2021-06-25 11:39:01 -07:00
Joel E. Denny	cc60fa2685	[UpdateCCTestChecks] Fix new test from `9eaf0d120d` `clang/test/utils/update_cc_test_checks/check-globals.test` from `9eaf0d120d` broke at: * <https://lab.llvm.org/buildbot/#/builders/110/builds/4415> * <https://lab.llvm.org/buildbot/#/builders/5/builds/9076> The problem is non-deterministic test order because the `.lit_test_times.txt` from one run of a sample test suite affects the other.	2021-06-25 14:29:58 -04:00
David Green	77ae9b364a	[AArch64] Extra negated shift tests. NFC	2021-06-25 19:17:31 +01:00
Jon Chesterfield	f66b8fdc0a	[libomptarget][amdgpu] Build openmp for two more targets [libomptarget][amdgpu] Build openmp for two more targets The 4800U APU is a gfx902 and the MI100 accelerator is a gfx908. Both numbers are listed in ROCT topology.c Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D104922	2021-06-25 19:02:03 +01:00
Nico Weber	fda790fbfa	[clang] Make fewer assumptions about path to lit.site.cfg after `9eaf0d120d`	2021-06-25 14:01:29 -04:00
Philip Reames	9714d08e94	[test] Add coverage for existing overflow rule with uadd.with.overflow	2021-06-25 10:45:00 -07:00
Florian Hahn	cc5ee857f9	[LV] Doxygenize VectorizationFactor member comments (NFC). Minor cleanup for follow-up patch.	2021-06-25 18:35:00 +01:00
Eugene Zhulenev	86ad0af870	[mlir:Async] Implement recursive async work splitting for scf.parallel operation (async-parallel-for pass) Depends On D104780 Recursive work splitting instead of sequential async tasks submission gives ~20%-30% speedup in microbenchmarks. Algorithm outline: 1. Collapse scf.parallel dimensions into a single dimension 2. Compute the block size for the parallel operations from the 1d problem size 3. Launch parallel tasks 4. Each parallel task reconstructs its own bounds in the original multi-dimensional iteration space 5. Each parallel task computes the original parallel operation body using scf.for loop nest Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D104850	2021-06-25 10:34:39 -07:00
Eugene Zhulenev	d43b23608a	[mlir:Async] Add the size parameter to the async.group Specify the `!async.group` size (the number of tokens that will be added to it) at construction time. `async.await_all` operation can potentially race with `async.execute` operations that keep updating the group, for this reason it is required to know upfront how many tokens will be added to the group. Reviewed By: ftynse, herhut Differential Revision: https://reviews.llvm.org/D104780	2021-06-25 10:26:50 -07:00
Philip Reames	2cd23eb243	[instcombine] Fold overflow check using umulo to comparison If we have a umul.with.overflow where the multiply result is not used and one of the operands is a constant, we can perform the overflow check cheaper with a comparison then by performing the multiply and extracting the overflow flag. (Noticed when looking at the conditions SCEV emits for overflow checks.) Differential Revision: https://reviews.llvm.org/D104665	2021-06-25 10:25:45 -07:00
Joel E. Denny	9eaf0d120d	[UpdateCCTestChecks] Support --check-globals This option is already supported by update_test_checks.py, but it can also be useful in update_cc_test_checks.py. For example, I'd like to use it in OpenMP offload codegen tests to check global variables like `.offload_maptypes*`. Reviewed By: jdoerfert, arichardson, ggeorgakoudis Differential Revision: https://reviews.llvm.org/D104714	2021-06-25 13:17:56 -04:00
Philip Reames	e80a21d632	[test][instcombine] Add test cases for all x.with.overflow overflow checks For each of the x.with.overflow variants, if only the overflow bit is consumed, we can generate a direct overflow comparison. This precommits tests for each of the variants and tries to cover interesting cornercases.	2021-06-25 10:09:58 -07:00
Stephan Herhut	fb0dbc5481	[mlir][memref] Introduce UnrankedMemRefOf to tablegen definitions This enables specifying operations that only support some element types for unranked memrefs. Differential Revision: https://reviews.llvm.org/D104906	2021-06-25 18:52:04 +02:00
Hendrik Greving	e15e1417b9	[ModuloSchedule] Pass loop block explicitly to kernel rewriter. This change is NFC upstream. We pass in the loop's block to the kernel rewriter explicitly, instead of assuming it's the loop's top block. This change is made for downstream targets where this assumption doesn't hold. Differential Revision: https://reviews.llvm.org/D104811	2021-06-25 09:51:22 -07:00

1 2 3 4 5 ...

392120 Commits All Branches Search

392120 Commits

All Branches