llvm-project

Commit Graph

Author	SHA1	Message	Date
Benjamin Kramer	f936457f80	Revert "Recommit "[LV] Induction Variable does not remain scalar under tail-folding."" This reverts commit `ae45b4dbe7`. It causes miscompilations, test case on the mailing list.	2020-05-08 14:49:10 +02:00
Simon Pilgrim	5f9f37c42a	[X86][AVX] Don't let X86ISD::BROADCAST peek through bitcasts to illegal types. This was an existing bug exposed by the more aggressive X86ISD::BROADCAST generation by rG8817334ce3c7 Original test case thanks to @mstorsjo	2020-05-08 12:30:50 +01:00
Simon Pilgrim	a0da4466d8	RemarkStringTable.h - reduce StringRef/Remark includes to forward declarations. NFC Move StringRef.h include down to RemarkStringTable.cpp and remove some unused includes there as well.	2020-05-08 12:30:49 +01:00
Simon Pilgrim	8fd9af4518	Remark.h - reduce ArrayRef.h include to SmallVector.h. NFC. We only need to include SmallVector.h in Remark.h, and then the more bulky ArrayRef.h in Remark.cpp.	2020-05-08 11:10:28 +01:00
Simon Pilgrim	ffd9cfa740	AArch6/ARMTargetParser.h - move Triple.h dependency down to cpp file. NFC. Reduce Triple.h include to a forward declaration in the header. Only the implementations in the cpp files need the actual Triple class definition.	2020-05-08 11:10:28 +01:00
Igor Kudrin	6ab09e7177	Fix a failing test. Differential Revision: https://reviews.llvm.org/D79501	2020-05-08 15:36:01 +07:00
Nikita Popov	5a2265647e	Reapply [InstSimplify] Remove known bits constant folding No changes relative to last time, but after a mitigation for an AMDGPU regression landed. --- If SimplifyInstruction() does not succeed in simplifying the instruction, it will compute the known bits of the instruction in the hope that all bits are known and the instruction can be folded to a constant. I have removed a similar optimization from InstCombine in D75801, and would like to drop this one as well. On average, we spend ~1% of total compile-time performing this known bits calculation. However, if we introduce some additional statistics for known bits computations and how many of them succeed in simplifying the instruction we get (on test-suite): instsimplify.NumKnownBits: 216 instsimplify.NumKnownBitsComputed: 13828375 valuetracking.NumKnownBitsComputed: 45860806 Out of ~14M known bits calculations (accounting for approximately one third of all known bits calculations), only 0.0015% succeed in producing a constant. Those cases where we do succeed to compute all known bits will get folded by other passes like InstCombine later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change. On lencod we see an improvement (a loop phi is optimized away), on the GCC torture test a regression (a function return value is determined only after IPSCCP, preventing propagation from a noinline function.) There are various regressions in InstSimplify tests. However, all of these cases are already handled by InstCombine, and corresponding tests have already been added there. Differential Revision: https://reviews.llvm.org/D79294	2020-05-08 10:24:53 +02:00
Igor Kudrin	989ae9e848	[DebugInfo] Fix handling DW_OP_call_ref in DWARF64 units. DW_OP_call_ref is the only operation that has an operand which depends on the DWARF format. The patch fixes handling that operation in DWARF64 units. Differential Revision: https://reviews.llvm.org/D79501	2020-05-08 15:14:42 +07:00
Igor Kudrin	050c9dd43a	[DebugInfo] Fix printing values of forms which depend on the DWARF format. The values are 8 bytes long in DWARF64, so they should not be truncated to uint32_t on dumping. Differential Revision: https://reviews.llvm.org/D79093	2020-05-08 15:14:41 +07:00
Nikita Popov	5fa87ec004	[AMDGPU] Try to determine sign bit during div/rem expansion This is preparation for D79294, which removes an expensive InstSimplify optimization, on the assumption that it will be picked up by InstCombine instead. Of course, this does not hold up if a backend performs non-trivial IR expansions without running a canonicalization pipeline afterwards, which turned up as an issue in the context of AMDGPU div/rem expansion. This patch mitigates the issue by explicitly performing a known bits calculation where it matters. No test changes, as those would only be visible after the other patch lands. Differential Revision: https://reviews.llvm.org/D79596	2020-05-08 10:11:26 +02:00
Craig Topper	5e74cf2999	[X86] Add v32i8 and v64i8 tests to vec_smulo.ll and vec_umulo.ll. NFC I was look at our vXi8 handling in LowerMULH and noticed that vXi8 mulo uses but we don't test all types.	2020-05-07 22:17:25 -07:00
aartbik	771d30c647	[llvm] [CodeGen] Fixed vector halving bug for masked store Summary: Note that this fix is very similar to what has already been done for the masked load in https://reviews.llvm.org/D78608 Bugs: https://bugs.llvm.org/show_bug.cgi?id=45563 https://bugs.llvm.org/show_bug.cgi?id=45833 Reviewers: craig.topper, nicolasvasilache, mehdi_amini Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79611	2020-05-07 19:01:40 -07:00
Xing GUO	ce86a986c3	[Object] Remove unneeded check in ELFFile<ELFT>::dynamicEntries(). Check for `DynSecSize % sizeof(Elf_Dyn) != 0` is unneeded in this context. 1. If the .dynamic section is acquired from program headers, the .dynamic section is "cut off" by ``` makeArrayRef(..., Phdr.p_filesz / sizeof(Elf_Dyn)); DynSeSize = Phdr.p_filesz; ``` 2. If the .dynamic section is acquired from section headers, the .dynamic section is checked in `getSectionContentsAsArray<Elf_Dyn>(&Sec)`. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D79560	2020-05-08 09:54:36 +08:00
Diego Caballero	f5224d437e	[LoopFusion] Remove unreachable blocks from DT and LI after fusion This patch removes FC0.ExitBlock and FC1GuardBlock from DT and LI after fusion of guarded loops. They become unreachable and LI verification failed when they happened to be inside another loop. Reviewed By: kbarton Differential Revision: https://reviews.llvm.org/D78679	2020-05-07 16:44:40 -07:00
Nico Weber	29396059a4	Revert "[YAMLVFSWriter][Test][NFC] Add couple tests" This reverts commit `7143d79254`. Breaks check-llvm on Windows, see e.g. http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/15919/steps/stage%201%20check/logs/stdio	2020-05-07 19:07:08 -04:00
James Y Knight	7af9d386da	Correctly modify the CFG in IfConverter, and then remove the CorrectExtraCFGEdges function. The latter was a workaround for "Various pieces of code" leaving bogus extra CFG edges in place. Where by "various" it meant only IfConverter::MergeBlocks, which failed to clear all of the successors of dead blocks it emptied out. This wouldn't matter a whole lot, except that the dead blocks remained listed as predecessors of still-useful blocks, inhibiting optimizations. This fix slightly changed two thumb tests, because the correct CFG successors allowed for the "diamond" if-conversion pattern to be detected, when it could only use "simple" before. Additionally, the removal of a now-redundant call to analyzeBranch (with AllowModify=true) in BranchFolder::OptimizeFunction caused a later check for an empty block in BranchFolder::OptimizeBlock to fail. Correct this by moving the call to analyzeBranch in OptimizeBlock higher. Differential Revision: https://reviews.llvm.org/D79527	2020-05-07 18:17:07 -04:00
Johannes Doerfert	edf0391491	[Attributor][FIX] Record dependences for assumed dead abstract attributes In a recent patch we introduced a problem with abstract attributes that were assumed dead at some point. Since `Attributor::updateAA` was introduced in `95e0d28b71`, we did not remember the dependence on the liveness AA when an abstract attribute was assumed dead and therefore not updated. Explicit reproducer added in liveness.ll. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 509242 (345483/s) temporary memory allocations: 98666 (66937/s) peak heap memory consumption: 18.60MB peak RSS (including heaptrack overhead): 103.29MB total memory leaked: 269.10KB ``` After: ``` calls to allocation functions: 529332 (355494/s) temporary memory allocations: 102107 (68574/s) peak heap memory consumption: 19.40MB peak RSS (including heaptrack overhead): 102.79MB total memory leaked: 269.10KB ``` Difference: ``` calls to allocation functions: 20090 (1339333/s) temporary memory allocations: 3441 (229400/s) peak heap memory consumption: 801.45KB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ```	2020-05-07 17:00:50 -05:00
Johannes Doerfert	675334daef	[Attributor] Mark dependence as optional	2020-05-07 17:00:50 -05:00
Ed Maste	21e5e1724b	getMainExecutable: Fix hand-rolled AT_EXECPATH for older FreeBSD Once we hit AT_NULL, we need to bail out of the loop; not just the enclosing switch. This fixes basic usage (e.g. `cc --version`) when AT_EXECPATH isn't present on older branches (e.g. under emu-user-static, at the moment), where we would previously run off the end of ::environ. Patch By: kevans Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D79239	2020-05-07 17:05:17 -04:00
Alina Sbirlea	6227f021ad	[SimpleLoopUnswitch] Update DefaultExit condition to check unreachable is not empty. Summary: Update the check for the default exit block to not only check that the terminator is not unreachable, but also check that unreachable block has only the unreachable instruction. Reviewers: chandlerc Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78277	2020-05-07 13:48:30 -07:00
Sanjay Patel	5b48f7d2fc	[VectorCombine] adjust test to make intent clearer; NFC Create a non-zero result to show that the other lane is computed correctly.	2020-05-07 16:21:17 -04:00
Huihui Zhang	e8ea1eb4c1	[NFC] Adjust test check lines for D78267. This wasn't identified through buildbot before.	2020-05-07 13:20:15 -07:00
Huihui Zhang	1ec0cc0f02	[InstCombine][SVE] Fix visitExtractElementInst for scalable type. Summary: This patch fix the following issues with visitExtractElementInst: 1. Restrict VectorUtils::findScalarElement to fixed-length vector. For scalable type, the number of elements in shuffle mask is unknown at compile-time. 2. Fix out-of-range calculation for fixed-length vector. 3. Skip scalable type when analysis rely on fixed number of elements. 4. Add unit tests to check functionality of extractelement for scalable type. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78267	2020-05-07 13:03:52 -07:00
Nico Weber	d03838343f	Make -Wnonportable-include-path ignore drive case on Windows. See PR45812 for motivation. No explicit test since I couldn't figure out how to get the current disk drive in lower case into a form in lit where I could mkdir it and cd to it. But the change does have test coverage in that I can remove the case normalization in lit, and tests failed on several bots (and for me locally if in a pwd with a lower-case drive) without that normalization prior to this change. Differential Revision: https://reviews.llvm.org/D79531	2020-05-07 15:54:09 -04:00
Huihui Zhang	08c9c13749	[InstCombine][SVE] Fix visitInsertElementInst for scalable type. Summary: This patch fixes the following issues in visitInsertElementInst: 1. Bail out for scalable type when analysis requires fixed size number of vector elements. 2. Use cast<FixedVectorType> to get vector number of elements. This ensure assertion on scalable vector type. 3. For scalable type, avoid folding a chain of insertelement into splat: insertelt(insertelt(insertelt(insertelt X, %k, 0), %k, 1), %k, 2) ... -> shufflevector(insertelt(X, %k, 0), undef, zero) The length of scalable vector is unknown at compile-time, therefore we don't know if given insertelement sequence is valid for splat. Reviewers: sdesmalen, efriedma, spatel, nikic Reviewed By: sdesmalen, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78895	2020-05-07 12:44:52 -07:00
Sanjay Patel	5d0f2fdfa5	[VectorCombine] add tests with undefs; NFC Goes with D79452.	2020-05-07 15:28:26 -04:00
Matt Arsenault	6f17b3e3a7	AMDGPU: Fix broken tests for HSA metadata These were testing byval private kernel arguments, which doesn't make any sense and has never been used. There didn't seem to be any tests for real value struct arguments, which are.	2020-05-07 15:27:12 -04:00
Sanjay Patel	02051c7f3a	[SLP] add another bailout for load-combine patterns (2nd try) The original patch (rG86dfbc676ebe) exposed an existing bug: we could wrongly cast a constant expression to BinaryOperator because the pattern matching allows that. This adds a check for that case, and there's a reduced test case to verify no crashing. Original commit message: This builds on the or-reduction bailout that was added with D67841. We still do not have IR-level load combining, although that could be a target-specific enhancement for -vector-combiner. The heuristic is narrowly defined to catch the motivating case from PR39538: https://bugs.llvm.org/show_bug.cgi?id=39538 ...while preserving existing functionality. That is, there's an unmodified test of pure load/zext/store that is not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll. That's the reason for the logic difference to require the 'or' instructions. The chances that vectorization would actually help a memory-bound sequence like that seem small, but it looks nicer with: vpmovzxwd (%rsi), %xmm0 vmovdqu %xmm0, (%rdi) rather than: movzwl (%rsi), %eax movl %eax, (%rdi) ... In the motivating test, we avoid creating a vector mess that is unrecoverable in the backend, and SDAG forms the expected bswap instructions after load combining: movzbl (%rdi), %eax vmovd %eax, %xmm0 movzbl 1(%rdi), %eax vmovd %eax, %xmm1 movzbl 2(%rdi), %eax vpinsrb $4, 4(%rdi), %xmm0, %xmm0 vpinsrb $8, 8(%rdi), %xmm0, %xmm0 vpinsrb $12, 12(%rdi), %xmm0, %xmm0 vmovd %eax, %xmm2 movzbl 3(%rdi), %eax vpinsrb $1, 5(%rdi), %xmm1, %xmm1 vpinsrb $2, 9(%rdi), %xmm1, %xmm1 vpinsrb $3, 13(%rdi), %xmm1, %xmm1 vpslld $24, %xmm0, %xmm0 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpslld $16, %xmm1, %xmm1 vpor %xmm0, %xmm1, %xmm0 vpinsrb $1, 6(%rdi), %xmm2, %xmm1 vmovd %eax, %xmm2 vpinsrb $2, 10(%rdi), %xmm1, %xmm1 vpinsrb $3, 14(%rdi), %xmm1, %xmm1 vpinsrb $1, 7(%rdi), %xmm2, %xmm2 vpinsrb $2, 11(%rdi), %xmm2, %xmm2 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpinsrb $3, 15(%rdi), %xmm2, %xmm2 vpslld $8, %xmm1, %xmm1 vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero vpor %xmm2, %xmm1, %xmm1 vpor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rsi) movl (%rdi), %eax movl 4(%rdi), %ecx movl 8(%rdi), %edx movbel %eax, (%rsi) movbel %ecx, 4(%rsi) movl 12(%rdi), %ecx movbel %edx, 8(%rsi) movbel %ecx, 12(%rsi) Differential Revision: https://reviews.llvm.org/D78997	2020-05-07 15:04:37 -04:00
Sanjay Patel	62ea77ec02	[SLP] add test for constant expression fake of load-combine pattern; NFC This is a reduction of the test that caused D78997 to be reverted.	2020-05-07 15:04:37 -04:00
Hiroshi Yamauchi	1b4e3def03	[BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare. Summary: This helps detect some missed BFI updates during CodeGenPrepare. This is debug build only and disabled behind a flag. Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts(). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77417	2020-05-07 11:58:00 -07:00
Jan Korous	7143d79254	[YAMLVFSWriter][Test][NFC] Add couple tests Differential Revision: https://reviews.llvm.org/D79552	2020-05-07 11:03:25 -07:00
Jan Korous	c0330bc00f	[YAMLVFSWriter] Fix directory handling For empty directories (except the first one) we've been adding a file with the same name as the directory to the result VFS mapping. Differential Revision: https://reviews.llvm.org/D79551	2020-05-07 10:58:43 -07:00
Jan Korous	5c145034e6	[YAMLVFSWriter][Tests] Fix YAMLVFSWriterTest Differential Revision: https://reviews.llvm.org/D79550	2020-05-07 10:43:22 -07:00
Vedant Kumar	9f889125ab	[dsymutil] Avoid relocating DW_AT_call_pc twice Summary: Avoid relocating DW_AT_call_pc, e.g. when a call PC is equal to the function's low_pc as is the case in the test: ``` __Z5func1v: 0000000100007f94 b __Z5func2v ``` rdar://62952440 Reviewers: friss, JDevlieghere Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79536	2020-05-07 10:36:29 -07:00
Thomas Raoux	dc26dec331	[ModuloSchedule] Fix epilogue peeling with illegal phi. When peeling out the epilogue we need to ignore illegal phis coming from stages greater than the producer stage. Otherwise we end up with circular phi dependencies. Differential Revision: https://reviews.llvm.org/D79581	2020-05-07 10:04:05 -07:00
Christopher Tetreault	b6c6bab9a5	[SVE] Fix incorrect usage of getNumElements() in InstCombineCalls Summary: Remove incorrect usage of getNumElements() from visitCallInst(). The number of elements was being used to construct a DemandedElts bitfield. This operation does not make sense for scalable vectors. Cast to FixedVectorType Identified by test case Clang :: CodeGen/aarch64-sve-intrinsics/acle_sve_mla.c Reviewers: rengolin, efriedma, sdesmalen, c-rhodes, david-arm Reviewed By: david-arm Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79524	2020-05-07 08:46:51 -07:00
Simon Pilgrim	c1dc994083	[cmake] Add headers in include/llvm/Remarks subdirectory Appeases visual studio	2020-05-07 16:43:29 +01:00
Hans Wennborg	c54c6ee1a7	Revert "[SLP] add another bailout for load-combine patterns" It caused asserts building Chromium, see discussion on https://reviews.llvm.org/D78997 This reverts commit `86dfbc676e`.	2020-05-07 16:31:52 +02:00
Sanjay Patel	666c61db79	[VectorCombine] add tests for insert into arbitrary constant; NFC Goes with D79452.	2020-05-07 10:27:25 -04:00
Simon Pilgrim	ecd28d2401	[X86] Add AVX512VL concat-cast tests.	2020-05-07 15:08:17 +01:00
Sjoerd Meijer	3bbc71d6c9	[LV] Fix typo in variable name. NFC.	2020-05-07 13:53:44 +01:00
Simon Pilgrim	b8a725274c	[X86][AVX] combineSignExtendInReg - promote mask arithmetic before v4i64 canonicalization We rely on the combine (sext_in_reg (v4i64 a/sext (v4i32 x)), v4i1) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x, ExtraVT))) to avoid complex v4i64 ashr codegen, but doing so prevents v4i64 comparison mask promotion, so ensure we attempt to promote before canonicalizing the (hopefully now redundant sext_in_reg). Helps with the poor codegen in PR45808.	2020-05-07 13:16:36 +01:00
Sam Parker	751da4d596	[NFC][AArch64] Add test Add cost model test for cast operations.	2020-05-07 13:16:03 +01:00
Calixte Denizet	bec223a9bc	[profile] Don't crash when forking in several threads Summary: When forking in several threads, the counters were written out in using the same global static variables (see GCDAProfiling.c): that leads to crashes. So when there is a fork, the counters are resetted in the child process and they will be dumped at exit using the interprocess file locking. When there is an exec, the counters are written out and in case of failures they're resetted. Reviewers: jfb, vsk, marco-c, serge-sans-paille Reviewed By: marco-c, serge-sans-paille Subscribers: llvm-commits, serge-sans-paille, dmajor, cfe-commits, hiraditya, dexonsmith, #sanitizers, marco-c, sylvestre.ledru Tags: #sanitizers, #clang, #llvm Differential Revision: https://reviews.llvm.org/D78477	2020-05-07 14:13:11 +02:00
Anna Welker	1e413a8c36	[ARM][MVE] Add support for incrementing gathers Enables the MVEGatherScatterLowering pass to build pre-incrementing gathers. Incrementing writeback gathers are built when it is possible to replace the loop increment instruction. Differential Revision: https://reviews.llvm.org/D76786	2020-05-07 12:33:50 +01:00
Kazushi (Jam) Marukawa	447efdb52b	[VE] Minimum MC layer for VE (2/4) Remove unnecessary EncoderMethod and DecoderMethod which cause errors in supporting MC layer. Differential Revision: https://reviews.llvm.org/D79544	2020-05-07 13:21:37 +02:00
Kazushi (Jam) Marukawa	6999ffcc39	[VE] Implements minimum MC layer for VE (1/4) Summary: Correct instruction bitfield addresses to generate machine code correctly. Also add some variables to represent all instructions correctly and change default values to use registers by default. Differential Revision: https://reviews.llvm.org/D79539	2020-05-07 13:10:36 +02:00
Sjoerd Meijer	ae45b4dbe7	Recommit "[LV] Induction Variable does not remain scalar under tail-folding." With 3 llvm regr tests fixed/updated that I had missed.	2020-05-07 11:52:20 +01:00
Kerry McLaughlin	3bcd3dd473	[CodeGen][SVE] Lowering of shift operations with scalable types Summary: Adds AArch64ISD nodes for: - SHL_PRED (logical shift left) - SHR_PRED (logical shift right) - SRA_PRED (arithmetic shift right) Existing patterns for unpredicated left shift by immediate have also been moved into the appropriate multiclasses in SVEInstrFormats.td. Reviewers: sdesmalen, efriedma, ctetreau, huihuiz, rengolin Reviewed By: efriedma Subscribers: huihuiz, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79478	2020-05-07 11:43:49 +01:00
LLVM GN Syncbot	92c657920e	[gn build] Port `e3ffe7269b`	2020-05-07 10:11:03 +00:00

1 2 3 4 5 ...

196368 Commits