llvm-project

Commit Graph

Author	SHA1	Message	Date
Vitaly Buka	5d964e262f	[StackSafety] Check variable lifetime We can't consider variable safe if out-of-lifetime access is possible. So if StackLifetime can't prove that the instruction always uses the variable when it's still alive, we consider it unsafe.	2020-06-22 03:45:29 -07:00
Vitaly Buka	8f592ed333	[StackSafety] Ignore unreachable instructions Usually DominatorTree provides this info, but here we use StackLifetime. The reason is that in the next patch StackLifetime will be used for actual lifetime checks and we can avoid forwarding the DominatorTree into this code.	2020-06-22 03:45:29 -07:00
Anton Korobeynikov	6cb80fbe40	Revert "[MSP430] Update register names" This reverts commit `8f6620f663`.	2020-06-22 13:37:22 +03:00
Anatoly Trosinenko	8f6620f663	[MSP430] Update register names When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4. The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it. Differential Revision: https://reviews.llvm.org/D82184	2020-06-22 13:24:03 +03:00
Momchil Velikov	75b0bbca1d	[LTO] Use StringRef instead of C-style strings in setCodeGenDebugOptions Fixes an issue with missing nul-terminators and saves us some string copying, compared to a version which would insert nul-terminators. Differential Revision: https://reviews.llvm.org/D82033	2020-06-22 11:22:18 +01:00
Anatoly Trosinenko	a5bd75aab8	[MSP430] Enable some basic support for debug information This commit technically permits LLVM to emit the debug information for ELF files for MSP430 architecture. Aside from this, it only defines the register numbers as defined by part 10.1 of MSP430 EABI specification (assuming the 1-byte subregisters share the register numbers with corresponding full-size registers). This commit was basically tested by me with TI-provided GCC 8.3.1 toolchain by compiling an example program with `clang` (please note manual linking may be required due to upstream `clang` not yet handling the `-msim` option necessary to run binaries on the GDB-provided simulator) and then running it and single-stepping with `msp430-elf-gdb` like this: ``` $sysroot/bin/msp430-elf-gdb ./test -ex "target sim" -ex "load ./test" (gdb) ... traditional GDB commands follow ... ``` While this implementation is most probably far from completeness and is considered experimental, it can already help with debugging MSP430 programs as well as finding issues in LLVM debug info support for MSP430 itself. One of the use cases includes trying to find a point where UBSan check in a trap-on-error mode was triggered. The expected debug information format is described in the [MSP430 Embedded Application Binary Interface](http://www.ti.com/lit/an/slaa534/slaa534.pdf) specification, part 10. Differential Revision: https://reviews.llvm.org/D81488	2020-06-22 13:14:07 +03:00
Anatoly Trosinenko	359fae6eb0	[DebugInfo] Explicitly permit addr_size = 0x02 when parsing DWARF data Current LLVM implementation uses `MCAsmInfo::CodePointerSize` as addr_size when emitting the DWARF data. llvm-dwarfdump, on the other hand, handles `addr_size`s of 4 and 8 properly and considers all other sizes as an error. This works for most of mainline targets except for MSP430 and AVR. msp430-gcc v8.3.1 emits DWARF32 with addr_size = 4 (DWARF32 does not imply addr_size = 4, 32 refers to internal offset width of 4 bytes) that is handled by llvm-dwarfdump already. Still, emitting 2-byte target pointers on MSP430 seems correct as well (but not for MSP430X that is supported by msp430-gcc but not by LLVM and has 20-bit address space). This patch make it possible for MSP430 debug info support to be tested with llvm-dwarfdump. Differential Revision: https://reviews.llvm.org/D82055	2020-06-22 13:11:55 +03:00
Florian Hahn	0e19ff02d8	[DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC).	2020-06-22 10:58:53 +01:00
Djordje Todorovic	792786e34d	[CSInfo][MIPS] Don't describe parameters loaded by sub/super reg copy When describing parameter value loaded by a COPY instruction, consider case where needed Reg value is a sub- or super- register of the COPY instruction's destination register. Without this patch, compile process will crash with the assertion "TargetInstrInfo::describeLoadedValue can't describe super- or sub-regs for copy instructions". Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D82000	2020-06-22 10:49:02 +02:00
Serguei Katkov	29b2c1ca72	[Peeling] Extend the scope of peeling a bit Currently we allow peeling of the loops if there is a exiting latch block and all other exits are blocks ending with deopt. Actually we want that exit would end up with deopt unconditionally but it is not required that exit itself ends with deopt. Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev Reviewed By: apilipenko Subscribers: hiraditya, zzheng, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D81140	2020-06-22 12:17:44 +07:00
Michael Liao	20a1700293	[amdgpu] Fix REL32 relocations with negative offsets. Summary: - The offset should be treated as a signed one. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82234	2020-06-21 23:09:03 -04:00
Sanjay Patel	6bdd531af5	[VectorCombine] create class for pass to hold analyses, etc; NFC This doesn't change anything currently, but it would make sense to create a class-level IRBuilder instead of recreating that everywhere. As we expand to more optimizations, we will probably also want to hold things like the DataLayout or other constant refs in here too.	2020-06-21 16:07:33 -04:00
David Green	67121d7b82	[CGP] Enable CodeGenPrepares phi type convertion.	2020-06-21 16:46:16 +01:00
Florian Hahn	40569db7b3	[DSE,MSSA] Move reachability check to main loop. As we traverse the CFG backwards, we could end up reaching unreachable blocks. For unreachable blocks, we won't have computed post order numbers and because DomAccess is reachable, unreachable blocks cannot be on any path from it. This fixes a crash with unreachable blocks.	2020-06-21 16:38:10 +01:00
David Green	730ecb63ec	[CGP] Convert phi types If a collection of interconnected phi nodes is only ever loaded, stored or bitcast then we can convert the whole set to the bitcast type, potentially helping to reduce the number of register moves needed as the phi's are passed across basic block boundaries. This has to be done in CodegenPrepare as it naturally straddles basic blocks. The alorithm just looks from phi nodes, looking at uses and operands for a collection of nodes that all together are bitcast between float and integer types. We record visited phi nodes to not have to process them more than once. The whole subgraph is then replaced with a new type. Loads and Stores are bitcast to the correct type, which should then be folded into the load/store, changing it's type. This comes up in the biquad testcase due to the way MVE needs to keep values in integer registers. I have also seen it come up from aarch64 partner example code, where a complicated set of sroa/inlining produced integer phis, where float would have been a better choice. I also added undef and extract element handling which increased the potency in some cases. This adds it with an option that defaults to off, and disabled for 32bit X86 due to potential issues around canonicalizing NaNs. Differential Revision: https://reviews.llvm.org/D81827	2020-06-21 15:54:17 +01:00
Nikita Popov	37d3030711	[ValueTracking, BasicAA] Don't simplify instructions GetUnderlyingObject() (and by required symmetry DecomposeGEPExpression()) will call SimplifyInstruction() on the passed value if other checks fail. This simplification is very expensive, but has little effect in practice. This patch removes the SimplifyInstruction call(), and replaces it with a check for single-argument phis (which can occur in canonical IR in LCSSA form), which is the only useful simplification case I was able to identify. At O3 the geomean CTMark improvement is -1.7%. The largest improvement is SPASS with ThinLTO at -6%. In test-suite, I see only two tests with a hash difference and no code size difference (PAQ8p, Ptrdist), which indicates that the simplification only ends up being useful very rarely. (I would have liked to figure out which simplification is responsible here, but wasn't able to spot it looking at transformation logs.) The AMDGPU test case that is update was using two selects with undef condition, in which case GetUnderlyingObject will return the first select operand as the underlying object. This will of course not happen with non-undef conditions, so this was not testing anything realistic. Additionally this illustrates potential unsoundness: While GetUnderlyingObject will pick the first operand, the select might be later replaced by the second operand, resulting in inconsistent assumptions about the undef value. Differential Revision: https://reviews.llvm.org/D82261	2020-06-21 16:31:07 +02:00
Sanjay Patel	2ad42c2653	[ValueTracking] improve analysis for fdiv with same operands (The 'nnan' variant of this pattern is already tested to produce '1.0'.) https://alive2.llvm.org/ce/z/D4hPBy define i1 @src(float %x, i32 %y) { %0: %d = fdiv float %x, %x %uge = fcmp uge float %d, 0.000000 ret i1 %uge } => define i1 @tgt(float %x, i32 %y) { %0: ret i1 1 } Transformation seems to be correct!	2020-06-21 09:07:59 -04:00
Simon Pilgrim	fb9f9dc318	[X86][SSE] Add SimplifyDemandedVectorEltsForTargetShuffle to handle target shuffle variable masks Pulled out from the ongoing work on D66004, currently we don't do a good job of simplifying variable shuffle masks that have already lowered to constant pool entries. This patch adds SimplifyDemandedVectorEltsForTargetShuffle (a custom x86 helper) to first try SimplifyDemandedVectorElts (which we already do) and then constant pool simplification to help mark undefined elements. To prevent lowering/combines infinite loops, we only handle basic constant pool loads instead of creating new BUILD_VECTOR nodes for lowering - e.g. we don't try to convert them to broadcast/vzext_load - there might be some benefit to this but if so I'd rather we come up with some way to reuse existing code than reimplement a lot of BUILD_VECTOR code. Differential Revision: https://reviews.llvm.org/D81791	2020-06-21 11:16:07 +01:00
clfbbn	10b0539772	[Attributor][NFC] Fix indentation Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, kuter, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82260	2020-06-21 15:43:32 +08:00
Wenlei He	7c8a6936bf	[Remarks] Add callsite locations to inline remarks Summary: Add call site location info into inline remarks so we can differentiate inline sites. This can be useful for inliner tuning. We can also reconstruct full hierarchical inline tree from parsing such remarks. The messege of inline remark is also tweaked so we can differentiate SampleProfileLoader inline from CGSCC inline. Reviewers: wmi, davidxl, hoy Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82213	2020-06-20 23:32:10 -07:00
Amy Kwan	cc95635b1b	[PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang This patch implements builtins for the following prototypes: ``` vector signed char vec_clrl (vector signed char a, unsigned int n); vector unsigned char vec_clrl (vector unsigned char a, unsigned int n); vector signed char vec_clrr (vector signed char a, unsigned int n); vector signed char vec_clrr (vector unsigned char a, unsigned int n); ``` Differential Revision: https://reviews.llvm.org/D81707	2020-06-20 18:29:16 -05:00
Eric Christopher	dc20419351	Rename function to more accurately reflect what it does.	2020-06-20 14:37:29 -07:00
Eric Christopher	8116d01905	Typos around a -> an.	2020-06-20 14:04:48 -07:00
Sanjay Patel	741e20f3d6	[VectorCombine] fix assert for type of compare operand As shown in the post-commit comment for D81661 - we need to loosen the type assertion to allow scalarization of a compare for vectors of pointers.	2020-06-20 15:20:17 -04:00
Sanjay Patel	7b201bfcac	[InstCombine] remove unused parameter and add assert; NFC	2020-06-20 11:47:00 -04:00
Sanjay Patel	d84cdb81ed	[InstCombine] fabs(X) / fabs(X) -> X / X Also, consolidate related folds so we don't miss/repeat these.	2020-06-20 10:20:21 -04:00
Simon Pilgrim	89dcbdfcfd	[X86] combineSetCCMOVMSK - consistently use CmpBits variable. NFCI. The comparison value should be the same size - I've added an assert to be absolutely certain.	2020-06-20 12:35:24 +01:00
Simon Pilgrim	56a9332328	[X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) != -1 -> !PTESTZ(X,X) allof patterns	2020-06-20 12:17:32 +01:00
Nikita Popov	d3d4e4bcb7	[LVI] Extract addValueHandle() method (NFC) There will be more places registering value handles.	2020-06-20 13:05:42 +02:00
Nikita Popov	64ecf85f63	[LVI] Use find_as() where possible (NFC) This prevents us from creating temporary PoisoningVHs and AssertingVHs while performing hashmap lookups. As such, it only matters in assertion-enabled builds.	2020-06-20 13:05:42 +02:00
Florian Hahn	9a7d80a32c	Revert "[BasicAA] Use known lower bounds for index values for size based check." This potentially related to https://bugs.llvm.org/show_bug.cgi?id=46335 and causes a slight compile-time regression. Revert while investigating. This reverts commit `d99a1848c4`.	2020-06-20 10:06:05 +01:00
Eric Christopher	10563e16aa	[Analysis/Transforms/Sanitizers] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:42:26 -07:00
Eric Christopher	858d385578	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:24:57 -07:00
Eric Christopher	cf23852587	[Target] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist. This change affects an internal llvm command line option.	2020-06-20 00:06:39 -07:00
Xing GUO	6770349592	[DWARFYAML][debug_info] Fix array index out of bounds error This patch is trying to fix the array index out of bounds error. I observed it in (https://reviews.llvm.org/harbormaster/unit/view/99638/). Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D82139	2020-06-20 15:08:24 +08:00
Craig Topper	c721bc081e	[X86] Correct the implementation of ud1(a.k.a. ud2b) instruction. We were missing the modrm byte this instruction has according to current Intel SDM. Experiments with gcc indicate that different modrm values are chosen based on 2 operands so I've added those as well. I think our previous implementation was based on an older behavior of binutils that has since been changed.	2020-06-19 23:57:48 -07:00
Craig Topper	0dda5e4ce2	[X86] Ignore bits 2:0 of the modrm byte when disassembling lfence, mfence, and sfence. These are documented as using modrm byte of 0xe8, 0xf0, and 0xf8 respectively. But hardware ignore bits 2:0. So 0xe9-0xef is treated the same as 0xe8. Similar for the other two. Fixing this required adding 8 new formats to the X86 instructions to convey this information. Could have gotten away with 3, but adding all 8 made for a more logical conversion from format to modrm encoding. I renumbered the format encodings to keep the register modrm formats grouped together.	2020-06-19 22:24:24 -07:00
Fangrui Song	2a4317bfb3	[SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D82244	2020-06-19 22:22:47 -07:00
Yevgeny Rouban	6429471e8b	[IR] Convert profile metadata in createCallMatchingInvoke() When an invoke instruction is converted to a call its profile metadata is dropped because it has incompatible format (see commit `16ad6eeb94`). This patch adds an attempt to convert profile data to format of the call instruction. This used to work well before the commit `dcfa78a4cc`. Reviewers: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D82071	2020-06-20 12:10:31 +07:00
Wang Rui	dd48c57da3	[Mips] Error if a non-immediate operand is used while an immediate is expected The 32-bit type relocation (R_MIPS_32) cannot be used for instructions below: ori $4, $4, start ori $4, $4, (start - .) We should print an error instead. Reviewed By: atanasyan, MaskRay Differential Revision: https://reviews.llvm.org/D81908	2020-06-19 22:08:59 -07:00
Vitaly Buka	3d8149db3c	[StackSafety,NFC] Don't rerun on LiveIn change	2020-06-19 21:29:31 -07:00
Xing GUO	1cfdda57fa	[ObjectYAML][ELF] Add support for emitting the .debug_info section. This patch helps add support for emitting the .debug_info section to yaml2elf. Reviewed By: jhenderson, grimar, MaskRay Differential Revision: https://reviews.llvm.org/D82073	2020-06-20 12:13:01 +08:00
Carl Ritson	4a7de36afc	[AMDGPU] Avoid use of V_READLANE into EXEC in SGPR spills Always prefer to clobber input SGPRs and restore them after the spill. This applies to both spills to VGPRs and scratch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81914	2020-06-20 12:10:47 +09:00
romanova-ekaterina	d7fad626e9	Error related to ThinLTO caching needs to be downgraded to a remark This is a fix for PR #46392 (Diagnostic message (error) related to ThinLTO caching needs to be downgraded to a remark). There are diagnostic messages related to ThinLTO caching that contain the word "error", but they are really just notices/remarks for users, and they don't cause a build failure. The word "error" appearing can be confusing to users, and may even cause deeper problems. User's build system might be designed to interpret any error messages (even a benign error message as the one above) reported by the compiler as a build failure, thus causing the build to fail "needlessly". In short, the term "error" in this diagnostic is misleading at best, and may be causing build systems to fail at worst. Differential Revision: https://reviews.llvm.org/D82138	2020-06-19 16:03:29 -07:00
Eric Christopher	b6536e549d	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-19 15:12:18 -07:00
Heejin Ahn	83c26eae23	[WebAssembly] Remove TEEs when dests are unstackified When created in RegStackify pass, `TEE` has two destinations, where op0 is stackified and op1 is not. But it is possible that op0 becomes unstackified in `fixUnwindMismatches` function in CFGStackify pass when a nested try-catch-end is introduced, violating the invariant of `TEE`s destinations. In this case we convert the `TEE` into two `COPY`s, which will eventually be resolved in ExplicitLocals. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D81851	2020-06-19 14:55:21 -07:00
Martin Storsjö	cdbd299800	[Support] Fix building for mingw on a case sensitive file system This fixes cross building on a case sensitive file system after `2e613d2ded`. (The official Windows SDKs don't have self-consistent casing and can't be used as such on case sentisive file systems without case fixups, while mingw headers consistently use lower case.)	2020-06-20 00:39:22 +03:00
Amara Emerson	1feeecf224	[AArch64][GlobalISel] Make G_SEXT_INREG legal and add selection support. We were defaulting to the lower action for this, resulting in SHL+ASHR sequences. On AArch64 we can do this in one instruction for an arbitrary extension using SBFM as we do for G_SEXT. Differential Revision: https://reviews.llvm.org/D81992	2020-06-19 13:20:41 -07:00
Sanjay Patel	216a37bb46	[VectorCombine] refactor extract-extract logic; NFCI	2020-06-19 14:52:27 -04:00
Lang Hames	bf783a6aa8	[JITLink] Display host -> target address mapping in debugging output. This can be helpful for sanity checking JITLink memory manager behavior.	2020-06-19 10:05:02 -07:00
Sanjay Patel	6d864097a2	[VectorCombine] fix crash while transforming constants This is a variation of the proposal in D82049 with an extra test.	2020-06-19 12:30:32 -04:00
Stanislav Mekhanoshin	2b87a44c49	[AMDGPU] Some formatting fixes. NFC.	2020-06-19 09:02:59 -07:00
Piotr Sobczak	6d9565d6d5	Revert "[AMDGPU] Select s_cselect" This caused some failures detected by the buildbot with expensive checks enabled. This reverts commit `4067de569f`.	2020-06-19 16:41:04 +02:00
dfukalov	129388ddc4	[AMDGPU][CostModel] Add fneg cost estimation Summary: The estimation uses AMDGPUTargetLowering::isFNegFree() Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82065	2020-06-19 17:31:35 +03:00
Piotr Sobczak	4067de569f	[AMDGPU] Select s_cselect Summary: Add patterns to select s_cselect in the isel. Handle more cases of implicit SCC accesses in si-fix-sgpr-copies to allow new patterns to work. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81925	2020-06-19 16:17:46 +02:00
Sjoerd Meijer	4aa893b8f2	[ARM][MVE] tail-predication: renamed internal option. Renamed -force-tail-predication to -force-mve-tail-predication because that's more descriptive and consistent.	2020-06-19 15:07:06 +01:00
Mikhail Maltsev	490f78c038	[ARM][BFloat] Implement lowering of bf16 load/store intrinsics Reviewers: labrinea, dmgreen, pratlucas, LukeGeeson Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81486	2020-06-19 14:02:35 +00:00
Mikhail Maltsev	7526881246	[ARM][BFloat] Lowering of create/get/set/dup intrinsics This patch adds codegen for the following BFloat operations to the ARM backend: * concatenation of bf16 vectors * bf16 vector element extraction * bf16 vector element insertion * duplication of a bf16 value into each lane of a vector * duplication of a bf16 vector lane into each lane Differential Revision: https://reviews.llvm.org/D81411	2020-06-19 12:52:40 +00:00
Simon Pilgrim	c143db3b10	[X86][SSE] combineHorizontalPredicateResult - improve all_of(X == 0) for vXi64 on pre-SSE41 targets Without SSE41 we don't have the PCMPEQQ instruction, making cmp-with-zero reductions more complicated than necessary. We can compare as vXi32 (PCMPEQD) and tweak the MOVMSK comparison to test upper/lower DWORD comparisons. This pre-fixes something that occurs with null tests for vectors of (64-bit) pointers such as in PR35129.	2020-06-19 11:43:25 +01:00
Vitaly Buka	0e1bdeafc9	[StackSafety,NFC] Fix comment	2020-06-19 03:11:13 -07:00
Tyker	67448a8ccc	try to fix build bot after `b7338fb1a6`	2020-06-19 12:02:09 +02:00
Simon Pilgrim	cad2038700	[X86][SSE] combineSetCCMOVMSK - fold MOVMSK(SHUFFLE(X,u)) -> MOVMSK(X) If we're permuting ALL the elements of a single vector, then for allof/anyof MOVMSK tests we can avoid the shuffle entirely.	2020-06-19 10:57:52 +01:00
David Sherwood	584d0d5c17	[SVE] Fall back on DAG ISel at -O0 when encountering scalable types At the moment we use Global ISel by default at -O0, however it is currently not capable of dealing with scalable vectors for two reasons: 1. The register banks know nothing about SVE registers. 2. The LLT (Low Level Type) class knows nothing about scalable vectors. For now, the easiest way to avoid users hitting issues when using the SVE ACLE is to fall back on normal DAG ISel when encountering instructions that operate on scalable vector types. I've added a couple of RUN lines to existing SVE tests to ensure we can compile at -O0. I've also added some new tests to CodeGen/AArch64/GlobalISel/arm64-fallback.ll that demonstrate we correctly fallback to DAG ISel at -O0 when lowering formal arguments or translating instructions that involve scalable vector types. Differential Revision: https://reviews.llvm.org/D81557	2020-06-19 10:57:00 +01:00
David Sherwood	0dc28af219	[CodeGen,AArch64] Fix up warnings in performExtendCombine Try to avoid calling getVectorNumElements() or relying upon the TypeSize conversion to uin64_t. Differential Revision: https://reviews.llvm.org/D81573	2020-06-19 10:34:51 +01:00
Vitaly Buka	f224f3d0f2	[StackSafety] Add StackLifetime::isAliveAfter This function is going to be added into StackSafety checks. This patch uses function in ::print implementation to make sure that it works as expected.	2020-06-19 02:32:17 -07:00
Vitaly Buka	306c257b00	[SafeStack,NFC] Print liveness for all instrunctions	2020-06-19 02:32:17 -07:00
Vitaly Buka	20b1094a04	[StackSafety,NFC] Replace map with vector We don't need to lookup InstructionNumbering by number, so we can use vector with index as assigned number.	2020-06-19 02:32:17 -07:00
Vitaly Buka	7b27c09f63	[StackSafety,NFC] Don't test terminators Code does not track terminators and do not expose them through interface. State there is just a state of the last instruction or entry. So this information is just redundant and doesn't need to be tested.	2020-06-19 02:32:17 -07:00
Florian Hahn	f9d8e33c32	[SCCP] Turn sext into zext for non-negative ranges. This patch updates SCCP/IPSCCP to use the computed range info to turn sexts into zexts, if the value is known to be non-negative. We already to a similar transform in CorrelatedValuePropagation, but it seems like we can catch a lot of additional cases by doing it in SCCP/IPSCCP as well. The transform is limited to ranges that are known to not include undef. Currently constant ranges from conditions are treated as potentially containing undef, due to PR46144. Once we flip this, the transform will be more effective in practice. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81756	2020-06-19 10:17:55 +01:00
Jay Foad	7cdf4326a8	[LiveIntervals] Fix early-clobber handling in handleMoveUp Without this fix, handleMoveUp can create an invalid live range like this: [98904e,98908r:0)[98908e,227504r:1) where the two segments overlap, but only because we have lost the "e" (early-clobber) on the end point of the first segment. Differential Revision: https://reviews.llvm.org/D82110	2020-06-19 10:17:04 +01:00
Tyker	b7338fb1a6	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-19 10:32:26 +02:00
David Sherwood	7edc7f6edb	[CodeGen] Fix SimplifyDemandedBits for scalable vectors For now I have changed SimplifyDemandedBits and it's various callers to assume we know nothing for scalable vectors and to ignore the demanded bits completely. I have also done something similar for SimplifyDemandedVectorElts. These changes fix up lots of warnings due to calls to EVT::getVectorNumElements() for types with scalable vectors. These functions are all used for optimisations, rather than functional requirements. In future we can revisit this code if there is a need to improve code quality for SVE. Differential Revision: https://reviews.llvm.org/D80537	2020-06-19 07:59:35 +01:00
David Sherwood	9e811b0d93	[CodeGen] Fix ComputeNumSignBits for scalable vectors When trying to calculate the number of sign bits for scalable vectors we should just bail out for now and pretend we know nothing. Differential Revision: https://reviews.llvm.org/D81093	2020-06-19 07:58:42 +01:00
Kristof Beyls	d938ec4509	[AArch64] Avoid incompatibility between SLSBLR mitigation and BTI codegen. A "BTI c" instruction only allows jumping/calling to using a BLR* instruction. However, the SLSBLR mitigation changes a BLR to a BR to implement the function call. Therefore, a "BTI c" check that passed before could trigger after the BLR->BL change done by the SLSBLR mitigation. However, if the register used in BR is X16 or X17, this trigger will not fire (see ArmARM for further details). Therefore, this patch simply changes the function stubs for the SLSBLR mitigation from __llvm_slsblr_thunk_x<N>: br x<N> SpeculationBarrier to __llvm_slsblr_thunk_x<N>: mov x16, x<N> br x16 SpeculationBarrier Differential Revision: https://reviews.llvm.org/D81405	2020-06-19 06:21:54 +01:00
Ronak Chauhan	5bd33de9c8	[MC] Pass the symbol rather than its name to onSymbolStart() Summary: This allows targets to also consider the symbol's type and/or address if needed. Reviewers: scott.linder, jhenderson, MaskRay, aardappel Reviewed By: scott.linder, MaskRay Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82090	2020-06-19 09:30:12 +05:30
Francesco Petrogalli	d32c134648	[llvm][SVE] Reg + reg addressing mode for LD1RO. Reviewers: efriedma, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80741	2020-06-19 03:56:10 +00:00
Nemanja Ivanovic	1fed131660	[PowerPC] Canonicalize shuffles to match more single-instruction masks on LE We currently miss a number of opportunities to emit single-instruction VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although this in itself is not a huge performance opportunity since loading the permute vector for a VPERM can always be pulled out of loops, producing such merge instructions is useful to downstream optimizations. Since VPERM is essentially opaque to all subsequent optimizations, we want to avoid it as much as possible. Other permute instructions have semantics that can be reasoned about much more easily in later optimizations. This patch does the following: - Canonicalize shuffles so that the first element comes from the first vector (since that's what most of the mask matching functions want) - Switch the elements that come from splat vectors so that they match the corresponding elements from the other vector (to allow for merges) - Adds debugging messages for when a shuffle is matched to a VPERM so that anyone interested in improving this further can get the info for their code Differential revision: https://reviews.llvm.org/D77448	2020-06-18 21:54:22 -05:00
Carl Ritson	8f3b2c8aa3	AMDGPU/GlobalISel: Remove selection of MAD/MAC when not available Add code to respect mad-mac-f32-insts target feature. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81990	2020-06-19 10:30:19 +09:00
Vitaly Buka	fcd67665a8	[StackSafety] Add "Must Live" logic Summary: Extend StackLifetime with option to calculate liveliness where alloca is only considered alive on basic block entry if all non-dead predecessors had it alive at terminators. Depends on D82043. Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82124	2020-06-18 16:53:37 -07:00
Nathan James	8b0df1c1a9	[NFC] Refactor Registry loops to range for	2020-06-19 00:40:10 +01:00
Vitaly Buka	f672791e08	[StackSafety] Add pass for StackLifetime testing Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82043	2020-06-18 16:34:18 -07:00
Matt Arsenault	bbd78519f9	ARC: Enforce function alignment at code emission time Don't do this in the MachineFunctionInfo constructor. Also, ensure the alignment rather than overwriting it outright. I vaguely remember there was another place to enforce the target minimum alignment, but I couldn't find it (it's there for instructions).	2020-06-18 17:40:49 -04:00
Matt Arsenault	95605b784b	AMDGPU/GlobalISel: Implement computeKnownAlignForTargetInstr We probably need to move where intrinsics are lowered to copies to make this useful.	2020-06-18 17:28:00 -04:00
Matt Arsenault	b13f6b0fe0	BypassSlowDivision: Fix dropping debug info I don't know anything about debug info, but this seems like more work should be necessary. This constructs a new IRBuilder and reconstructs the original divides rather than moving the original. One problem this has is if a div/rem pair are handled, both end up with the same debugloc. I'm not sure how to fix this, since this uses a cache when it sees the same input operands again, which will have the first instance's location attached.	2020-06-18 17:27:19 -04:00
Amy Kwan	c45c161130	[PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang This patch implements builtins for the following prototypes: vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long); vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b); unsigned long long __builtin_pdepd (unsigned long long, unsigned long long); unsigned long long __builtin_pextd (unsigned long long, unsigned long long); Revision Depends on D80758 Differential Revision: https://reviews.llvm.org/D80935	2020-06-18 16:23:56 -05:00
Matt Arsenault	7f8b2e1b91	GlobalISel: Pass LegalizerHelper to custom legalize callbacks This was passing in all the parameters needed to construct a LegalizerHelper in the custom legalization, when it's simpler to just pass in the existing helper. This is slightly more annoying to use in the common case where you don't need the legalizer helper, but we could add back the common parameters back in addition to the helper. I didn't propagate this to all the internal target changes that this logically implies, but did update a sample one for legalizeMinNumMaxNum. This is in preparation for moving AMDGPU load/store legalization entirely into custom lowering. The current set of legalization actions is really constraining and not really capable of expressing all the actions needed to legalize loads/stores. In particular there's no way to express when the memory access itself needs to change size vs. the result type. There's also a lot of redundancy since the same split/widen actions need to be applied in both vector and scalar cases. All of the sub-cases logically belong as steps in the legalizer helper, but it will be easier to consider everything at once in custom lowering.	2020-06-18 17:17:38 -04:00
Christopher Tetreault	8d11ec66b6	[SVE] Remove calls to VectorType::getNumElements from Transforms/Utils Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82057	2020-06-18 13:39:14 -07:00
Alexandre Ganea	2ae0df5be7	[CodeView] Revert `8374bf4363` and `403f953792` This reverts: `8374bf4363` [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. `403f953792` [CodeView] Add full repro to LF_BUILDINFO record This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c	2020-06-18 16:18:46 -04:00
Kirill Naumov	41d53194fb	[BasicBlock] Added AnnotationWriter functionality to BasicBlock class This functionality is very similar to Function compatibility with AnnotationWriter. This change allows us to use AnnotationWriter with BasicBlock through BB.print() method. Reviewed-By: apilipenko Differntial Revision: https://reviews.llvm.org/D81321	2020-06-18 19:49:58 +00:00
Sanjay Patel	46a285ad9e	[IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC The predicate can always be used to distinguish between icmp and fcmp, so we don't need to keep repeating this check in the callers.	2020-06-18 15:47:06 -04:00
Davide Italiano	8cdd2a158c	[SimplifyCFG] Update debug location when folding branch to common destination Sometimes a dead block gets folded and the debug information is still retained. This manifests as jumpy stepping in lldb, see the bugzilla PR for an end-to-end C testcase. Fixes https://bugs.llvm.org/show_bug.cgi?id=46008 Differential Revision: https://reviews.llvm.org/D82062	2020-06-18 12:33:32 -07:00
Michael Liao	2defe55722	[TTI] Expose isNoopAddrSpaceCast in TTI. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82025	2020-06-18 14:40:47 -04:00
serge-sans-paille	4dd332723d	Fix return status of LoopDistribute Move code that may update the IR after precondition, so that if precondition fail, the IR isn't modified. Differential Revision: https://reviews.llvm.org/D81225	2020-06-18 20:13:18 +02:00
Matt Arsenault	779cba79ec	AMDGPU: Remove mayLoad/mayStore from some side effecting intrinsics These don't really modify any memory, and should not expect memory operands.	2020-06-18 14:12:19 -04:00
Stanislav Mekhanoshin	6c7e1b16fa	[AMDGPU] Added new encoding to getMCOpcodeGen Nothing breaks yet, but all encodings shall be in the map. Differential Revision: https://reviews.llvm.org/D81974	2020-06-18 10:11:33 -07:00
Arthur Eubanks	91ef930526	[GlobalOpt] Remove preallocated calls when possible When possible (e.g. internal linkage), strip preallocated attribute off parameters/arguments. This requires removing the "preallocated" operand bundle from the call site, replacing @llvm.call.preallocated.arg() with an alloca and a bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since @llvm.call.preallocated.arg() can be called multiple times with the same arg index, we create an alloca per arg index. We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was and a @llvm.stackrestore() after the preallocated call to prevent the stack from blowing up. This is valid because the argument would normally not exist on the stack after the call before the transformation. This does not currently handle all possible preallocated calls. We will need to figure out where to put @llvm.stackrestore() in the cases where there is no obvious place to put it, for example conditional preallocated calls, invokes. This sort of transformation may need to be moved to somewhere more accessible to accomodate similar transformations (like inlining) in the future. Reviewers: efriedma, hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80951	2020-06-18 09:56:13 -07:00
Alexandros Lamprineas	ecdf48f15b	[ARM] Basic bfloat support This patch adds basic support for BFloat in the Arm backend. For now the code generation relies on fullfp16 being present. Briefly: * adds the bfloat scalar and vector types in the necessary register classes, * adjusts the calling convention to cope with bfloat argument passing and return, * adds codegen patterns for moves, loads and stores. It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy). The following people contributed to this patch: * Alexandros Lamprineas * Ties Stuij Differential Revision: https://reviews.llvm.org/D81373	2020-06-18 17:26:24 +01:00
Simon Pilgrim	2474421398	[TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes. If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.	2020-06-18 16:41:08 +01:00
Matt Arsenault	6f09bb7da2	AMDGPU: Don't pass MachineFunction if only the IR Function is used	2020-06-18 11:06:46 -04:00
Ayke van Laethem	b4c91462e8	[AVR] Fix miscompilation of zext + add Code like the following: define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) { entry: %conv = zext i1 %b to i32 %add = add nsw i32 %conv, %a ret i32 %add } Would compile to the following (incorrect) code: foo: mov r18, r20 clr r19 add r22, r18 adc r23, r19 sbci r24, 0 sbci r25, 0 ret Those sbci instructions are clearly wrong, they should have been adc instructions. This commit improves codegen to use adc instead: foo: mov r18, r20 clr r19 ldi r20, 0 ldi r21, 0 add r22, r18 adc r23, r19 adc r24, r20 adc r25, r21 ret This code is not optimal (it could be just 5 instructions instead of the current 9) but at least it doesn't miscompile. Differential Revision: https://reviews.llvm.org/D78439	2020-06-18 16:51:37 +02:00
Matt Arsenault	243303f8d7	Lanai: Remove unused method This was depending on the MachineFunction at MachineFunctionInfo construction, which will soon be disallowed.	2020-06-18 10:48:14 -04:00
Simon Pilgrim	fe0a85faf4	[X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) == -1 -> PTESTZ(X,X) Allow combineSetCCMOVMSK to handle 'allof' X == 0 patterns to be replaced with PTESTZ This is a preliminary patch before properly handling PR35129	2020-06-18 15:38:32 +01:00
Alexandre Ganea	8374bf4363	[CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.	2020-06-18 10:07:30 -04:00
Kamlesh Kumar	7622ea5835	[RISCV64] Emit correct lib call for fp(float/double) to ui/si Since i32 is not legal in riscv64, it always promoted to i64 before emitting lib call and for conversions like float/double to int and float/double to unsigned int wrong lib call was emitted. This commit fix it using custom lowering. Differential Revision: https://reviews.llvm.org/D80526	2020-06-18 19:34:16 +05:30
Igor Kudrin	6853cc7221	[MC] Rename a misnamed function. NFC. The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to better reflect an expression it creates and fix a naming style issue. Differential Revision: https://reviews.llvm.org/D82079	2020-06-18 20:18:19 +07:00
Alexandre Ganea	403f953792	[CodeView] Add full repro to LF_BUILDINFO record This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable). Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding. For more information see PR36198 and D43002. Differential Revision: https://reviews.llvm.org/D80833	2020-06-18 09:17:15 -04:00
Alexandre Ganea	24eff42ba4	[CodeView] Add TypeCollection::replaceType to replace type records post-merging The API is not called in this patch. This is to simply/support https://reviews.llvm.org/D80833	2020-06-18 09:17:14 -04:00
Alexandre Ganea	a45409d885	[Clang] Move clang::Job::printArg to llvm::sys::printArg. NFCI. This patch is to support/simplify https://reviews.llvm.org/D80833	2020-06-18 09:17:13 -04:00
Florian Hahn	1669fddc9f	[Matrix] Use alignment info when lowering loads/stores. This patch updates LowerMatrixIntrinsics to preserve the alignment specified at the original load/stores and the align attribute for the pointer argument of the column.major.load/store intrinsics. We can always use the specified alignment for the load of the first column. For subsequent columns, the alignment may need to be reduced. For ConstantInt strides, compute the offset for the start of the column in bytes and use commonAlignment to get the largest valid alignment. For non-ConstantInt strides, we need to take the common alignment of the initial alignment and the element size in bytes. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D81960	2020-06-18 13:19:31 +01:00
Lucas Prates	92ad6d57c2	[ARM] Moving CMSE handling of half arguments and return to the backend Summary: As half-precision floating point arguments and returns were previously coerced to either float or int32 by clang's codegen, the CMSE handling of those was also performed in clang's side by zeroing the unused MSBs of the coercer values. This patch moves this handling to the backend's calling convention lowering, making sure the high bits of the registers used by half-precision arguments and returns are zeroed. Reviewers: chill, rjmccall, ostannard Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81428	2020-06-18 13:16:29 +01:00
Lucas Prates	a255931c40	[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend Summary: Half-precision floating point arguments and returns are currently promoted to either float or int32 in clang's CodeGen and there's no existing support for the lowering of `half` arguments and returns from IR in AArch32's backend. Such frontend coercions, implemented as coercion through memory in clang, can cause a series of issues in argument lowering, as causing arguments to be stored on the wrong bits on big-endian architectures and incurring in missing overflow detections in the return of certain functions. This patch introduces the handling of half-precision arguments and returns in the backend using the actual "half" type on the IR. Using the "half" type the backend is able to properly enforce the AAPCS' directions for those arguments, making sure they are stored on the proper bits of the registers and performing the necessary floating point convertions. Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer Reviewed By: ostannard Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75169	2020-06-18 13:15:13 +01:00
Paul Walker	4612f39120	[SVE] Add flag to specify SVE register size, using this to calculate legal vector types. Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE data registers (in bits) to be specified. This allows the code generator to make assumptions it normally couldn't. As a starting point this information is used to mark fixed length vector types that can fit within the specified size as legal. Reviewers: rengolin, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80384	2020-06-18 12:11:16 +00:00
Sameer Sahasrabuddhe	7aad220795	[DA] conservatively mark the join of every divergent branch For a loop, a join block is a block that is reachable along multiple disjoint paths from the exiting block of a loop. If the exit condition of the loop is divergent, then such join blocks must also be marked divergent. This currently fails in some cases because not all join blocks are identified correctly. The workaround is to conservatively mark every join block of any branch (not necessarily the exiting block of a loop) as divergent. https://bugs.llvm.org/show_bug.cgi?id=46372 Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D81806	2020-06-18 17:39:20 +05:30
Florian Hahn	d88acd8f7d	[Matrix] Preserve volatile when loading loads/stores. Currently the matrix lowering turns volatile loads/stores into non-volatile ones. This patch updates the lowering to preserve the volatile bit. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D81498	2020-06-18 12:14:19 +01:00
Jeremy Morse	3626eba11f	[NFC][LiveDebugValues] Document how LiveDebugValues operates We're missing a plain English explanation of how this pass is supposed to operate -- add one to the file comment. Differential Revision: https://reviews.llvm.org/D80929	2020-06-18 10:54:09 +01:00
Ayke van Laethem	15bf42d503	[AVR] Implement disassembly of 32-bit instructions This needed two fixes: * 32-bit instructions were read in the wrong order. The machine code swaps the two 16-bit instruction words, which wasn't undone when decoding instructions. * Jump and call instructions don't encode the lowest address bit, which is always zero. Therefore, the address needed to be shifted by one to fix that. Differential Revision: https://reviews.llvm.org/D81961	2020-06-18 11:26:58 +02:00
David Sherwood	7e30ef77f6	[CodeGen] Fix warnings in getVectorTypeBreakdown Added NextPowerOf2() routine to TypeSize and rewritten the code in getVectorTypeBreakdown to avoid warnings being generated. Differential Revision: https://reviews.llvm.org/D81578	2020-06-18 09:54:16 +01:00
Florian Hahn	6d18c2067e	[Matrix] Update load/store intrinsics. This patch adjust the load/store matrix intrinsics, formerly known as llvm.matrix.columnwise.load/store, to improve the naming and allow passing of extra information (volatile). The patch performs the following changes: * Rename columnwise.load/store to column.major.load/store. This is more expressive and also more in line with the naming in Clang. * Changes the stride arguments from i32 to i64. The stride can be larger than i32 and this makes things more uniform with the way things are handled in Clang. * A new boolean argument is added to indicate whether the load/store is volatile. The lowering respects that when emitting vector load/store instructions * MatrixBuilder is updated to require both Alignment and IsVolatile arguments, which are passed through to the generated intrinsic. The alignment is set using the `align` attribute. The changes are grouped together in a single patch, to have a single commit that breaks the compatibility. We probably should be fine with updating the intrinsics, as we did not yet officially support them in the last stable release. If there are any concerns, we can add auto-upgrade rules for the columnwise intrinsics though. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse Reviewed By: anemet, nicolasvasilache Differential Revision: https://reviews.llvm.org/D81472	2020-06-18 09:44:52 +01:00
David Sherwood	65912a9768	[CodeGen] Fix warnings in foldCONCAT_VECTORS Instead of asserting the number of elements is the same, we should be comparing the element counts instead. In addition, when looking at concats of extract_subvectors it's fine to use getVectorMinNumElements() for scalable vectors. I discovered these warnings when compiling the structured loads tests in this file: test/CodeGen/AArch64/sve-intrinsics-loads.ll Differential Revision: https://reviews.llvm.org/D81936	2020-06-18 09:29:37 +01:00
serge-sans-paille	f9c7e3136e	Correctly report modified status for HWAddressSanitizer Differential Revision: https://reviews.llvm.org/D81238	2020-06-18 10:27:44 +02:00
David Green	158e734af1	[ARM] Adjust AND/OR combines to not call isConstantSplat on i1 vectors. NFC. The rearranges PerformANDCombine and PerformORCombine to try and make sure we don't call isConstantSplat on any i1 vectors. As pointed out in D81860 it may not be very well defined in those cases.	2020-06-18 08:25:44 +01:00
Kristof Beyls	832cfc7672	[IndirectThunks] Make generated MF structure as expected by all instruction selectors. This also enables running the AArch64 SLSHardening pass with GlobalISel, so add a test for that. Differential Revision: https://reviews.llvm.org/D81403	2020-06-18 06:44:53 +01:00
Kristof Beyls	3f0cc96a96	[AArch64] SLSHardening: compute correct thunk name for X29. The enum values for AArch64 registers are not all consecutive. Therefore, the computation "__llvm_slsblr_thunk_x" + utostr(Reg - AArch64::X0) is not always correct. utostr(Reg - AArch64::X0) will not generate the expected string for the registers that do not have consecutive values in the enum. This happened to work for most registers, but does not for AArch64::FP (i.e. register X29). This can get triggered when the X29 is not used as a frame pointer. Differential Revision: https://reviews.llvm.org/D81997	2020-06-18 06:36:49 +01:00
Xing GUO	d261a1c0e0	[DWARFYAML][debug_abbrev] Make the abbreviation code optional. This patch helps make the `Code` optional in abbreviations table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D81826	2020-06-18 13:02:54 +08:00
Mehdi Amini	77b79d79c0	Remove "unused" member ModuleSlice from `struct OpenMPOpt` This is fixing warning from clang: warning: private field 'ModuleSlice' is not used [-Wunused-private-field] SmallPtrSetImpl<Function *> &ModuleSlice; ^ Differential Revision: https://reviews.llvm.org/D82027	2020-06-18 03:02:26 +00:00
Kang Zhang	58e19d465a	[PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator Summary: For PPC BinaryOperator of fp128 will become libcall, we shouldn't convert loop to CTR loop if the loop contain libCall. But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal with BinaryOperator for ppc_fp128, don't deal with the fp128. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D81353	2020-06-18 02:54:19 +00:00
Xing GUO	1f391afbf4	[ObjectYAML][ELF] Add support for emitting the .debug_abbrev section. This patch enables yaml2elf emit the .debug_abbrev section. The generated .debug_abbrev is verified using `llvm-dwarfdump`. Known issues that will be addressed later: - Current implementation doesn't support generating multiple abbreviation tables in one .debug_abbrev section. Reviewed By: jhenderson, grimar Differential Revision: https://reviews.llvm.org/D81820	2020-06-18 10:50:38 +08:00
Esme-Yi	ad6024e29f	[PowerPC] Custom lower rotl v1i128 to vector_shuffle. Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected. In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81076	2020-06-18 01:32:23 +00:00
Sam Clegg	7ee758d691	[WebAssembly] MC: Fix for data aliases with offsets (getelementptr) For some reason we hadn't seen such cases in the wild which makes me think that clang and rustc don't generate these. In the bug which reproduces it only occurs with LTO so my guess is that some LTO pass is creating this alias + gep. See: https://github.com/emscripten-core/emscripten/issues/8731 Differential Revision: https://reviews.llvm.org/D79462	2020-06-17 16:25:50 -07:00
Matt Arsenault	5f5f566b26	AMDGPU: Don't use 16-bit FP inline constants in integer operands It seems to be a hardware defect that the half inline constants do not work as expected for the 16-bit integer operations (the inverse does work correctly). Experimentation seems to show these are really reading the 32-bit inline constants, which can be observed by writing inline asm using op_sel to see what's in the high half of the constant. Theoretically we could fold the high halves of the 32-bit constants using op_sel. The *_asm_all.s MC tests are broken, and I don't know where the script to autogenerate these are. I started manually fixing it, but there's just too many cases to fix. This also does break the assembler/disassembler support for these values, and I'm not sure what to do about it. These are still valid encodings, so it seems like you should be able to use them in some way. If you wrote assembly using them, you could have really meant it (perhaps to read the high bits with op_sel?). The disassembler will print the invalid literal constant which will fail to re-assemble. The behavior is also different depending on the use context. Consider this example, which was previously accepted and encoded using the inline constant: v_mad_i16 v5, v1, -4.0, v3 ; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04] In contexts where an inline immediate is required (such as on gfx8/9), this will now be rejected. For gfx10, this will produce the literal encoding and change the printed format: v_mad_i16 v5, v1, 0xc400, v3 ; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00] This is just another variation of the issue that we don't perfectly handle round trip assembly/disassembly due to not tracking how immediates were encoded. This doesn't matter much in practice, since compilers don't emit the suboptimal encoding. I doubt any users are relying on this behavior (although I did make use of the old behavior to figure out what was wrong). Fixes bug 46302.	2020-06-17 19:14:10 -04:00
Yonghong Song	89648eb16d	[BPF] fix a bug for BTF pointee type pruning In BTF, pointee type pruning is used to reduce cluttering too many unused types into prog BTF. For example, struct task_struct { ... struct mm_struct mm; ... } If bpf program does not access members of "struct mm_struct", there is no need to bring types for "struct mm_struct" to BTF. This patch fixed a bug where an incorrect pruning happened. The test case like below: struct t; typedef struct t _t; struct s1 { _t c; }; int test1(struct s1 arg) { ... } struct t { int a; int b; }; struct s2 { _t c; } int test2(struct s2 arg) { ... } After processing test1(), among others, BPF backend generates BTF types for "struct s1", "_t" and a placeholder for "struct t". Note that "struct t" is not really generated. If later a direct access to "struct t" member happened, "struct t" BTF type will be generated properly. During processing test2(), when processing member type "_t c", BPF backend sees type "_t" already generated, so returned. This caused the problem that "struct t" BTF type is never generated and eventually causing incorrect type definition for "struct s2". To fix the issue, during DebugInfo type traversal, even if a typedef/const/volatile/restrict derived type has been recorded in BTF, if it is not a type pruning candidate, type traversal of its base type continues. Differential Revision: https://reviews.llvm.org/D82041	2020-06-17 15:13:46 -07:00
Eric Christopher	a8dad30388	Revert "Remove unused class variable ModuleSlice." as it was used in debug only code. This reverts commit `07a1749081`.	2020-06-17 14:45:17 -07:00
Eric Christopher	07a1749081	Remove unused class variable ModuleSlice.	2020-06-17 14:33:29 -07:00
Christopher Tetreault	8819202dfd	[SVE] Eliminate bad VectorType::getNumElements() calls from ConstantFold Summary: Assume all usages of this function are explicitly fixed-width operations and cast to FixedVectorType Reviewers: efriedma, sdesmalen, c-rhodes, majnemer, dblaikie Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80262	2020-06-17 14:19:56 -07:00
Christopher Tetreault	4b776a98f1	[SVE] Fix invalid usages of getNumElements in ShuffleVectorInstruction Summary: Fix invalid usages of getNumElements identified by test case LLVM.Transforms/InstCombine::vscale_extractelement.ll. changesLength: Since the length of the llvm::SmallVector shufflemask is related to the minimum number of elements in a scalable vector, it is fine to just get the Min field of the ElementCount isIdentityWithExtract: Since it is not possible to express the mask needed for this pattern for scalable vectors, we can just bail before calling getNumElements() Reviewers: efriedma, sdesmalen, fpetrogalli, gchatelet, yrouban, craig.topper Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81969	2020-06-17 13:45:34 -07:00
Roman Lebedev	84b4f5a6a6	[InstCombine] Negator: while there, add detection for cycles during negation I don't have any testcases showing it happening, and i haven't succeeded in creating one, but i'm also not positive it can't ever happen, and i recall having something that looked like that in the very beginning of Negator creation. But since we now already have a negation cache, we can now detect such cases practically for free. Let's do so instead of "relying" on stack overflow :D	2020-06-17 22:47:20 +03:00
Roman Lebedev	e3d8cb1e1d	[InstCombine] Negator: cache negation results (PR46362) It is possible that we can try to negate the same value multiple times. For example, PHI nodes may happen to have multiple incoming values (all of which must be the same value) for the same incoming basic block. It may happen that we try to negate such a PHI node, and succeed, and that might result in having now-different incoming values.. To avoid that, and in general to reduce the amount of duplicated work we might be doing, let's introduce a cache where we'll track results of negating each value. The added test was previously failing -verify after -instcombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=46362	2020-06-17 22:47:20 +03:00
Roman Lebedev	c4166f3d84	[NFC][InstCombine] Negator: add thin negate() wrapped before visit()	2020-06-17 22:47:20 +03:00
Roman Lebedev	2b85147337	[NFC][InstCombine] Negator: do not include unneeded "llvm/IR/DerivedTypes.h" header	2020-06-17 22:47:19 +03:00
Thomas Lively	49754dcf22	[WebAssembly] Fix bug in FixBrTables and use branch analysis utils Summary: This commit fixes a bug in the FixBrTables pass in which an unconditional branch from the switch header block to the jump table block was not removed before the blocks were combined. The result was an invalid CFG in the MachineFunction. This commit also switches from using bespoke branch analysis and deletion code to using the standard utilities for the same. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81909	2020-06-17 12:34:45 -07:00
Nick Desaulniers	e7816f263b	[InlineSpiller] add assert about spills post terminators Summary: This invariant is being violated in the test case https://reviews.llvm.org/D77849, related to the use of the relatively new ability for callbr to have return values, and MachineBasicBlocks with INLINEASM_BR terminators to emit live out register defs. As noted in the comment, this triggers invariant violations in MachineVerifier via `llc -verify-machineinstrs` or `llc -verify-regalloc`, since only MachineInstrs that are terminators are allowed to follow the first terminator. https://reviews.llvm.org/D75098 may rework this very assertion if we're spilling via a (proposed) TCOPY MachineInstr. Reviewers: void, efriedma, arsenm Reviewed By: efriedma Subscribers: qcolombet, wdng, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78166	2020-06-17 11:51:58 -07:00
Nick Desaulniers	88c965ba14	BreakCriticalEdges for callbr indirect dests Summary: llvm::SplitEdge was failing an assertion that the BasicBlock only had one successor (for BasicBlocks terminated by CallBrInst, we typically have multiple successors). It was surprising that the earlier call to SplitCriticalEdge did not handle the critical edge (there was an early return). Removing that triggered another assertion relating to creating a BlockAddress for a BasicBlock that did not (yet) have a parent, which is a simple order of operations issue in llvm::SplitCriticalEdge (a freshly constructed BasicBlock must be inserted into a Function's basic block list to have a parent). Thanks to @nathanchance for the report. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018 Reviewers: craig.topper, jyknight, void, fhahn, efriedma Reviewed By: efriedma Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D81607	2020-06-17 11:45:06 -07:00
Davide Italiano	1cbaf847ab	[CGP] Reset the debug location when promoting zext(s). When the zext gets promoted, it used to retain the original location, which pessimizes the debugging experience causing an unexpected jump in stepping at -Og. Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also contains a full C repro). Differential Revision: https://reviews.llvm.org/D81437	2020-06-17 11:13:13 -07:00
Ian Levesque	7c7c8e0da4	[xray] Option to omit the function index Summary: Add a flag to omit the xray_fn_idx to cut size overhead and relocations roughly in half at the cost of reduced performance for single function patching. Minor additions to compiler-rt support per-function patching without the index. Reviewers: dberris, MaskRay, johnislarry Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D81995	2020-06-17 13:49:01 -04:00
Alexandre Ganea	acb30f6856	[X86] For 32-bit targets, emit two-byte NOP when possible In order to support hot-patching, we need to make sure the first emitted instruction in a function is a two-byte+ op. This is already the case on x86_64, which seems to always emit two-byte+ ops. However on 32-bit targets this wasn't the case. PATCHABLE_OP now lowers to a XCHG AX, AX, (66 90) like MSVC does. However when targetting pentium3 (/arch:SSE) or i386 (/arch:IA32) targets, we generate MOV EDI,EDI (8B FF) like MSVC does. This is for compatiblity reasons with older tools that rely on this two byte pattern. Differential Revision: https://reviews.llvm.org/D81301	2020-06-17 13:44:38 -04:00
Alexandre Ganea	ad879b31f0	[X86] Change signature of EmitNops. NFC. This is to support https://reviews.llvm.org/D81301.	2020-06-17 13:44:37 -04:00
Fangrui Song	c8b082a3ab	[llvm-cov gcov] Support clang<11 fake 4.2 format Test cases are restored from `a3bed4bd37`	2020-06-17 10:17:15 -07:00
Michał Górny	5c621900a6	[llvm] [CommandLine] Do not suggest really hidden opts in nearest lookup Skip 'really hidden' options when performing lookup of the nearest option when invalid option was passed. Since these options aren't even documented in --help-hidden, it seems inconsistent to suggest them to users. This fixes clang-tools-extra test failures due to unexpected suggestions when linking the tools to LLVM dylib (that provides more options than the subset of LLVM libraries linked directly). Differential Revision: https://reviews.llvm.org/D82001	2020-06-17 19:00:26 +02:00
Scott Linder	691ff4682f	[AMDGPU] Skip CFIInstructions in SIInsertWaitcnts Summary: CFI emitted during PEI at the beginning of the prologue needs to apply to any inserted waitcnts on function entry. Reviewers: arsenm, t-tye, RamNalamothu Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D76881	2020-06-17 12:41:03 -04:00
vnalamot	2e28009981	[NFC] Move getAll{S,V}GPR{32,128} methods to SIFrameLowering Summary: Future patch needs some of these in multiple places. The definitions of these can't be in the header and be eligible for inlining without making the full declaration of GCNSubtarget visible. I'm not sure what the right trade-off is, but I opted to not bloat SIRegisterInfo.h Reviewers: arsenm, cdevadas Reviewed By: arsenm Subscribers: RamNalamothu, qcolombet, jvesely, wdng, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79878	2020-06-17 12:08:09 -04:00

1 2 3 4 5 ...

135948 Commits