llvm-project

Commit Graph

Author	SHA1	Message	Date
Ayke van Laethem	01c2209d51	[AVR] Decode single register instructions This is a set of instructions that take just a single register as an operand, with no immediates. Because all instructions share the same format, I haven't added exhaustive bit testing to all instructions but just to the inc instruction. Differential Revision: https://reviews.llvm.org/D81968	2020-06-23 02:17:15 +02:00
Ayke van Laethem	ff4817ec2a	[AVR] Don't adjust for instruction size I'm not entirely sure why this was ever needed, but when I remove both adjustments all tests still pass. This fixes a bug where a long branch (using the `jmp` instead of the `rjmp` instruction) was incorrectly adjusted by 2 because it jumps to an absolute address instead of a PC-relative address. I could have added AVR::fixup_call to the list of exceptions, but it seemed more sensible to me to just remove this code. Differential Revision: https://reviews.llvm.org/D78459	2020-06-23 02:15:42 +02:00
Sam Clegg	79aad89d8d	[WebAssembly] Add support for externalref to MC and wasm-ld This allows code for handling externref values to be processed by the assembler and linker. Differential Revision: https://reviews.llvm.org/D81977	2020-06-22 15:57:24 -07:00
Christopher Tetreault	cd6848f6e1	[SVE] Remove calls to VectorType::getNumElements from ARM Reviewers: efriedma, greened, c-rhodes, david-arm, dmgreen Reviewed By: dmgreen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, dmgreen, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82216	2020-06-22 15:18:58 -07:00
Arthur Eubanks	d335c1317b	Fix dynamic alloca detection in CloneBasicBlock Summary: Simply check AI->isStaticAlloca instead of reimplementing checks for static/dynamic allocas. Reviewers: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82328	2020-06-22 15:06:28 -07:00
Craig Topper	23654d9e7a	Recommit "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum." Hopefully this version will fix the previously buildbot failure	2020-06-22 13:32:03 -07:00
Greg Clayton	ccf5a44917	Fix the verification of DIEs with DW_AT_ranges. Summary: Previous code would try to verify DW_AT_ranges and if any ranges would overlap, it would stop attributing any ranges after this to the DIE which caused incorrect errors to be reported that a DIE's address ranges were not contained in the parent DIE's ranges. Added a fix and a test. Reviewers: aprantl, labath, probinson, JDevlieghere, jhenderson Subscribers: hiraditya, MaskRay, cmtice, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79962	2020-06-22 13:13:48 -07:00
Lei Zhang	315bd96437	Use std::make_tuple instead initializer list Hopefully this pleases GCC-5 and fixes the build error: LowerExpectIntrinsic.cpp:62:53: error: converting to 'std::tuple<unsigned int, unsigned int>' from initializer list would use explicit constructor 'constexpr std::tuple<_T1, _T2>::tuple(_U1&&, _U2&&) [with _U1 = llvm:🆑:opt<unsigned int>&; _U2 = llvm:🆑:opt<unsigned int>&; <template-parameter-2-3> = void; _T1 = unsigned int; _T2 = unsigned int]' return {LikelyBranchWeight, UnlikelyBranchWeight}; Differential Revision: https://reviews.llvm.org/D82325	2020-06-22 15:43:40 -04:00
Hans Wennborg	1357c06578	Revert "[X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions" This caused a Chromium test to miscompile. See discussion on the Phabricator review. > This patch extends MatchVectorAllZeroTest to handle OR vector reduction patterns where the result is compared against zero. > > Fixes PR45378 > > Differential Revision: https://reviews.llvm.org/D81547 This reverts `057c9c7ee0`	2020-06-22 21:27:11 +02:00
Christopher Tetreault	5e2c736395	[SVE] Remove calls to VectorType::getNumElements from WebASM Summary: The getNumElements in base VectorType is being deprecated. See: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html Reviewers: efriedma, tlively, fpetrogalli, c-rhodes, dschuff Reviewed By: tlively, dschuff Subscribers: dschuff, sbc100, tschuett, jgravelle-google, hiraditya, aheejin, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82217	2020-06-22 12:25:08 -07:00
Craig Topper	bebea4221d	Revert "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum." Seems to breaking build. This reverts commit `5ac144fe64`.	2020-06-22 12:20:40 -07:00
Craig Topper	5ac144fe64	[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum. Move 0 initialization up to the caller so we don't need to know the size.	2020-06-22 11:46:20 -07:00
Zhi Zhuang	37fb860301	Add support of __builtin_expect_with_probability Add a new builtin-function __builtin_expect_with_probability and intrinsic llvm.expect.with.probability. The interface is __builtin_expect_with_probability(long expr, long expected, double probability). It is mainly the same as __builtin_expect besides one more argument indicating the probability of expression equal to expected value. The probability should be a constant floating-point expression and be in range [0.0, 1.0] inclusive. It is similar to builtin-expect-with-probability function in GCC built-in functions. Differential Revision: https://reviews.llvm.org/D79830	2020-06-22 10:21:28 -07:00
Hiroshi Yamauchi	9e1decf743	[PGO][PGSO] Enable non-cold size opts under partial profile sample PGO. Summary: Similar to D81020. Follow up D78949. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82053	2020-06-22 10:12:48 -07:00
Francesco Petrogalli	ef597eda8e	[sve][acle] Add SVE BFloat16 extensions. Summary: List of intrinsics: svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfdot[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfdot_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmmla[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalb_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmlalt[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalt[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalt_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svbfloat16_t svcvt_bf16[_f32]_m(svbfloat16_t inactive, svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_x(svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_z(svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_m(svbfloat16_t even, svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_x(svbfloat16_t even, svbool_t pg, svfloat32_t op) For reference, see section 7.2 of "Arm C Language Extensions for SVE - Version 00bet4" Reviewers: sdesmalen, ctetreau, efriedma, david-arm, rengolin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82141	2020-06-22 16:53:02 +00:00
Sanjay Patel	9934cc544c	[VectorCombine] make helper function for shift-shuffle; NFC This will probably be useful for other extract patterns.	2020-06-22 12:23:52 -04:00
Florian Hahn	328c8642e2	[DSE,MSSA] Reorder DSE blocking checks. Currently we stop exploring candidates too early in some cases. In particular, we can continue checking the defining accesses of non-removable MemoryDefs and defs without analyzable write location (read clobbers are already ruled out using MemorySSA at this point).	2020-06-22 17:16:34 +01:00
Fangrui Song	c52bee61e9	[MCParser] Support quoted section name for COFF This features matches ELFAsmParser and makes it possible to use `.section ".llvm.call-graph-profile","n"` Reviewed By: zequanwu Differential Revision: https://reviews.llvm.org/D82240	2020-06-22 09:11:44 -07:00
stozer	539381da26	[DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructions Following on from this RFC[0] from a while back, this is the first patch towards implementing variadic debug values. This patch specifically adds a set of functions to MachineInstr for performing operations specific to debug values, and replacing uses of the more general functions where appropriate. The most prevalent of these is replacing getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the operands corresponding to values will no longer be at index 0, but index 2 and upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been added for the other operands, along with some helper functions to replace oft-repeated code and operate on a variable number of value operands. [0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste> Differential Revision: https://reviews.llvm.org/D81852	2020-06-22 16:01:12 +01:00
Guillaume Chatelet	597a9070b5	[ARC] Add missing return statement	2020-06-22 14:57:29 +00:00
Sanjay Patel	98c2f4eea5	[VectorCombine] add helper to replace uses and rename The tests are regenerated to show a path that missed renaming, but there should be no functional difference from this patch.	2020-06-22 09:58:49 -04:00
Valentin Clement	8383ac6197	Revert commit `9e52530` because of dependencies issue This reverts commit `9e525309fb`.	2020-06-22 09:56:14 -04:00
Valentin Clement	9e525309fb	[openmp] Base of tablegen generated OpenMP common declaration Summary: As discussed previously when landing patch for OpenMP in Flang, the idea is to share common part of the OpenMP declaration between the different Frontend. While doing this it was thought that moving to tablegen instead of Macros will also give a cleaner and more powerful way of generating these declaration. This first part of a future series of patches is setting up the base .td file for DirectiveLanguage as well as the OpenMP version of it. The base file is meant to be used by other directive language such as OpenACC. In this first patch, the Directive and Clause enums are generated with tablegen instead of the macros on OMPConstants.h. The next pacth will extend this to other enum and move the Flang frontend to use it. Reviewers: jdoerfert, DavidTruby, fghanim, ABataev, jdenny, hfinkel, jhuber6, kiranchandramohan, kiranktp Reviewed By: jdoerfert, jdenny Subscribers: cfe-commits, mgorny, yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits Tags: #llvm, #openmp, #clang Differential Revision: https://reviews.llvm.org/D81736	2020-06-22 09:34:53 -04:00
Xing GUO	03480c80d3	[DWARFYAML][debug_info] Add support for error handling. This patch helps add support for error handling in `DWARFYAML::emitDebugInfo()`. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D82275	2020-06-22 21:36:13 +08:00
Xing GUO	3a48a632d0	[DWARFYAML][debug_info] Use 'AbbrCode' to index the abbreviation. Before this patch, we use `(uint32_t)AbbrCode - (uint32_t)FirstAbbrCode` to index the abbreviation. It's impossible for we to use the preceeding abbreviation of the previous one (e.g., if the previous DIE's `AbbrCode` is 2, we are unable to use the abbreviation with index 1). In this patch, we use `AbbrCode` to index the abbreviation directly. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D82173	2020-06-22 21:34:02 +08:00
Simon Pilgrim	48d1a2d6d0	[DAG] Add SimplifyMultipleUseDemandedVectorElts helper for SimplifyMultipleUseDemandedBits. NFCI. We have many cases where we call SimplifyMultipleUseDemandedBits and demand specific vector elements, but all the bits from them - this adds a helper wrapper to handle this.	2020-06-22 14:24:39 +01:00
Sanjay Patel	de65b356dc	[VectorCombine] add/use pass-level IRBuilder This saves creating/destroying a builder every time we perform some transform. The tests show instruction ordering diffs resulting from always inserting at the root instruction now, but those should be benign.	2020-06-22 09:01:29 -04:00
Jay Foad	9761d3cf9c	[AMDGPU] Update more live intervals in SIWholeQuadMode This fixes various assertion failures that would otherwise be triggered by a later patch to move SIWholeQuadMode later in the pass pipeline. Differential Revision: https://reviews.llvm.org/D82190	2020-06-22 13:50:15 +01:00
Sanjay Patel	cce625f73d	[VectorCombine] improve IR debugging by providing/salvaging value names The tests are regenerated to show the diffs, but there should be no functional change from this patch.	2020-06-22 08:35:47 -04:00
Tim Corringham	96ecead5a2	[AMDGPU] clang-format of SIModeRegister.cpp Ran clang-format just to ease future reviews. No functional changes.	2020-06-22 13:31:52 +01:00
Simon Pilgrim	ecc5d7ee0d	[DAG] SimplifyMultipleUseDemandedBits - drop unnecessary *_EXTEND_VECTOR_INREG cases For little endian targets, if we only need the lowest element and none of the extended bits then we can just use the (bitcasted) source vector directly. We already do this in SimplifyDemandedBits, this adds the SimplifyMultipleUseDemandedBits equivalent.	2020-06-22 12:35:32 +01:00
Tres Popp	09d72ad399	Revert "[CGP] Enable CodeGenPrepares phi type convertion." This reverts commit `67121d7b82`. This is causing compile times to be 2x slower on some large binaries.	2020-06-22 13:06:18 +02:00
Serguei Katkov	eae0d2e9b2	Revert "[Peeling] Extend the scope of peeling a bit" This reverts commit `29b2c1ca72`. The patch causes the DT verifier failure like: DominatorTree is different than a freshly computed one! Not sure the patch itself it wrong but revert to investigate the failure.	2020-06-22 17:48:29 +07:00
Vitaly Buka	5d964e262f	[StackSafety] Check variable lifetime We can't consider variable safe if out-of-lifetime access is possible. So if StackLifetime can't prove that the instruction always uses the variable when it's still alive, we consider it unsafe.	2020-06-22 03:45:29 -07:00
Vitaly Buka	8f592ed333	[StackSafety] Ignore unreachable instructions Usually DominatorTree provides this info, but here we use StackLifetime. The reason is that in the next patch StackLifetime will be used for actual lifetime checks and we can avoid forwarding the DominatorTree into this code.	2020-06-22 03:45:29 -07:00
Anton Korobeynikov	6cb80fbe40	Revert "[MSP430] Update register names" This reverts commit `8f6620f663`.	2020-06-22 13:37:22 +03:00
Anatoly Trosinenko	8f6620f663	[MSP430] Update register names When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4. The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it. Differential Revision: https://reviews.llvm.org/D82184	2020-06-22 13:24:03 +03:00
Momchil Velikov	75b0bbca1d	[LTO] Use StringRef instead of C-style strings in setCodeGenDebugOptions Fixes an issue with missing nul-terminators and saves us some string copying, compared to a version which would insert nul-terminators. Differential Revision: https://reviews.llvm.org/D82033	2020-06-22 11:22:18 +01:00
Anatoly Trosinenko	a5bd75aab8	[MSP430] Enable some basic support for debug information This commit technically permits LLVM to emit the debug information for ELF files for MSP430 architecture. Aside from this, it only defines the register numbers as defined by part 10.1 of MSP430 EABI specification (assuming the 1-byte subregisters share the register numbers with corresponding full-size registers). This commit was basically tested by me with TI-provided GCC 8.3.1 toolchain by compiling an example program with `clang` (please note manual linking may be required due to upstream `clang` not yet handling the `-msim` option necessary to run binaries on the GDB-provided simulator) and then running it and single-stepping with `msp430-elf-gdb` like this: ``` $sysroot/bin/msp430-elf-gdb ./test -ex "target sim" -ex "load ./test" (gdb) ... traditional GDB commands follow ... ``` While this implementation is most probably far from completeness and is considered experimental, it can already help with debugging MSP430 programs as well as finding issues in LLVM debug info support for MSP430 itself. One of the use cases includes trying to find a point where UBSan check in a trap-on-error mode was triggered. The expected debug information format is described in the [MSP430 Embedded Application Binary Interface](http://www.ti.com/lit/an/slaa534/slaa534.pdf) specification, part 10. Differential Revision: https://reviews.llvm.org/D81488	2020-06-22 13:14:07 +03:00
Anatoly Trosinenko	359fae6eb0	[DebugInfo] Explicitly permit addr_size = 0x02 when parsing DWARF data Current LLVM implementation uses `MCAsmInfo::CodePointerSize` as addr_size when emitting the DWARF data. llvm-dwarfdump, on the other hand, handles `addr_size`s of 4 and 8 properly and considers all other sizes as an error. This works for most of mainline targets except for MSP430 and AVR. msp430-gcc v8.3.1 emits DWARF32 with addr_size = 4 (DWARF32 does not imply addr_size = 4, 32 refers to internal offset width of 4 bytes) that is handled by llvm-dwarfdump already. Still, emitting 2-byte target pointers on MSP430 seems correct as well (but not for MSP430X that is supported by msp430-gcc but not by LLVM and has 20-bit address space). This patch make it possible for MSP430 debug info support to be tested with llvm-dwarfdump. Differential Revision: https://reviews.llvm.org/D82055	2020-06-22 13:11:55 +03:00
Florian Hahn	0e19ff02d8	[DSE,MSSA] Remove unused arguments for isDSEBarrier (NFC).	2020-06-22 10:58:53 +01:00
Djordje Todorovic	792786e34d	[CSInfo][MIPS] Don't describe parameters loaded by sub/super reg copy When describing parameter value loaded by a COPY instruction, consider case where needed Reg value is a sub- or super- register of the COPY instruction's destination register. Without this patch, compile process will crash with the assertion "TargetInstrInfo::describeLoadedValue can't describe super- or sub-regs for copy instructions". Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D82000	2020-06-22 10:49:02 +02:00
Serguei Katkov	29b2c1ca72	[Peeling] Extend the scope of peeling a bit Currently we allow peeling of the loops if there is a exiting latch block and all other exits are blocks ending with deopt. Actually we want that exit would end up with deopt unconditionally but it is not required that exit itself ends with deopt. Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev Reviewed By: apilipenko Subscribers: hiraditya, zzheng, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D81140	2020-06-22 12:17:44 +07:00
Michael Liao	20a1700293	[amdgpu] Fix REL32 relocations with negative offsets. Summary: - The offset should be treated as a signed one. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82234	2020-06-21 23:09:03 -04:00
Sanjay Patel	6bdd531af5	[VectorCombine] create class for pass to hold analyses, etc; NFC This doesn't change anything currently, but it would make sense to create a class-level IRBuilder instead of recreating that everywhere. As we expand to more optimizations, we will probably also want to hold things like the DataLayout or other constant refs in here too.	2020-06-21 16:07:33 -04:00
David Green	67121d7b82	[CGP] Enable CodeGenPrepares phi type convertion.	2020-06-21 16:46:16 +01:00
Florian Hahn	40569db7b3	[DSE,MSSA] Move reachability check to main loop. As we traverse the CFG backwards, we could end up reaching unreachable blocks. For unreachable blocks, we won't have computed post order numbers and because DomAccess is reachable, unreachable blocks cannot be on any path from it. This fixes a crash with unreachable blocks.	2020-06-21 16:38:10 +01:00
David Green	730ecb63ec	[CGP] Convert phi types If a collection of interconnected phi nodes is only ever loaded, stored or bitcast then we can convert the whole set to the bitcast type, potentially helping to reduce the number of register moves needed as the phi's are passed across basic block boundaries. This has to be done in CodegenPrepare as it naturally straddles basic blocks. The alorithm just looks from phi nodes, looking at uses and operands for a collection of nodes that all together are bitcast between float and integer types. We record visited phi nodes to not have to process them more than once. The whole subgraph is then replaced with a new type. Loads and Stores are bitcast to the correct type, which should then be folded into the load/store, changing it's type. This comes up in the biquad testcase due to the way MVE needs to keep values in integer registers. I have also seen it come up from aarch64 partner example code, where a complicated set of sroa/inlining produced integer phis, where float would have been a better choice. I also added undef and extract element handling which increased the potency in some cases. This adds it with an option that defaults to off, and disabled for 32bit X86 due to potential issues around canonicalizing NaNs. Differential Revision: https://reviews.llvm.org/D81827	2020-06-21 15:54:17 +01:00
Nikita Popov	37d3030711	[ValueTracking, BasicAA] Don't simplify instructions GetUnderlyingObject() (and by required symmetry DecomposeGEPExpression()) will call SimplifyInstruction() on the passed value if other checks fail. This simplification is very expensive, but has little effect in practice. This patch removes the SimplifyInstruction call(), and replaces it with a check for single-argument phis (which can occur in canonical IR in LCSSA form), which is the only useful simplification case I was able to identify. At O3 the geomean CTMark improvement is -1.7%. The largest improvement is SPASS with ThinLTO at -6%. In test-suite, I see only two tests with a hash difference and no code size difference (PAQ8p, Ptrdist), which indicates that the simplification only ends up being useful very rarely. (I would have liked to figure out which simplification is responsible here, but wasn't able to spot it looking at transformation logs.) The AMDGPU test case that is update was using two selects with undef condition, in which case GetUnderlyingObject will return the first select operand as the underlying object. This will of course not happen with non-undef conditions, so this was not testing anything realistic. Additionally this illustrates potential unsoundness: While GetUnderlyingObject will pick the first operand, the select might be later replaced by the second operand, resulting in inconsistent assumptions about the undef value. Differential Revision: https://reviews.llvm.org/D82261	2020-06-21 16:31:07 +02:00
Sanjay Patel	2ad42c2653	[ValueTracking] improve analysis for fdiv with same operands (The 'nnan' variant of this pattern is already tested to produce '1.0'.) https://alive2.llvm.org/ce/z/D4hPBy define i1 @src(float %x, i32 %y) { %0: %d = fdiv float %x, %x %uge = fcmp uge float %d, 0.000000 ret i1 %uge } => define i1 @tgt(float %x, i32 %y) { %0: ret i1 1 } Transformation seems to be correct!	2020-06-21 09:07:59 -04:00
Simon Pilgrim	fb9f9dc318	[X86][SSE] Add SimplifyDemandedVectorEltsForTargetShuffle to handle target shuffle variable masks Pulled out from the ongoing work on D66004, currently we don't do a good job of simplifying variable shuffle masks that have already lowered to constant pool entries. This patch adds SimplifyDemandedVectorEltsForTargetShuffle (a custom x86 helper) to first try SimplifyDemandedVectorElts (which we already do) and then constant pool simplification to help mark undefined elements. To prevent lowering/combines infinite loops, we only handle basic constant pool loads instead of creating new BUILD_VECTOR nodes for lowering - e.g. we don't try to convert them to broadcast/vzext_load - there might be some benefit to this but if so I'd rather we come up with some way to reuse existing code than reimplement a lot of BUILD_VECTOR code. Differential Revision: https://reviews.llvm.org/D81791	2020-06-21 11:16:07 +01:00
clfbbn	10b0539772	[Attributor][NFC] Fix indentation Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, kuter, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82260	2020-06-21 15:43:32 +08:00
Wenlei He	7c8a6936bf	[Remarks] Add callsite locations to inline remarks Summary: Add call site location info into inline remarks so we can differentiate inline sites. This can be useful for inliner tuning. We can also reconstruct full hierarchical inline tree from parsing such remarks. The messege of inline remark is also tweaked so we can differentiate SampleProfileLoader inline from CGSCC inline. Reviewers: wmi, davidxl, hoy Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82213	2020-06-20 23:32:10 -07:00
Amy Kwan	cc95635b1b	[PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang This patch implements builtins for the following prototypes: ``` vector signed char vec_clrl (vector signed char a, unsigned int n); vector unsigned char vec_clrl (vector unsigned char a, unsigned int n); vector signed char vec_clrr (vector signed char a, unsigned int n); vector signed char vec_clrr (vector unsigned char a, unsigned int n); ``` Differential Revision: https://reviews.llvm.org/D81707	2020-06-20 18:29:16 -05:00
Eric Christopher	dc20419351	Rename function to more accurately reflect what it does.	2020-06-20 14:37:29 -07:00
Eric Christopher	8116d01905	Typos around a -> an.	2020-06-20 14:04:48 -07:00
Sanjay Patel	741e20f3d6	[VectorCombine] fix assert for type of compare operand As shown in the post-commit comment for D81661 - we need to loosen the type assertion to allow scalarization of a compare for vectors of pointers.	2020-06-20 15:20:17 -04:00
Sanjay Patel	7b201bfcac	[InstCombine] remove unused parameter and add assert; NFC	2020-06-20 11:47:00 -04:00
Sanjay Patel	d84cdb81ed	[InstCombine] fabs(X) / fabs(X) -> X / X Also, consolidate related folds so we don't miss/repeat these.	2020-06-20 10:20:21 -04:00
Simon Pilgrim	89dcbdfcfd	[X86] combineSetCCMOVMSK - consistently use CmpBits variable. NFCI. The comparison value should be the same size - I've added an assert to be absolutely certain.	2020-06-20 12:35:24 +01:00
Simon Pilgrim	56a9332328	[X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) != -1 -> !PTESTZ(X,X) allof patterns	2020-06-20 12:17:32 +01:00
Nikita Popov	d3d4e4bcb7	[LVI] Extract addValueHandle() method (NFC) There will be more places registering value handles.	2020-06-20 13:05:42 +02:00
Nikita Popov	64ecf85f63	[LVI] Use find_as() where possible (NFC) This prevents us from creating temporary PoisoningVHs and AssertingVHs while performing hashmap lookups. As such, it only matters in assertion-enabled builds.	2020-06-20 13:05:42 +02:00
Florian Hahn	9a7d80a32c	Revert "[BasicAA] Use known lower bounds for index values for size based check." This potentially related to https://bugs.llvm.org/show_bug.cgi?id=46335 and causes a slight compile-time regression. Revert while investigating. This reverts commit `d99a1848c4`.	2020-06-20 10:06:05 +01:00
Eric Christopher	10563e16aa	[Analysis/Transforms/Sanitizers] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:42:26 -07:00
Eric Christopher	858d385578	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:24:57 -07:00
Eric Christopher	cf23852587	[Target] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist. This change affects an internal llvm command line option.	2020-06-20 00:06:39 -07:00
Xing GUO	6770349592	[DWARFYAML][debug_info] Fix array index out of bounds error This patch is trying to fix the array index out of bounds error. I observed it in (https://reviews.llvm.org/harbormaster/unit/view/99638/). Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D82139	2020-06-20 15:08:24 +08:00
Craig Topper	c721bc081e	[X86] Correct the implementation of ud1(a.k.a. ud2b) instruction. We were missing the modrm byte this instruction has according to current Intel SDM. Experiments with gcc indicate that different modrm values are chosen based on 2 operands so I've added those as well. I think our previous implementation was based on an older behavior of binutils that has since been changed.	2020-06-19 23:57:48 -07:00
Craig Topper	0dda5e4ce2	[X86] Ignore bits 2:0 of the modrm byte when disassembling lfence, mfence, and sfence. These are documented as using modrm byte of 0xe8, 0xf0, and 0xf8 respectively. But hardware ignore bits 2:0. So 0xe9-0xef is treated the same as 0xe8. Similar for the other two. Fixing this required adding 8 new formats to the X86 instructions to convey this information. Could have gotten away with 3, but adding all 8 made for a more logical conversion from format to modrm encoding. I renumbered the format encodings to keep the register modrm formats grouped together.	2020-06-19 22:24:24 -07:00
Fangrui Song	2a4317bfb3	[SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D82244	2020-06-19 22:22:47 -07:00
Yevgeny Rouban	6429471e8b	[IR] Convert profile metadata in createCallMatchingInvoke() When an invoke instruction is converted to a call its profile metadata is dropped because it has incompatible format (see commit `16ad6eeb94`). This patch adds an attempt to convert profile data to format of the call instruction. This used to work well before the commit `dcfa78a4cc`. Reviewers: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D82071	2020-06-20 12:10:31 +07:00
Wang Rui	dd48c57da3	[Mips] Error if a non-immediate operand is used while an immediate is expected The 32-bit type relocation (R_MIPS_32) cannot be used for instructions below: ori $4, $4, start ori $4, $4, (start - .) We should print an error instead. Reviewed By: atanasyan, MaskRay Differential Revision: https://reviews.llvm.org/D81908	2020-06-19 22:08:59 -07:00
Vitaly Buka	3d8149db3c	[StackSafety,NFC] Don't rerun on LiveIn change	2020-06-19 21:29:31 -07:00
Xing GUO	1cfdda57fa	[ObjectYAML][ELF] Add support for emitting the .debug_info section. This patch helps add support for emitting the .debug_info section to yaml2elf. Reviewed By: jhenderson, grimar, MaskRay Differential Revision: https://reviews.llvm.org/D82073	2020-06-20 12:13:01 +08:00
Carl Ritson	4a7de36afc	[AMDGPU] Avoid use of V_READLANE into EXEC in SGPR spills Always prefer to clobber input SGPRs and restore them after the spill. This applies to both spills to VGPRs and scratch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81914	2020-06-20 12:10:47 +09:00
romanova-ekaterina	d7fad626e9	Error related to ThinLTO caching needs to be downgraded to a remark This is a fix for PR #46392 (Diagnostic message (error) related to ThinLTO caching needs to be downgraded to a remark). There are diagnostic messages related to ThinLTO caching that contain the word "error", but they are really just notices/remarks for users, and they don't cause a build failure. The word "error" appearing can be confusing to users, and may even cause deeper problems. User's build system might be designed to interpret any error messages (even a benign error message as the one above) reported by the compiler as a build failure, thus causing the build to fail "needlessly". In short, the term "error" in this diagnostic is misleading at best, and may be causing build systems to fail at worst. Differential Revision: https://reviews.llvm.org/D82138	2020-06-19 16:03:29 -07:00
Eric Christopher	b6536e549d	As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-19 15:12:18 -07:00
Heejin Ahn	83c26eae23	[WebAssembly] Remove TEEs when dests are unstackified When created in RegStackify pass, `TEE` has two destinations, where op0 is stackified and op1 is not. But it is possible that op0 becomes unstackified in `fixUnwindMismatches` function in CFGStackify pass when a nested try-catch-end is introduced, violating the invariant of `TEE`s destinations. In this case we convert the `TEE` into two `COPY`s, which will eventually be resolved in ExplicitLocals. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D81851	2020-06-19 14:55:21 -07:00
Martin Storsjö	cdbd299800	[Support] Fix building for mingw on a case sensitive file system This fixes cross building on a case sensitive file system after `2e613d2ded`. (The official Windows SDKs don't have self-consistent casing and can't be used as such on case sentisive file systems without case fixups, while mingw headers consistently use lower case.)	2020-06-20 00:39:22 +03:00
Amara Emerson	1feeecf224	[AArch64][GlobalISel] Make G_SEXT_INREG legal and add selection support. We were defaulting to the lower action for this, resulting in SHL+ASHR sequences. On AArch64 we can do this in one instruction for an arbitrary extension using SBFM as we do for G_SEXT. Differential Revision: https://reviews.llvm.org/D81992	2020-06-19 13:20:41 -07:00
Sanjay Patel	216a37bb46	[VectorCombine] refactor extract-extract logic; NFCI	2020-06-19 14:52:27 -04:00
Lang Hames	bf783a6aa8	[JITLink] Display host -> target address mapping in debugging output. This can be helpful for sanity checking JITLink memory manager behavior.	2020-06-19 10:05:02 -07:00
Sanjay Patel	6d864097a2	[VectorCombine] fix crash while transforming constants This is a variation of the proposal in D82049 with an extra test.	2020-06-19 12:30:32 -04:00
Stanislav Mekhanoshin	2b87a44c49	[AMDGPU] Some formatting fixes. NFC.	2020-06-19 09:02:59 -07:00
Piotr Sobczak	6d9565d6d5	Revert "[AMDGPU] Select s_cselect" This caused some failures detected by the buildbot with expensive checks enabled. This reverts commit `4067de569f`.	2020-06-19 16:41:04 +02:00
dfukalov	129388ddc4	[AMDGPU][CostModel] Add fneg cost estimation Summary: The estimation uses AMDGPUTargetLowering::isFNegFree() Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82065	2020-06-19 17:31:35 +03:00
Piotr Sobczak	4067de569f	[AMDGPU] Select s_cselect Summary: Add patterns to select s_cselect in the isel. Handle more cases of implicit SCC accesses in si-fix-sgpr-copies to allow new patterns to work. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81925	2020-06-19 16:17:46 +02:00
Sjoerd Meijer	4aa893b8f2	[ARM][MVE] tail-predication: renamed internal option. Renamed -force-tail-predication to -force-mve-tail-predication because that's more descriptive and consistent.	2020-06-19 15:07:06 +01:00
Mikhail Maltsev	490f78c038	[ARM][BFloat] Implement lowering of bf16 load/store intrinsics Reviewers: labrinea, dmgreen, pratlucas, LukeGeeson Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81486	2020-06-19 14:02:35 +00:00
Mikhail Maltsev	7526881246	[ARM][BFloat] Lowering of create/get/set/dup intrinsics This patch adds codegen for the following BFloat operations to the ARM backend: * concatenation of bf16 vectors * bf16 vector element extraction * bf16 vector element insertion * duplication of a bf16 value into each lane of a vector * duplication of a bf16 vector lane into each lane Differential Revision: https://reviews.llvm.org/D81411	2020-06-19 12:52:40 +00:00
Simon Pilgrim	c143db3b10	[X86][SSE] combineHorizontalPredicateResult - improve all_of(X == 0) for vXi64 on pre-SSE41 targets Without SSE41 we don't have the PCMPEQQ instruction, making cmp-with-zero reductions more complicated than necessary. We can compare as vXi32 (PCMPEQD) and tweak the MOVMSK comparison to test upper/lower DWORD comparisons. This pre-fixes something that occurs with null tests for vectors of (64-bit) pointers such as in PR35129.	2020-06-19 11:43:25 +01:00
Vitaly Buka	0e1bdeafc9	[StackSafety,NFC] Fix comment	2020-06-19 03:11:13 -07:00
Tyker	67448a8ccc	try to fix build bot after `b7338fb1a6`	2020-06-19 12:02:09 +02:00
Simon Pilgrim	cad2038700	[X86][SSE] combineSetCCMOVMSK - fold MOVMSK(SHUFFLE(X,u)) -> MOVMSK(X) If we're permuting ALL the elements of a single vector, then for allof/anyof MOVMSK tests we can avoid the shuffle entirely.	2020-06-19 10:57:52 +01:00
David Sherwood	584d0d5c17	[SVE] Fall back on DAG ISel at -O0 when encountering scalable types At the moment we use Global ISel by default at -O0, however it is currently not capable of dealing with scalable vectors for two reasons: 1. The register banks know nothing about SVE registers. 2. The LLT (Low Level Type) class knows nothing about scalable vectors. For now, the easiest way to avoid users hitting issues when using the SVE ACLE is to fall back on normal DAG ISel when encountering instructions that operate on scalable vector types. I've added a couple of RUN lines to existing SVE tests to ensure we can compile at -O0. I've also added some new tests to CodeGen/AArch64/GlobalISel/arm64-fallback.ll that demonstrate we correctly fallback to DAG ISel at -O0 when lowering formal arguments or translating instructions that involve scalable vector types. Differential Revision: https://reviews.llvm.org/D81557	2020-06-19 10:57:00 +01:00
David Sherwood	0dc28af219	[CodeGen,AArch64] Fix up warnings in performExtendCombine Try to avoid calling getVectorNumElements() or relying upon the TypeSize conversion to uin64_t. Differential Revision: https://reviews.llvm.org/D81573	2020-06-19 10:34:51 +01:00
Vitaly Buka	f224f3d0f2	[StackSafety] Add StackLifetime::isAliveAfter This function is going to be added into StackSafety checks. This patch uses function in ::print implementation to make sure that it works as expected.	2020-06-19 02:32:17 -07:00
Vitaly Buka	306c257b00	[SafeStack,NFC] Print liveness for all instrunctions	2020-06-19 02:32:17 -07:00
Vitaly Buka	20b1094a04	[StackSafety,NFC] Replace map with vector We don't need to lookup InstructionNumbering by number, so we can use vector with index as assigned number.	2020-06-19 02:32:17 -07:00
Vitaly Buka	7b27c09f63	[StackSafety,NFC] Don't test terminators Code does not track terminators and do not expose them through interface. State there is just a state of the last instruction or entry. So this information is just redundant and doesn't need to be tested.	2020-06-19 02:32:17 -07:00
Florian Hahn	f9d8e33c32	[SCCP] Turn sext into zext for non-negative ranges. This patch updates SCCP/IPSCCP to use the computed range info to turn sexts into zexts, if the value is known to be non-negative. We already to a similar transform in CorrelatedValuePropagation, but it seems like we can catch a lot of additional cases by doing it in SCCP/IPSCCP as well. The transform is limited to ranges that are known to not include undef. Currently constant ranges from conditions are treated as potentially containing undef, due to PR46144. Once we flip this, the transform will be more effective in practice. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81756	2020-06-19 10:17:55 +01:00
Jay Foad	7cdf4326a8	[LiveIntervals] Fix early-clobber handling in handleMoveUp Without this fix, handleMoveUp can create an invalid live range like this: [98904e,98908r:0)[98908e,227504r:1) where the two segments overlap, but only because we have lost the "e" (early-clobber) on the end point of the first segment. Differential Revision: https://reviews.llvm.org/D82110	2020-06-19 10:17:04 +01:00
Tyker	b7338fb1a6	[AssumeBundles] add cannonicalisation to the assume builder Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650	2020-06-19 10:32:26 +02:00
David Sherwood	7edc7f6edb	[CodeGen] Fix SimplifyDemandedBits for scalable vectors For now I have changed SimplifyDemandedBits and it's various callers to assume we know nothing for scalable vectors and to ignore the demanded bits completely. I have also done something similar for SimplifyDemandedVectorElts. These changes fix up lots of warnings due to calls to EVT::getVectorNumElements() for types with scalable vectors. These functions are all used for optimisations, rather than functional requirements. In future we can revisit this code if there is a need to improve code quality for SVE. Differential Revision: https://reviews.llvm.org/D80537	2020-06-19 07:59:35 +01:00
David Sherwood	9e811b0d93	[CodeGen] Fix ComputeNumSignBits for scalable vectors When trying to calculate the number of sign bits for scalable vectors we should just bail out for now and pretend we know nothing. Differential Revision: https://reviews.llvm.org/D81093	2020-06-19 07:58:42 +01:00
Kristof Beyls	d938ec4509	[AArch64] Avoid incompatibility between SLSBLR mitigation and BTI codegen. A "BTI c" instruction only allows jumping/calling to using a BLR* instruction. However, the SLSBLR mitigation changes a BLR to a BR to implement the function call. Therefore, a "BTI c" check that passed before could trigger after the BLR->BL change done by the SLSBLR mitigation. However, if the register used in BR is X16 or X17, this trigger will not fire (see ArmARM for further details). Therefore, this patch simply changes the function stubs for the SLSBLR mitigation from __llvm_slsblr_thunk_x<N>: br x<N> SpeculationBarrier to __llvm_slsblr_thunk_x<N>: mov x16, x<N> br x16 SpeculationBarrier Differential Revision: https://reviews.llvm.org/D81405	2020-06-19 06:21:54 +01:00
Ronak Chauhan	5bd33de9c8	[MC] Pass the symbol rather than its name to onSymbolStart() Summary: This allows targets to also consider the symbol's type and/or address if needed. Reviewers: scott.linder, jhenderson, MaskRay, aardappel Reviewed By: scott.linder, MaskRay Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82090	2020-06-19 09:30:12 +05:30
Francesco Petrogalli	d32c134648	[llvm][SVE] Reg + reg addressing mode for LD1RO. Reviewers: efriedma, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80741	2020-06-19 03:56:10 +00:00
Nemanja Ivanovic	1fed131660	[PowerPC] Canonicalize shuffles to match more single-instruction masks on LE We currently miss a number of opportunities to emit single-instruction VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although this in itself is not a huge performance opportunity since loading the permute vector for a VPERM can always be pulled out of loops, producing such merge instructions is useful to downstream optimizations. Since VPERM is essentially opaque to all subsequent optimizations, we want to avoid it as much as possible. Other permute instructions have semantics that can be reasoned about much more easily in later optimizations. This patch does the following: - Canonicalize shuffles so that the first element comes from the first vector (since that's what most of the mask matching functions want) - Switch the elements that come from splat vectors so that they match the corresponding elements from the other vector (to allow for merges) - Adds debugging messages for when a shuffle is matched to a VPERM so that anyone interested in improving this further can get the info for their code Differential revision: https://reviews.llvm.org/D77448	2020-06-18 21:54:22 -05:00
Carl Ritson	8f3b2c8aa3	AMDGPU/GlobalISel: Remove selection of MAD/MAC when not available Add code to respect mad-mac-f32-insts target feature. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81990	2020-06-19 10:30:19 +09:00
Vitaly Buka	fcd67665a8	[StackSafety] Add "Must Live" logic Summary: Extend StackLifetime with option to calculate liveliness where alloca is only considered alive on basic block entry if all non-dead predecessors had it alive at terminators. Depends on D82043. Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82124	2020-06-18 16:53:37 -07:00
Nathan James	8b0df1c1a9	[NFC] Refactor Registry loops to range for	2020-06-19 00:40:10 +01:00
Vitaly Buka	f672791e08	[StackSafety] Add pass for StackLifetime testing Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82043	2020-06-18 16:34:18 -07:00
Matt Arsenault	bbd78519f9	ARC: Enforce function alignment at code emission time Don't do this in the MachineFunctionInfo constructor. Also, ensure the alignment rather than overwriting it outright. I vaguely remember there was another place to enforce the target minimum alignment, but I couldn't find it (it's there for instructions).	2020-06-18 17:40:49 -04:00
Matt Arsenault	95605b784b	AMDGPU/GlobalISel: Implement computeKnownAlignForTargetInstr We probably need to move where intrinsics are lowered to copies to make this useful.	2020-06-18 17:28:00 -04:00
Matt Arsenault	b13f6b0fe0	BypassSlowDivision: Fix dropping debug info I don't know anything about debug info, but this seems like more work should be necessary. This constructs a new IRBuilder and reconstructs the original divides rather than moving the original. One problem this has is if a div/rem pair are handled, both end up with the same debugloc. I'm not sure how to fix this, since this uses a cache when it sees the same input operands again, which will have the first instance's location attached.	2020-06-18 17:27:19 -04:00
Amy Kwan	c45c161130	[PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang This patch implements builtins for the following prototypes: vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long); vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b); unsigned long long __builtin_pdepd (unsigned long long, unsigned long long); unsigned long long __builtin_pextd (unsigned long long, unsigned long long); Revision Depends on D80758 Differential Revision: https://reviews.llvm.org/D80935	2020-06-18 16:23:56 -05:00
Matt Arsenault	7f8b2e1b91	GlobalISel: Pass LegalizerHelper to custom legalize callbacks This was passing in all the parameters needed to construct a LegalizerHelper in the custom legalization, when it's simpler to just pass in the existing helper. This is slightly more annoying to use in the common case where you don't need the legalizer helper, but we could add back the common parameters back in addition to the helper. I didn't propagate this to all the internal target changes that this logically implies, but did update a sample one for legalizeMinNumMaxNum. This is in preparation for moving AMDGPU load/store legalization entirely into custom lowering. The current set of legalization actions is really constraining and not really capable of expressing all the actions needed to legalize loads/stores. In particular there's no way to express when the memory access itself needs to change size vs. the result type. There's also a lot of redundancy since the same split/widen actions need to be applied in both vector and scalar cases. All of the sub-cases logically belong as steps in the legalizer helper, but it will be easier to consider everything at once in custom lowering.	2020-06-18 17:17:38 -04:00
Christopher Tetreault	8d11ec66b6	[SVE] Remove calls to VectorType::getNumElements from Transforms/Utils Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82057	2020-06-18 13:39:14 -07:00
Alexandre Ganea	2ae0df5be7	[CodeView] Revert `8374bf4363` and `403f953792` This reverts: `8374bf4363` [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. `403f953792` [CodeView] Add full repro to LF_BUILDINFO record This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c	2020-06-18 16:18:46 -04:00
Kirill Naumov	41d53194fb	[BasicBlock] Added AnnotationWriter functionality to BasicBlock class This functionality is very similar to Function compatibility with AnnotationWriter. This change allows us to use AnnotationWriter with BasicBlock through BB.print() method. Reviewed-By: apilipenko Differntial Revision: https://reviews.llvm.org/D81321	2020-06-18 19:49:58 +00:00
Sanjay Patel	46a285ad9e	[IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC The predicate can always be used to distinguish between icmp and fcmp, so we don't need to keep repeating this check in the callers.	2020-06-18 15:47:06 -04:00
Davide Italiano	8cdd2a158c	[SimplifyCFG] Update debug location when folding branch to common destination Sometimes a dead block gets folded and the debug information is still retained. This manifests as jumpy stepping in lldb, see the bugzilla PR for an end-to-end C testcase. Fixes https://bugs.llvm.org/show_bug.cgi?id=46008 Differential Revision: https://reviews.llvm.org/D82062	2020-06-18 12:33:32 -07:00
Michael Liao	2defe55722	[TTI] Expose isNoopAddrSpaceCast in TTI. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82025	2020-06-18 14:40:47 -04:00
serge-sans-paille	4dd332723d	Fix return status of LoopDistribute Move code that may update the IR after precondition, so that if precondition fail, the IR isn't modified. Differential Revision: https://reviews.llvm.org/D81225	2020-06-18 20:13:18 +02:00
Matt Arsenault	779cba79ec	AMDGPU: Remove mayLoad/mayStore from some side effecting intrinsics These don't really modify any memory, and should not expect memory operands.	2020-06-18 14:12:19 -04:00
Stanislav Mekhanoshin	6c7e1b16fa	[AMDGPU] Added new encoding to getMCOpcodeGen Nothing breaks yet, but all encodings shall be in the map. Differential Revision: https://reviews.llvm.org/D81974	2020-06-18 10:11:33 -07:00
Arthur Eubanks	91ef930526	[GlobalOpt] Remove preallocated calls when possible When possible (e.g. internal linkage), strip preallocated attribute off parameters/arguments. This requires removing the "preallocated" operand bundle from the call site, replacing @llvm.call.preallocated.arg() with an alloca and a bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since @llvm.call.preallocated.arg() can be called multiple times with the same arg index, we create an alloca per arg index. We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was and a @llvm.stackrestore() after the preallocated call to prevent the stack from blowing up. This is valid because the argument would normally not exist on the stack after the call before the transformation. This does not currently handle all possible preallocated calls. We will need to figure out where to put @llvm.stackrestore() in the cases where there is no obvious place to put it, for example conditional preallocated calls, invokes. This sort of transformation may need to be moved to somewhere more accessible to accomodate similar transformations (like inlining) in the future. Reviewers: efriedma, hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80951	2020-06-18 09:56:13 -07:00
Alexandros Lamprineas	ecdf48f15b	[ARM] Basic bfloat support This patch adds basic support for BFloat in the Arm backend. For now the code generation relies on fullfp16 being present. Briefly: * adds the bfloat scalar and vector types in the necessary register classes, * adjusts the calling convention to cope with bfloat argument passing and return, * adds codegen patterns for moves, loads and stores. It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy). The following people contributed to this patch: * Alexandros Lamprineas * Ties Stuij Differential Revision: https://reviews.llvm.org/D81373	2020-06-18 17:26:24 +01:00
Simon Pilgrim	2474421398	[TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes. If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.	2020-06-18 16:41:08 +01:00
Matt Arsenault	6f09bb7da2	AMDGPU: Don't pass MachineFunction if only the IR Function is used	2020-06-18 11:06:46 -04:00
Ayke van Laethem	b4c91462e8	[AVR] Fix miscompilation of zext + add Code like the following: define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) { entry: %conv = zext i1 %b to i32 %add = add nsw i32 %conv, %a ret i32 %add } Would compile to the following (incorrect) code: foo: mov r18, r20 clr r19 add r22, r18 adc r23, r19 sbci r24, 0 sbci r25, 0 ret Those sbci instructions are clearly wrong, they should have been adc instructions. This commit improves codegen to use adc instead: foo: mov r18, r20 clr r19 ldi r20, 0 ldi r21, 0 add r22, r18 adc r23, r19 adc r24, r20 adc r25, r21 ret This code is not optimal (it could be just 5 instructions instead of the current 9) but at least it doesn't miscompile. Differential Revision: https://reviews.llvm.org/D78439	2020-06-18 16:51:37 +02:00
Matt Arsenault	243303f8d7	Lanai: Remove unused method This was depending on the MachineFunction at MachineFunctionInfo construction, which will soon be disallowed.	2020-06-18 10:48:14 -04:00
Simon Pilgrim	fe0a85faf4	[X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) == -1 -> PTESTZ(X,X) Allow combineSetCCMOVMSK to handle 'allof' X == 0 patterns to be replaced with PTESTZ This is a preliminary patch before properly handling PR35129	2020-06-18 15:38:32 +01:00
Alexandre Ganea	8374bf4363	[CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.	2020-06-18 10:07:30 -04:00
Kamlesh Kumar	7622ea5835	[RISCV64] Emit correct lib call for fp(float/double) to ui/si Since i32 is not legal in riscv64, it always promoted to i64 before emitting lib call and for conversions like float/double to int and float/double to unsigned int wrong lib call was emitted. This commit fix it using custom lowering. Differential Revision: https://reviews.llvm.org/D80526	2020-06-18 19:34:16 +05:30
Igor Kudrin	6853cc7221	[MC] Rename a misnamed function. NFC. The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to better reflect an expression it creates and fix a naming style issue. Differential Revision: https://reviews.llvm.org/D82079	2020-06-18 20:18:19 +07:00
Alexandre Ganea	403f953792	[CodeView] Add full repro to LF_BUILDINFO record This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable). Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding. For more information see PR36198 and D43002. Differential Revision: https://reviews.llvm.org/D80833	2020-06-18 09:17:15 -04:00
Alexandre Ganea	24eff42ba4	[CodeView] Add TypeCollection::replaceType to replace type records post-merging The API is not called in this patch. This is to simply/support https://reviews.llvm.org/D80833	2020-06-18 09:17:14 -04:00
Alexandre Ganea	a45409d885	[Clang] Move clang::Job::printArg to llvm::sys::printArg. NFCI. This patch is to support/simplify https://reviews.llvm.org/D80833	2020-06-18 09:17:13 -04:00
Florian Hahn	1669fddc9f	[Matrix] Use alignment info when lowering loads/stores. This patch updates LowerMatrixIntrinsics to preserve the alignment specified at the original load/stores and the align attribute for the pointer argument of the column.major.load/store intrinsics. We can always use the specified alignment for the load of the first column. For subsequent columns, the alignment may need to be reduced. For ConstantInt strides, compute the offset for the start of the column in bytes and use commonAlignment to get the largest valid alignment. For non-ConstantInt strides, we need to take the common alignment of the initial alignment and the element size in bytes. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D81960	2020-06-18 13:19:31 +01:00
Lucas Prates	92ad6d57c2	[ARM] Moving CMSE handling of half arguments and return to the backend Summary: As half-precision floating point arguments and returns were previously coerced to either float or int32 by clang's codegen, the CMSE handling of those was also performed in clang's side by zeroing the unused MSBs of the coercer values. This patch moves this handling to the backend's calling convention lowering, making sure the high bits of the registers used by half-precision arguments and returns are zeroed. Reviewers: chill, rjmccall, ostannard Reviewed By: ostannard Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81428	2020-06-18 13:16:29 +01:00
Lucas Prates	a255931c40	[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend Summary: Half-precision floating point arguments and returns are currently promoted to either float or int32 in clang's CodeGen and there's no existing support for the lowering of `half` arguments and returns from IR in AArch32's backend. Such frontend coercions, implemented as coercion through memory in clang, can cause a series of issues in argument lowering, as causing arguments to be stored on the wrong bits on big-endian architectures and incurring in missing overflow detections in the return of certain functions. This patch introduces the handling of half-precision arguments and returns in the backend using the actual "half" type on the IR. Using the "half" type the backend is able to properly enforce the AAPCS' directions for those arguments, making sure they are stored on the proper bits of the registers and performing the necessary floating point convertions. Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer Reviewed By: ostannard Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75169	2020-06-18 13:15:13 +01:00
Paul Walker	4612f39120	[SVE] Add flag to specify SVE register size, using this to calculate legal vector types. Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE data registers (in bits) to be specified. This allows the code generator to make assumptions it normally couldn't. As a starting point this information is used to mark fixed length vector types that can fit within the specified size as legal. Reviewers: rengolin, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80384	2020-06-18 12:11:16 +00:00
Sameer Sahasrabuddhe	7aad220795	[DA] conservatively mark the join of every divergent branch For a loop, a join block is a block that is reachable along multiple disjoint paths from the exiting block of a loop. If the exit condition of the loop is divergent, then such join blocks must also be marked divergent. This currently fails in some cases because not all join blocks are identified correctly. The workaround is to conservatively mark every join block of any branch (not necessarily the exiting block of a loop) as divergent. https://bugs.llvm.org/show_bug.cgi?id=46372 Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D81806	2020-06-18 17:39:20 +05:30
Florian Hahn	d88acd8f7d	[Matrix] Preserve volatile when loading loads/stores. Currently the matrix lowering turns volatile loads/stores into non-volatile ones. This patch updates the lowering to preserve the volatile bit. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D81498	2020-06-18 12:14:19 +01:00
Jeremy Morse	3626eba11f	[NFC][LiveDebugValues] Document how LiveDebugValues operates We're missing a plain English explanation of how this pass is supposed to operate -- add one to the file comment. Differential Revision: https://reviews.llvm.org/D80929	2020-06-18 10:54:09 +01:00
Ayke van Laethem	15bf42d503	[AVR] Implement disassembly of 32-bit instructions This needed two fixes: * 32-bit instructions were read in the wrong order. The machine code swaps the two 16-bit instruction words, which wasn't undone when decoding instructions. * Jump and call instructions don't encode the lowest address bit, which is always zero. Therefore, the address needed to be shifted by one to fix that. Differential Revision: https://reviews.llvm.org/D81961	2020-06-18 11:26:58 +02:00
David Sherwood	7e30ef77f6	[CodeGen] Fix warnings in getVectorTypeBreakdown Added NextPowerOf2() routine to TypeSize and rewritten the code in getVectorTypeBreakdown to avoid warnings being generated. Differential Revision: https://reviews.llvm.org/D81578	2020-06-18 09:54:16 +01:00
Florian Hahn	6d18c2067e	[Matrix] Update load/store intrinsics. This patch adjust the load/store matrix intrinsics, formerly known as llvm.matrix.columnwise.load/store, to improve the naming and allow passing of extra information (volatile). The patch performs the following changes: * Rename columnwise.load/store to column.major.load/store. This is more expressive and also more in line with the naming in Clang. * Changes the stride arguments from i32 to i64. The stride can be larger than i32 and this makes things more uniform with the way things are handled in Clang. * A new boolean argument is added to indicate whether the load/store is volatile. The lowering respects that when emitting vector load/store instructions * MatrixBuilder is updated to require both Alignment and IsVolatile arguments, which are passed through to the generated intrinsic. The alignment is set using the `align` attribute. The changes are grouped together in a single patch, to have a single commit that breaks the compatibility. We probably should be fine with updating the intrinsics, as we did not yet officially support them in the last stable release. If there are any concerns, we can add auto-upgrade rules for the columnwise intrinsics though. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache, rjmccall, ftynse Reviewed By: anemet, nicolasvasilache Differential Revision: https://reviews.llvm.org/D81472	2020-06-18 09:44:52 +01:00
David Sherwood	65912a9768	[CodeGen] Fix warnings in foldCONCAT_VECTORS Instead of asserting the number of elements is the same, we should be comparing the element counts instead. In addition, when looking at concats of extract_subvectors it's fine to use getVectorMinNumElements() for scalable vectors. I discovered these warnings when compiling the structured loads tests in this file: test/CodeGen/AArch64/sve-intrinsics-loads.ll Differential Revision: https://reviews.llvm.org/D81936	2020-06-18 09:29:37 +01:00
serge-sans-paille	f9c7e3136e	Correctly report modified status for HWAddressSanitizer Differential Revision: https://reviews.llvm.org/D81238	2020-06-18 10:27:44 +02:00
David Green	158e734af1	[ARM] Adjust AND/OR combines to not call isConstantSplat on i1 vectors. NFC. The rearranges PerformANDCombine and PerformORCombine to try and make sure we don't call isConstantSplat on any i1 vectors. As pointed out in D81860 it may not be very well defined in those cases.	2020-06-18 08:25:44 +01:00
Kristof Beyls	832cfc7672	[IndirectThunks] Make generated MF structure as expected by all instruction selectors. This also enables running the AArch64 SLSHardening pass with GlobalISel, so add a test for that. Differential Revision: https://reviews.llvm.org/D81403	2020-06-18 06:44:53 +01:00
Kristof Beyls	3f0cc96a96	[AArch64] SLSHardening: compute correct thunk name for X29. The enum values for AArch64 registers are not all consecutive. Therefore, the computation "__llvm_slsblr_thunk_x" + utostr(Reg - AArch64::X0) is not always correct. utostr(Reg - AArch64::X0) will not generate the expected string for the registers that do not have consecutive values in the enum. This happened to work for most registers, but does not for AArch64::FP (i.e. register X29). This can get triggered when the X29 is not used as a frame pointer. Differential Revision: https://reviews.llvm.org/D81997	2020-06-18 06:36:49 +01:00
Xing GUO	d261a1c0e0	[DWARFYAML][debug_abbrev] Make the abbreviation code optional. This patch helps make the `Code` optional in abbreviations table. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D81826	2020-06-18 13:02:54 +08:00
Mehdi Amini	77b79d79c0	Remove "unused" member ModuleSlice from `struct OpenMPOpt` This is fixing warning from clang: warning: private field 'ModuleSlice' is not used [-Wunused-private-field] SmallPtrSetImpl<Function *> &ModuleSlice; ^ Differential Revision: https://reviews.llvm.org/D82027	2020-06-18 03:02:26 +00:00
Kang Zhang	58e19d465a	[PowerPC] Don't convert Loop to CTR Loop for fp128 BinaryOperator Summary: For PPC BinaryOperator of fp128 will become libcall, we shouldn't convert loop to CTR loop if the loop contain libCall. But currently, in the PPCTTIImpl::mightUseCTR() function, we only deal with BinaryOperator for ppc_fp128, don't deal with the fp128. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D81353	2020-06-18 02:54:19 +00:00
Xing GUO	1f391afbf4	[ObjectYAML][ELF] Add support for emitting the .debug_abbrev section. This patch enables yaml2elf emit the .debug_abbrev section. The generated .debug_abbrev is verified using `llvm-dwarfdump`. Known issues that will be addressed later: - Current implementation doesn't support generating multiple abbreviation tables in one .debug_abbrev section. Reviewed By: jhenderson, grimar Differential Revision: https://reviews.llvm.org/D81820	2020-06-18 10:50:38 +08:00
Esme-Yi	ad6024e29f	[PowerPC] Custom lower rotl v1i128 to vector_shuffle. Summary: A bug is reported in bugzilla-45628, where the swap_with_shift case can’t be matched to a single HW instruction xxswapd as expected. In fact the case matches the idiom of rotate. We have MatchRotate to handle an ‘or’ of two operands and generate a rot[lr] if the case matches the idiom of rotate. While PPC doesn’t support ROTL v1i128. We can custom lower ROTL v1i128 to the vector_shuffle. The vector_shuffle will be matched to a single HW instruction during the phase of instruction selection. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81076	2020-06-18 01:32:23 +00:00
Sam Clegg	7ee758d691	[WebAssembly] MC: Fix for data aliases with offsets (getelementptr) For some reason we hadn't seen such cases in the wild which makes me think that clang and rustc don't generate these. In the bug which reproduces it only occurs with LTO so my guess is that some LTO pass is creating this alias + gep. See: https://github.com/emscripten-core/emscripten/issues/8731 Differential Revision: https://reviews.llvm.org/D79462	2020-06-17 16:25:50 -07:00
Matt Arsenault	5f5f566b26	AMDGPU: Don't use 16-bit FP inline constants in integer operands It seems to be a hardware defect that the half inline constants do not work as expected for the 16-bit integer operations (the inverse does work correctly). Experimentation seems to show these are really reading the 32-bit inline constants, which can be observed by writing inline asm using op_sel to see what's in the high half of the constant. Theoretically we could fold the high halves of the 32-bit constants using op_sel. The *_asm_all.s MC tests are broken, and I don't know where the script to autogenerate these are. I started manually fixing it, but there's just too many cases to fix. This also does break the assembler/disassembler support for these values, and I'm not sure what to do about it. These are still valid encodings, so it seems like you should be able to use them in some way. If you wrote assembly using them, you could have really meant it (perhaps to read the high bits with op_sel?). The disassembler will print the invalid literal constant which will fail to re-assemble. The behavior is also different depending on the use context. Consider this example, which was previously accepted and encoded using the inline constant: v_mad_i16 v5, v1, -4.0, v3 ; encoding: [0x05,0x00,0xec,0xd1,0x01,0xef,0x0d,0x04] In contexts where an inline immediate is required (such as on gfx8/9), this will now be rejected. For gfx10, this will produce the literal encoding and change the printed format: v_mad_i16 v5, v1, 0xc400, v3 ; encoding: [0x05,0x00,0x5e,0xd7,0x01,0xff,0x0d,0x04,0x00,0xc4,0x00,0x00] This is just another variation of the issue that we don't perfectly handle round trip assembly/disassembly due to not tracking how immediates were encoded. This doesn't matter much in practice, since compilers don't emit the suboptimal encoding. I doubt any users are relying on this behavior (although I did make use of the old behavior to figure out what was wrong). Fixes bug 46302.	2020-06-17 19:14:10 -04:00
Yonghong Song	89648eb16d	[BPF] fix a bug for BTF pointee type pruning In BTF, pointee type pruning is used to reduce cluttering too many unused types into prog BTF. For example, struct task_struct { ... struct mm_struct mm; ... } If bpf program does not access members of "struct mm_struct", there is no need to bring types for "struct mm_struct" to BTF. This patch fixed a bug where an incorrect pruning happened. The test case like below: struct t; typedef struct t _t; struct s1 { _t c; }; int test1(struct s1 arg) { ... } struct t { int a; int b; }; struct s2 { _t c; } int test2(struct s2 arg) { ... } After processing test1(), among others, BPF backend generates BTF types for "struct s1", "_t" and a placeholder for "struct t". Note that "struct t" is not really generated. If later a direct access to "struct t" member happened, "struct t" BTF type will be generated properly. During processing test2(), when processing member type "_t c", BPF backend sees type "_t" already generated, so returned. This caused the problem that "struct t" BTF type is never generated and eventually causing incorrect type definition for "struct s2". To fix the issue, during DebugInfo type traversal, even if a typedef/const/volatile/restrict derived type has been recorded in BTF, if it is not a type pruning candidate, type traversal of its base type continues. Differential Revision: https://reviews.llvm.org/D82041	2020-06-17 15:13:46 -07:00
Eric Christopher	a8dad30388	Revert "Remove unused class variable ModuleSlice." as it was used in debug only code. This reverts commit `07a1749081`.	2020-06-17 14:45:17 -07:00
Eric Christopher	07a1749081	Remove unused class variable ModuleSlice.	2020-06-17 14:33:29 -07:00
Christopher Tetreault	8819202dfd	[SVE] Eliminate bad VectorType::getNumElements() calls from ConstantFold Summary: Assume all usages of this function are explicitly fixed-width operations and cast to FixedVectorType Reviewers: efriedma, sdesmalen, c-rhodes, majnemer, dblaikie Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80262	2020-06-17 14:19:56 -07:00
Christopher Tetreault	4b776a98f1	[SVE] Fix invalid usages of getNumElements in ShuffleVectorInstruction Summary: Fix invalid usages of getNumElements identified by test case LLVM.Transforms/InstCombine::vscale_extractelement.ll. changesLength: Since the length of the llvm::SmallVector shufflemask is related to the minimum number of elements in a scalable vector, it is fine to just get the Min field of the ElementCount isIdentityWithExtract: Since it is not possible to express the mask needed for this pattern for scalable vectors, we can just bail before calling getNumElements() Reviewers: efriedma, sdesmalen, fpetrogalli, gchatelet, yrouban, craig.topper Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81969	2020-06-17 13:45:34 -07:00
Roman Lebedev	84b4f5a6a6	[InstCombine] Negator: while there, add detection for cycles during negation I don't have any testcases showing it happening, and i haven't succeeded in creating one, but i'm also not positive it can't ever happen, and i recall having something that looked like that in the very beginning of Negator creation. But since we now already have a negation cache, we can now detect such cases practically for free. Let's do so instead of "relying" on stack overflow :D	2020-06-17 22:47:20 +03:00
Roman Lebedev	e3d8cb1e1d	[InstCombine] Negator: cache negation results (PR46362) It is possible that we can try to negate the same value multiple times. For example, PHI nodes may happen to have multiple incoming values (all of which must be the same value) for the same incoming basic block. It may happen that we try to negate such a PHI node, and succeed, and that might result in having now-different incoming values.. To avoid that, and in general to reduce the amount of duplicated work we might be doing, let's introduce a cache where we'll track results of negating each value. The added test was previously failing -verify after -instcombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=46362	2020-06-17 22:47:20 +03:00
Roman Lebedev	c4166f3d84	[NFC][InstCombine] Negator: add thin negate() wrapped before visit()	2020-06-17 22:47:20 +03:00
Roman Lebedev	2b85147337	[NFC][InstCombine] Negator: do not include unneeded "llvm/IR/DerivedTypes.h" header	2020-06-17 22:47:19 +03:00
Thomas Lively	49754dcf22	[WebAssembly] Fix bug in FixBrTables and use branch analysis utils Summary: This commit fixes a bug in the FixBrTables pass in which an unconditional branch from the switch header block to the jump table block was not removed before the blocks were combined. The result was an invalid CFG in the MachineFunction. This commit also switches from using bespoke branch analysis and deletion code to using the standard utilities for the same. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81909	2020-06-17 12:34:45 -07:00
Nick Desaulniers	e7816f263b	[InlineSpiller] add assert about spills post terminators Summary: This invariant is being violated in the test case https://reviews.llvm.org/D77849, related to the use of the relatively new ability for callbr to have return values, and MachineBasicBlocks with INLINEASM_BR terminators to emit live out register defs. As noted in the comment, this triggers invariant violations in MachineVerifier via `llc -verify-machineinstrs` or `llc -verify-regalloc`, since only MachineInstrs that are terminators are allowed to follow the first terminator. https://reviews.llvm.org/D75098 may rework this very assertion if we're spilling via a (proposed) TCOPY MachineInstr. Reviewers: void, efriedma, arsenm Reviewed By: efriedma Subscribers: qcolombet, wdng, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78166	2020-06-17 11:51:58 -07:00
Nick Desaulniers	88c965ba14	BreakCriticalEdges for callbr indirect dests Summary: llvm::SplitEdge was failing an assertion that the BasicBlock only had one successor (for BasicBlocks terminated by CallBrInst, we typically have multiple successors). It was surprising that the earlier call to SplitCriticalEdge did not handle the critical edge (there was an early return). Removing that triggered another assertion relating to creating a BlockAddress for a BasicBlock that did not (yet) have a parent, which is a simple order of operations issue in llvm::SplitCriticalEdge (a freshly constructed BasicBlock must be inserted into a Function's basic block list to have a parent). Thanks to @nathanchance for the report. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018 Reviewers: craig.topper, jyknight, void, fhahn, efriedma Reviewed By: efriedma Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D81607	2020-06-17 11:45:06 -07:00
Davide Italiano	1cbaf847ab	[CGP] Reset the debug location when promoting zext(s). When the zext gets promoted, it used to retain the original location, which pessimizes the debugging experience causing an unexpected jump in stepping at -Og. Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also contains a full C repro). Differential Revision: https://reviews.llvm.org/D81437	2020-06-17 11:13:13 -07:00
Ian Levesque	7c7c8e0da4	[xray] Option to omit the function index Summary: Add a flag to omit the xray_fn_idx to cut size overhead and relocations roughly in half at the cost of reduced performance for single function patching. Minor additions to compiler-rt support per-function patching without the index. Reviewers: dberris, MaskRay, johnislarry Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D81995	2020-06-17 13:49:01 -04:00
Alexandre Ganea	acb30f6856	[X86] For 32-bit targets, emit two-byte NOP when possible In order to support hot-patching, we need to make sure the first emitted instruction in a function is a two-byte+ op. This is already the case on x86_64, which seems to always emit two-byte+ ops. However on 32-bit targets this wasn't the case. PATCHABLE_OP now lowers to a XCHG AX, AX, (66 90) like MSVC does. However when targetting pentium3 (/arch:SSE) or i386 (/arch:IA32) targets, we generate MOV EDI,EDI (8B FF) like MSVC does. This is for compatiblity reasons with older tools that rely on this two byte pattern. Differential Revision: https://reviews.llvm.org/D81301	2020-06-17 13:44:38 -04:00
Alexandre Ganea	ad879b31f0	[X86] Change signature of EmitNops. NFC. This is to support https://reviews.llvm.org/D81301.	2020-06-17 13:44:37 -04:00
Fangrui Song	c8b082a3ab	[llvm-cov gcov] Support clang<11 fake 4.2 format Test cases are restored from `a3bed4bd37`	2020-06-17 10:17:15 -07:00
Michał Górny	5c621900a6	[llvm] [CommandLine] Do not suggest really hidden opts in nearest lookup Skip 'really hidden' options when performing lookup of the nearest option when invalid option was passed. Since these options aren't even documented in --help-hidden, it seems inconsistent to suggest them to users. This fixes clang-tools-extra test failures due to unexpected suggestions when linking the tools to LLVM dylib (that provides more options than the subset of LLVM libraries linked directly). Differential Revision: https://reviews.llvm.org/D82001	2020-06-17 19:00:26 +02:00
Scott Linder	691ff4682f	[AMDGPU] Skip CFIInstructions in SIInsertWaitcnts Summary: CFI emitted during PEI at the beginning of the prologue needs to apply to any inserted waitcnts on function entry. Reviewers: arsenm, t-tye, RamNalamothu Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D76881	2020-06-17 12:41:03 -04:00
vnalamot	2e28009981	[NFC] Move getAll{S,V}GPR{32,128} methods to SIFrameLowering Summary: Future patch needs some of these in multiple places. The definitions of these can't be in the header and be eligible for inlining without making the full declaration of GCNSubtarget visible. I'm not sure what the right trade-off is, but I opted to not bloat SIRegisterInfo.h Reviewers: arsenm, cdevadas Reviewed By: arsenm Subscribers: RamNalamothu, qcolombet, jvesely, wdng, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79878	2020-06-17 12:08:09 -04:00
sstefan1	7cfd267c51	[OpenMPOPT][NFC] Introducing OMPInformationCache. Summary: Introduction of OpenMP-specific information cache based on Attributor's `InformationCache`. This should make it easier to share information between them. Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku Subscribers: yaxunl, hiraditya, guansong, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81798	2020-06-17 16:56:45 +02:00
Jay Foad	def2e4c47f	[AMDGPU] Simplify GCNPassConfig::addOptimizedRegAlloc. NFC.	2020-06-17 15:56:15 +01:00
Simon Pilgrim	a5f1f9c9b8	ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC. Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency. Add implicit header dependency to source files where necessary.	2020-06-17 15:48:23 +01:00
Sjoerd Meijer	d1522513d4	[ARM] Reimplement MVE Tail-Predication pass using @llvm.get.active.lane.mask To set up a tail-predicated loop, we need to to calculate the number of elements processed by the loop. We can now use intrinsic @llvm.get.active.lane.mask() to do this, which is emitted by the vectoriser in D79100. This intrinsic generates a predicate for the masked loads/stores, and consumes the Backedge Taken Count (BTC) as its second argument. We can now use that to reconstruct the loop tripcount, instead of the IR pattern match approach we were using before. Many thanks to Eli Friedman and Sam Parker for all their help with this work. This also adds overflow checks for the different, new expressions that we create: the loop tripcount, and the sub expression that calculates the remaining elements to be processed. For the latter, SCEV is not able to calculate precise enough bounds, so we work around that at the moment, but is not entirely correct yet, it's conservative. The overflow checks can be overruled with a force flag, which is thus potentially unsafe (but not really because the vectoriser is the only place where this intrinsic is emitted at the moment). It's also good to mention that the tail-predication pass is not yet enabled by default. We will follow up to see if we can implement these overflow checks better, either by a change in SCEV or we may want revise the definition of llvm.get.active.lane.mask. Differential Revision: https://reviews.llvm.org/D79175	2020-06-17 15:17:42 +01:00
Kirill Naumov	ea844c7520	Revert "[InlineCost] InlineCostAnnotationWriterPass introduced" This reverts commit `37e06e8f5c`.	2020-06-17 14:02:34 +00:00
Kirill Naumov	dcf2a9f2ee	Revert "[InlineCost] PrinterPass prints constants to which instructions are simplified" This reverts commit `52b0db22f8`.	2020-06-17 14:02:29 +00:00
Kirill Naumov	39a4505e34	Revert "[InlineCost] GetElementPtr with constant operands" This reverts commit `34fba68d80`.	2020-06-17 14:02:18 +00:00
Kirill Naumov	34fba68d80	[InlineCost] GetElementPtr with constant operands If the GEP instruction contanins only constants as its arguments, then it should be recognized as a constant. For now, there was also added a flag to turn off this simplification if it causes any regressions ("disable-gep-const-evaluation") which is off by default. Once I gather needed data of the effectiveness of this simplification, the flag will be deleted. Reviewers: apilipenko, davidxl, mtrofin Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D81026	2020-06-17 13:40:19 +00:00
Kirill Naumov	52b0db22f8	[InlineCost] PrinterPass prints constants to which instructions are simplified This patch enables printing of constants to see which instructions were constant-folded. Needed for tests and better visiual analysis of inliner's work. Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D81024	2020-06-17 13:40:18 +00:00
Kirill Naumov	37e06e8f5c	[InlineCost] InlineCostAnnotationWriterPass introduced This class allows to see the inliner's decisions for better optimization verifications and tests. To use, use flag "-passes="print<inline-cost>"". Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev Reviewed By: mtrofin Differential revision: https://reviews.llvm.org/D81743	2020-06-17 13:40:17 +00:00
Benjamin Kramer	df9a51dab3	Remove global std::strings. NFCI.	2020-06-17 14:29:42 +02:00
Sjoerd Meijer	c1034d044a	Follow up of rGe345d547a0d5, and attempt to pacify buildbot: "error: 'get' is deprecated: The base class version of get with the scalable argument defaulted to false is deprecated." Changed VectorType::get() -> FixedVectorType::get().	2020-06-17 13:24:09 +01:00
Sjoerd Meijer	e345d547a0	Recommit "[LV] Emit @llvm.get.active.lane.mask for tail-folded loops" Fixed ARM regression test. Please see the original commit message rG47650451738c for details.	2020-06-17 13:12:15 +01:00
David Green	076e08aa45	[LSR] Filter for postinc formulae In more complicated loops we can easily hit the complexity limits of loop strength reduction. If we do and filtering occurs, it's all too easy to remove the wrong formulae for post-inc preferring accesses due to it attempting to maximise register re-use. The patch adds an alternative filtering step when the target is preferring postinc to pick postinc formulae instead, hopefully lowering the complexity to below the limit so that aggressive filtering is not needed. There is also a change in here to stop considering existing addrecs as free under postinc. We should already be modelling them as a reg so don't want it to cause us to get the cost wrong. (I'm not sure that code makes sense in general, but there are X86 tests specifically for it where it seems to be helping so have left it around for the standard non-post-inc case). Differential Revision: https://reviews.llvm.org/D80273	2020-06-17 12:32:04 +01:00
Carl Ritson	ac8a2f132b	[AMDGPU] Fix failure in VCC spilling Spills of VCC (SGPR64) will fail with new SGPR spill code, because super register is not correctly resolved. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81224	2020-06-17 20:11:15 +09:00
Benjamin Kramer	547b6da73c	[CallPrinter] Remove static constructor. No need to have std::string here. NFC.	2020-06-17 13:02:58 +02:00
Sam Parker	5bf0858c0b	Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant" I originally reverted the patch because it was causing performance issues, but now I think it's just enabling simplify-cfg to do something that I don't want instead :) Sorry for the noise. This reverts commit `3e39760f8e`.	2020-06-17 11:38:59 +01:00
Paul Walker	95db1e7fb9	[FileCheck] Implement * and / operators for ExpressionValue. Subscribers: arichardson, hiraditya, thopre, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80915	2020-06-17 09:39:17 +00:00
Hans Wennborg	16ad6eeb94	[IR] Don't copy profile metadata in createCallMatchingInvoke() The invoke instruction can have profile metadata with branch_weights, which does not make sense for a call instruction and will be rejected by the verifier. Differential revision: https://reviews.llvm.org/D81996	2020-06-17 11:18:23 +02:00
serge-sans-paille	1cafd8a5d1	Fix LoopIdiomRecognize pass return status Introduce an helper class to aggregate the cleanup in case of rollback. Differential Revision: https://reviews.llvm.org/D81230	2020-06-17 11:12:03 +02:00
Sjoerd Meijer	d4e183f686	Revert "[LV] Emit @llvm.get.active.mask for tail-folded loops" This reverts commit `4765045173` while I investigate the build bot failures.	2020-06-17 10:09:54 +01:00
Max Kazantsev	4ac9a6902f	[NFC] Add API for edge domination check in dom tree	2020-06-17 16:05:05 +07:00
Florian Hahn	773353be4e	[SCCP] Move common code to simplify basic block to helper (NFC). Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81755	2020-06-17 10:03:43 +01:00
Sjoerd Meijer	4765045173	[LV] Emit @llvm.get.active.mask for tail-folded loops This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised loops if the intrinsic is supported by the backend, which is checked by querying TargetTransform hook emitGetActiveLaneMask. This intrinsic creates a mask representing active and inactive vector lanes, which is used by the masked load/store instructions that are created for tail-folded loops. The semantics of @llvm.get.active.mask are described here in LangRef: https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics This intrinsic is also used to provide a hint to the backend. That is, the second argument of the intrinsic represents the back-edge taken count of the loop. For MVE, for example, we use that to set up tail-predication, which is a new form of predication in MVE for vector loops that implicitely predicates the last vector loop iteration by implicitely setting active/inactive lanes, i.e. the tail loop is predicated. In order to set up a tail-predicated vector loop, we need to know the number of data elements processed by the vector loop, which corresponds the the tripcount of the scalar loop, which we can now reconstruct using @llvm.get.active.mask. Differential Revision: https://reviews.llvm.org/D79100	2020-06-17 09:53:58 +01:00
Sjoerd Meijer	20835cff27	[TTI] Refactor emitGetActiveLaneMask Refactor TTI hook emitGetActiveLaneMask and remove the unused arguments as suggested in D79100.	2020-06-17 09:53:58 +01:00
Kirill Bobyrev	3847737fa4	[CallPrinter] Handle freq = 0 case Improvement of the following revision: `bbc629ebd6` This might still be problematic if freq = 0, so it's better to check for that.	2020-06-17 10:52:18 +02:00
Kirill Bobyrev	bbc629ebd6	[CallPrinter] Fix maxFreq = 0 case llvm::getHeatColor becomes a problem when maxFreq = 0 -> freq = 0 => log2(double(freq)) / log2(maxFreq) -> log2(0.) / log2(0.) which results in illegal instruction on some architectures. Problematic revision: https://reviews.llvm.org/D77172	2020-06-17 10:44:28 +02:00
Florian Hahn	e4b58ea8c1	[MemDep] Also remove load instructions from NonLocalDesCache. Currently load instructions are added to the cache for invariant pointer group dependencies, but only pointer values are removed currently. That leads to dangling AssertingVHs in the test case below, where we delete a load from an invariant pointer group. We should also remove the entries from the cache. Fixes PR46054. Reviewers: efriedma, hfinkel, asbirlea Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81726	2020-06-17 09:36:53 +01:00
James Henderson	b21794a91c	[DebugInfo] Unify Cursor usage for all debug line opcodes This is a natural extension of the previous changes to use the Cursor class independently in the standard and extended opcode paths, and in turn allows delaying error handling until the entire line has been printed in verbose mode, removing interleaved output in some cases. Reviewed by: MaskRay, JDevlieghere Differential Revision: https://reviews.llvm.org/D81562	2020-06-17 09:19:24 +01:00
Vitaly Buka	d812efb121	[SafeStack,NFC] Fix names after files move Summary: Depends on D81831. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81832	2020-06-17 01:08:40 -07:00
Vitaly Buka	6754a0e2ed	[SafeStack,NFC] Move SafeStackColoring code Summary: This code is going to be used in StackSafety. This patch is file move with minimal changes. Identifiers will be fixed in the followup patch. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81831	2020-06-17 01:07:47 -07:00
Jonas Paulsson	d3f7448e3c	[SystemZ] Bugfix in storeLoadCanUseBlockBinary(). Check that the MemoryVT of LoadA matches that of LoadB. This fixes https://bugs.llvm.org/show_bug.cgi?id=46239. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D81671	2020-06-17 09:49:31 +02:00
Kang Zhang	c2574dc9f7	[NFC]][PowerPC] Remove unused intrinsic for old CTR loop pass Summary: In the patch D62907 the PPC CTRLoops pass has been replaced by Generic Hardware Loop pass, and it has imported some new intrinsic for Generic Hardware Loop. The old intrinsic used in PPC CTRLoops int_ppc_mtctr and int_ppc_is_decremented_ctr_nonzero is been replaced by int_set_loop_iterations and loop_decrement. This patch is to remove above unused two instrinsic. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D81539	2020-06-17 07:06:46 +00:00
Serge Pavlov	2e613d2ded	[Support] Get process statistics in ExecuteAndWait and Wait The functions sys::ExcecuteAndWait and sys::Wait now have additional argument of type pointer to structure, which is filled with process execution statistics upon process termination. These are total and user execution times and peak memory consumption. By default this argument is nullptr so existing users of these function must not change behavior. Differential Revision: https://reviews.llvm.org/D78901	2020-06-17 13:39:59 +07:00
Igor Kudrin	ccbd7e8d46	[DebugInfo] Support parsing and dumping of DWARF64 macro units. Differential Revision: https://reviews.llvm.org/D81844	2020-06-17 12:57:54 +07:00
Sameer Sahasrabuddhe	d3963b3a5f	[DA] propagate loop live-out values that get used in a branch Values that are uniform within a loop but appear divergent to uses outside the loop are "tainted" so that such uses are marked divergent. But if such a use is a branch, then it's divergence needs to be propagated. The simplest way to do that is to put the branch back in the main worklist so that it is processed appropriately. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D81822	2020-06-17 09:21:00 +05:30
Itay Bookstein	df9d64ed9c	[IR] Add missing GlobalAlias copying of ThreadLocalMode attribute Summary: Previously, GlobalAlias::copyAttributesFrom did not preserve ThreadLocalMode, causing incorrect IR generation in IR linking flows. This patch pushes the code responsible for copying this attribute from GlobalVariable::copyAttributesFrom down to GlobalValue::copyAttributesFrom so that it is shared by GlobalAlias. Fixes PR46297. Reviewers: tejohnson, pcc, hans Reviewed By: tejohnson, hans Subscribers: hiraditya, ibookstein, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81605	2020-06-16 20:15:27 -07:00
Matt Arsenault	3b34f3fcca	AMDGPU/GlobalISel: Fix obvious bug in ported 32-bit udiv/urem This was hidden by the IR expansion in AMDGPUCodeGenPrepare, which I forgot to turn off.	2020-06-16 22:46:35 -04:00
Xing GUO	9aaa32cfcb	[ObjectYAML][DWARF] Let writeVariableSizedInteger() return Error. This patch helps change the return type of `writeVariableSizedInteger()` from `void` to `Error`. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D81915	2020-06-17 09:30:14 +08:00
Matt Arsenault	c5c58fd6b5	AMDGPU: Remove intermediate DAG node for trig_preop intrinsic We weren't doing anything with this, and keeping it would just add more boilerplate for GlobalISel.	2020-06-16 21:06:25 -04:00
Christopher Tetreault	8e204f807b	[SVE] Generalize size checks in Verifier to use getElementCount Summary: Attempts to call getNumElements on scalable vectors identified by test LLVM.Other::scalable-vectors-core-ir.ll. Since these checks are all attempting to find if two vectors are the same size, calling getElementCount will only increase safety. Reviewers: efriedma, aprantl, reames, kmclaughlin, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81895	2020-06-16 16:03:36 -07:00
Aaron Smith	7e01675ea5	[SelectionDAG] Add MVT::bf16 to getConstantFP() Summary: This was probably overlooked in recent bfloat patches. Needed to handle bf16 constants in SelectionDAG. ConstantFP:bf16<APFloat(0)> Reviewers: stuij Reviewed By: stuij Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81779	2020-06-16 15:10:05 -07:00
Fangrui Song	7f7cb79b57	[llvm-cov gcov] Don't suppress .gcov output if .gcda is corrupted If .gcda is corrupted, gcov continues to produce a .gcov and just assumes execution counts are zeros. This is reasonable, because the program can corrupt its .gcda output. The code path should be similar to the code path without .gcda.	2020-06-16 14:55:38 -07:00
Daniel Sanders	e35ba09961	[gicombiner] Allow generated combiners to store additional members Summary: Adds the ability to add members to a generated combiner via a State base class. In the current AArch64PreLegalizerCombiner this is used to make Helper available without having to provide it to every call. As part of this, split the command line processing into a separate object so that it still only runs once even though the generated combiner is constructed more frequently. Depends on D81862 Reviewers: aditya_nandakumar, bogner, volkan, aemerson, paquette, arsenm Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81863	2020-06-16 14:47:04 -07:00
Kirill Naumov	369d00df60	[CallPrinter] Adding heat coloring to CallPrinter This patch introduces the heat coloring of the Call Printer which is based on the relative "hotness" of each function. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Another feature added is the flag similar to "-cfg-dot-filename-prefix", which allows to write the graph into a named .pdf Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D77172	2020-06-16 21:15:29 +00:00
Fangrui Song	def2156389	[gcov] Add -i --intermediate-format Between gcov 4.9~8, `gcov -i $file` prints coverage information to $file.gcov in an intermediate text format (single file, instead of $source.gcov for each source file). lcov newer than 2019-05-24 detects -i support and uses it to increase processing speed. gcov 9 (GCC r265587) removed --intermediate-format and -i was changed to mean --json-format. However, we consider this format still useful and support it. geninfo (part of lcov) supports this format even if we announce that we are compatible with gcov 9.0.0	2020-06-16 14:14:28 -07:00
Fangrui Song	4cd7ba7eca	[gcov] Refactor llvm-cov gcov and add SourceInfo	2020-06-16 14:14:26 -07:00
Christopher Tetreault	616d8d942b	[SVE] Eliminate calls to default-false VectorType::get() from AArch64 Reviewers: efriedma, c-rhodes, david-arm, samparker, greened Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81518	2020-06-16 13:53:25 -07:00
Christopher Tetreault	b265cad93e	[NFC] Bail out for scalable vectors before calling getNumElements Summary: Move the bail out logic to before constructing the Result and Lane vectors. This is both potentially faster, and avoids calling getNumElements on a potentially scalable vector Reviewers: efriedma, sunfish, chandlerc, c-rhodes, fpetrogalli Reviewed By: fpetrogalli Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81619	2020-06-16 13:41:29 -07:00
Christopher Tetreault	747486991c	[SVE] Fix bad FixedVectorType cast in simplifyDivRem Summary: simplifyDivRem attempts to walk a VectorType elementwise. Ensure that it only does so for FixedVectorType Reviewers: efriedma, spatel, lebedev.ri, david-arm, kmclaughlin Reviewed By: spatel, david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81856	2020-06-16 13:17:05 -07:00
Christopher Tetreault	ff628f5f5e	[SVE] Eliminate calls to default-false VectorType::get() from Vectorize Reviewers: efriedma, fhahn, spatel, sdesmalen, kmclaughlin Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81521	2020-06-16 12:50:13 -07:00
Matt Arsenault	e4f19d1dda	GlobalISel: Fix not failing on widening G_INSERT_VECTOR_ELT This doesn't actually handled type idx 0, but was reporting Legalized on it. No test changes because nothing was trying to use this.	2020-06-16 15:48:57 -04:00
Ahsan Saghir	37e72f47a4	[PowerPC] Add -m[no-]power10-vector clang and llvm option Summary: This patch adds command line option for enabling power10-vector support. Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc Reviewed By: lei, amyk, #powerpc Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits Tags: #llvm, #clang, #powerpc Differential Revision: https://reviews.llvm.org/D80758	2020-06-16 14:47:35 -05:00
Matt Arsenault	8a3340d25d	GlobalISel: Use early return and reduce indentation	2020-06-16 14:47:08 -04:00
Stanislav Mekhanoshin	3f0c9c1634	Fix ubsan error in tblgen with signed left shift UBSAN complains when tblgen performs SHL of a negative value. Differential Revision: https://reviews.llvm.org/D81952	2020-06-16 11:15:09 -07:00
Hiroshi Yamauchi	6bc2b042f4	[TLI] Add four C++17 delete variants. Summary: delete(void, unsigned int, align_val_t) delete(void, unsigned long, align_val_t) delete[](void, unsigned int, align_val_t) delete[](void, unsigned long, align_val_t) Differential Revision: https://reviews.llvm.org/D81853	2020-06-16 11:12:02 -07:00
Sanjay Patel	ed67f5e7ab	[VectorCombine] scalarize compares with insertelement operand(s) Generalize scalarization (recently enhanced with D80885) to allow compares as well as binops. Similar to binops, we are avoiding scalarization of a loaded value because that could avoid a register transfer in codegen. This requires 1 extra predicate that I am aware of: we do not want to scalarize the condition value of a vector select. That might also invert a transform that we do in instcombine that prefers a vector condition operand for a vector select. I think this is the final step in solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 Differential Revision: https://reviews.llvm.org/D81661	2020-06-16 13:48:10 -04:00
Jessica Paquette	7caa9caa80	[AArch64][GlobalISel] Avoid creating redundant ubfx when selecting G_ZEXT When selecting 32 b -> 64 b G_ZEXTs, we don't have to always emit the extend. If the instruction feeding into the G_ZEXT implicitly zero extends the high half of the register, we can just emit a SUBREG_TO_REG instead. Differential Revision: https://reviews.llvm.org/D81897	2020-06-16 09:50:47 -07:00
Fangrui Song	4799fb63b5	[GlobalISel] Delete unused variable after r353432	2020-06-16 08:32:09 -07:00
Leandro Vaz	56262a74c3	Fix debug line info when line markers are present inside macros. Compiling assembly files when newlines are reduced to line markers within a `.macro` context will generate wrong information in `.debug_line` section. This patch fixes this issue by evaluating line markers within the macro scope but not when they are used and evaluated. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D80381	2020-06-16 16:13:11 +01:00
Luke Geeson	10b6567f49	[AArch64]: BFloat MatMul Intrinsics&CodeGen This patch upstreams support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are provided as needed. AArch32 Intrinsics + CodeGen will come after this patch. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki, stuij Reviewed By: miyuki, stuij Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits, miyuki, chill, pbarrio, stuij Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80752 Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f	2020-06-16 15:23:30 +01:00
Luke Geeson	508a4764c0	[AArch64]: BFloat Load/Store Intrinsics&CodeGen This patch upstreams support for ld / st variants of BFloat intrinsics in from __bf16 to AArch64. This includes IR intrinsics. Unittests are provided as needed. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Geeson - Momchil Velikov - Luke Cheeseman Reviewers: fpetrogalli, SjoerdMeijer, sdesmalen, t.p.northover, stuij Reviewed By: stuij Subscribers: arsenm, pratlucas, simon_tatham, labrinea, kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits, pbarrio, stuij Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80716 Change-Id: I22e1dca2a8a9ec25d1e4f4b200cb50ea493d2575	2020-06-16 15:23:30 +01:00
Georgii Rymar	66fb3c39cb	[DebugInfo/DWARF] - Report .eh_frame sections of version != 1. Specification (https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html#AEN1349) says that the value of Version field for .eh_frame should be 1. Though we accept other values and might perform an attempt to read it as a .debug_frame because of that, what is wrong. This patch adds a version check. Differential revision: https://reviews.llvm.org/D81469	2020-06-16 15:46:26 +03:00
Tyker	d7deef1206	Revert "[AssumeBundles] add cannonicalisation to the assume builder" This reverts commit `90c50cad19`.	2020-06-16 14:34:55 +02:00
Ayke van Laethem	5aa8014ca8	[AVR] Remove faulty stack pushing behavior An instruction like this will need to allocate some stack space for the last parameter: %x = call addrspace(1) i16 @bar(i64 undef, i64 undef, i16 undef, i16 0) This worked fine when passing an actual value (in this case 0). However, when passing undef, no value was pushed to the stack and therefore no push instructions were created. This caused an unbalanced stack leading to interesting results. This commit fixes that by replacing the push logic with a regular stack adjustment and stack-relative load/stores. This is less efficient but at least it correctly compiles the code. I can think of a few improvements in the future: * The stack should have been adjusted in the function prologue when there are no allocas in the function. * Many (if not most) stack adjustments can be replaced by pushing/popping the values directly. Exactly like the previous code attempted but didn't do correctly. * Small stack adjustments can be done more efficiently with a few push/pop instructions (pushing/popping bogus values), both for code size and for speed. All in all, as long as there are no allocas in the function I think that it is almost always more efficient to emit regular push/pop instructions. This is however left for future optimizations. Differential Revision: https://reviews.llvm.org/D78581	2020-06-16 13:53:32 +02:00
Ayke van Laethem	3ab1c97e35	[AVR] Fix stack size in functions with a frame pointer This patch fixes a bug in stack save/restore code. Because the frame pointer was saved/restored manually (not by marking it as clobbered) the StackSize variable was not updated accordingly. Most code still worked, but code that tried to load a parameter passed on the stack did not. This commit fixes this by marking the frame pointer as a callee-clobbered register. This will let it be saved without any effort in prolog/epilog code and will make sure the correct address is calculated for loading parameters that are passed on the stack. This approach is used by most other targets (such as X86, AArch64 and RISC-V). Differential Revision: https://reviews.llvm.org/D78579	2020-06-16 13:53:32 +02:00
David Green	f269bb7da0	[ARM] Fix crash trying to generate i1 immediates These code patterns attempt to call isVMOVModifiedImm on a splat of i1 values, leading to an unreachable being hit. I've guarded the call on a more specific set of sizes, as i1 vectors are legal under MVE. Differential Revision: https://reviews.llvm.org/D81860	2020-06-16 12:27:24 +01:00

... 3 4 5 6 7 ...

136081 Commits