llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	a19eb1de72	[OpenMP] Add match_{all,any,none} declare variant selector extensions. By default, all traits in the OpenMP context selector have to match for it to be acceptable. Though, we sometimes want a single property out of multiple to match (=any) or no match at all (=none). We offer these choices as extensions via `implementation={extension(match_{all,any,none})}` to the user. The choice will affect the entire context selector not only the traits following the match property. The first user will be D75788. There we can replace ``` #pragma omp begin declare variant match(device={arch(nvptx64)}) #define __CUDA__ #include <__clang_cuda_cmath.h> // TODO: Hack until we support an extension to the match clause that allows "or". #undef __CLANG_CUDA_CMATH_H__ #undef __CUDA__ #pragma omp end declare variant #pragma omp begin declare variant match(device={arch(nvptx)}) #define __CUDA__ #include <__clang_cuda_cmath.h> #undef __CUDA__ #pragma omp end declare variant ``` with the much simpler ``` #pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)}) #define __CUDA__ #include <__clang_cuda_cmath.h> #undef __CUDA__ #pragma omp end declare variant ``` Reviewed By: mikerice Differential Revision: https://reviews.llvm.org/D77414	2020-04-07 23:33:24 -05:00
Kazu Hirata	91eb442fde	[JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl Summary: ComputeValueKnownInPredecessorsImpl is the main folding mechanism in JumpThreading.cpp. To avoid potential infinite recursion while chasing use-def chains, it uses: DenseSet<std::pair<Value , BasicBlock >> &RecursionSet to keep track of Value-BB pairs that we've processed. Now, when ComputeValueKnownInPredecessorsImpl recursively calls itself, it always passes BB as is, so the second element is always BB. This patch simplifes the function by dropping "BasicBlock *" from RecursionSet. Reviewers: wmi, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77699	2020-04-07 18:37:36 -07:00
Eli Friedman	565b56a72c	[NFC] Clean up uses of LoadInst constructor.	2020-04-07 16:28:53 -07:00
Daniel Sanders	1adeeabb79	Add MIR-level debugify with only locations support for now Summary: Re-used the IR-level debugify for the most part. The MIR-level code then adds locations to the MachineInstrs afterwards based on the LLVM-IR debug info. It's worth mentioning that the resulting locations make little sense as the range of line numbers used in a Function at the MIR level exceeds that of the equivelent IR level function. As such, MachineInstrs can appear to originate from outside the subprogram scope (and from other subprogram scopes). However, it doesn't seem worth worrying about as the source is imaginary anyway. There's a few high level goals this pass works towards: * We should be able to debugify our .ll/.mir in the lit tests without changing the checks and still pass them. I.e. Debug info should not change codegen. Combining this with a strip-debug pass should enable this. The main issue I ran into without the strip-debug pass was instructions with MMO's and checks on both the instruction and the MMO as the debug-location is between them. I currently have a simple hack in the MIRPrinter to resolve that but the more general solution is a proper strip-debug pass. * We should be able to test that GlobalISel does not lose debug info. I recently found that the legalizer can be unexpectedly lossy in seemingly simple cases (e.g. expanding one instr into many). I have a verifier (will be posted separately) that can be integrated with passes that use the observer interface and will catch location loss (it does not verify correctness, just that there's zero lossage). It is a little conservative as the line-0 locations that arise from conflicts do not track the conflicting locations but it can still catch a fair bit. Depends on D77439, D77438 Reviewers: aprantl, bogner, vsk Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77446	2020-04-07 16:25:13 -07:00
Fangrui Song	624654fd64	[VE] Migrate to the getMachineMemOperand overload using llvm::Align Just delete the deprecated overload because nothing uses it.	2020-04-07 16:04:54 -07:00
Matt Arsenault	6011627f51	CodeGen: More conversions to use Register	2020-04-07 18:54:36 -04:00
Fangrui Song	d2ef8c1f2c	[ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker() dso_local leads to direct access even if the definition is not within this compilation unit (it is still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link. If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no direct access will be generated. The current behavior is benign, because -fpic does not assume dso_local (clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal). If we do that for -fno-semantic-interposition (D73865), there will be an R_X86_64_PC32 linker error without this patch. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74751	2020-04-07 15:46:01 -07:00
Fangrui Song	2f8fb4d1cd	[VE] Adapt `aa26dd9858` and `2481f26ac3`	2020-04-07 15:45:19 -07:00
Wei Mi	b49eac71ad	Recommit [SampleFDO] Add flag for partial profile. Fix the error of show-prof-info.test on some platforms without zlib. The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them. Differential Revision: https://reviews.llvm.org/D77426	2020-04-07 14:28:25 -07:00
Stanislav Mekhanoshin	96e51ed005	[AMDGPU] Implement copyPhysReg for 16 bit subregs Differential Revision: https://reviews.llvm.org/D74937	2020-04-07 14:22:46 -07:00
Matt Arsenault	2481f26ac3	CodeGen: Use Register in TargetFrameLowering	2020-04-07 17:07:44 -04:00
Nikita Popov	fe8abbf442	[BPI] Clear handles when releasing memory (NFC) This reduces max-rss of sqlite compilation by 2.5%.	2020-04-07 22:51:01 +02:00
Matt Arsenault	aa26dd9858	CodeGen: Use Register in more places	2020-04-07 15:59:40 -04:00
Wei Mi	c5da949ae8	Revert "[SampleFDO] Add flag for partial profile." show-prof-info.test breaks on some platforms. This reverts commit `e3ba652a14`.	2020-04-07 12:54:51 -07:00
Wei Mi	e3ba652a14	[SampleFDO] Add flag for partial profile. The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them. Differential Revision: https://reviews.llvm.org/D77426	2020-04-07 12:17:56 -07:00
Nemanja Ivanovic	ecd8435483	[NFC][PowerPC] Fix register class for patterns using XXPERMDIs There are a few patterns where we use a superclass for inputs to this instruction rather than the correct class. This can sometimes lead to unncessary copies.	2020-04-07 14:06:08 -05:00
Graham Sellers	a19a56f6a1	[AMDGPU] Extend constant folding for logical operations This patch extends existing constant folding in logical operations to handle S_XNOR, S_NAND, S_NOR, S_ANDN2, S_ORN2, V_LSHL_ADD_U32 and V_AND_OR_B32. Also added a couple of tests for existing folds.	2020-04-07 14:37:16 -04:00
Craig Topper	c41685b16f	[SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector. This removes a call to getScalarType from a bunch of call sites. It also makes the behavior consistent with SIGN_EXTEND_INREG. Differential Revision: https://reviews.llvm.org/D77631	2020-04-07 11:34:08 -07:00
Alexey Lapshin	88c2137b6d	[DWARFLinker][dsymutil][NFC] Move DwarfStreamer into DWARFLinker. For implementing "remove obsolete debug info in lld", it is neccesary to have DWARF generation code implementation. dsymutil uses DwarfStreamer for that purpose. DwarfStreamer uses AsmPrinter. It is considered OK to use AsmPrinter based code in lld(D74169). This patch moves DwarfStreamer implementation into DWARFLinker, so that it could be reused from lld. Generally, a better place for such a common DWARF generation code would be not DWARFLinker but an additional separate library. Such a library could contain a single version of DWARF generation routines and could also be independent of AsmPrinter. At the current moment, DwarfStreamer does not pretend to be such a general implementation of DWARF generation. So I decided to put it into DWARFLinker since it is the only user of DwarfStreamer. Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM bundle matches for the dsymutil with/without that patch. Reviewed By: JDevlieghere Differential revision: https://reviews.llvm.org/D77169	2020-04-07 21:21:54 +03:00
Eli Friedman	e9ac757f79	[AArch64] Don't expand memcmp in strict align mode. `7aecf232` fixed the bug where we would miscompile, but we still generate a crazy amount of code. Turn off the expansion until someone implements an appropriate heuristic. Differential Revision: https://reviews.llvm.org/D77599	2020-04-07 10:53:36 -07:00
Matt Arsenault	f596ab4066	AMDGPU: Use early return	2020-04-07 13:48:00 -04:00
Sam Clegg	5be42f36f5	[WebAssembly][MC] Fix leak of std::string members in MCSymbolWasm Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=45452 Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77627	2020-04-07 10:38:43 -07:00
Stanislav Mekhanoshin	12a324393d	[AMDGPU] Limit endcf-collapase to simple if We can only collapse adjacent SI_END_CF if outer statement belongs to a simple SI_IF, otherwise correct mask is not in the register we expect, but is an argument of an S_XOR instruction. Even if SI_IF is simple it might be lowered using S_XOR because lowering is dependent on a basic block layout. It is not considered simple if instruction consuming its output is not an SI_END_CF. Since that SI_END_CF might have already been lowered to an S_OR isSimpleIf() check may return false. This situation is an opportunity for a further optimization of SI_IF lowering, but that is a separate optimization. In the meanwhile move SI_END_CF post the lowering when we already know how the rest of the CFG was lowered since a non-simple SI_IF case still needs to be handled. Differential Revision: https://reviews.llvm.org/D77610	2020-04-07 10:27:23 -07:00
Matt Arsenault	b281138a1b	DAG: Use the correct getPointerTy in a few places These should not be assuming address space 0. Calling getPointerTy is generally the wrong thing to do, since you should already know the type from the incoming IR.	2020-04-07 12:45:41 -04:00
Nikita Popov	259649a519	[RDA] Avoid full reprocessing of blocks in loops (NFCI) RDA sometimes needs to visit blocks twice, to take into account reaching defs coming in along loop back edges. Currently it handles repeated visitation the same way as usual, which means that it will scan through all instructions and their reg unit defs again. Not only is this very inefficient, it also means that all reaching defs in loops are going to be inserted twice. We can do much better than this. The only thing we need to handle is a new reaching def from a predecessor, which either needs to be prepended to the reaching definitions (if there was no reaching def from a predecessor), or needs to replace an existing predecessor reaching def, if it is more recent. Since D77508 we only store the most recent predecessor reaching def, so that's the only one that may need updating. This also has the nice side-effect that reaching definitions are now automatically sorted and unique, so drop the llvm::sort() call in favor of an assertion. Differential Revision: https://reviews.llvm.org/D77511	2020-04-07 17:55:37 +02:00
Nikita Popov	76e987b372	[RDA] Don't pass down TraversedMBB (NFC) Only pass the MachineBasicBlock itself down to helper methods, they don't need to know about traversal. Move the debug print into the main method.	2020-04-07 17:53:04 +02:00
Nikita Popov	361c29d7ba	[RDA] Avoid inserting duplicate reaching defs (NFCI) An instruction may define the same reg unit multiple times, avoid inserting the same reaching def multiple times in that case. Also print the reg unit, rather than the super-register, in the debug code.	2020-04-07 17:50:38 +02:00
David Tenty	b9245f14b7	[NFC][PowerPC] Cleanup 64-bit and Darwin CalleeSavedRegs Summary: - Remove the no longer used Darwin CalleeSavedRegs - Combine the SVR464 callee saved regs and AIX64 since the two are (and should be) identical into PPC64 - Update tests for 64-bit CSR change Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu, #powerpc Reviewed By: sfertile Subscribers: wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77235	2020-04-07 11:49:10 -04:00
Simon Pilgrim	e3b6059776	[X86][SSE] combineX86ShufflesConstants - early out for zeroable vectors (PR45443) Shuffle combining can insert zero byte sized elements into the shuffle mask, which combineX86ShufflesConstants will attempt to fold without taking into account whether the byte-sized type is legal (e.g. AVX512F only targets). If we have a full-zeroable vector then we should just return a zero version of the root type, otherwise if the type isn't valid we should bail. Fixes PR45443	2020-04-07 14:45:29 +01:00
Keith Walker	01dc10774e	[ARM] unwinding .pad instructions missing in execute-only prologue If the stack pointer is altered for local variables and we are generating Thumb2 execute-only code the .pad directive is missing. Usually the size of the adjustment is stored in a PC-relative location and loaded into a register which is then added to the stack pointer. However when we are generating execute-only code code the size of the adjustment is instead generated using the MOVW/MOVT instruction pair. As a by product of handling the execute-only case this also fixes an existing issue that in the none execute-only case the .pad directive was generated against the load of the constant to a register instruction, instead of the instruction which adds the register to the stack pointer. Differential Revision: https://reviews.llvm.org/D76849	2020-04-07 11:51:59 +01:00
Florian Hahn	6aabb109be	[SCCP] Use ranges for predicate info conditions. This patch updates the code that deals with conditions from predicate info to make use of constant ranges. For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges: 1. The range of the original value. 2. The range imposed by the linked condition. 1. is known, 2. can be determined using makeAllowedICmpRegion. The intersection of those ranges is the range for the copy. With this patch, we get a nice increase in the number of instructions eliminated by both SCCP and IPSCCP for some benchmarks: For MultiSource, SPEC2000 & SPEC2006: Tests: 237 Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.NumInstRemoved Program base patch diff test-suite...Source/Benchmarks/sim/sim.test 10.00 71.00 610.0% test-suite...CFP2000/177.mesa/177.mesa.test 361.00 1626.00 350.4% test-suite...encode/alacconvert-encode.test 141.00 602.00 327.0% test-suite...decode/alacconvert-decode.test 141.00 602.00 327.0% test-suite...CI_Purple/SMG2000/smg2000.test 1639.00 4093.00 149.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 75.00 163.00 117.3% test-suite...T2006/401.bzip2/401.bzip2.test 358.00 513.00 43.3% test-suite...rks/FreeBench/pifft/pifft.test 11.00 15.00 36.4% test-suite...langs-C/unix-tbl/unix-tbl.test 4.00 5.00 25.0% test-suite...lications/sqlite3/sqlite3.test 541.00 667.00 23.3% test-suite.../CINT2000/254.gap/254.gap.test 243.00 299.00 23.0% test-suite...ks/Prolangs-C/agrep/agrep.test 25.00 29.00 16.0% test-suite...marks/7zip/7zip-benchmark.test 1135.00 1304.00 14.9% test-suite...lications/ClamAV/clamscan.test 1105.00 1268.00 14.8% test-suite...urce/Applications/lua/lua.test 398.00 436.00 9.5% Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...C/CFP2000/179.art/179.art.test 1.00 3.00 200.0% test-suite...006/447.dealII/447.dealII.test 429.00 1056.00 146.2% test-suite...nch/fourinarow/fourinarow.test 3.00 7.00 133.3% test-suite...CI_Purple/SMG2000/smg2000.test 818.00 1748.00 113.7% test-suite...ks/McCat/04-bisect/bisect.test 3.00 5.00 66.7% test-suite...CFP2000/177.mesa/177.mesa.test 165.00 255.00 54.5% test-suite...ediabench/gsm/toast/toast.test 18.00 27.00 50.0% test-suite...telecomm-gsm/telecomm-gsm.test 18.00 27.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 24.00 35.00 45.8% test-suite...TimberWolfMC/timberwolfmc.test 43.00 62.00 44.2% test-suite...encode/alacconvert-encode.test 46.00 66.00 43.5% test-suite...decode/alacconvert-decode.test 46.00 66.00 43.5% test-suite...langs-C/unix-tbl/unix-tbl.test 12.00 17.00 41.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 31.00 41.00 32.3% test-suite.../CINT2000/254.gap/254.gap.test 117.00 154.00 31.6% Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76611	2020-04-07 11:09:18 +01:00
Serguei Katkov	b7e3759e17	[DAG] Consolidate require spill slot logic in lambda. NFC. Move the logic whether lowering of deopt value requires a spill slot in a separate lambda. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77629	2020-04-07 16:43:47 +07:00
Peter Smith	14c1e98754	[ARM] Remove condition that could never be true From Arm v8 Architecture Reference Manual F5.1.84 LDREXD The ldrexd instruction in Arm state has the following conditions: t = UInt(Rt); t2 = t + 1; n = UInt(Rn); if Rt<0> == '1' \|\| t2 == 15 \|\| n == 15 then UNPREDICTABLE; In when Rt is odd or if Rt is 14 (making t2 15). In the implementation when the pair is the UNPREDICTABLE R14_R15 we would ideally return SOFT_FAIL. We can't because there is no R14_R15 value for us to return so we fail early returning FAIL. The early return for registers outside the bounds of the table means the check for Rt == 14 (0xE) redundant which causes a static analyzer to flag the condition as never being true. To fix the warning I've removed the check and replaced with a comment explaining the difference with the specification. Fixes pr41660 Differential Revision: https://reviews.llvm.org/D77463	2020-04-07 09:50:56 +01:00
Simon Tatham	aab9e9de4d	[Support,Windows] Tolerate failure of CryptGenRandom Summary: In `Unix/Process.inc`, we seed a random number generator from `/dev/urandom` if possible, but if not, we're happy to fall back to ordinary pseudorandom strategies, like the current time and PID. The corresponding function on Windows calls `CryptGenRandom`, but it //doesn't// have a fallback if that strategy fails. But `CryptGenRandom` //can// fail, if a cryptography provider isn't properly initialized, or occasionally (by our observation) simply intermittently. If it's reasonable on Unix to implement traditional pseudorandom-number seeding as a fallback, then it's surely reasonable to do the same on Windows. So this patch adds a last-ditch use of ordinary rand(), using much the same strategy as the Unix fallback code. Reviewers: hans, sammccall Reviewed By: hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77553	2020-04-07 09:18:12 +01:00
Pierre-vh	4fc59a468f	Revert "[CodeGen][SelectionDAG] Flip Booleans More Often" This reverts commit `23342bdcc8`.	2020-04-07 09:09:10 +01:00
Pierre-vh	23342bdcc8	[CodeGen][SelectionDAG] Flip Booleans More Often Differential Revision: https://reviews.llvm.org/D77201	2020-04-07 08:19:57 +01:00
Sam Clegg	f0bbf3d086	[WebAssembly] EmscriptenEHSjLj: Mark more functions as imported These should have been part of https://reviews.llvm.org/D77192 Differential Revision: https://reviews.llvm.org/D77358	2020-04-06 21:27:31 -07:00
Xiang1 Zhang	01a32f2bd3	Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology) Do not commit the llvm/test/ExecutionEngine/MCJIT/cet-code-model-lager.ll because it will cause build bot fail(not suitable for window 32 target). Summary: This patch comes from H.J.'s `2bd54ce7fa` This patch fix the failed llvm unit tests which running on CET machine. (e.g. ExecutionEngine/MCJIT/MCJITTests) The reason we enable IBT at "JIT compiled with CET" is mainly that: the JIT don't know the its caller program is CET enable or not. If JIT's caller program is non-CET, it is no problem JIT generate CET code or not. But if JIT's caller program is CET enabled, JIT must generate CET code or it will cause Control protection exceptions. I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed. and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too. (if not apply this patch, VNCserver will crash at CET machine.) Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei Reviewed By: LuoYuanke Subscribers: tstellar, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76900	2020-04-07 09:48:47 +08:00
Jun Ma	46bff786bc	[Coroutines] Remove alignment check in shouldBeMustTail Differential Revision: https://reviews.llvm.org/D77362	2020-04-07 09:07:34 +08:00
Eli Friedman	3f13ee8a00	[NFC] Modernize misc. uses of Align/MaybeAlign APIs. Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.	2020-04-06 17:53:04 -07:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Davide Italiano	8115e08b05	[MachineCSE] Don't carry the wrong location when hoisting PR: 45425 <rdar://problem/61359768> Differential Revision: https://reviews.llvm.org/D77604	2020-04-06 16:36:22 -07:00
Daniel Sanders	f27cea721e	Add way to omit debug-location from MIR output Summary: In lieu of a proper pass that strips debug info, add a way to omit debug-locations from the MIR output so that instructions with MMO's continue to match CHECK's when mir-debugify is used Reviewers: aprantl, bogner, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77575	2020-04-06 16:22:01 -07:00
Nick Desaulniers	41ba80182c	[CallSite Removal] a CallBase is never an IndirectCall for isInlineAsm Summary: Thanks to Bill Wendling (void) for the report and steps to reproduce. It looks like this was missed during r350508's cleanup of the CallSite split into CallBase, CallInst, and CallBrInst. This was exposed by running pgo on a callbr, which was creating a ptrtoint to the inline asm thinking it was an indirect call. The relevant callchain looks like: IndirectCallPromotionPlugin::run() -> PGOIndirectCallVisitor::findIndirectCalls() -> PGOIndirectCallVisitor::visitCallBase() -> CallBase::isIndirectCall() Reviewers: void, chandlerc Reviewed By: void Subscribers: hiraditya, llvm-commits, craig.topper, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D77600	2020-04-06 16:14:46 -07:00
Vedant Kumar	5f185a8999	[AddressSanitizer] Fix for wrong argument values appearing in backtraces Summary: In some cases, ASan may insert instrumentation before function arguments have been stored into their allocas. This causes two issues: 1) The argument value must be spilled until it can be stored into the reserved alloca, wasting a stack slot. 2) Until the store occurs in a later basic block, the debug location will point to the wrong frame offset, and backtraces will show an uninitialized value. The proposed solution is to move instructions which initialize allocas for arguments up into the entry block, before the position where ASan starts inserting its instrumentation. For the motivating test case, before the patch we see: ``` \| 0033: movq %rdi, 0x68(%rbx) \| \| DW_TAG_formal_parameter \| \| ... \| \| DW_AT_name ("a") \| \| 00d1: movq 0x68(%rbx), %rsi \| \| DW_AT_location (RBX+0x90) \| \| 00d5: movq %rsi, 0x90(%rbx) \| \| ^ not correct ... \| ``` and after the patch we see: ``` \| 002f: movq %rdi, 0x70(%rbx) \| \| DW_TAG_formal_parameter \| \| \| \| DW_AT_name ("a") \| \| \| \| DW_AT_location (RBX+0x70) \| ``` rdar://61122691 Reviewers: aprantl, eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77182	2020-04-06 15:59:25 -07:00
Daniel Sanders	35b7b0851b	Allow MachineFunction to obtain non-const Function (to enable MIR-level debugify) Summary: To debugify MIR, we need to be able to create metadata and to do that, we need a non-const Module. However, MachineFunction only had a const reference to the Function preventing this. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77439	2020-04-06 15:19:21 -07:00
Daniel Sanders	15f7bc7857	Add option to limit Debugify to locations (omitting variables) Summary: It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE around. In particular, because DEBUG_VALUE has the potential to change CodeGen behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally don't. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77438	2020-04-06 15:04:55 -07:00
David Blaikie	5aead592f0	X86ISelLowering: Minor refactor to avoid redundant initialization while ensuring compiler warnings can hopefully still prove initialization Based on post-commit review/discussion in fabe52a7412b	2020-04-06 14:25:52 -07:00
Konstantin Pyzhov	72e8754916	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 09:05:58 -04:00
Leonard Chan	a0222ac1f9	[AsmPrinter] Do not define local aliases for global objects in a comdat A global symbol that is defined in a comdat should not generate an alias since call sites that would've referred to that symbol will refer to their own independent local aliases rather than the surviving global comdat one. This could result in something that looks like: ``` ld.lld: error: relocation refers to a discarded section: .text._ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> defined in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.file.cc.o) >>> section group signature: _ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub >>> prevailing definition is in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.vnode.cc.o) >>> referenced by function.h:169 (../../zircon/system/ulib/fbl/include/fbl/function.h:169) >>> minfs._sources.file.cc.o:(minfs::File::AllocateAndCommitData(std::__2::unique_ptr<minfs::Transaction, std::__2::default_delete<minfs::Transaction> >)) in archive user-x64-clang/obj/system/ulib/minfs/libminfs.a ``` We ran into this when experimenting with a new C++ ABI for fuchsia (refer to D72959) which takes relative offsets between comdat'd functions which is why the normal C++ user wouldn't run into this. Differential Revision: https://reviews.llvm.org/D77429	2020-04-06 13:48:05 -07:00
Nick Desaulniers	5bc291be71	[SelectionDAG] fix predecessor list for INLINEASM_BRs' parent Summary: A bug report mentioned that LLVM was producing jumps off the end of a function when using "asm goto with outputs". Further digging pointed to MachineBasicBlocks that had their address taken and were indirect targets of INLINEASM_BR being removed by BranchFolder, because their predecessor list was empty, so they appeared to have no entry. This was a cascading failure caused earlier, during Pre-RA instruction scheduling. We have a few special cases in Pre-RA instruction scheduling where we split a MachineBasicBlock in two. This requires careful handing of predecessor and successor lists for a MachineBasicBlock that was split, and careful handing of PHI MachineInstrs that referred to the MachineBasicBlock before it was split. The clue that led to this fix was the observation that many callers of MachineBasicBlock::splice() frequently call MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI nodes after a splice. We don't want to reuse that method, as we have custom successor transferring logic for this block split. This patch fixes 2 pre-existing bugs, and adds tests. The first bug was that MachineBasicBlock::splice() correctly handles updating most successors and predecessors; we don't need to do anything more than removing the previous fallthrough block from the first half of the split block post splice. Previously, we were updating the successor list incorrectly (updating successors updates predecessors). The second bug was that PHI nodes that needed registers from the first half of the split block were not having entries populated. The register live out information was correct, and the FuncInfo->PHINodesToUpdate was correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock: for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) { MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first); if (!FuncInfo->MBB->isSuccessor(PHI->getParent())) continue; PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB); was `continue`ing because FuncInfo->MBB tracks the second half of the post-split block; no one was updating PHI entries for the first half of the post-split block. SelectionDAGBuilder::UpdateSplitBlock() already expects to perform special handling for MachineBasicBlocks that were split post calls to ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the current MachineBasicBlock being processed), and perform special handling for this in SelectionDAGBuilder::UpdateSplitBlock(). Reviewers: void, craig.topper, efriedma Reviewed By: void, efriedma Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D76961	2020-04-06 13:46:39 -07:00
Matt Arsenault	869f05c834	AMDGPU: Remove dead paths for requiresUniformRegister The extracts from control flow intrinsics are already properly handled by divergence analysis. The inline asm case isn't dead, but has also never really worked correctly so leave it as-is for now.	2020-04-06 16:15:10 -04:00
Francesco Petrogalli	53b7abdd23	[llvm][CodeGen] Avoid implicit cast of TypeSize to integer in `initActions`. Reviewers: sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77317	2020-04-06 19:46:11 +01:00
Masoud Ataei jaliseh	9ed0612cca	Add InjectTLIMappings pass to new pass manager This pass is created in `d6de5f12d4` and tested for new and legacy pass manager but never added to new pass manager pipeline. I am adding it to new pass manager pipeline. This pass is get used in Vector Function Database (VFDatabase) and without this pass in new pass manager pipeline, none of the vector libraries are work ing with new pass manager. Related passes: `66c120f025` https://reviews.llvm.org/D74944 Differential revision: https://reviews.llvm.org/D75354	2020-04-06 13:16:48 -05:00
Craig Topper	07ed1fb597	[SelectionDAGBuilder] Fix ISD::FREEZE creation for structs with fields of different types. The previous code used the type of the first field for the VT passed to getNode for every field. I've based the implementation here off what is done in visitSelect as it removes the need to special case aggregates. Differential Revision: https://reviews.llvm.org/D77093	2020-04-06 11:03:40 -07:00
Konstantin Pyzhov	51dc028314	Revert `e1730cfeb3`	2020-04-06 05:56:11 -04:00
Kirill Naumov	3f995ce8b5	[CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo The patch introduces the system to distinctively store the information needed for the Control Flow Graph as well as the instrumentary needed for the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D76820	2020-04-06 17:42:54 +00:00
Konstantin Pyzhov	e1730cfeb3	[AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU. Reviewers: sameerds, dstuttard Differential Revision: https://reviews.llvm.org/D77228	2020-04-06 05:10:37 -04:00
Fangrui Song	a5d375e0cb	[AArch64] Allow logical immediates to have all-1 in top bits So that constant expressions like the following are permitted: and w0, w0, #~(0xfe<<24) and w1, w1, #~(0xff<<24) The behavior matches GNU as (opcodes/aarch64-opc.c:aarch64_logical_immediate_p). Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D75885	2020-04-06 09:56:04 -07:00
Florian Hahn	7aba6a0333	[LV] Fix value that could be read uninitialized. This should fix http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/18569	2020-04-06 17:54:50 +01:00
Nikita Popov	e8b83f7ddc	[RDA] Only store most recent reaching def from predecessors (NFCI) When entering a basic block, RDA inserts reaching definitions coming from predecessor blocks (which will be negative numbers) in a rather peculiar way. If you have incoming reaching definitions -4, -3, -2, -1, it will insert those. If you have incoming reaching definitions -1, -2, -3, -4, it will insert -1, -1, -1, -1, as the max is taken at each step. That's probably not what was intended... However, RDA only actually cares about the most recent reaching definition from a predecessor (to calculate clearance), so this ends up working fine as far as behavior is concerned. It does waste memory on unnecessary reaching definitions though. This patch changes the implementation to first compute the most recent reaching definition in one loop, and then insert only that one in a separate loop. Differential Revision: https://reviews.llvm.org/D77508	2020-04-06 18:39:09 +02:00
Nikita Popov	8d75df1438	[RDA] Don't adjust ReachingDefDefaultVal (NFCI) At the end of a basic block, RDA adjusts all the reaching defs it found to be relative to the end of the basic block, rather than the start of it. However, it also does this to registers which don't have a reaching def, indicated by ReachingDefDefaultVal. This means that code checking against ReachingDefDefaultVal will not skip them, and may insert them into the reaching definition list. This is ultimately harmless, but causes unnecessary work and is logically not right. Differential Revision: https://reviews.llvm.org/D77506	2020-04-06 18:36:29 +02:00
Sanjay Patel	fbb1b43f13	[ValueTracking] enhance matching of umin/umax with 'not' operands The cmyk test is based on the known regression that resulted from: rGf2fbdf76d8d0 This improves on the equivalent signed min/max change: rG867f0c3c4d8c The underlying icmp equivalence is: ~X pred ~Y --> Y pred X For an icmp with constant, canonicalization results in a swapped pred: ~X < C --> X > ~C	2020-04-06 11:51:59 -04:00
Matt Arsenault	8a5f0dafd4	AMDGPU/GlobalISel: Select llvm.amdgcn.div.scale	2020-04-06 11:50:19 -04:00
Matt Arsenault	e87ec66762	AMDGPU/GlobalISel: Fix llvm.amdgcn.div.fmas.ll	2020-04-06 11:50:16 -04:00
Jay Foad	ddd2f4b96f	[AMDGPU] Fix inaccurate comments	2020-04-06 16:44:08 +01:00
Florian Hahn	90be3c24a7	[VPlan] Introduce new VPWidenCallRecipe (NFC). This patch moves calls to their own recipe, to simplify the transition to VPUser for operands of VPWidenRecipe, as discussed in D76992. Subsequently additional information can be added to the recipe rather than computing it during the execute step. Reviewers: rengolin, Ayal, gilr, hsaito Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77467	2020-04-06 16:07:37 +01:00
Chris Bowler	d6ea82d11c	[AIX][PPC] Implement by-val caller arguments in multiple registers Differential Revision: https://reviews.llvm.org/D76380	2020-04-06 11:06:51 -04:00
Guillaume Chatelet	808286342a	[Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set. Summary: In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined. This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure. Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77538	2020-04-06 14:54:57 +00:00
diggerlin	a26a441b99	[llvm-objdump][XCOFF] Use symbol index+symbol name + storage mapping class as label for -D SUMMARY: For the llvm-objdump -D, the symbol name is used as a label in the disassembly for the specific address (when a symbol address is equal to the virtual address in the dump). In XCOFF, multiple symbols may have the same name, being differentiated by their storage mapping class. It is helpful to print the QualName and not just the name when forming the output label for a csect symbol. The symbol index further removes any ambiguity caused by duplicate names. To maintain compatibility with the binutils objdump, the XCOFF-specific --symbol-description option is added to enable the enhanced format. Reviewers: hubert.reinterpretcast, James Henderson, Jason Liu ,daltenty Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D72973	2020-04-06 10:10:10 -04:00
Benjamin Kramer	880ec421dd	[MC] Use a byte_swap in emitIntValue instead of doing it in a loop. NFCI.	2020-04-06 15:51:24 +02:00
Florian Hahn	6babae74c7	[Matrix] Update load/storeMatrix to take indices as Value* (NFC). This allows using the functions to be used with loop dependent indices.	2020-04-06 14:48:48 +01:00
Matt Arsenault	cbf719b568	AMDGPU: Use DAG patterns for div_fmas	2020-04-06 09:28:30 -04:00
Matt Arsenault	79b29d6df7	AMDGPU: Remove DisableInst feature I'm not sure why these were bothering to check the instruction profile, since those profiles should only be used with these instruction classes.	2020-04-06 09:27:44 -04:00
Matt Arsenault	70726cec5b	DAG: Combine extract_vector_elt of concat_vectors Fixes extra canonicalize regressions when legalizing vector fminnum/fmaxnum.	2020-04-06 09:26:29 -04:00
Hans Wennborg	64c2312750	Revert `43f031d312` "Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)" ExecutionEngine/MCJIT/cet-code-model-lager.ll is failing on 32-bit windows, see llvm-commits thread for `fef2dab`. This reverts commit `43f031d312` and the follow-ups `fef2dab100` and `6a800f6f62`.	2020-04-06 15:05:25 +02:00
Sourabh Singh Tomar	5d7e9adce2	[DWARF5] Added support for emission of debug_macro section. Summary: This patch adds support for emission of following DWARFv5 macro forms in .debug_macro section. 1. DW_MACRO_start_file 2. DW_MACRO_end_file 3. DW_MACRO_define_strp 4. DW_MACRO_undef_strp. Reviewed By: dblaikie, ikudrin Differential Revision: https://reviews.llvm.org/D72828	2020-04-06 17:45:10 +05:30
Pavel Labath	9154a6398e	[llvm/Support] Make more DataExtractor methods error-aware Summary: This patch adds the optional Error argument, and the Cursor variants to more DataExtractor methods. The functions now behave the same way as other error-aware functions (they set the error when they fail, and don't do anything if the error is already set). I have merged the LEB128 implementations via a template (similarly to how fixed-size functions are handled) to reduce code duplication. Depends on D77304. Reviewers: dblaikie, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77306	2020-04-06 14:14:11 +02:00
Pavel Labath	a16fffa3f6	[Support] Make DataExtractor string functions error-aware Summary: This patch adds an optional Error argument to DataExtractor functions for string extraction, and makes them behave like other DataExtractor functions (set the error if extraction fails, don't do anything if the error is already set). I have merged the StringRef and C string versions of the functions to reduce code duplication. Reviewers: dblaikie, MaskRay Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77307	2020-04-06 14:14:11 +02:00
Guillaume Chatelet	ff858d7781	[Alignment][NFC] Add DebugStr and operator* Summary: This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately) Differences from D77394: - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)` - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll) - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum) Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77537	2020-04-06 12:09:45 +00:00
Guillaume Chatelet	39cfba9e33	[Alignment][NFC] Remove deprecated functions introduced in 10.0.0 Summary: 24 March 2020: LLVM 10.0.0 is out. I gathered all deprecated function introduced between 9 and 10 and cleaned them up so they will be removed from 11. > git log -p -S LLVM_ATTRIBUTE_DEPRECATED llvmorg-9.0.0..llvmorg-10.0.0 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77409	2020-04-06 12:07:18 +00:00
Simon Pilgrim	9bc5b1a489	[X86][SSE] combineVectorSignBitsTruncation - remove minimum vector length limitations truncateVectorWithPACK has its own vector length controls, so we can rely on those directly. This helps some existing truncation to subvector tests, which were being combined later during shuffle lowering at which point the sign/zero bit detection had become obscured preventing lowerShuffleWithPACK working as well as it could.	2020-04-06 12:45:23 +01:00
Benjamin Kramer	232eff55f6	[LTO] Replace hand-rolled endian conversion with support::endian. NFCI.	2020-04-06 13:23:27 +02:00
Benjamin Kramer	e64e516790	[RuntimeDyld] Replace hand-rolled endian conversion with support::endian. NFCI.	2020-04-06 13:22:53 +02:00
Benjamin Kramer	9a9bc23672	[llvm-bcanalyzer] Simplify code. NFCI.	2020-04-06 12:50:50 +02:00
Kazushi (Jam) Marukawa	e981a46a77	[VE] Update lea/load/store instructions Summary: Modify lea/load/store instructions to accept `disp(index, base)` style addressing mode (called ASX format). Also, uniform the number of DAG nodes to have 3 operands for this ASX format instructions, and update selectADDR functions to lower appropriate MI. Reviewers: arsenm, simoll, k-ishizaka Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D76822	2020-04-06 11:49:46 +02:00
Oliver Stannard	a294d9eb21	Revert "[IPRA][ARM] Spill extra registers at -Oz" Reverting because this is causing failures on bots with expensive checks enabled. This reverts commit `73cea83a6f`.	2020-04-06 10:34:59 +01:00
Kerry McLaughlin	944e322f88	[AArch64][SVE] Add SVE intrinsics for saturating add & subtract Summary: Adds the following intrinsics: - @llvm.aarch64.sve.[s\|u]qadd.x - @llvm.aarch64.sve.[s\|u]qsub.x Reviewers: sdesmalen, c-rhodes, dancgr, efriedma, cameron.mcinally, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77054	2020-04-06 10:07:08 +01:00
Florian Hahn	39f2d9aa81	[Matrix] Add option to use row-major matrix layout as default. This patch adds a -matrix-default-layout option which can be used to set the default matrix layout to row-major or column-major (default). The initial patch updates codegen for loads, stores, binary operators and matrix multiply. Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76325	2020-04-06 10:00:56 +01:00
Florian Hahn	d1fed7081d	[Matrix] Add initial tiling for load/multiply/store chains. This patch adds initial fusion for load/multiply/store chains of matrix operations. The patch contains roughly two parts: 1. Code generation for a fused load/multiply/store chain (LowerMatrixMultiplyFused). First, we ensure that both loads of the multiply operands do not alias the store. If they do, we create new non-aliasing copies of the operands. Note that this may introduce new basic block. Finally we process TileSize x TileSize blocks. That is: load tiles from the input operands, multiply and store them. 2. Identify fusion candidates & matrix instructions. As a first step, collect all instructions with shape info and fusion candidates (currently @llvm.matrix.multiply calls). Next, try to fuse candidates and collect instructions eliminated by fusion. Finally iterate over all matrix instructions, skip the ones eliminated by fusion and lower the rest as usual. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75566	2020-04-06 09:28:15 +01:00
Guillaume Chatelet	6000478f39	Revert "[Alignment][NFC] Add DebugStr and operator*" This reverts commit `1e34ab98fc`.	2020-04-06 07:55:25 +00:00
Guillaume Chatelet	1e34ab98fc	[Alignment][NFC] Add DebugStr and operator* Summary: Also updates files to use them. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77394	2020-04-06 07:12:46 +00:00
Igor Kudrin	35819ff3cf	[DebugInfo] Fix reading range lists of v5 units in DWP. In package files, the base offset provided by index sections should be used to find the contribution of a unit. The patch adds that base offset when reading range list tables. Differential revision: https://reviews.llvm.org/D77401	2020-04-06 13:28:06 +07:00
Igor Kudrin	a93b77b97f	[DebugInfo] Fix reading location tables headers of v5 units in DWP. This fixes the reading of location lists headers for compilation units in package files by adjusting the reading offset according to the corresponding record in the unit index. This is required for DW_FORM_loclistx to work. Differential revision: https://reviews.llvm.org/D77146	2020-04-06 13:28:06 +07:00
Igor Kudrin	49737df767	[DebugInfo] Fix reading location tables of v5 units in DWP. Without the patch, all version 5 compile units in a DWP file read location tables from the beginning of a .debug_loclists.dwo section. The patch fixes that by adjusting the reading offset the same way as for pre-v5 units. The section identifier to find the contribution entry corresponds to the version of the unit. Differential revision: https://reviews.llvm.org/D77145	2020-04-06 13:28:06 +07:00
Igor Kudrin	714324b79a	[DebugInfo] Support DWARFv5 index sections. DWARFv5 defines index sections in package files in a slightly different way than the pre-standard GNU proposal, see Section 7.3.5 in the DWARF standard and https://gcc.gnu.org/wiki/DebugFissionDWP for GNU proposal. The main concern here is values for section identifiers, which are partially overlapped with changed meanings. The patch adds support for v5 index sections and resolves that difficulty by defining a set of identifiers for internal use which can represent and distinct values of both standards. Differential Revision: https://reviews.llvm.org/D75929	2020-04-06 13:28:06 +07:00
Igor Kudrin	a0249fe91c	[DebugInfo] Rename section identifiers which are deprecated in DWARFv5. NFC. This is a preparation for an upcoming patch which adds support for DWARFv5 unit index sections. The patch adds tag "_EXT_" to identifiers which reference sections that are deprecated in the DWARFv5 standard. See D75929 for the discussion. Differential Revision: https://reviews.llvm.org/D77141	2020-04-06 13:28:06 +07:00
Craig Topper	97e57f3b24	[DAGCombiner] Use getAnyExtOrTrunc instead of getSExtOrTrunc in the zext(setcc) combine. We're ANDing with 1 right after which will cause the SIGN_EXTEND to be combined to ANY_EXTEND later. Might as well just start with an ANY_EXTEND. While there replace create the AND using the getZeroExtendInReg helper to remove the need to explicitly create the VecOnes constant.	2020-04-05 22:44:45 -07:00
Johannes Doerfert	931c0cd713	[OpenMP][NFC] Move and simplify directive -> allowed clause mapping Move the listing of allowed clauses per OpenMP directive to the new macro file in `llvm/Frontend/OpenMP`. Also, use a single generic macro that specifies the directive and one allowed clause explicitly instead of a dedicated macro per directive. We save 800 loc and boilerplate for all new directives/clauses with no functional change. We also need to include the macro file only once and not once per directive. Depends on D77112. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D77113	2020-04-06 00:04:08 -05:00
Craig Topper	586c051a27	[DAGCombiner] Replace a hardcoded constant in visitZERO_EXTEND with a proper check for the condition its trying to protect. This code is replacing a shift with a new shift on an extended type. If the shift amount type can't represent the maximum shift amount for the new type, the amount needs to be extended to a type that can. Previously, the code just hardcoded a check for 256 bits which seems to have been an assumption that the original shift amount was MVT::i8. But that seems more catered to a specific target like X86 that uses i8 as its legal shift amount type. Other targets may use different types. This commit changes the code to look at the real type of the shift amount and makes sure it has enough bits for the Log2 of the new type. There are similar checks to this in SelectionDAGBuilder and LegalizeIntegerTypes.	2020-04-05 20:35:57 -07:00
Johannes Doerfert	419a559c5a	[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP` This is a cleanup and normalization patch that also enables reuse with Flang later on. A follow up will clean up and move the directive -> clauses mapping. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D77112	2020-04-05 22:30:29 -05:00
Tarindu Jayatilaka	b43b59fcc0	Expose `attributor-disable` to the new and old pass managers The new and old pass managers (PassManagerBuilder.cpp and PassBuilder.cpp) are exposed to an `extern` declaration of `attributor-disable` option which will guard the addition of the attributor passes to the pass pipelines. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76871	2020-04-05 22:29:34 -05:00
Lang Hames	1b39c6f62c	[ORC] Add MachO universal binary support to StaticLibraryDefinitionGenerator. Add a new overload of StaticLibraryDefinitionGenerator::Load that takes a triple argument and supports loading archives from MachO universal binaries in addition to regular archives. The LLI tool is updated to use this overload.	2020-04-05 20:21:05 -07:00
Simon Pilgrim	a43e233606	Remove unused function 'isInRange'. NFCI.	2020-04-05 23:11:24 +01:00
Simon Pilgrim	4431a29c60	[X86][SSE] Combine unary shuffle(HORIZOP,HORIZOP) -> HORIZOP We had previously limited the shuffle(HORIZOP,HORIZOP) combine to binary shuffles, but we can often merge unary shuffles just as well, folding in UNDEF/ZERO values into the 64-bit half lanes. For the (P)HADD/HSUB cases this is limited to fast-horizontal cases but PACKSS/PACKUS combines under all cases.	2020-04-05 22:49:46 +01:00
Anna Thomas	1d0f757904	[InlineFunction] Update metadata on loads that are return values This patch builds upon D76140 by updating metadata on pointer typed loads in inlined functions, when the load is the return value, and the callsite contains return attributes which can be updated as metadata on the load. Added test cases show this for nonnull, dereferenceable, dereferenceable_or_null Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D76792	2020-04-05 14:50:10 -04:00
Sourabh Singh Tomar	0d71782f4e	[DebugInfo]: Allow DwarfCompileUnit to have line table symbol Previously line table symbol was represented as `DIE::value_iterator` inside `DwarfCompileUnit` and subsequent function `intStmtList` was used to create a local `MCSymbol` to initialize it. This patch removes `DIE::value_iterator` from `DwarfCompileUnit` and intoduce `MCSymbol` for representing this units symbol for `debug_line` section. As a result `applyStmtList` is also modified to utilize this. Further more a helper function `getLineTableStartSym` is also introduced to get this symbol, this would be used by clients which need to access this line table, i.e `debug_macro`. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D77489	2020-04-06 00:14:29 +05:30
Zuojian Lin	a58c8a7866	Remove the additional constant which requires an extra register for statepoint lowering. The newly-created constant zero will need an extra register to hold it in the current statepoint lowering implementation. Remove it if there exists one.	2020-04-05 11:22:09 -04:00
Apelete Seketeli	8aadb442d1	[scan-build] fix dead store warnings emitted on LLVM AMDGPU code base This fixes dead store warnings of the type "dead assignment" reported by Clang Static Analyzer.	2020-04-05 11:19:03 -04:00
Oliver Stannard	cb6aeb2239	[ARM] Add data gathering hint instruction Summary: This patch upstreams support the optional ARMv8.0 Data Gathering Hint (DGH) extension, which adds the Data Gathering Hint instruction to the hint space. See ARMv8.0-DGH in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, danielkiss, samparker Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77097	2020-04-05 15:21:00 +01:00
Oliver Stannard	6f60eb4a3c	[ARM] Add enhanced counter virtualization system registers Summary: This patch upstreams support for the ARMv8.6A Enhanced Counter Virtualization (ECV) extension, which adds 6 new system registers. See ARMv8.6-ECV in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, pcc, ab, chill Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77094	2020-04-05 15:18:35 +01:00
Sanjay Patel	538a8f0227	[InstCombine] convert bitcast-shuffle to vector trunc As discussed in D76983, that patch can turn a chain of insert/extract with scalar trunc ops into bitcast+extract and existing instcombine vector transforms end up creating a shuffle out of that (see the PhaseOrdering test for an example). Currently, that process requires at least this sequence: -instcombine -early-cse -instcombine. Before D76983, the sequence of insert/extract would reach the SLP vectorizer and become a vector trunc there. Based on a small sampling of public targets/types, converting the shuffle to a trunc is better for codegen in most cases (and a regression of that form is the reason this was noticed). The trunc is clearly better for IR-level analysis as well. This means that we can induce "spontaneous vectorization" without invoking any explicit vectorizer passes (at least a vector cast op may be created out of scalar casts), but that seems to be the right choice given that we started with a chain of insert/extract, and the backend would expand back to that chain if a target does not support the op. Differential Revision: https://reviews.llvm.org/D77299	2020-04-05 09:48:02 -04:00
Oliver Stannard	9e1455dc23	[ARM] Add ARMv8.6 Fine Grain Traps system registers Summary: This patch upstreams support for the ARMv8.6A Fine Grain Traps (FGT) extension, which adds 5 new system registers. See ARMv8.6-FGT in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, momchil.velikov Reviewed By: SjoerdMeijer Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76991	2020-04-05 14:28:18 +01:00
Sanjay Patel	4036a0af24	[InstCombine] enhance freelyNegateValue() by handling 'not' This patch extends D77230. If we have a 'not' instruction inside a negated expression, we can ignore extra uses of that op because the negation has a one-to-one replacement: negate becomes increment. Alive2 examples of the test cases: http://volta.cs.utah.edu:8080/z/T5-u9P http://volta.cs.utah.edu:8080/z/eT89L6 Differential Revision: https://reviews.llvm.org/D77459	2020-04-05 09:16:19 -04:00
Sanjay Patel	867f0c3c4d	[ValueTracking] enhance matching of smin/smax with 'not' operands The cmyk tests are based on the known regression that resulted from: rGf2fbdf76d8d0 So this improvement in analysis might be enough to restore that commit.	2020-04-05 08:54:12 -04:00
Diogo Sampaio	59d10dc703	[ARM] add ARMv8.6-A Activity monitors virtualization extension Summary: This patch upstreams v8.6A activity monitors virtualization assembler support, which consists of 32 new system registers (two groups, each with 16 numbered registers). See ARMv8.6-AMU in the Arm Architecture Reference Manual Armv8 for more information. Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, john.brawn, ostannard Reviewed By: ostannard Subscribers: LukeGeeson, dnsampaio, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76998	2020-04-05 13:31:06 +01:00
Benjamin Kramer	ff889df356	[X86] Roll some loops. NFCI.	2020-04-05 13:59:50 +02:00
Florian Hahn	47ee404075	[ValueTracking] Use Inst::comesBefore in isValidAssumeForCtx (NFC). D51664 added Instruction::comesBefore which should provide better performance than the manual check. Reviewers: rnk, nikic, spatel Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D76228	2020-04-05 12:38:04 +01:00
Simon Pilgrim	3079e51858	[X86][SSE] Generalize shuffle(HORIZOP,HORIZOP) -> HORIZOP combine Our existing combine allows to merge the shuffle of 2 similar 64-bit wide 'horizontal ops' (HADD/PACK/etc.) if the shuffle was a UNPCK/MOVSD. This patch generalizes this to decode any target shuffle mask that can be widened to a 128-bit repeating v2*64 mask, which helps us catch PBLENDW/PBLENDD cases.	2020-04-05 12:09:19 +01:00
Simon Pilgrim	a17de6b91c	[X86][SSE] truncateVectorWithPACK - upper undef for 128->64 packing If we're packing from 128-bits to 64-bits then we don't need the RHS argument. This helps with register allocation, especially as we avoid repeating a use of the input value.	2020-04-05 11:47:36 +01:00
Matt Arsenault	6bfe28e92f	AMDGPU: Fix annotate kernel features through casted calls I thought I was testing this before, but the workitem id x case isn't great since it's mandatory in the parent kernel.	2020-04-04 20:44:44 -04:00
Matt Arsenault	221890d709	AMDGPU: Add feature for fast f32 denormals	2020-04-04 20:01:24 -04:00
Stefanos Baziotis	f3dd3a66d3	[Attributor] AAUndefinedBehavior: Use AAValueSimplify in memory accessing instructions. Query AAValueSimplify on pointers in memory accessing instructions to take advantage of the constant propagation (or any other value simplification) of such values.	2020-04-05 02:46:26 +03:00
Jonathan Roelofs	3ce77142a6	Revert "[DAG] Fix PR45049: LegalizeTypes crash" This reverts commit `17673ae0b2`.	2020-04-04 13:47:22 -06:00
Jonathan Roelofs	17673ae0b2	[DAG] Fix PR45049: LegalizeTypes crash Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG does, leading to accidental SDValue removal before its reference count was truly zero. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049 https://reviews.llvm.org/D76994	2020-04-04 13:36:22 -06:00
Florian Hahn	a2b18c5a08	[LV] Simplify tryToWiden as recipes are not re-used (NFC). After `49d00824bb`, VPWidenRecipe only stores a single instruction. tryToWiden can simply return the widen recipe, like other helpers in VPRecipeBuilder.	2020-04-04 18:30:50 +01:00
Heejin Ahn	fc5d8b672b	[WebAssembly] Fix a sanitizer error in WasmEHPrepare Summary: D77423 started using a dominator tree in WasmEHPrepare, but we deleted BBs in `prepareThrows` before we used the domtree in `prepareEHPads`, and those CFG changes were not reflected in the domtree. This uses `DomTreeUpdater` to make sure we update the domtree every time we delete BBs from the CFG. This fixes ubsan/msan/expensive_check errors caught in LLVM buildbots. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77465	2020-04-04 09:57:07 -07:00
Nikita Popov	4ede730096	[InstCombine] Don't limit uses in eraseInstFromFunction() eraseInstFromFunction() adds the operands of the erased instructions, as those might now be dead as well. However, this is limited to instructions with less than 8 operands. This check doesn't make a lot of sense to me. As the instruction gets removed afterwards, I don't see a potential for anything overly pathological happening here (as we can only add those operands to the worklist once). The impact on CTMark is in the noise. We also have the same code in instruction sinking and don't limit the operand count there. Differential Revision: https://reviews.llvm.org/D77325	2020-04-04 18:37:30 +02:00
Luofan Chen	eec6d87626	[Attributor] Deduce attributes for non-exact functions This patch is based on D63312 and D63319. For now we create shallow wrappers for all functions that are IPO amendable. See also [this github issue](https://github.com/llvm/llvm-project/issues/172). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76404	2020-04-04 11:34:58 -05:00
Heejin Ahn	2e9839729d	[WebAssembly] Fix wasm.lsda() optimization in WasmEHPrepare Summary: When we insert a call to the personality function wrapper (`_Unwind_CallPersonality`) for a catch pad, we store some necessary info in `__wasm_lpad_context` struct and pass it. One of the info is the LSDA address for the function. For this, we insert a call to `wasm.lsda()`, which will be lowered down to the address of LSDA, and store it in a field in `__wasm_lpad_context`. There are exceptions to this personality call insertion: catchpads for `catch (...)` and cleanuppads (for destructors) don't need personality function calls, because we don't need to figure out whether the current exception should be caught or not. (They always should.) There was a little optimization to `wasm.lsda()` call insertion. Because the LSDA address is the same throughout a function, we don't need to insert a store of `wasm.lsda()` return value in every catchpad. For example: ``` try { foo(); } catch (int) { // wasm.lsda() call and a store are inserted here, like, in // pseudocode, // %lsda = wasm.lsda(); // store %lsda to a field in __wasm_lpad_context try { foo(); } catch (int) { // We don't need to insert the wasm.lsda() and store again, because // to arrive here, we have already stored the LSDA address to // __wasm_lpad_context in the outer catch. } } ``` So the previous algorithm checked if the current catch has a parent EH pad, we didn't insert a call to `wasm.lsda()` and its store. But this was incorrect, because what if the outer catch is `catch (...)` or a cleanuppad? ``` try { foo(); } catch (...) { // wasm.lsda() call and a store are NOT inserted here try { foo(); } catch (int) { // We need wasm.lsda() here! } } ``` In this case we need to insert `wasm.lsda()` in the inner catchpad, because the outer catchpad does not have one. To minimize the number of inserted `wasm.lsda()` calls and stores, we need a way to figure out whether we have encountered `wasm.lsda()` call in any of EH pads that dominates the current EH pad. To figure that out, we now visit EH pads in BFS order in the dominator tree so that we visit parent BBs first before visiting its child BBs in the domtree. We keep a set named `ExecutedLSDA`, which basically means "Do we have `wasm.lsda()` either in the current EH pad or any of its parent EH pads in the dominator tree?". This is to prevent scanning the domtree up to the root in the worst case every time we examine an EH pad: each EH pad only needs to examine its immediate parent EH pad. - If any of its parent EH pads in the domtree has `wasm.lsda()`, this means we don't need `wasm.lsda()` in the current EH pad. We also insert the current EH pad in `ExecutedLSDA` set. - If none of its parent EH pad has `wasm.lsda()` - If the current EH pad is a `catch (...)` or a cleanuppad, done. - If the current EH pad is neither a `catch (...)` nor a cleanuppad, add `wasm.lsda()` and the store in the current EH pad, and add the current EH pad to `ExecutedLSDA` set. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77423	2020-04-04 07:02:50 -07:00
Simon Pilgrim	e5e719d885	[X86][SSE] lowerV8I16Shuffle - lower compaction shuffles using PACKUSDW(PBLENDW,PBLENDW) on SSE41+ Similar to the lowerV16I8Shuffle implementation, for binary compaction v8i16 shuffles we can avoid the PUNPCKLDQ(PSHUFB,PSHUFB) pattern on SSE41+ targets by using PACKUSDW and PBLENDW. Before SSE41 we would need to use PACKSSDW but that requires sign extension that seems to destroy any gains, even on targets without PSHUFB. This is a bigger gain on AMD than Intel targets but should never be a regression, and avoiding the shuffle mask load(s) is always useful. Noticed in codegen while dealing with PR31443.	2020-04-04 13:08:25 +01:00
Nikita Popov	b90ea4f341	[IRBuilder] Move some code into the cpp file; NFC Since D73835 we no longer need to define the whole IRBuilder implementation in the header. This patch moves some of the larger methods out of line, into the C++ file. Differential Revision: https://reviews.llvm.org/D77332	2020-04-04 12:52:56 +02:00
Nikita Popov	6896d559f3	[VNCoercion] Use IRBuilderBase; NFC And remove include from header.	2020-04-04 12:44:50 +02:00
Nikita Popov	ebd5a1b049	[Reassociate] Use IRBuilderBase; NFC And remove now unnecessary IRBuilder.h include in header.	2020-04-04 12:34:16 +02:00
Nikita Popov	1055e9e3c8	[IVDescriptors] Remove IRBuilder.h include; NFC IVDescriptors.h itself does not reference IRBuilder at all. Move the include into transformation passes that do.	2020-04-04 12:07:57 +02:00
Nikita Popov	a5eb1236e3	[IVDescriptors] Remove unnecessary DemandedBits.h include; NFC Forward declare DemandedBits in IVDescriptors, and move include into the cpp file. Also drop the include from LoopUtils, which does not need it at all.	2020-04-04 12:07:57 +02:00
Craig Topper	1d42c0db9a	Revert "[X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets" This reverts commit `c74dd640fd`. Reverting to address coding standard issues raised in post-commit review.	2020-04-03 16:56:08 -07:00
Craig Topper	a505ad58cf	Revert "[X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI)" This reverts commit `62c42e29ba` Reverting to address coding standard issues raised in post-commit review.	2020-04-03 16:55:53 -07:00
Scott Constable	62c42e29ba	[X86] Add Support for Load Hardening to Mitigate Load Value Injection (LVI) After finding all such gadgets in a given function, the pass minimally inserts LFENCE instructions in such a manner that the following property is satisfied: for all SOURCE+SINK pairs, all paths in the CFG from SOURCE to SINK contain at least one LFENCE instruction. The algorithm that implements this minimal insertion is influenced by an academic paper that minimally inserts memory fences for high-performance concurrent programs: http://www.cs.ucr.edu/~lesani/companion/oopsla15/OOPSLA15.pdf The algorithm implemented in this pass is as follows: 1. Build a condensed CFG (i.e., a GadgetGraph) consisting only of the following components: -SOURCE instructions (also includes function arguments) -SINK instructions -Basic block entry points -Basic block terminators -LFENCE instructions 2. Analyze the GadgetGraph to determine which SOURCE+SINK pairs (i.e., gadgets) are already mitigated by existing LFENCEs. If all gadgets have been mitigated, go to step 6. 3. Use a heuristic or plugin to approximate minimal LFENCE insertion. 4. Insert one LFENCE along each CFG edge that was cut in step 3. 5. Go to step 2. 6. If any LFENCEs were inserted, return true from runOnFunction() to tell LLVM that the function was modified. By default, the heuristic used in Step 3 is a greedy heuristic that avoids inserting LFENCEs into loops unless absolutely necessary. There is also a CLI option to load a plugin that can provide even better optimization, inserting fewer fences, while still mitigating all of the LVI gadgets. The plugin can be found here: https://github.com/intel/lvi-llvm-optimization-plugin, and a description of the pass's behavior with the plugin can be found here: https://software.intel.com/security-software-guidance/insights/optimized-mitigation-approach-load-value-injection. Differential Revision: https://reviews.llvm.org/D75937	2020-04-03 13:45:50 -07:00
Scott Constable	c74dd640fd	[X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) Gadgets Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph. More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK). Also adds a new target feature to X86: +lvi-load-hardening The feature can be added via the clang CLI using -mlvi-hardening. Differential Revision: https://reviews.llvm.org/D75936	2020-04-03 13:02:04 -07:00
Alina Sbirlea	688450c7f0	[GraphDiff] Extend GraphDiff to track a list of updates. Summary: This patch includes two extensions: 1. It extends the GraphDiff to also keep the original list of updates after legalization, not just the deletes/insert vectors. It also provides an API to pop the first update (the updates are store in reverse, such that the first update is at the end of the list) 2. It adds a bool to mark whether the given updates should be applied as given, or applied in reverse. This moves the task of reversing the updates (when the caller needs this) to a functionality inside GraphDiff, versus having the caller do this. The two changes could be split into two patches, but they seemed reasonably small to be reviewed together. Reviewers: kuhar, dblaikie Subscribers: hiraditya, george.burgess.iv, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77167	2020-04-03 12:10:36 -07:00
Scott Constable	f95a67d8b8	[X86] Add RET-hardening Support to mitigate Load Value Injection (LVI) Adding a pass that replaces every ret instruction with the sequence: pop <scratch-reg> lfence jmp *<scratch-reg> where <scratch-reg> is some available scratch register, according to the calling convention of the function being mitigated. Differential Revision: https://reviews.llvm.org/D75935	2020-04-03 12:08:34 -07:00
Matt Arsenault	30ebafaa56	CodeGen: Convert some TII hooks to use Register	2020-04-03 14:52:54 -04:00
Matt Arsenault	178050c3ba	AMDGPU: Use Register in more places	2020-04-03 14:52:54 -04:00
Matt Arsenault	e8dcb6d05e	AMDGPU: Remove redundant virtual	2020-04-03 14:52:53 -04:00
Christopher Tetreault	b600809688	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: kparzysz, sdesmalen, efriedma Reviewed By: kparzysz Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77267	2020-04-03 11:26:51 -07:00
Stanislav Mekhanoshin	0462795095	[AMDGPU] Propagate AGPR RC from PHI to its PHI operands We can fix register class of PHI based on its all AGPR uses. That leaves behind all PHIs which were already processed earlier. Propagate RC back to PHI operands of a PHI. Differential Revision: https://reviews.llvm.org/D77344	2020-04-03 11:23:02 -07:00
Simon Pilgrim	2225797567	[YAMLParser] Scanner::setError - ensure we use the StringRef::iterator argument (PR45043) As detailed on PR45043, static analysis was warning that the StringRef::iterator Position argument was being ignored and the function was hardwired to use the Current iterator. This patch ensures we use the provided iterator and removes the (barely necessary) setError wrapper that always used Current. Differential Revision: https://reviews.llvm.org/D76512	2020-04-03 18:55:38 +01:00
Sanjay Patel	ce97ce3a5d	[VectorCombine] try to form a better extractelement Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms. Admittedly, this is a quick-fix for a more general problem that I'm hoping to solve by adding transforms for patterns that start with an insertelement. But this might resolve some regressions known to be caused by the extract-extract transform (although I have not gotten more details on those yet). In the motivating case from PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 The combination of subsequent instcombine and codegen transforms gets us this improvement: vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm4 vmovshdup %xmm1, %xmm3 ## xmm3 = xmm1[1,1,3,3] vaddps %xmm0, %xmm2, %xmm0 vaddps %xmm1, %xmm3, %xmm1 vshufps $200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3] vinsertps $177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2] --> vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm1 vaddps %xmm0, %xmm2, %xmm0 vshufps $200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3] Differential Revision: https://reviews.llvm.org/D76623	2020-04-03 13:55:13 -04:00
Sylvain Audi	e4ae0a2e97	[Support/Path] sys::path::replace_path_prefix fix and simplifications Added unit tests for 2 scenarios that were failing. Made replace_path_prefix back to 3 parameters instead of 5, simplifying the implementation. The other 2 were always used with the default value. This commit is intended to be the first of 3: 1) simplify/fix replace_path_prefix. 2) use it in the context of -fdebug-prefix-map and -fmacro-prefix-map (see D76869). 3) Make Windows version of replace_path_prefix insensitive to both case and separators (slash vs backslash). Differential Revision: https://reviews.llvm.org/D77223	2020-04-03 13:50:23 -04:00
Simon Pilgrim	34a497b765	[X86][SSE] lowerShuffleWithPACK - extend to use chained PACKs for larger truncations Extend lowerShuffleWithPACK/matchShuffleWithPACK/createPackShuffleMask to handle compaction style shuffle masks that can be lowered to chains of PACKSS/PACKUS if their inputs are suitably sign/zero extended. This helps avoid PSHUFB (and its mask load) for short shuffle chains, shuffle combining will still replace with a PSHUFB if we have enough shuffles as getFauxShuffleMask should recognise the PACKSS/PACKUS chains.	2020-04-03 18:26:10 +01:00
Roman Lebedev	7d572ef2dd	Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)" As discussed in post-commit review in https://reviews.llvm.org/D73501 if the goal of this is to help vectorizer, then we should actually be teaching vectorizer to do this, because right now this rewrite is still budget-limited, which isn't what we'd want. Additionally, while the rest of the patch series was universally profitable, this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171) exposing cost-modeling issues on ARM. So let's just back this particular patch out. Once there's an undo transform, this could be considered for reintegration. This reverts commit `44edc6fd2c`.	2020-04-03 20:15:04 +03:00
John Brawn	4ad9ca0f9e	[ARM] Fix incorrect handling of big-endian vmov.i64 Currently when the target is big-endian vmov.i64 reverses the order of the two words of the vector. This is correct only when the underlying element type is 32-bit, as actually what it should be doing is considering it a vector of the underlying type and reversing the elements of that. Differential Revision: https://reviews.llvm.org/D76515	2020-04-03 17:36:50 +01:00
John Brawn	cd58fb6325	[ARM] Avoid pointless vrev of element-wise vmov If we have an element-wise vmov immediate instruction then a subsequent vrev with width greater or equal to the vmov element width, then that vrev won't do anything. Add a DAG combine to convert bitcasts that would become such vrevs into vector_reg_casts instead. Differential Revision: https://reviews.llvm.org/D76514	2020-04-03 17:36:50 +01:00
Matt Arsenault	57a55313c3	InstCombine: Reduce minnum/maxnum if inputs are casted	2020-04-03 11:57:25 -04:00
jasonliu	d65557d15d	[NFC][XCOFF][AIX] Refactor get/setContainingCsect Summary: For current architect, we always require setContainingCsect to be called on every MCSymbol got used in XCOFF context. This is very hard to achieve because symbols gets created everywhere and other MCSymbol types(ELF, COFF) do not have similar rules. It's very easy to miss setting the containing csect, and we would need to add a lot of XCOFF specialized code around some common code area. This patch intendeds to do 1. Rely on getFragment().getParent() to get csect from labels. 2. Only use get/setRepresentedCsect (was get/setContainingCsect) if symbol itself represents a csect. Reviewers: DiggerLin, hubert.reinterpretcast, daltenty Differential Revision: https://reviews.llvm.org/D77080	2020-04-03 13:33:12 +00:00
Guillaume Chatelet	9068bccbae	[Alignment][NFC] Deprecate InstrTypes getRetAlignment/getParamAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77312	2020-04-03 13:21:58 +00:00
Guillaume Chatelet	1a584a8d50	[Alignment][NFC] Remove unused private functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77297	2020-04-03 09:16:20 +00:00
Guillaume Chatelet	ca11c480e7	[Alignment][NFC] Convert MachineIRBuilder::buildDynStackAlloc to Align Summary: The change in IRTranslator is not trivial but is NFC as far as I can tell. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77292	2020-04-03 09:05:19 +00:00
OCHyams	9b56cc9361	[DebugInfo] Salvage debug info when sinking loop invariant instructions Reviewed By: vsk, aprantl, djtodoro Differential Revision: https://reviews.llvm.org/D77318	2020-04-03 09:19:26 +01:00
Guillaume Chatelet	9f5c786876	[NFC] G_DYN_STACKALLOC realign iff align > 1, update documentation Summary: I think it would be better to require the alignment to be >= 1. It is currently confusing to allow both values. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77372	2020-04-03 08:12:39 +00:00
scentini	6825920b18	Silence -Wpessimizing-move warning	2020-04-03 09:37:39 +02:00
Scott Constable	5b519cf1fc	[X86] Add Indirect Thunk Support to X86 to mitigate Load Value Injection (LVI) This pass replaces each indirect call/jump with a direct call to a thunk that looks like: lfence jmpq *%r11 This ensures that if the value in register %r11 was loaded from memory, then the value in %r11 is (architecturally) correct prior to the jump. Also adds a new target feature to X86: +lvi-cfi ("cfi" meaning control-flow integrity) The feature can be added via clang CLI using -mlvi-cfi. This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code. Differential Revision: https://reviews.llvm.org/D76812	2020-04-03 00:34:39 -07:00
scentini	0a3845b70f	Silence -Wpessimizing-move warning	2020-04-03 09:24:26 +02:00
Igor Kudrin	f13ce15d44	[DebugInfo] Rename getOffset() to getContribution(). NFC. The old name was a bit misleading because the functions actually return contributions to the corresponding sections. Differential revision: https://reviews.llvm.org/D77302	2020-04-03 14:15:53 +07:00
Sourabh Singh Tomar	69c8fb1c65	[DWARF5] Added support for debug_macro section parsing and dumping in llvm-dwarfdump. Summary: This patch adds parsing and dumping DWARFv5 .debug_macro section in llvm-dwarfdump, it does not introduce any new switch. Existing switch "--debug-macro" should be used to dump macinfo or macro section. Reviewed By: dblaikie, ikudrin, jhenderson Differential Revision: https://reviews.llvm.org/D73086	2020-04-03 12:23:51 +05:30
Serguei Katkov	bd1d70bf0e	[DAG] Change isGCValue detection for statepoint lowering isGCValue should detect whether the deopt value is a GC pointer. Currently it checks by finding the value in SI.Bases and SI.Ptrs. However these data structures contain only those values which have corresponding gc.relocate call. So we can miss GC value if it does not have gc.relocate call (dead after the call). Check GC strategy whether pointer is GC one or consider any pointer to be GC one conservatively. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D77130	2020-04-03 12:36:13 +07:00
Scott Constable	b1d581019f	[X86] Refactor X86IndirectThunks.cpp to Accommodate Mitigations other than Retpoline Introduce a ThunkInserter CRTP base class from which new thunk types can inherit, e.g., thunks to mitigate https://software.intel.com/security-software-guidance/software-guidance/load-value-injection. Differential Revision: https://reviews.llvm.org/D76811	2020-04-02 22:09:54 -07:00
Scott Constable	71e8021d82	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810	2020-04-02 21:55:13 -07:00
Hongtao Yu	88da019977	Fix a bug in the inliner that causes subsequent double inlining Summary: A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining. To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges. ``` void top() { int t = first(); second(t); } void second(int t) { t = third(t); fourth(t); } void third(int t) { return t; } ``` The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up. We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too. Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification. Reviewers: wenlei, davidxl, tejohnson Reviewed By: wenlei, davidxl Subscribers: eraman, nikic, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76248	2020-04-02 21:08:05 -07:00
Xiang1 Zhang	43f031d312	Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology) Summary: This patch comes from H.J.'s `2bd54ce7fa` This patch fix the failed llvm unit tests which running on CET machine. (e.g. ExecutionEngine/MCJIT/MCJITTests) The reason we enable IBT at "JIT compiled with CET" is mainly that: the JIT don't know the its caller program is CET enable or not. If JIT's caller program is non-CET, it is no problem JIT generate CET code or not. But if JIT's caller program is CET enabled, JIT must generate CET code or it will cause Control protection exceptions. I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed. and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too. (if not apply this patch, VNCserver will crash at CET machine.) Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei Subscribers: tstellar, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76900	2020-04-03 11:44:07 +08:00
Jessica Paquette	71947ed927	[AArch64][GlobalISel] Constrain reg operands in selectBrJT This was causing a machine verifier failure on the test suite. Make sure that we don't end up with a weird register class here. Failure for reference: * Bad machine code: Illegal virtual register for instruction * - function: check_constrain - basic block: %bb.1 (0x7f8b70839f80) - instruction: early-clobber %6:gpr64, early-clobber %7:gpr64sp = JumpTableDest32 %5:gpr64, %1:gpr64sp, %jump-table.0 - operand 3: %1:gpr64sp Expected a GPR64 register, but got a GPR64sp register Differential Revision: https://reviews.llvm.org/D77349	2020-04-02 20:34:11 -07:00
Wenju He	fe8ac0fe51	[x86] Fix Intel OpenCL builtin CalleeSavedRegs on skx Summary: Align with AVX512 builtins implementations, some of which don't preserve rdi. Reviewers: yubing, tianqing, craig.topper Reviewed By: craig.topper Subscribers: yaxunl, Anastasia, hiraditya Differential Revision: https://reviews.llvm.org/D77032	2020-04-03 11:27:40 +08:00
Qiu Chaofan	71f1ab5354	[PowerPC] Remove unnecessary XSRSP instruction MI peephole will remove unnecessary FRSP instructions. This patch removes such unnecessary XSRSP. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D77208	2020-04-03 11:05:14 +08:00
Jun Ma	9c6f32a0ff	[Coroutines] Simplify implementation using removePredecessor Differential Revision: https://reviews.llvm.org/D77035	2020-04-03 09:20:07 +08:00
Austin Kerbow	30f18ed387	[AMDGPU] Handle SMRD signed offset immediate Summary: This fixes a few issues related to SMRD offsets. On gfx9 and gfx10 we have a signed byte offset immediate, however we can overflow into a negative since we treat it as unsigned. Also, the SMRD SOFFSET sgpr is an unsigned offset on all subtargets. We sometimes tried to use negative values here. Third, S_BUFFER instructions should never use a signed offset immediate. Differential Revision: https://reviews.llvm.org/D77082	2020-04-02 17:41:52 -07:00
Adrian Prantl	93fe58c9cf	Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.label intrinsic. Debug info for labels is not generated at -gline-tables-only, so this pass should remove them. Differential Revision: https://reviews.llvm.org/D77345	2020-04-02 17:39:33 -07:00
Adrian Prantl	c024f3ebdc	Teach the stripNonLineTableDebugInfo pass about the llvm.dbg.addr intrinsic. This patch also strips llvm.dbg.addr intrinsics when downgrading debug info to linetables-only. Differential Revision: https://reviews.llvm.org/D77343	2020-04-02 17:39:33 -07:00
Lang Hames	05598441de	Re-apply `0071eaaf08`, "[ORC] Export __cxa_atexit ...", with fixes. Forgot to include part of the testcase. Thank to Nico for spotting that and reverting!	2020-04-02 16:03:35 -07:00
Matt Arsenault	f68cc2a7ed	AMDGPU: Use 128-bit DS operations by default	2020-04-02 17:17:47 -04:00
Matt Arsenault	5660bb6bc9	AMDGPU: Remove denormal subtarget features Switch to using the denormal-fp-math/denormal-fp-math-f32 attributes.	2020-04-02 17:17:12 -04:00
Matt Arsenault	75cf30918f	AMDGPU: Assume f32 denormals are enabled by default This will likely introduce catastrophic performance regressions on older subtargets, but should be correct. A follow up change will remove the old fp32-denormals subtarget features, and switch to using the new denormal-fp-math/denormal-fp-math-f32 attributes. Frontends should be making sure to add the denormal-fp-math-f32 attribute when appropriate to avoid performance regressions.	2020-04-02 17:17:12 -04:00
Cyndy Ishida	fd4d07517b	[llvm][TextAPI] adding inlining reexported libraries support Summary: [llvm][TextAPI] adding inlining reexported libraries support * this patch adds reader/writer support for MachO tbd files. The usecase is to represent reexported libraries in top level library that won't need to exist for linker indirection because all of the needed content will be inlined in the same document. Reviewers: ributzka, steven_wu, jhenderson Reviewed By: ributzka Subscribers: JDevlieghere, hiraditya, mgrang, dexonsmith, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67646	2020-04-02 13:05:08 -07:00
Craig Topper	4fdb63bbf0	[X86] Enable combineExtSetcc for vectors larger than 256 bits when we've disabled 512 bit vectors. The compares are going to be type legalized to 256 bits so we might as well fold the extend.	2020-04-02 12:44:27 -07:00
Anna Thomas	bf7a16a768	[InlineFunction] Update valid return attributes at callsite within callee body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate valid attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. Also, this is valid only for attributes which are a property of a callsite and not those that are not dependent on the ABI, or a property of the call itself. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-04-02 14:13:12 -04:00
Matt Arsenault	c3d3c22a58	AMDGPU: Hack out noinline on functions using LDS globals This is a workaround for clang adding noinline to all functions at -O0. Previously, we would just add alwaysinline, and the verifier would complain about having both noinline and alwaysinline. We currently can't truly codegen this case as a freestanding function, so override the user forcing noinline.	2020-04-02 14:12:07 -04:00
Sanjay Patel	f4448063cc	[InstCombine] try to reduce shuffle with bitcasted operand shuf (bitcast X), undef, Mask --> bitcast X' The 'inverse shuffles' test (shuf_bitcast_operand) is a pattern in the motivating examples from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 (see also D76727) We can deal with this class of patterns in generic instcombine because we are not creating any new shuffles, just a bitcast. Alive2 proof: http://volta.cs.utah.edu:8080/z/mwDUZf Differential Revision: https://reviews.llvm.org/D76844	2020-04-02 13:44:50 -04:00
Sanjay Patel	b6050ca181	[VectorCombine] transform bitcasted shuffle to narrower elements bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC' We do not attempt this in InstCombine because we do not want to change types and create new shuffle ops that are potentially not lowered as well as the original code. Here, we can check the cost model to see if it is worthwhile. I've aggressively enabled this transform even if the types are the same size and/or equal cost because moving the bitcast allows InstCombine to make further simplifications. In the motivating cases from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 ...this is enough to let instcombine and the backend eliminate the redundant shuffles, but we probably want to extend VectorCombine to handle the inverse pattern (shuffle-of-bitcast) to get that simplification directly in IR. Differential Revision: https://reviews.llvm.org/D76727	2020-04-02 13:30:22 -04:00
Stanislav Mekhanoshin	f2334a7ef2	[AMDGPU] Fix crash in SILoadStoreOptimizer SILoadStoreOptimizer::checkAndPrepareMerge() expects base and paired instruction to come in order and scans MBB from base to the paired instruction. An original order can be changed if there were a dependent instruction in between and base instruction was moved. Fixed by bailing the optimization. In theory it might be possible still to perform a merge by swapping instructions, but on practice it bails anyway because it finds dependency on that same instruction which has resulted in the base move. Differential Revision: https://reviews.llvm.org/D77245	2020-04-02 10:26:47 -07:00
Sanjay Patel	12fcbcecff	[InstCombine] add tests for cmyk benchmark; NFC These are versions of a function that regressed with: rGf2fbdf76d8d0 That particular problem occurs with an instcombine-simplifycfg-instcombine sequence, but we can show that it exists within instcombine only with other variations of the pattern.	2020-04-02 13:00:46 -04:00
Benjamin Kramer	de8831934a	[LoopDataPrefetch] Remove unused include that's a layering violation	2020-04-02 17:46:10 +02:00
Benjamin Kramer	dffc503187	Revert "[SimplifyLibCalls] Erase replaced instructions" This reverts commit `2a77544ad5`. This introduces a use-after-free in Transforms/InstCombine/sincospi.ll. Found by asan.	2020-04-02 17:30:47 +02:00
Jonas Paulsson	7e02da7db5	[SystemZ] Add isCommutable flag on vector instructions. This does not change much in code generation, but in rare cases MachineCSE can figure out that an instruction is redundant after commuting it. Review: Ulrich Weigand	2020-04-02 16:06:15 +02:00
Sanjay Patel	1008435f3d	Revert "[InstCombine] do not exclude min/max from icmp with casted operand fold" This reverts commit `f2fbdf76d8`. As noted in the post-commit thread: https://reviews.llvm.org/rGf2fbdf76d8d0 ...this can obscure a min/max pattern where the components have extra uses. We can show that the problem is independent of this change with a slightly modified source example, so this revert just delays/reduces the need to fix the real problem. We need to improve our analysis of negation or -- more generally -- subtraction using patches like D77230 or D68408.	2020-04-02 09:15:23 -04:00
Tyker	c00cb76274	[NFC] Split Knowledge retention and place it more appropriatly Summary: Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils allows Queries and Transform/Utils to use Analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77171	2020-04-02 15:01:41 +02:00
Jonas Paulsson	36d4421f50	[LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop. This patch adds - New arguments to getMinPrefetchStride() to let the target decide on a per-loop basis if software prefetching should be done even with a stride within the limit of the hw prefetcher. - New TTI hook enableWritePrefetching() to let a target do write prefetching by default (defaults to false). - In LoopDataPrefetch: - A search through the whole loop to gather information before emitting any prefetches. This way the target can get information via new arguments to getMinPrefetchStride() and emit prefetches more selectively. Collected information includes: Does the loop have a call, how many memory accesses, how many of them are strided, how many prefetches will cover them. This is NFC to before as long as the target does not change its definition of getMinPrefetchStride(). - If a previous access to the same exact address was 'read', and the current one is 'write', make it a 'write' prefetch. - If two accesses that are covered by the same prefetch do not dominate each other, put the prefetch in a block that dominates both of them. - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop. - A SystemZ implementation of getMinPrefetchStride(). Review: Ulrich Weigand, Michael Kruse Differential Revision: https://reviews.llvm.org/D70228	2020-04-02 14:57:46 +02:00
Simon Pilgrim	b02c7a8152	Fix "result of 32-bit shift implicitly converted to 64 bits" MSVC warning. NFCI. The shift of 1 by an amount that is never more than 31 means that the warning is a false positive but is safe and fixes Werror builds.	2020-04-02 12:02:04 +01:00
David Green	fbd53ffc3a	[ARM] MVE VMULL patterns This adds MVE vmull patterns, which are conceptually the same as mul(vmovl, vmovl), and so the tablegen patterns follow the same structure. For i8 and i16 this is simple enough, but in the i32 version the multiply (in 64bits) is illegal, meaning we need to catch the pattern earlier in a dag fold. Because bitcasts are involved in the zext versions and the patterns are a little different in little and big endian. I have only added little endian support in this patch. Differential Revision: https://reviews.llvm.org/D76740	2020-04-02 10:57:40 +01:00
David Green	c697dd9ffd	[ARM] Make remaining MVE instruction predictable The unpredictable/hasSideEffects flag is usually inferred by tablegen from whether the instruction has a tablegen pattern (and that pattern only has a single output instruction). Now that the MVE intrinsics are all committed and producing code, the remaining instructions still marked as unpredictable need to be specially handled. This adds the flag directly to instructions that need it, notably the V*MLAL instructions and some of the MOV's. Differential Revision: https://reviews.llvm.org/D76910	2020-04-02 10:57:40 +01:00
Guillaume Chatelet	96cae168fa	[NFC] Preparatory work for D77292	2020-04-02 09:30:33 +00:00
Clement Courbet	fb4aa30f27	[ExpandMemCmp] Allow overlaping loads in the zero-relational case. Summary: This allows doing `memcmp(p, q, 7)` with 2 loads instead of a call to memcmp. This fixes part of PR45147. Reviewers: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76133	2020-04-02 11:20:47 +02:00
Florian Hahn	a63b5c9e53	[CallSiteSplitting] Simplify isPredicateOnPHI & continue checking PHIs. As pointed out by @thakis, currently CallSiteSplitting bails out after checking the first PHI node. We should check all PHI nodes, until we find one where call site splitting is beneficial. This patch also slightly simplifies the code using BasicBlock::phis(). Reviewers: davidxl, junbuml, thakis Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D77089	2020-04-02 10:11:27 +01:00
Guillaume Chatelet	189d2e215f	[Alignment][NFC] Use more Align versions of various functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, arsenm, sdardis, jvesely, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77291	2020-04-02 09:00:53 +00:00
OCHyams	550ab58bc1	[NFC] Fix performance issue in LiveDebugVariables When compiling AMDGPUDisassembler.cpp in a stage 1 trunk build with CMAKE_BUILD_TYPE=RelWithDebInfo LLVM_USE_SANITIZER=Address LiveDebugVariables accounts for 21.5% wall clock time. This fix reduces that to 1.2% by switching out a linked list lookup with a map lookup. Note that the linked list is still used to group UserValues by vreg. The vreg lookups don't cause any problems in this pathological case. This is the same idea as D68816, which was reverted, except that it is a less intrusive fix. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D77226	2020-04-02 09:39:33 +01:00
Djordje Todorovic	29d253c4c6	[Object] Add the method for checking if a section is a debug section Different file formats have different naming style for the debug sections. The method is implemented for ELF, COFF and Mach-O formats. Differential Revision: https://reviews.llvm.org/D76276	2020-04-02 10:56:00 +02:00
WangTianQing	d08fadd662	[X86] Add SERIALIZE instruction. Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77193	2020-04-02 16:19:23 +08:00
Shengchen Kan	9f92d4612f	Revert "[NFC][X86] Refine code in X86AsmBackend" This reverts commit `a157cde0ac`.	2020-04-02 15:57:06 +08:00
Shengchen Kan	a157cde0ac	[NFC][X86] Refine code in X86AsmBackend Replace pattern getContents().size with universe function call	2020-04-02 15:41:10 +08:00
Johannes Doerfert	1858f4b50d	Revert "[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP`" This reverts commit `c18d55998b`. Bots have reported uses that need changing, e.g., clang-tools-extra/clang-tidy/openmp/UseDefaultNoneCheck.cp as reported by http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/46591	2020-04-02 02:23:22 -05:00
Johannes Doerfert	c18d55998b	[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP` This is a cleanup and normalization patch that also enables reuse with Flang later on. A follow up will clean up and move the directive -> clauses mapping. Differential Revision: https://reviews.llvm.org/D77112	2020-04-02 01:39:07 -05:00
Fangrui Song	cbd3969e8c	[PPCInstPrinter] Delete an unneeded overload of printBranchOperand. NFC It was added by D76591 for migration purposes (not all printBranchOperand users have migrated to the overload with `uint64_t Address`). Now that all have been migrated, the parameter can go away.	2020-04-01 22:45:25 -07:00
Fangrui Song	85adce3d73	[PPCInstPrinter] Change B to print the target address in hexadecimal form Follow-up of D76591 and D76907	2020-04-01 22:38:24 -07:00
Johannes Doerfert	bcd8009369	[Attributor] Use the proper context instruction in genericValueTraversal There was a TODO in genericValueTraversal to provide the context instruction and due to the lack of it users that wanted one just used something available. Unfortunately, using a fixed instruction is wrong in the presence of PHIs so we need to update the context instruction properly. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D76870	2020-04-01 22:20:47 -05:00
Johannes Doerfert	ac96c8fd85	[Attributor][FIX] Do not compute ranges for arguments of declarations This cannot be triggered right now, as far as I know, but it doesn't make sense to deduce a constant range on arguments of declarations. Exposed during testing of AAValueSimplify extensions.	2020-04-01 22:05:30 -05:00
Johannes Doerfert	54d6a608bf	[Attributor][NFC] Predetermine the module It could happen that we delete the first function in the SCC in the future so we should be careful accessing `Functions` after the manifest stage.	2020-04-01 21:56:17 -05:00
Johannes Doerfert	9e19693994	[Attributor] Derive better alignment for accessed pointers Use DL & ABI information for better alignment deduction, e.g., if a type is accessed and the ABI specifies an alignment requirement for such an access we can use it. This is based on a patch by @lebedev.ri and inspired by getBaseAlign in Loads.cpp. Depends on D76673. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D76674	2020-04-01 21:49:57 -05:00
Nico Weber	5bac8d427d	Revert "[ORC] Export __cxa_atexit from the main JITDylib in LLJIT." This reverts commit `0071eaaf08`. Inputs/noop-main.ll wasn't checked in, so this breaks check-llvm everywhere.	2020-04-01 22:49:38 -04:00
Johannes Doerfert	b1c788d051	[Attributor][FIX] Prevent alignment breakage wrt. must-tail calls If we have a must-tail call the callee and caller need to have matching ABIs. Part of that is alignment which we might modify when we deduce alignment of arguments of either. Since we would need to keep them in sync, which is not as simple, we simply avoid deducing alignment for arguments of the must-tail caller or callee. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D76673	2020-04-01 21:40:07 -05:00
Lang Hames	0071eaaf08	[ORC] Export __cxa_atexit from the main JITDylib in LLJIT. Failure to export __cxa_atexit can lead to an attempt to import a definition from the process itself (if __cxa_atexit is referenced from another JITDylib), but the process definition will clash with the existing non-exported definition to produce an unexpected DuplicateDefinitionError. This patch fixes the immediate issue by exporting __cxa_atexit. It also fixes a bug where atexit functions in other JITDylibs were not being run by adding a copy of run_atexits_helper to every JITDylib. A follow up patch will deal with the bug where definition generators are called despite a non-exported definition being present.	2020-04-01 19:12:08 -07:00
Johannes Doerfert	41f2a57d0b	[Attributor][NFC] Use a BumpPtrAllocator to allocate `AbstractAttribute`s We create a lot of AbstractAttributes and they live as long as the Attributor does. It seems reasonable to allocate them via a BumpPtrAllocator owned by the Attributor. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76589	2020-04-01 20:53:28 -05:00
Sam Clegg	296ccef703	[WebAssembly] EmscriptenEHSjLj: Mark __invoke_ functions as imported This means the linker will be expect them be undefined at link time an will generate imports from the `env` module rather than reporting undefined externals. Differential Revision: https://reviews.llvm.org/D77192	2020-04-01 16:33:33 -07:00
Daniel Sanders	e65e677ee4	[globalisel][legalizer] Fix DebugLoc bugs caught by a prototype lost-location verifier The legalizer has a tendency to lose DebugLoc's when expanding or combining instructions. The verifier that detected these isn't ready for upstreaming yet but this patch fixes the cases that came up when applying it to our out-of-tree backend's CodeGen tests. This pattern comes up a few more times in this file and probably in the backends too but I'd prefer to fix the others separately (and preferably when the lost-location verifier detects them).	2020-04-01 12:50:18 -07:00
Lang Hames	8e5a8f620c	[ORC] Don't require a null-terminator on MemoryBuffers for objects in archives. The MemoryBuffer::getMemBuffer method's RequiresNullTerminator parameter defaults to true, but object files are not null terminated so we need to explicitly pass false here.	2020-04-01 12:16:38 -07:00
Sanjay Patel	3d90048791	[InstCombine] enhance freelyNegateValue() by handling xor Negation is equivalent to bitwise-not + 1, so try to convert more subtracts into adds using this relationship: 0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1 I doubt this will recover the regression noted in rGf2fbdf76d8d0, but seems like we're going to need to improve here and/or revive D68408? Alive2 proofs: http://volta.cs.utah.edu:8080/z/Re5tMU http://volta.cs.utah.edu:8080/z/An-uns Differential Revision: https://reviews.llvm.org/D77230	2020-04-01 15:05:13 -04:00
Jonathan Roelofs	1148f004fa	Fix PR45371: SeparateConstOffsetFromGEP clean up bookkeeping find() was altering the UserChain, even in cases where it subsequently discovered that the resulting constant was a 0. This confuses rebuildWithoutConstOffset() when it attempts to walk the chain later, since it is expected that the chain itself be a path down the use-def edges of an expression.	2020-04-01 12:38:15 -06:00
Nikita Popov	50a3e8738a	Revert "[InstCombine] Erase old instruction when replacing extractelements" This reverts commit `d40368fdb5`. llvm-clang-x86_64-expensive-checks-debian failure looks related.	2020-04-01 20:10:11 +02:00
Nikita Popov	2a77544ad5	[SimplifyLibCalls] Erase replaced instructions After RAUWing an instruction, also erase it. This makes sure we don't perform extra InstCombine iterations to clean up the garbage.	2020-04-01 20:00:10 +02:00
Uday Bondhugula	6ee11c3b0f	[NewGVN] Make NewGVN aware of aligned_alloc Make the New GVN pass aware of aligned_alloc. Depends on D76975. Differential Revision: https://reviews.llvm.org/D76976	2020-04-01 23:26:51 +05:30
Uday Bondhugula	4cf70af94f	[GVN] Make GVN aware of aligned_alloc Make the GVN pass aware of aligned_alloc. Depends on D76974. Differential Revision: https://reviews.llvm.org/D76975	2020-04-01 23:26:50 +05:30
Uday Bondhugula	c4499e3333	[Attributor] Make attributor aware of aligned_alloc for heap to stack conversion Make the attributor pass aware of aligned_alloc for converting heap allocations to stack ones. Depends on D76971. Differential Revision: https://reviews.llvm.org/D76974	2020-04-01 23:26:50 +05:30
Nikita Popov	d40368fdb5	[InstCombine] Erase old instruction when replacing extractelements As we are not returning the result of replaceInstUsesWith(), so we need to clean up ourselves. NFC apart from worklist order.	2020-04-01 19:55:28 +02:00
Nikita Popov	4b35c816ef	[InstCombine] Use replaceOperand() in div transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-04-01 19:55:00 +02:00
Matt Arsenault	5e4e8d0388	AMDGPU/GlobalISel: Change intrinsic ID for _L to _LZ opt Still should handle the other case changes the opcode this way.	2020-04-01 13:03:02 -04:00
Heejin Ahn	c87b5e7e22	[WebAssembly] Fix subregion relationship in CFGSort Summary: The previous code for determining the innermost region in CFGSort was not correct. We determine subregion relationship by domination of their headers, i.e., if region A's header dominates region B's header, B is a subregion of A. Previously we assumed that if a BB belongs to both a loop and an exception, the region with fewer number of BBs is the innermost one. This may not be true, because while WebAssemblyException contains BBs in all its subregions (loops or exceptions), MachineLoop may not, because MachineLoop does not contain BBs that don't have a path to its header even if they are dominated by its header. Loop header <---\| \| \| Exception header \| \| \ \| A B \| \| \ \| \| C \| \| \| Loop latch \| \| \| -------------\| For example, in this CFG, the loop does not contain B and C, because they don't have a path back to the loops header. But for CFGSort we consider the exception here belongs to the loop and the exception should be a subregion of the loop and scheduled together. So here we should use `WE->contains(ML->getHeader())` (but not `ML->contains(WE->getHeader())`, for the stated region above). This also fixes some comments and deletes `Regions` vector in `RegionInfo` class, which was not used anywere. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77181	2020-04-01 08:12:41 -07:00
Jessica Clarke	616289ed29	[LegalizeTypes][RISCV] Correctly sign-extend comparison for ATOMIC_CMP_XCHG Summary: Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised with GetPromotedInteger, which leaves the upper bits of the value undefind. Since this is used for comparing in an LR/SC loop with a full-width comparison, we must sign extend it. We introduce a new getExtendForAtomicCmpSwapArg to complement getExtendForAtomicOps, since many targets have compare-and-swap instructions (or pseudos) that correctly handle an any-extend input, and the existing function determines the extension of the result, whereas we are concerned with the input. This is related to https://reviews.llvm.org/D58829, which solved the issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler ATOMIC_CMP_SWAP. Reviewers: asb, lenary, efriedma Reviewed By: asb Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74453	2020-04-01 15:51:26 +01:00
Guillaume Chatelet	fc63c4d8ce	[Alignment][NFC] Remove remaining uses of MachineFrameInfo::setObjectAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77217	2020-04-01 14:38:05 +00:00
Simon Pilgrim	eb8880562e	[X86][SSE] combinePTESTCC - fold TESTZ(X,~Y) -> TESTC(Y,X)	2020-04-01 15:10:53 +01:00
Guillaume Chatelet	1dffa2550b	[Alignment][NFC] Transition to MachineFrameInfo::getObjectAlign() Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77215	2020-04-01 14:08:28 +00:00
Kai Wang	501522b5b2	[RISCV] Support RISC-V ELF attributes sections in llvm-readobj. Enable llvm-readobj to handle RISC-V ELF attribute sections. Differential Revision: https://reviews.llvm.org/D75833	2020-04-01 21:50:11 +08:00
Simon Pilgrim	be7a233e93	Fix operator precedence warning. NFCI.	2020-04-01 14:36:52 +01:00
Simon Pilgrim	552e46ea1e	Fix unused variable warnings. NFCI.	2020-04-01 14:36:51 +01:00
Benjamin Kramer	b605c56b0f	[ARM] Silence warning in Release builds llvm/lib/Target/ARM/MVEVPTBlockPass.cpp:175:37: error: unused variable 'BlockBeg' [-Werror,-Wunused-variable] MachineBasicBlock::instr_iterator BlockBeg = Iter; ^	2020-04-01 15:29:19 +02:00
Guillaume Chatelet	3a78f44daf	[Alignment][NFC] Convert SelectionDAG::InferPtrAlignment to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77212	2020-04-01 13:22:11 +00:00
Simon Pilgrim	481413d394	[X86][SSE] matchShuffleWithPACK - generalize zero/signbits matching for any packed src type First step toward making use of canLowerByDroppingEvenElements to match chains of PACKSS/PACKUS for compaction shuffles. At the moment we still only match a single stage but the MatchPACK is now more general.	2020-04-01 14:10:32 +01:00
shchenz	e344f8b9db	Revert "[LSR] re-add testcase for wrongly phi node elimination - NFC" This reverts commit `f25a1b4f58`. ARM and hexagon fail at the new added case.	2020-04-01 12:58:06 +00:00
Guillaume Chatelet	bf573bea19	[Alignment][NFC] Convert MIR Yaml to MaybeAlign Summary: Although it may look like non NFC it is. especially the MIRParser may set `0` to the MachineFrameInfo and MachineFunction, but they all deal with `Align` internally and assume that `0` means `1`. `93fc0ba145/llvm/include/llvm/CodeGen/MachineFrameInfo.h (L483)` This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77203	2020-04-01 12:26:31 +00:00
Pierre-vh	2effe8f5e7	[Target][ARM] Improvements to the VPT Block Insertion Pass This allows the MVE VPT Block insertion pass to remove VPNOTs in order to create more complex VPT blocks such as TE, TEET, TETE, etc. Differential Revision: https://reviews.llvm.org/D75993	2020-04-01 12:34:20 +01:00
Pierre-vh	dad848280d	[Target][ARM] Change VPTMaskValues to the correct encoding VPTMaskValue was using the "instruction" encoding to represent the masks (= the same encoding as the one used by the instructions in an object file), but it is only used to build MCOperands, so it should use the MCOperand encoding of the masks, which is slightly different. Differential Revision: https://reviews.llvm.org/D76139	2020-04-01 12:34:20 +01:00
Benjamin Kramer	66b9f5f7f0	[GVNSink] Simplify code. NFC.	2020-04-01 13:13:00 +02:00
shchenz	f25a1b4f58	[LSR] re-add testcase for wrongly phi node elimination - NFC Retest the case on X86/SystemZ/AArch64/PowerPC	2020-04-01 11:11:17 +00:00

... 3 4 5 6 7 ...

133217 Commits