llvm-project

Commit Graph

Author	SHA1	Message	Date
Stefan Stipanovic	5a9ba27c71	Revert "Fixing build error from commit 9285295." This reverts commit `95cbc3da88`. llvm-svn: 366759	2019-07-22 22:55:05 +00:00
Peter Collingbourne	710605c085	Analysis: Don't look through aliases when simplifying GEPs. It is not safe in general to replace an alias in a GEP with its aliasee if the alias can be replaced with another definition (i.e. via strong/weak resolution (linkonce_odr) or via symbol interposition (default visibility in ELF)) while the aliasee cannot. An example of how this can go wrong is in the included test case. I was concerned that this might be a load-bearing misoptimization (it's possible for us to use aliases to share vtables between base and derived classes, and on Windows, vtable symbols will always be aliases in RTTI mode, so this change could theoretically inhibit trivial devirtualization in some cases), so I built Chromium for Linux and Windows with and without this change. The file sizes of the resulting binaries were identical, so it doesn't look like this is going to be a problem. Differential Revision: https://reviews.llvm.org/D65118 llvm-svn: 366754	2019-07-22 22:13:46 +00:00
Stefan Stipanovic	95cbc3da88	Fixing build error from commit `9285295`. [Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D64162 llvm-svn: 366753	2019-07-22 22:10:59 +00:00
Roman Lebedev	3a94765bfc	[NFC][PatternMatch] Refactor code into a proper "matcher for any integral constant" Having it as a proper matcher is better for reusability elsewhere (in a follow-up patch.) llvm-svn: 366752	2019-07-22 22:09:24 +00:00
Matt Arsenault	827427f65b	AMDGPU: Don't use SDNodeXForm for DS offset output The xform has no real valuewhen it's using out of a complex pattern output. The complex pattern was already creating TargetConstants with i16, so this was just unnecessary machinery. This allows global isel to import the simple cases once the complex pattern is implemented. llvm-svn: 366743	2019-07-22 21:38:11 +00:00
Eric Christopher	77dc6d2479	Temporarily Revert "[Attributor] Liveness analysis." as it's breaking the build. This reverts commit `9285295f75`. llvm-svn: 366737	2019-07-22 21:04:23 +00:00
Stefan Stipanovic	9285295f75	[Attributor] Liveness analysis. Liveness analysis abstract attribute used to indicate which BasicBlocks are dead and can therefore be ignored. Right now we are only looking at noreturn calls. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D64162 llvm-svn: 366736	2019-07-22 20:54:30 +00:00
Craig Topper	510e6fadaa	[X86] When using AND+PACKUS in lowerV16I8Shuffle, generate the build vector directly in v16i8 with the correct 0x00 or 0xFF elements rather than using another VT and bitcasting it. The build_vector will become a constant pool load. By using the desired type initially, it ensures we don't generate a bitcast of the constant pool load which will need to be folded with the load. While experimenting with another patch, I noticed that when the load type and the constant pool type don't match, then SimplifyDemandedBits can't handle it. While we should probably fix that, this was a simple way to fix the issue I saw. llvm-svn: 366732	2019-07-22 19:58:49 +00:00
Jason Liu	8dd563ef4b	[NFC][PowerPC]Change ADDIStocHA to ADDIStocHA8 to follow 64-bit naming convention Summary: Since we are planning to add ADDIStocHA for 32bit in later patch, we decided to change 64bit one first to follow naming convention with 8 behind opcode. Patch by: Xiangling_L Differential Revision: https://reviews.llvm.org/D64814 llvm-svn: 366731	2019-07-22 19:55:33 +00:00
Stefan Stipanovic	69ebb02001	[Attributor] NoAlias on return values. Porting function return value attribute noalias to attributor. This will be followed with a patch for callsite and function argumets. Reviewers: jdoerfert Subscribers: lebedev.ri, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D63067 llvm-svn: 366728	2019-07-22 19:36:27 +00:00
Sean Fertile	942537d9fa	Stubs out TLOF for AIX and add support for common vars in assembly output. Stubs out a TargetLoweringObjectFileXCOFF class, implementing only SelectSectionForGlobal for common symbols. Also adds an override of EmitGlobalVariable in PPCAIXAsmPrinter which adds a number of defensive errors and adds support for emitting common globals. llvm-svn: 366727	2019-07-22 19:15:29 +00:00
Petr Hosek	f6cd6ffbc9	[SafeStack] Insert the deref after the offset While debugging code that uses SafeStack, we've noticed that LLVM produces an invalid DWARF. Concretely, in the following example: int main(int argc, char* argv[]) { std::string value = ""; printf("%s\n", value.c_str()); return 0; } DWARF would describe the value variable as being located at: DW_OP_breg14 R14+0, DW_OP_deref, DW_OP_constu 0x20, DW_OP_minus The assembly to get this variable is: leaq -32(%r14), %rbx The order of operations in the DWARF symbols is incorrect in this case. Specifically, the deref is incorrect; this appears to be incorrectly re-inserted in repalceOneDbgValueForAlloca. With this change which inserts the deref after the offset instead of before it, LLVM produces correct DWARF: DW_OP_breg14 R14-32 Differential Revision: https://reviews.llvm.org/D64971 llvm-svn: 366726	2019-07-22 18:52:42 +00:00
Peter Collingbourne	ef5cfc2dae	WholeProgramDevirt: Teach the pass to respect the global's alignment. The bytes inserted before an overaligned global need to be padded according to the alignment set on the original global in order for the initializer to meet the global's alignment requirements. The previous implementation that padded to the pointer width happened to be correct for vtables on most platforms but may do the wrong thing if the vtable has a larger alignment. This issue is visible with a prototype implementation of HWASAN for globals, which will overalign all globals including vtables to 16 bytes. There is also no padding requirement for the bytes inserted after the global because they are never read from nor are they significant for alignment purposes, so stop inserting padding there. Differential Revision: https://reviews.llvm.org/D65031 llvm-svn: 366725	2019-07-22 18:50:45 +00:00
Sean Fertile	324d33dd4e	[PowerPC] Fix comment on MO_PLT Target Operand Flag. [NFC] Patch by Xiangling Liao. llvm-svn: 366724	2019-07-22 18:47:59 +00:00
Sean Fertile	8034daca5f	[Object][XCOFF] Remove extra includes from XCOFF related files. [NFC] Differential Revision: https://reviews.llvm.org/D60885 llvm-svn: 366723	2019-07-22 18:47:55 +00:00
Peter Collingbourne	c3b8661df5	LowerTypeTests: Teach the pass to respect global alignments. We were previously ignoring alignment entirely when combining globals together in this pass. There are two main things that we need to do here: add additional padding before each global to meet the alignment requirements, and set the combined global's alignment to the maximum of all of the original globals' alignments. Since we now need to calculate layout as we go anyway, use the calculated layout to produce GlobalLayout instead of using StructLayout. Differential Revision: https://reviews.llvm.org/D65033 llvm-svn: 366722	2019-07-22 18:47:03 +00:00
Nilanjana Basu	06b8fe8d03	Changes to emit CodeView debug info nested type records properly using MCStreamer directives llvm-svn: 366720	2019-07-22 18:22:55 +00:00
Simon Pilgrim	3ebd2fe91a	[SLPVectorizer] Fix some MSVC/cppcheck uninitialized variable warnings. NFCI. llvm-svn: 366712	2019-07-22 17:57:36 +00:00
Vlad Tsyrklevich	5874a28ac5	Revert "Reland [ELF] Loose a condition for relocation with a symbol" This reverts commit r366686 as it appears to be causing buildbot failures on sanitizer-x86_64-linux-android and sanitizer-x86_64-linux. llvm-svn: 366708	2019-07-22 17:48:53 +00:00
Matt Arsenault	542720b2bc	TableGen: Support physical register inputs > 255 This was truncating register value that didn't fit in unsigned char. Switch AMDGPU sendmsg intrinsics to using a tablegen pattern. llvm-svn: 366695	2019-07-22 15:02:34 +00:00
Sam Parker	4379a40088	[ARM][LowOverheadLoops] Revert remaining pseudos ARMLowOverheadLoops would assert a failure if it did not find all the pseudo instructions that comprise the hardware loop. Instead of doing this, iterate through all the instructions of the function and revert any remaining pseudo instructions that haven't been converted. Differential Revision: https://reviews.llvm.org/D65080 llvm-svn: 366691	2019-07-22 14:16:40 +00:00
Nikola Prica	0166cff09b	Reland [ELF] Loose a condition for relocation with a symbol This patch was not the reason of the buildbot failure. Deleted code was introduced as a work around for a bug in the gold linker (http://sourceware.org/PR16794). Test case that was given as a reason for this part of code, the one on previous link, now works for the gold. This condition is too strict and when a code is compiled with debug info it forces generation of numerous relocations with symbol for architectures that do not have relocation addend. Reviewers: arsenm, espindola Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D64327 llvm-svn: 366686	2019-07-22 13:07:01 +00:00
Matt Arsenault	937d0ee5d8	AMDGPU/GlobalISel: Remove unnecessary code The minnum/maxnum case are dead, and the cvt is handled by the default. llvm-svn: 366685	2019-07-22 13:05:25 +00:00
David Green	8876a312a8	[ARM] Fix for MVE VPT block pass We need to ensure that the number of T's is correct when adding multiple instructions into the same VPT block. Differential revision: https://reviews.llvm.org/D65049 llvm-svn: 366684	2019-07-22 12:51:38 +00:00
Simon Pilgrim	b3d719e1cf	[X86] EltsFromConsecutiveLoads - support common source loads (REAPPLIED) This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load. A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match. Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle. Fixed out of bounds load assert identified in rL366501 Differential Revision: https://reviews.llvm.org/D64551 llvm-svn: 366681	2019-07-22 12:44:10 +00:00
Christudasan Devadasan	006cf8c03d	Added address-space mangling for stack related intrinsics Modified the following 3 intrinsics: int_addressofreturnaddress, int_frameaddress & int_sponentry. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64561 llvm-svn: 366679	2019-07-22 12:42:48 +00:00
Oliver Stannard	6771a89fa0	[IPRA][ARM] Make use of the "returned" parameter attribute ARM has code to recognise uses of the "returned" function parameter attribute which guarantee that the value passed to the function in r0 will be returned in r0 unmodified. IPRA replaces the regmask on call instructions, so needs to be told about this to avoid reverting the optimisation. Differential revision: https://reviews.llvm.org/D64986 llvm-svn: 366669	2019-07-22 08:44:36 +00:00
Jay Foad	298500ae33	[AMDGPU] Save some work when an atomic op has no uses Summary: In the atomic optimizer, save doing a bunch of work and generating a bunch of dead IR in the fairly common case where the result of an atomic op (i.e. the value that was in memory before the atomic op was performed) is not used. NFC. Reviewers: arsenm, dstuttard, tpr Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64981 llvm-svn: 366667	2019-07-22 07:19:44 +00:00
Serguei Katkov	c6c31da867	[Loop Peeling] Fix the handling of branch weights of peeled off branches. Current algorithm to update branch weights of latch block and its copies is based on the assumption that number of peeling iterations is approximately equal to trip count. However it is not correct. According to profitability check in one case we can decide to peel in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration can be less then estimated trip count. This patch introduces another way to set the branch weights to peeled of branches. Let F is a weight of the edge from latch to header. Let E is a weight of the edge from latch to exit. F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit. Then, Estimated TripCount = F / E. For I-th (counting from 0) peeled off iteration we set the the weights for the peeled latch as (TC - I, 1). It gives us reasonable distribution, The probability to go to exit 1/(TC-I) increases. At the same time the estimated trip count of remaining loop reduces by I. As a result after peeling off N iteration the weights will be (F - N * E, E) and trip count of loop becomes F / E - N or TC - N. The idea is taken from the review of the patch D63918 proposed by Philip. Reviewers: reames, mkuper, iajbar, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64235 llvm-svn: 366665	2019-07-22 05:15:34 +00:00
Simon Pilgrim	86fa3270ef	[X86] SimplifyDemandedVectorEltsForTargetNode - Move SUBV_BROADCAST narrowing handling. NFCI. Move the narrowing of SUBV_BROADCAST to where we handle all the other opcodes. llvm-svn: 366660	2019-07-21 19:04:44 +00:00
Craig Topper	e6cd20ba53	[InstCombine] Update comment I missed in r366649. NFC llvm-svn: 366658	2019-07-21 16:15:03 +00:00
Aditya Nandakumar	d7504a1569	[GISel]: Attach missing range metadata while translating G_LOADs https://reviews.llvm.org/D65048 Attach range information to G_LOAD when only defining one register. reviewed by: arsenm llvm-svn: 366656	2019-07-21 14:07:54 +00:00
Craig Topper	1d149d08d3	[InstCombine] Remove insertRangeTest code that handles the equality case. For equality, the function called getTrue/getFalse with the VT of the comparison input. But getTrue/getFalse need the boolean VT. So if this code ever executed, it would assert. I believe these cases are removed by InstSimplify so we don't get here. So this patch just fixes up an assert to exclude the equality possibility and removes the broken code. llvm-svn: 366649	2019-07-21 06:43:38 +00:00
Craig Topper	8fabdfe9fc	[InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI AddOne/SubOne create new Constant objects. That seems heavy for comparing ConstantInts which wrap APInts. Just do the math on on the APInts and compare them. llvm-svn: 366648	2019-07-21 05:26:05 +00:00
Roman Lebedev	cd9b19484b	[Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvements Summary: Four things here: 1. Generalize the fold to handle non-splat divisors. Reasonably trivial. 2. Unban power-of-two divisors. I don't see any reason why they should be illegal. * There is no ban in Hacker's Delight * I think the ban came from the same bug that caused the miscompile in the base patch - in `floor((2^W - 1) / D)` we were dividing by `D0` instead of `D`, and we were ensuring that `D0` is not `1`, which made sense. 3. Unban `1` divisors. I no longer believe Hacker's Delight actually says that the fold is invalid for `D = 0`. Further considerations: * We know that * `(X u% 1) == 0` can be constant-folded to `1`, * `(X u% 1) != 0` can be constant-folded to `0`, * Also, we know that * `X u<= -1` can be constant-folded to `1`, * `X u> -1` can be constant-folded to `0`, * https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p * We know will end up with the following: `(setule/setugt (rotr (mul N, P), K), Q)` * Therefore, for given new DAG nodes and comparison predicates (`ule`/`ugt`), we will still produce the correct answer if: `Q` is a all-ones constant; and both `P` and `K` are anything other than `undef`. * The fold will indeed produce `Q = all-ones`. 4. Try to re-splat the `P` and `K` vectors - we don't care about their values for the lanes where divisor was `1`. Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00 Reviewed By: RKSimon Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63963 llvm-svn: 366637	2019-07-20 16:33:15 +00:00
Simon Pilgrim	adec0f2252	[X86][SSE] Use PSADBW to improve vXi8 sum reduction (PR42674) As detailed on PR42674, we can reduce a vXi8 down until we have the final <8 x i8>, and then use PSADBW with zero, to sum those values. We then extract the bottom i8, discarding any overflow from the upper bits of the i16 result. llvm-svn: 366636	2019-07-20 15:20:11 +00:00
Florian Hahn	0a7faa4e3d	[Local] Zap blockaddress without users in ConstantFoldTerminator. If the blockaddress is not destoryed, the destination block will still be marked as having its address taken, limiting further transformations. I think there are other places where the dead blockaddress constants are kept around, I'll look into that as follow up. Reviewers: craig.topper, brzycki, davide Reviewed By: brzycki, davide Differential Revision: https://reviews.llvm.org/D64936 llvm-svn: 366633	2019-07-20 12:25:47 +00:00
Jessica Paquette	41affad967	[GlobalISel][AArch64] Contract trivial same-size cross-bank copies into G_STOREs Sometimes, you can end up with cross-bank copies between same-sized GPRs and FPRs, which feed into G_STOREs. When these copies feed only into stores, they aren't necessary; we can just store using the original register bank. This provides some minor code size savings for some floating point SPEC benchmarks. (Around 0.2% for 453.povray and 450.soplex) This issue doesn't seem to show up due to regbankselect or anything similar. So, this patch introduces an early select function, `contractCrossBankCopyIntoStore` which performs the contraction when possible. The selector then continues normally and selects the correct store opcode, eliminating needless copies along the way. Differential Revision: https://reviews.llvm.org/D65024 llvm-svn: 366625	2019-07-20 01:55:35 +00:00
Guanzhong Chen	5204f7611f	[WebAssembly] Compute and export TLS block alignment Summary: Add immutable WASM global `__tls_align` which stores the alignment requirements of the TLS segment. Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang. The expected usage has now changed to: __wasm_init_tls(memalign(__builtin_wasm_tls_align(), __builtin_wasm_tls_size())); Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton Reviewed By: tlively Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65028 llvm-svn: 366624	2019-07-19 23:34:16 +00:00
Matt Arsenault	f3bfb85bce	AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spaces llvm-svn: 366621	2019-07-19 22:28:44 +00:00
Stanislav Mekhanoshin	05d9e6a2a3	[AMDGPU] Autogenerate register sequences in tuples Differential Revision: https://reviews.llvm.org/D65007 llvm-svn: 366619	2019-07-19 21:43:42 +00:00
Stanislav Mekhanoshin	7b5a54e369	[AMDGPU] Fixed occupancy calculation for gfx10 Differential Revision: https://reviews.llvm.org/D65010 llvm-svn: 366616	2019-07-19 21:29:51 +00:00
Matt Arsenault	5e23f42820	AMDGPU: Avoid custom predicates for stores with glue llvm-svn: 366613	2019-07-19 21:01:30 +00:00
Matt Arsenault	e3401a9b86	AMDGPU: Redefine setcc condition PatLeafs Avoid using custom code predicates. llvm-svn: 366609	2019-07-19 20:24:40 +00:00
Matt Arsenault	48c0df5d46	AMDGPU: Don't rely on m0 being -1 for GWS offsets This only works if the high bits of m0 are also 0, so m0 would have to be set to 0xffff. llvm-svn: 366608	2019-07-19 20:01:24 +00:00
Matt Arsenault	85f3890126	AMDGPU: Force s_waitcnt after GWS instructions This is apparently required to be the immediately following instruction, so force it into a bundle with a waitcnt. llvm-svn: 366607	2019-07-19 19:47:30 +00:00
Matt Arsenault	c14334e959	LiveIntervals: Fix handleMove asserting on BUNDLE The top-level BUNDLE instruction should behave as an ordinary instruction. It is supposed to have all relevant registers as implicit operands. Moving it should work as any other instruction. I believe the assert intended to avoid moving instructions inside bundles. llvm-svn: 366605	2019-07-19 19:32:00 +00:00
Nick Desaulniers	4e9196ebcb	Revert "Use the MachineBasicBlock symbol for a callbr target" This reverts commit r366523/ccbffefccaff42b0d094c9ef0f49fc3e8c8456ea. Two regressions were immediately reported: - https://github.com/ClangBuiltLinux/linux/issues/614 - https://github.com/ClangBuiltLinux/linux/issues/615 Reported-by: nathanchance llvm-svn: 366600	2019-07-19 18:18:02 +00:00
Stanislav Mekhanoshin	01fcf9238f	[AMDGPU] Allow register tuples to set asm names This change reverts most of the previous register name generation. The real problem is that RegisterTuple does not generate asm names. Added optional operand to RegisterTuple. This way we can simplify register name access and dramatically reduce the size of static tables for the backend. Differential Revision: https://reviews.llvm.org/D64967 llvm-svn: 366598	2019-07-19 18:05:01 +00:00
Matt Arsenault	7df225dfc2	AMDGPU/GlobalISel: Fix MMO flags for kernel argument loads The DAG lowering sets dereferencable and invariant, not nontemporal. llvm-svn: 366597	2019-07-19 17:52:56 +00:00
Matt Arsenault	08494f6231	AMDGPU/GlobalISel: Selection for fminnum/fmaxnum v2f16 case doesn't work yet because the VOP3P complex patterns haven't been ported yet. llvm-svn: 366585	2019-07-19 14:42:40 +00:00
Matt Arsenault	b60a2ae40e	AMDGPU/GlobalISel: Support arguments with multiple registers Handles structs used directly in argument lists. llvm-svn: 366584	2019-07-19 14:29:30 +00:00
Matt Arsenault	fecf43eba3	AMDGPU/GlobalISel: Rewrite lowerFormalArguments This should now handle everything except structs passed as multiple registers. I think most of the packing logic should be handled by handleAssignments, but I'm unclear on what the contract is for multiple registers. This is copying how x86 handles this. This does change the behavior of the test_sgpr_alignment0 amdgpu_vs test. I don't think shader arguments should try to follow the alignment, and registers need to be repacked. I also don't think it matters, since I think the pointers are packed to the beginning of the argument list anyway. llvm-svn: 366582	2019-07-19 14:15:18 +00:00
Matt Arsenault	1022c0dfde	AMDGPU: Decompose all values to 32-bit pieces for calling conventions This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit. This should also help avoid issues graphics shaders have had with 64-bit values, and simplify argument lowering in globalisel. llvm-svn: 366578	2019-07-19 13:57:44 +00:00
Matt Arsenault	5905aae169	DAG: Handle dbg_value for arguments split into multiple subregs This was handled previously for arguments split due to not fitting in an MVT. This was dropping the register for argument registers split due to TLI::getRegisterTypeForCallingConv. llvm-svn: 366574	2019-07-19 13:36:46 +00:00
Dmitry Preobrazhensky	4ccb7f8c45	[AMDGPU][MC] Corrected parsing of branch offsets See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D64629 llvm-svn: 366571	2019-07-19 13:12:47 +00:00
Kai Luo	dec624682e	[MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs. Summary: Current PRE hoists common computations into CMBB = DT->findNearestCommonDominator(MBB, MBB1). However, if CMBB is in a hot loop body, we might get performance degradation. Differential Revision: https://reviews.llvm.org/D64394 llvm-svn: 366570	2019-07-19 12:58:16 +00:00
Than McIntosh	e238a4c757	[X86] for split stack, not save/restore nested arg if unused Summary: For split-stack, if the nested argument (i.e. R10) is not used, no need to save/restore it in the prologue. Reviewers: thanm Reviewed By: thanm Subscribers: mstorsjo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64673 llvm-svn: 366569	2019-07-19 12:54:44 +00:00
Oliver Stannard	8780c0dda2	Don't update NoTrappingFPMath and FPDenormalMode in resetTargetOptions We'd like to remove this whole function, because these are properties of functions, not the target as a whole. These two are easy to remove because they are only used for emitting ARM build attributes, which expects them to represent the defaults for the whole module, not just the last function generated. This is needed to get correct build attributes when using IPRA on ARM, because IPRA causes resetTargetOptions to get called before ARMAsmPrinter::emitAttributes. Differential revision: https://reviews.llvm.org/D64929 llvm-svn: 366562	2019-07-19 10:37:37 +00:00
Oliver Stannard	0ed7732671	[IPRA] Don't rely on non-exact function definitions If a function definition is not exact, then the linker could select a differently-compiled version of it, which could use different registers. https://reviews.llvm.org/D64909 llvm-svn: 366557	2019-07-19 09:59:26 +00:00
Mikhail Maltsev	0b001f94a5	[ARM] Add <saturate> operand to SQRSHRL and UQRSHLL Summary: According to the new Armv8-M specification https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the instructions SQRSHRL and UQRSHLL now have an additional immediate operand <saturate>. The new assembly syntax is: SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm where <saturate> can be either 64 (the existing behavior) or 48, in that case the result is saturated to 48 bits. The new operand is encoded as follows: #64 Encoded as sat = 0 #48 Encoded as sat = 1 sat is bit 7 of the instruction bit pattern. This patch adds a new assembler operand class MveSaturateOperand which implements parsing and encoding. Decoding is implemented in DecodeMVEOverlappingLongShift. Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64810 llvm-svn: 366555	2019-07-19 09:46:28 +00:00
Hubert Tong	2711e16b35	[sanitizers] Use covering ObjectFormatType switches Summary: This patch removes the `default` case from some switches on `llvm::Triple::ObjectFormatType`, and cases for the missing enumerators (`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added. For `UnknownObjectFormat`, the effect of the action for the `default` case is maintained; otherwise, where `llvm_unreachable` is called, `report_fatal_error` is used instead. Where the `default` case returns a default value, `report_fatal_error` is used for XCOFF as a placeholder. For `Wasm`, the effect of the action for the `default` case in maintained. The code is structured to avoid strongly implying that the `Wasm` case is present for any reason other than to make the switch cover all `ObjectFormatType` enumerator values. Reviewers: sfertile, jasonliu, daltenty Reviewed By: sfertile Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64222 llvm-svn: 366544	2019-07-19 08:46:18 +00:00
Jay Foad	7d06ffff46	[AMDGPU] Simplify the exclusive scan used for optimized atomics Summary: Change the scan algorithm to use only power-of-two shifts (1, 2, 4, 8, 16, 32) instead of starting off shifting by 1, 2 and 3 and then doing a 3-way ADD, because: 1. It simplifies the compiler a little. 2. It minimizes vgpr pressure because each instruction is now of the form vn = vn + vn << c. 3. It is more friendly to the DPP combiner, which currently can't combine into an ADD3 instruction. Because of #2 and #3 the end result is improved from this: v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf v_mov_b32_dpp v1, v3 row_shr:3 row_mask:0xf bank_mask:0xf v_add3_u32 v1, v4, v5, v1 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf To this: v_add_u32_dpp v1, v1, v1 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf I.e. two fewer computational instructions, one extra nop where we could schedule something else. Reviewers: arsenm, sheredom, critson, rampitec, vpykhtin Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64411 llvm-svn: 366543	2019-07-19 08:40:37 +00:00
Serguei Katkov	bde33af85a	[Loop Peeling] Enable peeling of multiple exits by default. Enable loop peeling with multiple exits where all non-latch exits ends up with deopt by default. Reviewers: reames, fhahn Reviewed By: reames Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64619 llvm-svn: 366542	2019-07-19 08:35:45 +00:00
Roman Lebedev	f2eb403144	[InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) Normally, the inner pattern is sign-extend, but for our purposes it's no different to other patterns: alive proofs: f: https://rise4fun.com/Alive/7U3 For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64524 llvm-svn: 366540	2019-07-19 08:26:58 +00:00
Roman Lebedev	441c9d6ca8	[InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: e: https://rise4fun.com/Alive/0FT For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64521 llvm-svn: 366539	2019-07-19 08:26:47 +00:00
Roman Lebedev	3c212ce305	[InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: d: https://rise4fun.com/Alive/I5Y For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64519 llvm-svn: 366538	2019-07-19 08:26:37 +00:00
Roman Lebedev	2ebe57386d	[InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: c: https://rise4fun.com/Alive/RgJh For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64517 llvm-svn: 366537	2019-07-19 08:26:25 +00:00
Roman Lebedev	4422a1657c	[InstCombine] Dropping redundant masking before left-shift [1/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: b. `(x & (~(-1 << maskNbits))) << shiftNbits` All these patterns can be simplified to just: `x << ShiftShAmt` iff: b. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: b: https://rise4fun.com/Alive/y8M For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64514 llvm-svn: 366536	2019-07-19 08:26:13 +00:00
Roman Lebedev	a5f0824eb5	[InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: a. `(x & ((1 << MaskShAmt) - 1)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: a. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: a: https://rise4fun.com/Alive/wi9 Indeed, not all of these patterns are canonical. But since this fold will only produce a single instruction i'm really interested in handling even uncanonical patterns, since i have this general kind of pattern in hotpaths, and it is not totally outlandish for bit-twiddling code. For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, huihuiz, xbolva00 Reviewed By: xbolva00 Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64512 llvm-svn: 366535	2019-07-19 08:25:43 +00:00
Hsiangkai Wang	c5ecdd3c5a	[DebugInfo] Some fields do not need relocations even relax is enabled. In debug frame information, some fields, e.g., Length in CIE/FDE and Offset in FDE are attributes to describe the structure of CIE/FDE. They are not related to the relaxed code. However, these attributes are symbol differences. So, in current design, these attributes will be filled as zero and LLVM generates relocations for them. We only need to generate relocations for symbols in executable sections. So, if the symbols are not located in executable sections, we still evaluate their values under relaxation. Differential Revision: https://reviews.llvm.org/D61584 llvm-svn: 366531	2019-07-19 06:10:36 +00:00
Hsiangkai Wang	18ccfadd46	[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame. It is necessary to generate fixups in .debug_frame or .eh_frame as relaxation is enabled due to the address delta may be changed after relaxation. There is an opcode with 6-bits data in debug frame encoding. So, we also need 6-bits fixup types. Differential Revision: https://reviews.llvm.org/D58335 llvm-svn: 366524	2019-07-19 02:03:34 +00:00
Bill Wendling	ccbffefcca	Use the MachineBasicBlock symbol for a callbr target Summary: Inline asm doesn't use labels when compiled as an object file. Therefore, we shouldn't create one for the (potential) callbr destination. Instead, use the symbol for the MachineBasicBlock. Reviewers: nickdesaulniers, craig.topper Reviewed By: nickdesaulniers Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64888 llvm-svn: 366523	2019-07-19 01:10:28 +00:00
Amara Emerson	cf12c7815f	[GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs and legalize later. I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't do that unless we delay lowering to actual function calls. This patch changes the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and then have each target specify that using the new custom legalizer for intrinsics hook that they want it expanded it a libcall. Differential Revision: https://reviews.llvm.org/D64895 llvm-svn: 366516	2019-07-19 00:24:45 +00:00
Stanislav Mekhanoshin	a9c71e01e7	[AMDGPU] Drop Reg32 and use regular AsmName This allows to reduce generated AMDGPUGenAsmWriter.inc by ~100Kb. Differential Revision: https://reviews.llvm.org/D64952 llvm-svn: 366505	2019-07-18 22:18:33 +00:00
Jessica Paquette	7a1dcc5ff1	[GlobalISel][AArch64] Add support for base register + offset register loads Add support for folding G_GEPs into loads of the form ``` ldr reg, [base, off] ``` when possible. This can save an add before the load. Currently, this is only supported for loads of 64 bits into 64 bit registers. Add a new addressing mode function, `selectAddrModeRegisterOffset` which performs this folding when it is profitable. Also add a test for addressing modes for G_LOAD. Differential Revision: https://reviews.llvm.org/D64944 llvm-svn: 366503	2019-07-18 21:50:11 +00:00
Peter Collingbourne	50057f3288	CodeGen: Allow !associated metadata to point to aliases. This is a small extension of !associated, mostly useful for the implementation convenience of instrumentation passes that RAUW globals with aliases, such as LowerTypeTests. Differential Revision: https://reviews.llvm.org/D64951 llvm-svn: 366502	2019-07-18 21:37:16 +00:00
Reid Kleckner	ba9c9e62cb	Revert [X86] EltsFromConsecutiveLoads - support common source loads This reverts r366441 (git commit `48104ef7c9`) This causes clang to fail to compile some file in Skia. Reduction soon. llvm-svn: 366501	2019-07-18 21:26:41 +00:00
Guanzhong Chen	df4479200b	[WebAssembly] Fix __builtin_wasm_tls_base intrinsic Summary: Properly generate the outchain for the `__builtin_wasm_tls_base` intrinsic. Also marked the intrinsic pure, per @sunfish's suggestion. Reviewers: tlively, aheejin, sbc100, sunfish Reviewed By: tlively Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits, sunfish Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64949 llvm-svn: 366499	2019-07-18 21:17:52 +00:00
Peter Collingbourne	68f3fc2d91	Fix typo in r366494. Spotted by Yuanfang Chen. llvm-svn: 366497	2019-07-18 21:03:37 +00:00
Steven Wu	dac7fca530	Remove the static initialize introduced in r365099 Summary: Some polish for r365099 which adds a static initializer to MachOObjectFile. Remove it by moving it to file scope. Reviewers: smeenai, alexshap, compnerd, mtrent, anushabasana Reviewed By: smeenai Subscribers: hiraditya, jkorous, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64873 llvm-svn: 366496	2019-07-18 21:01:21 +00:00
Peter Collingbourne	d1ec8eb84f	IR: Teach Constant::needsRelocation() that relative pointers don't need to be relocated. This causes sections with relative pointers to be marked as read only, which means that they won't end up sharing pages with writable data. Differential Revision: https://reviews.llvm.org/D64948 llvm-svn: 366494	2019-07-18 20:56:21 +00:00
Jordan Rose	887d31ccee	FileSystem: Check for DTTOIF alone, not _DIRENT_HAVE_D_TYPE While 'd_type' is a non-standard extension to `struct dirent`, only glibc signals its presence with a macro '_DIRENT_HAVE_D_TYPE'. However, any platform with 'd_type' also includes a way to convert to mode_t values using the macro 'DTTOIF', so we can check for that alone and still be confident that the 'd_type' member exists. (If this turns out to be wrong, I'll go back and set up an actual CMake check.) I couldn't think of how to write a test for this, because I couldn't think of how to test that a 'stat' call doesn't happen without controlling the filesystem or intercepting 'stat', and there's no good cross-platform way to do that that I know of. Follow-up (almost a year later) to r342089. rdar://problem/50592673 https://reviews.llvm.org/D64940 llvm-svn: 366486	2019-07-18 20:05:11 +00:00
Lang Hames	9e52d0576a	[ORC] Suppress an ORCv1 deprecation warning. llvm-svn: 366485	2019-07-18 19:55:42 +00:00
Amy Huang	f332fe642c	[COFF] Change a variable type to be const in the HeapAllocSite map. llvm-svn: 366479	2019-07-18 18:22:52 +00:00
Guanzhong Chen	801fa8e6b9	[WebAssembly] Implement __builtin_wasm_tls_base intrinsic Summary: Add `__builtin_wasm_tls_base` so that LeakSanitizer can find the thread-local block and scan through it for memory leaks. Reviewers: tlively, aheejin, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64900 llvm-svn: 366475	2019-07-18 17:53:22 +00:00
Michael Liao	17a8a9277c	[LAA] Re-check bit-width of pointers after stripping. Summary: - As the pointer stripping now tracks through `addrspacecast`, prepare to handle the bit-width difference from the result pointer. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64928 llvm-svn: 366470	2019-07-18 17:30:27 +00:00
Peter Collingbourne	aa6a7df64a	MC: AArch64: Add support for prel_g* relocation specifiers. Differential Revision: https://reviews.llvm.org/D64683 llvm-svn: 366462	2019-07-18 16:54:33 +00:00
Peter Collingbourne	76427f849f	AArch64: Unify relocation restrictions between MOVK/MOVN/MOVZ. There doesn't seem to be a practical reason for these instructions to have different restrictions on the types of relocations that they may be used with, notwithstanding the language in the ELF AArch64 spec that implies that specific relocations are meant to be used with specific instructions. For example, we currently forbid the first instruction in the following sequence, despite it currently being used by clang to generate a global reference under -mcmodel=large: movz x0, #:abs_g0_nc:foo movk x0, #:abs_g1_nc:foo movk x0, #:abs_g2_nc:foo movk x0, #:abs_g3:foo Therefore, allow MOVK/MOVN/MOVZ to accept the union of the set of relocations that they currently accept individually. Differential Revision: https://reviews.llvm.org/D64466 llvm-svn: 366461	2019-07-18 16:51:53 +00:00
Hsiangkai Wang	657277e0f1	Revert "[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame." This reverts commit 17e3cbf5fe656483d9016d0ba9e1d0cd8629379e. llvm-svn: 366444	2019-07-18 15:06:50 +00:00
Hsiangkai Wang	e43ce1a958	[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame. It is necessary to generate fixups in .debug_frame or .eh_frame as relaxation is enabled due to the address delta may be changed after relaxation. There is an opcode with 6-bits data in debug frame encoding. So, we also need 6-bits fixup types. Differential Revision: https://reviews.llvm.org/D58335 llvm-svn: 366442	2019-07-18 14:47:34 +00:00
Simon Pilgrim	48104ef7c9	[X86] EltsFromConsecutiveLoads - support common source loads This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load. A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match. Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle. Differential Revision: https://reviews.llvm.org/D64551 llvm-svn: 366441	2019-07-18 14:33:25 +00:00
Simon Pilgrim	8b525e357f	[DAGCombine] Pull getSubVectorSrc helper out of narrowInsertExtractVectorBinOp. NFCI. NFC step towards reusing this in other EXTRACT_SUBVECTOR combines. llvm-svn: 366435	2019-07-18 13:45:53 +00:00
Thomas Preud'homme	70494494c1	[FileCheck] Fix numeric variable redefinition Summary: Commit r365249 changed usage of FileCheckNumericVariable to have one instance of that class per variable as opposed to one instance per definition of a given variable as was done before. However, it retained the safety check in setValue that it should only be called with the variable unset, even after r365625. However this causes assert failure when a non-pseudo variable is being redefined. And while redefinition of @LINE at each CHECK line work in the general case, it caused problem when a substitution failed (fixed in r365624) and still causes problem when a CHECK line does not match since @LINE's value is cleared after substitutions in match() happened but printSubstitutions also attempts a substitution. This commit solves the root of the problem by changing setValue to set a new value regardless of whether a value was set or not, thus fixing all the aforementioned issues. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D64882 llvm-svn: 366434	2019-07-18 13:39:04 +00:00
Sanjay Patel	e654785912	[x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483) LEA doesn't affect flags, so use it more liberally to replace an ADD when we know that the ADD operands affect flags. In the motivating example from PR40483: https://bugs.llvm.org/show_bug.cgi?id=40483 ...this lets us avoid duplicating a math op just to avoid flag conflict. As mentioned in the TODO comments, this heuristic can be extended to fire more often if that leads to more improvements. Differential Revision: https://reviews.llvm.org/D64707 llvm-svn: 366431	2019-07-18 12:48:01 +00:00
Diogo N. Sampaio	11512e742b	[ARM][DAGCOMBINE][FIX] PerformVMOVRRDCombine Summary: PerformVMOVRRDCombine ommits adding a offset of 4 to the PointerInfo, when converting a f64 = load[M] to {i32, i32} = {load[M], load[M + 4]} Which would allow the machine scheduller to break dependencies with the second load. - pr42638 Reviewers: eli.friedman, dmgreen, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64870 llvm-svn: 366423	2019-07-18 10:05:56 +00:00
Chen Zheng	c38e3efe27	[SCEV] add no wrap flag for SCEVAddExpr. Differential Revision: https://reviews.llvm.org/D64868 llvm-svn: 366419	2019-07-18 09:23:19 +00:00
Alex Bradbury	b8d352a08b	[RISCV] Reset NoPHIS MachineFunctionProperty in emitSelectPseudo We insered PHIS were there were none before, so the property must be reset. This error was found on an EXPENSIVE_CHECKS build. llvm-svn: 366412	2019-07-18 07:52:41 +00:00
Serguei Katkov	0ffa833d54	[LoopInfo] Use early return in branch weight update functions. NFC. llvm-svn: 366411	2019-07-18 07:36:20 +00:00
Craig Topper	8da0402210	[X86] Disable combineConcatVectors for vXi1 vectors. I'm not convinced the code this calls is properly vetted for vXi1 vectors. Experimental vector widening legalization testing for D55251 is now hitting an assertion failure inside EltsFromConsecutiveLoads. This is occurring from a v2i1 load having a store size different than its VT size. Hopefully this commit will keep such issues from happening. llvm-svn: 366405	2019-07-18 06:18:06 +00:00
Alex Bradbury	44deaf7e54	[DWARF][RISCV] Add support for RISC-V relocations needed for debug info When code relaxation is enabled many RISC-V fixups are not resolved but instead relocations are emitted. This happens even for DWARF debug sections. Therefore, to properly support the parsing of DWARF debug info we need to be able to resolve RISC-V relocations. This patch adds: * Support for RISC-V relocations in RelocationResolver * DWARF support for two relocations per object file offset * DWARF changes to support relocations in more DIE fields The two relocations per offset change is needed because some RISC-V relocations (used for label differences) come in pairs. Relocations can also be emitted for DWARF fields where relocations were not yet evaluated. Adding relocation support for some of these fields is essencial. On the other hand, LLVM currently emits RISC-V relocations for fixups that could be safely evaluated, since they can never be affected by code relaxations. This patch also adds relocation support for the fields affected by those extraneous relocations (the DWARF unit entry Length, and the DWARF debug line entry TotalLength and PrologueLength), for testing purposes. Differential Revision: https://reviews.llvm.org/D62062 Patch by Luís Marques. llvm-svn: 366402	2019-07-18 05:22:55 +00:00
Alex Bradbury	8aba95d64c	[RISCV] Avoid signed integer overflow UB in RISCVMatInt::generateInstSeq Found by UBSan. llvm-svn: 366398	2019-07-18 04:02:58 +00:00
Alex Bradbury	ad73a436dc	[RISCV] Don't acccess an invalidated iterator in RISCVInstrInfo::removeBranch Issue found by ASan. llvm-svn: 366397	2019-07-18 03:23:47 +00:00
Fangrui Song	f358cf8de2	[AArch64] Add dependency from AArch64CodeGen to TransformUtils to fix -DBUILD_SHARED_LIBS=on link error after D64173/r366361 This fixes: ld.lld: error: undefined symbol: llvm::findAllocaForValue(llvm::Value, llvm::DenseMap<llvm::Value, llvm::Alloc aInst, llvm::DenseMapInfo<llvm::Value>, llvm::detail::DenseMapPair<llvm::Value, llvm::AllocaInst> >&) >>> referenced by AArch64StackTagging.cpp llvm-svn: 366396	2019-07-18 01:53:08 +00:00
Nilanjana Basu	4e22770219	Changes to display code view debug info type records in hex format llvm-svn: 366390	2019-07-17 23:43:58 +00:00
Evgeniy Stepanov	6abd78cc7c	Make DT a transitive dependency of LI. Summary: LoopInfoWrapperPass::verify uses DT, which means DT must be alive even if it has no direct users. Fixes a crash in expensive checks mode. Reviewers: pcc, leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64896 llvm-svn: 366388	2019-07-17 23:31:59 +00:00
Denis Bakhvalov	3eab4819f2	[llvm-bcanalyzer] Fixed error 'Expected<T> must be checked before access or destruction' After rL365286 I had failing test: LLVM :: tools/gold/X86/v1.12/thinlto_emit_linked_objects.ll It was failing with the output: $ llvm-bcanalyzer --dump llvm/test/tools/gold/X86/v1.12/Output/thinlto_emit_linked_objects.ll.tmp3.o.thinlto.bc Expected<T> must be checked before access or destruction. Unchecked Expected<T> contained error: Unexpected end of file reading 0 of 0 bytesStack dump: Change-Id: I07e03262074ea5e0aae7a8d787d5487c87f914a2 llvm-svn: 366387	2019-07-17 23:28:39 +00:00
Nico Weber	7bb5fc0583	llvm-pdbdump: Fix several smaller issues with injected source compression handling - getCompression() used to return a PDB_SourceCompression even though the docs for IDiaInjectedSource are explicit about the return value being compiler-dependent. Return an uint32_t instead, and make the printing code handle unknown values better by printing "Unknown" and the int value instead of not printing any compression. - Print compressed contents as hex dump, not as string. - Add compression type "DotNet", which is used (at least) by csc.exe, the C# compiler. Also add a lengthy comment describing the stream contents (derived from looking at the raw hex contents long enough to see the GUIDs, which led me to the roslyn and mono implementations for handling this). - The native injected source dumper was dumping the contents of the whole data stream -- but csc.exe writes a stream that's padded with zero bytes to the next 512 boundary, and the dia api doesn't display those padding bytes. So make NativeInjectedSource::getCode() do the same thing. Differential Revision: https://reviews.llvm.org/D64879 llvm-svn: 366386	2019-07-17 22:59:52 +00:00
Stanislav Mekhanoshin	7872d76a16	[AMDGPU] Simplify AMDGPUInstPrinter::printRegOperand() Differential Revision: https://reviews.llvm.org/D64892 llvm-svn: 366385	2019-07-17 22:58:43 +00:00
Craig Topper	61fff7a337	[X86] Make sure we mark 128/256 MLOAD as Legal with VLX when min-legal-vector-width=256 is in effect. This started triggering an assertion after r364718 when we made these Custom under AVX2. llvm-svn: 366382	2019-07-17 22:26:00 +00:00
Peter Collingbourne	3b82b92c6b	hwasan: Initialize the pass only once. This will let us instrument globals during initialization. This required making the new PM pass a module pass, which should still provide access to analyses via the ModuleAnalysisManager. Differential Revision: https://reviews.llvm.org/D64843 llvm-svn: 366379	2019-07-17 21:45:19 +00:00
Stanislav Mekhanoshin	9c7f4264d3	[AMDGPU] Stop special casing flat_scratch for register name Differential Revision: https://reviews.llvm.org/D64885 llvm-svn: 366376	2019-07-17 21:35:11 +00:00
Evgeniy Stepanov	f45fd429b7	Speculative fix for stack-tagging.ll failure. Depending on the evaluation order of function call arguments, the current code may insert a use before def. llvm-svn: 366375	2019-07-17 21:27:44 +00:00
Hideto Ueno	4a09a73fb0	[Attributor][NFC] Remove unnecessary debug output llvm-svn: 366373	2019-07-17 21:11:02 +00:00
Nilanjana Basu	6e4076699c	Adding inline comments to code view type record directives for better readability llvm-svn: 366372	2019-07-17 21:01:12 +00:00
Francis Visoiu Mistrih	9f2b290add	[PEI] Don't re-allocate a pre-allocated stack protector slot The LocalStackSlotPass pre-allocates a stack protector and makes sure that it comes before the local variables on the stack. We need to make sure that later during PEI we don't re-allocate a new stack protector slot. If that happens, the new stack protector slot will end up being after the local variables that it should be protecting. Therefore, we would have two slots assigned for two different stack protectors, one at the top of the stack, and one at the bottom. Since PEI will overwrite the assigned slot for the stack protector, the load that is used to compare the value of the stack protector will use the slot assigned by PEI, which is wrong. For this, we need to check if the object is pre-allocated, and re-use that pre-allocated slot. Differential Revision: https://reviews.llvm.org/D64757 llvm-svn: 366371	2019-07-17 20:46:19 +00:00
Francis Visoiu Mistrih	90ba54bf67	[CodeGen][NFC] Simplify checks for stack protector index checking Use `hasStackProtectorIndex()` instead of `getStackProtectorIndex() >= 0`. llvm-svn: 366369	2019-07-17 20:46:09 +00:00
Matt Arsenault	0966dd0d69	GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sources Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367	2019-07-17 20:22:44 +00:00
Matt Arsenault	914a59cad8	GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUES Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366	2019-07-17 20:22:38 +00:00
Evgeniy Stepanov	851339fb29	Basic MTE stack tagging instrumentation. Summary: Use MTE intrinsics to tag stack variables in functions with sanitize_memtag attribute. Reviewers: pcc, vitalybuka, hctim, ostannard Subscribers: srhines, mgorny, javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64173 llvm-svn: 366361	2019-07-17 19:24:12 +00:00
Evgeniy Stepanov	d752f5e953	Basic codegen for MTE stack tagging. Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360	2019-07-17 19:24:02 +00:00
Momchil Velikov	0e2b74a2b0	Revert [AArch64] Add support for Transactional Memory Extension (TME) This reverts r366322 (git commit `4b8da3a503`) llvm-svn: 366355	2019-07-17 17:43:32 +00:00
Daniil Fukalov	d912a9ba9b	[AMDGPU] Tune inlining parameters for AMDGPU target Summary: Since the target has no significant advantage of vectorization, vector instructions bous threshold bonus should be optional. amdgpu-inline-arg-alloca-cost parameter default value and the target InliningThresholdMultiplier value tuned then respectively. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64642 llvm-svn: 366348	2019-07-17 16:51:29 +00:00
Lang Hames	1716454027	[ORC] Add deprecation warnings to ORCv1 layers and utilities. Summary: ORCv1 is deprecated. The current aim is to remove it before the LLVM 10.0 release. This patch adds deprecation attributes to the ORCv1 layers and utilities to warn clients of the change. Reviewers: dblaikie, sgraenitz, AlexDenisov Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64609 llvm-svn: 366344	2019-07-17 16:40:52 +00:00
Matt Arsenault	06eed42213	AMDGPU: Use getTargetConstant Avoids creating an extra intermediate mov. llvm-svn: 366340	2019-07-17 15:35:36 +00:00
Hideto Ueno	11d3710c1c	[Attributor] Deduce "willreturn" function attribute Summary: Deduce the "willreturn" attribute for functions. For now, intrinsics are not willreturn. More annotation will be done in another patch. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63046 llvm-svn: 366335	2019-07-17 15:15:43 +00:00
Alex Bradbury	ab009a602e	[AsmPrinter] Make the encoding of call sites in .gcc_except_table configurable and use for RISC-V The original behavior was to always emit the offsets to each call site in the call site table as uleb128 values, however on some architectures (eg RISCV) these uleb128 offsets into the code cannot always be resolved until link time (because relaxation will invalidate any calculated offsets), and there are no appropriate relocations for uleb128 values. As a consequence it needs to be possible to specify an alternative. This also switches RISCV to use DW_EH_PE_udata4 for call side encodings in .gcc_except_table Differential Revision: https://reviews.llvm.org/D63415 Patch by Edward Jones. llvm-svn: 366329	2019-07-17 14:00:35 +00:00
Alex Bradbury	b94c233d06	[RISCV] Set correct encodings for DWARF exception handling This patch sets correct encodings for DWARF exception handling for RISC-V (other than call site encoding, which must be udata4 rather than uleb128 and is handled by D63415). This has the same intend as D63409, except this version matches GCC/binutils behaviour which uses the same encodings regardless of PIC/non-PIC and medlow/medany code model. llvm-svn: 366327	2019-07-17 13:54:38 +00:00
Jay Foad	70235c642e	[AMDGPU] Optimize atomic AND/OR/XOR Summary: Extend the atomic optimizer to handle AND, OR and XOR. Reviewers: arsenm, sheredom Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64809 llvm-svn: 366323	2019-07-17 13:40:03 +00:00
Momchil Velikov	4b8da3a503	[AArch64] Add support for Transactional Memory Extension (TME) TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322	2019-07-17 13:23:27 +00:00
Justin Hibbits	0257c6b659	PowerPC: Fix register spilling for SPE registers Summary: Missed in the original commit, use the correct callee-saved register list for spilling, instead of the standard SVR432 list. This avoids needlessly spilling the SPE non-volatile registers when they're not used. As part of this, also add where missing, and sort, the spill opcode checks for SPE and SPE4 register classes. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56703 llvm-svn: 366319	2019-07-17 12:30:48 +00:00
Justin Hibbits	5214956eaa	PowerPC/SPE: Fix load/store handling for SPE Summary: Pointed out in a comment for D49754, register spilling will currently spill SPE registers at almost any offset. However, the instructions `evstdd` and `evldd` require a) 8-byte alignment, and b) a limit of 256 (unsigned) bytes from the base register, as the offset must fix into a 5-bit offset, which ranges from 0-31 (indexed in double-words). The update to the register spill test is taken partially from the test case shown in D49754. Additionally, pointed out by Kei Thomsen, globals will currently use evldd/evstdd, though the offset isn't known at compile time, so may exceed the 8-bit (unsigned) offset permitted. This fixes that as well, by forcing it to always use evlddx/evstddx when accessing globals. Part of the patch contributed by Kei Thomsen. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54409 llvm-svn: 366318	2019-07-17 12:30:04 +00:00
Petar Avramovic	1e62635d05	[MIPS GlobalISel] ClampScalar and select pointer G_ICMP Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317	2019-07-17 12:08:01 +00:00
Nicolai Haehnle	8b7041a5c6	AMDGPU/GFX10: Apply the VMEM-to-scalar-write hazard also to writes to EXEC Summary: Change-Id: I854fbf7d48e937bef9f8f3f5d0c8aeb970652630 Reviewers: rampitec, mareko Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64807 Change-Id: I4405b3a7f84186acea5a78d291bff71056e745fc llvm-svn: 366314	2019-07-17 11:22:57 +00:00
Nicolai Haehnle	a256b8b7d7	AMDGPU: Improve alias analysis for GDS Summary: GDS cannot alias anything else. Original patch by: Marek Olšák Reviewers: arsenm, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64114 Change-Id: I07bfbd96f5d5c37a6dfba7997df12f291dd794b0 llvm-svn: 366313	2019-07-17 11:22:19 +00:00
Diana Picus	37e403d18c	[ARM GlobalISel] Cleanup CallLowering. NFC Migrate CallLowering::lowerReturnVal to use the same infrastructure as lowerCall/FormalArguments and remove the now obsolete code path from splitToValueTypes. Forgot to push this earlier. llvm-svn: 366308	2019-07-17 10:01:27 +00:00
Simon Atanasyan	4c1e440892	[mips] Use mult/mflo pattern on 64-bit targets prior to MIPS64 The `MUL` instruction is available starting from the MIPS32/MIPS64 targets. llvm-svn: 366301	2019-07-17 08:11:40 +00:00
Simon Atanasyan	a884afb6f8	[mips] Implement .cplocal directive This directive forces to use the alternate register for context pointer. For example, this code: .cplocal $4 jal foo expands to: ld $25, %call16(foo)($4) jalr $25 Differential Revision: https://reviews.llvm.org/D64743 llvm-svn: 366300	2019-07-17 08:11:31 +00:00
Simon Atanasyan	7f308af5ee	[mips] Support the "o" inline asm constraint As well as other LLVM targets we do not handle "offsettable" memory addresses in any special way. In other words, the "o" constraint is an exact equivalent of the "m" one. But some existing code require the "o" constraint support. This fixes PR42589. Differential Revision: https://reviews.llvm.org/D64792 llvm-svn: 366299	2019-07-17 08:11:15 +00:00
Stanislav Mekhanoshin	e5012ab308	[AMDGPU] Autogenerate register asm names Differential Revision: https://reviews.llvm.org/D64839 llvm-svn: 366283	2019-07-16 23:44:21 +00:00
Matt Arsenault	1c3f4ec7fc	GlobalISel: Add overload of handleAssignments with CCState AMDGPU needs to allocate special argument registers separately from the user function argument list, so needs direct control over the CCState. The ArgLocs argument is only really necessary because CCState doesn't allow access to it. llvm-svn: 366279	2019-07-16 22:41:34 +00:00
Guanzhong Chen	0a8d4df799	[WebAssembly] Compile all TLS on Emscripten as local-exec Summary: Currently, on Emscripten, dynamic linking is not supported with threads. This means that if thread-local storage is used, it must be used in a statically-linked executable. Hence, local-exec is the only possible model. This diff compiles all TLS variables to use local-exec on Emscripten as a temporary measure until dynamic linking is supported with threads. The goal for this is to allow C++ types with constructors to be thread-local. Currently, when `clang` compiles a `thread_local` variable with a constructor, it generates `__tls_guard` variable: @__tls_guard = internal thread_local global i8 0, align 1 As no TLS model is specified, this is treated as general-dynamic, which we do not support (and cannot support without implementing dynamic linking support with threads in Emscripten). As a result, any C++ constructor in `thread_local` variables would not compile. By compiling all `thread_local` as local-exec, `__tls_guard` will compile and we can support C++ constructors with TLS without implementing dynamic linking with threads. Depends on D64537 Reviewers: tlively, aheejin, sbc100 Reviewed By: aheejin Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64776 llvm-svn: 366275	2019-07-16 22:22:08 +00:00
Guanzhong Chen	42bba4b852	[WebAssembly] Implement thread-local storage (local-exec model) Summary: Thread local variables are placed inside a `.tdata` segment. Their symbols are offsets from the start of the segment. The address of a thread local variable is computed as `__tls_base` + the offset from the start of the segment. `.tdata` segment is a passive segment and `memory.init` is used once per thread to initialize the thread local storage. `__tls_base` is a wasm global. Since each thread has its own wasm instance, it is effectively thread local. Currently, `__tls_base` must be initialized at thread startup, and so cannot be used with dynamic libraries. `__tls_base` is to be initialized with a new linker-synthesized function, `__wasm_init_tls`, which takes as an argument a block of memory to use as the storage for thread locals. It then initializes the block of memory and sets `__tls_base`. As `__wasm_init_tls` will handle the memory initialization, the memory does not have to be zeroed. To help allocating memory for thread-local storage, a new compiler intrinsic is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns the size of the thread-local storage for the current function. The expected usage is to run something like the following upon thread startup: __wasm_init_tls(malloc(__builtin_wasm_tls_size())); Reviewers: tlively, aheejin, kripken, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64537 llvm-svn: 366272	2019-07-16 22:00:45 +00:00
Sanjay Patel	d746a210e1	[x86] use more phadd for reductions This is part of what is requested by PR42023: https://bugs.llvm.org/show_bug.cgi?id=42023 There's an extension needed for FP add, but exactly how we would specify that using flags is not clear to me, so I left that as a TODO. We're still missing patterns for partial reductions when the input vector is 256-bit or 512-bit, but I think that's a failure of vector narrowing. If we can reduce the widths, then this matching should work on those tests. Differential Revision: https://reviews.llvm.org/D64760 llvm-svn: 366268	2019-07-16 21:30:41 +00:00
David Blaikie	40580d36c4	DWARF: Skip zero column for inline call sites D64033 <https://reviews.llvm.org/D64033> added DW_AT_call_column for inline sites. However, that change wasn't aware of "-gno-column-info". To avoid adding column info when "-gno-column-info" is used, now DW_AT_call_column is only added when we have non-zero column (when "-gno-column-info" is used, column will be zero). Patch by Wenlei He! Differential Revision: https://reviews.llvm.org/D64784 llvm-svn: 366264	2019-07-16 21:15:19 +00:00
Matt Arsenault	f8c8284455	AMDGPU/GlobalISel: Select G_ASHR llvm-svn: 366257	2019-07-16 20:31:25 +00:00
Matt Arsenault	e5b28b98e9	AMDGPU/GlobalISel: Select G_LSHR llvm-svn: 366256	2019-07-16 20:25:43 +00:00
Jinsong Ji	65e34a3143	[PowerPC][HTM] Fix impossible reg-to-reg copy assert with ttest builtin Summary: This is exposed by our internal testing. The reduced testcase will assert with "Impossible reg-to-reg copy" We can't use COPY to do 32-bit to 64-bit conversion. Reviewers: kbarton, hfinkel, nemanjai Reviewed By: hfinkel Subscribers: hiraditya, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64499 llvm-svn: 366255	2019-07-16 20:24:33 +00:00
Matt Arsenault	1b69fd275d	AMDGPU/GlobalISel: Select G_SHL I think this manages to not break the DAG handling with the divergent predicates because the stadalone divergent patterns end up with a higher priority than the pattern on the instruction definition. The 16-bit versions don't work yet. llvm-svn: 366254	2019-07-16 20:15:30 +00:00
Stanislav Mekhanoshin	6e0fa292c2	[AMDGPU] Change register type for v32 vectors When it is AReg_1024 this results in unnecessary copying into AGPRs of a 32 element vectors even though they are not intended for an mfma instruction. Differential Revision: https://reviews.llvm.org/D64815 llvm-svn: 366252	2019-07-16 20:06:00 +00:00
Michael Liao	ccf22ef94c	Fix -Wreturn-type warning. NFC. llvm-svn: 366251	2019-07-16 19:59:08 +00:00
Matt Arsenault	2d10407719	AMDGPU/GlobalISel: Fix selection of private stores llvm-svn: 366249	2019-07-16 19:27:44 +00:00
Matt Arsenault	7161fb0be5	AMDGPU/GlobalISel: Select private loads llvm-svn: 366248	2019-07-16 19:22:21 +00:00
Matt Arsenault	dad1f89210	AMDGPU/GlobalISel: Select flat stores llvm-svn: 366246	2019-07-16 18:42:53 +00:00
Matt Arsenault	7eb1902cd5	AMDGPU: Add register classes to flat store patterns For some reason GlobalISelEmitter needs register classes to import these, although it works for the load patterns. llvm-svn: 366242	2019-07-16 18:26:42 +00:00
Philip Reames	6e1c3bb181	[IndVars] Speculative fix for an assertion failure seen in bots I don't have an IR sample which is actually failing, but the issue described in the comment is theoretically possible, and should be guarded against even if there's a different root cause for the bot failures. llvm-svn: 366241	2019-07-16 18:23:49 +00:00
Matt Arsenault	8f8d07e93b	AMDGPU: Replace store PatFrags Convert the easy cases to formats understood for GlobalISel. llvm-svn: 366240	2019-07-16 18:21:25 +00:00
Matt Arsenault	35c96598b1	AMDGPU/GlobalISel: Select flat loads Now that the patterns use the new PatFrag address space support, the only blocker to importing most load patterns is the addressing mode complex patterns. llvm-svn: 366237	2019-07-16 18:05:29 +00:00
Nico Weber	d100b5dd01	Teach `llvm-pdbutil pretty -native` about `-injected-sources` `pretty -native -injected-sources -injected-source-content` works with this patch, and produces identical output to the dia version. Differential Revision: https://reviews.llvm.org/D64428 llvm-svn: 366236	2019-07-16 18:04:26 +00:00
Jay Foad	17060f0a54	[AMDGPU] Optimize atomic max/min Summary: Extend the atomic optimizer to handle signed and unsigned max and min operations, as well as add and subtract. Reviewers: arsenm, sheredom, critson, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64328 llvm-svn: 366235	2019-07-16 17:44:54 +00:00
Matt Arsenault	c6fd5abecc	AMDGPU: Redefine load PatFrags Rewrite PatFrags using the new PatFrag address space matching in tablegen. These will now work with both SelectionDAG and GlobalISel. llvm-svn: 366234	2019-07-16 17:38:50 +00:00
Michael Liao	b3f967d411	[AMDGPU] Add the adjusted FP as a livein register. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64145 llvm-svn: 366223	2019-07-16 15:57:12 +00:00
Ulrich Weigand	450c62e33e	[Strict FP] Allow more relaxed scheduling Reimplement scheduling constraints for strict FP instructions in ScheduleDAGInstrs::buildSchedGraph to allow for more relaxed scheduling. Specifially, allow one strict FP instruction to be scheduled across another, as long as it is not moved across any global barrier. Differential Revision: https://reviews.llvm.org/D64412 Reviewed By: cameron.mcinally llvm-svn: 366222	2019-07-16 15:55:45 +00:00
Francis Visoiu Mistrih	94bad22c2c	[Remarks] Simplify and refactor the RemarkParser interface Before, everything was based on some kind of type erased parser implementation which container a lot of boilerplate code when multiple formats were to be supported. This simplifies it by: * the remark now owns its arguments * always returning an error from the implementation side * working around the way the YAML parser reports errors: catch them through callbacks and re-insert them in a proper llvm::Error * add a CParser wrapper that is used when implementing the C API to avoid cluttering the C++ API with useless state * LLVMRemarkParserGetNext now returns an object that needs to be released to avoid leaking resources * add a new API to dispose of a remark entry: LLVMRemarkEntryDispose llvm-svn: 366217	2019-07-16 15:25:05 +00:00
Francis Visoiu Mistrih	cc909812a3	[Remarks][NFC] Combine ParserFormat and SerializerFormat It's useless to have both. llvm-svn: 366216	2019-07-16 15:24:59 +00:00
Amara Emerson	228a7b4f2a	[ADCE] Fix non-deterministic behaviour due to iterating over a pointer set. Original patch by Yann Laigle-Chapuy Differential Revision: https://reviews.llvm.org/D64785 llvm-svn: 366215	2019-07-16 15:23:10 +00:00
Amaury Sechet	f34a69c2e2	[DAGCombiner] fold (addcarry (xor a, -1), b, c) -> (subcarry b, a, !c) and flip carry. Summary: As per title. DAGCombiner only mathes the special case where b = 0, this patches extends the pattern to match any value of b. Depends on D57302 Reviewers: hfinkel, RKSimon, craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59208 llvm-svn: 366214	2019-07-16 15:17:00 +00:00
Matt Arsenault	22c4a147a9	AMDGPU/GlobalISel: Fix test failures in release build Apparently the check for legal instructions during instruction select does not happen without an asserts build, so these would successfully select in release, and fail in debug. Make s16 and/or/xor legal. These can just be selected directly to the 32-bit operation, as is already done in SelectionDAG, so just make them legal. llvm-svn: 366210	2019-07-16 14:28:30 +00:00
Kyrylo Tkachov	eb72138340	[AArch64] Implement __jcvt intrinsic from Armv8.3-A The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined. This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function. The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used. I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example). make check-all didn't show any new failures. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics Differential Revision: https://reviews.llvm.org/D64495 llvm-svn: 366197	2019-07-16 09:27:39 +00:00
Kyrylo Tkachov	a3e26d1a6c	[NFC] Test commit: add full stop at end of comment llvm-svn: 366195	2019-07-16 09:15:01 +00:00
Igor Kudrin	f48bc01812	[DWARF] Fix the reserved values for unit length in DWARFDebugLine. The DWARF3 documentation had inconsistency concerning the reserved range for unit length values. The issue was fixed in DWARF4. Differential Revision: https://reviews.llvm.org/D64622 llvm-svn: 366190	2019-07-16 07:01:08 +00:00
Igor Kudrin	74c350af21	[DWARF] Fix an incorrect format specifier. This adjusts the format specifier because PCOffset is uint16_t. Differential Revision: https://reviews.llvm.org/D64620 llvm-svn: 366189	2019-07-16 06:56:10 +00:00
Igor Kudrin	860f7ec058	[DWARF] Simplify DWARFAttribute. NFC. The first argument in the constructor was ignored, and the remaining arguments were always passed as their defaults. Differential Revision: https://reviews.llvm.org/D64407 llvm-svn: 366188	2019-07-16 06:53:06 +00:00
Craig Topper	c0b2ed664b	[X86] In combineStore, don't convert v2f32 load/store pairs to f64 loads/stores. Type legalization can take care of this. This gives DAG combine a little more time with the original types. llvm-svn: 366182	2019-07-16 05:52:27 +00:00
Alex Bradbury	1ffceaa543	[RISCV] Match GNU tools canonical JALR and add aliases The canonical GNU form of JALR resembles a load/store instruction rather than placing the immediate offset as a separate argument, so match this behaviour. Also add parser-only aliases for the three-operand form, and add other shorter aliases also emitted by GNU tools. Differential Revision: https://reviews.llvm.org/D55277 Patch by James Clarke. llvm-svn: 366179	2019-07-16 04:56:43 +00:00
Rui Ueyama	49a3ad21d6	Fix parameter name comments using clang-tidy. NFC. This patch applies clang-tidy's bugprone-argument-comment tool to LLVM, clang and lld source trees. Here is how I created this patch: $ git clone https://github.com/llvm/llvm-project.git $ cd llvm-project $ mkdir build $ cd build $ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_ENABLE_PROJECTS='clang;lld;clang-tools-extra' \ -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DLLVM_ENABLE_LLD=On \ -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ../llvm $ ninja $ parallel clang-tidy -checks='-,bugprone-argument-comment' \ -config='{CheckOptions: [{key: StrictMode, value: 1}]}' -fix \ ::: ../llvm/lib//.{cpp,h} ../clang/lib/*/.{cpp,h} ../lld/*/.{cpp,h} llvm-svn: 366177	2019-07-16 04:46:31 +00:00
Alex Bradbury	bb479ca311	[RISCV] Avoid overflow when determining number of nops for code align RISCVAsmBackend::shouldInsertExtraNopBytesForCodeAlign() assumed that the align specified would be greater than or equal to the minimum nop length, but that is not always the case - for example if a user specifies ".align 0" in assembly. Differential Revision: https://reviews.llvm.org/D63274 Patch by Edward Jones. llvm-svn: 366176	2019-07-16 04:40:25 +00:00
Alex Bradbury	e9ad0cf6cf	[RISCV] Fix a potential issue in shouldInsertFixupForCodeAlign() The bool result of shouldInsertExtraNopBytesForCodeAlign() is not checked but the returned nop count is unconditionally read even though it could be uninitialized. Differential Revision: https://reviews.llvm.org/D63285 Patch by Edward Jones. llvm-svn: 366175	2019-07-16 04:37:19 +00:00
Alex Bradbury	ef8577ef98	[RISCV][NFC] Split PseudoCALL pattern out from instruction Since PseudoCALL defines AsmString, it can be generated from assembly, and so code-gen patterns should be defined separately to be consistent with the style of the RISCV backend. Other pseudo-instructions exist that have code-gen patterns defined directly, but these instructions are purely for code-gen and cannot be written in assembly. Differential Revision: https://reviews.llvm.org/D64012 Patch by James Clarke. llvm-svn: 366174	2019-07-16 03:56:45 +00:00
Alex Bradbury	a3c7b27419	[RISCV][NFC] Fix HasStedExtA -> HasStdExtA typo in comment Differential Revision: https://reviews.llvm.org/D64011 Patch by James Clarke. llvm-svn: 366173	2019-07-16 03:54:08 +00:00
Alex Bradbury	4ac0b9be23	[RISCV] Make RISCVELFObjectWriter::getRelocType check IsPCRel Previously, this function didn't check the IsPCRel argument. But doing so is a useful check for errors, and also seemingly necessary for FK_Data_4 (which we produce a R_RISCV_32_PCREL relocation for if IsPCRel). Other than R_RISCV_32_PCREL, this should be NFC. Future exception handling related patches will include tests that capture this behaviour. llvm-svn: 366172	2019-07-16 03:47:34 +00:00
Peter Collingbourne	e5c4b468f0	hwasan: Pad arrays with non-1 size correctly. Spotted by eugenis. Differential Revision: https://reviews.llvm.org/D64783 llvm-svn: 366171	2019-07-16 03:25:50 +00:00
Matt Arsenault	1739b700b1	AMDGPU: Avoid code predicates for extload PatFrags Use the MemoryVT field. This will be necessary for tablegen to automatically handle patterns for GlobalISel. Doesn't handle the d16 lo/hi patterns. Those are a special case since it involvess the custom node type. llvm-svn: 366168	2019-07-16 02:46:05 +00:00
Jonas Devlieghere	ca16d280f7	Re-land "[DebugInfo] Move function from line table to the prologue (NFC)" In LLDB, when parsing type units, we don't need to parse the whole line table. Instead, we only need to parse the "support files" from the line table prologue. To make that possible, this patch moves the respective functions from the LineTable into the Prologue. Because I don't think users of the LineTable should have to know that these files come from the Prologue, I've left the original methods in place, and made them redirect to the LineTable. Differential revision: https://reviews.llvm.org/D64774 llvm-svn: 366164	2019-07-16 01:21:25 +00:00
Michael Liao	543ba4e9e0	[InstructionSimplify] Apply sext/trunc after pointer stripping Summary: - As the pointer stripping could trace through `addrspacecast` now, need to sext/trunc the offset to ensure it has the same width as the pointer after stripping. Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64768 llvm-svn: 366162	2019-07-16 01:03:06 +00:00
Jonas Devlieghere	01ee172e9e	Revert "[DebugInfo] Move function from line table to the prologue (NFC)" This broke LLD, which I didn't have enabled. llvm-svn: 366160	2019-07-16 00:59:04 +00:00
Jonas Devlieghere	509903e887	[DebugInfo] Move function from line table to the prologue (NFC) In LLDB, when parsing type units, we don't need to parse the whole line table. Instead, we only need to parse the "support files" from the line table prologue. To make that possible, this patch moves the respective functions from the LineTable into the Prologue. Because I don't think users of the LineTable should have to know that these files come from the Prologue, I've left the original methods in place, and made them redirect to the LineTable. Differential revision: https://reviews.llvm.org/D64774 llvm-svn: 366158	2019-07-16 00:37:17 +00:00
Eric Christopher	93dfb93ad6	Temporarily Revert "[SLP] Recommit: Look-ahead operand reordering heuristic." As there are some reported miscompiles with AVX512 and performance regressions in Eigen. Verified with the original committer and testcases will be forthcoming. This reverts commit r364964. llvm-svn: 366154	2019-07-15 23:36:02 +00:00
Leonard Chan	bb147aabc6	Revert "[NewPM] Port Sancov" This reverts commit `5652f35817`. llvm-svn: 366153	2019-07-15 23:18:31 +00:00
Craig Topper	51193871da	[X86] Teach convertToThreeAddress to handle SUB with immediate We mostly avoid sub with immediate but there are a couple cases that can create them. One is the add 128, %rax -> sub -128, %rax trick in isel. The other is when a SUB immediate gets created for a compare where both the flags and the subtract value is used. If we are unable to linearize the SelectionDAG to satisfy the flag user and the sub result user from the same instruction, we will clone the sub immediate for the two uses. The one that produces flags will eventually become a compare. The other will have its flag output dead, and could then be considered for LEA creation. I added additional test cases to add.ll to show the the sub -128 trick gets converted to LEA and a case where we don't need to convert it. This showed up in the current codegen for PR42571. Differential Revision: https://reviews.llvm.org/D64574 llvm-svn: 366151	2019-07-15 23:07:56 +00:00
Heejin Ahn	1cf6922660	[WebAssembly] Add missing utility methods for exnref type Summary: This adds missing utility methods and copy instruction handling for `exnref` type and also adds tests. `tee` instruction tests are missing because `isTee` is currently only used in ExplicitLocals pass and testing that pass in mir requires serialization of stackified registers in mir files, which is a bit nontrivial because `MachineFunctionInfo` only has info of vreg numbers (which are large integers) but not the mir's register numbers. But this change is quite trivial anyway. Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64705 llvm-svn: 366149	2019-07-15 23:04:00 +00:00
Heejin Ahn	9f96a58ccc	[WebAssembly] Rename except_ref type to exnref Summary: We agreed to rename `except_ref` to `exnref` for consistency with other reference types in https://github.com/WebAssembly/exception-handling/issues/79. This also renames WebAssemblyInstrExceptRef.td to WebAssemblyInstrRef.td in order to use the file for other reference types in future. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64703 llvm-svn: 366145	2019-07-15 22:49:25 +00:00
Wouter van Oortmerssen	292e21d8bc	[WebAssembly] Assembler: support special floats: infinity / nan Summary: These are emitted as identifiers by the InstPrinter, so we should parse them as such. These could potentially clash with symbols of the same name, but that is out of our (the WebAssembly backend) control. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64770 llvm-svn: 366139	2019-07-15 22:13:39 +00:00
Austin Kerbow	423b4a18a4	[AMDGPU] Enable merging m0 initializations. Summary: Enable hoisting and merging m0 defs that are initialized with the same immediate value. Fixes bug where removed instructions are not considered to interfere with other inits, and make sure to not hoist inits before block prologues. Reviewers: rampitec, arsenm Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64766 llvm-svn: 366135	2019-07-15 22:07:05 +00:00
Simon Atanasyan	becae2b232	[mips] Print BEQZL and BNEZL pseudo instructions One of the reasons - to be compatible with GNU tools. llvm-svn: 366133	2019-07-15 21:46:38 +00:00
Matt Arsenault	b082f1055b	AMDGPU: Use standalone MUBUF load patterns We already do this for the flat and DS instructions, although it is certainly uglier and more verbose. This will allow using separate pattern definitions for extload and zextload. Currently we get away with using a single PatFrag with custom predicate code to check if the extension type is a zextload or anyextload. The generic mechanism the global isel emitter understands treats these as mutually exclusive. I was considering making the pattern emitter accept zextload or sextload extensions for anyextload patterns, but in global isel, the different extending loads have distinct opcodes, and there is currently no mechanism for an opcode matcher to try multiple (and there probably is very little need for one beyond this case). llvm-svn: 366132	2019-07-15 21:41:44 +00:00
Nick Desaulniers	c4f245b40a	[LoopUnroll+LoopUnswitch] do not transform loops containing callbr Summary: There is currently a correctness issue when unrolling loops containing callbr's where their indirect targets are being updated correctly to the newly created labels, but their operands are not. This manifests in unrolled loops where the second and subsequent copies of callbr instructions have blockaddresses of the label from the first instance of the unrolled loop, which would result in nonsensical runtime control flow. For now, conservatively do not unroll the loop. In the future, I think we can pursue unrolling such loops provided we transform the cloned callbr's operands correctly. Such a transform and its legalities are being discussed in: https://reviews.llvm.org/D64101 Link: https://bugs.llvm.org/show_bug.cgi?id=42489 Link: https://groups.google.com/forum/#!topic/clang-built-linux/z-hRWP9KqPI Reviewers: fhahn, hfinkel, efriedma Reviewed By: fhahn, hfinkel, efriedma Subscribers: efriedma, hiraditya, zzheng, dmgreen, llvm-commits, pirama, kees, nathanchance, E5ten, craig.topper, chandlerc, glider, void, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D64368 llvm-svn: 366130	2019-07-15 21:16:29 +00:00
Matt Arsenault	66ee934440	AMDGPU/GlobalISel: Allow scalar s1 and/or/xor If a 1-bit value is in a 32-bit VGPR, the scalar opcodes set SCC to whether the result is 0. If the inputs are SCC, these can be copied to a 32-bit SGPR to produce an SCC result. llvm-svn: 366125	2019-07-15 20:20:18 +00:00
Evgeniy Stepanov	c5e7f56249	ARM MTE stack sanitizer. Add "memtag" sanitizer that detects and mitigates stack memory issues using armv8.5 Memory Tagging Extension. It is similar in principle to HWASan, which is a software implementation of the same idea, but there are enough differencies to warrant a new sanitizer type IMHO. It is also expected to have very different performance properties. The new sanitizer does not have a runtime library (it may grow one later, along with a "debugging" mode). Similar to SafeStack and StackProtector, the instrumentation pass (in a follow up change) will be inserted in all cases, but will only affect functions marked with the new sanitize_memtag attribute. Reviewers: pcc, hctim, vitalybuka, ostannard Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64169 llvm-svn: 366123	2019-07-15 20:02:23 +00:00
Matt Arsenault	c8291c94f8	AMDGPU/GlobalISel: Select G_AND/G_OR/G_XOR llvm-svn: 366121	2019-07-15 19:50:07 +00:00

... 2 3 4 5 6 ...

125108 Commits