llvm-project

Commit Graph

Author	SHA1	Message	Date
Dmitry Preobrazhensky	4ccb7f8c45	[AMDGPU][MC] Corrected parsing of branch offsets See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D64629 llvm-svn: 366571	2019-07-19 13:12:47 +00:00
Kai Luo	dec624682e	[MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs. Summary: Current PRE hoists common computations into CMBB = DT->findNearestCommonDominator(MBB, MBB1). However, if CMBB is in a hot loop body, we might get performance degradation. Differential Revision: https://reviews.llvm.org/D64394 llvm-svn: 366570	2019-07-19 12:58:16 +00:00
Than McIntosh	e238a4c757	[X86] for split stack, not save/restore nested arg if unused Summary: For split-stack, if the nested argument (i.e. R10) is not used, no need to save/restore it in the prologue. Reviewers: thanm Reviewed By: thanm Subscribers: mstorsjo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64673 llvm-svn: 366569	2019-07-19 12:54:44 +00:00
Oliver Stannard	8780c0dda2	Don't update NoTrappingFPMath and FPDenormalMode in resetTargetOptions We'd like to remove this whole function, because these are properties of functions, not the target as a whole. These two are easy to remove because they are only used for emitting ARM build attributes, which expects them to represent the defaults for the whole module, not just the last function generated. This is needed to get correct build attributes when using IPRA on ARM, because IPRA causes resetTargetOptions to get called before ARMAsmPrinter::emitAttributes. Differential revision: https://reviews.llvm.org/D64929 llvm-svn: 366562	2019-07-19 10:37:37 +00:00
Oliver Stannard	0ed7732671	[IPRA] Don't rely on non-exact function definitions If a function definition is not exact, then the linker could select a differently-compiled version of it, which could use different registers. https://reviews.llvm.org/D64909 llvm-svn: 366557	2019-07-19 09:59:26 +00:00
Mikhail Maltsev	0b001f94a5	[ARM] Add <saturate> operand to SQRSHRL and UQRSHLL Summary: According to the new Armv8-M specification https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the instructions SQRSHRL and UQRSHLL now have an additional immediate operand <saturate>. The new assembly syntax is: SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm where <saturate> can be either 64 (the existing behavior) or 48, in that case the result is saturated to 48 bits. The new operand is encoded as follows: #64 Encoded as sat = 0 #48 Encoded as sat = 1 sat is bit 7 of the instruction bit pattern. This patch adds a new assembler operand class MveSaturateOperand which implements parsing and encoding. Decoding is implemented in DecodeMVEOverlappingLongShift. Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64810 llvm-svn: 366555	2019-07-19 09:46:28 +00:00
Hubert Tong	2711e16b35	[sanitizers] Use covering ObjectFormatType switches Summary: This patch removes the `default` case from some switches on `llvm::Triple::ObjectFormatType`, and cases for the missing enumerators (`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added. For `UnknownObjectFormat`, the effect of the action for the `default` case is maintained; otherwise, where `llvm_unreachable` is called, `report_fatal_error` is used instead. Where the `default` case returns a default value, `report_fatal_error` is used for XCOFF as a placeholder. For `Wasm`, the effect of the action for the `default` case in maintained. The code is structured to avoid strongly implying that the `Wasm` case is present for any reason other than to make the switch cover all `ObjectFormatType` enumerator values. Reviewers: sfertile, jasonliu, daltenty Reviewed By: sfertile Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64222 llvm-svn: 366544	2019-07-19 08:46:18 +00:00
Jay Foad	7d06ffff46	[AMDGPU] Simplify the exclusive scan used for optimized atomics Summary: Change the scan algorithm to use only power-of-two shifts (1, 2, 4, 8, 16, 32) instead of starting off shifting by 1, 2 and 3 and then doing a 3-way ADD, because: 1. It simplifies the compiler a little. 2. It minimizes vgpr pressure because each instruction is now of the form vn = vn + vn << c. 3. It is more friendly to the DPP combiner, which currently can't combine into an ADD3 instruction. Because of #2 and #3 the end result is improved from this: v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf v_mov_b32_dpp v1, v3 row_shr:3 row_mask:0xf bank_mask:0xf v_add3_u32 v1, v4, v5, v1 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf To this: v_add_u32_dpp v1, v1, v1 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf I.e. two fewer computational instructions, one extra nop where we could schedule something else. Reviewers: arsenm, sheredom, critson, rampitec, vpykhtin Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64411 llvm-svn: 366543	2019-07-19 08:40:37 +00:00
Serguei Katkov	bde33af85a	[Loop Peeling] Enable peeling of multiple exits by default. Enable loop peeling with multiple exits where all non-latch exits ends up with deopt by default. Reviewers: reames, fhahn Reviewed By: reames Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64619 llvm-svn: 366542	2019-07-19 08:35:45 +00:00
Roman Lebedev	f2eb403144	[InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) Normally, the inner pattern is sign-extend, but for our purposes it's no different to other patterns: alive proofs: f: https://rise4fun.com/Alive/7U3 For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64524 llvm-svn: 366540	2019-07-19 08:26:58 +00:00
Roman Lebedev	441c9d6ca8	[InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: e: https://rise4fun.com/Alive/0FT For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64521 llvm-svn: 366539	2019-07-19 08:26:47 +00:00
Roman Lebedev	3c212ce305	[InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: d: https://rise4fun.com/Alive/I5Y For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64519 llvm-svn: 366538	2019-07-19 08:26:37 +00:00
Roman Lebedev	2ebe57386d	[InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: c: https://rise4fun.com/Alive/RgJh For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64517 llvm-svn: 366537	2019-07-19 08:26:25 +00:00
Roman Lebedev	4422a1657c	[InstCombine] Dropping redundant masking before left-shift [1/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: b. `(x & (~(-1 << maskNbits))) << shiftNbits` All these patterns can be simplified to just: `x << ShiftShAmt` iff: b. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: b: https://rise4fun.com/Alive/y8M For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64514 llvm-svn: 366536	2019-07-19 08:26:13 +00:00
Roman Lebedev	a5f0824eb5	[InstCombine] Dropping redundant masking before left-shift [0/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: a. `(x & ((1 << MaskShAmt) - 1)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: a. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)` alive proof: a: https://rise4fun.com/Alive/wi9 Indeed, not all of these patterns are canonical. But since this fold will only produce a single instruction i'm really interested in handling even uncanonical patterns, since i have this general kind of pattern in hotpaths, and it is not totally outlandish for bit-twiddling code. For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic, huihuiz, xbolva00 Reviewed By: xbolva00 Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64512 llvm-svn: 366535	2019-07-19 08:25:43 +00:00
Hsiangkai Wang	c5ecdd3c5a	[DebugInfo] Some fields do not need relocations even relax is enabled. In debug frame information, some fields, e.g., Length in CIE/FDE and Offset in FDE are attributes to describe the structure of CIE/FDE. They are not related to the relaxed code. However, these attributes are symbol differences. So, in current design, these attributes will be filled as zero and LLVM generates relocations for them. We only need to generate relocations for symbols in executable sections. So, if the symbols are not located in executable sections, we still evaluate their values under relaxation. Differential Revision: https://reviews.llvm.org/D61584 llvm-svn: 366531	2019-07-19 06:10:36 +00:00
Hsiangkai Wang	18ccfadd46	[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame. It is necessary to generate fixups in .debug_frame or .eh_frame as relaxation is enabled due to the address delta may be changed after relaxation. There is an opcode with 6-bits data in debug frame encoding. So, we also need 6-bits fixup types. Differential Revision: https://reviews.llvm.org/D58335 llvm-svn: 366524	2019-07-19 02:03:34 +00:00
Bill Wendling	ccbffefcca	Use the MachineBasicBlock symbol for a callbr target Summary: Inline asm doesn't use labels when compiled as an object file. Therefore, we shouldn't create one for the (potential) callbr destination. Instead, use the symbol for the MachineBasicBlock. Reviewers: nickdesaulniers, craig.topper Reviewed By: nickdesaulniers Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64888 llvm-svn: 366523	2019-07-19 01:10:28 +00:00
Amara Emerson	cf12c7815f	[GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs and legalize later. I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't do that unless we delay lowering to actual function calls. This patch changes the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and then have each target specify that using the new custom legalizer for intrinsics hook that they want it expanded it a libcall. Differential Revision: https://reviews.llvm.org/D64895 llvm-svn: 366516	2019-07-19 00:24:45 +00:00
Stanislav Mekhanoshin	a9c71e01e7	[AMDGPU] Drop Reg32 and use regular AsmName This allows to reduce generated AMDGPUGenAsmWriter.inc by ~100Kb. Differential Revision: https://reviews.llvm.org/D64952 llvm-svn: 366505	2019-07-18 22:18:33 +00:00
Jessica Paquette	7a1dcc5ff1	[GlobalISel][AArch64] Add support for base register + offset register loads Add support for folding G_GEPs into loads of the form ``` ldr reg, [base, off] ``` when possible. This can save an add before the load. Currently, this is only supported for loads of 64 bits into 64 bit registers. Add a new addressing mode function, `selectAddrModeRegisterOffset` which performs this folding when it is profitable. Also add a test for addressing modes for G_LOAD. Differential Revision: https://reviews.llvm.org/D64944 llvm-svn: 366503	2019-07-18 21:50:11 +00:00
Peter Collingbourne	50057f3288	CodeGen: Allow !associated metadata to point to aliases. This is a small extension of !associated, mostly useful for the implementation convenience of instrumentation passes that RAUW globals with aliases, such as LowerTypeTests. Differential Revision: https://reviews.llvm.org/D64951 llvm-svn: 366502	2019-07-18 21:37:16 +00:00
Reid Kleckner	ba9c9e62cb	Revert [X86] EltsFromConsecutiveLoads - support common source loads This reverts r366441 (git commit `48104ef7c9`) This causes clang to fail to compile some file in Skia. Reduction soon. llvm-svn: 366501	2019-07-18 21:26:41 +00:00
Guanzhong Chen	df4479200b	[WebAssembly] Fix __builtin_wasm_tls_base intrinsic Summary: Properly generate the outchain for the `__builtin_wasm_tls_base` intrinsic. Also marked the intrinsic pure, per @sunfish's suggestion. Reviewers: tlively, aheejin, sbc100, sunfish Reviewed By: tlively Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits, sunfish Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64949 llvm-svn: 366499	2019-07-18 21:17:52 +00:00
Peter Collingbourne	68f3fc2d91	Fix typo in r366494. Spotted by Yuanfang Chen. llvm-svn: 366497	2019-07-18 21:03:37 +00:00
Steven Wu	dac7fca530	Remove the static initialize introduced in r365099 Summary: Some polish for r365099 which adds a static initializer to MachOObjectFile. Remove it by moving it to file scope. Reviewers: smeenai, alexshap, compnerd, mtrent, anushabasana Reviewed By: smeenai Subscribers: hiraditya, jkorous, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64873 llvm-svn: 366496	2019-07-18 21:01:21 +00:00
Peter Collingbourne	d1ec8eb84f	IR: Teach Constant::needsRelocation() that relative pointers don't need to be relocated. This causes sections with relative pointers to be marked as read only, which means that they won't end up sharing pages with writable data. Differential Revision: https://reviews.llvm.org/D64948 llvm-svn: 366494	2019-07-18 20:56:21 +00:00
Jordan Rose	887d31ccee	FileSystem: Check for DTTOIF alone, not _DIRENT_HAVE_D_TYPE While 'd_type' is a non-standard extension to `struct dirent`, only glibc signals its presence with a macro '_DIRENT_HAVE_D_TYPE'. However, any platform with 'd_type' also includes a way to convert to mode_t values using the macro 'DTTOIF', so we can check for that alone and still be confident that the 'd_type' member exists. (If this turns out to be wrong, I'll go back and set up an actual CMake check.) I couldn't think of how to write a test for this, because I couldn't think of how to test that a 'stat' call doesn't happen without controlling the filesystem or intercepting 'stat', and there's no good cross-platform way to do that that I know of. Follow-up (almost a year later) to r342089. rdar://problem/50592673 https://reviews.llvm.org/D64940 llvm-svn: 366486	2019-07-18 20:05:11 +00:00
Lang Hames	9e52d0576a	[ORC] Suppress an ORCv1 deprecation warning. llvm-svn: 366485	2019-07-18 19:55:42 +00:00
Amy Huang	f332fe642c	[COFF] Change a variable type to be const in the HeapAllocSite map. llvm-svn: 366479	2019-07-18 18:22:52 +00:00
Guanzhong Chen	801fa8e6b9	[WebAssembly] Implement __builtin_wasm_tls_base intrinsic Summary: Add `__builtin_wasm_tls_base` so that LeakSanitizer can find the thread-local block and scan through it for memory leaks. Reviewers: tlively, aheejin, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64900 llvm-svn: 366475	2019-07-18 17:53:22 +00:00
Michael Liao	17a8a9277c	[LAA] Re-check bit-width of pointers after stripping. Summary: - As the pointer stripping now tracks through `addrspacecast`, prepare to handle the bit-width difference from the result pointer. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64928 llvm-svn: 366470	2019-07-18 17:30:27 +00:00
Peter Collingbourne	aa6a7df64a	MC: AArch64: Add support for prel_g* relocation specifiers. Differential Revision: https://reviews.llvm.org/D64683 llvm-svn: 366462	2019-07-18 16:54:33 +00:00
Peter Collingbourne	76427f849f	AArch64: Unify relocation restrictions between MOVK/MOVN/MOVZ. There doesn't seem to be a practical reason for these instructions to have different restrictions on the types of relocations that they may be used with, notwithstanding the language in the ELF AArch64 spec that implies that specific relocations are meant to be used with specific instructions. For example, we currently forbid the first instruction in the following sequence, despite it currently being used by clang to generate a global reference under -mcmodel=large: movz x0, #:abs_g0_nc:foo movk x0, #:abs_g1_nc:foo movk x0, #:abs_g2_nc:foo movk x0, #:abs_g3:foo Therefore, allow MOVK/MOVN/MOVZ to accept the union of the set of relocations that they currently accept individually. Differential Revision: https://reviews.llvm.org/D64466 llvm-svn: 366461	2019-07-18 16:51:53 +00:00
Hsiangkai Wang	657277e0f1	Revert "[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame." This reverts commit 17e3cbf5fe656483d9016d0ba9e1d0cd8629379e. llvm-svn: 366444	2019-07-18 15:06:50 +00:00
Hsiangkai Wang	e43ce1a958	[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame. It is necessary to generate fixups in .debug_frame or .eh_frame as relaxation is enabled due to the address delta may be changed after relaxation. There is an opcode with 6-bits data in debug frame encoding. So, we also need 6-bits fixup types. Differential Revision: https://reviews.llvm.org/D58335 llvm-svn: 366442	2019-07-18 14:47:34 +00:00
Simon Pilgrim	48104ef7c9	[X86] EltsFromConsecutiveLoads - support common source loads This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load. A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match. Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle. Differential Revision: https://reviews.llvm.org/D64551 llvm-svn: 366441	2019-07-18 14:33:25 +00:00
Simon Pilgrim	8b525e357f	[DAGCombine] Pull getSubVectorSrc helper out of narrowInsertExtractVectorBinOp. NFCI. NFC step towards reusing this in other EXTRACT_SUBVECTOR combines. llvm-svn: 366435	2019-07-18 13:45:53 +00:00
Thomas Preud'homme	70494494c1	[FileCheck] Fix numeric variable redefinition Summary: Commit r365249 changed usage of FileCheckNumericVariable to have one instance of that class per variable as opposed to one instance per definition of a given variable as was done before. However, it retained the safety check in setValue that it should only be called with the variable unset, even after r365625. However this causes assert failure when a non-pseudo variable is being redefined. And while redefinition of @LINE at each CHECK line work in the general case, it caused problem when a substitution failed (fixed in r365624) and still causes problem when a CHECK line does not match since @LINE's value is cleared after substitutions in match() happened but printSubstitutions also attempts a substitution. This commit solves the root of the problem by changing setValue to set a new value regardless of whether a value was set or not, thus fixing all the aforementioned issues. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D64882 llvm-svn: 366434	2019-07-18 13:39:04 +00:00
Sanjay Patel	e654785912	[x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483) LEA doesn't affect flags, so use it more liberally to replace an ADD when we know that the ADD operands affect flags. In the motivating example from PR40483: https://bugs.llvm.org/show_bug.cgi?id=40483 ...this lets us avoid duplicating a math op just to avoid flag conflict. As mentioned in the TODO comments, this heuristic can be extended to fire more often if that leads to more improvements. Differential Revision: https://reviews.llvm.org/D64707 llvm-svn: 366431	2019-07-18 12:48:01 +00:00
Diogo N. Sampaio	11512e742b	[ARM][DAGCOMBINE][FIX] PerformVMOVRRDCombine Summary: PerformVMOVRRDCombine ommits adding a offset of 4 to the PointerInfo, when converting a f64 = load[M] to {i32, i32} = {load[M], load[M + 4]} Which would allow the machine scheduller to break dependencies with the second load. - pr42638 Reviewers: eli.friedman, dmgreen, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64870 llvm-svn: 366423	2019-07-18 10:05:56 +00:00
Chen Zheng	c38e3efe27	[SCEV] add no wrap flag for SCEVAddExpr. Differential Revision: https://reviews.llvm.org/D64868 llvm-svn: 366419	2019-07-18 09:23:19 +00:00
Alex Bradbury	b8d352a08b	[RISCV] Reset NoPHIS MachineFunctionProperty in emitSelectPseudo We insered PHIS were there were none before, so the property must be reset. This error was found on an EXPENSIVE_CHECKS build. llvm-svn: 366412	2019-07-18 07:52:41 +00:00
Serguei Katkov	0ffa833d54	[LoopInfo] Use early return in branch weight update functions. NFC. llvm-svn: 366411	2019-07-18 07:36:20 +00:00
Craig Topper	8da0402210	[X86] Disable combineConcatVectors for vXi1 vectors. I'm not convinced the code this calls is properly vetted for vXi1 vectors. Experimental vector widening legalization testing for D55251 is now hitting an assertion failure inside EltsFromConsecutiveLoads. This is occurring from a v2i1 load having a store size different than its VT size. Hopefully this commit will keep such issues from happening. llvm-svn: 366405	2019-07-18 06:18:06 +00:00
Alex Bradbury	44deaf7e54	[DWARF][RISCV] Add support for RISC-V relocations needed for debug info When code relaxation is enabled many RISC-V fixups are not resolved but instead relocations are emitted. This happens even for DWARF debug sections. Therefore, to properly support the parsing of DWARF debug info we need to be able to resolve RISC-V relocations. This patch adds: * Support for RISC-V relocations in RelocationResolver * DWARF support for two relocations per object file offset * DWARF changes to support relocations in more DIE fields The two relocations per offset change is needed because some RISC-V relocations (used for label differences) come in pairs. Relocations can also be emitted for DWARF fields where relocations were not yet evaluated. Adding relocation support for some of these fields is essencial. On the other hand, LLVM currently emits RISC-V relocations for fixups that could be safely evaluated, since they can never be affected by code relaxations. This patch also adds relocation support for the fields affected by those extraneous relocations (the DWARF unit entry Length, and the DWARF debug line entry TotalLength and PrologueLength), for testing purposes. Differential Revision: https://reviews.llvm.org/D62062 Patch by Luís Marques. llvm-svn: 366402	2019-07-18 05:22:55 +00:00
Alex Bradbury	8aba95d64c	[RISCV] Avoid signed integer overflow UB in RISCVMatInt::generateInstSeq Found by UBSan. llvm-svn: 366398	2019-07-18 04:02:58 +00:00
Alex Bradbury	ad73a436dc	[RISCV] Don't acccess an invalidated iterator in RISCVInstrInfo::removeBranch Issue found by ASan. llvm-svn: 366397	2019-07-18 03:23:47 +00:00
Fangrui Song	f358cf8de2	[AArch64] Add dependency from AArch64CodeGen to TransformUtils to fix -DBUILD_SHARED_LIBS=on link error after D64173/r366361 This fixes: ld.lld: error: undefined symbol: llvm::findAllocaForValue(llvm::Value, llvm::DenseMap<llvm::Value, llvm::Alloc aInst, llvm::DenseMapInfo<llvm::Value>, llvm::detail::DenseMapPair<llvm::Value, llvm::AllocaInst> >&) >>> referenced by AArch64StackTagging.cpp llvm-svn: 366396	2019-07-18 01:53:08 +00:00
Nilanjana Basu	4e22770219	Changes to display code view debug info type records in hex format llvm-svn: 366390	2019-07-17 23:43:58 +00:00
Evgeniy Stepanov	6abd78cc7c	Make DT a transitive dependency of LI. Summary: LoopInfoWrapperPass::verify uses DT, which means DT must be alive even if it has no direct users. Fixes a crash in expensive checks mode. Reviewers: pcc, leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64896 llvm-svn: 366388	2019-07-17 23:31:59 +00:00
Denis Bakhvalov	3eab4819f2	[llvm-bcanalyzer] Fixed error 'Expected<T> must be checked before access or destruction' After rL365286 I had failing test: LLVM :: tools/gold/X86/v1.12/thinlto_emit_linked_objects.ll It was failing with the output: $ llvm-bcanalyzer --dump llvm/test/tools/gold/X86/v1.12/Output/thinlto_emit_linked_objects.ll.tmp3.o.thinlto.bc Expected<T> must be checked before access or destruction. Unchecked Expected<T> contained error: Unexpected end of file reading 0 of 0 bytesStack dump: Change-Id: I07e03262074ea5e0aae7a8d787d5487c87f914a2 llvm-svn: 366387	2019-07-17 23:28:39 +00:00
Nico Weber	7bb5fc0583	llvm-pdbdump: Fix several smaller issues with injected source compression handling - getCompression() used to return a PDB_SourceCompression even though the docs for IDiaInjectedSource are explicit about the return value being compiler-dependent. Return an uint32_t instead, and make the printing code handle unknown values better by printing "Unknown" and the int value instead of not printing any compression. - Print compressed contents as hex dump, not as string. - Add compression type "DotNet", which is used (at least) by csc.exe, the C# compiler. Also add a lengthy comment describing the stream contents (derived from looking at the raw hex contents long enough to see the GUIDs, which led me to the roslyn and mono implementations for handling this). - The native injected source dumper was dumping the contents of the whole data stream -- but csc.exe writes a stream that's padded with zero bytes to the next 512 boundary, and the dia api doesn't display those padding bytes. So make NativeInjectedSource::getCode() do the same thing. Differential Revision: https://reviews.llvm.org/D64879 llvm-svn: 366386	2019-07-17 22:59:52 +00:00
Stanislav Mekhanoshin	7872d76a16	[AMDGPU] Simplify AMDGPUInstPrinter::printRegOperand() Differential Revision: https://reviews.llvm.org/D64892 llvm-svn: 366385	2019-07-17 22:58:43 +00:00
Craig Topper	61fff7a337	[X86] Make sure we mark 128/256 MLOAD as Legal with VLX when min-legal-vector-width=256 is in effect. This started triggering an assertion after r364718 when we made these Custom under AVX2. llvm-svn: 366382	2019-07-17 22:26:00 +00:00
Peter Collingbourne	3b82b92c6b	hwasan: Initialize the pass only once. This will let us instrument globals during initialization. This required making the new PM pass a module pass, which should still provide access to analyses via the ModuleAnalysisManager. Differential Revision: https://reviews.llvm.org/D64843 llvm-svn: 366379	2019-07-17 21:45:19 +00:00
Stanislav Mekhanoshin	9c7f4264d3	[AMDGPU] Stop special casing flat_scratch for register name Differential Revision: https://reviews.llvm.org/D64885 llvm-svn: 366376	2019-07-17 21:35:11 +00:00
Evgeniy Stepanov	f45fd429b7	Speculative fix for stack-tagging.ll failure. Depending on the evaluation order of function call arguments, the current code may insert a use before def. llvm-svn: 366375	2019-07-17 21:27:44 +00:00
Hideto Ueno	4a09a73fb0	[Attributor][NFC] Remove unnecessary debug output llvm-svn: 366373	2019-07-17 21:11:02 +00:00
Nilanjana Basu	6e4076699c	Adding inline comments to code view type record directives for better readability llvm-svn: 366372	2019-07-17 21:01:12 +00:00
Francis Visoiu Mistrih	9f2b290add	[PEI] Don't re-allocate a pre-allocated stack protector slot The LocalStackSlotPass pre-allocates a stack protector and makes sure that it comes before the local variables on the stack. We need to make sure that later during PEI we don't re-allocate a new stack protector slot. If that happens, the new stack protector slot will end up being after the local variables that it should be protecting. Therefore, we would have two slots assigned for two different stack protectors, one at the top of the stack, and one at the bottom. Since PEI will overwrite the assigned slot for the stack protector, the load that is used to compare the value of the stack protector will use the slot assigned by PEI, which is wrong. For this, we need to check if the object is pre-allocated, and re-use that pre-allocated slot. Differential Revision: https://reviews.llvm.org/D64757 llvm-svn: 366371	2019-07-17 20:46:19 +00:00
Francis Visoiu Mistrih	90ba54bf67	[CodeGen][NFC] Simplify checks for stack protector index checking Use `hasStackProtectorIndex()` instead of `getStackProtectorIndex() >= 0`. llvm-svn: 366369	2019-07-17 20:46:09 +00:00
Matt Arsenault	0966dd0d69	GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sources Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367	2019-07-17 20:22:44 +00:00
Matt Arsenault	914a59cad8	GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUES Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366	2019-07-17 20:22:38 +00:00
Evgeniy Stepanov	851339fb29	Basic MTE stack tagging instrumentation. Summary: Use MTE intrinsics to tag stack variables in functions with sanitize_memtag attribute. Reviewers: pcc, vitalybuka, hctim, ostannard Subscribers: srhines, mgorny, javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64173 llvm-svn: 366361	2019-07-17 19:24:12 +00:00
Evgeniy Stepanov	d752f5e953	Basic codegen for MTE stack tagging. Implement IR intrinsics for stack tagging. Generated code is very unoptimized for now. Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are used to implement a tagged stack frame pointer in a virtual register. Differential Revision: https://reviews.llvm.org/D64172 llvm-svn: 366360	2019-07-17 19:24:02 +00:00
Momchil Velikov	0e2b74a2b0	Revert [AArch64] Add support for Transactional Memory Extension (TME) This reverts r366322 (git commit `4b8da3a503`) llvm-svn: 366355	2019-07-17 17:43:32 +00:00
Daniil Fukalov	d912a9ba9b	[AMDGPU] Tune inlining parameters for AMDGPU target Summary: Since the target has no significant advantage of vectorization, vector instructions bous threshold bonus should be optional. amdgpu-inline-arg-alloca-cost parameter default value and the target InliningThresholdMultiplier value tuned then respectively. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64642 llvm-svn: 366348	2019-07-17 16:51:29 +00:00
Lang Hames	1716454027	[ORC] Add deprecation warnings to ORCv1 layers and utilities. Summary: ORCv1 is deprecated. The current aim is to remove it before the LLVM 10.0 release. This patch adds deprecation attributes to the ORCv1 layers and utilities to warn clients of the change. Reviewers: dblaikie, sgraenitz, AlexDenisov Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64609 llvm-svn: 366344	2019-07-17 16:40:52 +00:00
Matt Arsenault	06eed42213	AMDGPU: Use getTargetConstant Avoids creating an extra intermediate mov. llvm-svn: 366340	2019-07-17 15:35:36 +00:00
Hideto Ueno	11d3710c1c	[Attributor] Deduce "willreturn" function attribute Summary: Deduce the "willreturn" attribute for functions. For now, intrinsics are not willreturn. More annotation will be done in another patch. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63046 llvm-svn: 366335	2019-07-17 15:15:43 +00:00
Alex Bradbury	ab009a602e	[AsmPrinter] Make the encoding of call sites in .gcc_except_table configurable and use for RISC-V The original behavior was to always emit the offsets to each call site in the call site table as uleb128 values, however on some architectures (eg RISCV) these uleb128 offsets into the code cannot always be resolved until link time (because relaxation will invalidate any calculated offsets), and there are no appropriate relocations for uleb128 values. As a consequence it needs to be possible to specify an alternative. This also switches RISCV to use DW_EH_PE_udata4 for call side encodings in .gcc_except_table Differential Revision: https://reviews.llvm.org/D63415 Patch by Edward Jones. llvm-svn: 366329	2019-07-17 14:00:35 +00:00
Alex Bradbury	b94c233d06	[RISCV] Set correct encodings for DWARF exception handling This patch sets correct encodings for DWARF exception handling for RISC-V (other than call site encoding, which must be udata4 rather than uleb128 and is handled by D63415). This has the same intend as D63409, except this version matches GCC/binutils behaviour which uses the same encodings regardless of PIC/non-PIC and medlow/medany code model. llvm-svn: 366327	2019-07-17 13:54:38 +00:00
Jay Foad	70235c642e	[AMDGPU] Optimize atomic AND/OR/XOR Summary: Extend the atomic optimizer to handle AND, OR and XOR. Reviewers: arsenm, sheredom Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64809 llvm-svn: 366323	2019-07-17 13:40:03 +00:00
Momchil Velikov	4b8da3a503	[AArch64] Add support for Transactional Memory Extension (TME) TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Patch by Javed Absar and Momchil Velikov Differential Revision: https://reviews.llvm.org/D64416 llvm-svn: 366322	2019-07-17 13:23:27 +00:00
Justin Hibbits	0257c6b659	PowerPC: Fix register spilling for SPE registers Summary: Missed in the original commit, use the correct callee-saved register list for spilling, instead of the standard SVR432 list. This avoids needlessly spilling the SPE non-volatile registers when they're not used. As part of this, also add where missing, and sort, the spill opcode checks for SPE and SPE4 register classes. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56703 llvm-svn: 366319	2019-07-17 12:30:48 +00:00
Justin Hibbits	5214956eaa	PowerPC/SPE: Fix load/store handling for SPE Summary: Pointed out in a comment for D49754, register spilling will currently spill SPE registers at almost any offset. However, the instructions `evstdd` and `evldd` require a) 8-byte alignment, and b) a limit of 256 (unsigned) bytes from the base register, as the offset must fix into a 5-bit offset, which ranges from 0-31 (indexed in double-words). The update to the register spill test is taken partially from the test case shown in D49754. Additionally, pointed out by Kei Thomsen, globals will currently use evldd/evstdd, though the offset isn't known at compile time, so may exceed the 8-bit (unsigned) offset permitted. This fixes that as well, by forcing it to always use evlddx/evstddx when accessing globals. Part of the patch contributed by Kei Thomsen. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54409 llvm-svn: 366318	2019-07-17 12:30:04 +00:00
Petar Avramovic	1e62635d05	[MIPS GlobalISel] ClampScalar and select pointer G_ICMP Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317	2019-07-17 12:08:01 +00:00
Nicolai Haehnle	8b7041a5c6	AMDGPU/GFX10: Apply the VMEM-to-scalar-write hazard also to writes to EXEC Summary: Change-Id: I854fbf7d48e937bef9f8f3f5d0c8aeb970652630 Reviewers: rampitec, mareko Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64807 Change-Id: I4405b3a7f84186acea5a78d291bff71056e745fc llvm-svn: 366314	2019-07-17 11:22:57 +00:00
Nicolai Haehnle	a256b8b7d7	AMDGPU: Improve alias analysis for GDS Summary: GDS cannot alias anything else. Original patch by: Marek Olšák Reviewers: arsenm, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64114 Change-Id: I07bfbd96f5d5c37a6dfba7997df12f291dd794b0 llvm-svn: 366313	2019-07-17 11:22:19 +00:00
Diana Picus	37e403d18c	[ARM GlobalISel] Cleanup CallLowering. NFC Migrate CallLowering::lowerReturnVal to use the same infrastructure as lowerCall/FormalArguments and remove the now obsolete code path from splitToValueTypes. Forgot to push this earlier. llvm-svn: 366308	2019-07-17 10:01:27 +00:00
Simon Atanasyan	4c1e440892	[mips] Use mult/mflo pattern on 64-bit targets prior to MIPS64 The `MUL` instruction is available starting from the MIPS32/MIPS64 targets. llvm-svn: 366301	2019-07-17 08:11:40 +00:00
Simon Atanasyan	a884afb6f8	[mips] Implement .cplocal directive This directive forces to use the alternate register for context pointer. For example, this code: .cplocal $4 jal foo expands to: ld $25, %call16(foo)($4) jalr $25 Differential Revision: https://reviews.llvm.org/D64743 llvm-svn: 366300	2019-07-17 08:11:31 +00:00
Simon Atanasyan	7f308af5ee	[mips] Support the "o" inline asm constraint As well as other LLVM targets we do not handle "offsettable" memory addresses in any special way. In other words, the "o" constraint is an exact equivalent of the "m" one. But some existing code require the "o" constraint support. This fixes PR42589. Differential Revision: https://reviews.llvm.org/D64792 llvm-svn: 366299	2019-07-17 08:11:15 +00:00
Stanislav Mekhanoshin	e5012ab308	[AMDGPU] Autogenerate register asm names Differential Revision: https://reviews.llvm.org/D64839 llvm-svn: 366283	2019-07-16 23:44:21 +00:00
Matt Arsenault	1c3f4ec7fc	GlobalISel: Add overload of handleAssignments with CCState AMDGPU needs to allocate special argument registers separately from the user function argument list, so needs direct control over the CCState. The ArgLocs argument is only really necessary because CCState doesn't allow access to it. llvm-svn: 366279	2019-07-16 22:41:34 +00:00
Guanzhong Chen	0a8d4df799	[WebAssembly] Compile all TLS on Emscripten as local-exec Summary: Currently, on Emscripten, dynamic linking is not supported with threads. This means that if thread-local storage is used, it must be used in a statically-linked executable. Hence, local-exec is the only possible model. This diff compiles all TLS variables to use local-exec on Emscripten as a temporary measure until dynamic linking is supported with threads. The goal for this is to allow C++ types with constructors to be thread-local. Currently, when `clang` compiles a `thread_local` variable with a constructor, it generates `__tls_guard` variable: @__tls_guard = internal thread_local global i8 0, align 1 As no TLS model is specified, this is treated as general-dynamic, which we do not support (and cannot support without implementing dynamic linking support with threads in Emscripten). As a result, any C++ constructor in `thread_local` variables would not compile. By compiling all `thread_local` as local-exec, `__tls_guard` will compile and we can support C++ constructors with TLS without implementing dynamic linking with threads. Depends on D64537 Reviewers: tlively, aheejin, sbc100 Reviewed By: aheejin Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64776 llvm-svn: 366275	2019-07-16 22:22:08 +00:00
Guanzhong Chen	42bba4b852	[WebAssembly] Implement thread-local storage (local-exec model) Summary: Thread local variables are placed inside a `.tdata` segment. Their symbols are offsets from the start of the segment. The address of a thread local variable is computed as `__tls_base` + the offset from the start of the segment. `.tdata` segment is a passive segment and `memory.init` is used once per thread to initialize the thread local storage. `__tls_base` is a wasm global. Since each thread has its own wasm instance, it is effectively thread local. Currently, `__tls_base` must be initialized at thread startup, and so cannot be used with dynamic libraries. `__tls_base` is to be initialized with a new linker-synthesized function, `__wasm_init_tls`, which takes as an argument a block of memory to use as the storage for thread locals. It then initializes the block of memory and sets `__tls_base`. As `__wasm_init_tls` will handle the memory initialization, the memory does not have to be zeroed. To help allocating memory for thread-local storage, a new compiler intrinsic is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns the size of the thread-local storage for the current function. The expected usage is to run something like the following upon thread startup: __wasm_init_tls(malloc(__builtin_wasm_tls_size())); Reviewers: tlively, aheejin, kripken, sbc100 Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64537 llvm-svn: 366272	2019-07-16 22:00:45 +00:00
Sanjay Patel	d746a210e1	[x86] use more phadd for reductions This is part of what is requested by PR42023: https://bugs.llvm.org/show_bug.cgi?id=42023 There's an extension needed for FP add, but exactly how we would specify that using flags is not clear to me, so I left that as a TODO. We're still missing patterns for partial reductions when the input vector is 256-bit or 512-bit, but I think that's a failure of vector narrowing. If we can reduce the widths, then this matching should work on those tests. Differential Revision: https://reviews.llvm.org/D64760 llvm-svn: 366268	2019-07-16 21:30:41 +00:00
David Blaikie	40580d36c4	DWARF: Skip zero column for inline call sites D64033 <https://reviews.llvm.org/D64033> added DW_AT_call_column for inline sites. However, that change wasn't aware of "-gno-column-info". To avoid adding column info when "-gno-column-info" is used, now DW_AT_call_column is only added when we have non-zero column (when "-gno-column-info" is used, column will be zero). Patch by Wenlei He! Differential Revision: https://reviews.llvm.org/D64784 llvm-svn: 366264	2019-07-16 21:15:19 +00:00
Matt Arsenault	f8c8284455	AMDGPU/GlobalISel: Select G_ASHR llvm-svn: 366257	2019-07-16 20:31:25 +00:00
Matt Arsenault	e5b28b98e9	AMDGPU/GlobalISel: Select G_LSHR llvm-svn: 366256	2019-07-16 20:25:43 +00:00
Jinsong Ji	65e34a3143	[PowerPC][HTM] Fix impossible reg-to-reg copy assert with ttest builtin Summary: This is exposed by our internal testing. The reduced testcase will assert with "Impossible reg-to-reg copy" We can't use COPY to do 32-bit to 64-bit conversion. Reviewers: kbarton, hfinkel, nemanjai Reviewed By: hfinkel Subscribers: hiraditya, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64499 llvm-svn: 366255	2019-07-16 20:24:33 +00:00
Matt Arsenault	1b69fd275d	AMDGPU/GlobalISel: Select G_SHL I think this manages to not break the DAG handling with the divergent predicates because the stadalone divergent patterns end up with a higher priority than the pattern on the instruction definition. The 16-bit versions don't work yet. llvm-svn: 366254	2019-07-16 20:15:30 +00:00
Stanislav Mekhanoshin	6e0fa292c2	[AMDGPU] Change register type for v32 vectors When it is AReg_1024 this results in unnecessary copying into AGPRs of a 32 element vectors even though they are not intended for an mfma instruction. Differential Revision: https://reviews.llvm.org/D64815 llvm-svn: 366252	2019-07-16 20:06:00 +00:00
Michael Liao	ccf22ef94c	Fix -Wreturn-type warning. NFC. llvm-svn: 366251	2019-07-16 19:59:08 +00:00
Matt Arsenault	2d10407719	AMDGPU/GlobalISel: Fix selection of private stores llvm-svn: 366249	2019-07-16 19:27:44 +00:00
Matt Arsenault	7161fb0be5	AMDGPU/GlobalISel: Select private loads llvm-svn: 366248	2019-07-16 19:22:21 +00:00
Matt Arsenault	dad1f89210	AMDGPU/GlobalISel: Select flat stores llvm-svn: 366246	2019-07-16 18:42:53 +00:00
Matt Arsenault	7eb1902cd5	AMDGPU: Add register classes to flat store patterns For some reason GlobalISelEmitter needs register classes to import these, although it works for the load patterns. llvm-svn: 366242	2019-07-16 18:26:42 +00:00

1 2 3 4 5 ...

124953 Commits