llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	eb16435b5e	Migrate function attribute "no-frame-pointer-elim-non-leaf" to "frame-pointer"="non-leaf" as cleanups after D56351	2019-12-24 16:05:15 -08:00
Fangrui Song	502a77f125	Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351	2019-12-24 15:57:33 -08:00
Craig Topper	c06e53119b	[X86] Use 128-bit vector instructions for f32/f64->i64 conversions on 32-bit targets with avx512dq and avx512vl instructions. On 32-bit targets we can't use the scalar instruction so we insert the scalar into a vector and use packed conversions. Previously we used either v4f32->v4i64 or v4f64->v4i64 to avoid some complexity creating target specific ISD opcodes for v4f32->v2i64. But this causes extra vzeroupper instructions and possibly frequency throttling on Intel CPUs. This patch changes this to create a 128-bit vector and uses a target specific ISD opcode if needed.	2019-12-24 11:20:10 -08:00
Craig Topper	a21beccea2	[X86] Add STRICT versions of CVTTP2SI, CVTTP2UI, CMPM, and CMPP. Differential Revision: https://reviews.llvm.org/D71850	2019-12-24 10:07:04 -08:00
Matt Arsenault	df5c2159d0	AMDGPU/GlobalISel: Legalize some 16-bit round instructions	2019-12-24 09:53:01 -05:00
Matt Arsenault	e351256c0d	GlobalISel: Define equivalent node for G_INTRINSIC_TRUNC	2019-12-24 09:53:01 -05:00
Matt Arsenault	9035fa6b54	AMDGPU/GlobalISel: Lower llvm.amdgcn.else	2019-12-24 09:53:01 -05:00
David Blaikie	fccac1ec16	DebugInfo: Correct the form of DW_AT_macro_info in .dwo files (sec_offset, rather than data4)	2019-12-24 01:23:21 -08:00
Georgii Rymar	301cb91428	[llvm-readobj] - Remove an excessive helper for printing dynamic tags. This removes the `getTypeString` from readeobj source because it almost duplicates the existent method: `ELFFile<ELFT>::getDynamicTagAsString`. Side effect: now it prints "<unknown:>0xHEXVALUE" instead of "(unknown)" for unknown values. llvm-readelf before this patch printed: ``` 0x0000000012345678 (unknown) 0x8765432187654321 0x000000006abcdef0 (unknown) 0x9988776655443322 0x0000000076543210 (unknown) 0x5555666677778888 ``` and now it prints: ``` 0x0000000012345678 (<unknown:>0x12345678) 0x8765432187654321 0x000000006abcdef0 (<unknown:>0x6abcdef0) 0x9988776655443322 0x0000000076543210 (<unknown:>0x76543210) 0x5555666677778888 ``` GNU reaedlf prints different thing: ``` 0x0000000012345678 (<unknown>: 12345678) 0x8765432187654321 0x000000006abcdef0 (Operating System specific: 6abcdef0) 0x9988776655443322 0x0000000076543210 (Processor Specific: 76543210) 0x5555666677778888 ``` I am not sure we want to follow GNU here. Even if we do, it should be separate patch probably. The new output looks better and closer to GNU anyways, and the code is a bit simpler. Differential revision: https://reviews.llvm.org/D71835	2019-12-24 11:55:45 +03:00
Sourabh Singh Tomar	0a72515d33	[DebugInfo] Fix v4 macinfo for dwo files. Dwo files must contain have DW_AT_macro_info attribute, when macro information is emitted. Adjusted the test case for the same.	2019-12-24 12:50:34 +05:30
David Blaikie	199700a5cf	DebugInfo: Support dumping any exprloc as an expression Now that DWARFv5 provides a way to identify DWARF expressions based on form, rather than only by attribute - use it to always provide pretty printing for any exprloc attribute, not only the attributes known to contain expressions.	2019-12-23 19:18:47 -08:00
Igor Kudrin	6f635f9092	[DWARF] Check that all fields of a Unit Header are read. Tests "dwarfdump-rnglists-dwarf64.s" and "dwarfdump-rnglists.s" were malformed because they had missing required DWO ID fields in split compilation unit headers. The patch fixes the tests and checks the reading of a unit header more thoroughly. Differential Revision: https://reviews.llvm.org/D71704	2019-12-24 09:38:20 +07:00
Sanjay Patel	25cf5d97ac	[InstCombine] add test for copysign; NFC	2019-12-23 17:54:31 -05:00
Sanjay Patel	9a77c20954	[InstCombine] add tests for not(select ...); NFC	2019-12-23 17:14:32 -05:00
Ulrich Weigand	0d3f782e41	[FPEnv][X86] More strict int <-> FP conversion fixes Fix several several additional problems with the int <-> FP conversion logic both in common code and in the X86 target. In particular: - The STRICT_FP_TO_UINT expansion emits a floating-point compare. This compare can raise exceptions and therefore needs to be a strict compare. I've made it signaling (even though quiet would also be correct) as signaling is the more usual default for an LT. This code exists both in common code and in the X86 target. - The STRICT_UINT_TO_FP expansion algorithm was incorrect for strict mode: it emitted two STRICT_SINT_TO_FP nodes and then used a select to choose one of the results. This can cause spurious exceptions by the STRICT_SINT_TO_FP that ends up not chosen. I've fixed the algorithm to use only a single STRICT_SINT_TO_FP instead. - The !isStrictFPEnabled logic in DoInstructionSelection would sometimes do the wrong thing because it calls getOperationAction using the result VT. But for some opcodes, incuding [SU]INT_TO_FP, getOperationAction needs to be called using the operand VT. - Remove some (obsolete) code in X86DAGToDAGISel::Select that would mutate STRICT_FP_TO_[SU]INT to non-strict versions unnecessarily. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71840	2019-12-23 21:11:45 +01:00
David Blaikie	e028cee66a	MC: Ensure test only reads from the Inputs directory	2019-12-23 11:08:26 -08:00
Jay Foad	c7c05b0c8a	[AMDGPU] Don't create MachinePointerInfos with an UndefValue pointer Summary: The only useful information the UndefValue conveys is the address space, which MachinePointerInfo can represent directly without referring to an IR value. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71838	2019-12-23 15:58:19 +00:00
Luís Marques	5b1d0dc6bf	[RISCV][NFC] Fix use of missing attribute groups in tests	2019-12-23 15:39:04 +00:00
czhengsz	79b3325be0	[PowerPC] NFC - fix the testcase bug of folding rlwinm	2019-12-23 10:28:22 -05:00
Sanjay Patel	8cefc37be5	[DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815	2019-12-23 10:11:45 -05:00
Florian Hahn	8d6f59b78a	[Matrix] Use fmuladd for matrix.multiply if allowed. If the matrix.multiply calls have the contract fast math flag, we can use fmuladd. This als adds a command line option to force fmuladd generation. We can retire this option once there is a clang-level option. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70951	2019-12-23 14:49:14 +01:00
Florian Hahn	109e4e3851	[Matrix] Add forward shape propagation and first shape aware lowerings. This patch adds infrastructure for forward shape propagation to LowerMatrixIntrinsics. It also updates the pass to make use of the shape information to break up larger vector operations and to eliminate unnecessary conversion operations between columnwise matrixes and flattened vectors: if shape information is available for an instruction, lower the operation to a set of instructions operating on columns. For example, a store of a matrix is broken down into separate stores for each column. For users that do not have shape information (e.g. because they do not yet support shape information aware lowering), we pack the result columns into a flat vector and update those users. It also adds shape aware lowering for the first non-intrinsic instruction: vector stores. Example: For %c = call <4 x double> @llvm.matrix.transpose(<4 x double> %a, i32 2, i32 2) store <4 x double> %c, <4 x double>* %Ptr We generate the code below without shape propagation. Note %9 which combines the columns of the transposed matrix into a flat vector. %split = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 0, i32 1> %split1 = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 2, i32 3> %1 = extractelement <2 x double> %split, i64 0 %2 = insertelement <2 x double> undef, double %1, i64 0 %3 = extractelement <2 x double> %split1, i64 0 %4 = insertelement <2 x double> %2, double %3, i64 1 %5 = extractelement <2 x double> %split, i64 1 %6 = insertelement <2 x double> undef, double %5, i64 0 %7 = extractelement <2 x double> %split1, i64 1 %8 = insertelement <2 x double> %6, double %7, i64 1 %9 = shufflevector <2 x double> %4, <2 x double> %8, <4 x i32> <i32 0, i32 1, i32 2, i32 3> store <4 x double> %9, <4 x double>* %Ptr With this patch, we propagate the 2x2 shape information from the transpose to the store and we generate the code below. Note that we store the columns directly and do not need an extra shuffle. %9 = bitcast <4 x double>* %Ptr to double* %10 = bitcast double* %9 to <2 x double>* store <2 x double> %4, <2 x double>* %10, align 8 %11 = getelementptr double, double* %9, i32 2 %12 = bitcast double* %11 to <2 x double>* store <2 x double> %8, <2 x double>* %12, align 8 Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70897	2019-12-23 13:51:56 +01:00
Georgii Rymar	f027e1a68d	[yaml2obj] - Allow using an arbitrary value for OSABI. There was no way to set an unsupported or unknown OS ABI. With this patch it is possible to use any numeric value. Differential revision: https://reviews.llvm.org/D71765	2019-12-23 13:29:52 +03:00
Georgii Rymar	1f98577556	[yaml2obj] - Add support for ELFOSABI_LINUX. ELFOSABI_LINUX is an alias for ELFOSABI_GNU. It is not that obvious probably. Differential revision: https://reviews.llvm.org/D71764	2019-12-23 13:25:58 +03:00
Georgii Rymar	2cebc1a717	[yaml2obj] - Add testing for OSABI field. We have no such testing. This makes impossible to add support for new ELFOSABI_* tags. Differential revision: https://reviews.llvm.org/D71763	2019-12-23 13:18:18 +03:00
Martin Storsjö	5a751e747d	[AArch64] [Windows] Use COFF stubs for calls to extern_weak functions As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Improve the classifyGlobalFunctionReference method to set MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in AArch64TargetLowering::LowerCall to use the return value from classifyGlobalFunctionReference for these cases. Add code in both AArch64FastISel and GlobalISel/IRTranslator to bail out for function calls to extern weak functions on windows, to let SelectionDAG handle them. This matches what was done for X86 in `6bf108d77a`. Differential Revision: https://reviews.llvm.org/D71721	2019-12-23 12:13:49 +02:00
Martin Storsjö	b774aa1011	[ARM] [Windows] Use COFF stubs for calls to extern_weak functions As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Instead check the shouldAssumeDSOLocal method and load the address from a COFF stub. This matches what was done for X86 in `6bf108d77a`. Differential Revision: https://reviews.llvm.org/D71720	2019-12-23 12:13:49 +02:00
Georgii Rymar	cc522bc4e3	[llvm-readobj][test] - Stop using Inputs/trivial.obj.elf-x86-64. This rewrites a few tests to stop using the trivial.obj.elf-x86-64 precompiled object and removes it. Differential revision: https://reviews.llvm.org/D71662	2019-12-23 13:10:26 +03:00
QingShan Zhang	6d5e35e89d	[Power9] Remove the PPCISD::XXREVERSE as it has completely the same semantics of ISD::BSWAP The custom node PPCISD::XXREVERSE has completely the same semantics of generic node ISD::BSWAP. We need to clean up it as we have the combine rules for bswap in the base class, while nothing for xxreverse. Differential Revision: https://reviews.llvm.org/D70657	2019-12-23 07:44:33 +00:00
QingShan Zhang	9d1071eac4	[NFC][Test][PowerPC] Add more tests for 'and mask'	2019-12-23 06:59:14 +00:00
Jim Lin	da0fe5db99	[AVR] Fix codegen for rotate instructions Summary: This patch introduces the ROLBRd and RORBRd pseudo-instructions, which implemenent the "traditional" rotate operations; instead of the AVR rotate instructions that use the carry bit. The code is not optimized at all. Especially when dealing with loops of rotate instructions, this codegen should be improved some day. Related bug: 41358 <https://bugs.llvm.org/show_bug.cgi?id=41358> //Note//: This is my first submitted patch. Reviewers: dylanmckay, Jim Reviewed By: dylanmckay Subscribers: hiraditya, llvm-commits, dylanmckay, dsprenkels Tags: #llvm Patched by dsprenkels (Daan Sprenkels) Differential Revision: https://reviews.llvm.org/D60365	2019-12-23 11:41:28 +08:00
Kai Luo	9681dc9627	[PowerPC] Exploit `vrl(b\|h\|w\|d)` to perform vector rotation Summary: Currently, we set legalization action of `ISD::ROTL` vectors as `Expand` in `PPCISelLowering`. However, we can exploit `vrl(b\|h\|w\|d)` to lower `ISD::ROTL` directly. Differential Revision: https://reviews.llvm.org/D71324	2019-12-23 03:04:43 +00:00
Shengchen Kan	fb53396c49	[NFC] Remove unnecessary blank and rename align-branch-64-5b.s to align-branch-64-6a.s	2019-12-23 10:22:02 +08:00
czhengsz	7259f04dde	[SCEV] add testcase for get accurate range for addrecexpr with nuw flag	2019-12-22 20:58:19 -05:00
Carl Ritson	2791667d2e	[DAGCombiner] Check term use before applying aggressive FSUB optimisations Summary: Without this check unnecessary FMA instructions are generated when the FSUB terms are reused. This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation. Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel Reviewed By: arsenm Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71656	2019-12-23 09:37:58 +09:00
Valentin Churavy	fb0ccff6e5	[SelectionDAG] Copy FP flags when visiting a binary instruction. Summary: We noticed in Julia that the sequence below no longer turned into a sequence of FMA instructions in LLVM 7+, but it did in LLVM 6. ``` %29 = fmul contract <4 x double> %wide.load, %wide.load16 %30 = fmul contract <4 x double> %wide.load13, %wide.load17 %31 = fmul contract <4 x double> %wide.load14, %wide.load18 %32 = fmul contract <4 x double> %wide.load15, %wide.load19 %33 = fadd fast <4 x double> %vec.phi, %29 %34 = fadd fast <4 x double> %vec.phi10, %30 %35 = fadd fast <4 x double> %vec.phi11, %31 %36 = fadd fast <4 x double> %vec.phi12, %32 ``` Unlike Clang, Julia doesn't set the `unsafe-fp-math=true` function attribute, but rather emits more local instruction flags. This partially undoes https://reviews.llvm.org/D46854 and if required I can try to minimize the test further. Reviewers: spatel, mcberg2017 Reviewed By: spatel Subscribers: chriselrod, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71495	2019-12-22 14:29:36 -05:00
Reid Kleckner	b2c1ba5b1f	Revert "[ARM][TypePromotion] Enable by default" This reverts commit `ee7579409b`. It causes crashes during ThinLTO. I suspect the issue is related to races on the global TypeSize variable, which is 80 at the time of the crash.	2019-12-22 11:27:11 -08:00
Craig Topper	a4aa40cebc	[X86] Autogenerate complete checks. NFC	2019-12-22 11:18:37 -08:00
Craig Topper	fa303ea5d3	[X86] Fix typo of intrinsic name in test cases. NFC These said test_f32_olt_s for the type of an overloaded intrinsic. But the parser doesn't use that part of the name and just uses the types of the arguments.	2019-12-22 11:18:32 -08:00
Philip Reames	be051f4312	[Test] Add examples of problematic assembler auto-padding This is in the context of the automatic padding work for the jcc erratum mitigation. These are example cases we need to not pad for correctness. Exact mechanism to suppress is still TBD, but saving the tests which have come up.	2019-12-22 09:01:04 -08:00
Sanjay Patel	9cdcd81d3f	[InstCombine] enhance fold for copysign with known sign arg This is another optimization suggested in PRPR44153: https://bugs.llvm.org/show_bug.cgi?id=44153	2019-12-22 10:07:01 -05:00
Eric Astor	dc5b614fa9	[ms] [X86] Use "P" modifier on operands to call instructions in inline X86 assembly. Summary: This is documented as the appropriate template modifier for call operands. Fixes PR44272, and adds a regression test. Also adds support for operand modifiers in Intel-style inline assembly. Reviewers: rnk Reviewed By: rnk Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71677	2019-12-22 09:16:34 -05:00
Sanjay Patel	0b38af89e2	[AArch64] match splat of bitcasted extract subvector to DUPLANE This is another potential regression exposed by D63815. Here we peek through a bitcast to find an extract subvector and scale the splat offset based on that: splat (bitcast (extract X, C)), LaneC --> duplane (bitcast X), LaneC' Differential Revision: https://reviews.llvm.org/D71672	2019-12-22 08:37:03 -05:00
Sanjay Patel	79c7fa31f3	[InstCombine] check alloc size in bitcast of geps fold (PR44321) We missed a constraint in D44833 when folding a bitcast into a GEP with vector/array types. If the alloc sizes specified by the datalayout don't match, this could miscompile as shown in: https://bugs.llvm.org/show_bug.cgi?id=44321 Differential Revision: https://reviews.llvm.org/D71771	2019-12-21 10:31:21 -05:00
Sanjay Patel	19f9f374d9	[SimplifyLibCalls] require fast-math-flags for pow(X, -0.5) transforms As discussed in PR44330: https://bugs.llvm.org/show_bug.cgi?id=44330 ...the transform from pow(X, -0.5) libcall/intrinsic to reciprocal square root can result in small deviations from the expected result due to differences in the pow() implementation and/or the extra rounding step from the division. This patch proposes to allow that difference with either the 'approximate functions' or 'reassociate' FMF: http://llvm.org/docs/LangRef.html#fast-math-flags In practice, this likely means that the code is compiled with all of 'fast' (-ffast-math), but I have preserved the existing specializations for -0.0/-INF that enable generating safe code if those special values are allowed simultaneously with allowing approximation/reassociation. The question about whether a similar restriction is needed for the non-reciprocal case -- pow(X, 0.5) -- is deferred. That transform is allowed without FMF currently, and this patch does not change that behavior. Differential Revision: https://reviews.llvm.org/D71706	2019-12-21 10:00:53 -05:00
Florian Hahn	d269255b95	[AArch64] Respect reserved registers while renaming in LdSt opt. We cannot pick reserved registers as rename registers. Fixes https://bugs.llvm.org/show_bug.cgi?id=44358	2019-12-21 15:10:07 +01:00
Matt Arsenault	f9677c4757	Mips: Make test resistant to future changes This seems to have been relying on extra spills being inserted in these blocks to increase the code size to trigger branch relaxation. This broke when these spills were avoided. Add some asm to pad the size of the blocks to make it not matter.	2019-12-21 04:56:20 -05:00
Matt Arsenault	42a26445f9	AMDGPU/GlobalISel: Fix misuse of div_scale intrinsics Confusingly, the intrinsic operands do not match the instruction/custom node. The order is shuffled, and the 3rd operand is an immediate to select operands. I'm not 100% sure I did this right, but fdiv still doesn't select end to end and it will be easier to tell when it does. This at least avoids an assertion in RegBankSelect and allows hitting the fallback on selection.	2019-12-21 04:55:36 -05:00
Matt Arsenault	dff3f8d742	AMDGPU/GlobalISel: Fix missing scc imp-def on scalar and/or/xor	2019-12-21 04:55:36 -05:00
Michael Trent	b4dfa74a5d	Constrain the macho-stabs test added in `f72d001e09` to run on systems configured with an x86 backend. Summary: This fixes a failure on the Builder clang-cmake-armv7-quick bot. Reviewers: lhames, jhenderson Reviewed By: lhames Subscribers: kristof.beyls, rupprecht, seiya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71792	2019-12-20 17:40:37 -08:00

1 2 3 4 5 ...

67313 Commits