llvm-project

Commit Graph

Author	SHA1	Message	Date
Jessica Paquette	809d708b8a	[MachineOutliner] NFC: Split up getOutliningBenefit This is some more cleanup in preparation for some actual functional changes. This splits getOutliningBenefit into two cost functions: getOutliningCallOverhead and getOutliningFrameOverhead. These functions return the number of instructions that would be required to call a specific function and the number of instructions that would be required to construct a frame for a specific funtion. The actual outlining benefit logic is moved into the outliner, which calls these functions. The goal of refactoring getOutliningBenefit is to: - Get us closer to getting rid of the IsTailCall flag - Further split up "target-specific" things and "general algorithm" things llvm-svn: 309356	2017-07-28 03:21:58 +00:00
Matthias Braun	c618a466f1	ARMFrameLowering: Only set ExtraCSSpill for actually unused registers. The code assumed that unclobbered/unspilled callee saved registers are unused in the function. This is not true for callee saved registers that are also used to pass parameters such as swiftself. rdar://33401922 llvm-svn: 309350	2017-07-28 01:36:32 +00:00
Reid Kleckner	07a5d4372e	[X86] Fix latent bug in sibcall eligibility logic The X86 tail call eligibility logic was correct when it was written, but the addition of inalloca and argument copy elision broke its assumptions. It was assuming that fixed stack objects were immutable. Currently, we aim to emit a tail call if no arguments have to be re-arranged in memory. This code would trace the outgoing argument values back to check if they are loads from an incoming stack object. If the stack argument is immutable, then we won't need to store it back to the stack when we tail call. Fortunately, stack objects track their mutability, so we can just make the obvious check to fix the bug. This was http://crbug.com/749826 llvm-svn: 309343	2017-07-28 00:58:35 +00:00
Adrian Prantl	8f4b353ee1	Remove unused function from AArch64 backend (NFC) llvm-svn: 309336	2017-07-27 23:52:06 +00:00
Ahmed Bougacha	c890993726	[X86] Don't lie about legality to TLI's demanded bits. Like r309323, X86 had a typo where it passed the wrong flags to TLO. Found by inspection; I haven't been able to tickle this into having observable behavior. I don't think it does, given that X86 doesn't have custom demanded bits logic, and the generic logic doesn't have a lot of exposure to illegal constructs. llvm-svn: 309325	2017-07-27 21:28:59 +00:00
Ahmed Bougacha	52cecb1f27	[AArch64] Remove outdated comment. NFC. There hasn't been a ternary since r231987. llvm-svn: 309324	2017-07-27 21:27:58 +00:00
Ahmed Bougacha	87807c5a86	[AArch64] Fix legality info passed to demanded bits for TBI opt. The (seldom-used) TBI-aware optimization had a typo lying dormant since it was first introduced, in r252573: when asking for demanded bits, it told TLI that it was running after legalize, where the opposite was true. This is an important piece of information, that the demanded bits analysis uses to make assumptions about the node. r301019 added such an assumption, which was broken by the TBI combine. Instead, pass the correct flags to TLO. llvm-svn: 309323	2017-07-27 21:27:25 +00:00
Florian Hahn	e3583bdf91	[ARM] Add use-misched feature, to enable the MachineScheduler. Summary: This change makes it easier to experiment with the MachineScheduler in the ARM backend and also makes it very explicit which CPUs use the MachineScheduler (currently only swift and cyclone). Reviewers: MatzeB, t.p.northover, javed.absar Reviewed By: MatzeB Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35935 llvm-svn: 309316	2017-07-27 19:56:44 +00:00
Dinar Temirbulatov	aead31a36f	[X86] SET0 to use XMM registers where possible PR26018 PR32862 Differential Revision: https://reviews.llvm.org/D35839 llvm-svn: 309298	2017-07-27 17:47:01 +00:00
Florian Hahn	67ddd1d08f	[TargetParser] Use enum classes for various ARM kind enums. Summary: Using c++11 enum classes ensures that only valid enum values are used for ArchKind, ProfileKind, VersionKind and ISAKind. This removes the need for checks that the provided values map to a proper enum value, allows us to get rid of AK_LAST and prevents comparing values from different enums. It also removes a bunch of static_cast from unsigned to enum values and vice versa, at the cost of introducing static casts to access AArch64ARCHNames and ARMARCHNames by ArchKind. FPUKind and ArchExtKind are the only remaining old-style enum in TargetParser.h. I think it's beneficial to keep ArchExtKind as old-style enum, but FPUKind can be converted too, but this patch is quite big, so could do this in a follow-up patch. I could also split this patch up a bit, if people would prefer that. Reviewers: rengolin, javed.absar, chandlerc, rovka Reviewed By: rovka Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35882 llvm-svn: 309287	2017-07-27 16:27:56 +00:00
Florian Hahn	db479524dd	[ARM] Mark labels in skipAlignedDPRCS2Spills as fallthrough (NFC). The comment at the top of the switch statement indicates that the fall-through behavior is intentional. By using LLVM_FALLTHROUGH, -Wimplicit-fallthrough are silenced, which is enabled by default in GCC 7. llvm-svn: 309272	2017-07-27 14:37:17 +00:00
Andrew V. Tischenko	e255526d0b	Added cost of ZEROALL and ZEROUPPER instrs in btver2 cpu. Differential Revision https://reviews.llvm.org/D35834 llvm-svn: 309269	2017-07-27 13:12:08 +00:00
Daniel Sanders	8e82af2be6	Re-commit: r309094 [globalisel][tablegen] Fuse the generated tables together. Summary: Now that we have control flow in place, fuse the per-rule tables into a single table. This is a compile-time saving at this point. However, this will also enable the optimization of a table so that similar instructions can be tested together, reducing the time spent on the matching the code. This is NFC in terms of externally visible behaviour but some internals have changed slightly. State.MIs is no longer reset between each rule that is attempted because it's not necessary to do so. As a consequence of this the restriction on the order that instructions are added to State.MIs has been relaxed to only affect recorded instructions that require new elements to be added to the vector. GIM_RecordInsn can now write to any element from 1 to State.MIs.size() instead of just State.MIs.size(). The compile-time regressions from the last commit were caused by the ARM target including a non-const variable (zero_reg) in the table and therefore generating an initializer for it. That variable is now const. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35681 llvm-svn: 309264	2017-07-27 11:03:45 +00:00
Simon Pilgrim	afc1ac2735	[X86] Tidyup MaskedLoad/Store mask creation. NFCI. Assign all concat elements to zero and then just replace the first element, instead of setting them all to null and copying everything in. llvm-svn: 309261	2017-07-27 10:29:04 +00:00
Hiroshi Inoue	967dc58ac1	[PowerPC] enable optimizeCompareInstr for branch with static branch hint In optimizeCompareInstr, a compare instruction is eliminated by using a record form instruction if possible. If the branch instruction that uses the result of the compare has a static branch hint, the optimization does not happen. This patch makes this optimization happen regardless of the branch hint by splitting branch hint and branch condition before checking the predicate to identify the possible optimizations. Differential Revision: https://reviews.llvm.org/D35801 llvm-svn: 309255	2017-07-27 08:14:48 +00:00
Eugene Zelenko	569932d1e6	[Hexagon] Fix expensive checks build bot broken in r309230. llvm-svn: 309236	2017-07-26 23:56:29 +00:00
Eugene Zelenko	efd3d5887b	[Hexagon] Partially revert r309230 which caused some build bots failures. llvm-svn: 309233	2017-07-26 23:45:28 +00:00
Eugene Zelenko	e4fc6ee790	[Hexagon] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 309230	2017-07-26 23:20:35 +00:00
Stanislav Mekhanoshin	3197eb6981	[AMDGPU] Optimize SI_IF lowering for simple if regions Currently SI_IF results in a s_and_saveexec_b64 followed by s_xor_b64. The xor is used to extract only the changed bits. In case of a simple if region where the only use of that value is in the SI_END_CF to restore the old exec mask, we can omit the xor and perform an or of the exec mask with the original exec value saved by the s_and_saveexec_b64. Differential Revision: https://reviews.llvm.org/D35861 llvm-svn: 309185	2017-07-26 21:29:15 +00:00
Evandro Menezes	b3ed4bcb8f	[ARM] Minor cosmetic edits (NFC) Change the order of a case and the description for Exynos Mx processors. llvm-svn: 309184	2017-07-26 21:28:20 +00:00
Evandro Menezes	d192a8ae7d	[AArch64] Adjust the cost model for Exynos M1 and M2 Add the information for the scalar reciprocal square root approximation. llvm-svn: 309183	2017-07-26 21:28:15 +00:00
Wei Ding	a126a13bb3	AMDGPU : Widen extending scalar loads to 32-bits. Differential Revision: http://reviews.llvm.org/D35146 llvm-svn: 309178	2017-07-26 21:07:28 +00:00
Matt Arsenault	894e53d6ac	AMDGPU: Fix using SMRD instructions for argument loads in functions These are not actually uniform values except in kernels. llvm-svn: 309172	2017-07-26 20:39:42 +00:00
Tom Stellard	55038cd1d3	AMDGPU/GlobalISel: Mark 32-bit G_OR as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35127 llvm-svn: 309165	2017-07-26 20:00:53 +00:00
Peter Collingbourne	081ffe2ff2	Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159	2017-07-26 19:15:29 +00:00
Rafael Espindola	e06e4df8be	Simplify. NFC. llvm-svn: 309141	2017-07-26 17:27:27 +00:00
Florian Hahn	239e4b9301	[Hexagon] Mark raise_relocation_error as NORETURN. Summary: This silences a couple of implicit fallthrough warnings with GCC 7.1 in this file. Reviewers: colinl, kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35889 llvm-svn: 309129	2017-07-26 16:07:51 +00:00
Stefan Pintilie	df0ee9e1b9	[NFC] test commit. Added a comment to explain how to add a PPCISD node. llvm-svn: 309114	2017-07-26 13:44:59 +00:00
Zvi Rackover	092f199188	DAGCombiner: Extend reduceBuildVecToTrunc to handle non-zero offset Summary: Adding support for combining power2-strided build_vector's where the first build_vectori's operand is extracted from a non-zero index. Example: v4i32 build_vector((extract_elt V, 1), (extract_elt V, 3), (extract_elt V, 5), (extract_elt V, 7)) --> v4i32 truncate (bitcast (shuffle<1,u,3,u,5,u,7,u> V, u) to v4i64) Reviewers: delena, RKSimon, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35700 llvm-svn: 309108	2017-07-26 12:57:03 +00:00
Martin Storsjo	0b7bf7a2e3	[COFF, ARM64] Fix symbol offsets in ADRP/ADD/LDR/STR relocations In COFF, a symbol offset can't be stored in the relocation (as is done in ELF or MachO), but is stored as the immediate in the instruction itself. The immediate in the ADRP thus is the symbol offset in bytes, not in pages. For the PAGEOFFSET_12A/L relocations, ignore any offset outside of the lowest 12 bits; they won't have any effect on the ADD/LDR/STR instruction itself but only on the associated ADRP. This is similar to how the same issue is handled for MOVW/MOVT instructions in ELF (see e.g. SVN r307713, and r307728 in lld). This fixes "fixup out of range" errors while building larger object files, where temporary symbols end up as a plain section symbol and an offset, and fixes any cases where the symbol offset mean that the actual target ended up on a different page than the symbol itself. Differential Revision: https://reviews.llvm.org/D35791 llvm-svn: 309105	2017-07-26 11:19:17 +00:00
Diana Picus	a5d6518e93	[ARM] GlobalISel: Map G_GLOBAL_VALUE to GPR A G_GLOBAL_VALUE is basically a pointer, so it should live in the GPR. llvm-svn: 309101	2017-07-26 11:01:13 +00:00
Diana Picus	b1fd784936	[ARM] GlobalISel: Mark G_GLOBAL_VALUE as legal llvm-svn: 309090	2017-07-26 09:25:15 +00:00
Michael Zuckerman	c1918ad571	[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess. This patch expands the support of lowerInterleavedStore to 32x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=32) and we plan to include more patterns in the future. To reach our goal of "more patterns". We include two mask creators. The first function creates shuffle's mask equivalent to unpacklo/unpackhi instructions. The other creator creates mask equivalent to a concat of two half vectors(high/low). The patch goal is to optimize the following sequence: At the end of the computation, we have ymm2, ymm0, ymm12 and ymm3 holding each 32 chars: c0, c1, , c31 m0, m1, , m31 y0, y1, , y31 k0, k1, ., k31 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Reviewers: dorit Farhana RKSimon guyblank DavidKreitzer Differential Revision: https://reviews.llvm.org/D34601 llvm-svn: 309086	2017-07-26 08:10:14 +00:00
Zvi Rackover	1b73682243	TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI. Changing mask argument type from const SmallVectorImpl<int>& to ArrayRef<int>. This came up in D35700 where a mask is received as an ArrayRef<int> and we want to pass it to TargetLowering::isShuffleMaskLegal(). Also saves a few lines of code. llvm-svn: 309085	2017-07-26 08:06:58 +00:00
Michael Zuckerman	60bc7e0f0a	[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess part1. splitting patch D34601 into two part. This part changes the location of two functions. The second part will be based on that patch. This was requested by @RKSimon. Reviewers: 1. dorit 2. Farhana 3. RKSimon 4. guyblank 5. DavidKreitzer llvm-svn: 309084	2017-07-26 07:45:02 +00:00
Craig Topper	050c9c8f83	[X86] Prevent selecting masked aligned load instructions if the load should be non-temporal Summary: The aligned load predicates don't suppress themselves if the load is non-temporal the way the unaligned predicates do. For the most part this isn't a problem because the aligned predicates are mostly used for instructions that only load the the non-temporal loads have priority over those. The exception are masked loads. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35712 llvm-svn: 309079	2017-07-26 04:31:04 +00:00
Eugene Zelenko	96d933da4f	[AArch64] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 309062	2017-07-25 23:51:02 +00:00
Eric Christopher	97ae58686f	Update the comments on default subtargets based on feedback. llvm-svn: 309041	2017-07-25 22:21:08 +00:00
Marek Olsak	6096f542d1	AMDGPU/SI: Fix Depth and Height computation for SI scheduler Patch by: Axel Davy Differential Revision: https://reviews.llvm.org/D34967 llvm-svn: 309028	2017-07-25 20:37:03 +00:00
Marek Olsak	e6f74384b1	AMDGPU/SI: Force exports at the end for SI scheduler Patch by: Axel Davy Differential Revision: https://reviews.llvm.org/D34965 llvm-svn: 309027	2017-07-25 20:36:58 +00:00
Eric Christopher	adfe5368ee	Revert "This patch enables the usage of constant Enum identifiers within Microsoft style inline assembly statements." This reverts commit r308966. llvm-svn: 309005	2017-07-25 19:22:09 +00:00
Nemanja Ivanovic	009016bb70	[PowerPC] Pretty-print CR bits the way the binutils disassembler does This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001	2017-07-25 18:26:35 +00:00
Nemanja Ivanovic	864c953773	[PowerPC] - Recommit r304907 now that the issue has been fixed This is just a recommit since the issue that the commit exposed is now resolved. llvm-svn: 308995	2017-07-25 17:54:51 +00:00
Simon Pilgrim	18b97f78fe	[X86][CGP] Reduce memcmp() expansion to 2 load pairs (PR33914) D35067/rL308322 attempted to support up to 4 load pairs for memcmp inlining which resulted in regressions for some optimized libc memcmp implementations (PR33914). Until we can match these more optimal cases, this patch reduces the memcmp expansion to a maximum of 2 load pairs (which matches what we do for -Os). This patch should be considered for the 5.0.0 release branch as well Differential Revision: https://reviews.llvm.org/D35830 llvm-svn: 308986	2017-07-25 17:04:37 +00:00
Fedor Sergeev	7856a3205f	[Sparc] invalid adjustments in TLS_LE/TLS_LDO relocations removed Summary: Some SPARC TLS relocations were applying nontrivial adjustments to zero value, leading to unexpected non-zero values in ELF and then Solaris linker failures. Getting rid of these adjustments. Fixes PR33825. Reviewers: rafael, asb, jyknight Subscribers: joerg, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D35567 llvm-svn: 308978	2017-07-25 15:28:28 +00:00
Andrew V. Tischenko	32e9b1ad0b	X86 Asm uses assertions instead of proper diagnostic. This patch fixes that. Differential Revision: https://reviews.llvm.org/D35115 llvm-svn: 308972	2017-07-25 13:05:12 +00:00
Matan Haroush	2f21017be2	This patch enables the usage of constant Enum identifiers within Microsoft style inline assembly statements. Differential Revision: https://reviews.llvm.org/D33277 https://reviews.llvm.org/D33278 llvm-svn: 308966	2017-07-25 10:44:09 +00:00
Sam Parker	19a08e42a8	[ARM] Enable partial and runtime unrolling Enable runtime and partial loop unrolling of simple loops without calls on M-class cores. The thresholds are calculated based on whether the target is Thumb or Thumb-2. Differential Revision: https://reviews.llvm.org/D34619 llvm-svn: 308956	2017-07-25 08:51:30 +00:00
Martin Storsjo	8cb3667541	[AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs Create a dummy 8 byte fixed object for the unused slot below the first stored vararg. Alternative ideas tested but skipped: One could try to align the whole fixed object to 16, but I haven't found how to add an offset to the stack frame used in LowerWin64_VASTART. If only the size of the fixed stack object size is padded but not the offset, via MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false), PrologEpilogInserter crashes due to "Attempted to reset backwards range!". This fixes misconceptions about where registers are spilled, since AArch64FrameLowering.cpp assumes the offset from fixed objects is aligned to 16 bytes (and the Win64 case there already manually aligns the offset to 16 bytes). This fixes cases where local stack allocations could overwrite callee saved registers on the stack. Differential Revision: https://reviews.llvm.org/D35720 llvm-svn: 308950	2017-07-25 05:20:01 +00:00
Reid Kleckner	c990b5d916	Revert "[X86][InlineAsm][Ms Compatibility]Prefer variable name over a register when the two collides" This reverts r308867 and r308866. It broke the sanitizer-windows buildbot on C++ code similar to the following: namespace cl { } void f() { __asm { mov al, cl } } t.cpp(4,13): error: unexpected namespace name 'cl': expected expression mov al, cl ^ In this case, MSVC parses 'cl' as a register, not a namespace. llvm-svn: 308926	2017-07-24 20:48:15 +00:00

1 2 3 4 5 ...

43349 Commits