llvm-project

Commit Graph

Author	SHA1	Message	Date
Zoran Jovanovic	3a7654c15d	[mips][microMIPS] Extending size reduction pass with LWP and SWP Author: milena.vujosevic.janicic Reviewers: sdardis The patch extends size reduction pass for MicroMIPS. It introduces reduction of two instructions into one instruction: Two SW instructions are transformed into one SWP instrucition. Two LW instructions are transformed into one LWP instrucition. Differential Revision: https://reviews.llvm.org/D39115 llvm-svn: 334595	2018-06-13 12:51:37 +00:00
Sanjay Patel	b983ac6fe1	[x86] eliminate even more sign-bit tests with vector select This shortcoming was noted in D47330, and the test diffs show we already had other examples where we failed to fold to a SHRUNKBLEND: /// Dynamic (non-constant condition) vector blend where only the sign bits /// of the condition elements are used. This is used to enforce that the /// condition mask is not valid for generic VSELECT optimizations. This patch implements an idea from D48043 and would obsolete that patch because it catches more cases (notable the AVX1 case that was missed there). All we're doing is allowing the existing transform to fire more often by removing the post-legalize constraint. All of the relevant feature checks and other predicates are left as-is. Differential Revision: https://reviews.llvm.org/D48078 llvm-svn: 334592	2018-06-13 12:28:32 +00:00
Alex Bradbury	96f492d7df	[RISCV] Add codegen support for atomic load/stores with RV32A Fences are inserted according to table A.6 in the current draft of version 2.3 of the RISC-V Instruction Set Manual, which incorporates the memory model changes and definitions contributed by the RISC-V Memory Consistency Model task group. Instruction selection failures will now occur for 8/16/32-bit atomicrmw and cmpxchg operations when targeting RV32IA until lowering for these operations is added in a follow-on patch. Differential Revision: https://reviews.llvm.org/D47589 llvm-svn: 334591	2018-06-13 12:04:51 +00:00
Alex Bradbury	dc790dd5d0	[RISCV] Codegen support for atomic operations on RV32I This patch adds lowering for atomic fences and relies on AtomicExpandPass to lower atomic loads/stores, atomic rmw, and cmpxchg to __atomic_* libcalls. test/CodeGen/RISCV/atomic-* are modelled on the exhaustive test/CodeGen/PPC/atomics-regression.ll, and will prove more useful once RV32A codegen support is introduced. Fence mappings are taken from table A.6 in the current draft of version 2.3 of the RISC-V Instruction Set Manual, which incorporates the memory model changes and definitions contributed by the RISC-V Memory Consistency Model task group. Differential Revision: https://reviews.llvm.org/D47587 llvm-svn: 334590	2018-06-13 11:58:46 +00:00
Clement Courbet	5eeed77f87	[TableGen] Emit a fatal error on inconsistencies in resource units vs cycles. Summary: For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files. For more obvious cases, I've ventured a fix. Some notes: - Exynos is especially fishy. - AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice. - The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit. Also see PR37310. Reviewers: RKSimon, craig.topper, javed.absar Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D46356 llvm-svn: 334586	2018-06-13 09:41:49 +00:00
Hiroshi Inoue	0f7f59f073	[PowerPC] fix trivial typos in comment, NFC llvm-svn: 334583	2018-06-13 08:54:13 +00:00
Hiroshi Inoue	9bffc94cf0	[PowerPC] avoid verification failure due to PowerPC VSX Swap Removal pass This patch fixes a failure in lnt tests with -verify-machineinstrs option. When VSX Swap Removal pass swaps two register operands, it did not maintain kill flags associated with operands. This patch swaps flags as well as register number to avoid inconsistent kill flags information. llvm-svn: 334579	2018-06-13 08:25:14 +00:00
Craig Topper	3829d258ee	[X86] Remove masking from avx512vbmi2 concat and shift by immediate intrinsics. Use select in IR instead. llvm-svn: 334576	2018-06-13 07:19:21 +00:00
Craig Topper	55488731be	[X86] Mark all instructions that have masked store semantics with NotMemoryFoldable. Remove dependency on SchedRW from memory table autogenerator. Previously we were whitelisting in instructions based on their SchedRW value. With the masked store instructions explicitly removed via NotMemoryFoldable, we don't seem to need this check anymore. llvm-svn: 334563	2018-06-13 00:04:08 +00:00
Craig Topper	4f9cac667b	[X86] Remove VPCOMPRESSB/W from the autogenerated load folding table. llvm-svn: 334562	2018-06-13 00:04:04 +00:00
Stanislav Mekhanoshin	8fd3c4e431	[AMDGPU] DAG combine to produce V_PERM_B32 Differential Revision: https://reviews.llvm.org/D48099 llvm-svn: 334559	2018-06-12 23:50:37 +00:00
Krzysztof Parzyszek	82d284c1d2	[DAGCombiner] Recognize more patterns for ABS Differential Revision: https://reviews.llvm.org/D47831 llvm-svn: 334553	2018-06-12 21:51:49 +00:00
Petr Hosek	7250908016	[AArch64] Support reserving x20 register Register x20 is a callee-saved register which may be used for other purposes in certain contexts, for example to hold special variables within the kernel. This change adds support for reserving this register both to frontend and backend to make this register usable for these purposes. Differential Revision: https://reviews.llvm.org/D46552 llvm-svn: 334531	2018-06-12 20:00:50 +00:00
Craig Topper	3a34c3596d	[X86] Remove mayLoad flag from AVX512 truncating store instructions. llvm-svn: 334529	2018-06-12 19:59:08 +00:00
Reid Kleckner	98117a47e6	[MS][ARM64] Hoist __ImageBase handling into TargetLoweringObjectFileCOFF All COFF targets should use @IMGREL32 relocations for symbol differences against __ImageBase. Do the same for getSectionForConstant, so that immediates lowered to globals get merged across TUs. Patch by Chris January Differential Revision: https://reviews.llvm.org/D47783 llvm-svn: 334523	2018-06-12 18:56:05 +00:00
Konstantin Zhuravlyov	ce25bc3e82	AMDHSA/NFC: Code object v3 updates (additional): - Move section selection and alignment to AMDGPUAsmPrinter llvm-svn: 334521	2018-06-12 18:33:51 +00:00
Konstantin Zhuravlyov	00f2cb1116	AMDHSA: Code object v3 updates - Do not emit following assembler directives: - .hsa_code_object_version - .hsa_code_object_isa - .amd_amdgpu_isa - .amd_amdgpu_hsa_metadata - .amd_amdgpu_pal_metadata - Do not emit .note entries - Cleanup and bring in sync kernel descriptor header file - Emit kernel descriptor into .rodata with appropriate relocations and alignments llvm-svn: 334519	2018-06-12 18:02:46 +00:00
Fangrui Song	f72cdb50be	[MC] [X86] Teach leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 to use R_X86_64_GOTPC32 instead of R_X86_64_PC32 Summary: This is similar to D46319 (ARM). x86-64 psABI p40 gives an example: leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 # GOTPC32 reloc GNU as creates R_X86_64_GOTPC32. However, MC currently emits R_X86_64_PC32. Reviewers: javed.absar, echristo Subscribers: kristof.beyls, llvm-commits, peter.smith, grimar Differential Revision: https://reviews.llvm.org/D47507 llvm-svn: 334515	2018-06-12 16:20:44 +00:00
Simon Pilgrim	e39fa6cbbb	[CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select (PR33744) As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources: e.g. v4f32: <0,5,2,7> or <4,1,6,3> This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline: e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc. This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns. Differential Revision: https://reviews.llvm.org/D47985 llvm-svn: 334513	2018-06-12 16:12:29 +00:00
Craig Topper	ede97c9548	[X86] Remove TB_ALIGN_16 from VEXTRACTF128/VEXTRACTI128 in the memory folding table. llvm-svn: 334511	2018-06-12 15:48:03 +00:00
Krzysztof Parzyszek	bea23d065e	[Hexagon] Make floating point operations expensive for vectorization llvm-svn: 334508	2018-06-12 15:12:50 +00:00
Sanjay Patel	c3466d2568	[x86] move shrunkblend transform to helper function; NFCI We should be able to obsolete D48043 by easing the constraints on this existing code. llvm-svn: 334504	2018-06-12 14:21:51 +00:00
Krzysztof Parzyszek	3d671248ab	[SelectionDAG] Provide default expansion for rotates Implement default legalization of rotates: either in terms of the rotation in the opposite direction (if legal), or in terms of shifts and ors. Implement generating of rotate instructions for Hexagon. Hexagon only supports rotates by an immediate value, so implement custom lowering of ROTL/ROTR on Hexagon. If a rotate is not legal, use the default expansion. Differential Revision: https://reviews.llvm.org/D47725 llvm-svn: 334497	2018-06-12 12:49:36 +00:00
Simon Dardis	74fb5e6789	[mips] Guard some floating point instructions correctly Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47636 llvm-svn: 334491	2018-06-12 10:28:06 +00:00
Aleksandar Beserminji	8acdc10220	[mips] Extend LONG_BRANCH_LUi/ADDiu with extra parameter Extend LONG_BRANCH_LUi and LONG_BRANCH_ADDiu pseudo instructions with additional flag, so instead of always lowering to lui %hi(...), addiu %lo(...) or addiu %hi(...), now they can lower to either %lo, %hi, %higher or %highest depending on the added flag. Differential Revision: https://reviews.llvm.org/D47941 llvm-svn: 334490	2018-06-12 10:23:49 +00:00
Luke Geeson	dc82aa44e6	[AArch64] Audit on rL333879 to fix FP16 64bit bitpatterns llvm-svn: 334488	2018-06-12 09:35:20 +00:00
Craig Topper	88c230265b	[X86] Add NotMemoryFoldable to the VPCOMPRESS instructions. llvm-svn: 334481	2018-06-12 07:32:19 +00:00
Craig Topper	5799e4df75	[X86] Add NotMemoryFoldable to more instructions. These include PUSH/POP instructions that don't match the manual table. This also includes CMPXCHG which we never emit in non-locked form. llvm-svn: 334479	2018-06-12 07:32:17 +00:00
Craig Topper	66572df76e	[X86] Add NotMemoryFoldable to a bunch of instructions to suppress them from the autogenerated load folding table. Most of these are system instructions or other instructions we don't use in CodeGen. No point wasting space for them in the table. Removing them from the autogenerated table makes it easier to review the manual table. A few are real opcode collisions where the memory and register forms are completely different instructions. llvm-svn: 334474	2018-06-12 04:34:59 +00:00
Craig Topper	957b738432	[X86] Add isel patterns for folding loads when creating ROUND instructions from ffloor/fnearbyint/fceil/frint/ftrunc. We were missing packed isel folding patterns for all of sse41, avx, and avx512. For some reason avx512 had scalar load folding patterns under optsize(due to partial/undef reg update), but we didn't have the equivalent sse41 and avx patterns. Sometimes we would get load folding due to peephole pass anyway, but we're also missing avx512 instructions from the load folding table. I'll try to fix that in another patch. Some of this was spotted in the review for D47993. This patch adds all the folds to isel, adds a few spot tests, and disables the peephole pass on a few tests to ensure we're testing some of these patterns. llvm-svn: 334460	2018-06-12 00:48:57 +00:00
Mark Searles	987f292c56	[AMDGPU] prevent hitting Assertion `isReg() && "Wrong MachineOperand accessor"' The use iterator, used within findMaskOperands(), can return anything which is not a def. isUse() requires a register, so check isReg() before calling isUse(). Differential Revision: https://reviews.llvm.org/D48047 llvm-svn: 334459	2018-06-12 00:41:26 +00:00
George Burgess IV	c72204d5b5	Simplify; NFC Not shown in the diff: AQ is a `vector<SUnit >`, and SU is a `SUnit ` llvm-svn: 334451	2018-06-11 22:58:32 +00:00
Konstantin Zhuravlyov	3e5d66ac66	AMDGPU: Add 64-bit relative variant kind Differential Revision: https://reviews.llvm.org/D47601 llvm-svn: 334443	2018-06-11 21:37:57 +00:00
Craig Topper	3efdb7ce19	[X86] Push some variable declarations down into the individual switch cases that need them. NFC All of the cases are already wrapped in curly braces so declaring a variable there isn't an issue. And the variables aren't assigned or used in the larger scope. llvm-svn: 334436	2018-06-11 20:50:58 +00:00
Craig Topper	ceed99baf0	[X86] Reorder some type constraints to force things to be vectors and integer/fp before forcing them to be the same size. This may be needed by another patch that I'm working on. It should have no effect on any of the generated outputs. llvm-svn: 334430	2018-06-11 19:20:15 +00:00
Krzysztof Parzyszek	dd9415d550	[Hexagon] Late predicate producers cannot be used as dot-new sources llvm-svn: 334426	2018-06-11 18:45:52 +00:00
Simon Pilgrim	14ee66ef37	[X86][AVX512] Tag AVX5124FMAPS/AVX5124VNNIW with missing scheduler classes Necessary for D46276 as even though btver2 doesn't use these instructions, its now flagged as complete so complains if ANY instruction isn't tagged..... UnsupportedFeatures wouldn't help here as these instructions don't appear to have a feature predicate (like a lot of AVX512). llvm-svn: 334423	2018-06-11 17:28:00 +00:00
Stanislav Mekhanoshin	7ba3fc730c	[AMDGPU] Do not consider indirect acces through phi for wave limiter Rational: if there is indirect access that is usually an issue because load is not ready by the use. However, if use is inside a loop and load is outside that is potentially an issue for a first iteration only. Differential Revision: https://reviews.llvm.org/D47740 llvm-svn: 334420	2018-06-11 16:50:49 +00:00
Aleksandar Beserminji	62cf9d21ab	[mips] Fix spill slot for mips3, n64 abi When program is compiled for mips3 with n64 abi, wrong register class is used for creating an emergency spill slot. This patch fixes the correct register class to be chosen. This patch resolves PR35859. Thanks to John Baldwin for reporting the issue! Differential Revision: https://reviews.llvm.org/D47938 llvm-svn: 334419	2018-06-11 16:50:28 +00:00
Dylan McKay	d011869c82	[AVR] Set trackLivenessAfterRegAlloc This sets trackLivenessAfterRegAlloc on AVRRegisterInfo. Most existing targets set this flag. Without it, specific IR inputs cause LLVM to fail with: Assertion failed: (getParent()->getProperties().hasProperty( MachineFunctionProperties::Property::TracksLiveness) && "Liveness information is accurate"), function livein_begin file MachineBasicBlock.cpp, line 1354. With this commit, this no longer happens. Patch by Peter Nimmervoll. llvm-svn: 334409	2018-06-11 14:46:48 +00:00
Clement Courbet	7db69cc08a	[X86] Fix skylake server scheduling info. Summary: This fixes most of the scheduling info for SKX vector operations. I had to split a lot of the YMM/ZMM classes into separate classes for YMM and ZMM. The before/after llvm-exegesis analysis are in the phabricator diff. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47721 llvm-svn: 334407	2018-06-11 14:37:53 +00:00
Clement Courbet	f4f6899cdf	[ExynosM1][Sched] Fix resource usage in scheduling model. This is part of https://reviews.llvm.org/D46356. llvm-svn: 334391	2018-06-11 07:33:08 +00:00
Clement Courbet	c48435bfe5	[X86] Explicitly mark unsupported classes in scheduling models. Summary: In preparation for D47721. HSW and SNB still define unsupported classes as they are used by KNL and generic models respectively. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47763 llvm-svn: 334389	2018-06-11 07:00:08 +00:00
Craig Topper	0e25c8239a	[X86] Remove masking from dbpsadbw intrinsics, use select in IR instead. llvm-svn: 334384	2018-06-11 06:18:22 +00:00
Daniel Cederman	33f67a256b	[Sparc] Add support for 13-bit PIC Summary: When compiling with -fpic, in contrast to -fPIC, use only the immediate field to index into the GOT. This saves space if the GOT is known to be small. The linker will warn if the GOT is too large for this method. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: brad, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D47136 llvm-svn: 334383	2018-06-11 05:50:08 +00:00
Craig Topper	e71ad1f6d0	[X86] Remove and autoupgrade the expandload and compressstore intrinsics. We use the target independent intrinsics now. llvm-svn: 334381	2018-06-11 01:25:22 +00:00
Craig Topper	860562c915	[X86] Miscellaneous fixes to get the load folding table generator to work again. llvm-svn: 334377	2018-06-10 21:48:24 +00:00
Ivan A. Kosarev	847daa11f8	[NEON] Support VST1xN intrinsics in AArch32 mode (LLVM part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47447 llvm-svn: 334361	2018-06-10 09:27:27 +00:00
Craig Topper	98a79934af	[X86] Remove masking from the 512-bit masked floating point add/sub/mul/div intrinsics. Use a select in IR instead. llvm-svn: 334358	2018-06-10 06:01:36 +00:00
Gabor Buella	5aa26980c4	[X86] NFC Use member initialization in X86Subtarget The separate initializeEnvironment function was sort of useless since r217071. ARM did this move already with r273556. llvm-svn: 334345	2018-06-09 09:19:40 +00:00

1 2 3 4 5 ...

47898 Commits