llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	e0ac7f3beb	[X86] Remove PCOMMIT instruction support since Intel has deprecated this instruction with no plans to release products with it. Intel's documentation for the deprecation https://software.intel.com/en-us/blogs/2016/09/12/deprecate-pcommit-instruction llvm-svn: 294405	2017-02-08 05:45:39 +00:00
Craig Topper	55bc6cb4a7	Move mnemonicIsValid to Mips target. Summary: The Mips target is the only user of mnemonicIsValid. This patch moves this method from AsmMatcherEmitter.cpp to MipsAsmParser.cpp, getting rid of the method in all other targets where it generated warnings about an unused function. Patch by Gonsolo. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: sdardis Differential Revision: https://reviews.llvm.org/D28748 llvm-svn: 294400	2017-02-08 02:54:12 +00:00
Eugene Zelenko	ee513ed84c	[PowerPC] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MC headers dependencies. llvm-svn: 294368	2017-02-07 22:59:46 +00:00
Hans Wennborg	819e3e02a9	[X86] Disable conditional tail calls (PR31257) They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. llvm-svn: 294348	2017-02-07 20:37:45 +00:00
Sanjay Patel	b0cee9b273	[x86] improve comments for SHRUNKBLEND node creation; NFC llvm-svn: 294344	2017-02-07 19:54:16 +00:00
Sanjoy Das	2f63cbcc0c	[ImplicitNullCheck] Extend Implicit Null Check scope by using stores Summary: This change allows usage of store instruction for implicit null check. Memory Aliasing Analisys is not used and change conservatively supposes that any store and load may access the same memory. As a result re-ordering of store-store, store-load and load-store is prohibited. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: atrick, llvm-commits Differential Revision: https://reviews.llvm.org/D29400 llvm-svn: 294338	2017-02-07 19:19:49 +00:00
Sanjay Patel	ef6d573f67	[x86] use range-for loops; NFCI llvm-svn: 294337	2017-02-07 19:18:25 +00:00
Sanjay Patel	633ecbf3c4	[x86] use getSignBit() for clarity; NFCI llvm-svn: 294333	2017-02-07 19:01:35 +00:00
Nemanja Ivanovic	17aeb5a260	[PowerPC][Altivec] Add vnot extended mnemonic Adds the vnot extended mnemonic for the vnor instruction. Committing on behalf of brunoalr (Bruno Rosa). Differential Revision: https://reviews.llvm.org/D29225 llvm-svn: 294330	2017-02-07 18:57:29 +00:00
Alexander Timofeev	a3dace3619	[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track lane masks. Differential revision: https://reviews.llvm.org/D29442 llvm-svn: 294324	2017-02-07 17:57:48 +00:00
Krzysztof Parzyszek	5ea971ced5	[Hexagon] Update instruction types Remove TypeXTYPE, TypeALU32, TypeSYSTEM, TypeJR, and instead use their architecture counterparts. Patch by Colin LeMahieu. llvm-svn: 294321	2017-02-07 17:47:37 +00:00
Krzysztof Parzyszek	c8d676ef72	[Hexagon] Remove encoding bits from mapped instructions - Map A2_zxtb to A2_andir. - Map PS_call_nr J2_call. - Map A2_tfr[t\|f][new] to A2_padd[t\|f][new]. Patch by Colin LeMahieu. llvm-svn: 294320	2017-02-07 17:42:11 +00:00
Simon Pilgrim	8c0f62d293	[X86][SSE] Ensure that vector shift-by-immediate inputs are correctly bitcast to the result type vXi8/vXi64 vector shifts are often shifted as vYi16/vYi32 types but we weren't always remembering to bitcast the input. Tested with a new assert as we don't currently manipulate these shifts enough for test cases to catch them. llvm-svn: 294308	2017-02-07 14:22:25 +00:00
Christof Douma	d3ed8380e0	[ARM] Make RWPI use movw/movt when available When constructing global address literals while targeting the RWPI relocation model. LLVM currently only uses literal pools. If MOVW/MOVT instructions are available we can use these instead. Beside being more efficient it allows -arm-execute-only to work with -relocation-model=RWPI as well. When we generate MOVW/MOVT for global addresses when targeting the RWPI relocation model, we need to use base relative relocations. This patch does the needed plumbing in MC to generate these for MOVW/MOVT. Differential Revision: https://reviews.llvm.org/D29487 Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d llvm-svn: 294298	2017-02-07 13:07:12 +00:00
Craig Topper	9191c3324a	[AVX-512] Add masked and unmasked shift by immediate instructions to load folding tables. llvm-svn: 294287	2017-02-07 07:31:00 +00:00
Craig Topper	62304d80e3	[AVX-512] Add masked shift instructions to load folding tables. This adds the masked versions of everything, but the shift by immediate instructions. llvm-svn: 294286	2017-02-07 07:30:57 +00:00
Craig Topper	45d9ddc687	[AVX-512] Add some of the shift instructions to the load folding tables. This includes unmasked forms of variable shift and shifting by the lower element of a register. Still need to do shift by immediate which was not foldable prior to avx512 and all the masked forms. llvm-svn: 294285	2017-02-07 07:30:54 +00:00
Matt Arsenault	f48169862d	AMDGPU: Fix missing static llvm-svn: 294281	2017-02-07 04:37:59 +00:00
Craig Topper	39d86bb688	[X86] Change the Defs list for VZEROALL/VZEROUPPER back to not including YMM16-31. llvm-svn: 294277	2017-02-07 04:10:57 +00:00
Krzysztof Parzyszek	0605f3fddd	[Hexagon] Address ASAN and UBSAN failures after r294226 Reinstate r294256 with a fix. llvm-svn: 294269	2017-02-07 02:31:53 +00:00
Yaxun Liu	8f844f3960	[AMDGPU] Lower null pointers in static variable initializer For amdgcn target Clang generates addrspacecast to represent null pointers in private and local address spaces. In LLVM codegen, the static variable initializer is lowered by virtual function AsmPrinter::lowerConstant which is target generic. Since addrspacecast is target specific, AsmPrinter::lowerConst This patch overrides AsmPrinter::lowerConstant with AMDGPUAsmPrinter::lowerConstant, which is able to lower the target-specific addrspacecast in the null pointer representation so that -1 is co Differential Revision: https://reviews.llvm.org/D29284 llvm-svn: 294265	2017-02-07 00:43:21 +00:00
Tim Northover	868332d6bf	GlobalISel: legalize narrow G_SELECTS on AArch64. Otherwise there aren't any patterns to select them. llvm-svn: 294261	2017-02-06 23:41:27 +00:00
Krzysztof Parzyszek	becf0a362a	Revert "[Hexagon] Address ASAN and UBSAN failures after r294226" This reverts commit r294256. It seems to be causing more problems instead of solving them. llvm-svn: 294259	2017-02-06 23:30:17 +00:00
Krzysztof Parzyszek	5b4a6b67c5	[Hexagon] Adding gp+ to the syntax of gp-relative instructions Patch by Colin LeMahieu. llvm-svn: 294258	2017-02-06 23:18:57 +00:00
Stanislav Mekhanoshin	99be1aff31	[AMDGPU] Fix GCNSchedStrategy.cpp debug output There is typo in the debug output: top and bottom candidates are switched. Differential Revision: https://reviews.llvm.org/D29608 llvm-svn: 294257	2017-02-06 23:16:51 +00:00
Krzysztof Parzyszek	35a7838954	[Hexagon] Address ASAN and UBSAN failures after r294226 llvm-svn: 294256	2017-02-06 23:11:32 +00:00
Tim Northover	6f2db57dae	GlobalISel: fall back gracefully when we can't map an operand's size. AArch64 was asserting when it was asked to provide a register-bank of a size it couldn't deal with (in this case an s128 IMPLICIT_DEF). But we want a robust fallback path so this isn't allowed. llvm-svn: 294248	2017-02-06 21:57:06 +00:00
Tim Northover	0e6afbdd77	GlobalISel: legalize G_INSERT instructions We don't handle all cases yet (see arm64-fallback.ll for an example), but this is enough to cover most common C++ code so it's a good place to start. llvm-svn: 294247	2017-02-06 21:56:47 +00:00
Eugene Zelenko	90562dfb50	[X86] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies.(vlsj-clangbuild)[622] llvm-svn: 294246	2017-02-06 21:55:43 +00:00
Krzysztof Parzyszek	8cdfe8ecf3	[Hexagon] Update MCTargetDesc Changes include: - Updates to the instruction descriptor flags. - Improvements to the packet shuffler and checker. - Updates to the handling of certain relocations. - Better handling of duplex instructions. llvm-svn: 294226	2017-02-06 19:35:46 +00:00
John Brawn	3a9c842a9d	[AArch64] Fix incorrect MachinePointerInfo in splitStoreSplat When splitting up one store into several in splitStoreSplat we have to make sure we get the MachinePointerInfo right, otherwise alias analysis thinks they all store to the same location. This can then cause invalid scheduling later on. Differential Revision: https://reviews.llvm.org/D29446 llvm-svn: 294203	2017-02-06 18:07:20 +00:00
Simon Pilgrim	bfd4495512	[X86][SSE] Combine shuffle nodes with multiple uses if all the users are being combined. Currently we only combine shuffle nodes if they have a single user to prevent us from causing code bloat by splitting the shuffles into several different combines. We don't take into account that in some cases we will already have combined all the users during recursively calling up the shuffle tree. This patch keeps a list of all the shuffle nodes that have been combined so far and permits combining of further shuffle nodes if all its users are in that list. Differential Revision: https://reviews.llvm.org/D29399 llvm-svn: 294183	2017-02-06 13:44:45 +00:00
Simon Dardis	3aa8a90eff	[mips] dla expansion without the at register Previously only the superscalar scheduled expansion of the dla macro for MIPS64 was implemented. If assembler temporary register is not available and the optional source register is not the destination register, synthesize the address using the naive solution of adds and shifts. This partially resolves PR/30383. Thanks to Sean Bruno for reporting the issue! Reviewers: slthakur, seanbruno Differential Revision: https://reviews.llvm.org/D29328 llvm-svn: 294182	2017-02-06 12:43:46 +00:00
Dylan McKay	0acfafdbd6	[AVR] Use 'print' instead of 'dump' This should fix an undefined reference on the AVR buildbot. llvm-svn: 294175	2017-02-06 08:43:30 +00:00
Igor Breger	5c31a4c9a3	[X86][GlobalISel] Add limited ret lowering support to the IRTranslator. Summary: Support return lowering for i8/i16/i32/i64/float/double, vector type supported for 64bit platform only. Support argument lowering for float/double types. Reviewers: t.p.northover, zvi, ab, rovka Reviewed By: zvi Subscribers: dberris, kristof.beyls, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D29261 llvm-svn: 294173	2017-02-06 08:37:41 +00:00
Craig Topper	5d9ecd23e8	[AVX-512] Add VPSLLDQ/VPSRLDQ to load folding tables. llvm-svn: 294170	2017-02-06 05:12:14 +00:00
Craig Topper	f0eb60a6f3	[AVX-512] Add VPABSB/D/Q/W to load folding tables. llvm-svn: 294169	2017-02-06 03:18:01 +00:00
Craig Topper	864b1a5376	[AVX-512] Add VSHUFPS/PD to load folding tables. llvm-svn: 294168	2017-02-06 03:17:58 +00:00
Craig Topper	75218fb6b1	[AVX-512] Add VPMULLD/Q/W instructions to load folding tables. llvm-svn: 294164	2017-02-06 01:19:26 +00:00
Craig Topper	452a7770e6	[AVX-512] Add all masked and unmasked versions of VPMULDQ and VPMULUDQ to load folding tables. llvm-svn: 294163	2017-02-05 23:31:48 +00:00
Simon Pilgrim	380ce75687	[X86][SSE] Replace insert_vector_elt(vec, -1, idx) with shuffle Similar to what we already do for zero elt insertion, we can quickly rematerialize 'allbits' vectors so to avoid a unnecessary gpr value and insertion into a vector llvm-svn: 294162	2017-02-05 22:50:29 +00:00
Craig Topper	8eb1f315ac	[AVX-512] Add scalar masked max/min intrinsic instructions to the load folding tables. llvm-svn: 294153	2017-02-05 22:25:46 +00:00
Craig Topper	cb4bc8be5b	[AVX-512] Add scalar masked add/sub/mul/div intrinsic instructions to the load folding tables. llvm-svn: 294152	2017-02-05 22:25:42 +00:00
Craig Topper	59af67206d	[AVX-512] Add masked scalar FMA intrinsics to isNonFoldablePartialRegisterLoad to improve load folding of scalar loads. llvm-svn: 294151	2017-02-05 22:25:40 +00:00
Dylan McKay	ccd819ad94	[AVR] Implement stacksave/stackrestore by expanding (PR31342) Summary: Authored by Florian Zeitz. This implements the missing stacksave/stackrestore intrinsics via expansion. Output of `llc -O0 -march=avr ~/devel/llvm/test/CodeGen/Generic/stacksave-restore.ll` for sanity checking (comments mine): ``` .text .file ".../llvm/test/CodeGen/Generic/stacksave-restore.ll" .globl test .p2align 1 .type test,@function test: ; @test ; BB#0: push r28 push r29 in r28, 61 in r29, 62 sbiw r28, 4 in r0, 63 cli out 62, r29 out 63, r0 out 61, r28 in r18, 61 in r19, 62 mov r20, r22 mov r21, r23 in r30, 61 in r31, 62 lsl r22 rol r23 lsl r22 rol r23 in r26, 61 in r27, 62 sub r26, r22 sbc r27, r23 andi r26, 252 in r0, 63 cli out 62, r27 out 63, r0 out 61, r26 in r0, 63 cli out 62, r31 out 63, r0 out 61, r30 in r30, 61 in r31, 62 sub r30, r22 sbc r31, r23 andi r30, 252 in r0, 63 cli out 62, r31 out 63, r0 out 61, r30 std Y+3, r24 ; 2-byte Folded Spill std Y+4, r25 ; 2-byte Folded Spill mov r24, r26 mov r25, r27 in r0, 63 cli out 62, r19 out 63, r0 out 61, r18 std Y+1, r20 ; 2-byte Folded Spill std Y+2, r21 ; 2-byte Folded Spill adiw r28, 4 in r0, 63 cli out 62, r29 out 63, r0 out 61, r28 pop r29 pop r28 ret .Lfunc_end0: .size test, .Lfunc_end0-test ``` Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29553 llvm-svn: 294146	2017-02-05 21:35:45 +00:00
Kamil Rytarowski	5d2bd8dd54	Revamp llvm::once_flag to be closer to std::once_flag Summary: Make this interface reusable similarly to std::call_once and std::once_flag interface. This makes porting LLDB to NetBSD easier as there was in the original approach a portable way to specify a non-static once_flag. With this change translating std::once_flag to llvm::once_flag is mechanical. Sponsored by <The NetBSD Foundation> Reviewers: mehdi_amini, labath, joerg Reviewed By: mehdi_amini Subscribers: emaste, clayborg Differential Revision: https://reviews.llvm.org/D29566 llvm-svn: 294143	2017-02-05 21:13:06 +00:00
Craig Topper	cac328f25e	[X86] Fix printing of sha256rnds2 to include the implicit %xmm0 argument. llvm-svn: 294132	2017-02-05 18:33:31 +00:00
Craig Topper	d7ae9ab1fa	[X86] Fix printing of blendvpd/blendvps/pblendvb to include the implicit %xmm0 argument. This makes codegen output more obvious about the %xmm0 usage. llvm-svn: 294131	2017-02-05 18:33:24 +00:00
Craig Topper	6a35a81fc5	[X86] In LowerTRUNCATE, create an ISD::VECTOR_SHUFFLE instead of explicitly creating a PSHUFB. This will be lowered by regular shuffle lowering to a PSHUFB later. Similar was already done for several other shuffles in this function. The test changes are because the old code used explicity zeroing for elements that could have been undef. While I was here I also changed other shuffle vectors in the same function to use the same input twice instead of creating UNDEF nodes. getVectorShuffle can create the UNDEF for us. llvm-svn: 294130	2017-02-05 18:33:14 +00:00
Daniel Sanders	c3ac566754	[globalisel][arm] Tablegen-erate current Register Bank Information. Summary: This patch tablegen-erates the ARM register bank information so that the static tables added in D27807 no longer need to be maintained. Depends on D27338 Reviewers: t.p.northover, rovka, ab, qcolombet, aditya_nandakumar Reviewed By: rovka Subscribers: aemerson, rengolin, mgorny, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D28567 llvm-svn: 294124	2017-02-05 12:07:55 +00:00
Dylan McKay	b78f36657e	[AVR] Fix a bug where asm operands are printed twice We would unconditionally call printOperand, even if PrintAsmOperand already printed the immediate. llvm-svn: 294121	2017-02-05 10:42:49 +00:00
Dylan McKay	7a3eb290ef	[AVR] Support zero-sized arguments in defined methods It is sufficient to skip emission of these arguments as we have nothing to actually pass through the function call. The AVR-GCC reference has nothing to say about zero-sized arguments, presumably because C/C++ doesn't support them. This means we don't have to worry about ABI differences. llvm-svn: 294119	2017-02-05 09:53:45 +00:00
Craig Topper	978fdb75a4	[X86] Add support for folding (insert_subvector vec1, (extract_subvector vec2, idx1), idx1) -> (blendi vec2, vec1). llvm-svn: 294112	2017-02-04 23:26:46 +00:00
Craig Topper	3d95228dbe	[X86] Simplify the code that turns INSERT_SUBVECTOR into BLENDI. NFCI llvm-svn: 294111	2017-02-04 23:26:42 +00:00
Eric Christopher	b128abcf7a	Remove a bunch of unnecessary casts to a target specific version of TII and TRI as we're working from a target specific STI. llvm-svn: 294081	2017-02-04 01:52:17 +00:00
Eugene Zelenko	3f37f07c7f	[Sparc] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294072	2017-02-04 00:36:49 +00:00
Eugene Zelenko	cd8ea02b4a	[Mips] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294069	2017-02-03 23:39:33 +00:00
Eugene Zelenko	06869c04f3	[SystemZ] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294068	2017-02-03 23:39:06 +00:00
Eugene Zelenko	e894b4dc59	[AMDGPU] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294067	2017-02-03 23:38:40 +00:00
Eugene Zelenko	939f6b0167	[AArch64] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294053	2017-02-03 21:49:13 +00:00
Eugene Zelenko	07dc38f67a	[ARM] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294052	2017-02-03 21:48:12 +00:00
Eugene Zelenko	c3164b9c2f	[XCore] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294051	2017-02-03 21:46:55 +00:00
Matt Arsenault	f15da6c419	AMDGPU: AsmParser cleanups Use typedef, remove unnecessary enum, line wraps. llvm-svn: 294039	2017-02-03 20:49:51 +00:00
Stanislav Mekhanoshin	81db53109d	[AMDGPU] Bump -amdgpu-unroll-threshold-private to 2000 This has quite positive performance impact according to measurements. Before previous fixes to limit the optimization that was too high and blowed compile time and scratch usage, but now this is gone and we can bump the threshold. Differential Revision: https://reviews.llvm.org/D29505 llvm-svn: 294032	2017-02-03 20:08:29 +00:00
Matt Arsenault	1fa5eacf9d	AMDGPU: Set MCAsmInfo::PointerSize llvm-svn: 294031	2017-02-03 20:02:23 +00:00
Matt Arsenault	d9cd736585	AMDGPU: Don't unroll for private with dynamic allocas This won't be elimnated, so this will just bloat code if/when these are ever used/supported. llvm-svn: 294030	2017-02-03 19:36:00 +00:00
Simon Pilgrim	034c1bd32c	[X86][SSE] Add support for combining scalar_to_vector(extract_vector_elt) into a target shuffle. Correctly flagging upper elements as undef. llvm-svn: 294020	2017-02-03 17:59:58 +00:00
Simon Dardis	68e9d94055	[mips] Remove absolute size assertion for end directive The .end <symbol> directive for MIPS marks the end of a symbol and sets the symbol's size. Previously, the corresponding emitDirective handler asserted that a function's size could be evaluated to an absolute value at that point in time. This cannot be done with when directives like .align have been encountered, instead set the function's size to the corresponding symbolic expression and let ELFObjectWriter resolve the expression to an absolute value. This avoids a redundant call to evaluateAsAbsolute. llvm-svn: 294012	2017-02-03 15:48:53 +00:00
Justin Lebar	e90c468444	[NVPTX] Enable combineRepeatedFPDivisors for NVPTX. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D29477 llvm-svn: 294011	2017-02-03 15:13:50 +00:00
Artem Tamazov	43b61561b0	[AMDGPU][mc] Fix AddressSanitizer leftover issue in gfx7_asm_all test Issue occurs when assembling "ds_ordered_count v0, v0 gds". llvm-svn: 294004	2017-02-03 12:47:30 +00:00
Sanne Wouda	a994185757	[ARM] Change TCReturn to tBL if tailcall optimization fails. Summary: The tail call optimisation is performed before register allocation, so at that point we don't know if LR is being spilt or not. If LR was spilt to the stack, then we cannot do a tail call optimisation. That would involve popping back into LR which is not possible in Thumb1 code. Reviewers: rengolin, jmolloy, rovka, olista01 Reviewed By: olista01 Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D29020 llvm-svn: 294000	2017-02-03 11:15:53 +00:00
Stanislav Mekhanoshin	f29602df65	[AMDGPU] Unroll preferences improvements Exit loop analysis early if suitable private access found. Do not account for GEPs which are invariant to loop induction variable. Do not account for Allocas which are too big to fit into register file anyway. Add option for tuning: -amdgpu-unroll-threshold-private. Differential Revision: https://reviews.llvm.org/D29473 llvm-svn: 293991	2017-02-03 02:20:05 +00:00
Matt Arsenault	e1b595306d	AMDGPU: Fold fneg into fmin/fmax_legacy llvm-svn: 293972	2017-02-03 00:51:50 +00:00
Craig Topper	bbb2b95ce5	[X86] Mark 256-bit and 512-bit INSERT_SUBVECTOR operations as legal and remove the custom lowering. llvm-svn: 293969	2017-02-03 00:24:49 +00:00
Matt Arsenault	2511c031de	AMDGPU: Fold fneg into fminnum/fmaxnum llvm-svn: 293968	2017-02-03 00:23:15 +00:00
Matt Arsenault	a8fcfadf46	AMDGPU: Check if users of fneg can fold mods In multi-use cases this can save a few instructions. llvm-svn: 293962	2017-02-02 23:21:23 +00:00
Eugene Zelenko	fbd13c5c12	[X86] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293949	2017-02-02 22:55:55 +00:00
Reid Kleckner	3c467e225e	[X86] Avoid sorted order check in release builds Effectively reverts r290248 and fixes the unused function warning with ifndef NDEBUG. llvm-svn: 293945	2017-02-02 22:06:30 +00:00
Craig Topper	c45657375b	[X86] Move turning 256-bit INSERT_SUBVECTORS into BLENDI from legalize to DAG combine. On one test this seems to have given more chance for DAG combine to do other INSERT_SUBVECTOR/EXTRACT_SUBVECTOR combines before the BLENDI was created. Looks like we can still improve more by teaching DAG combine to optimize INSERT_SUBVECTOR/EXTRACT_SUBVECTOR with BLENDI. llvm-svn: 293944	2017-02-02 22:02:57 +00:00
Reid Kleckner	c35139ec0d	[CodeGen] Remove dead call-or-prologue enum from CCState This enum has been dead since Olivier Stannard re-implemented ARM byval handling in r202985 (2014). llvm-svn: 293943	2017-02-02 21:58:22 +00:00
Rafael Espindola	13a79bbfe5	Change how we handle section symbols on ELF. On ELF every section can have a corresponding section symbol. When in an assembly file we have .quad .text the '.text' refers to that symbol. The way we used to handle them is to leave .text an undefined symbol until the very end when the object writer would map them to the actual section symbol. The problem with that is that anything before the end would see an undefined symbol. This could result in bad diagnostics (test/MC/AArch64/label-arithmetic-diags-elf.s), or incorrect results when using the asm streamer (est/MC/Mips/expansion-jal-sym-pic.s). Fixing this will also allow using the section symbol earlier for setting sh_link of SHF_METADATA sections. This patch includes a few hacks to avoid changing our behaviour when handling conflicts between section symbols and other symbols. I reported pr31850 to track that. llvm-svn: 293936	2017-02-02 21:26:06 +00:00
Javed Absar	bb8dcc6aec	[ARM] Classification Improvements to ARM Sched-Model. NFCI. This is the second in the series of patches to enable adding of machine sched-models for ARM processors easier and compact. This patch focuses on integer instructions and adds missing sched definitions. Reviewers: rovka, rengolin Differential Revision: https://reviews.llvm.org/D29127 llvm-svn: 293935	2017-02-02 21:08:12 +00:00
Krzysztof Parzyszek	d0d42f0ec8	[Hexagon] Adding opExtentBits and opExtentAlign to GPrel instructions Patch by Colin LeMahieu. llvm-svn: 293933	2017-02-02 20:35:12 +00:00
Michael Kuperstein	e6d59fdca5	[X86] Add costs for non-AVX512 single-source permutation integer shuffles Differential Revision: https://reviews.llvm.org/D29416 llvm-svn: 293932	2017-02-02 20:27:13 +00:00
Krzysztof Parzyszek	e17b0bfb24	[Hexagon] Fix relocation kind for extended predicated calls Patch by Sid Manning. llvm-svn: 293931	2017-02-02 20:21:56 +00:00
Krzysztof Parzyszek	357b048666	[Hexagon] Remove A4_ext_* pseudo instructions Patch by Colin LeMahieu. llvm-svn: 293929	2017-02-02 19:58:22 +00:00
Krzysztof Parzyszek	d67ab623f6	[Hexagon] Fix insertBranch for loops with multiple ENDLOOP instructions llvm-svn: 293925	2017-02-02 19:36:37 +00:00
Dan Gohman	b89f2d3d92	[WebAssembly] Add instruction definitions for drop and get/set_global. llvm-svn: 293922	2017-02-02 19:29:44 +00:00
Nirav Dave	93f9d5ce04	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. llvm-svn: 293915	2017-02-02 18:24:55 +00:00
Simon Dardis	08ce5fb66b	[mips] Expansion of BEQL and BNEL with immediate operands Adds support for BEQL and BNEL macros with immediate operands. Patch by: Srdjan Obucina Reviewers: dsanders, zoran.jovanovic, vkalintiris, sdardis, obucina, seanbruno Differential Revision: https://reviews.llvm.org/D17040 llvm-svn: 293905	2017-02-02 16:13:49 +00:00
Jonas Paulsson	b7a2ef8375	[SystemZ] Add comment for ISD::FP_TO_UINT expansion. (Copied from the fp-conv-10.ll test to SystemZISelLowering.cpp) Review: Ulrich Weigand llvm-svn: 293900	2017-02-02 15:42:14 +00:00
Krzysztof Parzyszek	bc4dc9b4b9	[Hexagon] Emitting individual instructions without copying them Patch by Colin LeMahieu. llvm-svn: 293899	2017-02-02 15:32:26 +00:00
Krzysztof Parzyszek	f65b8f14f4	[Hexagon] Rename TypeCOMPOUND to TypeCJ llvm-svn: 293894	2017-02-02 15:03:30 +00:00
Nirav Dave	4442667fc5	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293893	2017-02-02 14:39:42 +00:00
Nirav Dave	e14300e270	[X86,ISEL] Fix X86 increment chain dependence calculation Merging Load-add-store pattern into a increment op previously dropped the load's chain from the instructions dependence if the store is chained to a TokenFactor. llvm-svn: 293892	2017-02-02 14:39:26 +00:00
Diana Picus	32cd9b434c	[ARM] GlobalISel: Lower pointer args and returns It is important to change the ArgInfo's type from pointer to integer, otherwise the CC assign function won't know what to do. Instead of hacking it up, we use ComputeValueVTs and introduce some of the helpers that we will need later on for lowering more complex types. llvm-svn: 293889	2017-02-02 14:01:00 +00:00
Diana Picus	0c11c7b5c7	[ARM] GlobalISel: Error out instead of asserting Allow unknown types in TLI.getValueType, otherwise we get asserts for certain types that we do not support yet (instead of returning that we don't support them and falling through the normal error path). llvm-svn: 293888	2017-02-02 14:00:54 +00:00
Diana Picus	fc19a8ff07	[ARM] GlobalISel: Legalize loading pointers Make it legal to load pointer values. Also check that pointers are assigned to the GPR reg bank by default. llvm-svn: 293886	2017-02-02 13:20:49 +00:00
Simon Pilgrim	20ab6b875a	[X86][SSE] Use MOVMSK for all_of/any_of reduction patterns This is a first attempt at using the MOVMSK instructions to replace all_of/any_of reduction patterns (i.e. an and/or + shuffle chain). So far this only matches patterns where we are reducing an all/none bits source vector (i.e. a comparison result) but we should be able to expand on this in conjunction with improvements to 'bool vector' handling both in the x86 backend as well as the vectorizers etc. Differential Revision: https://reviews.llvm.org/D28810 llvm-svn: 293880	2017-02-02 11:52:33 +00:00
Craig Topper	047a8be18a	[X86] Remove some unused DAGCombinerInfo parameters. NFC llvm-svn: 293873	2017-02-02 08:03:23 +00:00
Craig Topper	94ed54b49a	[X86] Move some INSERT_SUBVECTOR optimizations from legalize to DAG combine. This moves creation of SUBV_BROADCAST and merging of adjacent loads that are being inserted together. This is a step towards removing legalizing of INSERT_SUBVECTOR except for vXi1 cases. llvm-svn: 293872	2017-02-02 08:03:20 +00:00
Craig Topper	b81e6c48f8	[AVX-512] Fix the implicit defs for VZEROALL/VZEROUPPER to include YMM16-YMM31. llvm-svn: 293862	2017-02-02 04:17:18 +00:00
Matt Arsenault	9dba9bd4cf	AMDGPU: Use source modifiers with f16->f32 conversions The operand types were defined to fit the fp16_to_fp node, which has the half as an integer type. v_cvt_f32_f16 does support source modifiers, so change this to have an FP type and modifiers. For targets without legal f16, this requires recognizing the bit operations and trying to produce them. llvm-svn: 293857	2017-02-02 02:27:04 +00:00
Matthias Braun	5b49f95592	AArch64RegisterInfo: Simplify getReservedReg(); NFC After marking a 32bit register and all its super registers the 64bit register does not need to be marked again. llvm-svn: 293855	2017-02-02 02:23:25 +00:00
Matt Arsenault	8e190b2f23	NVPTX: Fix not preserving volatile when expanding memset llvm-svn: 293851	2017-02-02 01:20:34 +00:00
Peter Collingbourne	dc5e583687	X86: Produce @ABS8 symbol modifiers for absolute symbols in range [0,128). Differential Revision: https://reviews.llvm.org/D28689 llvm-svn: 293844	2017-02-02 00:32:03 +00:00
Stanislav Mekhanoshin	2b913b1f49	[AMDGPU] Account workgroup size in LDS occupancy limits Functions matching LDS use to occupancy return results for a workgroup of 64 workitems. The numbers has to be adjusted for bigger workgroups. For example a workgroup of size 256 already occupies 4 waves just by itself. Given that all numbers of LDS use in the compiler are per workgroup, occupancy shall be multiplied by 4 in this case. Each 64 workitems still limited by the same number, but 4 subrgoups 64 workitems each can afford 4 times more LDS to get the same occupancy. In addition change initializes LDS size in the subtarget to a real value for SI+ targets. This is required since LDS size is a variable in these calculations. Differential Revision: https://reviews.llvm.org/D29423 llvm-svn: 293837	2017-02-01 22:59:50 +00:00
Eugene Zelenko	c5eb8e29d0	[AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293836	2017-02-01 22:56:06 +00:00
Dehao Chen	0944a8c2ec	Change debug-info-for-profiling from a TargetOption to a function attribute. Summary: LTO requires the debug-info-for-profiling to be a function attribute. Reviewers: echristo, mehdi_amini, dblaikie, probinson, aprantl Reviewed By: mehdi_amini, dblaikie, aprantl Subscribers: aprantl, probinson, ahatanak, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29203 llvm-svn: 293833	2017-02-01 22:45:09 +00:00
Matt Arsenault	74f64833bc	AMDGPU: Allow clustering flat memory operations llvm-svn: 293809	2017-02-01 20:22:51 +00:00
Simon Dardis	ac9c30c37f	[mips] Parse the 'bopt' and 'nobopt' directives in IAS. The GAS assembler supports the ".set bopt" directive but according to the sources it doesn't do anything. It's supposed to optimize branches by filling the delay slot of a branch with it's target. This patch teaches the MIPS asm parser to accept both and warn in the case of 'bopt' that the bopt directive is unsupported. This resolves PR/31841. Thanks to Sean Bruno for reporting the issue! llvm-svn: 293798	2017-02-01 18:50:24 +00:00
Matthew Simpson	ba5cf9dfee	[LV] Move interleaved access helper functions to VectorUtils (NFC) This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788	2017-02-01 17:45:46 +00:00
Simon Pilgrim	ca931efc21	[X86][SSE] Remove unused argument. NFCI. llvm-svn: 293777	2017-02-01 16:34:50 +00:00
Matt Arsenault	d59e640455	AMDGPU: Improve nsw/nuw/exact when promoting uniform i16 ops These were simply preserving the flags of the original operation, which was too conservative in most cases and incorrect for mul. nsw/nuw may be needed for some combines to cleanup messes when intermediate sext_inregs are introduced later. Tested valid combinations with alive. llvm-svn: 293776	2017-02-01 16:25:23 +00:00
Simon Dardis	6433d5af6b	[mips] Fix an initialization issue with MipsABIInfo in MipsTargetELFStreamer DebugInfoDWARFTests is the only user so far which initializes the MCObjectStreamer without initializing the ASMParser. The MIPS backend relies on the ASMParser to initialize the MipsABIInfo object and to update the target streamer with it. This should turn the mips buildbots green. Reviewers: atanasyan, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D28025 llvm-svn: 293772	2017-02-01 15:39:23 +00:00
Kit Barton	d26978796e	[PowerPC] Fix sjlj pseduo instructions to use G8RC_NOX0 register class The the following instructions: - LD/LWZ (expanded from sjLj pseudo-instructions) - LXVL/LXVLL vector loads - STXVL/STXVLL vector stores all require G8RC_NO0X class registers for RA. Differential Revision: https://reviews.llvm.org/D29289 Committed for Lei Huang llvm-svn: 293769	2017-02-01 14:33:57 +00:00
Simon Pilgrim	55a9c79bd1	[X86][SSE] Merge SSE2 PINSRW lowering with SSE41 PINSRB/PINSRW lowering. NFCI. These are identical apart from the extra SSE41 guard for PINSRB. llvm-svn: 293766	2017-02-01 13:32:19 +00:00
Javed Absar	e5ad87e939	[ARM] Enable Cortex-M23 and Cortex-M33 support. Add both cores to the target parser and TableGen. Test that eabi attributes are set correctly for both cores. Additionally, test the absence and presence of MOVT in Cortex-M23 and Cortex-M33, respectively. Committed on behalf of Sanne Wouda. Reviewers : rengolin, olista01. Differential Revision: https://reviews.llvm.org/D29073 llvm-svn: 293761	2017-02-01 11:55:03 +00:00
NAKAMURA Takumi	468487d71a	*MacroFusion.cpp: Suppress warnings to eliminate \param(s). [-Wdocumentation] llvm-svn: 293744	2017-02-01 07:30:46 +00:00
Craig Topper	0bcba19cdf	[X86] For AVX1/AVX2 isel, don't use FP move instructions for 128-bit loads/stores of integer types. For SSE we use fp because of the smaller encoding, but that doesn't apply to AVX. So just do the natural thing so we don't have to explain why we aren't. We can't do this for 256-bit loads/stores since integer loads and stores aren't available in AVX1 so we need fallback patterns since the integer types are legal. This doesn't affect any tests because execution domain fixing freely converts the instructions anyway. Honestly, we could probably rely on it for the SSE size optimization too. llvm-svn: 293743	2017-02-01 07:17:16 +00:00
Evandro Menezes	455382ea22	[AArch64] Add new target feature to fuse literal generation This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, sections 4.14 and 4.15. Differential revision: https://reviews.llvm.org/D28698 llvm-svn: 293739	2017-02-01 02:54:42 +00:00
Evandro Menezes	b21fb29c26	[AArch64] Add new subtarget feature to fuse AES crypto operations This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, section 4.13, and on Exynos M1. Differential revision: https://reviews.llvm.org/D28491 llvm-svn: 293738	2017-02-01 02:54:39 +00:00
Evandro Menezes	94edf02923	[CodeGen] Move MacroFusion to the target This patch moves the class for scheduling adjacent instructions, MacroFusion, to the target. In AArch64, it also expands the fusion to all instructions pairs in a scheduling block, beyond just among the predecessors of the branch at the end. Differential revision: https://reviews.llvm.org/D28489 llvm-svn: 293737	2017-02-01 02:54:34 +00:00
Eugene Zelenko	926883e1c2	[Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293729	2017-02-01 01:22:51 +00:00
Matt Arsenault	da7a656542	AMDGPU: Cleanup fmin/fmax legacy function Use a more specific subtarget check and combine hasOneUse checks llvm-svn: 293726	2017-02-01 00:42:40 +00:00
Matt Arsenault	1575cb893c	AMDGPU: Fix warning llvm-svn: 293717	2017-01-31 23:48:37 +00:00
Justin Lebar	06fcea4cd9	[NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than xrsqrt(x). xrsqrt(x) returns NaN for x == 0, whereas 1/rsqrt(x) returns 0, as desired. Verified that the particular nvptx approximate instructions here do in fact return 0 for x = 0. llvm-svn: 293713	2017-01-31 23:08:57 +00:00
Michael Kuperstein	e18aad39ab	Shut up GCC warning about operator precedence. NFC. Technically, this is actually changes the expression and the original assert was "wrong", but since the conjunction is with true, it doesn't matter in this case. llvm-svn: 293709	2017-01-31 22:48:45 +00:00
Peter Collingbourne	d763c4cc85	MC: Introduce the ABS8 symbol modifier. @ABS8 can be applied to symbols which appear as immediate operands to instructions that have a 8-bit immediate form for that operand. It causes the assembler to use the 8-bit form and an 8-bit relocation (e.g. R_386_8 or R_X86_64_8) for the symbol. Differential Revision: https://reviews.llvm.org/D28688 llvm-svn: 293667	2017-01-31 18:28:44 +00:00
Matt Arsenault	d5d78510c7	AMDGPU: Use source mods with fcanonicalize llvm-svn: 293654	2017-01-31 17:28:40 +00:00
Nirav Dave	a7c041d147	[X86] Implement -mfentry Summary: Insert calls to __fentry__ at function entry. Reviewers: hfinkel, craig.topper Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D28000 llvm-svn: 293648	2017-01-31 17:00:27 +00:00
Tom Stellard	124f5cc8c2	AMDGPU/SI: Fix inst-select-load-smrd.mir on some builds Summary: For some reason instructions are being inserted in the wrong order with some builds. I'm not sure why this is happening. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D29325 llvm-svn: 293639	2017-01-31 15:24:11 +00:00
Simon Pilgrim	1b39d5db7b	[X86][SSE] Add support for combining PINSRB into a target shuffle. llvm-svn: 293637	2017-01-31 14:59:44 +00:00
Sam Parker	9bf658d5fe	[ARM] Avoid using ARM instructions in Thumb mode The Requires class overrides the target requirements of an instruction, rather than adding to them, so all ARM instructions need to include the IsARM predicate when they have overwitten requirements. This caused the swp and swpb instructions to be allowed in thumb mode assembly, and the ARM encoding of CDP to be selected in codegen (which is different for conditional instructions). Differential Revision: https://reviews.llvm.org/D29283 llvm-svn: 293634	2017-01-31 14:35:01 +00:00
Benjamin Kramer	94a833962c	[X86] Silence unused variable warning in Release builds. llvm-svn: 293631	2017-01-31 14:13:53 +00:00
Simon Pilgrim	4eab18f6b8	[X86][SSE] Detect unary PBLEND shuffles. These can appear during shuffle combining. llvm-svn: 293628	2017-01-31 13:58:01 +00:00
Simon Pilgrim	c29eab52e8	[X86][SSE] Add support for combining PINSRW into a target shuffle. Also add the ability to recognise PINSR(Vex, 0, Idx). Targets shuffle combines won't replace multiple insertions with a bit mask until a depth of 3 or more, so we avoid codesize bloat. The unnecessary vpblendw in clearupper8xi16a will be fixed in an upcoming patch. llvm-svn: 293627	2017-01-31 13:51:10 +00:00
Nemanja Ivanovic	2f2a6ab991	[PowerPC][Altivec] Add vmr extended mnemonic Just adds the vmr (Vector Move Register) mnemonic for the VOR instruction in the PPC back end. Committing on behalf of brunoalr (Bruno Rosa). Differential Revision: https://reviews.llvm.org/D29133 llvm-svn: 293626	2017-01-31 13:43:11 +00:00
Simon Dardis	12850eeac5	[mips] Addition of the immediate cases for the instructions [d]div, [d]divu Related to http://reviews.llvm.org/D15772 Depends on http://reviews.llvm.org/D16888 Adds support for immediate operand for [D]DIV[U] instructions. Patch By: Srdjan Obucina Reviewers: zoran.jovanovic, vkalintiris, dsanders, obucina Differential Revision: https://reviews.llvm.org/D16889 llvm-svn: 293614	2017-01-31 10:49:24 +00:00
Craig Topper	2cfa2071bd	[AVX-512] Don't both looking into the AVX512DQ execution domain fixing tables if AVX512DQ isn't supported since we can't do any conversion anyway. llvm-svn: 293608	2017-01-31 06:49:55 +00:00
Craig Topper	797e32dd98	[X86] Add AVX and SSE2 version of MOVSDmr to execution domain fixing table. AVX-512 already did this for the EVEX version. llvm-svn: 293607	2017-01-31 06:49:53 +00:00
Craig Topper	779e4c5bb4	[AVX-512] Fix copy and paste bug in execution domain fixing tables so that we can convert 256-bit movnt instructions. llvm-svn: 293606	2017-01-31 06:49:50 +00:00
Justin Lebar	1c9692a46f	[NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate. Summary: This lets us lower to sqrt.approx and rsqrt.approx under more circumstances. * Now we emit sqrt.approx and rsqrt.approx for calls to @llvm.sqrt.f32, when fast-math is enabled. Previously, we only would emit it for calls to @llvm.nvvm.sqrt.f. (With this patch we no longer emit sqrt.approx for calls to @llvm.nvvm.sqrt.f; we rely on intcombine to simplify llvm.nvvm.sqrt.f into llvm.sqrt.f32.) * Now we emit the ftz version of rsqrt.approx when ftz is enabled. Previously, we only emitted rsqrt.approx when ftz was disabled. Reviewers: hfinkel Subscribers: llvm-commits, tra, jholewinski Differential Revision: https://reviews.llvm.org/D28508 llvm-svn: 293605	2017-01-31 05:58:22 +00:00
Craig Topper	06e038c6de	[X86] Update the broadcast fallback patterns to use shuffle instructions from the appropriate execution domain. llvm-svn: 293603	2017-01-31 05:18:29 +00:00
Craig Topper	e9e84c8284	[AVX-512] Fix the ExeDomain for VMOVDDUP, VMOVSLDUP, and VMOVSHDUP. llvm-svn: 293601	2017-01-31 05:18:24 +00:00
Matt Arsenault	f84e5d9a27	AMDGPU: Generalize matching of v_med3_f32 I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598	2017-01-31 03:07:46 +00:00
Craig Topper	d064cc93b2	[X86] Remove patterns for X86VPermilpi with integer types. I don't think we've formed these since the shuffle lowering rewrite. llvm-svn: 293592	2017-01-31 02:09:53 +00:00
Craig Topper	85935f69fb	[X86] Remove duplicate patterns for X86VPermilpv that already exist in the instructions themselves. llvm-svn: 293591	2017-01-31 02:09:51 +00:00
Craig Topper	ced68315ce	[X86] Remove patterns for selecting PSHUFD with FP types. We don't seem to do this anymore and the AVX case definitely should be using VPERMILPS anyway. llvm-svn: 293590	2017-01-31 02:09:49 +00:00
Craig Topper	b76494e017	[X86] Remove 'else' after 'return'. NFC llvm-svn: 293589	2017-01-31 02:09:46 +00:00
Craig Topper	f9d901f0ea	[X86] Use integer broadcast instructions for integer broadcast patterns. I'm not sure why we were using an FP instruction before and had to have a comment calling attention to it, but not justifying it. llvm-svn: 293588	2017-01-31 02:09:43 +00:00
Matt Arsenault	b6491cc854	AMDGPU: Implement hook for InferAddressSpaces For now just port some of the existing NVPTX tests and from an old HSAIL optimization pass which approximately did the same thing. Don't enable the pass yet until more testing is done. llvm-svn: 293580	2017-01-31 01:20:54 +00:00
Matt Arsenault	850657a439	NVPTX: Move InferAddressSpaces to generic code llvm-svn: 293579	2017-01-31 01:10:58 +00:00
Eugene Zelenko	342257ea92	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293578	2017-01-31 00:56:17 +00:00
Matt Arsenault	9f432ec24c	NVPTX: Trivial cleanups of NVPTXInferAddressSpaces - Move DEBUG_TYPE below includes - Change unknown address space constant to be consistent with other passes - Grammar fixes in debug output llvm-svn: 293567	2017-01-30 23:27:11 +00:00
Eugene Zelenko	dde94e4c4f	[Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293565	2017-01-30 23:21:32 +00:00
Matt Arsenault	42b6478344	NVPTX: Refactor NVPTXInferAddressSpaces to check TTI Add a new TTI hook for getting the generic address space value. llvm-svn: 293563	2017-01-30 23:02:12 +00:00
Simon Pilgrim	3905e03a47	[X86][SSE] Fix unsigned <= 0 warning in assert. NFCI. Thanks to @mkuper llvm-svn: 293561	2017-01-30 22:58:44 +00:00
Simon Pilgrim	a80a47afef	[X86][SSE] Generalize the number of decoded shuffle inputs. NFCI. combineX86ShufflesRecursively can still only handle a maximum of 2 shuffle inputs but everything before it now supports any number of shuffle inputs. This will be necessary for combining OR(SHUFFLE, SHUFFLE) patterns. llvm-svn: 293560	2017-01-30 22:48:49 +00:00
Eli Friedman	2345733246	Fix line endings. llvm-svn: 293554	2017-01-30 22:04:23 +00:00
Tom Stellard	887a2562b7	AMDGPU: Fix release build broken by r293551 llvm-svn: 293553	2017-01-30 22:02:58 +00:00
Tom Stellard	ca16621b2a	Re-commit AMDGPU/GlobalISel: Add support for simple shaders Fix build when global-isel is disabled and fix a warning. Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293551	2017-01-30 21:56:46 +00:00
Stanislav Mekhanoshin	a3b72798af	[AMDGPU] Internalize non-kernel symbols Since we have no call support and late linking we can produce code only for used symbols. This saves compilation time, size of the final executable, and size of any intermediate dumps. Run Internalize pass early in the opt pipeline followed by global DCE pass. To enable it RT can pass -amdgpu-internalize-symbols option. Differential Revision: https://reviews.llvm.org/D29214 llvm-svn: 293549	2017-01-30 21:05:18 +00:00
Matt Arsenault	af635240d5	AMDGPU: Undo sub x, c -> add x, -c canonicalization This is worse if the original constant is an inline immediate. This should also be done for 64-bit adds, but requires fixing operand folding bugs first. llvm-svn: 293540	2017-01-30 19:30:24 +00:00
Krzysztof Parzyszek	3695d06a10	[RDF] Add support for regmasks llvm-svn: 293538	2017-01-30 19:16:30 +00:00
Matt Arsenault	0c3293844b	AMDGPU: Run AMDGPUCodeGenPrepare after inlining With leaf functions, this makes nonsensical decisions based on the uniformity of the arguments. llvm-svn: 293525	2017-01-30 18:40:29 +00:00
Matt Arsenault	ee3f0acf20	AMDGPU: Make i32 uaddo/usubo legal llvm-svn: 293514	2017-01-30 18:11:38 +00:00
Krzysztof Parzyszek	49ffff12e5	[RDF] Extract the physical register information into a separate class llvm-svn: 293510	2017-01-30 17:46:56 +00:00
Tom Stellard	7a19d56f73	Revert "AMDGPU/GlobalISel: Add support for simple shaders" This reverts commit r293503. Revert while I investigate some of the buildbot failures. llvm-svn: 293509	2017-01-30 17:42:41 +00:00
Matt Arsenault	41c1499504	AMDGPU: Fix atomic_inc/atomic_dec + ds_swizzle not being divergent llvm-svn: 293504	2017-01-30 17:09:47 +00:00
Tom Stellard	e48f60aec8	AMDGPU/GlobalISel: Add support for simple shaders Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP. Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D26730 llvm-svn: 293503	2017-01-30 17:09:15 +00:00
Simon Pilgrim	098998aef0	[X86][SSE] Add support for combining PINSRW+ASSERTZEXT+PEXTRW patterns with target shuffles llvm-svn: 293500	2017-01-30 16:58:34 +00:00
Krzysztof Parzyszek	b561cf953a	[RDF] Add phis for entry block live-ins (in addition to function live-ins) llvm-svn: 293491	2017-01-30 16:20:30 +00:00
Rafael Espindola	e0eba3c493	Only print architecture dependent flags for that architecture. Different architectures can have different meaning for flags in the SHF_MASKPROC mask, so we should always check what the architecture use before checking the flag. NFC for now, but will allow fixing the value of an xmos flag. llvm-svn: 293484	2017-01-30 15:38:43 +00:00
Benjamin Kramer	73564981fe	[Hexagon] Make header self-contained. llvm-svn: 293482	2017-01-30 14:55:33 +00:00
Asaf Badouh	e11d2d73bf	[X86][MCU] Minor bug fix for r293469 + test case llvm-svn: 293478	2017-01-30 13:14:37 +00:00
Marek Olsak	e81adb52b1	AMDGPU: Remove a useless VI SMRD pattern Summary: already covered by complex patterns Reviewers: arsenm, nhaehnle, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28995 llvm-svn: 293477	2017-01-30 12:25:14 +00:00
Marek Olsak	8e93529020	AMDGPU: Fix assembler encoding for EXP instructions on VI Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28992 llvm-svn: 293476	2017-01-30 12:25:03 +00:00
Kristof Beyls	65a12c012f	[GlobalISel] Add support for indirectbr Differential Revision: https://reviews.llvm.org/D28079 llvm-svn: 293470	2017-01-30 09:13:18 +00:00
Asaf Badouh	53713df0c2	[X86][MCU] replace select with bit manipulation instead of branches Differential Revision: https://reviews.llvm.org/D28354 llvm-svn: 293469	2017-01-30 08:16:59 +00:00
Craig Topper	f6df4a6978	[AVX-512] Remove duplicate CodeGenOnly patterns for scalar register broadcast. We can use COPY_TO_REGCLASS like AVX does. This causes stack spill slots be oversized sometimes, but the same should already be happening with AVX. llvm-svn: 293464	2017-01-30 06:59:06 +00:00
Craig Topper	0265a39472	[AVX-512] Remove KSET0B/KSET1B in favor of the patterns that select KSET0W/KSET1W for v8i1. llvm-svn: 293458	2017-01-30 05:37:47 +00:00
Craig Topper	3b7e823f92	[AVX-512] Don't reuse VSHLI/VSRLI for mask register shifts. VSHLI/VSHRI shift within elements while KSHIFT moves whole elements. llvm-svn: 293448	2017-01-30 00:06:01 +00:00
Chris Ray	30b3fafb94	[X86][Disassembler] Added SALC instruction Reviewers: joe.abbey, craig.topper Reviewed By: craig.topper Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D29201 llvm-svn: 293447	2017-01-29 23:02:47 +00:00
Craig Topper	db919caf1b	[AVX-512] Fix lowering for mask register concatenation with undef in the lower half. Previously this test case fired an assertion in getNode because we tried to create an insert_subvector with both input types the same size and the index pointing to half the vector width. llvm-svn: 293446	2017-01-29 22:53:33 +00:00
Chris Ray	ba3741cb2b	[X86] Fixing flag usage for RCL and RCR Summary: The RCL and RCR instructions use the carry flag. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29237 llvm-svn: 293441	2017-01-29 20:05:30 +00:00
Simon Pilgrim	76073f8d22	[X86][SSE] Lower scalar_to_vector(0) to zero vector Replaces an xor+movd/movq with an xorps which will be shorter in codesize, avoid an int-fpu transfer, allow modern cores to fast path the result during decode and helps other combines recognise an all-zero vector. The only reason I can think of that we'd want to keep scalar_to_vector in this case is to help recognise the upper elts are undef but this doesn't seem to be a problem. Differential Revision: https://reviews.llvm.org/D29097 llvm-svn: 293438	2017-01-29 18:13:37 +00:00
Saleem Abdulrasool	5282eed06c	ARM: support `-mlong-calls` with AEABI TLS on ELF Support lowering AEABI TLS access (__aeabi_read_tp) with long calls. This requires adjusting the call sequence to use an indirect call to get full addressability. Resolves PR31769! llvm-svn: 293433	2017-01-29 16:46:22 +00:00
Elena Demikhovsky	17fe27f1f2	[X86 Codegen] Fixed a bug in unsigned saturation PACKUSWB converts Signed word to Unsigned byte, (the same about DW) and it can't be used for umin+truncate pattern. AVX-512 VPMOVUS* instructions fit the pattern since they convert Unsigned to Unsigned. See https://llvm.org/bugs/show_bug.cgi?id=31773 Differential Revision: https://reviews.llvm.org/D29196 llvm-svn: 293431	2017-01-29 13:18:30 +00:00
Igor Breger	9ea154d4ad	[X86][GlobalISel] Add limited argument lowering support to the IRTranslator. Summary: Add limited (i8/i16/i32/i64) argument lowering support to the IRTranslator. Inspired by commit 289940. Reviewers: t.p.northover, qcolombet, ab, zvi, rovka Reviewed By: rovka Subscribers: dberris, rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D28987 llvm-svn: 293427	2017-01-29 08:35:42 +00:00
Justin Hibbits	10b6147e23	Add some Book-E instructions to the asm parser and printer. Summary: Adds the following instructions: * mfpmr * mtpmr * icblc * icblq * icbtls Fix the scheduling for mtspr on e5500, which uses CFX0, instead of SFX0/SFX1 as on e500mc. Addresses PR 31538. Differential Revision: https://reviews.llvm.org/D29002 llvm-svn: 293417	2017-01-29 04:55:57 +00:00
Craig Topper	6533e40e9d	[X86] Fix vector ANDN matching to work correctly when both inputs to the AND are XORs. llvm-svn: 293403	2017-01-28 23:52:09 +00:00
Will Dietz	10294b932c	AMDGPU: Add GlobalISel to required_libraries. llvm-svn: 293387	2017-01-28 18:13:08 +00:00
Arpith Chacko Jacob	2b156edf56	[NVPTX] Add intrinsics to support named barriers. Support for barrier synchronization between a subset of threads in a CTA through one of sixteen explicitly specified barriers. These intrinsics are not directly exposed in CUDA but are critical for forthcoming support of OpenMP on NVPTX GPUs. The intrinsics allow the synchronization of an arbitrary (multiple of 32) number of threads in a CTA at one of 16 distinct barriers. The two intrinsics added are as follows: call void @llvm.nvvm.barrier.n(i32 10) waits for all threads in a CTA to arrive at named barrier #10. call void @llvm.nvvm.barrier(i32 15, i32 992) waits for 992 threads in a CTA to arrive at barrier #15. Detailed description of these intrinsics are available in the PTX manual. http://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions Reviewers: hfinkel, jlebar Differential Revision: https://reviews.llvm.org/D17657 llvm-svn: 293384	2017-01-28 16:38:15 +00:00
Matthias Braun	194ded551c	Use print() instead of dump() in code llvm-svn: 293371	2017-01-28 06:53:55 +00:00
Richard Trieu	3de487b2e8	[WebAssembly] Use print instead of dump method. This fixes non-debug non-assert builds after r293359. llvm-svn: 293368	2017-01-28 03:23:49 +00:00
Matthias Braun	8c209aa877	Cleanup dump() functions. We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359	2017-01-28 02:02:38 +00:00
Eugene Zelenko	e79c077ef9	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293348	2017-01-27 23:58:02 +00:00
Artem Tamazov	33b01e9cfe	[AMDGPU][mc] Fix memory corruption uncovered by AddressSanitizer during coverage/smoke Gfx7/8 testing. Coverage/smoke Gfx7/8 tests were committed r292922 but then reverted by r292974 due to AddressSanitizer failure, which is fixed by this patch. Tests to be re-committed soon. llvm-svn: 293338	2017-01-27 22:19:42 +00:00
Krzysztof Parzyszek	35ce5dac7f	[Hexagon] Remove unused variable (and silence a warning) llvm-svn: 293331	2017-01-27 20:40:14 +00:00
Tom Stellard	08efb7ebf6	AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D29068 llvm-svn: 293321	2017-01-27 18:41:14 +00:00
Konstantin Zhuravlyov	a304c83608	[AMDGPU] Grab MCSubtargetInfo from TargetMachine instead of constructing it Differential Revision: https://reviews.llvm.org/D29224 llvm-svn: 293318	2017-01-27 18:32:40 +00:00
Chris Ray	535e7d1547	[X86] Adding FFREEP instruction. Summary: Small change to get the FREEP instruction to decode properly. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29193 llvm-svn: 293314	2017-01-27 18:02:53 +00:00
Matt Arsenault	d8f7ea381f	AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands Accomplishes what r292982 was supposed to, which ended up only really making the necessary test changes. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 293310	2017-01-27 17:42:26 +00:00
Matt Arsenault	32b9600a7e	NVPTX: Make NVPTXInferAddressSpaces preserve CFG llvm-svn: 293308	2017-01-27 17:30:39 +00:00
Stanislav Mekhanoshin	f6c1feb8c3	[AMDGPU] Turn AMDGPUUnifyMetadata back into module pass With the adjustPassManager interface that is now possible to use custom early module passes. Differential Revision: https://reviews.llvm.org/D29189 llvm-svn: 293300	2017-01-27 16:38:10 +00:00
Simon Dardis	ca74dd79e9	[mips] Recommit: "N64 static relocation model support" This patch makes one change to GOT handling and two changes to N64's relocation model handling. Furthermore, the jumptable encodings have been corrected for static N64. Big GOT handling is now done via a new SDNode MipsGotHi - this node is unconditionally lowered to an lui instruction. The first change to N64's relocation handling is the lifting of the restriction that N64 always uses PIC. Now it is possible to target static environments. The second change adds support for 64 bit symbols and enables them by default. Previously N64 had patterns for sym32 mode only. In this mode all symbols are assumed to have 32 bit addresses. sym32 mode support is selectable with attribute 'sym32'. A follow on patch for clang will add the necessary frontend parameter. This partially resolves PR/23485. Thanks to Brooks Davis for reporting the issue! This version corrects a "Conditional jump or move depends on uninitialised value(s)" error detected by valgrind present in the original commit. Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris Differential Revision: https://reviews.llvm.org/D23652 llvm-svn: 293279	2017-01-27 11:36:52 +00:00
Saleem Abdulrasool	26c00e3700	ARM: fix vectorized division on WoA The Windows on ARM target uses custom division for normal division as the backend needs to insert division-by-zero checks. However, it is designed to only handle non-vectorized division. ARM has custom lowering for vectorized division as that can avoid loading registers with the values and invoke a division routine for each one, preferring to lower using NEON instructions. Fall back to the custom lowering for the NEON instructions if we encounter a vectorized division. Resolves PR31778! llvm-svn: 293259	2017-01-27 03:41:53 +00:00
NAKAMURA Takumi	0d299191d0	NVPTXCodeGen: Add IPO to libdeps, since r293189. llvm-svn: 293256	2017-01-27 02:11:10 +00:00
Quentin Colombet	89dbea06f1	[ARM][LegalizerInfo] Specify the type of the opcode. This is to fix the win7 bot that does not seem to be very good at infering the type when it gets used in an initiliazer list. llvm-svn: 293248	2017-01-27 01:30:46 +00:00
Quentin Colombet	24203cf997	[AArch64][LegalizerInfo] Specify the type of the opcode. This is an attempt to fix the win7 bot that does not seem to be very good at infering the type when it gets used in an initiliazer list. llvm-svn: 293246	2017-01-27 01:13:30 +00:00
Quentin Colombet	e15e460c05	Revert "[AArch64][LegalizerInfo] Specify the type of the initialization list." This reverts commit r293238. Even with that the win7 bot is still failing: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/3862 llvm-svn: 293245	2017-01-27 01:13:25 +00:00
Quentin Colombet	86fc8305ec	[AArch64][LegalizerInfo] Specify the type of the initialization list. This is an attempt to fix the win7 bot that does not seem to be very good at infering the type. llvm-svn: 293238	2017-01-27 00:39:03 +00:00
Eugene Zelenko	e6cf4374b0	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293229	2017-01-26 23:40:06 +00:00
Krzysztof Parzyszek	d6c8e3c9ce	[Hexagon] Require IPO library in Hexagon build This should unbreak the Hexagon build bots. llvm-svn: 293221	2017-01-26 23:03:22 +00:00
Krzysztof Parzyszek	c8b943860f	[Hexagon] Add Hexagon-specific loop idiom recognition pass llvm-svn: 293213	2017-01-26 21:41:10 +00:00
Balaram Makam	b73d2962ba	[AArch64] Refine Kryo Machine Model Summary: Refine floating point SQRT and DIV with accurate latency information. Reviewers: mcrosier Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D29191 llvm-svn: 293204	2017-01-26 20:10:41 +00:00
Sean Fertile	3c8c385a77	[PPC] cleanup of mayLoad/mayStore flags and memory operands. 1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store instructions. 2) Updated the flags on a number of intrinsics indicating that they write memory. 3) Added SDNPMemOperand flags for some target dependent SDNodes so that they propagate their memory operand Review: https://reviews.llvm.org/D28818 llvm-svn: 293200	2017-01-26 18:59:15 +00:00
Stanislav Mekhanoshin	81598117b6	Replace addEarlyAsPossiblePasses callback with adjustPassManager This change introduces adjustPassManager target callback giving a target an opportunity to tweak PassManagerBuilder before pass managers are populated. This generalizes and replaces addEarlyAsPossiblePasses target callback. In particular that can be used to add custom passes to extension points other than EP_EarlyAsPossible. Differential Revision: https://reviews.llvm.org/D28336 llvm-svn: 293189	2017-01-26 16:49:08 +00:00
Nirav Dave	d32a421f75	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293184 which is failing in LTO builds llvm-svn: 293188	2017-01-26 16:46:13 +00:00
Serge Rogatch	e09ba748cf	[XRay][Arm32] Reduce the portion of the stub and implement more staging for tail calls - in LLVM Summary: This patch provides more staging for tail calls in XRay Arm32 . When the logging part of XRay is ready for tail calls, its support in the core part of XRay Arm32 may be as easy as changing the number passed to the handler from 1 to 2. Coupled patch: - https://reviews.llvm.org/D28674 Reviewers: dberris, rengolin Reviewed By: dberris Subscribers: llvm-commits, iid_iunknown, aemerson, rengolin, dberris Differential Revision: https://reviews.llvm.org/D28673 llvm-svn: 293185	2017-01-26 16:17:03 +00:00
Nirav Dave	de6516c466	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293184	2017-01-26 16:02:24 +00:00
Rafael Espindola	82149a1aa9	Use shouldAssumeDSOLocal in classifyGlobalReference. And teach shouldAssumeDSOLocal that ppc has no copy relocations. The resulting code handle a few more case than before. For example, it knows that a weak symbol can be resolved to another .o file, but it will still be in the main executable. llvm-svn: 293180	2017-01-26 15:02:31 +00:00
Simon Pilgrim	027bb453d9	[X86][SSE] Add support for combining ANDNP byte masks with target shuffles llvm-svn: 293178	2017-01-26 14:31:12 +00:00
Simon Pilgrim	3057fd53f9	[X86][SSE] Pull out target shuffle resolve code into helper. NFCI. Pulled out code that removed unused inputs from a target shuffle mask into a helper function to allow it to be reused in a future commit. llvm-svn: 293175	2017-01-26 13:06:02 +00:00
Valery Pykhtin	75d1de903f	[AMDGPU] Fix typo in GCNSchedStrategy Differential revision: https://reviews.llvm.org/D28980 llvm-svn: 293171	2017-01-26 10:51:47 +00:00
Simon Dardis	5b67a4f75f	Revert "[mips] N64 static relocation model support" This reverts commit r293164. There are multiple tests failing. llvm-svn: 293170	2017-01-26 10:46:07 +00:00
Simon Dardis	09e65efd09	[mips] N64 static relocation model support This patch makes one change to GOT handling and two changes to N64's relocation model handling. Furthermore, the jumptable encodings have been corrected for static N64. Big GOT handling is now done via a new SDNode MipsGotHi - this node is unconditionally lowered to an lui instruction. The first change to N64's relocation handling is the lifting of the restriction that N64 always uses PIC. Now it is possible to target static environments. The second change adds support for 64 bit symbols and enables them by default. Previously N64 had patterns for sym32 mode only. In this mode all symbols are assumed to have 32 bit addresses. sym32 mode support is selectable with attribute 'sym32'. A follow on patch for clang will add the necessary frontend parameter. This partially resolves PR/23485. Thanks to Brooks Davis for reporting the issue! Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris Differential Revision: https://reviews.llvm.org/D23652 llvm-svn: 293164	2017-01-26 10:19:02 +00:00
Diana Picus	278c722e6d	[ARM] GlobalISel: Load i1, i8 and i16 args from stack Add support for loading i1, i8 and i16 arguments from the stack, with or without the ABI extension flags. When the ABI extension flags are present, we load a 4-byte value, otherwise we preserve the size of the load and let the instruction selector replace it with a LDRB/LDRH. This generates the same thing as DAGISel. Differential Revision: https://reviews.llvm.org/D27803 llvm-svn: 293163	2017-01-26 09:20:47 +00:00
Craig Topper	bad53cce26	[AVX-512] Move the combine that runs combineBitcastForMaskedOp to the last DAG combine phase where I had originally meant to put it. llvm-svn: 293157	2017-01-26 07:17:58 +00:00
Craig Topper	f0bab7b739	[X86] When bitcasting INSERT_SUBVECTOR/EXTRACT_SUBVECTOR to match masked operations, use the correct type for the immediate operand. llvm-svn: 293156	2017-01-26 07:17:53 +00:00
Jonas Paulsson	8e2f948ef0	[TargetTransformInfo] Refactor and improve getScalarizationOverhead() Refactoring to remove duplications of this method. New method getOperandsScalarizationOverhead() that looks at the present unique operands and add extract costs for them. Old behaviour was to just add extract costs for one operand of the type always, which still happens in getArithmeticInstrCost() if no operands are provided by the caller. This is a good start of improving on this, but there are more places that can be improved by using getOperandsScalarizationOverhead(). Review: Hal Finkel https://reviews.llvm.org/D29017 llvm-svn: 293155	2017-01-26 07:03:25 +00:00
Matt Arsenault	53f0cc238c	AMDGPU: Fold fneg into round instructions llvm-svn: 293127	2017-01-26 01:25:36 +00:00
Daniel Jasper	65144c852d	Revert "[PPC] Give unaligned memory access lower cost on processor that supports it" This reverts commit r292680. It is causing significantly worse performance and test timeouts in our internal builds. I have already routed reproduction instructions your way. llvm-svn: 293092	2017-01-25 21:21:08 +00:00
Konstantin Zhuravlyov	400771edd6	[AMDGPU] Bump up n_type for metadata v2 Differential Revision: https://reviews.llvm.org/D29115 llvm-svn: 293083	2017-01-25 20:47:17 +00:00
Matt Arsenault	5d9101941f	AMDGPU: Set call_convention bit in kernel_code_t According to the documentation this is supposed to be -1 if indirect calls are not supported. llvm-svn: 293081	2017-01-25 20:21:57 +00:00
Serge Rogatch	bc2d34394d	[XRay][AArch64] More staging for tail call support in XRay on AArch64 - in LLVM Summary: This patch prepares more for tail call support in XRay. Until the logging part supports tail calls, this is just staging, so it seems LLVM part is mostly ready with this patch. Related: https://reviews.llvm.org/D28948 (compiler-rt) Reviewers: dberris, rengolin Reviewed By: dberris Subscribers: llvm-commits, iid_iunknown, aemerson Differential Revision: https://reviews.llvm.org/D28947 llvm-svn: 293080	2017-01-25 20:21:49 +00:00
Krzysztof Parzyszek	ee9aa3ffee	Add iterator_range<regclass_iterator> to {Target,MC}RegisterInfo, NFC llvm-svn: 293077	2017-01-25 19:29:04 +00:00
Matthias Braun	aeb8e33968	PowerPC: Slight cleanup of getReservedRegs(); NFC Change getReservedRegs() to not mark a register as reserved and then revert that decision in some cases. Motivated by the discussion in https://reviews.llvm.org/D29056 llvm-svn: 293073	2017-01-25 17:12:10 +00:00
Chad Rosier	072e70b365	[AArch64] Minor code refactoring. NFC. llvm-svn: 293063	2017-01-25 15:56:59 +00:00
Martin Bohme	8396e14e7f	[ARM] GlobalISel: Fix stack-use-after-scope bug. Summary: Lifetime extension wasn't triggered on the result of BuildMI because the reference was non-const. However, instead of adding a const, I've removed the reference entirely as RVO should kick in anyway. Reviewers: rovka, bkramer Reviewed By: bkramer Subscribers: aemerson, rengolin, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D29124 llvm-svn: 293059	2017-01-25 14:28:19 +00:00
Mohammed Agabaria	20caee95e1	[X86] enable memory interleaving for X86\SLM arch. Differential Revision: https://reviews.llvm.org/D28547 llvm-svn: 293040	2017-01-25 09:14:48 +00:00
Diana Picus	d83df5d372	[ARM] GlobalISel: Support i1 add and ABI extensions Add support for: * i1 add * i1 function arguments, if passed through registers * i1 returns, with ABI signext/zeroext Differential Revision: https://reviews.llvm.org/D27706 llvm-svn: 293035	2017-01-25 08:47:40 +00:00
Diana Picus	8b6c6bedcb	[ARM] GlobalISel: Support i8/i16 ABI extensions At the moment, this means supporting the signext/zeroext attribute on the return type of the function. For function arguments, signext/zeroext should be handled by the caller, so there's nothing for us to do until we start lowering calls. Note that this does not include support for other extensions (i8 to i16), those will be added later. Differential Revision: https://reviews.llvm.org/D27705 llvm-svn: 293034	2017-01-25 08:10:40 +00:00
Coby Tayree	77807d93af	[X86]Enable the use of 'mov' with a 64bit GPR and a large immediate Enable the next form (intel style): "mov <reg64>, <largeImm>" which is should be available, where <largeImm> stands for immediates which exceed the range of a singed 32bit integer Differential Revision: https://reviews.llvm.org/D28988 llvm-svn: 293030	2017-01-25 07:09:42 +00:00
Diana Picus	1d8eaf4387	[ARM] GlobalISel: Bail out on Thumb. NFC Thumb is not supported yet, so bail out early. llvm-svn: 293029	2017-01-25 07:08:53 +00:00
Matt Arsenault	74a576e7d3	AMDGPU: Check nsz instead of unsafe math llvm-svn: 293028	2017-01-25 06:27:02 +00:00
Matt Arsenault	732a531506	DAG: Recognize no-signed-zeros-fp-math attribute clang already emits this with -cl-no-signed-zeros, but codegen doesn't do anything with it. Treat it like the other fast math attributes, and change one place to use it. llvm-svn: 293024	2017-01-25 06:08:42 +00:00
Matt Arsenault	9f5e0ef0c5	AMDGPU: Implement early ifcvt target hooks. Leave early ifcvt disabled for now since there are some shader-db regressions. This causes some immediate improvements, but could be better. The cost checking that the pass does is based on critical path length for out of order CPUs which we do not want so it skips out on many cases we want. llvm-svn: 293016	2017-01-25 04:25:02 +00:00
Ahmed Bougacha	05a5f7dc0b	[GlobalISel] Generate selector for more integer binop patterns. This surprisingly isn't NFC because there are patterns to select GPR sub to SUBSWrr (rather than SUBWrr/rs); SUBS is later optimized to SUB if NZCV is dead. From ISel's perspective, both are fine. llvm-svn: 293010	2017-01-25 02:41:38 +00:00
Tom Stellard	2f3f9855f0	AMDGPU add support for spilling to a user sgpr pointed buffers Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000	2017-01-25 01:25:13 +00:00
Eugene Zelenko	11f6907f40	[AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 292996	2017-01-25 00:29:26 +00:00
Eugene Zelenko	8c6ed0f3a0	[XCore] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 292988	2017-01-24 23:02:48 +00:00
Matt Arsenault	bf67cf7e4b	AMDGPU: Remove spurious out branches after a kill The sequence like this: v_cmpx_le_f32_e32 vcc, 0, v0 s_branch BB0_30 s_cbranch_execnz BB0_30 ; BB#29: exp null off, off, off, off done vm s_endpgm BB0_30: ; %endif110 is likely wrong. The s_branch instruction will unconditionally jump to BB0_30 and the skip block (exp done + endpgm) inserted for performing the kill instruction will never be executed. This results in a GPU hang with Star Ruler 2. The s_branch instruction is added during the "Control Flow Optimizer" pass which seems to re-organize the basic blocks, and we assume that SI_KILL_TERMINATOR is always the last instruction inside a basic block. Thus, after inserting a skip block we just go to the next BB without looking at the subsequent instructions after the kill, and the s_branch op is never removed. Instead, we should remove the unconditional out branches and let skip the two instructions if the exec mask is non-zero. This patch fixes the GPU hang and doesn't introduce any regressions with "make check". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99019 Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> llvm-svn: 292985	2017-01-24 22:18:39 +00:00
Eugene Zelenko	3943d2b0d7	[SystemZ] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 292983	2017-01-24 22:10:43 +00:00
Matt Arsenault	7aad8fd8f4	Enable FeatureFlatForGlobal on Volcanic Islands This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982	2017-01-24 22:02:15 +00:00
Changpeng Fang	c85abbd955	AMDGPU/SI: Give up in promote alloca when a pointer may be captured. Differential Revision: http://reviews.llvm.org/D28970 Reviewer: Matt llvm-svn: 292966	2017-01-24 19:06:28 +00:00
Chad Rosier	8e11fbd15d	[AArch64] Fix typo. NFC. llvm-svn: 292959	2017-01-24 18:08:10 +00:00
Stanislav Mekhanoshin	22a56f2f5a	[AMDGPU] Add VGPR copies post regalloc fix pass Regalloc creates COPY instructions which do not formally use VALU. That results in v_mov instructions displaced after exec mask modification. One pass which do it is SIOptimizeExecMasking, but potentially it can be done by other passes too. This patch adds a pass immediately after regalloc to add implicit exec use operand to all VGPR copy instructions. Differential Revision: https://reviews.llvm.org/D28874 llvm-svn: 292956	2017-01-24 17:46:17 +00:00
Evandro Menezes	7784cacd91	[AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128' In order to follow the pattern of the existing 'slow-misaligned-128store' option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'. llvm-svn: 292954	2017-01-24 17:34:31 +00:00
Chris Bieneman	bef847c3ae	[Lanai] Rename LanaiInstPrinter library to LanaiAsmPrinter Summary: This is in keeping with LLVM convention. The classes are InstPrinters, but the library is ${target}AsmPrinter. This patch is in response to bryant pointing out to me that Lanai was the only backend deviating from convention here. Thanks! Reviewers: jpienaar, bryant Subscribers: mgorny, jgosnell, llvm-commits Differential Revision: https://reviews.llvm.org/D29043 llvm-svn: 292953	2017-01-24 17:27:01 +00:00
Simon Pilgrim	893d2119ee	[X86][AVX512] Remove unused argument from PMOVX tablegen patterns. NFCI. Seems to be a copy+paste legacy from the AVX2 patterns. llvm-svn: 292941	2017-01-24 16:16:29 +00:00
Martin Bohme	526299c81c	[X86][SSE] Add explicit braces to avoid -Wdangling-else warning. Reviewers: RKSimon Subscribers: llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D29076 llvm-svn: 292924	2017-01-24 12:31:30 +00:00
Simon Pilgrim	0c45338961	Fix unused variable warning llvm-svn: 292921	2017-01-24 11:54:27 +00:00
Simon Pilgrim	e1ec9072f6	[X86][SSE] Add support for constant folding vector arithmetic shift by immediates llvm-svn: 292919	2017-01-24 11:46:13 +00:00
Simon Pilgrim	6340e54861	[X86][SSE] Add support for constant folding vector logical shift by immediates llvm-svn: 292915	2017-01-24 11:21:57 +00:00
Craig Topper	fc8798fa1b	[X86] Remove unnecessary peakThroughBitcasts call that's already take care of by the ISD::isBuildVectorAllOnes check below. llvm-svn: 292894	2017-01-24 06:57:29 +00:00
Wei Ding	ee21a36f8a	AMDGPU : Add trap handler support. llvm-svn: 292893	2017-01-24 06:41:21 +00:00
Craig Topper	b0cbd5b5b0	[AVX-512] Simplify multiclasses for integer logic operations. There were several inputs that didn't vary. While there give them the same scheduling itinerary as the SSE/AVX versions. llvm-svn: 292892	2017-01-24 06:25:34 +00:00
Jonas Paulsson	463e2a6f3d	[SystemZ] Gracefully fail in GeneralShuffle::add() instead of assertion. The GeneralShuffle::add() method used to have an assert that made sure that source elements were at least as big as the destination elements. This was wrong, since it is actually expected that an EXTRACT_VECTOR_ELT node with a smaller source element type than the return type gets extended. Therefore, instead of asserting this, it is just checked and if this is the case 'false' is returned from the GeneralShuffle::add() method. This case should be very rare and is not handled further by the backend. Review: Ulrich Weigand. llvm-svn: 292888	2017-01-24 05:43:03 +00:00
Craig Topper	993edc9db1	[X86] Don't split v8i32 all ones values if only AVX1 is available. Keep it intact and split it at isel. This allows us to remove the check in ANDN combining that had to look through the extraction. llvm-svn: 292881	2017-01-24 04:33:03 +00:00
Craig Topper	eb440a14a5	[X86] Remove Undef handling from extractSubVector. This is now handled inside getNode. llvm-svn: 292877	2017-01-24 02:43:54 +00:00
Matthias Braun	1d77599ba3	PowerPC: Mark super regs of reserved regs reserved. When a register like R1 is reserved, X1 should be reserved as well. This was already done "manually" when 64bit code was enabled, however using the markSuperRegs() function on the base register is more convenient and allows to use the checksAllSuperRegsMarked() function even in 32bit mode to avoid accidental breakage in the future. This is also necessary to allow https://reviews.llvm.org/D28881 Differential Revision: https://reviews.llvm.org/D29056 llvm-svn: 292870	2017-01-24 01:12:30 +00:00
Derek Schuff	4b320b72a2	[WebAssembly] Update LibFunc::Func -> LibFunc Fixes compile failures after r292848 llvm-svn: 292857	2017-01-24 00:01:18 +00:00
Eugene Zelenko	a63528cf9c	[AMDGPU] Fix obsolete comments, spotted by Malcolm Parsons. (NFC) llvm-svn: 292853	2017-01-23 23:41:16 +00:00
David L. Jones	d21529fa0d	[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC) Summary: The LibFunc::Func enum holds enumerators named for libc functions. Unfortunately, there are real situations, including libc implementations, where function names are actually macros (musl uses "#define fopen64 fopen", for example; any other transitively visible macro would have similar effects). Strictly speaking, a conforming C++ Standard Library should provide any such macros as functions instead (via <cstdio>). However, there are some "library" functions which are not part of the standard, and thus not subject to this rule (fopen64, for example). So, in order to be both portable and consistent, the enum should not use the bare function names. The old enum naming used a namespace LibFunc and an enum Func, with bare enumerators. This patch changes LibFunc to be an enum with enumerators prefixed with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override macros.) There are additional changes required in clang. Reviewers: rsmith Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28476 llvm-svn: 292848	2017-01-23 23:16:46 +00:00
Matt Arsenault	3aef809384	AMDGPU: Custom lower more vector operations This avoids stack usage. llvm-svn: 292846	2017-01-23 23:09:58 +00:00
Krzysztof Parzyszek	09a8638724	[RDF] Add registers to live set even if they are live already When calculating kills, a register may be considered live because a part of it is live, but if there is a use of that (whole) register, the whole register (and its subregisters) need to be added to the live set. llvm-svn: 292845	2017-01-23 23:03:49 +00:00
Matt Arsenault	a6867fd441	AMDGPU: Combine fp16/fp64 subtarget features The same control register controls both, and are set to the same defaults. Keep the old names around as aliases. llvm-svn: 292837	2017-01-23 22:31:03 +00:00
Krzysztof Parzyszek	f86d385813	[Hexagon] Explicitly reserve aliases of reserved registers llvm-svn: 292836	2017-01-23 22:13:05 +00:00
Ahmed Bougacha	b6137063eb	[AArch64][GlobalISel] Legalize narrow scalar fp->int conversions. Since we're now avoiding operations using narrow scalar integer types, we have to legalize the integer side of the FP conversions. This requires teaching the legalizer how to do that. llvm-svn: 292828	2017-01-23 21:10:14 +00:00
Ahmed Bougacha	cfb384d39d	[AArch64][GlobalISel] Legalize narrow scalar ops again. Since r279760, we've been marking as legal operations on narrow integer types that have wider legal equivalents (for instance, G_ADD s8). Compared to legalizing these operations, this reduced the amount of extends/truncates required, but was always a weird legalization decision made at selection time. So far, we haven't been able to formalize it in a way that permits the selector generated from SelectionDAG patterns to be sufficient. Using a wide instruction (say, s64), when a narrower instruction exists (s32) would introduce register class incompatibilities (when one narrow generic instruction is selected to the wider variant, but another is selected to the narrower variant). It's also impractical to limit which narrow operations are matched for which instruction, as restricting "narrow selection" to ranges of types clashes with potentially incompatible instruction predicates. Concerns were also raised regarding MIPS64's sign-extended register assumptions, as well as wrapping behavior. See discussions in https://reviews.llvm.org/D26878. Instead, legalize the operations. Should we ever revert to selecting these narrow operations, we should try to represent this more accurately: for instance, by separating a "concrete" type on operations, and an "underlying" type on vregs, we could move the "this narrow-looking op is really legal" decision to the legalizer, and let the selector use the "underlying" vreg type only, which would be guaranteed to map to a register class. In any case, we eventually should mitigate: - the performance impact by selecting no-op extract/truncates to COPYs (which we currently do), and the COPYs to register reuses (which we don't do yet). - the compile-time impact by optimizing away extract/truncate sequences in the legalizer. llvm-svn: 292827	2017-01-23 21:10:05 +00:00
Javed Absar	00cce41752	[ARM] Classification Improvements to ARM Sched-Models. NFCI. This is a series of patches to enable adding of machine sched models for ARM processors easier and compact. They define new sched-readwrites for groups of ARM instructions. This has been missing so far, and as a consequence, machine scheduler models for individual sub-targets have tended to be larger than they needed to be. The current patch focuses on floating-point instructions. Reviewers: Diana Picus (rovka), Renato Golin (rengolin) Differential Revision: https://reviews.llvm.org/D28194 llvm-svn: 292825	2017-01-23 20:20:39 +00:00
Matt Arsenault	7b49ad74ed	AMDGPU: Propagate fast math flags in fneg combines Can't for fma/mad since it seems they can't have flags currently. llvm-svn: 292818	2017-01-23 19:08:34 +00:00
Matt Arsenault	78916e17ea	AMDGPU: Remove unnecessary check There are no scalar FP types that can be extended. llvm-svn: 292816	2017-01-23 19:00:15 +00:00
Jonas Paulsson	d034e7ddc8	[SystemZ] Mark vector immediate load instructions with useful flags. Vector immediate load instructions should have the isAsCheapAsAMove, isMoveImm and isReMaterializable flags set. With them, these instruction will get hoisted out of loops. Review: Ulrich Weigand llvm-svn: 292790	2017-01-23 14:09:58 +00:00
Simon Pilgrim	0218ce1080	[X86][SSE] Add missing X86ISD::ANDNP combines. llvm-svn: 292767	2017-01-22 22:45:23 +00:00
Simon Pilgrim	7e1cc97513	[X86][SSE] Improve shuffle combining with zero insertions Add support for handling shuffles with scalar_to_vector(0) llvm-svn: 292766	2017-01-22 22:21:44 +00:00
Sanjay Patel	8f49aede82	[x86] avoid crashing with illegal vector type (PR31672) https://llvm.org/bugs/show_bug.cgi?id=31672 llvm-svn: 292758	2017-01-22 17:06:12 +00:00
Craig Topper	8e0724d332	[X86] Don't allow commuting to form phsub operations. Fixes PR31714. llvm-svn: 292713	2017-01-21 06:59:38 +00:00
Matthias Braun	28eae8f4e0	LiveRegUnits: Add accumulateBackward() function Re-Commit r292543 with a fix for the situation when the chain end is MBB.end(). This function can be used to accumulate the set of all read and modified register in a sequence of instructions. Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove the concept. - The AArch64A57LoadBalancing code is using a backwards analysis now which is irrespective of kill flags. This is the main motivation for this change. Differential Revision: http://reviews.llvm.org/D22082 llvm-svn: 292705	2017-01-21 02:21:04 +00:00
Eugene Zelenko	06f90ef45c	[AMDGPU] Fix build broken in r292688. llvm-svn: 292699	2017-01-21 01:34:25 +00:00
Justin Lebar	46624a822d	[NVPTX] Auto-upgrade some NVPTX intrinsics to LLVM target-generic code. Summary: Specifically, we upgrade llvm.nvvm.: * brev{32,64} * clz.{i,ll} * popc.{i,ll} * abs.{i,ll} * {min,max}.{i,ll,u,ull} * h2f These either map directly to an existing LLVM target-generic intrinsic or map to a simple LLVM target-generic idiom. In all cases, we check that the code we generate is lowered to PTX as we expect. These builtins don't need to be backfilled in clang: They're not accessible to user code from nvcc. Reviewers: tra Subscribers: majnemer, cfe-commits, llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28793 llvm-svn: 292694	2017-01-21 01:00:32 +00:00
Justin Lebar	077f8fb168	[NVPTX] Move getDivF32Level, usePrecSqrtF32, and useF32FTZ into out of DAGToDAG and into TargetLowering. Summary: DADToDAG has access to TargetLowering, but not vice versa, so this is the more general location for these functions. NFC Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28795 llvm-svn: 292693	2017-01-21 01:00:14 +00:00
Eugene Zelenko	6620376da7	[AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 292688	2017-01-21 00:53:49 +00:00
Guozhi Wei	a5c6ed5a5c	[PPC] Give unaligned memory access lower cost on processor that supports it Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost. This patch fixes pr31492. Differential Revision: https://reviews.llvm.org/D28630 llvm-svn: 292680	2017-01-20 23:35:27 +00:00
Jan Vesely	f170504c41	AMDGPU/R600: Serialize vector trunc stores to private AS Add DUMMY_CHAIN SDNode to denote stores of interest Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=28915 Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=30411 Differential Revision: https://reviews.llvm.org/D27964 llvm-svn: 292651	2017-01-20 21:24:26 +00:00
Dan Gohman	a99b717f52	[WebAssembly] Don't create bitcast-wrappers for varargs. WebAssembly varargs functions use a significantly different ABI than non-varargs functions, and the current code in WebAssemblyFixFunctionBitcasts doesn't handle that difference. For now, just avoid creating wrapper functions in the presence of varargs. llvm-svn: 292645	2017-01-20 20:50:29 +00:00
Matthias Braun	856548a616	ARM: tLDR_postidx should be marked mayLoad This fixes -verify-machineinstrs complaints. llvm-svn: 292629	2017-01-20 18:30:28 +00:00
Matthias Braun	2e8c11e4b3	AArch64LoadStoreOptimizer: Update kill flags when merging stores Kill flags need to be updated correctly when moving stores up/down to form store pair instructions. Those invalid flags have been ignored before but as of r290014 they are recognized when using -mllvm -verify-machineinstrs. Also simplifies test/CodeGen/AArch64/ldst-opt-dbg-limit.mir, renames it to ldst-opt.mir test and adds a new tests for this change. Differential Revision: https://reviews.llvm.org/D28875 llvm-svn: 292625	2017-01-20 18:04:27 +00:00
Petar Jovanovic	dbb39356b4	[mips] Fix debug information for __thread variable This patch fixes debug information for __thread variable on Mips using .dtprelword and .dtpreldword directives. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D28770 llvm-svn: 292624	2017-01-20 17:53:30 +00:00
Eugene Zelenko	734bb7bb09	[AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 292623	2017-01-20 17:52:16 +00:00
Simon Pilgrim	3e5b525699	Remove trailing whitespace. NFCI. llvm-svn: 292613	2017-01-20 15:15:59 +00:00
Simon Pilgrim	0da4d2bc03	[CostModel][X86] Removed unused cost. NFCI. SHL v8i32 is already handled in the SSE41 cost table llvm-svn: 292612	2017-01-20 15:14:38 +00:00
Sjoerd Meijer	2db2a947f6	[Thumb] Add support for tMUL in the compare instruction peephole optimizer. We also want to optimise tests like this: return a*b == 0. The MULS instruction is flag setting, so we don't need the CMP instruction but can instead branch on the result of the MULS. The generated instructions sequence for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the boolean values resulting from the select instruction, but these MOVS instructions are flag setting and were thus preventing this optimisation. Now we first reorder and move the MULS to before the CMP and generate sequence MOVS, MOVS, MULS, CMP so that the optimisation could trigger. Reordering of the MULS and MOVS is safe to do because the subsequent MOVS instructions just set the CPSR register and don't use it, i.e. the CPSR is dead. Differential Revision: https://reviews.llvm.org/D27990 llvm-svn: 292608	2017-01-20 13:10:12 +00:00
Benjamin Kramer	11590b8281	Pacify -Wreorder. llvm-svn: 292599	2017-01-20 10:37:53 +00:00
Sam Kolton	07dbde214b	[AMDGPU] Add subtarget features for SDWA/DPP Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28900 llvm-svn: 292596	2017-01-20 10:01:25 +00:00
Diana Picus	bd66b7dc87	[ARM] Use helpers for adding pred / CC operands. NFC Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0) and replace with add(condCodeOp()) and add(predOps()). This should make it easier to understand what those operands represent (without having to look at the definition of the instruction that we're adding to). Differential Revision: https://reviews.llvm.org/D27984 llvm-svn: 292587	2017-01-20 08:15:24 +00:00
Matthias Braun	d9217c0b86	Revert "LiveRegUnits: Add accumulateBackward() function" This seems to be breaking some bots. This reverts commit r292543. llvm-svn: 292574	2017-01-20 03:58:42 +00:00
Ahmed Bougacha	d294823930	[AArch64][GlobalISel] Widen scalar int->fp conversions. It's incorrect to ignore the higher bits of the integer source. Teach the legalizer how to widen it. llvm-svn: 292563	2017-01-20 01:37:24 +00:00
Stanislav Mekhanoshin	6ec3e3a728	[AMDGPU] Prevent spills before exec mask is restored Inline spiller can decide to move a spill as early as possible in the basic block. It will skip phis and label, but we also need to make sure it skips instructions in the basic block prologue which restore exec mask. Added isPositionLike callback in TargetInstrInfo to detect instructions which shall be skipped in addition to common phis, labels etc. Differential Revision: https://reviews.llvm.org/D27997 llvm-svn: 292554	2017-01-20 00:44:31 +00:00
Matthias Braun	3ffeb68869	LiveRegUnits: Add accumulateBackward() function This function can be used to accumulate the set of all read and modified register in a sequence of instructions. Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove the concept. - The AArch64A57LoadBalancing code is using a backwards analysis now which is irrespective of kill flags. This is the main motivation for this change. Differential Revision: http://reviews.llvm.org/D22082 llvm-svn: 292543	2017-01-20 00:16:17 +00:00
Stanislav Mekhanoshin	68257700f8	[AMDGPU] Add exec copy to LiveIntervals in SILowerControlFlow::emitElse This instruction is missing from LiveIntervals. I'm not aware of any problems because of this though. Differential Revision: https://reviews.llvm.org/D28879 llvm-svn: 292521	2017-01-19 21:26:22 +00:00
Serge Rogatch	f83d2a25bf	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 llvm-svn: 292516	2017-01-19 20:24:23 +00:00
Simon Pilgrim	db101e4d57	[X86][SSE] Improve comments describing combineTruncatedArithmetic. NFCI. llvm-svn: 292502	2017-01-19 18:18:32 +00:00
Simon Pilgrim	5f2f53b106	[X86][SSE] Attempt to pre-truncate arithmetic operations that have already been extended As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further llvm-svn: 292493	2017-01-19 16:25:02 +00:00
Kristof Beyls	e9412b4d47	[GlobalISel] Pointers are legal operands for G_SELECT on AArch64 Differential Revision: https://reviews.llvm.org/D28805 llvm-svn: 292481	2017-01-19 13:32:14 +00:00
Elena Demikhovsky	e01512cecf	Recommiting unsigned saturation with a bugfix. A test case that crached is added to avx512-trunc.ll. (PR31589) llvm-svn: 292479	2017-01-19 12:08:21 +00:00
Daniel Sanders	d64d5024a4	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since first commit attempt: * Added missing guards * Added more missing guards * Found and fixed a use-after-free bug involving Twine locals Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292478	2017-01-19 11:15:55 +00:00
Craig Topper	200ea31684	[AVX-512] Support ADD/SUB/MUL of mask vectors Summary: Currently we expand and scalarize these operations, but I think we should be able to implement ADD/SUB with KXOR and MUL with KAND. We already do this for scalar i1 operations so I just extended it to vectors of i1. Reviewers: zvi, delena Reviewed By: delena Subscribers: guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D28888 llvm-svn: 292474	2017-01-19 07:12:35 +00:00
Matt Arsenault	3e6f9b5773	AMDGPU: Disable some fneg combines unless nsz For -(x + y) -> (-x) + (-y), if x == -y, this would change the result from -0.0 to 0.0. Since the fma/fmad combine is an extension of this problem it also applies there. fmul should be fine, and I don't think any of the unary operators or conversions should be a problem either. llvm-svn: 292473	2017-01-19 06:35:27 +00:00
Matt Arsenault	3b99f12a4e	AMDGPU: Remove modifiers from v_div_scale_* They seem to produce nonsense results when used. This should be applied to the release branch. llvm-svn: 292472	2017-01-19 06:04:12 +00:00
Craig Topper	c227529105	[X86] Merge LowerADD and LowerSUB into a single LowerADD_SUB since they are identical. llvm-svn: 292469	2017-01-19 03:49:29 +00:00
Craig Topper	b561e66384	[AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector broadcasts that can't fold the load. llvm-svn: 292466	2017-01-19 02:34:29 +00:00
Dehao Chen	1ce8d6ca59	Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection Summary: SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info: * start line of all subprograms * linkage name of all subprograms * standalone subprograms (functions that has neither inlined nor been inlined) This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch): -gmlt(orig) -gmlt(patched) -g 433.milc 4.68% 5.40% 19.73% 444.namd 8.45% 8.93% 45.99% 447.dealII 97.43% 115.21% 374.89% 450.soplex 27.75% 31.88% 126.04% 453.povray 21.81% 26.16% 92.03% 470.lbm 0.60% 0.67% 1.96% 482.sphinx3 5.77% 6.47% 26.17% 400.perlbench 17.81% 19.43% 73.08% 401.bzip2 3.73% 3.92% 12.18% 403.gcc 31.75% 34.48% 122.75% 429.mcf 0.78% 0.88% 3.89% 445.gobmk 6.08% 7.92% 42.27% 456.hmmer 10.36% 11.25% 35.23% 458.sjeng 5.08% 5.42% 14.36% 462.libquantum 1.71% 1.96% 6.36% 464.h264ref 15.61% 16.56% 43.92% 471.omnetpp 11.93% 15.84% 60.09% 473.astar 3.11% 3.69% 14.18% 483.xalancbmk 56.29% 81.63% 353.22% geomean 15.60% 18.30% 57.81% Debug info size change for -gmlt binary with this patch: 433.milc 13.46% 444.namd 5.35% 447.dealII 18.21% 450.soplex 14.68% 453.povray 19.65% 470.lbm 6.03% 482.sphinx3 11.21% 400.perlbench 8.91% 401.bzip2 4.41% 403.gcc 8.56% 429.mcf 8.24% 445.gobmk 29.47% 456.hmmer 8.19% 458.sjeng 6.05% 462.libquantum 11.23% 464.h264ref 5.93% 471.omnetpp 31.89% 473.astar 16.20% 483.xalancbmk 44.62% geomean 16.83% Reviewers: davidxl, echristo, dblaikie Reviewed By: echristo, dblaikie Subscribers: aprantl, probinson, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25434 llvm-svn: 292457	2017-01-19 00:44:11 +00:00
Artem Belevich	3d3f6190ab	[NVPTX] Fix lowering of fp16 ISD::FNEG. There's no neg.f16 instruction, so negation has to be done via subtraction from zero. Differential Revision: https://reviews.llvm.org/D28876 llvm-svn: 292452	2017-01-19 00:14:45 +00:00
Krzysztof Parzyszek	954dd8d9ba	[Hexagon] Remove dead defs from the live set when expanding wstores llvm-svn: 292445	2017-01-18 23:11:40 +00:00
Michael Kuperstein	d3d2925933	Revert r291670 because it introduces a crash. r291670 doesn't crash on the original testcase from PR31589, but it crashes on a slightly more complex one. PR31589 has the new reproducer. llvm-svn: 292444	2017-01-18 23:05:58 +00:00
Evandro Menezes	7960b2e19a	[AArch64] Generate literals by the little end ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 llvm-svn: 292422	2017-01-18 18:57:08 +00:00
Stanislav Mekhanoshin	a4e63ead4b	[AMDGPU] Do not allow register coalescer to create big superregs Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 llvm-svn: 292413	2017-01-18 17:30:05 +00:00
Kirill Bobyrev	6afbaf0944	Revert 292404 due to buildbot failures. llvm-svn: 292407	2017-01-18 16:34:25 +00:00
Kirill Bobyrev	9ad06dbe17	[X86] Minor code cleanup to fix several clang-tidy warnings. NFC llvm-svn: 292404	2017-01-18 16:15:47 +00:00
Chad Rosier	771db6f895	[Assembler] Fix crash when assembling .quad for AArch32. A 64-bit relocation does not exist in 32-bit ARMELF. Report an error instead of crashing. PR23870 Patch by Sanne Wouda (sanwou01). Differential Revision: https://reviews.llvm.org/D28851 llvm-svn: 292373	2017-01-18 15:02:54 +00:00
Florian Hahn	8485cecd3f	[thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue. Summary: In this function, virtual registers can be introduced (for example through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs will replace those virtual registers with concrete registers later on in PrologEpilogInserter, which sets NoVRegs again. This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which failed with expensive checks. https://llvm.org/bugs/show_bug.cgi?id=27484 Reviewers: rnk, bkramer, olista01 Reviewed By: olista01 Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D28829 llvm-svn: 292372	2017-01-18 15:01:22 +00:00
Daniel Sanders	af76f989b5	Re-revert: [globalisel] Tablegen-erate current Register Bank Information More missing guards. My build didn't notice it due to a stale file left over from a Global ISel build. llvm-svn: 292369	2017-01-18 14:26:12 +00:00
Daniel Sanders	517b61cb69	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since last commit: The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and this should fix the buildbots however it may not be the whole fix. The previous buildbot failures suggest there may be a memory bug lurking that I'm unable to reproduce (including when using asan) or spot in the source. If they re-occur on this commit then I'll need assistance from the bot owners to track it down. Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292367	2017-01-18 14:17:50 +00:00
Sam Parker	df7c6ef96f	[ARM] Create objdump subtarget from build attrs Enable an ELFObjectFile to read the its arm build attributes to produce a target triple with a specific ARM architecture. llvm-objdump now uses this functionality to automatically produce a more accurate target. Differential Revision: https://reviews.llvm.org/D28769 llvm-svn: 292366	2017-01-18 13:52:12 +00:00
Michael Zuckerman	0c0240ce84	[X86] Improve mul combine for negative multiplayer (2^c - 1) This patch improves the mul instruction combine function (combineMul) by adding new layer of logic. In this patch, we are adding the ability to fold (mul x, -((1 << c) -1)) or (mul x, -((1 << c) +1)) into (neg(X << c) -x) or (neg((x << c) + x) respective. Differential Revision: https://reviews.llvm.org/D28232 llvm-svn: 292358	2017-01-18 09:31:13 +00:00
Renato Golin	03c5e69d07	Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier" This reverts commit r292210, as it broke the Thumb buldbot with: clang-5.0: error: the clang compiler does not support '-fxray-instrument on thumbv7-unknown-linux-gnueabihf'. llvm-svn: 292357	2017-01-18 09:08:43 +00:00
Jonas Paulsson	a9bb00d82b	[SystemZ] Proper handling of undef flag while expanding pseudo. During post-RA pseudo expansion, an 'undef' flag of the source operand should be propagated by emitGRX32Move(). Review: Ulrich Weigand llvm-svn: 292353	2017-01-18 08:32:54 +00:00
Marina Yatsina	197db00e3e	[X86] Fix for bugzilla 31576 - add support for "data32" instruction prefix This patch fixes bugzilla 31576 (https://llvm.org/bugs/show_bug.cgi?id=31576). "data32" instruction prefix was not defined in the llvm. An exception had to be added to the X86 tablegen and AsmPrinter because both "data16" and "data32" are encoded to 0x66 (but in different modes). Differential Revision: https://reviews.llvm.org/D28468 llvm-svn: 292352	2017-01-18 08:07:51 +00:00
Dan Gohman	73e3aaa61e	[WebAssembly] Update grow_memory's return type. The grow_memory instruction now returns the previous memory size. Add the return type to the LLVM intrinsic. llvm-svn: 292322	2017-01-18 01:02:45 +00:00
Justin Lebar	1cf6bf4989	[NVPTX] Support global variables of integer type larger than i64. Reviewers: tra, majnemer Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28825 llvm-svn: 292316	2017-01-18 00:29:53 +00:00
Justin Lebar	9c46450dbb	[NVPTX] Standardize asm printer on "foo \tbar". Some instructions were printed as "foo\tbar", but most are printed as "foo \bar". Standardize on the latter form. llvm-svn: 292306	2017-01-18 00:09:36 +00:00
Justin Lebar	2a2d6f0ddd	[NVPTX] Clean up nested !strconcat calls. !strconcat is a variadic function; it will concatenate an arbitrary number of strings. There's no need to nest it. llvm-svn: 292305	2017-01-18 00:09:19 +00:00
Justin Lebar	cc938fc197	[NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic. Summary: This change also lets us use max.{s,u}16. There's a vague warning in a test about this maybe being less efficient, but I could not come up with a case where the resulting SASS (sm_35 or sm_60) was different with or without max.{s,u}16. It's true that nvcc seems to emit only max.{s,u}32, but even ptxas 7.0 seems to have no problem generating efficient SASS from max.{s,u}16 (the casts up to i32 and back down to i16 seem to be implicit and nops, happening via register aliasing). In the absence of evidence, better to have fewer special cases, emit more straightforward code, etc. In particular, if a new GPU has 16-bit min/max instructions, we want to be able to use them. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28732 llvm-svn: 292304	2017-01-18 00:09:01 +00:00
Justin Lebar	7dc3d6c341	[NVPTX] Lower integer absolute value idiom to abs instruction. Summary: Previously we lowered it literally, to shifts and xors. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28722 llvm-svn: 292303	2017-01-18 00:08:44 +00:00
Justin Lebar	1091a9f566	[NVPTX] Improve lowering of llvm.ctpop. Summary: Avoid an unnecessary conversion operation when using the result of ctpop.i32 or ctpop.i16 as an i32, as in both cases the ptx instruction we run returns an i32. (Previously if we used the value as an i32, we'd do an unnecessary zext+trunc.) Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28721 llvm-svn: 292302	2017-01-18 00:08:27 +00:00
Justin Lebar	c7d20128bd	[NVPTX] Add lowering for llvm.bitreverse. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28720 llvm-svn: 292301	2017-01-18 00:08:10 +00:00
Justin Lebar	d17de5380b	[NVPTX] Improve lowering of llvm.ctlz. Summary: * Disable "ctlz speculation", which inserts a branch on every ctlz(x) which has defined behavior on x == 0 to check whether x is, in fact zero. * Add DAG patterns that avoid re-truncating or re-expanding the result of the 16- and 64-bit ctz instructions. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28719 llvm-svn: 292299	2017-01-18 00:07:35 +00:00

... 5 6 7 8 9 ...

41506 Commits