llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	5a69a0011b	[X86][Broadwell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955) llvm-svn: 328076	2018-03-21 06:28:42 +00:00
Craig Topper	137a4dd84d	[X86] Fix the SchedRW for XOP vpcom register form instructions to not be marked as loads. llvm-svn: 328071	2018-03-21 03:41:33 +00:00
Craig Topper	d25f1acf67	[X86] Change PMULLD to 10 cycles on Skylake per Agner's tables and llvm-exegesis. Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server. Fixes PR36808. llvm-svn: 328061	2018-03-20 23:39:48 +00:00
Derek Schuff	73a98f5eea	[WebAssembly] Update torture compile test expectations The tests compile after r328049 llvm-svn: 328057	2018-03-20 23:00:13 +00:00
Derek Schuff	39b5367cba	[WebAssembly] Strip threadlocal attribute from globals in single thread mode The default thread model for wasm is single, and in this mode thread-local global variables can be lowered identically to non-thread-local variables. Differential Revision: https://reviews.llvm.org/D44703 llvm-svn: 328049	2018-03-20 22:01:32 +00:00
Simon Pilgrim	572bfa562a	[X86] Drop unnecessary InstRW overrides for WriteFMA As noticed on D44687, these already match the WriteFMA def so can be removed. llvm-svn: 328045	2018-03-20 21:15:23 +00:00
Martin Storsjo	07589fc496	[X86] Don't use the MSVC stack protector names on mingw Mingw uses the same stack protector functions as GCC provides on other platforms as well. Patch by Valentin Churavy! Differential Revision: https://reviews.llvm.org/D27296 llvm-svn: 328039	2018-03-20 20:37:51 +00:00
Derek Schuff	e4825975d8	[WebAssembly] Added initial AsmParser implementation. It uses the MC framework and the tablegen matcher to do the heavy lifting. Can handle both explicit and implicit locals (-disable-wasm-explicit-locals). Comes with a small regression test. This is a first basic implementation that can parse most llvm .s output and round-trips most instructions succesfully, but in order to keep the commit small, does not address all issues. There are a fair number of mismatches between what MC / assembly matcher think a "CPU" should look like and what WASM provides, some already have workarounds in this commit (e.g. the way it deals with register operands) and some that require further work. Some of that further work may involve changing what the Disassembler outputs (and what s2wasm parses), so are probably best left to followups. Some known things missing: - Many directives are ignored and not emitted. - Vararg calls are parsed but extra args not emitted. - Loop signatures are likely incorrect. - $drop= is not emitted. - Disassembler does not output SIMD types correctly, so assembler can't test them. Patch by Wouter van Oortmerssen Differential Revision: https://reviews.llvm.org/D44329 llvm-svn: 328028	2018-03-20 20:06:35 +00:00
Evandro Menezes	36afbee1d8	[AArch64] Adjust the cost model for Exynos M3 Fix typo in the number of integer dividers. llvm-svn: 328027	2018-03-20 20:00:29 +00:00
Krzysztof Parzyszek	65059ee284	[Hexagon] Add heuristic to exclude critical path cost for scheduling Patch by Brendon Cahoon. llvm-svn: 328022	2018-03-20 19:26:27 +00:00
Krzysztof Parzyszek	9315c0de9b	[Hexagon] Fix fall-through warnings in HexagonMCDuplexInfo.cpp llvm-svn: 328021	2018-03-20 19:23:18 +00:00
Nirav Dave	ce71989188	[MC,X86] Cleanup some X86 parser functions to use MCParser helpers. NFCI. llvm-svn: 328019	2018-03-20 19:12:41 +00:00
Craig Topper	c2dbd677bd	[PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything. If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all. I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it. Differential Revision: https://reviews.llvm.org/D44061 llvm-svn: 328017	2018-03-20 18:49:28 +00:00
Krzysztof Parzyszek	eb0c510ecd	[X86] Add phony registers for high halves of regs with low halves Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters that cover the low halves of these registers. This change adds artificial subregisters for the high halves in order to differentiate (in terms of register units) between the 32- and the low 16-bit registers. This patch contains parts that aim to preserve the calculated register pressure. This is in order to preserve the current codegen (minimize the impact of this patch). The approach of having artificial subregisters could be used to fix PR23423, but the pressure calculation would need to be changed. Differential Revision: https://reviews.llvm.org/D43353 llvm-svn: 328016	2018-03-20 18:46:55 +00:00
Artem Belevich	914d4babec	[NVPTX] Make tensor load/store intrinsics overloaded. This way we can support address-space specific variants without explicitly encoding the space in the name of the intrinsic. Less intrinsics to deal with -> less boilerplate. Added a bit of tablegen magic to match/replace an intrinsics with a pointer argument in particular address space with the space-specific instruction variant. Updated tests to use non-default address spaces. Differential Revision: https://reviews.llvm.org/D43268 llvm-svn: 328006	2018-03-20 17:18:59 +00:00
Krzysztof Parzyszek	4c6b65f685	[Hexagon] Correct the computation of TopReadyCycle and BotReadyCycle of SU TopReadyCycle and BotReadyCycle were off by one cycle when an SU is either the first instruction or the last instruction in a packet. Patch by Ikhlas Ajbar. llvm-svn: 328000	2018-03-20 17:03:27 +00:00
Krzysztof Parzyszek	73be83dec5	[Hexagon] Check weak dependences when only 1 instruction is available Patch by Brendon Cahoon. llvm-svn: 327997	2018-03-20 16:22:06 +00:00
Simon Pilgrim	62690e9d0e	[X86][Haswell][Znver1] Fix typo in fldl instregexs Missing comma was casing 2 instregex entries to be concatenated together by mistake. Found while investigating PR35548 llvm-svn: 327992	2018-03-20 15:44:47 +00:00
Krzysztof Parzyszek	5ffd808a27	[Hexagon] Improve scheduling heuristic for large basic blocks This patch changes the isLatencyBound heuristic to look at the path length based upon the number of packets needed to schedule a basic block. For small basic blocks, the heuristic uses a small threshold for isLatencyBound. For large basic blocks, the heuristic uses a large threshold. The goal is to increase the priority of an instruction in a small basic block that has a large height or depth relative to the code size. For large functions, the height and depth are ignored because it increases the live range of a register and causes more spills. That is, for large functions, it is more important to schedule instructions when available, and attempt to keep the defs and uses closer together. Patch by Brendon Cahoon. llvm-svn: 327987	2018-03-20 14:54:01 +00:00
Geoff Berry	0b64402adb	[AArch64][Falkor] Correct load/store increment scheduling details llvm-svn: 327982	2018-03-20 13:46:35 +00:00
Krzysztof Parzyszek	2c4231d888	[Hexagon] Fix division by zero in machine scheduler llvm-svn: 327980	2018-03-20 13:28:46 +00:00
Alex Bradbury	80c8eb7696	[RISCV] Add codegen for RV32F floating point load/store As part of this, add support for load/store from the constant pool. This is used to materialise f32 constants. llvm-svn: 327979	2018-03-20 13:26:12 +00:00
Alex Bradbury	76c29ee815	[RISCV] Add codegen for RV32F arithmetic and conversion operations Currently, only a soft floating point ABI is supported. llvm-svn: 327976	2018-03-20 12:45:35 +00:00
Krzysztof Parzyszek	dca383123f	[Hexagon] Improve scheduling based on register pressure Patch by Brendon Cahoon. llvm-svn: 327975	2018-03-20 12:28:43 +00:00
Simon Pilgrim	4a83f802cc	[X86][SandyBridge] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955) I've also merged some VEX/non-VEX instregex strings with a (V?) prefix - there are still a lot more of these to do. llvm-svn: 327974	2018-03-20 12:26:55 +00:00
Martin Storsjo	802b434156	[X86] Properly implement the calling convention for f80 for mingw/x86_64 In these cases, both parameters and return values are passed as a pointer to a stack allocation. MSVC doesn't use the f80 data type at all, while it is used for long doubles on mingw. Normally, this part of the calling convention is handled within clang, but for intrinsics that are lowered to libcalls, it may need to be handled within llvm as well. Differential Revision: https://reviews.llvm.org/D44592 llvm-svn: 327957	2018-03-20 06:19:38 +00:00
Craig Topper	ad7c685791	[X86] Rename MOVSX32_NOREXrr8 to MOVSX32rr8_NOREX so that the scheduler model regular expressions will pick it up with the regular version. Do the same for MOVSX32_NOREXrm8, MOVZX32_NOREXrr8, and MOVZX32_NOREXrm8 llvm-svn: 327948	2018-03-20 05:00:20 +00:00
Craig Topper	4778fa7e8a	[X86] Fix the SchedRW for memory forms of CMP and TEST. They were incorrectly marked as RMW operations. Some of the CMP instrucions worked, but the ones that use a similar encoding as RMW form of ADD ended up marked as RMW. TEST used the same tablegen class as some of the CMPs. llvm-svn: 327947	2018-03-20 03:55:17 +00:00
Craig Topper	3e9462607e	[X86] Add TEST16mi/TEST32mi/TEST64mi32 to the Sandybridge/Haswell/Broadwell/Skylake scheduler models. Move it from a load+store group on SNB to a load only group, the same group as CMP. llvm-svn: 327944	2018-03-20 03:02:03 +00:00
Craig Topper	7c90e29cf8	[X86] Add ROR/ROL/SHR/SAR by 1 instructions to the Sandy Bridge scheduler model. I assume these match the generic immediate version like they do in the other models. llvm-svn: 327943	2018-03-20 03:01:59 +00:00
Shiva Chen	cbd498ac10	[RISCV] Preserve stack space for outgoing arguments when the function contain variable size objects E.g. bar (int x) { char p[x]; push outgoing variables for foo. call foo } We need to generate stack adjustment instructions for outgoing arguments by eliminateCallFramePseudoInstr when the function contains variable size objects to avoid outgoing variables corrupt the variable size object. Default hasReservedCallFrame will return !hasFP(). We don't want to generate extra sp adjustment instructions when hasFP() return true, So We override hasReservedCallFrame as !hasVarSizedObjects(). Differential Revision: https://reviews.llvm.org/D43752 llvm-svn: 327938	2018-03-20 01:39:17 +00:00
Craig Topper	2330d6cd55	[X86] Fix the SNB scheduler for BLENDVB. PBLENDVBrr0 was with the memory version of VBLENDVB and PBLENDVBrm0 was missing. llvm-svn: 327937	2018-03-20 01:30:21 +00:00
Jessica Paquette	563548d8f3	[MachineOutliner] AArch64: Emit CFI instructions when outlining calls When outlining calls, the outliner needs to update CFI to ensure that, say, exception handling works. This commit adds that functionality and adds a test just for call outlining. Call outlining stuff in machine-outliner.mir should be moved into machine-outliner-calls.mir in a later commit. llvm-svn: 327917	2018-03-19 22:48:40 +00:00
Craig Topper	ab6076514d	[X86] Simplify the AVX512 code in LowerTruncate a little. We don't need to create an ISD::TRUNCATE node to return, we started with one and can return it. Also remove the call to getExtendInVec, the result is just going to be a getNode of that value passed in. llvm-svn: 327914	2018-03-19 21:58:02 +00:00
Craig Topper	3b967466d5	[X86] Replace a couple calls to getExtendInVec with getNode and the appropriate target independent EXTEND_VECTOR_INREG opcode. llvm-svn: 327899	2018-03-19 20:20:22 +00:00
Nirav Dave	3264c1bdf6	[DAG, X86] Revert r327197 "Revert r327170, r327171, r327172" Reland ISel cycle checking improvements after simplifying node id invariant traversal and correcting typo. llvm-svn: 327898	2018-03-19 20:19:46 +00:00
Martin Storsjo	9a55c1b0dc	[ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes This extends the use of this attribute on ARM and AArch64 from SVN r325900 (where it was only checked for fixed stack allocations on ARM/AArch64, but for all stack allocations on X86). This also adds a testcase for the existing use of disabling the fixed stack probe with the attribute on ARM and AArch64. Differential Revision: https://reviews.llvm.org/D44291 llvm-svn: 327897	2018-03-19 20:06:50 +00:00
Lei Huang	ecfede94a7	[Power9]Legalize and emit code for quad-precision copySign/abs/nabs/neg/sqrt Legalize and emit code for quad-precision floating point operations: * xscpsgnqp * xsabsqp * xsnabsqp * xsnegqp * xssqrtqp Differential Revision: https://reviews.llvm.org/D44530 llvm-svn: 327889	2018-03-19 19:22:52 +00:00
Craig Topper	9770107b5f	[X86] Add JMP16r and JMP32r to Sandybridge scheduler model. Fixes PR36010 llvm-svn: 327883	2018-03-19 19:00:37 +00:00
Craig Topper	5e65996fac	[X86] Remove OUT32rr/OUT8rr/OUT32ri/OUT8ri from Sandybridge scheduler model. PR35590 was already filed for this information being wrong. It's probably better to default to WriteSystem behavior instead of using something completely wrong. llvm-svn: 327882	2018-03-19 19:00:35 +00:00
Craig Topper	b4c7873f8c	[X86] Add JCXZ/JECXZ to Sandybridge/Haswell/Broadwell/Skylake scheduler models. JRCXZ was already present, but not the others. We never codegen this instruction so this doesn't affect much just trying to get them all into a single generated scheduler class in the output. llvm-svn: 327881	2018-03-19 19:00:32 +00:00
Craig Topper	afabf36505	[X86] Correct regular expression in Zen scheduler model that was excluding JECXZ instruction. The regex was looking for JECXZ_32 or JECXZ_64, but their is just one instruction called JECXZ. They used to exist as separate instructions, but were merged over 3 years ago. llvm-svn: 327880	2018-03-19 19:00:29 +00:00
Craig Topper	591f44df54	[X86] Correct the SchedRW on (V)MOVAPSrr_REV and similar to match their non _REV counterparts. llvm-svn: 327879	2018-03-19 19:00:26 +00:00
Lei Huang	6d1596a98c	[PowerPC][Power9]Legalize and emit code for quad-precision add/div/mul/sub Legalize and emit code for quad-precision floating point operations: * xsaddqp * xssubqp * xsdivqp * xsmulqp Differential Revision: https://reviews.llvm.org/D44506 llvm-svn: 327878	2018-03-19 18:52:20 +00:00
Nemanja Ivanovic	d9d5bd3067	[PowerPC] Make AddrSpaceCast noop PowerPC targets do not use address spaces. As a result, we can get selection failures with address space casts. This patch makes those casts noops. Patch by Valentin Churavy. Differential revision: https://reviews.llvm.org/D43781 llvm-svn: 327877	2018-03-19 18:50:02 +00:00
Craig Topper	836cfb3a4c	[X86] Add the rest of the TEST with immediate instructions to the scheduler models to match their 8-bit counterpart. llvm-svn: 327874	2018-03-19 17:58:41 +00:00
Craig Topper	645e531a69	[X86] Add MOV16ri/MOV32ri/MOV64ri* to scheduler models to match MOV8ri. Correct SchedRW and itinerary for MOV32ri64. llvm-svn: 327872	2018-03-19 17:46:59 +00:00
Craig Topper	259eaa6e7c	[X86] Remove sse41 specific code from lowering v16i8 multiply With the SRAs removed from the SSE2 code in D44267, then there doesn't appear to be any advantage to the sse41 code. The punpcklbw instruction and pmovsx seem to have the same latency and throughput on most CPUs. And the SSE41 code requires moving the upper 64-bits into the lower 64-bit before the sign extend can be done. The unpckhbw in sse2 code can do better than that. llvm-svn: 327869	2018-03-19 17:31:41 +00:00
Craig Topper	5ccd87233f	[X86] Make the multiply and divide itineraries more consistent. Sometimes we used the same itinerary for MEM and REG forms, but that seems inconsistent with our usual usage. We also used the MUL8 itinerary for MULX32/64 which was also weird. The test changes are because we were using IIC_IMUL32_RR and IIC_IMUL64_RR instead of IIC_IMUL32_REG/IIC_IMUL64_REG for the 32 and 64 bit multiplies that produce double width result. llvm-svn: 327866	2018-03-19 16:38:33 +00:00
Zaara Syeda	01f414baaa	Revert [MachineLICM] This reverts commit rL327856 Failing build bots. Revert the commit now. llvm-svn: 327864	2018-03-19 16:19:44 +00:00

1 2 3 4 5 ...

46619 Commits