llvm-project

Commit Graph

Author	SHA1	Message	Date
Luis Marques	e38695a025	Patch from Phabricator llvm-svn: 372092	2019-09-17 09:43:08 +00:00
Alex Bradbury	3966b02cc8	[RISCV][NFC] Add nounwind attribute to functions missing it in test/CodeGen/RISCV This is in preparation for emitting CFI directives. llvm-svn: 360897	2019-05-16 13:56:23 +00:00
Alex Bradbury	8a70468a27	[RISCV] Only mark fp as reserved if the function has a dedicated frame pointer This follows similar logic in the ARM and Mips backends, and allows the free use of s0 in functions without a dedicated frame pointer. The changes in callee-saved-gprs.ll most clearly show the effect of this patch. llvm-svn: 356063	2019-03-13 16:33:45 +00:00
Alex Bradbury	192df587d1	[RISCV] Regenerate umulo-128-legalisation-lowering.ll Upstream changes have improved codegen, reducing stack usage. Regenerate the test. llvm-svn: 356044	2019-03-13 12:33:44 +00:00
Alex Bradbury	1cc2d0b9fb	[RISCV] Avoid unnecessary XOR for seteq/setne 0 Differential Revision: https://reviews.llvm.org/D53492 Patch by James Clarke. llvm-svn: 346497	2018-11-09 14:47:36 +00:00
Alex Bradbury	efceb59801	[RISCV] Remove RV64 test lines from umulo-128-legalisation-lowering.ll The generated code is incorrect anyway, and this test adds noise to the upcoming set of patches that flesh out RV64 support. llvm-svn: 343675	2018-10-03 10:59:42 +00:00
Eli Friedman	73e8a784e6	[SelectionDAG] Improve the legalisation lowering of UMULO. There is no way in the universe, that doing a full-width division in software will be faster than doing overflowing multiplication in software in the first place, especially given that this same full-width multiplication needs to be done anyway. This patch replaces the previous implementation with a direct lowering into an overflowing multiplication algorithm based on half-width operations. Correctness of the algorithm was verified by exhaustively checking the output of this algorithm for overflowing multiplication of 16 bit integers against an obviously correct widening multiplication. Baring any oversights introduced by porting the algorithm to DAG, confidence in correctness of this algorithm is extremely high. Following table shows the change in both t = runtime and s = space. The change is expressed as a multiplier of original, so anything under 1 is “better” and anything above 1 is worse. +-------+-----------+-----------+-------------+-------------+ \| Arch \| u64u64 t \| u64u64 s \| u128u128 t \| u128u128 s \| +-------+-----------+-----------+-------------+-------------+ \| X64 \| - \| - \| ~0.5 \| ~0.64 \| \| i686 \| ~0.5 \| ~0.6666 \| ~0.05 \| ~0.9 \| \| armv7 \| - \| ~0.75 \| - \| ~1.4 \| +-------+-----------+-----------+-------------+-------------+ Performance numbers have been collected by running overflowing multiplication in a loop under `perf` on two x86_64 (one Intel Haswell, other AMD Ryzen) based machines. Size numbers have been collected by looking at the size of function containing an overflowing multiply in a loop. All in all, it can be seen that both performance and size has improved except in the case of armv7 where code size has regressed for 128-bit multiply. u128*u128 overflowing multiply on 32-bit platforms seem to benefit from this change a lot, taking only 5% of the time compared to original algorithm to calculate the same thing. The final benefit of this change is that LLVM is now capable of lowering the overflowing unsigned multiply for integers of any bit-width as long as the target is capable of lowering regular multiplication for the same bit-width. Previously, 128-bit overflowing multiply was the widest possible. Patch by Simonas Kazlauskas! Differential Revision: https://reviews.llvm.org/D50310 llvm-svn: 339922	2018-08-16 18:39:39 +00:00

7 Commits