llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	301162c4fe	AMDGPU: Replace i64 add/sub lowering Use VOP3 add/addc like usual. This has some tradeoffs. Inline immediates fold a little better, but other constants are worse off. SIShrinkInstructions could be made smarter to handle these cases. This allows us to avoid selecting scalar adds where we need to track the carry in scc and replace its users. This makes it easier to use the carryless VALU adds. llvm-svn: 318340	2017-11-15 21:51:43 +00:00
Matt Arsenault	aafff87dda	AMDGPU: Do not fold clamp instructions when sources are different Patch by hakzsam (Samuel Pitoiset) llvm-svn: 314951	2017-10-05 00:13:17 +00:00
Matt Arsenault	6b114d2c50	AMDGPU: Select clamp pattern with v2f16 llvm-svn: 312087	2017-08-30 01:20:17 +00:00
Stanislav Mekhanoshin	79da2a7698	[AMDGPU] Remove getBidirectionalReasonRank This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 llvm-svn: 297536	2017-03-11 00:29:27 +00:00
Matt Arsenault	79a45db7f5	AMDGPU: Use clamp with f64 llvm-svn: 295908	2017-02-22 23:53:37 +00:00
Matt Arsenault	d5c6515b68	AMDGPU: Fold FP clamp as modifier bit The manual is unclear on the details of this. It's not clear to me if denormals are not allowed with clamp, or if that is only omod. Not allowing denorms for fp16 or fp64 isn't useful so I also question if that is really a restriction. Same with whether this is valid without IEEE mode enabled. llvm-svn: 295905	2017-02-22 23:27:53 +00:00
Matt Arsenault	2fdf2a1a18	AMDGPU: Redefine clamp node as clamp 0.0-1.0 Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788	2017-02-21 23:35:48 +00:00

7 Commits