Commit Graph

11 Commits

Author SHA1 Message Date
Matt Arsenault 3b5d4aa258 GlobalISel: Infer nofpexcept flag during selection for non-strict ops
Match SelectionDAG's behavior of adding nofpexcept to out instructions
that may raise fp exceptions that are selected from instructions that
do not.
2020-06-05 13:59:46 -04:00
Matt Arsenault 4b4496312e AMDGPU: Start adding MODE register uses to instructions
This is the groundwork required to implement strictfp. For now, this
should be NFC for regular instructoins (many instructions just gain an
extra use of a reserved register). Regalloc won't rematerialize
instructions with reads of physical registers, but we were suffering
from that anyway with the exec reads.

Should add it for all the related FP uses (possibly with some
extras). I did not add it to either the gpr index mode instructions
(or every single VALU instruction) since it's a ridiculous feature
already modeled as an arbitrary side effect.

Also work towards marking instructions with FP exceptions. This
doesn't actually set the bit yet since this would start to change
codegen. It seems nofpexcept is currently not implied from the regular
IR FP operations. Add it to some MIR tests where I think it might
matter.
2020-05-27 14:47:00 -04:00
Matt Arsenault 89c8c80bd5 AMDGPU: Change pre-gfx9 implementation of fcanonicalize to mul
If f32 denormals were enabled pre-gfx9, we would still try to
implement this with v_max_f32. Pre-gfx9, these instructions ignored
the denormal mode and did not flush. Switch to the multiply form for
f32 as a workaround which should always work in any case.

This fixes conformance failures when the library implementation of
fmin/fmax were accidentally not inlined, forcing the assumption of no
flushing on targets where denormals are not enabled by default. This
is a workaround, since really we should not be mixing code with
different FP mode expectations, but prefer the lowering that will work
in any mode.

Now this will always use max to implement canonicalize on gfx9+. This
is only really beneficial for f64. For f32/f16 it's a neutral choice
(and worse in terms of code size in 1 case), but possibly worse for
the compiler since it does add an extra register use operand. Leave
this change for later.
2020-04-23 15:24:13 -04:00
Matt Arsenault 5fe3f06596 AMDGPU/GlobalISel: Add new baseline checks for canonicalize 2020-04-23 15:04:32 -04:00
Matt Arsenault dfce5fd50a AMDGPU/GlobalISel: Select VOP3P instructions
This only handles the basic cases. More work is needed to make better
use of op_sel.
2020-02-21 13:35:40 -05:00
Matt Arsenault 1024b73ef5 AMDGPU: Split denormal mode tracking bits
Prepare to accurately track the future denormal-fp-math attribute
changes. The way to actually set these separately is not wired in yet.

This is just a mechanical change, and mostly still assumes the input
and output mode match. This should be refined for some cases. For
example, fcanonicalize lowering should use the flushing variant if
either input or output flushing is enabled
2020-02-04 10:44:21 -08:00
Matt Arsenault db0ed3e429 AMDGPU: Refactor treatment of denormal mode
Start moving towards treating this as a property of the calling
convention, and not the subtarget. The default denormal mode should
not be part of the subtarget, and be moved into a separate function
attribute.

This patch is still NFC. The denormal mode remains as a subtarget
feature for now, but make the necessary changes to switch to using an
attribute.
2019-11-19 19:55:43 +05:30
Matt Arsenault ea23b6428b AMDGPU: Be explicit about denormal mode in MIR tests
Start checking the machine function in GlobalISel instead of the
target directly.

This temporarily breaks fcanonicalize selection in GlobalISel.
2019-11-19 19:55:43 +05:30
Matt Arsenault e1895aba3d AMDGPU/GlobalISel: Select G_FABS/G_FNEG
f64 doesn't work yet because tablegen currently doesn't handlde
REG_SEQUENCE.

This does regress some multi use VALU fneg cases since now the
immediate remains in an SGPR, and more moves are used for legalizing
the xor. This is a SIFixSGPRCopies deficiency.

llvm-svn: 371540
2019-09-10 17:19:46 +00:00
Matt Arsenault 4f64ade04c AMDGPU/GlobalISel: Select src modifiers
llvm-svn: 364782
2019-07-01 15:18:56 +00:00
Matt Arsenault fbf67d88de GlobalISel: Add DAG compat for G_FCANONICALIZE
llvm-svn: 364758
2019-07-01 13:22:00 +00:00