Matt Arsenault
687ec75d10
DAG: Change behavior of fminnum/fmaxnum nodes
...
Introduce new versions that follow the IEEE semantics
to help with legalization that may need quieted inputs.
There are some regressions from inserting unnecessary
canonicalizes when these are matched from fast math
fcmp + select which should be fixed in a future commit.
llvm-svn: 344914
2018-10-22 16:27:27 +00:00
Alexander Timofeev
db7ee7660a
[AMDGPU] Preliminary patch for divergence driven instruction selection. Immediate selection predicate changed
...
Differential revision: https://reviews.llvm.org/D51734
Reviewers: rampitec
llvm-svn: 341928
2018-09-11 11:56:50 +00:00
Matt Arsenault
6c7ba82900
AMDGPU: Address todo for handling 1/(2 pi)
...
llvm-svn: 339814
2018-08-15 21:03:55 +00:00
Matt Arsenault
de496c32a4
AMDGPU: Reduce code size with fcanonicalize (fneg x)
...
When fcanonicalize is lowered to a mul, we can
use -1.0 for free and avoid the cost of the bigger
encoding for source modifers.
llvm-svn: 338244
2018-07-30 12:16:58 +00:00
Matt Arsenault
f3c9a34def
AMDGPU: Make fneg combine handle fcanonicalize
...
llvm-svn: 338243
2018-07-30 12:16:47 +00:00
Matt Arsenault
70b9282015
AMDGPU: Fix -enable-var-scope violations
...
llvm-svn: 318004
2017-11-12 23:53:44 +00:00
Matt Arsenault
6c29c5acfe
AMDGPU: Allow SIShrinkInstructions to work in non-SSA
...
Immediates can be folded as long as the immediate is a vreg.
Also undo commuting instructions if it didn't fold an immediate.
llvm-svn: 307575
2017-07-10 19:53:57 +00:00
Matt Arsenault
bf5482e4bb
AMDGPU: Pull fneg out of extract_vector_elt
...
This allows folding source modifiers in more f16 cases.
Makes it easier to select per-component packed neg modifiers.
llvm-svn: 302813
2017-05-11 17:26:25 +00:00
Matt Arsenault
3dbeefa978
AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
...
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Matthias Braun
dbcf9e2ee4
LiveRegMatrix: Fix some subreg interference checks
...
Surprisingly, one of the three interference checks in LiveRegMatrix was
using the main live range instead of the apropriate subregister range
resulting in unnecessarily conservative results.
llvm-svn: 296722
2017-03-02 00:35:08 +00:00
Matt Arsenault
e1b595306d
AMDGPU: Fold fneg into fmin/fmax_legacy
...
llvm-svn: 293972
2017-02-03 00:51:50 +00:00
Matt Arsenault
2511c031de
AMDGPU: Fold fneg into fminnum/fmaxnum
...
llvm-svn: 293968
2017-02-03 00:23:15 +00:00
Matt Arsenault
a8fcfadf46
AMDGPU: Check if users of fneg can fold mods
...
In multi-use cases this can save a few instructions.
llvm-svn: 293962
2017-02-02 23:21:23 +00:00
Matt Arsenault
53f0cc238c
AMDGPU: Fold fneg into round instructions
...
llvm-svn: 293127
2017-01-26 01:25:36 +00:00
Matt Arsenault
74a576e7d3
AMDGPU: Check nsz instead of unsafe math
...
llvm-svn: 293028
2017-01-25 06:27:02 +00:00
Matt Arsenault
8a27aee6ae
DAGCombiner: Allow negating ConstantFP after legalize
...
llvm-svn: 293019
2017-01-25 04:54:34 +00:00
Matt Arsenault
3e6f9b5773
AMDGPU: Disable some fneg combines unless nsz
...
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.
fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.
llvm-svn: 292473
2017-01-19 06:35:27 +00:00
Matt Arsenault
31c039ef2e
AMDGPU: Fold free fneg into sin
...
llvm-svn: 291790
2017-01-12 18:48:09 +00:00
Matt Arsenault
a8c325e2f5
AMDGPU: Fold fneg into fmul_legacy
...
llvm-svn: 291784
2017-01-12 18:26:30 +00:00
Matt Arsenault
ff7e5aadf5
AMDGPU: Fold fneg into rcp
...
llvm-svn: 291779
2017-01-12 17:46:35 +00:00
Matt Arsenault
4242d48c36
AMDGPU: Fold fneg into fp_round
...
llvm-svn: 291778
2017-01-12 17:46:33 +00:00
Matt Arsenault
98d2bf1024
AMDGPU: Fold fneg into fp_extend
...
llvm-svn: 291777
2017-01-12 17:46:28 +00:00
Matt Arsenault
63f953795e
AMDGPU: Fold fneg into fma or fmad
...
Patch mostly by Fiona Glaser
llvm-svn: 291733
2017-01-12 00:32:16 +00:00
Matt Arsenault
4103a81d6d
AMDGPU: Fold fneg into fmul
...
Patch mostly by Fiona Glaser
llvm-svn: 291732
2017-01-12 00:23:20 +00:00
Matt Arsenault
2529fba989
AMDGPU: Fold fneg into fadd
...
Patch mostly by Fiona Glaser
llvm-svn: 291731
2017-01-12 00:09:34 +00:00