Commit Graph

24861 Commits

Author SHA1 Message Date
Nicolai Haehnle 7a9c03f484 AMDGPU: Select MIMG instructions manually in SITargetLowering
Summary:
Having TableGen patterns for image intrinsics is hitting limitations:
for D16 we already have to manually pre-lower the packing of data
values, and we will have to do the same for A16 eventually.

Since there is already some custom C++ code anyway, it is arguably easier
to just do everything in C++, now that we can use the beefed-up generic
tables backend of TableGen to provide all the required metadata and map
intrinsics to corresponding opcodes. With this approach, all image
intrinsic lowering happens in SITargetLowering::lowerImage. That code is
dense due to all the cases that it handles, but it should still be easier
to follow than what we had before, by virtue of it all being done in a
single location, and by virtue of not relying on the TableGen pattern
magic that very few people really understand.

This means that we will have MachineSDNodes with MIMG instructions
during DAG combining, but that seems alright: previously we had
intrinsic nodes instead, but those are similarly opaque to the generic
CodeGen infrastructure, and the final pattern matching just did a 1:1
translation to machine instructions anyway. If anything, the fact that
we now merge the address words into a vector before DAG combine should
be an advantage.

Change-Id: I417f26bd88f54ce9781c1668acc01f3f99774de6

Reviewers: arsenm, rampitec, rtaylor, tstellar

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48017

llvm-svn: 335228
2018-06-21 13:36:57 +00:00
Nicolai Haehnle b3a9b68513 AMDGPU: Add implicit def of SCC to kill and indirect pseudos
Summary:
Kill instructions sometimes do use SCC in unusual circumstances, when
v_cmpx cannot be used due to the operands that are involved.

Additionally, even if SCC was never defined by the expansion, kill pseudos
could previously occur between an s_cmp and an s_cbranch_scc, which breaks
the SCC liveness tracking when the pseudo is expanded to split the basic
block. While it would be possible to explicitly mark the SCC as live-in for
the successor basic block, it's simpler to just mark the pseudo as using SCC,
so that such a sequence is never emitted by instruction selection in the
first place.

A similar issue affects indirect source/dest pseudos in principle, although
I haven't been able to come up with a test case where it actually matters
(this affects instruction selection, so a MIR test can't be used).

Fixes: dEQP-GLES3.functional.shaders.discard.dynamic_loop_always
Change-Id: Ica8d82ecff1a763b892a1112cf1b06c948863a4f

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47761

llvm-svn: 335223
2018-06-21 13:36:08 +00:00
Nicolai Haehnle f267431901 AMDGPU: Turn D16 for MIMG instructions into a regular operand
Summary:
This allows us to reduce the number of different machine instruction
opcodes, which reduces the table sizes and helps flatten the TableGen
multiclass hierarchies.

We can do this because for each hardware MIMG opcode, we have a full set
of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata
and vaddr registers. Instead of having separate D16 machine instructions,
a packed D16 instructions loading e.g. 4 components can simply use the
same V2 opcode variant that non-D16 instructions use.

We still require a TSFlag for D16 buffer instructions, because the
D16-ness of buffer instructions is part of the opcode. Renaming the flag
should help avoid future confusion.

The one non-obvious code change is that for gather4 instructions, the
disassembler can no longer automatically decide whether to use a V2 or
a V4 variant. The existing logic which choose the correct variant for
other MIMG instruction is extended to cover gather4 as well.

As a bonus, some of the assembler error messages are now more helpful
(e.g., complaining about a wrong data size instead of a non-existing
instruction).

While we're at it, delete a whole bunch of dead legacy TableGen code.

Change-Id: I89b02c2841c06f95e662541433e597f5d4553978

Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor

Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47434

llvm-svn: 335222
2018-06-21 13:36:01 +00:00
Mikael Holmen 42f7bc96dd [DebugInfo] Make sure all DBG_VALUEs' reguse operands have IsDebug property
Summary:
In some cases, these operands lacked the IsDebug property, which is meant to signal that
they should not affect codegen. This patch adds a check for this property in the
MachineVerifier and adds it where it was missing.

This includes refactorings to use MachineInstrBuilder construction functions instead of
manually setting up the intrinsic everywhere.

Patch by: JesperAntonsson

Reviewers: aprantl, rnk, echristo, javed.absar

Reviewed By: aprantl

Subscribers: qcolombet, sdardis, nemanjai, JDevlieghere, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D48319

llvm-svn: 335214
2018-06-21 10:03:34 +00:00
David Green a465188500 [DAGCombine] Fix alignment for offset loads/stores
The alignment parameter to getExtLoad is treated as a base alignment,
not the alignment of the load (base + offset). When we infer a better
alignment for a Ptr we need to ensure that it applies to the base to
prevent the alignment on the load from being wrong.

This fixes a bug where the alignment could then be used to incorrectly
prove noalias between a load and a store, leading to a miscompile.

Differential Revision: https://reviews.llvm.org/D48029

llvm-svn: 335210
2018-06-21 08:30:07 +00:00
Craig Topper 76df3d61a3 [X86] Go through some tests that still reference old intrinsics that have been autoupgraded and replace them with the upgraded IR.
This is mostly the stack folding tests and is by no means a thorough audit of tests.

llvm-svn: 335204
2018-06-21 06:17:16 +00:00
Chandler Carruth e5f1b8e20b [RISC-V] Fix a test case to not include label names as those aren't
stable in non-asserts builds. This fixes a test failure in release
config.

llvm-svn: 335202
2018-06-21 05:42:05 +00:00
Craig Topper 296526bf46 [X86] Remove masking from 512-bit floating max/min intrinsics. Use select instruction instead.
llvm-svn: 335199
2018-06-21 05:00:56 +00:00
Simon Dardis 0f111dd704 [mips] Add microMIPS specific addressing patterns.
These are identical but use microMIPS instructions instead of MIPS instructions.

Also, flatten the 'let AdditionalPredicates = [InMicroMips]' by using the
ISA_MICROMIPS adjective. Add tests for constant materialization.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D48275

llvm-svn: 335185
2018-06-20 22:40:12 +00:00
Alina Sbirlea dfd14adeb0 Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
Summary:
Two utils methods have essentially the same functionality. This is an attempt to merge them into one.
1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor

Prior to the patch:
1. MergeBasicBlockIntoOnlyPred
Updates either DomTree or DeferredDominance
Moves all instructions from Pred to BB, deletes Pred
Asserts BB has single predecessor
If address was taken, replace the block address with constant 1 (?)

2. MergeBlockIntoPredecessor
Updates DomTree, LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

After the patch:
Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:

1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
Updated in this patch. No challenges.

2. lib/CodeGen/CodeGenPrepare.cpp
Updated in this patch.
  i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
  ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks
Some interesting aspects:
  - Since Pred is not deleted (BB is), the entry block does not need updating.
  - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
  - isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
  - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
  - adding some test owner as subscribers for the interesting tests modified:
    test/CodeGen/X86/avx-cmp.ll
    test/CodeGen/AMDGPU/nested-loop-conditions.ll
    test/CodeGen/AMDGPU/si-annotate-cf.ll
    test/CodeGen/X86/hoist-spill.ll
    test/CodeGen/X86/2006-11-17-IllegalMove.ll

3. lib/Transforms/Scalar/JumpThreading.cpp
Not covered in this patch. It is the only use case using the DeferredDominance.
I would defer to Brian Rzycki to make this replacement.

Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar

Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D48202

llvm-svn: 335183
2018-06-20 22:01:04 +00:00
Craig Topper c2696d577b [X86] Use setcc ISD opcode for AVX512 integer comparisons all the way to isel
I don't believe there is any real reason to have separate X86 specific opcodes for vector compares. Setcc has the same behavior just uses a different encoding for the condition code.

I had to change the CondCodeAction for SETLT and SETLE to prevent some transforms from changing SETGT lowering.

Differential Revision: https://reviews.llvm.org/D43608

llvm-svn: 335173
2018-06-20 21:05:02 +00:00
Stanislav Mekhanoshin 20279dc025 Allow binop C1, (select cc, CF, CT) -> select folding
Previously this folding was done only if select is a first operand.
However, for non-commutative operations constant may go before
select.

Differential Revision: https://reviews.llvm.org/D48223

llvm-svn: 335167
2018-06-20 20:24:20 +00:00
Simon Dardis 6021424c10 [mips] Correct predicates for loads, bit manipulation instructions and some pseudos
Additionally, correct the definition of the rdhwr instruction.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D48216

llvm-svn: 335162
2018-06-20 19:59:58 +00:00
Matt Arsenault 5a4ec8127f AMDGPU: Fix scalar_to_vector for v4i16/v4f16
llvm-svn: 335161
2018-06-20 19:45:48 +00:00
Krzysztof Parzyszek 95486fd56a [Hexagon] Replace .ll test for expanding post-ra pesudos with .mir
llvm-svn: 335158
2018-06-20 19:22:27 +00:00
Bjorn Pettersson 7bf676662a [DAG] Don't map a TableId to itself in the ReplacedValues map
Summary:
Found some regressions (infinite loop in DAGTypeLegalizer::RemapId)
after r334880. This patch makes sure that we do map a TableId to
itself.

Reviewers: niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48364

llvm-svn: 335141
2018-06-20 16:06:09 +00:00
Nirav Dave cd558887d3 [DAG] Fix and-mask folding when narrowing loads.
Summary:
Check that and masks are strictly smaller than implicit mask from
narrowed load.

Fixes PR37820.

Reviewers: samparker, RKSimon, nemanjai

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48335

llvm-svn: 335137
2018-06-20 15:36:29 +00:00
Mikhail Dvoretckii 027fd8068f [X86] Adding a test for PR37879
llvm-svn: 335126
2018-06-20 14:01:57 +00:00
Tim Northover 644a819534 ARM: convert ORR instructions to ADD where possible on Thumb.
Thumb has more 16-bit encoding space dedicated to ADD than ORR, allowing both a
3-address encoding and a wider range of immediates. So, particularly when
optimizing for code size (but it doesn't make things worse elsewhere) it's
beneficial to select an OR operation to an ADD if we know overflow won't occur.

This is made even better by LLVM's penchant for putting operations in canonical
form by converting the other way.

llvm-svn: 335119
2018-06-20 12:09:44 +00:00
Tim Northover 70666e7765 [AArch64] Implement FLT_ROUNDS macro.
Very similar to ARM implementation, just maps to an MRS.

Should fix PR25191.

Patch by Michael Brase.

llvm-svn: 335118
2018-06-20 12:09:01 +00:00
Clement Courbet 7b9913fb9f [X86] Add sched class WriteLAHFSAHF and fix values.
Summary:
I ran llvm-exegesis on SKX, SKL, BDW, HSW, SNB.
Atom is from Agner and SLM is a guess.
I've left AMD processors alone.

Reviewers: RKSimon, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48079

llvm-svn: 335097
2018-06-20 06:13:39 +00:00
Craig Topper 31a64ee76c [X86] Remove a fptosi from the test_mm512_mask_reduce_max_pd fast-isel test.
The clang test inadvertently turned a floating point value into a double by having the wrong return type on the test function relative to the intrinsic it was testing.

This resulted in an extra fptosi instruction that propagated into this test when I copied the clang output.

llvm-svn: 335094
2018-06-20 04:32:06 +00:00
Craig Topper 59da74370b [SelectionDAG] Don't crash on inline assembly errors when the inline assembly return type is a struct.
Summary:
If we get an error building the SelectionDAG for inline assembly we try to continue and still build the DAG.

But if the return type for the inline assembly is a struct we end up crashing because we try to create an UNDEF node with a struct type which isn't valid.

Instead we need to create an UNDEF for each element of the struct and join them with merge_values.

This patch relies on single operand merge_values being handled gracefully by getMergeValues. If the return type is void there will be no VTs returned by ComputeValueVTs and now we just return instead of calling setValue. Hopefully that's ok, I assumed nothing would need to look up the mapped value for void node.

Fixes PR37359

Reviewers: rengolin, rovka, echristo, efriedma, bogner

Reviewed By: efriedma

Subscribers: craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D46560

llvm-svn: 335093
2018-06-20 04:32:05 +00:00
Philip Reames 62bbd54ed3 Add more test cases for deopt-operands via regalloc
This time, focused on reuse of arguments slots.  Only one minor todo here.

llvm-svn: 335091
2018-06-20 02:43:46 +00:00
Philip Reames 8befa295ea [InlineSpiller] Fix a crash due to lack of forward progress from remat specifically for STATEPOINT
This patch covers up a fairly fundemental issue around remat and register allocation which shows up with psuedo instructions with more vreg uses than there are physical registers.  This patch essentially just disables remat for STATEPOINTs which are the only case we've seen so far, but long term we need a better fix.

For STATEPOINTs specifically, this is a strict improvement.  It unblocks progress towards enabling a currently off-by-default mode which integrates deopt bundle operand lowering with register allocator spilling so that we end up with smaller stack sizes and more optimally placed spills.  Assming no other issues turn up during my next round of integration testing - which based on experience so far, is admittedly unlikely - we might finally be able to enable something I've been working towards in small bits and pieces for years now.  :)

For psuedo ops in general, there are a couple of ideas for a "proper fix" discussed on the bug, but I'm far enough outside my knowledge area to not be able to see any of them through to a successful conclusion.  If anyone wants to help out here, please do.  

Differential Revision: https://reviews.llvm.org/D41098

llvm-svn: 335077
2018-06-19 21:19:59 +00:00
Heejin Ahn 891a747266 [WebAssembly] Fix liveness tracking info after drop insertion
Summary:
This fixes liveness tracking information after `drop` instruction
insertion in ExplicitLocals pass.

When a drop instruction is inserted to drop a dead register operand, the
original operand should be marked not dead anymore because it is now
used by the new drop instruction. And the operand to the new drop
instruction should be marked killed instead. This bug caused some
programs to fail when `llc` is run with `-verify-machineinstrs` option.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D48253

llvm-svn: 335074
2018-06-19 20:30:42 +00:00
Craig Topper 31961f051f [X86] Update fast-isel tests for clang's avx512f reduction intrinsics to match the codegen from r335070.
llvm-svn: 335071
2018-06-19 19:14:50 +00:00
Craig Topper 858afbd165 [X86] Add fast-isel tests for clang's AVX512F vector reduction intrinsics.
llvm-svn: 335068
2018-06-19 18:52:15 +00:00
Matt Davis a245c765a8 [MIRParser] Update a diagnostic message to use the correct register sigil. NFC
Summary:
Patch r323922 changed the sigil for physical registers to '$',  instead of '%'.
An error message was missed during this change, and reports the wrong sigil.
This patch corrects that diagnostic and the tests that check that error string.


Reviewers: zer0, bjope

Reviewed By: bjope

Subscribers: bjope, thegameg, plotfi, llvm-commits

Differential Revision: https://reviews.llvm.org/D48086

llvm-svn: 335066
2018-06-19 18:39:40 +00:00
Craig Topper 7ffa976993 [X86] Don't fold unaligned loads into SSE ROUNDPS/ROUNDPD for ceil/floor/nearbyint/rint/trunc.
Incorrect patterns were added in r334460. This changes them to check alignment properly for SSE.

llvm-svn: 335062
2018-06-19 17:51:42 +00:00
Krzysztof Parzyszek 5c2944c4f2 [Hexagon] Enforce restrictions on packetizing cache instructions
llvm-svn: 335061
2018-06-19 17:26:20 +00:00
Simon Dardis af38a8fed6 [mips] Mark microMIPS64 as being unsupported.
There are no provided instruction definitions for this architecture.

Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D48320

llvm-svn: 335057
2018-06-19 16:05:44 +00:00
Strahinja Petrovic bb2b00bb80 [PowerPC] Fix label address calculation for ppc32
This patch fixes calculating address of label on ppc32 (for -fPIC).

Differential Revision: https://reviews.llvm.org/D46582

llvm-svn: 335043
2018-06-19 13:07:40 +00:00
Mikhail Dvoretckii b1ce7765be [X86] VRNDSCALE* folding from masked and scalar ffloor and fceil patterns
This patch handles back-end folding of generic patterns created by lowering the
X86 rounding intrinsics to native IR in cases where the instruction isn't a
straightforward packed values rounding operation, but a masked operation or a
scalar operation.

Differential Revision: https://reviews.llvm.org/D45203

llvm-svn: 335037
2018-06-19 10:37:52 +00:00
QingShan Zhang 9f0fe9a3f8 If the arch is P9, we will select the DFLOADf32/DFLOADf64 pseudo instruction when we are loading a floating,
and expand it post RA basing on the register pressure. However, we miss to do the add-imm peephole for these pseudo instruction.

Differential Revision: https://reviews.llvm.org/D47568
Reviewed By: Nemanjai

llvm-svn: 335024
2018-06-19 06:54:51 +00:00
Roger Ferrer Ibanez ec03fbe8bb [RISCV] Add tests for overflow intrinsics
This is using the existing codegen so we can see the change once we custom
lower ISD::{U,S}{ADD,SUB}O nodes.

llvm-svn: 335023
2018-06-19 06:45:47 +00:00
Eli Friedman de735c977d [ARM] Thumb2 constant cmp testcases.
Shows some missed optimizations for the -7929856 and -2166 testcases.
-7929856 is due to a bug in ARMTargetLowering::getARMCmp, I think;
the -2166 case is a missing pattern.

llvm-svn: 335004
2018-06-19 00:14:10 +00:00
Eli Friedman 9e3bb196cb [ARM] Testcase for Thumb1 cmp with constants.
Even if a comparison isn't legal, we should try to prefer constants
which can be materialized with a two-instruction sequence. (Thinking
about it a bit more, there might be some more clever sequence we could
generate for certain comparisons invoving powers of two, but I'm not
sure exactly what that would look like.)

llvm-svn: 335003
2018-06-19 00:12:13 +00:00
Eli Friedman e6b4719244 [ARM] Add Thumb1 coverage for cmn testcases.
There's a missed optimization for immediates: we can save two
instructions by using adds instead of movs+mvns+cmp.

llvm-svn: 335002
2018-06-19 00:09:44 +00:00
Eli Friedman 892366d025 [ARM] Testcase for missed optimization for masking.
When the result of masking is truncated to i16, we should try to use
"bic" instead of "and".

llvm-svn: 335001
2018-06-19 00:08:32 +00:00
Eli Friedman 801c2f4c3a [ARM] Testcase for missed optimization with i16 compare.
The result looks weird because the DAG actually has an explicit
shift; I haven't figured out why, exactly.

llvm-svn: 335000
2018-06-19 00:07:30 +00:00
Michael Berg 7b993d762f Utilize new SDNode flag functionality to expand current support for fadd
Summary: This patch originated from D46562 and is a proper subset, with some issues addressed.

Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar

Reviewed By: spatel

Subscribers: wdng, nhaehnle

Differential Revision: https://reviews.llvm.org/D47909

llvm-svn: 334996
2018-06-18 23:44:59 +00:00
Stanislav Mekhanoshin 9347c7b939 Tests for dag combine select (binop) -> select. NFC.
Tests will be updated with https://reviews.llvm.org/D48223

llvm-svn: 334987
2018-06-18 21:49:07 +00:00
Sanjay Patel 3e52deb144 [x86] regenerate checks and adjust tests
2 of these tests were clearly not doing what the comments
said they were doing.

The last test was added at rL177933 with no assertions
(presumably it used to crash). But either we don't have 
that problem anymore, or this test is folded sooner,
so we don't hit the bug that was fixed by disabling late
FP constant creation. Looking at this as part of reviewing
D48289.

llvm-svn: 334977
2018-06-18 20:05:16 +00:00
Krzysztof Parzyszek 546017322f Shrink interval after moving copy in removePartialRedundancy
llvm-svn: 334963
2018-06-18 17:16:39 +00:00
Clement Courbet 0d9da88d18 [X86] Fix NOOP sched overrides on BDW/HSW/SKL.
Summary: Noop certainly does not use resources.

Reviewers: RKSimon, craig.topper, andreadb

Subscribers: gbedwell, llvm-commits, gchatelet

Differential Revision: https://reviews.llvm.org/D48028

llvm-svn: 334927
2018-06-18 06:48:22 +00:00
Craig Topper b0e986f88e [X86] Pass the parent SDNode to X86DAGToDAGISel::selectScalarSSELoad to simplify the hasSingleUseFromRoot handling.
Some of the calls to hasSingleUseFromRoot were passing the load itself. If the load's chain result has a user this would count against that. By getting the true parent of the match and ensuring any intermediate between the match and the load have a single use we can avoid this case. isLegalToFold will take care of checking users of the load's data output.

This fixed at least fma-scalar-memfold.ll to succed without the peephole pass.

llvm-svn: 334908
2018-06-17 16:29:46 +00:00
Stanislav Mekhanoshin 3b11794dbf [AMDGPU] setcc (select cc, CT, CF), CF, eq | ne -> xor cc, -1 | cc
This is the common case in the BE when we serialize condition and then
rematerialize it. Use either original or inverted condition.

Differential Revision: https://reviews.llvm.org/D48246

llvm-svn: 334882
2018-06-16 03:46:59 +00:00
Michael Berg 8e570c3390 Utilize new SDNode flag functionality to expand current support for fma
Summary: This patch originated from D47388 and is a proper subset of the originating changes, containing only the fmf optimization guard extensions.

Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar, rampitec, nhaehnle, nemanjai

Reviewed By: rampitec, nhaehnle

Subscribers: tpr, nemanjai, wdng

Differential Revision: https://reviews.llvm.org/D47918

llvm-svn: 334876
2018-06-16 00:03:06 +00:00
Cameron McInally 7caac670b2 [FPEnv] Expand constrained FP POWI
Modify ExpandStrictFPOp(...) to handle nodes that have scalar
operands. 

Also, add a Strict FMA test and do some other light cleanup in the
Strict FP code.

Differential Revision: https://reviews.llvm.org/D48149

llvm-svn: 334863
2018-06-15 20:57:55 +00:00