Commit Graph

117701 Commits

Author SHA1 Message Date
Reid Kleckner 953bdce68d [MC] Separate masm integer literal lexer support from inline asm
Summary:
This renames the IsParsingMSInlineAsm member variable of AsmLexer to
LexMasmIntegers and moves it up to MCAsmLexer. This is the only behavior
controlled by that variable. I added a public setter, so that it can be
set from outside or from the llvm-mc command line. We may need to
arrange things so that users can get this behavior from clang, but
that's future work.

I also put additional hex literal lexing functionality under this flag
to fix PR32973. It appears that this hex literal parsing wasn't intended
to be enabled in non-masm-style blocks.

Now, masm integers (0b1101 and 0ABCh) work in __asm blocks from clang,
but 0b label references work when using .intel_syntax in standalone .s
files.

However, 0b label references will *not* work from __asm blocks in clang.
They will work from GCC inline asm blocks, which it sounds like is
important for Crypto++ as mentioned in PR36144.

Essentially, we only lex masm literals for inline asm blobs that use
intel syntax. If the .intel_syntax directive is used inside a gnu-style
inline asm statement, masm literals will not be lexed, which is
compatible with gas and llvm-mc standalone .s assembly.

This fixes PR36144 and PR32973.

Reviewers: Gerolf, avt77

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D53535

llvm-svn: 345189
2018-10-24 20:23:57 +00:00
Tim Northover 1c353419ab AArch64: add a pass to compress jump-table entries when possible.
llvm-svn: 345188
2018-10-24 20:19:09 +00:00
Evandro Menezes 769d4cebad [AArch64] Refactor Exynos machine model (NFC)
llvm-svn: 345187
2018-10-24 20:03:24 +00:00
Evandro Menezes 80bc136732 [AArch64] Fix overlapping instructions
Fix overlapping instruction descriptions in the machine model for Exynos M3.
Effectively, NFC.

llvm-svn: 345186
2018-10-24 20:03:20 +00:00
Craig Topper 7bb8c2e6e5 [X86] Explicitly list all KNL features of inheriting from IVB. NFC
I'm not sure all the microarchitectural tuning flags that have been added to IVBFeatures are relevant for KNL. Separating will allow us to see and audit them. There might even be some simplification opportunities in the Sandy Bridge through Icelake inheritance line without KNL using the same chain.

llvm-svn: 345183
2018-10-24 19:24:44 +00:00
Simon Pilgrim c5bb362b13 [X86][SSE] Add SimplifyDemandedBitsForTargetNode PMULDQ/PMULUDQ handling
Add X86 SimplifyDemandedBitsForTargetNode and use it to simplify PMULDQ/PMULUDQ target nodes.

This enables us to repeatedly simplify the node's arguments after the previous approach had to be reverted due to PR39398.

Differential Revision: https://reviews.llvm.org/D53643

llvm-svn: 345182
2018-10-24 19:11:28 +00:00
Simon Pilgrim 6f53b38fd4 [TargetLowering] Add SimplifyDemandedBitsForTargetNode callback
Add a SimplifyDemandedBitsForTargetNode callback to handle target nodes.

Differential Revision: https://reviews.llvm.org/D53643

llvm-svn: 345179
2018-10-24 19:00:56 +00:00
Teresa Johnson c8dba682bb [hot-cold-split] Name split functions with ".cold" suffix
Summary:
The current default of appending "_"+entry block label to the new
extracted cold function breaks demangling. Change the deliminator from
"_" to "." to enable demangling. Because the header block label will
be empty for release compile code, use "extracted" after the "." when
the label is empty.

Additionally, add a mechanism for the client to pass in an alternate
suffix applied after the ".", and have the hot cold split pass use
"cold."+Count, where the Count is currently 1 but can be used to
uniquely number multiple cold functions split out from the same function
with D53588.

Reviewers: sebpop, hiraditya

Subscribers: llvm-commits, erik.pilkington

Differential Revision: https://reviews.llvm.org/D53534

llvm-svn: 345178
2018-10-24 18:53:47 +00:00
Simon Pilgrim ac84005841 [CostModel][X86] Add vXi8 vector division by constants costs.
ISD::MULHS/ISD::MULHU lowering of vXi8 types means we expand these in TargetLowering BuildSDIV/BuildUDIV.

llvm-svn: 345175
2018-10-24 18:44:12 +00:00
Peter Collingbourne 4bb928c110 ARM: Use BKPT instead of TRAP to implement llvm.debugtrap.
The BKPT instruction is specified to cause a software breakpoint,
and at least on Linux results in a SIGTRAP. This makes it more
suitable for implementing debugtrap than TRAP (aka UDF #254), which
is specified to cause an undefined instruction exception and results
in a SIGILL on Linux.

Moreover, BKPT is not marked as a terminator, which is not only
consistent with the IR instruction but allows the analyzeBlock
function to correctly analyze a basic block containing the instruction,
which fixes an assertion failure in the machine block placement pass
previously triggered by the included test case.

Because BKPT is only supported starting with ARMv5T, we continue to
use UDF #254 when targeting v4T.

Differential Revision: https://reviews.llvm.org/D53614

llvm-svn: 345171
2018-10-24 18:10:38 +00:00
Krzysztof Parzyszek 57b5ac1431 [Hexagon] Flip hexagon-autohvx to be true by default
This will allow other generators of LLVM IR to use the auto-vectorizer
without having to change that flag.

Note: on its own, this patch will enable auto-vectorization on Hexagon
in all cases, regardless of the -fvectorize flag. There is a companion
clang patch that together with this one forms an NFC for clang users.

llvm-svn: 345169
2018-10-24 17:55:13 +00:00
Craig Topper 2417273255 [X86] Bring back the MOV64r0 pseudo instruction
This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos.

My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at.

There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling.

Differential Revision: https://reviews.llvm.org/D52757

llvm-svn: 345165
2018-10-24 17:32:09 +00:00
Simon Pilgrim 2cce074e8c [CostModel][X86] Enable non-uniform vector division by constants costs.
Non-uniform division/remainder handling was added back at D49248/D50765 - so share the 'mul+sub' costs that already exist for uniform cases.

llvm-svn: 345164
2018-10-24 17:30:29 +00:00
Robert Lougher 18bfb3a5ec [CodeGen] skip lifetime end marker in isInTailCallPosition
A lifetime end intrinsic between a tail call and the return should not
prevent the call from being tail call optimized.

Differential Revision: https://reviews.llvm.org/D53519

llvm-svn: 345163
2018-10-24 17:03:19 +00:00
Simon Pilgrim c8c7451063 [LegalizeDAG] ExpandLegalINT_TO_FP - cleanup UINT_TO_FP i64 -> f32 expansion.
Use SrcVT/DestVT types and correct shift type.

Part of prep work for D52965

llvm-svn: 345158
2018-10-24 16:35:01 +00:00
Krasimir Georgiev 09ea204964 IR: Optimize FunctionType::get to perform one hash lookup instead of two, NFCI
Summary: This function was performing two hash lookups when a new function type was requested: first checking if it exists and second to insert it. This patch updates the function to perform a single hash lookup in this case by updating the value in the hash table in-place in case the function type was not there before.

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53471

llvm-svn: 345151
2018-10-24 15:18:51 +00:00
Sanjay Patel 3b206305fd [InstCombine] try harder to form select from logic ops (2nd try)
The original patch was committed here:
rL344609
...and reverted:
rL344612
...because it did not properly check/test data types before calling
ComputeNumSignBits(). 

The tests that caused bot failures for the previous commit are 
over-reaching front-end tests that run the entire -O optimizer 
pipeline: 
    Clang :: CodeGen/builtins-systemz-zvector.c
    Clang :: CodeGen/builtins-systemz-zvector2.c

I've added a negative test here to ensure coverage for that case.
The new early exit check also tests the type of the 'B' parameter,
so we don't waste time on matching if either value is unsuitable.

Original commit message:

This is part of solving PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549

The patterns shown here are a special case of something
that we already convert to select. Using ComputeNumSignBits()
catches that case (but not the more complicated motivating
patterns yet).

The backend has hooks/logic to convert back to logic ops
if that's better for the target.

llvm-svn: 345149
2018-10-24 15:17:56 +00:00
Cameron McInally 678f43f666 [FPEnv] Convert more BinaryOperator::isFNeg(...) to m_FNeg(...)
This work is to avoid regressions when we seperate FNeg from the FSub IR instruction. 

Differential Revision: https://reviews.llvm.org/D53205

llvm-svn: 345146
2018-10-24 14:45:18 +00:00
Alexey Bataev c15c853c3a [DEBUGINFO, NVPTX] Try to pack bytes data into a single string.
Summary:
If the target does not support `.asciz` and `.ascii` directives, the
strings are represented as bytes and each byte is placed on the new line
as a separate byte directive `.b8 <data>`. NVPTX target allows to
represent the vector of the data of the same type as a vector, where
values are separated using `,` symbol: `.b8 <data1>,<data2>,...`. This
allows to reduce the size of the final PTX file. Ptxas tool includes ptx
files into the resulting binary object, so reducing the size of the PTX
file is important.

Reviewers: tra, jlebar, echristo

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D45822

llvm-svn: 345142
2018-10-24 14:04:00 +00:00
Eugene Leviant 9465a1a580 [ThinLTO] Change parameter type. NFC
Change destination module type for consistency with r345118

llvm-svn: 345124
2018-10-24 08:59:58 +00:00
Gil Rapaport c523036fd2 Revert r345114
Investigating fails.

llvm-svn: 345123
2018-10-24 08:41:22 +00:00
Tim Renouf 2a1b1d94b6 [AMDGPU] Defined gfx909 Raven Ridge 2
Differential Revision: https://reviews.llvm.org/D53418

Change-Id: Ie3d054f2e956c2768988c0f4c0ffd29a47294eef
llvm-svn: 345120
2018-10-24 08:14:07 +00:00
Eugene Leviant 1f54500af0 [ThinLTO] Fix dot dumper for regular LTO modules
Regular LTO module identifier is (unsigned)-1. This patch emits correct
module identifier while printing edges with source summary in regular
LTO module.

Differential revision: https://reviews.llvm.org/D53583

llvm-svn: 345118
2018-10-24 07:48:32 +00:00
Dorit Nuzman 5114390e48 [LV] Don't have fold-tail under optsize invalidate interleave-groups when
masked-interleaving is enabled

Enable interleave-groups under fold-tail scenario for Opt for size compilation;
D50480 added support for vectorizing loops of arbitrary trip-count without a
remiander, which in turn makes everything in the loop conditional, including
interleave-groups if any. It therefore invalidated all interleave-groups
because we didn't have support for vectorizing predicated interleaved-groups
at the time. In the meantime, D53011 introduced this support, so we don't
have to invalidate interleave-groups when masked-interleaved support is enabled.

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: hsaito

Differential Revision: https://reviews.llvm.org/D53559

llvm-svn: 345115
2018-10-24 07:11:38 +00:00
Gil Rapaport 5012e7f6ac [LSR] Combine unfolded offset into invariant register
LSR reassociates constants as unfolded offsets when the constants fit as
immediate add operands, which currently prevents such constants from being
combined later with loop invariant registers.
This patch modifies GenerateCombinations() to generate a second formula which
includes the unfolded offset in the combined loop-invariant register.

Differential Revision: https://reviews.llvm.org/D51861

llvm-svn: 345114
2018-10-24 07:08:38 +00:00
Craig Topper da54bbf52a [X86] Correct a bad isel predicate. Though I don't think it can be exposed.
This B/W VPTEST instructions are only available with AVX512BW. But lowering should prevent any byte or word elements from getting to isel so this can't be exposed.

llvm-svn: 345112
2018-10-24 06:13:36 +00:00
Saleem Abdulrasool 4005f9a860 ARM: handle checking aliases with out-of-bounds GEPs
A global alias may use indices which are not considered in bounds.  In
such a case, accessing the base object will fail as it only peers
through inbounds accesses.  This pattern is used by the swift compiler
to create references to preceeding members in the type metadata.  This
would cause the code generation to fail when targeting a platform that
used ELF as the object file format.  Be conservative and fail the
read-only check if we run into an alias that we cannot peer through.

llvm-svn: 345107
2018-10-24 00:00:52 +00:00
Reid Kleckner 5fa1e35bcc Commit missing comment edit and use correct cast to fix std::min overload
llvm-svn: 345105
2018-10-23 23:44:44 +00:00
Reid Kleckner 1500effacd [hurd] Make getMainExecutable get the real binary path
On GNU/Hurd, llvm-config is returning bogus value, such as:

$ llvm-config-6.0 --includedir
/usr/include

while it should be:
$ llvm-config-6.0 --includedir
/usr/lib/llvm-6.0/include

This is because getMainExecutable does not get the actual installation
path. On GNU/Hurd, /proc/self/exe is indeed a symlink to the path that
was used to start the program, and not the eventual binary file. Llvm's
getMainExecutable thus needs to run realpath over it to get the actual
place where llvm was installed (/usr/lib/llvm-6.0/bin/llvm-config), and
not /usr/bin/llvm-config-6.0. This will not change the result on Linux,
where /proc/self/exe already points to the eventual file.

Patch by Samuel Thibault!

While making changes here, I reformatted this block a bit to reduce
indentation and match 2 space indent style.

Differential Revision: https://reviews.llvm.org/D53557

llvm-svn: 345104
2018-10-23 23:35:43 +00:00
Wei Mi 80a0c97e07 [PM] keeping history when original SCC split and then merge into itself
in the same round of SCC update.

In https://reviews.llvm.org/rL309784, inline history is added to prevent
infinite inlining across multiple run of inliner and SCC update, but the
history will only be kept when new SCC is actually generated during SCC update.

We found a case that SCC can be split and then merge into itself in the same
round of SCC update, so the same SCC will be pop out from UR.CWorklist and
then added back immediately, without any new SCC generated, that is why the
existing patch cannot catch the infinite inline case.

What the patch does is even if no new SCC is generated, if only the current
SCC appears in UR.CWorklist again, then keep the inline history.

Differential Revision: https://reviews.llvm.org/D52915

llvm-svn: 345103
2018-10-23 23:29:45 +00:00
Matthias Braun 4f82406c46 SelectionDAG: Reuse bigger sized constants in memset expansion.
When implementing memset's today we often see this pattern:
$x0 = MOV 0xXYXYXYXYXYXYXYXY
store $x0, ...
$w1 = MOV 0xXYXYXYXY
store $w1, ...

We first create a 64bit constant in a 64bit register with all bytes the
same and then create a 32bit constant with all bytes the same in a 32bit
register. In many targets we could just access the lower byte of the
64bit register instead.

- Ideally this would be handled by the ConstantHoist pass but it runs
  too early when memset isn't expanded yet.
- The memset expansion code already had this optimization implemented,
  however SelectionDAG constantfolding would constantfold the
  "trunc(bigconstnat)" pattern to "smallconstant".
- This patch makes the memset expansion mark the constant as Opaque and
  stop DAGCombiner from constant folding in this situation. (Similar to
  how ConstantHoisting marks things as Opaque to avoid folding
  ADD/SUB/etc.)

Differential Revision: https://reviews.llvm.org/D53181

llvm-svn: 345102
2018-10-23 23:19:23 +00:00
Lang Hames 23cb2e7f77 [ORC] Re-apply r345077 with fixes to remove ambiguity in lookup calls.
llvm-svn: 345098
2018-10-23 23:01:39 +00:00
Teresa Johnson 7c6344a64f Revert "[ThinLTO] Fix a crash in lazy loading of Metadata"
This reverts commit r345095. It was accidentally committed.

llvm-svn: 345097
2018-10-23 23:00:29 +00:00
Teresa Johnson d725335bd1 [hot-cold-split] Only perform splitting in ThinLTO backend post-link
Summary:
Fix the new PM to only perform hot cold splitting once during ThinLTO,
by skipping it in the pre-link phase.

This was already fixed in the old PM by the move of the hot cold split
pass later (after the early return when PrepareForThinLTO) by r344869.

Reviewers: vsk, sebpop, hiraditya

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D53611

llvm-svn: 345096
2018-10-23 22:57:40 +00:00
Teresa Johnson 3513dc245e [ThinLTO] Fix a crash in lazy loading of Metadata
Summary:
This is a revised version of D41474.

When the debug location is parsed in BitcodeReader::parseFunction, the
scope and inlinedAt MDNodes are obtained via MDLoader->getMDNodeFwdRefOrNull(),
which will create a forward ref if they were not yet loaded.
Specifically, if one of these MDNodes is in the module level metadata
block, and this is during ThinLTO importing, that metadata block is
lazily loaded.

Most places in that invoke getMDNodeFwdRefOrNull have a corresponding call
to resolveForwardRefsAndPlaceholders which will take care of resolving them.
E.g. places that call getMetadataFwdRefOrLoad, or at the end of parsing a
function-level metadata block, or at the end of the initial lazy load of
module level metadata in order to handle invocations of getMDNodeFwdRefOrNull
for named metadata and global object attachments. However, the calls for
the scope/inlinedAt of debug locations are not backed by any such call to
resolveForwardRefsAndPlaceholders.

To fix this, change the scope and inlinedAt parsing to instead use
getMetadataFwdRefOrLoad, which will ensure the forward refs to lazily
loaded metadata are resolved.

Fixes PR35472.

Reviewers: dexonsmith, Sunil_Srivastava, vsk

Subscribers: inglorion, eraman, steven_wu, sebpop, mehdi_amini, dmikulin, vsk, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D53596

llvm-svn: 345095
2018-10-23 22:57:21 +00:00
Zhizhou Yang 13f76f84bc Print out DebugCounter info with -print-debug-counter
Summary:
This patch will print out {Counter, Skip, StopAfter} info of all passes which have DebugCounter set at destruction.

It can be used to monitor how many times does certain transformation happen in a pass, and also help check if -debug-counter option is set correctly.

Please refer to this [[ http://lists.llvm.org/pipermail/llvm-dev/2018-July/124722.html  | thread ]] for motivation.

Reviewers: george.burgess.iv, davide, greened

Reviewed By: greened

Subscribers: kristina, llozano, mgorny, llvm-commits, mgrang

Differential Revision: https://reviews.llvm.org/D50031

llvm-svn: 345085
2018-10-23 21:51:56 +00:00
Matt Arsenault 9ef8e51cec Fix typo in verifier error message
llvm-svn: 345083
2018-10-23 21:23:52 +00:00
Peter Collingbourne abd820a92b CGP: Clear data structures at the end of a loop iteration instead of the beginning.
Clearing LargeOffsetGEPMap at the end fixes a bug where if a large
offset GEP is in a dead basic block, we fail an assertion when trying
to delete the block due to the asserting VH in LargeOffsetGEPMap.

Differential Revision: https://reviews.llvm.org/D53464

llvm-svn: 345082
2018-10-23 21:23:18 +00:00
Reid Kleckner db367e952e Revert r345077 "[ORC] Change how non-exported symbols are matched during lookup."
Doesn't build on Windows. The call to 'lookup' is ambiguous. Clang and
MSVC agree, anyway.

http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/787
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): error C2668: 'llvm::orc::ExecutionSession::lookup': ambiguous call to overloaded function
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(823): note: could be 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(llvm::ArrayRef<llvm::orc::JITDylib *>,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(817): note: or       'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(const llvm::orc::JITDylibSearchList &,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): note: while trying to match the argument list '(initializer list, llvm::orc::SymbolStringPtr)'

llvm-svn: 345078
2018-10-23 20:54:43 +00:00
Lang Hames 841796decd [ORC] Change how non-exported symbols are matched during lookup.
In the new scheme the client passes a list of (JITDylib&, bool) pairs, rather
than a list of JITDylibs. For each JITDylib the boolean indicates whether or not
to match against non-exported symbols (true means that they should be found,
false means that they should not). The MatchNonExportedInJD and MatchNonExported
parameters on lookup are removed.

The new scheme is more flexible, and easier to understand.

This patch also updates JITDylib search orders to be lists of (JITDylib&, bool)
pairs to match the new lookup scheme. Error handling is also plumbed through
the LLJIT class to allow regression tests to fail predictably when a lookup from
a lazy call-through fails.

llvm-svn: 345077
2018-10-23 20:20:22 +00:00
Vedant Kumar 503154615d [HotColdSplitting] Attach MinSize to outlined code
Outlined code is cold by assumption, so it makes sense to optimize it
for minimal code size rather than performance.

After r344869 moved the splitting pass to the end of the IR pipeline,
this does not result in much of a code size reduction. This is probably
because a comparatively small number backend transforms make use of the
MinSize hint.

Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of
919 bytes of TEXT reduction. I didn't measure a significant performance
impact.

Differential Revision: https://reviews.llvm.org/D53518

llvm-svn: 345072
2018-10-23 19:41:12 +00:00
Simon Pilgrim b6c57075c0 [X86][SSE] Revert rL343922 combinePMULDQ AddToWorklist (PR39398)
We can't add the MULDQ node back to the worklist after the demanded bits change has been committed in case the node has been removed entirely. This will have to wait until we have SimplifyDemandedBitsForTargetNode.

llvm-svn: 345070
2018-10-23 19:07:53 +00:00
Simon Pilgrim 8c4796deb4 [LegalizeDAG] Share Vector/Scalar CTPOP Expansion
As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer.

Proper vector support will be added by D53258.

llvm-svn: 345066
2018-10-23 18:28:24 +00:00
Roman Lebedev 2fae985793 X86DAGToDAGISel::matchBitExtract(): lambdas can't have default arguments.
As reported by ctopper.
That is a gcc-only warning at the moment.

llvm-svn: 345065
2018-10-23 18:27:10 +00:00
Simon Pilgrim d705ba97dd [LegalizeDAG] Share Vector/Scalar CTLZ Expansion
As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.

Extension to D53474

llvm-svn: 345060
2018-10-23 17:48:30 +00:00
Fangrui Song 531e3d0cd9 [IR] Fix -Wunused-function after r345052
llvm-svn: 345057
2018-10-23 17:24:15 +00:00
Reid Kleckner 075897292f [PDB] Fix -Wunused-private-field in DIA
llvm-svn: 345054
2018-10-23 17:20:16 +00:00
Stefan Pintilie 927e8bf316 [Power9] Add __float128 support in the backend for bitcast to a i128
Add support to allow bit-casting from f128 to i128 and then
extracting 64 bits from the result.

Differential Revision: https://reviews.llvm.org/D49507

llvm-svn: 345053
2018-10-23 17:11:36 +00:00
Sanjay Patel 07076cfdf6 [IR] remove fake binop queries for not/neg
The initial motivation is that we want to remove the
fneg API because that would silently fail if we add
an actual fneg instruction to IR. The same would be
true for the integer ops, so we might as well get rid 
of these too.

We have a newer 'match' API that makes checking for
these patterns simpler. It also works with vectors
that may include undef elements in constants.

If any out-of-tree users need updating, they can model
their code changes on these commits:
rL345050
rL345043
rL345042
rL345041
rL345036
rL345030

llvm-svn: 345052
2018-10-23 17:06:03 +00:00
Sanjay Patel 95790c546f [InstCombine] use 'match' to simplify code
There's probably some vector-with-undef-element pattern
that shows an improvement, so this is probably not quite
'NFC'.

This is the last step towards removing the fake binop 
queries for not/neg. Ie, there are no more uses of those
functions in trunk. Fneg should follow.

llvm-svn: 345050
2018-10-23 16:54:28 +00:00