Normally DCE kills these, but at -O0 these get left behind
leaving suspicious looking illegal copies.
Replace with IMPLICIT_DEF to avoid iterator issues.
llvm-svn: 327842
This is the groundwork for adding the Armv8.2-A FP16 vector intrinsics, which
uses v4f16 and v8f16 vector operands and return values. All the moving parts
are tested with two intrinsics, a 1-operand v8f16 and a 2-operand v4f16
intrinsic. In a follow-up patch the rest of the intrinsics and tests will be
added.
Differential Revision: https://reviews.llvm.org/D44538
llvm-svn: 327839
If DoneMBB becomes empty it must have CC added to its live-in list, since it
will fall-through into EndMBB. This happens when the CLC loop does the
complete range.
Review: Ulrich Weigand
llvm-svn: 327834
This is re-land of https://reviews.llvm.org/rL327362 with a fix
and regression test.
The crash was due to it is possible that for found MDL loop,
LHS or RHS may contain an invariant unknown SCEV which
does not dominate the MDL. Please see regression
test for an example.
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44553
llvm-svn: 327822
Also move ADC8i8 and SBB8i8 in the Sandy Bridge model to the same class as ADC8ri and SBB8ri. That seems more accurate since its the 8i8 is just the register forced to AL instead of coming from modrm.
llvm-svn: 327820
This patch adds i128 division support by instruction LLVM to lower
128-bit divisions to the __udivmodti4 and __divmodti4 rtlib functions.
This also adds test for 64-bit division and 128-bit division.
Patch by Peter Nimmervoll.
llvm-svn: 327814
LICM deletes trivially dead instructions which it won't attempt to sink.
Attempt to salvage debug values which reference these instructions.
llvm-svn: 327800
As shown in the code comment, we don't need all of 'fast',
but we do need reassoc + nsz + nnan.
Differential Revision: https://reviews.llvm.org/D43765
llvm-svn: 327796
Jaguar's FPU has 2 scheduler pipes (JFPU0/JFPU1) which forward to multiple functional sub-units each. We need to model that an micro-op will both consume the scheduler pipe and a functional unit.
This patch just handles the ops defined through JWriteResFpuPair, I'll go through the custom cases later.
llvm-svn: 327791
Now that almost all functionality of Apple's dsymutil has been
upstreamed, the open source variant can be used as a drop in
replacement. Hence we feel it's no longer necessary to have the llvm
prefix.
Differential revision: https://reviews.llvm.org/D44527
llvm-svn: 327790
Hopefully these tests can be easily reused should any other subtarget get in depth llvm-mca coverage (we can either copy the tests or move them into a common dir and run it with multiple prefixes).
llvm-svn: 327788
The information was so wildly inaccurate and incomplete its better to just remove it.
MMX_MASKMOVQ64 showed up twice in several scheduler models. In Haswell and Broadwell they were on adjacent lines. On Skylake the copies had different information.
MMX_MASKMOVQ and MASKMOVDQU were completely missing.
MMX_MASKMOVQ64 was listed on Haswell/Broadwell as 1 cycle on port 1 despite it being a store instruction.
Filed PR36780 to track fixing this right.
llvm-svn: 327783
X86 Supports Indirect Branch Tracking (IBT) as part of Control-Flow Enforcement Technology (CET).
IBT instruments ENDBR instructions used to specify valid targets of indirect call / jmp.
The `nocf_check` attribute has two roles in the context of X86 IBT technology:
1. Appertains to a function - do not add ENDBR instruction at the beginning of the function.
2. Appertains to a function pointer - do not track the target function of this pointer by adding nocf_check prefix to the indirect-call instruction.
This patch implements `nocf_check` context for Indirect Branch Tracking.
It also auto generates `nocf_check` prefixes before indirect branchs to jump tables that are guarded by range checks.
Differential Revision: https://reviews.llvm.org/D41879
llvm-svn: 327767
Improve/implement these methods to improve DAG combining. This mainly
concerns intrinsics.
Some constant operands to SystemZISD nodes have been marked Opaque to avoid
transforming back and forth between generic and target nodes infinitely.
Review: Ulrich Weigand
llvm-svn: 327765
The BITCAST handling in computeKnownBits() previously only worked for little
endian.
This patch reverses the iteration over elements for a big endian target which
allows this to work in this case also.
SystemZ test case.
Review: Eli Friedman
https://reviews.llvm.org/D44249
llvm-svn: 327764
At the point the outliner runs, KILLs don't impact anything, but they're still
considered unique instructions. This commit makes them invisible like
DebugValues so that they can still be outlined without impacting outlining
decisions.
llvm-svn: 327760
Avoid scheduling two loads in such a way that they would end up in the
same packet. If there is a load in a packet, try to schedule a non-load
next.
Patch by Brendon Cahoon.
llvm-svn: 327742
Now the Windows mangling modes ('w' and 'x') do not do any mangling for
symbols starting with '?'. This means that clang can stop adding the
hideous '\01' leading escape. This means LLVM debug logs are less likely
to contain ASCII escape characters and it will be easier to copy and
paste MS symbol names from IR.
Finally.
For non-Windows platforms, names starting with '?' still get IR
mangling, so once clang stops escaping MS C++ names, we will get extra
'_' prefixing on MachO. That's fine, since it is currently impossible to
construct a triple that uses the MS C++ ABI in clang and emits macho
object files.
Differential Revision: https://reviews.llvm.org/D7775
llvm-svn: 327734
We previously avoided inserting these moves during isel in a few cases which is implemented using a whitelist of opcodes. But it's too difficult to generate a perfect list of opcodes to whitelist. Especially with AVX512F without AVX512VL using 512 bit vectors to implement some 128/256 bit operations. Since isel is done bottoms up, we'd have to check the VT and opcode and subtarget in order to determine whether an EXTRACT_SUBREG would be generated for some operations.
So instead of doing that, this patch adds a post processing step that detects when the moves are unnecesssary after isel. At that point any EXTRACT_SUBREGs would have already been created and appear in the DAG. So then we just need to ensure the input to the move isn't one.
Differential Revision: https://reviews.llvm.org/D44289
llvm-svn: 327724
This builds on the work from https://reviews.llvm.org/D44287. It turned out supporting fcmp was much easier than I realized, so let's do that now.
As an aside, our -O3 handling of a floating point IVs leaves a lot to be desired. We do convert the float IV to an integer IV, but do so late enough that many other optimizations are missed (e.g. we don't vectorize).
Differential Revision: https://reviews.llvm.org/D44542
llvm-svn: 327722
JumpThreading iterates over F until the IR quiesces. Transforming
unreachable BBs increases compile time and it is also possible to
never stabilize causing JumpThreading to hang. An older attempt at
fixing this problem was D3991 where removeUnreachableBlocks(F)
was called before JumpThreading began. This has a few drawbacks:
* expensive - the routine attempts to fix up the IR to identify
additional BBs that can be removed along with unreachable BBs.
* aggressive - does not identify and preserve the shape of the IR.
At a minimum it does not preserve loop hierarchies.
* invasive - altering reachable blocks it may disrupt IR shapes
that could have otherwise been JumpThreaded.
This patch avoids removeUnreachableBlocks(F) and instead tracks
unreachable BBs in a SmallPtrSet using DominatorTree to validate the
initial state of all BBs. We then rely on subsequent passes to identify
and remove these unreachable blocks from F.
Reviewers: dberlin, sebpop, kuhar, dinesh.d
Reviewed by: sebpop, kuhar
Subscribers: hiraditya, uabelho, llvm-commits
Differential Revision: https://reviews.llvm.org/D44177
llvm-svn: 327713
Summary:
Currently the LLVM MC assembler is able to convert e.g.
vmov.i32 d0, #0xabababab
(which is technically invalid) into a valid instruction
vmov.i8 d0, #0xab
this patch adds support for vmov.i64 and for cases with the resulting
load types other than i8, e.g.:
vmov.i32 d0, #0xab00ab00 ->
vmov.i16 d0, #0xab00
Reviewers: olista01, rengolin
Reviewed By: rengolin
Subscribers: rengolin, javed.absar, kristof.beyls, rogfer01, llvm-commits
Differential Revision: https://reviews.llvm.org/D44467
llvm-svn: 327709