Commit Graph

129406 Commits

Author SHA1 Message Date
Mark de Wever 31262d6722 [NVPTX] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Also removed the top-level const as requested by Aaron Ballman in similar
patches.

Differential Revision: https://reviews.llvm.org/D71812
2019-12-22 19:27:44 +01:00
Mark de Wever 1b344e7967 [PowerPC] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D71811
2019-12-22 19:23:57 +01:00
Mark de Wever 098d3347e7 [Transforms] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Differential Revision: https://reviews.llvm.org/D71810
2019-12-22 19:20:17 +01:00
Sanjay Patel 9cdcd81d3f [InstCombine] enhance fold for copysign with known sign arg
This is another optimization suggested in PRPR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153
2019-12-22 10:07:01 -05:00
Eric Astor dc5b614fa9 [ms] [X86] Use "P" modifier on operands to call instructions in inline X86 assembly.
Summary:
This is documented as the appropriate template modifier for call operands.
Fixes PR44272, and adds a regression test.

Also adds support for operand modifiers in Intel-style inline assembly.

Reviewers: rnk

Reviewed By: rnk

Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71677
2019-12-22 09:16:34 -05:00
Sanjay Patel 0b38af89e2 [AArch64] match splat of bitcasted extract subvector to DUPLANE
This is another potential regression exposed by D63815.

Here we peek through a bitcast to find an extract subvector and
scale the splat offset based on that:
splat (bitcast (extract X, C)), LaneC --> duplane (bitcast X), LaneC'

Differential Revision: https://reviews.llvm.org/D71672
2019-12-22 08:37:03 -05:00
David Blaikie d0bfb3c583 DebugInfo: Remove out of date comment 2019-12-21 23:13:26 -08:00
Simon Pilgrim 6945d383b9 Fix "result of 32-bit shift implicitly converted to 64 bits" warning. NFC. 2019-12-21 17:45:30 +00:00
Sanjay Patel 79c7fa31f3 [InstCombine] check alloc size in bitcast of geps fold (PR44321)
We missed a constraint in D44833
when folding a bitcast into a GEP with vector/array types.
If the alloc sizes specified by the datalayout don't match,
this could miscompile as shown in:
https://bugs.llvm.org/show_bug.cgi?id=44321

Differential Revision: https://reviews.llvm.org/D71771
2019-12-21 10:31:21 -05:00
Sanjay Patel 19f9f374d9 [SimplifyLibCalls] require fast-math-flags for pow(X, -0.5) transforms
As discussed in PR44330:
https://bugs.llvm.org/show_bug.cgi?id=44330
...the transform from pow(X, -0.5) libcall/intrinsic to
reciprocal square root can result in small deviations from
the expected result due to differences in the pow()
implementation and/or the extra rounding step from the division.

This patch proposes to allow that difference with either the
'approximate functions' or 'reassociate' FMF:
http://llvm.org/docs/LangRef.html#fast-math-flags

In practice, this likely means that the code is compiled with
all of 'fast' (-ffast-math), but I have preserved the existing
specializations for -0.0/-INF that enable generating safe code
if those special values are allowed simultaneously with
allowing approximation/reassociation.

The question about whether a similar restriction is needed for
the non-reciprocal case -- pow(X, 0.5) -- is deferred. That
transform is allowed without FMF currently, and this patch does
not change that behavior.

Differential Revision: https://reviews.llvm.org/D71706
2019-12-21 10:00:53 -05:00
Florian Hahn d269255b95 [AArch64] Respect reserved registers while renaming in LdSt opt.
We cannot pick reserved registers as rename registers.

Fixes https://bugs.llvm.org/show_bug.cgi?id=44358
2019-12-21 15:10:07 +01:00
Matt Arsenault 42a26445f9 AMDGPU/GlobalISel: Fix misuse of div_scale intrinsics
Confusingly, the intrinsic operands do not match the
instruction/custom node. The order is shuffled, and the 3rd operand is
an immediate to select operands.

I'm not 100% sure I did this right, but fdiv still doesn't select end
to end and it will be easier to tell when it does. This at least
avoids an assertion in RegBankSelect and allows hitting the fallback
on selection.
2019-12-21 04:55:36 -05:00
Matt Arsenault dff3f8d742 AMDGPU/GlobalISel: Fix missing scc imp-def on scalar and/or/xor 2019-12-21 04:55:36 -05:00
Matt Arsenault d688a6739d AMDGPU/GlobalISel: Simplify code
This can directly access the register bank, and doesn't need to get it
through the ID.
2019-12-21 04:55:36 -05:00
Lang Hames 9f4f237e29 [ORC] De-register eh-frames in the RTDyldObjectLinkingLayer destructor.
This matches the behavior of the legacy layer, which automatically deregistered
frames.
2019-12-20 21:10:49 -08:00
Jessica Paquette d5750770eb [NFC][MachineOutliner] Rewrite setSuffixIndices to be iterative
Having this function be recursive could use up way too much stack space.

Rewrite it as an iterative traversal in the tree instead to prevent this.

Fixes PR44344.
2019-12-20 16:12:37 -08:00
Vedant Kumar fa4701e197 [DWARF] Defer creating declaration DIEs until we prepare call site info
It isn't necessary to create DIEs for all of the declaration subprograms
in a CU's retainedTypes list. We can defer creating these subprograms
until we need to prepare a call site tag that refers to one.

This cleanup was mentioned in passing in D70350.
2019-12-20 15:26:31 -08:00
Vedant Kumar 79daafc903 Reland: [DWARF] Allow cross-CU references of subprogram definitions
This allows a call site tag in CU A to reference a callee DIE in CU B
without resorting to creating an incomplete duplicate DIE for the callee
inside of CU A.

We already allow cross-CU references of subprogram declarations, so it
doesn't seem like definitions ought to be special.

This improves entry value evaluation and tail call frame synthesis in
the LTO setting. During LTO, it's common for cross-module inlining to
produce a call in some CU A where the callee resides in a different CU,
and there is no declaration subprogram for the callee anywhere. In this
case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram
in order to fill in the call site tag. That empty 'definition' defeats
entry value evaluation etc., because the debugger can't figure out what
it means.

As a follow-up, maybe we could add a DWARF verifier check that a
DW_TAG_subprogram at least has a DW_AT_name attribute.

Update:

Reland with a fix to create a declaration DIE when the declaration is
missing from the CU's retainedTypes list. The declaration is left out
of the retainedTypes list in two cases:

1) Re-compiling pre-r266445 bitcode (in which declarations weren't added
   to the retainedTypes list), and
2) Doing LTO function importing (which doesn't update the retainedTypes
   list).

It's possible to handle (1) and (2) by modifying the retainedTypes list
(in AutoUpgrade, or in the LTO importing logic resp.), but I don't see
an advantage to doing it this way, as it would cause more DWARF to be
emitted compared to creating the declaration DIEs lazily.

Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a
ReleaseLTO-g build of the test suite.

rdar://46577651, rdar://57855316, rdar://57840415

Differential Revision: https://reviews.llvm.org/D70350
2019-12-20 15:26:31 -08:00
Yury Delendik adf7a0a558 [WebAssembly] Use TargetIndex operands in DbgValue to track WebAssembly operands locations
Extends DWARF expression language to express locals/globals locations. (via
target-index operands atm) (possible variants are: non-virtual registers
or address spaces)

The WebAssemblyExplicitLocals can replace virtual registers to targertindex
operand type at the time when WebAssembly backend introduces
{get,set,tee}_local instead of corresponding virtual registers.

Reviewed By: aprantl, dschuff

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D52634
2019-12-20 14:39:05 -08:00
Jakub Kuderski c431c407eb [InstCombine] Improve infinite loop detection
Summary:
This patch limits the default number of iterations performed by InstCombine. It also exposes a new option that allows to specify how many iterations is considered getting stuck in an infinite loop.

Based on experiments performed on real-world C++ programs, InstCombine seems to perform at most ~8-20 iterations, so treating 1000 iterations as an infinite loop seems like a safe choice. See D71145 for details.

The two limits can be specified via command line options.

Reviewers: spatel, lebedev.ri, nikic, xbolva00, grosser

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71673
2019-12-20 16:15:04 -05:00
Adrian Prantl 44b4b833ad Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.

This attribute only appears in Clang module debug info.

Differential Revision: https://reviews.llvm.org/D71722
2019-12-20 13:11:17 -08:00
Bill Wendling dc03b960d0 Add parentheses to silence warning 2019-12-20 12:52:36 -08:00
Philip Reames c148e2e2ef More style cleanups following rG14fc20ca6282 [NFC]
Demote member functions to static functions where possible
Use early continue/early return to reduce nesting
Clarify comments slightly.
Reuse previously define expression in one case.
2019-12-20 12:19:33 -08:00
Philip Reames 4024d49edc Fix a memory leak introduced w/the instruction padding support in rG14fc20ca6282
Should have caught this in review, but only noticed when addressing post commit style items.  We were creating a new instance of the X86MCInstrInfo class, and then never reclaiming the memory.  This wasn't even conditional on the new off by default flags, so it was an unconditional leak.
2019-12-20 12:04:07 -08:00
Philip Reames 14fc20ca62 Align branches within 32-Byte boundary (NOP padding)
WARNING: If you're looking at this patch because you're looking for a full
performace mitigation of the Intel JCC Erratum, this is not it!

This is a preliminary patch on the patch towards mitigating the performance
regressions caused by Intel's microcode update for Jump Conditional Code
Erratum.  For context, see:
https://www.intel.com/content/www/us/en/support/articles/000055650.html

The patch adds the required assembler infrastructure and command line options
needed to exercise the logic for INTERNAL TESTING.  These are NOT public flags,
and should not be used for anything other than LLVM's own testing/debugging
purposes.  They are likely to change both in spelling and meaning.

WARNING: This patch is knowingly incorrect in some cornercases.  We need, and
do not yet provide, a mechanism to selective enable/disable the padding.
Conversation on this will continue in parellel with work on extending this
infrastructure to support prefix padding.

The goal here is to have the assembler align specific instructions such that
they neither cross or end at a 32 byte boundary.  The impacted instructions are:
a. Conditional jump.
b. Fused conditional jump.
c. Unconditional jump.
d. Indirect jump.
e. Ret.
f. Call.

The new options for llvm-mc are:
    -x86-align-branch-boundary=NUM aligns branches within NUM byte boundary.
    -x86-align-branch=TYPE[+TYPE...] specifies types of branches to align.

A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit
NOP to align the fused/unfused branch.

alignBranchesBegin inserts MCBoundaryAlignFragment before instructions,
alignBranchesEnd marks the end of the branch to be aligned,
relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch.

Nop padding is disabled when the instruction may be rewritten by the linker,
such as TLS Call.

Process Note: I am landing a patch by skan as it has been LGTMed, and
continuing to iterate on the review is simply slowing us down at this point.
We can and will continue to iterate in tree.

Patch By: skan
Differential Revision: https://reviews.llvm.org/D70157
2019-12-20 11:35:50 -08:00
Fangrui Song e8054f0933 [PPC32] Emit R_PPC_PLTREL24 for calls to dso_local ifunc
static void *ifunc(void) __attribute__((ifunc("resolver")));
  void foo() { ifunc(); }

The relocation produced by the ifunc() call:

1. gcc -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
2. gcc -msecure-plt -PIE => R_PPC_PLTREL24 r_addend=0x8000
3. clang -msecure-plt -fPIC => R_PPC_PLTREL24 r_addend=0x8000
4. clang -msecure-plt -fPIE => R_PPC_REL24

4 is incorrect. The R_PPC_REL24 needs a call stub due to ifunc. If this
relocation is mixed with other R_PPC_PLTREL24(r_addend=0x8000) in a
function, both GNU ld and lld (after D71621 fix) may produce a wrong
result.

This patch fixes 4 to use R_PPC_PLTREL24, which matches GCC.
Both GNU ld and lld (after D71621) will be happy.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D71649
2019-12-20 11:32:02 -08:00
Craig Topper de2378b4f3 [X86] Fix a KNL miscompile caused by combineSetCC swapping LHS/RHS variables before a later use.
The setcc operands are copied into LHS and RHS variables at the top of the function. We also capture the condition code.

A later piece of code swaps the operands and changing the CC variable as part of a canonicalization to make some other checks simpler. But we might not make the transform we canonicalized for. So we continue on through the function where we can use the swapped LHS/RHS variables and access the original condition code operand instead of the modified CC variable. This leads to a setcc being created with the original condition code, but with swapped operands.

To mitigate this, this patch does a couple things. The LHS/RHS/CC variables are made const to keep them from being modified like this again. The transform that needs the swap now uses temporary copies of the variables. And the transform that used the original condition code operand has been altered to use the CC variable we cached originally. Either of these changes are enough to fix the issue, but doing both to make this code very safe.

I also considered rewriting the swap code in some way to check both permutations without explicitly swapping or needing temporary variables, but held off on that.

Differential Revision: https://reviews.llvm.org/D71736
2019-12-20 11:24:45 -08:00
Danilo Carvalho Grael 15bfd2cd54 [AArch64][SVE] Replace integer immediate intrinsics with splat vector variant
Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics.

Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71614
2019-12-20 13:52:19 -05:00
Jonas Paulsson 9fcebad5e5 [SystemZ] Add a mapping from "select register" to "load on condition" (2-addr).
The SELR(Mux) instructions can be converted to two-address form as LOCR(Mux)
instructions whenever one of the sources are the same reg as dest. By adding
this mapping in getTwoOperandOpcode(), we get:

- Two-address hints in getRegAllocationHints() for select register
  instructions.

- No need anymore for special handling in SystemZShortenInst.cpp -
  shortenSelect() removed.

The two-address hints are now added before the GRX32 hints, which should be
preferred.

Review: Ulrich Weigand
https://reviews.llvm.org/D68870
2019-12-20 10:44:58 -08:00
Evgenii Stepanov b538a2aa07 llvm-symbolizer: support DW_FORM_loclistx locations.
Summary:
With -gdwarf-5 local variable locations are emitted as DW_FORM_loclistx
form instead of the regular DW_FORM_sec_offset. Teach
DWARFDie::getLocations to understand the new format and use it in
llvm-symbolizer "FRAME" command.

Reviewers: pcc, jdoerfert

Subscribers: srhines, aprantl, hiraditya, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70756
2019-12-20 10:36:14 -08:00
Jordan Rupprecht 02a6b0bc3b Temporarily revert "Reapply [LVI] Normalize pointer behavior" and "[LVI] Restructure caching"
This reverts commits 7e18aeba50 (D70376) 21fbd5587c (D69914) due to increased memory usage.
2019-12-20 10:25:57 -08:00
Jonas Paulsson 3174683e21 [SystemZ] Bugfix and improve the handling of CC values.
It was recently discovered that the handling of CC values was actually broken
since overflow was not properly handled ('nsw' flag not checked for).

Add and sub instructions now have a new target specific instruction flag
named SystemZII::CCIfNoSignedWrap. It means that the CC result can be used
instead of a compare with 0, but only if the instruction has the 'nsw' flag
set.

This patch also adds the improvements of conversion to logical instructions
and the analyzing of add with immediates, to be able to eliminate more
compares.

Review: Ulrich Weigand
https://reviews.llvm.org/D66868
2019-12-20 10:20:23 -08:00
Victor Campos 2ff5a596cb Revert "[ARM] Improve codegen of volatile load/store of i64"
This reverts commit bbcf1c3496.
2019-12-20 18:08:24 +00:00
Ulrich Weigand ede8293d7d [SystemZ][FPEnv] Enable strict vector FP extends/truncations
The back-end currently has special DAGCombine code to detect
cases where two floating-point extend or truncate operations
can be combined into a single vector operation.

This patch extends that support to also handle strict FP operations.

Note that currently only the case where both operations have the
same input chain are supported.  This already suffices to cover
the common case where the operations result from scalarizing a
non-legal vector type.  More general cases can be supported in
the future.
2019-12-20 15:36:56 +01:00
Paul Walker 6cba90dc4d [AArch64][SVE] Correct intrinsics and patterns for logical predicate instructions
In general SVE intrinsics are considered predicated and merging
with everything else having suitable decoration.  For predicated
zeroing operations (like the predicate logical instructions) we
use the "_z" suffix.  After this change all intrinsics use their
expected names (i.e. orr instead of or and eor instead of xor).

I've removed intrinsics and patterns for condition code setting
instructions as that data is not returned as part of the intrinsic.
The expectation is to ask for a cc flag explicitly.

For example:
  a = and_z(pg, p1, p2)
  cc = ptest_<flag>(pg, a)

With the code generator expected to use "s" variants of instructions
when available.

Differential Revision: https://reviews.llvm.org/D71715
2019-12-20 14:22:27 +00:00
Tom Weaver 453dc4d7ec [OPT-DBG] Teach DbgEntityHistoryCalculator about meta-instructions.
The calculator was considering instructions such as KILLs as clobbers
of a physical address. This is wrong as meta instructions such as KILLs
produce no output in the final program and thus don't clobber or change
any physical location's value. As a result they're safe to ignore whilst
calculating location list ranges.

reviewers: aprantl, vsk

diff revision: https://reviews.llvm.org/D70497

fixes: https://bugs.llvm.org/show_bug.cgi?id=38753
2019-12-20 14:03:34 +00:00
Ayal Zaks e498be5738 [LV] Strip wrap flags from vectorized reductions
A sequence of additions or multiplications that is known not to wrap, may wrap
if it's order is changed (i.e., reassociated). Therefore when vectorizing
integer sum or product reductions, their no-wrap flags need to be removed.

Fixes PR43828

Patch by Denis Antrushin

Differential Revision: https://reviews.llvm.org/D69563
2019-12-20 14:48:53 +02:00
Cullen Rhodes 974f00a436 [AArch64][SVE] Fold constant multiply of element count
Summary:
E.g.

  %0 = tail call i64 @llvm.aarch64.sve.cntw(i32 31)
  %mul = mul i64 %0, <const>

Should emit:

  cntw    x0, all, mul #<const>

For <const> in the range 1-16.

Patch by Kerry McLaughlin

Reviewers: sdesmalen, huntergr, dancgr, rengolin, efriedma

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71014
2019-12-20 11:58:00 +00:00
Andrzej Warzynski be2b7ea89a [AArch64][SVE] Add intrnisics for saturating scalar arithmetic
Summary:
The following intrnisics are added:
  * @llvm.aarch64.sve.sqdec{b|h|w|d|p}
  * @llvm.aarch64.sve.sqinc{b|h|w|d|p}
  * @llvm.aarch64.sve.uqdec{b|h|w|d|p}
  * @llvm.aarch64.sve.uqinc{b|h|w|d|p}

For every intrnisic there a scalar variants (with n32 or n64 suffix) and
vector variants (no suffix).

Reviewers: sdesmalen, rengolin, efriedma

Reviewed By: sdesmalen, efriedma

Subscribers: eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71252
2019-12-20 11:09:40 +00:00
Cullen Rhodes 3f9005eb89 Recommit "[AArch64][SVE] Add permutation and selection intrinsics"
Recommit 23c28c4043 (reverted in
dcb48f50bd) with a fix for an assert
"Request for a fixed size on a scalable object" being triggered in
`LowerSVEIntrinsicEXT`. The fix is to call `getKnownMinSize` on the
TypeSize object.
2019-12-20 10:45:17 +00:00
Andrzej Warzynski 88a973cf68 [AArch64][SVE] Add intrinsics for binary narrowing operations
Summary:
The following intrinsics for binary narrowing shift righ operations are
added:
  * @llvm.aarch64.sve.shrnb
  * @llvm.aarch64.sve.uqshrnb
  * @llvm.aarch64.sve.sqshrnb
  * @llvm.aarch64.sve.sqshrunb
  * @llvm.aarch64.sve.uqrshrnb
  * @llvm.aarch64.sve.sqrshrnb
  * @llvm.aarch64.sve.sqrshrunb
  * @llvm.aarch64.sve.shrnt
  * @llvm.aarch64.sve.uqshrnt
  * @llvm.aarch64.sve.sqshrnt
  * @llvm.aarch64.sve.sqshrunt
  * @llvm.aarch64.sve.uqrshrnt
  * @llvm.aarch64.sve.sqrshrnt
  * @llvm.aarch64.sve.sqrshrunt

Reviewers: sdesmalen, rengolin, efriedma

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71552
2019-12-20 10:20:30 +00:00
Sam Parker acbc9aed72 [ARM][MVE] Fixes for tail predication.
1) Fix an issue with the incorrect value being used for the number of
   elements being passed to [d|w]lstp. We were trying to check that
   the value was available at LoopStart, but this doesn't consider
   that the last instruction in the block could also define the
   register. Two helpers have been added to RDA for this.
2) Insert some code to now try to move the element count def or the
   insertion point so that we can perform more tail predication.
3) Related to (1), the same off-by-one could prevent us from
   generating a low-overhead loop when a mov lr could have been
   the last instruction in the block.
4) Fix up some instruction attributes so that not all the
   low-overhead loop instructions are labelled as branches and
   terminators - as this is not true for dls/dlstp.

Differential Revision: https://reviews.llvm.org/D71609
2019-12-20 09:34:18 +00:00
Sam Parker 4042518335 [ARM][MVE] Tail predicate in the presence of vcmp
Record the discovered VPT blocks while checking for validity and, for
now, only handle blocks that begin with VPST and not VPT. We're now
allowing more than one instruction to define vpr, but each block must
somehow be predicated using the vctp. This leaves us with several
scenarios which need fixing up:
1) A VPT block with is only predicated by the vctp and has no
   internal vpr defs.
2) A VPT block which is only predicated by the vctp but has an
   internal vpr def.
3) A VPT block which is predicated upon the vctp as well as another
   vpr def.
4) A VPT block which is not predicated upon a vctp, but contains it
   and all instructions within the block are predicated upon in.

The changes needed are, for:
1) The easy one, just remove the vpst and unpredicate the
   instructions in the block.
2) Remove the vpst and unpredicate the instructions up to the
   internal vpr def. Need insert a new vpst to predicate the
   remaining instructions.
3) No nothing.
4) The vctp will be inside a vpt and the instruction will be removed,
   so adjust the size of the mask on the vpst.

Differential Revision: https://reviews.llvm.org/D71107
2019-12-20 08:42:11 +00:00
Sam Parker 4f0fe6b97e [ARM][MVE] Tail predicate bottom/top muls.
Add VMULL and VQDMULL variants to our tail predication white list.

Differential Revision: https://reviews.llvm.org/D71465
2019-12-20 08:33:01 +00:00
Craig Topper bf507d4259 [X86] Make EmitCmp into a static function and explicitly return chain result for STRICT_FCMP. NFCI
The only thing its getting from the X86TargetLowering class is
the subtarget which we can easily pass. This function only has
one call site now since this might help the compiler inline it.

Explicitly return both the flag result and the chain result for
STRICT_FCMP nodes. This removes an assumption in the caller that
getValue(1) is the right way to get the chain.
2019-12-19 23:03:15 -08:00
Craig Topper 9b6fafa399 [X86] Directly call EmitTest in two places instead of creating a null constant and calling EmitCmp. NFCI
EmitCmp will just immediately call EmitTest and discard the null
constant only to have EmitTest create it again if it doesn't fold.

So just skip all that and go directly to EmitTest.
2019-12-19 23:03:06 -08:00
Lang Hames 07ac3145cc [Orc][LLJIT] Re-apply 298e183e81 (use JITLink for LLJIT where supported).
Patch d9220b580b fixed the underlying issue that casued 298e183e81 to fail.
2019-12-19 20:42:26 -08:00
Lang Hames d9220b580b [JITLink][MachO] Fix common symbol size plumbing.
This fixes the underlying bug that was exposed by 298e183e81.
2019-12-19 20:41:59 -08:00
River Riddle a77a290a4d [CommandLine] Add template instantiations of cl::parser for long and long long.
This allows cl::opt<int64_t>.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D71729
2019-12-19 17:01:22 -08:00
Mircea Trofin 93ac81cc9d [NFC][InlineCost] Simplify internal inlining cost interface
Summary:
All the use cases of CallAnalyzer use the same call site parameter to
both construct the CallAnalyzer, and then pass to the analysis member.
This change removes this duplication.

Reviewers: davidxl, eraman, Jim

Reviewed By: davidxl

Subscribers: Jim, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71645
2019-12-19 15:32:15 -08:00