Commit Graph

26580 Commits

Author SHA1 Message Date
Jinsong Ji 7aafdef627 [MachineScheduler] checkResourceLimit boundary condition update
When we call checkResourceLimit in bumpCycle or bumpNode, and we
know the resource count has just reached the limit (the equations
 are equal). We should return true to mark that we are resource
limited for next schedule, or else we might continue to schedule
in favor of latency for 1 more schedule and create a schedule that
 actually overbook the resource.

When we call checkResourceLimit to estimate the resource limite before
scheduling, we don't need to return true even if the equations are
equal, as it shouldn't limit the schedule for it .

Differential Revision: https://reviews.llvm.org/D62345

llvm-svn: 362805
2019-06-07 14:54:47 +00:00
Matt Arsenault 94a609e343 TailDuplicator: Remove no-op analyzeBranch call
This could fail, which looked concerning. However nothing was actually
using the results of this. I assume this was intended to use the
anti-feature of analyzeBranch of removing instructions, but wasn't
actually calling it with AllowModify = true.

Fixes bug 42162.

llvm-svn: 362800
2019-06-07 13:33:34 +00:00
Sam Parker 67f9dc60b8 Fix for lld buildbot
Removed unused (in non-debug builds) variable.

llvm-svn: 362775
2019-06-07 08:04:18 +00:00
Sam Parker c5ef502ee8 [CodeGen] Generic Hardware Loop Support
Patch which introduces a target-independent framework for generating
hardware loops at the IR level. Most of the code has been taken from
PowerPC CTRLoops and PowerPC has been ported over to use this generic
pass. The target dependent parts have been moved into
TargetTransformInfo, via isHardwareLoopProfitable, with
HardwareLoopInfo introduced to transfer information from the backend.
    
Three generic intrinsics have been introduced:
- void @llvm.set_loop_iterations
  Takes as a single operand, the number of iterations to be executed.
- i1 @llvm.loop_decrement(anyint)
  Takes the maximum number of elements processed in an iteration of
  the loop body and subtracts this from the total count. Returns
  false when the loop should exit.
- anyint @llvm.loop_decrement_reg(anyint, anyint)
  Takes the number of elements remaining to be processed as well as
  the maximum numbe of elements processed in an iteration of the loop
  body. Returns the updated number of elements remaining.

llvm-svn: 362774
2019-06-07 07:35:30 +00:00
Alexey Lapshin b9f1e7b16e [DebugInfo] Incorrect debug info record generated for loop counter.
Incorrect Debug Variable Range was calculated while "COMPUTING LIVE DEBUG VARIABLES" stage.
Range for Debug Variable("i") computed according to current state of instructions
inside of basic block. But Register Allocator creates new instructions which were not taken
into account when Live Debug Variables computed. In the result DBG_VALUE instruction for
the "i" variable was put after these newly inserted instructions. This is incorrect.
Debug Value for the loop counter should be inserted before any loop instruction.

Differential Revision: https://reviews.llvm.org/D62650

llvm-svn: 362750
2019-06-06 21:19:39 +00:00
Jason Liu 60ec248148 [AIX] Implement function descriptor on SDAG
Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
 * A function descriptor (Name)
 * A function entry point (.Name)

The descriptor structure on AIX is the same as those in the ELF V1 ABI:
 * The address of the entry point of the function.
 * The TOC base address for the function.
 * The environment pointer.

The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
 function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".

Which symbol gets referenced depends on the context:
 * Taking the address of the function references the descriptor symbol.
 * Calling the function references the entry point symbol.

(2) Speaking of implementation on AIX, for direct function call target, we
 create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
 replace original TargetGlobalAddress SDNode. Then down the path, we can
 take advantage of this MCSymbol.

Patch by: Xiangling_L

Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara

Differential Revision: https://reviews.llvm.org/D62532

llvm-svn: 362735
2019-06-06 19:13:36 +00:00
Simon Pilgrim 842c7792aa [DAGCombine] MergeConsecutiveStores - improve non-temporal load\store handling (PR42123)
This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores:

1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another.

2 - The merged load\store node must retain the non-temporal flag.

Differential Revision: https://reviews.llvm.org/D62910

llvm-svn: 362723
2019-06-06 17:04:13 +00:00
Simon Pilgrim da993d08c8 [DAGCombine] Cleanup isNegatibleForFree/GetNegatedExpression. NFCI.
Prep work for PR42105 - clang-format, use auto for cast and merge nested if()s

llvm-svn: 362695
2019-06-06 10:21:18 +00:00
Petar Avramovic faaa2b5d21 [MIPS GlobalISel] Select floor and ceil
Select G_FFLOOR and G_FCEIL for MIPS32.

Differential Revision: https://reviews.llvm.org/D62901

llvm-svn: 362688
2019-06-06 09:02:24 +00:00
Ulrich Weigand 6c5d5ce551 Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point
intrinsics are currently never passed to the target back-end, which makes
it impossible to handle them correctly (e.g. mark instructions are depending
on a floating-point status and control register, or mark instructions as
possibly trapping).

This patch allows the target to use setOperationAction to switch the action
on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code
will stop converting the STRICT nodes to regular floating-point nodes, but
instead pass the STRICT nodes to the target using normal SelectionDAG
matching rules.

To avoid having the back-end duplicate all the floating-point instruction
patterns to handle both strict and non-strict variants, we make the MI
codegen explicitly aware of the floating-point exceptions by introducing
two new concepts:

- A new MCID flag "mayRaiseFPException" that the target should set on any
  instruction that possibly can raise FP exception according to the
  architecture definition.
- A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI
  instruction resulting from expansion of any constrained FP intrinsic.

Any MI instruction that is *both* marked as mayRaiseFPException *and*
FPExcept then needs to be considered as raising exceptions by MI-level
codegen (e.g. scheduling).

Setting those two new flags is straightforward. The mayRaiseFPException
flag is simply set via TableGen by marking all relevant instruction
patterns in the .td files.

The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes
in the SelectionDAG, and gets inherited in the MachineSDNode nodes created
from it during instruction selection. The flag is then transfered to an
MIFlag when creating the MI from the MachineSDNode. This is handled just
like fast-math flags like no-nans are handled today.

This patch includes both common code changes required to implement the
new features, and the SystemZ implementation.

Reviewed By: andrew.w.kaylor

Differential Revision: https://reviews.llvm.org/D55506

llvm-svn: 362663
2019-06-05 22:33:10 +00:00
Tim Northover 607c8a9d14 IR: make getParamByValType Just Work. NFC.
Most parts of LLVM don't care whether the byval type is derived from an
explicit Attribute or from the parameter's pointee type, so it makes
sense for the main access function to just return the right value.

The very few users who do care (only BitcodeReader so far) can find out
how it's specified by accessing the Attribute directly.

llvm-svn: 362642
2019-06-05 20:37:47 +00:00
Simon Pilgrim 77d6adc491 Fix shadow local variable warning. NFCI.
llvm-svn: 362622
2019-06-05 17:26:29 +00:00
Sanjay Patel ad62a3a299 [LoopUtils][SLPVectorizer] clean up management of fast-math-flags
Instead of passing around fast-math-flags as a parameter, we can set those
using an IRBuilder guard object. This is no-functional-change-intended.

The motivation is to eventually fix the vectorizers to use and set the
correct fast-math-flags for reductions. Examples of that not behaving as
expected are:
https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast')
https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0)
D61802 (should be able to reduce with IR-level FMF)

Differential Revision: https://reviews.llvm.org/D62272

llvm-svn: 362612
2019-06-05 14:58:04 +00:00
Simon Pilgrim 5a81af547c [TargetLowering] SimplifyDemandedBits - pull out shift value type. NFCI.
Will be used more in an upcoming patch.

llvm-svn: 362595
2019-06-05 10:59:04 +00:00
Johannes Doerfert 6b432dca5d [SelectionDAG][FIX] Allow "returned" arguments to be bit-casted
Summary:
An argument that is return by a function but bit-casted before can still
be annotated as "returned". Make sure we do not crash for this case.

Reviewers: sunfish, stephenwlin, niravd, arsenm

Subscribers: wdng, hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59917

llvm-svn: 362546
2019-06-04 20:34:43 +00:00
Nemanja Ivanovic aed7227b71 Revert r362472 as it is breaking PPC build bots
The patch https://reviews.llvm.org/rL362472 broke PPC LNT buildbots.
Reverting it to bring the bots back to green.

llvm-svn: 362539
2019-06-04 18:48:43 +00:00
Craig Topper 09a4415803 [DAGCombiner][X86] Fold (not (neg X)) -> (add X, -1)
This is a special case of a more general transform (not (sub Y, X)) -> (add X, ~Y). InstCombine knows the general form. I've restricted to the special case to fix the motivating case PR42118. I tried handling any case where Y was constant, but got some changes on some Mips tests that I couldn't quickly prove where beneficial.

Fixes PR42118

Differential Revision: https://reviews.llvm.org/D62828

llvm-svn: 362533
2019-06-04 17:44:18 +00:00
Sanjay Patel 1e63dd0b44 [SelectionDAG][x86] limit post-legalization store merging by type
The proposal in D62498 showed that x86 would benefit from vector
store splitting, but that may conflict with the generic DAG
combiner's store merging transforms.

Add memory type to the existing TLI hook that enables the merging
transforms, so we can limit those changes to scalars only for x86.

llvm-svn: 362507
2019-06-04 15:15:59 +00:00
Roman Lebedev 3dce0326fe [DAGCombine][X86][AArch64][MIPS][LANAI] (C - x) - y -> C - (x + y) fold (PR41952)
Summary:
This *might* be the last fold for `sink-addsub-of-const.ll`, but i'm not sure yet.

As far as i can tell, there are no regressions here (ignoring x86-32),
all changes are either good or neutral.

This, almost surprisingly to me, fixes the motivational tests (in `shift-amount-mod.ll`)
`@reg32_lshr_by_sub_from_negated` from [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].

https://rise4fun.com/Alive/vMd3

Reviewers: RKSimon, t.p.northover, craig.topper, spatel, efriedma

Reviewed By: RKSimon

Subscribers: sdardis, javed.absar, arichardson, kristof.beyls, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62774

llvm-svn: 362488
2019-06-04 11:06:21 +00:00
Roman Lebedev be6ce7b3f2 [DAGCombine][X86][AArch64][ARM] (C - x) + y -> (y - x) + C fold
Summary:
All changes except ARM look **great**.
https://rise4fun.com/Alive/R2M

The regression `test/CodeGen/ARM/addsubcarry-promotion.ll`
is recovered fully by D62392 + D62450.

Reviewers: RKSimon, craig.topper, spatel, rogfer01, efriedma

Reviewed By: efriedma

Subscribers: dmgreen, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62266

llvm-svn: 362487
2019-06-04 11:06:08 +00:00
Simon Pilgrim ad298f86b7 [SelectionDAG] ComputeNumSignBits - support constant pool values from target
As I mentioned on D61887 we don't get many hits on ComputeNumSignBits as we did on computeKnownBits.

The case we do get is interesting though - it allows us to use the 'ConditionalNegate' combine in combineLogicBlendIntoPBLENDV to remove a select.

It comes too late for SSE41 (BLENDV) cases, but SSE2 tests can hit it now. We should probably try to make use of this for SSE41+ targets as well - avoiding variable blends is usually a good idea. I'll investigate as a followup.

Differential Revision: https://reviews.llvm.org/D62777

llvm-svn: 362486
2019-06-04 10:49:06 +00:00
Simon Pilgrim 3178546a27 [SelectionDAG] ComputeNumSignBits - clang-format + improve *EXTLOAD comments. NFCI.
Pre-commit requested for D62777.

llvm-svn: 362485
2019-06-04 10:17:56 +00:00
Simon Pilgrim 3018d505a3 [SelectionDAG] Add fpto[us]i(undef) --> undef constant fold
Follow up to D62807.

Differential Revision: https://reviews.llvm.org/D62811

llvm-svn: 362483
2019-06-04 10:04:55 +00:00
QingShan Zhang 11de0e71b0 [DAGCombine] Match a pattern where a wide type scalar value is stored by several narrow stores
This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c

static void store64(u64 x, unsigned char* y)
{
    for(int i = 0; i != 8; ++i)
        y[i] = (x >> ((7-i) * 8)) & 255;
}

static u64 load64(const unsigned char* y)
{
    u64 res = 0;
    for(int i = 0; i != 8; ++i)
        res |= (u64)(y[i]) << ((7-i) * 8);
    return res;
}
The load64 has been implemented by https://reviews.llvm.org/D26149
This patch is trying to implement the store pattern.

Match a pattern where a wide type scalar value is stored by several narrow
stores. Fold it into a single store or a BSWAP and a store if the targets
supports it.

Assuming little endian target:
i8 *p = ...
i32 val = ...
p[0] = (val >> 0) & 0xFF;
p[1] = (val >> 8) & 0xFF;
p[2] = (val >> 16) & 0xFF;
p[3] = (val >> 24) & 0xFF;

>
*((i32)p) = val;

i8 *p = ...
i32 val = ...
p[0] = (val >> 24) & 0xFF;
p[1] = (val >> 16) & 0xFF;
p[2] = (val >> 8) & 0xFF;
p[3] = (val >> 0) & 0xFF;

>
*((i32)p) = BSWAP(val);

Differential Revision: https://reviews.llvm.org/D61843

llvm-svn: 362472
2019-06-04 08:53:53 +00:00
Michael Berg 6ff978ee05 Propagate fmf for setcc in SDAG for select folds
llvm-svn: 362448
2019-06-03 21:53:26 +00:00
Michael Berg 0b7f98da65 Propagate fmf for setcc/select folds
Summary: This change facilitates propagating fmf which was placed on setcc from fcmp through folds with selects so that back ends can model this path for arithmetic folds on selects in SDAG.

Reviewers: qcolombet, spatel

Reviewed By: qcolombet

Subscribers: nemanjai, jsji

Differential Revision: https://reviews.llvm.org/D62552

llvm-svn: 362439
2019-06-03 19:12:15 +00:00
Matt Arsenault 8dbeb9256c TTI: Improve default costs for addrspacecast
For some reason multiple places need to do this, and the variant the
loop unroller and inliner use was not handling it.

Also, introduce a new wrapper to be slightly more precise, since on
AMDGPU some addrspacecasts are free, but not no-ops.

llvm-svn: 362436
2019-06-03 18:41:34 +00:00
Simon Pilgrim cb7e4e8193 [SelectionDAG] Add [us]itofp(undef) --> 0 constant fold (PR39205)
We were missing this fold in the DAG, which I've copied directly from llvm::ConstantFoldCastInstruction

Differential Revision: https://reviews.llvm.org/D62807

llvm-svn: 362397
2019-06-03 13:02:07 +00:00
Nikola Prica 2d0106a110 [LiveDebugValues] Close range for previous variable's location when adding newly deduced location
When LiveDebugValues deduces new variable's location from spill, restore or
register copy instruction it should close old variable's location. Otherwise
we can have multiple block output locations for same variable. That could lead
to inserting two DBG_VALUEs for same variable to the beginning of the successor
block which results to ignoring of first DBG_VALUE.

Reviewers: aprantl, jmorse, wolfgangp, dstenb

Reviewed By: aprantl

Subscribers: probinson, asowda, ivanbaev, petarj, djtodoro

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D62196

llvm-svn: 362373
2019-06-03 09:48:29 +00:00
Florian Hahn e71963c850 Recommit r360171: [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor.
If we hit the limit, we do expand the outstanding tokenfactors.
Otherwise, we might drop nodes with users in the unexpanded
tokenfactors. This fixes the crashes reported by Jordan Rupprecht.

Reviewers: niravd, spatel, craig.topper, rupprecht

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D62633

llvm-svn: 362350
2019-06-03 01:30:19 +00:00
Craig Topper 50b35caf30 [DAGCombiner][X86] Fold away masked store and scatter with all zeroes mask.
Similar to what was done for masked load and gather.

llvm-svn: 362342
2019-06-02 22:52:38 +00:00
Craig Topper 5f79d74946 [X86] Add test cases for masked store and masked scatter with an all zeroes mask. Fix bug in ScalarizeMaskedMemIntrin
Need to cast only to Constant instead of ConstantVector to allow
ConstantAggregateZero.

llvm-svn: 362341
2019-06-02 22:52:34 +00:00
Craig Topper a7bc31ebc6 [DAGCombiner] Replace masked loads with a zero mask with the passthru value
Similar to what was recently done for gathers in r362015.

llvm-svn: 362337
2019-06-02 18:58:46 +00:00
Simon Pilgrim 7a869e7036 [DAGCombine] Fold insert_subvector(bitcast(x),bitcast(y),c1) -> bitcast(insert_subvector(x,y),c2)
Move this combine from x86 into generic DAGCombine, which currently only manages cases where the bitcast is between types of the same scalarsize.

Differential Revision: https://reviews.llvm.org/D59188

llvm-svn: 362324
2019-06-02 14:42:11 +00:00
Simon Pilgrim ffb4d2bff7 [DAG] isBitwiseNot / isConstOrConstSplat - add support for build vector undefs + truncation (PR41020)
Add (opt-in) support for implicit truncation to isConstOrConstSplat, which allows us to match truncated 'all ones' cases in isBitwiseNot.

PR41020 compares against using ISD::isBuildVectorAllOnes() instead, but that predicate silently accepts any UNDEF elements in the build vector which might not be what we want in isBitwiseNot - so I've added an opt-in 'AllowUndefs' flag that is set to false by default but will allow us to enable it on individual cases where its safe.

Differential Revision: https://reviews.llvm.org/D62783

llvm-svn: 362323
2019-06-02 11:56:39 +00:00
Simon Pilgrim 88522ce388 [TargetLowering] SimplifyDemandedBits - don't use OriginalDemanded variables in analysis.
These might have been replaced in multiple use cases.

llvm-svn: 362322
2019-06-02 10:12:55 +00:00
Simon Pilgrim 30a6caa3e7 [TargetLowering] SimplifyDemandedVectorElts - use same arg names as SimplifyDemandedBits. NFCI.
Helps with debugging as we recurse between them.

llvm-svn: 362321
2019-06-02 10:03:56 +00:00
Craig Topper f58ef87bb7 [DAGCombiner] Replace two unchecked dyn_casts with casts.
The results of the dyn_casts were immediately dereferenced on the next line
so they had better not be null.

I don't think there's any way for these dyn_casts to fail, so use a cast
of adding null check.

llvm-svn: 362315
2019-06-02 03:31:01 +00:00
Craig Topper 78c794a70b [X86] Fix several places that weren't passing what they though they were to MachineInstr::print
Over a year ago, MachineInstr gained a fourth boolean parameter that occurs
before the TII pointer. When this happened, several places started accidentally
passing TII into this boolean parameter instead of the TII parameter.

llvm-svn: 362312
2019-06-02 01:36:48 +00:00
Eli Friedman d8e8722791 [CodeGen] Fix hashing for MO_ExternalSymbol MachineOperands.
We were hashing the string pointer, not the string, so two instructions
could be identical (isIdenticalTo), but have different hash codes.

This showed up as a very rare, non-deterministic assertion failure
rehashing a DenseMap constructed by MachineOutliner.  So there's no
"real" testcase, just a unittest which checks that the hash function
behaves correctly.

I'm a little scared fixing this is going to cause a regression in
outlining or MachineCSE, but hopefully we won't run into any issues.

Differential Revision: https://reviews.llvm.org/D61975

llvm-svn: 362281
2019-06-01 00:08:54 +00:00
Craig Topper bc9e04d0c3 [SelectionDAG] Make the code in mutateStrictFPToFP less aware of how many operands each node has. NFCI
Just copy all of the operands except the chain and call MorphNode on that.
This removes the IsUnary and IsTernary flags.

Also always get the result type from the result type of the original
nodes. Previously we got it from the operand except for two nodes
where that didn't work.

llvm-svn: 362269
2019-05-31 22:18:45 +00:00
Nick Desaulniers 103bd108a7 [RegisterCoalescer] fix potential use of undef value. NFC
Summary:
Fixes a warning produced from scan-build (llvm.org/reports/scan-build/),
further warnings found by annotation isMoveInstr [[nodiscard]].

isMoveInstr potentially does not assign to its parameters, so if they
were uninitialized, they will potentially stay uninitialized.  It seems
most call sites pass references to uninitialized values, then use them
without checking the return value.

Reviewers: wmi

Reviewed By: wmi

Subscribers: MatzeB, qcolombet, hiraditya, tpr, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62109

llvm-svn: 362265
2019-05-31 21:20:13 +00:00
Puyan Lotfi 3ea6b24f41 [MIR-Canon] Don't do vreg skip for independent instructions if there are none.
We don't want to create vregs if there is nothing to use them for. That causes
verifier errors.

Differential Revision: https://reviews.llvm.org/D62740

llvm-svn: 362247
2019-05-31 17:34:25 +00:00
Kevin P. Neal ac79007205 Revert revert of r362112 with minor SystemZ test file corrections.
[FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes

This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully.

Submitted by:	Drew Wock <drew.wock@sas.com>
Reviewed by:	Cameron McInally, Kevin P. Neal
Approved by:	Cameron McInally
Differential Revision:	https://reviews.llvm.org/D62546

llvm-svn: 362241
2019-05-31 16:32:12 +00:00
Jinsong Ji 18e7bf5c4d [MachinePipeliner][NFC] Add some debug log and statistics
This is to add some log and statistics for debugging

Differential Revision: https://reviews.llvm.org/D62165

llvm-svn: 362233
2019-05-31 15:35:19 +00:00
Puyan Lotfi 0d63cef180 [MIR-Canon] Skip the first N vreg names lazily.
This consolidates the vreg skip code into one function (SkipVRegs()).
SkipVRegs() now knows if it should skip as if it is the first initialization or
subsequent skips.

The first skip is also done the first time createVirtualRegister is called by
the cursor instead of by the cursor's constructor. This prevents verifier
errors on machine functions that have no vregs (where the verifier will
complain that there are vregs when the function uses none).

Differential Revision: https://reviews.llvm.org/D62717

llvm-svn: 362195
2019-05-31 06:02:38 +00:00
Puyan Lotfi 2a901401fe [MIR-Canon] Hardening propagateLocalCopies.
This is am almost NFC, it does the following:
- If there is no register class for a COPY's src or dst, bail.
- Fixes uses iterator invalidation bug.

Differential Revision: https://reviews.llvm.org/D62713

llvm-svn: 362191
2019-05-31 04:49:58 +00:00
Matt Arsenault 18659f84b2 MISched: Fix -misched-regpressure=0 if subreg liveness enabled
Test is waiting on fixing several more crashes in the AMDGPU scheduler
implementation with this.

llvm-svn: 362174
2019-05-30 23:31:36 +00:00
Francis Visoiu Mistrih 6ada11f134 [Remarks][NFC] Move the serialization to lib/Remarks
Separate the remark serialization to YAML from the LLVM Diagnostics.

This adds a new serialization abstraction: remarks::Serializer. It's
completely independent from lib/IR and it provides an easy way to
replace YAML by providing a new remarks::Serializer.

Differential Revision: https://reviews.llvm.org/D62632

llvm-svn: 362160
2019-05-30 21:45:59 +00:00
Puyan Lotfi daaecf98c9 [MIR-Canon] Fixing case where MachineFunction is empty.
In cases where the machine function is empty: bail on the RPO traversal.

Differential Revision: https://reviews.llvm.org/D62617

llvm-svn: 362158
2019-05-30 21:37:25 +00:00
Roman Lebedev 46511d75b5 [DAGCombine] Limit 'hoist add/sub binop w/ constant op' to non-opaque consts
I don't have a test case for these, but there is a test case for D62266
where, even after all the constant-folding patches, we still end up
with endless combine loop. Which makes sense, since we don't constant
fold for opaque constants.

llvm-svn: 362156
2019-05-30 21:10:37 +00:00
Roman Lebedev a4e3b50e26 [DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold. Try 2
Summary:
Only vector tests are being affected here,
since subtraction by scalar constant is rewritten
as addition by negated constant.

No surprising test changes.

https://rise4fun.com/Alive/pbT

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62257

llvm-svn: 362146
2019-05-30 20:37:49 +00:00
Roman Lebedev 57aa36ff91 [DAGCombine] (x - C) - y -> (x - y) - C fold. Try 3
Summary:
Again only vectors affected. Frustrating. Let me take a look into that..

https://rise4fun.com/Alive/AAq

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62294

llvm-svn: 362145
2019-05-30 20:37:39 +00:00
Roman Lebedev 63b4741534 [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold. Try 3
Summary:
This prevents regressions in next patch,
and somewhat recovers from the regression to AMDGPU test in D62223.

It is indeed not great that we leave vector decrement,
don't transform it into vector add all-ones..

https://rise4fun.com/Alive/ZRl

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: RKSimon, craig.topper, spatel, arsenm

Reviewed By: RKSimon, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62263

llvm-svn: 362144
2019-05-30 20:37:29 +00:00
Roman Lebedev 05ad5fd213 [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 3
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

llvm-svn: 362143
2019-05-30 20:37:18 +00:00
Roman Lebedev 1d9ec7a81b [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 3
Summary:
The main motivation is shown by all these `neg` instructions that are now created.
In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test.

AArch64 test changes all look good (`neg` created), or neutral.

X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created).

I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill
is now hoisted into preheader (which should still be good?),
2 4-byte reloads become 1 8-byte reload, and are elsewhere,
but i'm not sure how that affects that loop.

I'm unable to interpret AMDGPU change, looks neutral-ish?

This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].

https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later)

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: craig.topper, RKSimon, spatel, arsenm

Reviewed By: RKSimon

Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62223

llvm-svn: 362142
2019-05-30 20:36:54 +00:00
Roman Lebedev 7eb8b5b5dd [DAGCombine] ((c1-A)-c2) -> ((c1-c2)-A) constant-fold
Summary: https://rise4fun.com/Alive/B0A

Reviewers: t.p.northover, RKSimon, spatel, craig.topper

Reviewed By: RKSimon

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62691

llvm-svn: 362135
2019-05-30 19:27:51 +00:00
Roman Lebedev 691b5e2ecc [DAGCombine] (A-C1)-C2 -> A-(C1+C2) constant-fold
Summary: https://rise4fun.com/Alive/Mb1M

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62689

llvm-svn: 362134
2019-05-30 19:27:42 +00:00
Roman Lebedev 0a3dbbcdfb [DAGCombine] (A+C1)-C2 -> A+(C1-C2) constant-fold
Summary:
Direct sibling of D62662, the root cause of the endless combine loop in D62257

https://rise4fun.com/Alive/d3W

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62664

llvm-svn: 362133
2019-05-30 19:27:32 +00:00
Roman Lebedev 9ff3159b4a [DAGCombine] Use FoldConstantArithmetic() to perform C2-(A+C1) -> (C2-C1)-A fold
Summary:
No tests change, and i'm not sure how to test this, but it's better safe than sorry.

Reviewers: spatel, RKSimon, craig.topper, t.p.northover

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62663

llvm-svn: 362132
2019-05-30 19:27:26 +00:00
Roman Lebedev cc9a9cf237 [DAGCombine] ((A-c1)+c2) -> (A+(c2-c1)) constant-fold
Summary:
This was the root cause of the endless combine loop in D62257

https://rise4fun.com/Alive/d3W

Reviewers: RKSimon, spatel, craig.topper, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62662

llvm-svn: 362131
2019-05-30 19:27:19 +00:00
Roman Lebedev ef95679741 [DAGCombine] Use FoldConstantArithmetic() to perform ((c1-A)+c2) -> (c1+c2)-A fold
Summary: No tests change, and i'm not sure how to test this, but it's better safe than sorry.

Reviewers: spatel, RKSimon, craig.topper, t.p.northover

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62661

llvm-svn: 362130
2019-05-30 19:27:10 +00:00
Tim Northover b7141207a4 Reapply: IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe
how many bytes a 'byval' parameter should occupy on the stack. This adds
a (for now) optional extra type parameter.

If present, the type must match the pointee type of the argument.

The original commit did not remap byval types when linking modules, which broke
LTO. This version fixes that.

Note to front-end maintainers: if this causes test failures, it's probably
because the "byval" attribute is printed after attributes without any parameter
after this change.

llvm-svn: 362128
2019-05-30 18:48:23 +00:00
Puyan Lotfi 0f4446b270 [MIR-Canon] Add support for rewriting VRegs that are typed but don't have an RC.
There were crashes (addrspace-memoperands.mir was only one of them) in MIR that
had operands that came from before register classes were set. With these
operands, creating a replacement vreg (for MIR-Canon's renaming) needs to use
the vreg type rather than the RegisterClass which is not present.

Differential Revision: https://reviews.llvm.org/D62543

llvm-svn: 362122
2019-05-30 18:06:28 +00:00
Kevin P. Neal 51ce0b196a Correct error in revert of r362112.
Differential Revision:	http://reviews.llvm.org/D62546

llvm-svn: 362118
2019-05-30 17:21:45 +00:00
Kevin P. Neal d3db7b40b0 Revert r362112, it broke the bots with the message "Unsupported vector argument or return type"
Differential Revision:	http://reviews.llvm.org/D62546

llvm-svn: 362117
2019-05-30 17:10:21 +00:00
Kevin P. Neal 2e1807678d [FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes
This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully.

Submitted by:	Drew Wock <drew.wock@sas.com>
Reviewed by:	Cameron McInally, Kevin P. Neal
Approved by:	Cameron McInally
Differential Revision:	http://reviews.llvm.org/D62546

llvm-svn: 362112
2019-05-30 16:44:47 +00:00
Roman Lebedev 019d270e43 [DAGCombine] Revert of recommit of "binop-with-const hoisting" patches
I was looking into an endless combine loop the uncommitted follow-up patch
was causing, and it appears even these patches can exibit such an
endless loop. The root cause is that we try to hoist one binop (add/sub) with
constant operand, and if we get two such binops both of which are
eligible for this hoisting, we get stuck.

Some cases may highlight missing constant-folds.

Reverts r361871,r361872,r361873,r361874.

llvm-svn: 362109
2019-05-30 16:07:11 +00:00
Amy Huang 325003be02 CodeView - add static data members to global variable debug info.
Summary:
Add static data members to IR debug info's list of global variables
so that they are emitted as S_CONSTANT records.

Related to https://bugs.llvm.org/show_bug.cgi?id=41615.

Reviewers: rnk

Subscribers: aprantl, cfe-commits, llvm-commits, thakis

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D62167

llvm-svn: 362038
2019-05-29 21:45:34 +00:00
Tim Northover 71ee3d0237 Revert "IR: add optional type to 'byval' function parameters"
The IRLinker doesn't delve into the new byval attribute when mapping types, and
this breaks LTO.

llvm-svn: 362029
2019-05-29 20:46:38 +00:00
Benjamin Kramer 107f8d9873 [DAGCombiner] Replace gathers with a zero mask with the passthru value
These can be created by the legalizer when splitting a larger gather.

See https://llvm.org/PR42055 for a motivating example.

Differential Revision: https://reviews.llvm.org/D62613

llvm-svn: 362015
2019-05-29 19:24:19 +00:00
Tim Northover 6e07f16fae IR: add optional type to 'byval' function parameters
When we switch to opaque pointer types we will need some way to describe
how many bytes a 'byval' parameter should occupy on the stack. This adds
a (for now) optional extra type parameter.

If present, the type must match the pointee type of the argument.

Note to front-end maintainers: if this causes test failures, it's probably
because the "byval" attribute is printed after attributes without any parameter
after this change.

llvm-svn: 362012
2019-05-29 19:12:48 +00:00
Sam McCall 4f09d9fcfa Qualify use of llvm::empty that's ambiguous with std::empty
llvm-svn: 361968
2019-05-29 15:02:16 +00:00
Richard Trieu c77aff7e17 Inline a variable into debug section to fix unused variable warning.
llvm-svn: 361927
2019-05-29 04:09:32 +00:00
Richard Trieu e8698ead9d Inline value into debug statement to avoid unused variable warning.
llvm-svn: 361924
2019-05-29 03:43:01 +00:00
Peter Collingbourne 31fda09b2d Add IR support, ELF section and user documentation for partitioning feature.
The partitioning feature was proposed here:
http://lists.llvm.org/pipermail/llvm-dev/2019-February/130583.html

This is mostly just documentation. The feature itself will be contributed
in subsequent patches.

Differential Revision: https://reviews.llvm.org/D60242

llvm-svn: 361923
2019-05-29 03:29:01 +00:00
Jinsong Ji f6cb3bcb4c Support resource tracking with InstrSchedModel
The current design use DFA to do resource tracking in SMS,
and DFA only support InstrItins, and also has scaling limitation.

This patch extend SMS to allow Subtarget to use ProcResource in
InstrSchedModel instead.

Differential Revision: https://reviews.llvm.org/D62163

llvm-svn: 361919
2019-05-29 03:02:59 +00:00
Pengfei Wang 72e3f9662b Revert "[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to"
This reverts commit c1b3716614bc0a107e6f41a7d3d503baefad8a5b.

llvm-svn: 361918
2019-05-29 02:49:59 +00:00
Pengfei Wang 818c652643 [X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to
avoid static check fail

RegClassOrBank is an object of RegClassOrRegBank, which is defined as
using llvm::RegClassOrRegBank = typedef PointerUnion<const
TargetRegisterClass *, const RegisterBank *>
so control flow can not get here. Use ""llvm_unreachable" here to avoid
"null pointer" confusion.

Patch by Shengchen Kan (skan)

Differential Revision: https://reviews.llvm.org/D62006

Signed-off-by: pengfei <pengfei.wang@intel.com>
llvm-svn: 361912
2019-05-29 02:20:37 +00:00
Quentin Colombet a6f57ad2c9 [RegUsageInfoCollector] Don't mark as saved registers that don't have subregister lanes
To determine the list of clobbered registers, the RegUsageInfoCollector pass
uses the list of callee saved registers provided by the target and then augments
it with the list of registers which have all their subregisters saved. It then
basically does the difference between all the registers and the saved registers
to come up with what is clobbered (plus it checks that the register is defined
within that functions).

The patch fixes a bug where when register does not have any subregister lane,
hence when checking if any of its subregister are not saved, we would find none
and think the register is saved as well.

That's obviously wrong.

The code was actually kind of checking for something like that with the
CoveredBySubRegs bit. What this bit says is that a register is completely
covered by its subregisters.
We required that this bit was set, to check that a register was saved by its
subregister lanes, since without this bit, we potentially would miss to check
some part of the register.

However, this bit is used de facto on registers that don't have any
subregisters (e.g., on ARM) and the code was not prepared for that.

This patch fixes this by checking that a register has subregisters before
declaring it saved when none of its lanes are modified.

llvm-svn: 361901
2019-05-28 23:43:12 +00:00
Adhemerval Zanella 6d7bf5e8df [CodeGen] Add lrint/llrint builtins
This patch add the ISD::LRINT and ISD::LLRINT along with new
intrinsics.  The changes are straightforward as for other
floating-point rounding functions, with just some adjustments
required to handle the return value being an interger.

The idea is to optimize lrint/llrint generation for AArch64
in a subsequent patch.  Current semantic is just route it to libm
symbol.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D62017

llvm-svn: 361875
2019-05-28 20:47:44 +00:00
Roman Lebedev dfc34f0211 [DAGCombine] (x - C) - y -> (x - y) - C fold. Try 2
Summary:
Again only vectors affected. Frustrating. Let me take a look into that..

https://rise4fun.com/Alive/AAq

This is a recommit, originally committed in rL361856, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62294

llvm-svn: 361874
2019-05-28 20:40:10 +00:00
Roman Lebedev d485c6bc9f [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold. Try 2
Summary:
This prevents regressions in next patch,
and somewhat recovers from the regression to AMDGPU test in D62223.

It is indeed not great that we leave vector decrement,
don't transform it into vector add all-ones..

https://rise4fun.com/Alive/ZRl

This is a recommit, originally committed in rL361855, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel, arsenm

Reviewed By: RKSimon, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62263

llvm-svn: 361873
2019-05-28 20:40:03 +00:00
Roman Lebedev 96c9986199 [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 2
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

This is a recommit, originally committed in rL361853, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

llvm-svn: 361872
2019-05-28 20:39:55 +00:00
Roman Lebedev 2feb7e56e2 [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 2
Summary:
The main motivation is shown by all these `neg` instructions that are now created.
In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test.

AArch64 test changes all look good (`neg` created), or neutral.

X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created).

I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill
is now hoisted into preheader (which should still be good?),
2 4-byte reloads become 1 8-byte reload, and are elsewhere,
but i'm not sure how that affects that loop.

I'm unable to interpret AMDGPU change, looks neutral-ish?

This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].

https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later)

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs.

Reviewers: craig.topper, RKSimon, spatel, arsenm

Reviewed By: RKSimon

Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62223

llvm-svn: 361871
2019-05-28 20:39:39 +00:00
Roman Lebedev 272d70c366 Revert DAGCombine "hoist binop with const" folds
Appear to introduce test-suite compile-time hang.

http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/22825

This reverts r361852,r361853,r361854,r361855,r361856

llvm-svn: 361865
2019-05-28 19:04:21 +00:00
Roman Lebedev 7669665432 [DAGCombine] (x - C) - y -> (x - y) - C fold
Summary:
Again only vectors affected. Frustrating. Let me take a look into that..

https://rise4fun.com/Alive/AAq

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62294

llvm-svn: 361856
2019-05-28 17:54:21 +00:00
Roman Lebedev 8c9b3e4e4a [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold
Summary:
This prevents regressions in next patch,
and somewhat recovers from the regression to AMDGPU test in D62223.

It is indeed not great that we leave vector decrement,
don't transform it into vector add all-ones..

https://rise4fun.com/Alive/ZRl

Reviewers: RKSimon, craig.topper, spatel, arsenm

Reviewed By: RKSimon, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62263

llvm-svn: 361855
2019-05-28 17:54:13 +00:00
Roman Lebedev 6a24c9b9ab [DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold
Summary:
Only vector tests are being affected here,
since subtraction by scalar constant is rewritten
as addition by negated constant.

No surprising test changes.

https://rise4fun.com/Alive/pbT

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62257

llvm-svn: 361854
2019-05-28 17:54:04 +00:00
Roman Lebedev 1499f65ac1 [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

llvm-svn: 361853
2019-05-28 17:53:54 +00:00
Roman Lebedev 19f51ec04a [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold
Summary:
The main motivation is shown by all these `neg` instructions that are now created.
In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test.

AArch64 test changes all look good (`neg` created), or neutral.

X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created).

I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill
is now hoisted into preheader (which should still be good?),
2 4-byte reloads become 1 8-byte reload, and are elsewhere,
but i'm not sure how that affects that loop.

I'm unable to interpret AMDGPU change, looks neutral-ish?

This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].

https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later)

Reviewers: craig.topper, RKSimon, spatel, arsenm

Reviewed By: RKSimon

Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62223

llvm-svn: 361852
2019-05-28 17:53:43 +00:00
Simon Pilgrim 9cd9624fb6 [DAG] LegalizeVectorTypes - reduce scope of local variables. NFCI.
Move the element index/count variables into the block where they are actually used - appeases cppcheck and helps avoid shadow variable warnings.

llvm-svn: 361821
2019-05-28 13:46:26 +00:00
David Stenberg 5d0e6b6755 Stop undef fragments from closing non-overlapping fragments
Summary:
When DwarfDebug::buildLocationList() encountered an undef debug value,
it would truncate all open values, regardless if they were overlapping or
not. This patch fixes so that it only does that for overlapping fragments.

This change unearthed a bug that I had introduced in D57511,
which I have fixed in this patch. The code in DebugHandlerBase that
changes labels for parameter debug values could break DwarfDebug's
assumption that the labels for the entries in the debug value history
are monotonically increasing. Before this patch, that bug could result
in location list entries whose ending address was lower than the
beginning address, and with the changes for undef debug values that this
patch introduces it could trigger an assertion, due to attempting to
emit location list entries with empty ranges. A reproducer for the bug
is added in param-reg-const-mix.mir.

Reviewers: aprantl, jmorse, probinson

Reviewed By: aprantl

Subscribers: javed.absar, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D62379

llvm-svn: 361820
2019-05-28 13:23:25 +00:00
Matt Arsenault d3ed418ad3 MIR: Fix printer crashing on dead CSR frame indexes
llvm-svn: 361819
2019-05-28 13:08:31 +00:00
Benjamin Kramer 57e267a2e9 [X86] Custom lower CONCAT_VECTORS of v2i1
The generic legalizer cannot handle this. Add an assert instead of
silently miscompiling vectors with elements smaller than 8 bits.

llvm-svn: 361814
2019-05-28 12:52:57 +00:00
Matt Arsenault ca84c4be4b RegAllocFast: Set MayLiveAcrossBlocks when allocating uses
Setting mayLiveOut based only on use instructions after allocating the
def block did not work if the use block was allocated before the def
block, since the virtual register uses were already removed.

Fixes bug 41973.

llvm-svn: 361781
2019-05-27 20:37:31 +00:00
Sanjay Patel 2f99d009c1 [SelectionDAG] fold concat of extract subvectors
This is derived from the related fold for build vectors.
We also have a version of this in DAGCombiner. The benefit of
having this fold at node creation time is (1) efficiency and
(2) preventing infinite looping from creating patterns that
should not exist in the first place.

Currently, the inf-loop could happen with MergeConsecutiveStores()
because it naively creates concat of extracts when forming a wider
vector store. That could fight with target-specific store narrowing.

llvm-svn: 361780
2019-05-27 20:26:21 +00:00
Sanjay Patel e13ae3e4d8 [SelectionDAG] fix formatting and redundant comments; NFC
There's a possible missing fold here for extracting from the
same source vector. It's similar to a check that we use to
squash a build vector with all extracted elements from the
same source vector.

llvm-svn: 361778
2019-05-27 18:26:43 +00:00
Michael Liao 9c70c574b4 [SelectionDAG] Enhance the simplification of `copyto` from `implicit-def`.
Summary:
- The current implementation simplifies the case where the source of
  `copyto` is `implicit-def`ed. However, it only works when that
  `implicit-def` is single-used since it detects that from
  `implicit-def` and cannot determine which destination vreg should be
  used if there are multiple uses.
- This patch changes that detection when `copyto` is being emitted. If
  that `copyto`'s source is defined from `implicit-def`, it simplifies
  it. Hence, it works even that `implicit-def` is multi-used.
- Except it simplifies the internal IR, it won't improve the quality of
  code generation. However, it helps to detect 'implicit-def` in a
  straight-forward manner in some passes, such as `si-i1-copies`. A test
  case is added.

Reviewers: sunfish, nhaehnle

Subscribers: jvesely, hiraditya, asbirlea, llvm-commits, yaxunl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62342

llvm-svn: 361777
2019-05-27 18:26:29 +00:00
Simon Pilgrim ebb053b139 [SelectionDAG] GetDemandedBits - add demanded elements wrapper implementation
The DemandedElts variable is pretty much inert at the moment - the original GetDemandedBits implementation calls it with an 'all ones' DemandedElts value so the function is active and behaves exactly as it used to.

llvm-svn: 361773
2019-05-27 16:39:25 +00:00
Nikola Prica 441ad62531 Test commit (NFC)
Add blank line.

llvm-svn: 361761
2019-05-27 13:51:30 +00:00
David L. Jones 0ff41b8a5a Revert r361356: "[MIR] Add simple PRE pass to MachineCSE"
This is problematic on buildbots, as discussed here: https://reviews.llvm.org/rL361356

It seems like the plan already was to revert, but that hasn't happened yet.

llvm-svn: 361746
2019-05-27 06:00:00 +00:00
Alexander Timofeev ba447bae74 [AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
             the correct register classes to the cross block values beforehand. For the divergent targets
             same value type requires different register classes dependent on the value divergence.

    Reviewers: rampitec, nhaehnle

    Differential Revision: https://reviews.llvm.org/D59990

    This commit was reverted because of the build failure.
    The reason was mlformed patch.
    Build failure fixed.

llvm-svn: 361741
2019-05-26 20:33:26 +00:00
Simon Pilgrim 06e02856ab [SelectionDAG] GetDemandedBits - cleanup to more closely match SimplifyDemandedBits. NFCI.
Prep work before adding demanded elts support.

llvm-svn: 361739
2019-05-26 18:58:14 +00:00
Simon Pilgrim 2916b9e28c [SelectionDAG] MaskedValueIsZero - add demanded elements implementation
Will be used in an upcoming patch but I've updated the original implementation to call this to ensure test coverage.

llvm-svn: 361738
2019-05-26 18:43:44 +00:00
Sanjay Patel 91131b6500 [SelectionDAG] soften assertion when legalizing narrow vector FP ops
The test based on PR42010:
https://bugs.llvm.org/show_bug.cgi?id=42010
...may show an inaccuracy for PPC's target defs, but we should not
be so aggressive with an assert here. There's no telling what out-of-tree
targets look like.

llvm-svn: 361696
2019-05-25 13:48:07 +00:00
Peter Collingbourne 3b93737446 Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."
Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio

llvm-svn: 361688
2019-05-25 01:52:38 +00:00
Alexander Timofeev dffedea014 [AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
         the correct register classes to the cross block values beforehand. For the divergent targets
         same value type requires different register classes dependent on the value divergence.

Reviewers: rampitec, nhaehnle

Differential Revision: https://reviews.llvm.org/D59990

llvm-svn: 361644
2019-05-24 15:32:18 +00:00
Simon Pilgrim 95b8d9bbf8 [SelectionDAG] computeKnownBits - support constant pool values from target
This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets.

computeKnownBits then uses this function to improve codegen, notably vector code after legalization.

A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit.

This required a couple of fixes:
* SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR).
* Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets.

Differential Revision: https://reviews.llvm.org/D61887

llvm-svn: 361620
2019-05-24 10:03:11 +00:00
Bjorn Pettersson b4771425f5 Use the DataLayout::typeSizeEqualsStoreSize helper. NFC
Just a minor refactoring to use the new helper method
DataLayout::typeSizeEqualsStoreSize(). This is done when
checking if getTypeSizeInBits is equal/non-equal to
getTypeStoreSizeInBits.

llvm-svn: 361613
2019-05-24 09:20:20 +00:00
Tim Northover 3b2157aeed GlobalISel: support swifterror attribute on AArch64.
swifterror marks an argument as a register pretending to be a pointer, so we
need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the
infrastructure can be reused from the DAG world.

llvm-svn: 361608
2019-05-24 08:40:13 +00:00
Tim Northover 3d7a057b0d CodeGen: factor out swifterror value tracking.
llvm-svn: 361607
2019-05-24 08:39:43 +00:00
Sanjay Patel 7d6c0bce50 [DAGCombiner] make folds of binops safe for opcodes that produce >1 value
This is no-functional-change-intended currently because the definition
of isBinOp() only includes opcodes that produce 1 value. But if we
share that implementation with isCommutativeBinOp() as proposed in
D62191, then we need to make sure that the callers bail out for
opcodes that they are not prepared to handle correctly.

llvm-svn: 361547
2019-05-23 20:17:25 +00:00
Matt Arsenault 0f3ba44b57 AMDGPU/GlobalISel: Legality for integer min/max
llvm-svn: 361519
2019-05-23 17:58:48 +00:00
Shoaib Meenai 87226a7202 [AsmPrinter] Treat a narrowing PtrToInt like Trunc
When printing assembly for PtrToInt, AsmPrinter::lowerConstant
incorrectly assumed that if PtrToInt was not converting to an
int with exactly the same number of bits, it must be widening
to a larger int. But this isn't necessarily true; PtrToInt can
also shrink the size, which is useful when you want to produce
a known 32-bit pointer on a 64-bit platform (on x86_64 ELF
this yields a R_X86_64_32 relocation).

The old behavior of falling through to the widening case for a
narrowing PtrToInt yields bogus assembly code like this, which
fails to assemble because the no-op bit and it accidentally
creates is not a valid relocation:

```
        .long   a&-1
```

The fix is to treat a narrowing PtrToInt exactly the same as
it already treats Trunc: just emit the expression and let
the assembler deal with truncating it in the appropriate way.

Patch by Mat Hostetter <mjh@fb.com>.

Differential Revision: https://reviews.llvm.org/D61325

llvm-svn: 361508
2019-05-23 16:29:09 +00:00
Petar Jovanovic aa28b6d198 [LiveDebugValues] Rename 'DMI' into 'DebugInstr' (NFC)
This will improve code readability.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D62295

llvm-svn: 361497
2019-05-23 13:49:06 +00:00
Clement Courbet 43882b16a3 [MergeICmps] Make the pass compatible with the new pass manager.
Reviewers: gchatelet, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62287

llvm-svn: 361490
2019-05-23 12:35:26 +00:00
Petar Jovanovic ff47d83e78 [DwarfExpression] Refactor dwarf expression (NFC)
Refactor location description kind in order to be easier for extensions
(needed for D60866).
In addition, cut off some bits from the other class fields.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D62002

llvm-svn: 361480
2019-05-23 10:37:13 +00:00
Matt Arsenault ca64ef2043 MC: Allow getMaxInstLength to depend on the subtarget
Keep it optional in cases this is ever needed in some global
context. Currently it's only used for getting an upper bound inline
asm code size.

For AMDGPU, gfx10 increases the maximum instruction size to
20-bytes. This avoids penalizing older subtargets when estimating code
size, and making some annoying branch relaxation test adjustments.

llvm-svn: 361405
2019-05-22 16:28:41 +00:00
Kees Cook c2187c20a4 [TargetLowering] Extend bool args to inline-asm according to getBooleanType
Summary:
This extends Krzysztof Parzyszek's X86-specific solution
(https://reviews.llvm.org/D60208) to the generic code pointed out by
James Y Knight.

Reviewers: kparzysz, craig.topper, nickdesaulniers

Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60224

llvm-svn: 361404
2019-05-22 16:16:15 +00:00
Kees Cook a7a687e500 [TargetLowering] Add blank line (test commit)
llvm-svn: 361403
2019-05-22 16:02:13 +00:00
Anton Afanasyev df00c6a54f [MIR] Add simple PRE pass to MachineCSE
This is the second part of the commit fixing PR38917 (hoisting
partitially redundant machine instruction). Most of PRE (partitial
redundancy elimination) and CSE work is done on LLVM IR, but some of
redundancy arises during DAG legalization. Machine CSE is not enough
to deal with it. This simple PRE implementation works a little bit
intricately: it passes before CSE, looking for partitial redundancy
and transforming it to fully redundancy, anticipating that the next
CSE step will eliminate this created redundancy. If CSE doesn't
eliminate this, than created instruction will remain dead and eliminated
later by Remove Dead Machine Instructions pass.

The third part of the commit is supposed to refactor MachineCSE,
to make it more clear and to merge MachinePRE with MachineCSE,
so one need no rely on further Remove Dead pass to clear instrs
not eliminated by CSE.

First step: https://reviews.llvm.org/D54839

Fixes llvm.org/PR38917

llvm-svn: 361356
2019-05-22 07:41:34 +00:00
Stanislav Mekhanoshin 44d17ca02e Fix register coalescer failure to prune value
Register coalescer fails for the test in the patch with the assertion in
JoinVals::ConflictResolution `DefMI != nullptr'. It attempts to join
live intervals for two adjacent instructions and erase the copy:

    %2:vreg_256 = COPY %1
    %3:vreg_256 = COPY killed %1

The LI needs to be adjusted to kill subrange for the erased instruction
and extend the subrange of the original def. That was done for the main
interval only but not for the subrange. As a result subrange had a VNI
pointing to the erased slot resulting in the above failure.

Differential Revision: https://reviews.llvm.org/D62162

llvm-svn: 361293
2019-05-21 19:32:41 +00:00
Leonard Chan 0bada7ce6c [Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
Add an intrinsic that takes 2 signed integers with the scale of them provided
as the third argument and performs fixed point multiplication on them. The
result is saturated and clamped between the largest and smallest representable
values of the first 2 operands.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D55720

llvm-svn: 361289
2019-05-21 19:17:19 +00:00
Sanjay Patel 10f6b39899 [SelectionDAG] fold insert subvector of undef into undef
DAGCombiner simplifies this more liberally as:
  // If inserting an UNDEF, just return the original vector.
  if (N1.isUndef())
    return N0;

So there's no way to make this visible in output AFAIK, but
doing this at node creation time should be slightly more efficient.

llvm-svn: 361287
2019-05-21 18:53:53 +00:00
Sanjay Patel 51dc59d090 [SelectionDAG] remove redundant code; NFCI
getNode() squashes concatenation of undefs via FoldCONCAT_VECTORS():
  // Concat of UNDEFs is UNDEF.
  if (llvm::all_of(Ops, [](SDValue Op) { return Op.isUndef(); }))
    return DAG.getUNDEF(VT);

llvm-svn: 361284
2019-05-21 18:28:22 +00:00
Sanjay Patel 78c3f58122 [DAGCombiner] prevent unsafe reassociation of FP ops
There are no FP callers of DAGCombiner::reassociateOps() currently,
but we can add a fast-math check to make sure this API is not being
misused.

This was noted as a potential risk (and that risk might increase) with:
D62191

llvm-svn: 361268
2019-05-21 14:47:38 +00:00
Florian Hahn f9b28e53c7 [ScheduleDAGInstrs] Compute topological ordering on demand.
In most cases, the topological ordering does not get changed in
ScheduleDAGInstrs. We can compute the ordering on demand, similar to
D60125.

This drastically cuts down the number of times we need to compute the
topological ordering, e.g. for SPEC2006, SPEC2k and MultiSource, we get
the following stats for -O3 -flto on X86 (showing the top reductions,
with small absolute values filtered). The smallest reduction is -50%.

Slightly positive impact on compile-time (-0.1 % geomean speedup for
test-suite + SPEC & co, with -O1 on X86)

Tests: 243
Metric: pre-RA-sched.NumTopoInits

Program                                        base       patch  diff
 test-suite...ngs-C/fixoutput/fixoutput.test   115.00      3.00   -97.4%
 test-suite...ks/Prolangs-C/cdecl/cdecl.test   957.00     26.00   -97.3%
 test-suite...math/automotive-basicmath.test   107.00      3.00   -97.2%
 test-suite...rolangs-C++/deriv2/deriv2.test   144.00      6.00   -95.8%
 test-suite...lowfish/security-blowfish.test   410.00     18.00   -95.6%
 test-suite...frame_layout/frame_layout.test   441.00     23.00   -94.8%
 test-suite...rolangs-C++/employ/employ.test   159.00     11.00   -93.1%
 test-suite...s/Ptrdist/anagram/anagram.test   157.00     11.00   -93.0%
 test-suite...s-C/unix-smail/unix-smail.test   829.00     59.00   -92.9%
 test-suite...chmarks/Olden/power/power.test   154.00     11.00   -92.9%
 test-suite...T95/147.vortex/147.vortex.test   19876.00  1434.00  -92.8%
 test-suite...000/255.vortex/255.vortex.test   19881.00  1435.00  -92.8%
 test-suite...ce/Applications/Burg/burg.test   2203.00   168.00   -92.4%
 test-suite...urce/Applications/hbd/hbd.test   1067.00    85.00   -92.0%
 test-suite...ternal/HMMER/hmmcalibrate.test   3145.00   251.00   -92.0%
 test-suite.../Applications/spiff/spiff.test   1037.00    84.00   -91.9%
 test-suite...SPEC/CINT95/130.li/130.li.test   5913.00   487.00   -91.8%
 test-suite.../CINT95/134.perl/134.perl.test   12532.00  1041.00  -91.7%
 test-suite...ce/Benchmarks/Olden/bh/bh.test   220.00     19.00   -91.4%
 test-suite :: External/Nurbs/nurbs.test       2304.00   206.00   -91.1%
 test-suite...arks/VersaBench/dbms/dbms.test   773.00     75.00   -90.3%
 test-suite...ce/Applications/siod/siod.test   9043.00   878.00   -90.3%
 test-suite...pplications/treecc/treecc.test   4510.00   438.00   -90.3%
 test-suite...T2006/456.hmmer/456.hmmer.test   7093.00   697.00   -90.2%
 test-suite...s-C/Pathfinder/PathFinder.test   882.00     87.00   -90.1%
 test-suite.../CINT2000/176.gcc/176.gcc.test   64978.00  6721.00  -89.7%
 test-suite...cations/hexxagon/hexxagon.test   657.00     69.00   -89.5%
 test-suite...fice-ispell/office-ispell.test   2712.00   285.00   -89.5%
 test-suite.../CINT2006/403.gcc/403.gcc.test   139613.00 14992.00 -89.3%
 test-suite...lications/ClamAV/clamscan.test   25880.00  2785.00  -89.2%

Reviewers: MatzeB, atrick, efriedma, niravd

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D60839

llvm-svn: 361253
2019-05-21 13:04:53 +00:00
Dylan McKay e967308da4 Add TargetLoweringInfo hook for explicitly setting the ABI calling convention endianess
Summary:
The endianess used in the calling convention does not always match the
endianess of the target on all architectures, namely AVR.

When an argument is too large to be legalised by the architecture and is
split for the ABI, a new hook TargetLoweringInfo::shouldSplitFunctionArgumentsAsLittleEndian
is queried to find the endianess that function arguments must be laid
out in.

This approach was recommended by Eli Friedman.

Originally reported in https://github.com/avr-rust/rust/issues/129.

Patch by Carl Peto.

Reviewers: bogner, t.p.northover, RKSimon, niravd, efriedma

Reviewed By: efriedma

Subscribers: JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62003

llvm-svn: 361222
2019-05-21 06:38:02 +00:00
Craig Topper 97d4f7c194 [SelectionDAGBuilder] Flush PendingExports before creating INLINEASM_BR node for asm goto.
Since INLINEASM_BR is a terminator we need to flush the pending exports before
emitting it. If we don't do this, a TokenFactor can be inserted between it and
the BR instruction emitted to finish the callbr lowering.

It looks like nodes are glued to the INLINEASM_BR so I had to make sure we emit
the TokenFactor before that.

Differential Revision: https://reviews.llvm.org/D59981

llvm-svn: 361177
2019-05-20 17:08:02 +00:00
Craig Topper af7a188453 [Intrinsics] Merge lround.i32 and lround.i64 into a single intrinsic with overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64
We shouldn't really make assumptions about possible sizes for long and long long. And longer term we should probably support vectorizing these intrinsics. By making the result types not fixed we can support vectors as well.

Differential Revision: https://reviews.llvm.org/D62026

llvm-svn: 361169
2019-05-20 16:27:09 +00:00
Craig Topper 203bfdd0f0 [DAGCombiner] Refactor code in visitShiftByConstant slightly to make it more readable. NFC
This changes the isShift variable to include the constant operand
check that was previously in the if statement.

While there fix an 80 column violation and an unnecessary use of
getNode. Also fix variable name capitalization.

llvm-svn: 361168
2019-05-20 16:26:55 +00:00
Nikita Popov 9060b6df97 [SDAG] Vector op legalization for overflow ops
Fixes issue reported by aemerson on D57348. Vector op legalization
support is added for uaddo, usubo, saddo and ssubo (umulo and smulo
were already supported). As usual, by extracting TargetLowering methods
and calling them from vector op legalization.

Vector op legalization doesn't really deal with multiple result nodes,
so I'm explicitly performing a recursive legalization call on the
result value that is not being legalized.

There are some existing test changes because expansion happens
earlier, so we don't get a DAG combiner run in between anymore.

Differential Revision: https://reviews.llvm.org/D61692

llvm-svn: 361166
2019-05-20 16:09:22 +00:00
Matt Arsenault 7c8ec18964 RegAlloc: Fix verifier error with undef identity copies
The code did not match the example in the comment, and was checking
the undef flag on the copy dest instead of source. The existing tests
were only hitting the > 2 operands case.

llvm-svn: 361156
2019-05-20 14:09:36 +00:00
Guillaume Chatelet e386a01e84 [NFC] Refactor visitIntrinsicCall so it doesn't return a const char*
Summary: API simplification

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61306

llvm-svn: 361140
2019-05-20 11:01:30 +00:00
Petar Jovanovic e85bbf564d [DebugInfoMetadata] Refactor DIExpression::prepend constants (NFC)
Refactor DIExpression::With* into a flag enum in order to be less
error-prone to use (as discussed on D60866).

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D61943

llvm-svn: 361137
2019-05-20 10:35:57 +00:00
Guillaume Chatelet a760e69840 Revert "[NFC] Refactor visitIntrinsicCall so it doesn't return a const char*"
This reverts commit 706d3cd6388cc3446aab282f3af879862b10cbed.

llvm-svn: 361130
2019-05-20 09:00:12 +00:00
Guillaume Chatelet fa8c152576 [NFC] Refactor visitIntrinsicCall so it doesn't return a const char*
Summary: API simplification

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61306

llvm-svn: 361129
2019-05-20 08:52:10 +00:00
Simon Pilgrim 2b45a70fd6 MemCmpExpansion::getCompareLoadPairs - assert we find a comparison diff. NFCI.
Fix scan-build uninitialized warning and assert the final diff isn't null.

llvm-svn: 361095
2019-05-18 11:31:48 +00:00
Matt Arsenault 02b5ca8cd1 GlobalISel: Implement lower for S64->S32 [SU]ITOFP
This is ported from the custom AMDGPU DAG implementation. I think this
is a better default expansion than what the DAG currently uses, at
least if the target has CTLZ.

This implements the signed version in terms of the unsigned
conversion, which is implemented with bit operations. SelectionDAG has
several other implementations that should eventually be ported
depending on what instructions are legal.

llvm-svn: 361081
2019-05-17 23:05:13 +00:00
Matt Arsenault f3cedf4823 GlobalISel: Define integer min/max instructions
Doesn't attempt to emit them for anything yet, but some legalizations
I want to port use them.

llvm-svn: 361061
2019-05-17 18:36:31 +00:00
Roman Lebedev 64c756b991 [DAGCombiner] visitShiftByConstant(): drop bogus signbit check
Summary:
That check claims that the transform is illegal otherwise.
That isn't true:
1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter
   https://rise4fun.com/Alive/K4A
2. For `ISD::AND`, there is no restriction on constants:
   https://rise4fun.com/Alive/Wy3
3. For `ISD::OR`, there is no restriction on constants:
   https://rise4fun.com/Alive/GOH
3. For `ISD::XOR`, there is no restriction on constants:
   https://rise4fun.com/Alive/ml6

So, why is it there then?

This changes the testcase that was touched by @spatel in rL347478,
but i'm not sure that test tests anything particular?

Reviewers: RKSimon, spatel, craig.topper, jojo, rengolin

Reviewed By: spatel

Subscribers: javed.absar, llvm-commits, spatel

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61918

llvm-svn: 361044
2019-05-17 15:52:58 +00:00
Matt Arsenault 1448f5689e AMDGPU/GlobalISel: Legalize G_FCOPYSIGN
llvm-svn: 361025
2019-05-17 12:19:52 +00:00
Fangrui Song ec6dc3089e [GlobalISel] Fix -Wsign-compare on 32-bit -DLLVM_ENABLE_ASSERTIONS=on builds
llvm-svn: 360989
2019-05-17 05:53:39 +00:00
Ben Dunbobbin 1d16515fb4 [ELF] Implement Dependent Libraries Feature
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.

Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.

The design goals were to provide:

- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
  environments (MSVC in particular).

Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.

In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:

1. There are no competing defined symbols in a given set of libraries, or
   if they exist, the program owner doesn't care which is linked to their
   program.
2. There may be circular dependencies between libraries.

The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:

.section ".deplibs","MS",@llvm_dependent_libraries,1
         .asciz "foo"

For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.

LLD processes the dependent library specifiers in the following way:

1. Dependent libraries which are found from the specifiers in .deplibs sections
   of relocatable object files are added when the linker decides to include that
   file (which could itself be in a library) in the link. Dependent libraries
   behave as if they were appended to the command line after all other options. As
   a consequence the set of dependent libraries are searched last to resolve
   symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
   to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
   strings in a .deplibs section by; first, handling the string as if it was
   specified on the command line; second, by looking for the string in each of the
   library search paths in turn; third, by looking for a lib<string>.a or
   lib<string>.so (depending on the current mode of the linker) in each of the
   library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
   dependent libraries.

Rationale for the above points:

1. Adding the dependent libraries last makes the process simple to understand
   from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
   failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
   will affect all of the dependent libraries. There is a potential problem of
   surprise for developers, who might not realize that these options would apply
   to these "invisible" input files; however, despite the potential for surprise,
   this is easy for developers to reason about and gives developers the control
   that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
   find input files. The different search methods are tried by the linker in most
   obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
   ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
   is not necessary: if finer control is required developers can fall back to using
   the command line directly.

RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.

Differential Revision: https://reviews.llvm.org/D60274

llvm-svn: 360984
2019-05-17 03:44:15 +00:00
Amy Huang c2029068bc Emit global variables as S_CONSTANT records for codeview debug info.
Summary:
This emits S_CONSTANT records for global variables.
Currently this emits records for the global variables already being tracked in the
LLVM IR metadata, which are just constant global variables; we'll also want S_CONSTANTs
for static data members and enums.

Related to https://bugs.llvm.org/show_bug.cgi?id=41615

Reviewers: rnk

Subscribers: aprantl, hiraditya, llvm-commits, thakis

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61926

llvm-svn: 360948
2019-05-16 22:28:52 +00:00
Tim Renouf e3cbdaf1b5 [CodeGen] Fixed de-optimization of legalize subvector extract
The recent introduction of v3i32 etc as an MVT, and its use in AMDGPU
3-dword memory instructions, caused a de-optimization problem for code
with such a load that then bitcasts via vector of i8, because v12i8 is
not an MVT so it legalizes the bitcast by widening it.

This commit adds the ability to widen a bitcast using extract_subvector
on the result, so the value does not need to go via memory.

Differential Revision: https://reviews.llvm.org/D60457

Change-Id: Ie4abb7760547e54a2445961992eafc78e80d4b64
llvm-svn: 360942
2019-05-16 21:49:06 +00:00
Adhemerval Zanella 73643b5041 [CodeGen] Add lround/llround builtins
This patch add the ISD::LROUND and ISD::LLROUND along with new
intrinsics.  The changes are straightforward as for other
floating-point rounding functions, with just some adjustments
required to handle the return value being an interger.

The idea is to optimize lround/llround generation for AArch64
in a subsequent patch.  Current semantic is just route it to libm
symbol.

llvm-svn: 360889
2019-05-16 13:15:27 +00:00
Matt Arsenault 828b685ebe RegAllocFast: Improve hinting heuristic
Trace through multiple COPYs when looking for a physreg source. Add
hinting for vregs that will be copied into physregs (we only hinted
for vregs getting copied to a physreg previously).  Give hinted a
register a bonus when deciding which value to spill.  This is part of
my rewrite regallocfast series. In fact this one doesn't even have an
effect unless you also flip the allocation to happen from back to
front of a basic block. Nonetheless it helps to split this up to ease
review of D52010

Patch by Matthias Braun

llvm-svn: 360887
2019-05-16 12:50:39 +00:00
Matt Arsenault 27ac8408f6 GlobalISel: Add DstOp version of buildIntrinsic
llvm-svn: 360879
2019-05-16 12:22:56 +00:00
Matt Arsenault 11be78bc7a GlobalISel: Add buildFConstant for APFloat
llvm-svn: 360853
2019-05-16 04:09:06 +00:00
Matt Arsenault 012ecbbbba GlobalISel: Fix indentation
llvm-svn: 360851
2019-05-16 04:08:46 +00:00
Matt Arsenault 55146d3139 GlobalISel: Add G_FCOPYSIGN
llvm-svn: 360850
2019-05-16 04:08:39 +00:00
Reid Kleckner 4882490349 [codeview] Fix SDNode representation of annotation labels
Before this change, they were erroneously constructed with the EH_LABEL
SDNode opcode, which caused other passes to interact with them in
incorrect ways. See the FIXME about fastisel that this addresses in the
existing test case.

Fixes PR41890

llvm-svn: 360818
2019-05-15 21:46:05 +00:00
Nicolai Haehnle f672b6170c [MachineOperand] Add a ChangeToGA method
Summary:
Analogous to the other ChangeToXXX methods. See the next patch for a
use case.

Change-Id: I6548d614706834fb9109ab3c8fe915e9c6ece2a7

Reviewers: arsenm, kzhuravl

Subscribers: wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61651

llvm-svn: 360789
2019-05-15 17:48:10 +00:00
Nicolai Haehnle 664ceeda68 RegAlloc: try to fail more gracefully when out of registers
Summary:
The emitError path allows the program to continue, unlike report_fatal_error.
This is friendlier to use cases where LLVM is embedded in a larger program,
because the caller may be able to deal with the error somewhat gracefully.

Change the number of requested NOP bytes in the AArch64 and PowerPC
test cases to avoid triggering an unrelated assertion. The compilation
still fails, as verified by the test.

Change-Id: Iafb9ca341002a597b82e59ddc7a1f13c78758e3d

Reviewers: arsenm, MatzeB

Subscribers: qcolombet, nemanjai, wdng, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61489

llvm-svn: 360786
2019-05-15 17:29:58 +00:00
Clement Courbet d9d0665d1c [[DAGCombiner][NFC] Add a comment.
As suggested in D61846.

llvm-svn: 360755
2019-05-15 08:21:18 +00:00
Fangrui Song f4dfd63c74 [IR] Disallow llvm.global_ctors and llvm.global_dtors of the 2-field form in textual format
The 3-field form was introduced by D3499 in 2014 and the legacy 2-field
form was planned to be removed in LLVM 4.0

For the textual format, this patch migrates the existing 2-field form to
use the 3-field form and deletes the compatibility code.
test/Verifier/global-ctors-2.ll checks we have a friendly error message.

For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the
2-field form (add i8* null as the third field).

Reviewed By: rnk, dexonsmith

Differential Revision: https://reviews.llvm.org/D61547

llvm-svn: 360742
2019-05-15 02:35:32 +00:00
Fangrui Song 2f6ef2fc92 DWARF v5: emit DW_AT_addr_base if DW_AT_low_pc references .debug_addr
The condition !AddrPool.empty() is tested before attachRangesOrLowHighPC(), which may add an entry to AddrPool. We emit DW_AT_low_pc (DW_FORM_addrx) but may incorrectly omit DW_AT_addr_base for LineTablesOnly. This can be easily reproduced:

clang -gdwarf-5 -gmlt -c a.cc

Fix this by moving !AddrPool.empty() below.

This was discovered while investigating an lld crash (fixed by D61889) on such object files: ld.lld --gdb-index a.o

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D61891

llvm-svn: 360678
2019-05-14 14:37:26 +00:00
Diana Picus a568222ddd [IRTranslator] Don't hardcode GEP index type
When breaking up loads and stores of aggregates, the IRTranslator uses
LLT::scalar(64) for the index type of the G_GEP instructions that
compute the addresses. This is unnecessarily large for 32-bit targets.
Use the int ptr type provided by the DataLayout instead.

Note that we're already doing the right thing when translating
getelementptr instructions from the IR. This is just an oversight when
generating new ones while translating loads/stores.

Both x86 and AArch64 already have tests confirming that the old
behaviour is preserved for 64-bit targets.

Differential Revision: https://reviews.llvm.org/D61852

llvm-svn: 360656
2019-05-14 09:25:17 +00:00
Sanjay Patel 99d6420a82 [SDAG] fix unused variable warning and unneeded indirection; NFC
llvm-svn: 360640
2019-05-14 00:57:31 +00:00
Sanjay Patel 3a13d970aa [SDAG, x86] allow targets to override test for binop opcodes
This follows the pattern of the existing isCommutativeBinOp().

x86 shows improvements from vector narrowing for the min/max opcodes.

llvm-svn: 360639
2019-05-14 00:39:40 +00:00
Nick Desaulniers c33f754e74 [TargetLowering] Handle multi depth GEPs w/ inline asm constraints
Summary:
X86TargetLowering::LowerAsmOperandForConstraint had better support than
TargetLowering::LowerAsmOperandForConstraint for arbitrary depth
getelementpointers for "i", "n", and "s" extended inline assembly
constraints. Hoist its support from the derived class into the base
class.

Link: https://github.com/ClangBuiltLinux/linux/issues/469

Reviewers: echristo, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, E5ten, kees, jyknight, nemanjai, javed.absar, eraman, hiraditya, jsji, llvm-commits, void, craig.topper, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61560

llvm-svn: 360604
2019-05-13 17:27:44 +00:00
Simon Pilgrim d3cedee3c6 [TargetLowering] Add SimplifyDemandedBits support for ZERO_EXTEND_VECTOR_INREG
More work for PR39709.

llvm-svn: 360592
2019-05-13 15:51:26 +00:00
Sanjay Patel 05dafb1c97 [DAGCombiner] narrow vector binop with inserts/extract
We catch most of these patterns (on x86 at least) by matching
a concat vectors opcode early in combining, but the pattern may
emerge later using insert subvector instead.

The AVX1 diffs for add/sub overflow show another missed narrowing
pattern. That one may be falling though the cracks because of
combine ordering and multiple uses.

llvm-svn: 360585
2019-05-13 14:31:14 +00:00
Kevin P. Neal 5987749e33 Add constrained fptrunc and fpext intrinsics.
The new fptrunc and fpext intrinsics are constrained versions of the
regular fptrunc and fpext instructions.

Reviewed by:	Andrew Kaylor, Craig Topper, Cameron McInally, Conner Abbot
Approved by:	Craig Topper
Differential Revision: https://reviews.llvm.org/D55897

llvm-svn: 360581
2019-05-13 13:23:30 +00:00
Simon Pilgrim d845bc3d0c TargetLowering::SimplifyDemandedBits - early-out for UNDEF ops. NFCI.
llvm-svn: 360579
2019-05-13 12:44:03 +00:00
Clement Courbet 9afc4764dd [DAGCombiner] Fix invalid alias analysis.
Summary:
When we know for sure whether two addresses do or do not alias, we
should immediately return from DAGCombiner::isAlias().

I think this comes from a bad copy/paste, Sorry for not catching that during the
code review.

Fixes PR41855.

Reviewers: niravd, gchatelet, EricWF

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61846

llvm-svn: 360566
2019-05-13 09:07:37 +00:00
Craig Topper 61e556d2bd Recommit r358887 "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling"
I've included a new fix in X86RegisterInfo to prevent PR41619 without
reintroducing r359392. We might be able to improve that in the base class
implementation of shouldRewriteCopySrc somehow. But this hopefully enables
forward progress on SimplifyDemandedBits improvements for now.

Original commit message:

This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly.

The AMDGPU backend needed an extra  (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGComb
but it caused a lot of noise on other targets - some improvements, some regressions.

The X86 changes are all definite wins.

llvm-svn: 360552
2019-05-13 04:03:35 +00:00
Sanjay Patel a09e686821 [DAGCombiner] try to move bitcast after extract_subvector
I noticed that we were failing to narrow an x86 ymm math op in a case similar
to the 'madd' test diff. That is because a bitcast is sitting between the math
and the extract subvector and thwarting our pattern matching for narrowing:

       t56: v8i32 = add t59, t58
      t68: v4i64 = bitcast t56
    t73: v2i64 = extract_subvector t68, Constant:i64<2>
  t96: v4i32 = bitcast t73

There are a few wins and neutral diffs in the other tests.

Differential Revision: https://reviews.llvm.org/D61806

llvm-svn: 360541
2019-05-12 14:43:20 +00:00
Simon Pilgrim 605a840747 [DAG] Add SimplifyDemandedBits support for BITREVERSE
Pulled out of D58017 while I continue to investigate the BSWAP regression on PPC

llvm-svn: 360534
2019-05-11 20:56:05 +00:00
Simon Pilgrim aeed0a30c0 SelectionDAGISel::CodeGenAndEmitDAG - remove unused variable. NFCI.
llvm-svn: 360514
2019-05-11 11:00:37 +00:00
Jordan Rupprecht 16c7fbd112 Revert [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor
This reverts r360171 (git commit a9d6c32eaf). A repro showing the asan/msan failures is forthcoming.

llvm-svn: 360481
2019-05-10 23:20:02 +00:00
Craig Topper 114f763f37 [LegalizeVectorOps] Remove calls to LegalizeOp on the return value from ExpandLoad/ExpandStore.
We already updated the LegalizedNodes map at the end of the Expand call. This
would have marked the new node as being mapped to itself. So the LegalizeOp
call will find that an immediately return.

llvm-svn: 360472
2019-05-10 21:42:27 +00:00
Nikita Popov 9f7537bd48 [SDAG] Recursively legalize both vector mulo results
Split out from D61692 per RKSimon's suggestion. Vector op
legalization will automatically recursively legalize the returned
SDValue, but we need to take care of the other results ourselves.
Otherwise it will end up getting legalized only during op
legalization, by which point it might be too late (though I'm not
aware of any specific cases right now).

There are codegen differences because expansion occurs earlier now
and we don't get a DAGCombiner run in between.

Differential Revision: https://reviews.llvm.org/D61744

llvm-svn: 360470
2019-05-10 20:42:48 +00:00
Sanjay Patel b37ddeafc0 [DAGCombiner] reduce code duplication; NFC
llvm-svn: 360462
2019-05-10 20:02:30 +00:00
David Blaikie 7598b71488 DebugInfo: Only move types out of type units if they're named or type united
Follow up to r359122, after a bug was reported in it - the original
change too aggressively tried to move related types out of type units,
which included unnamed types (like array types) which can't reasonably
be declared-but-not-defined.

A step beyond that is that some types in type units can be anonymous, if
they are types with a name for linkage purposes (eg: "typedef struct { }
x;"). So ensure those don't get turned into plain declarations (without
signatures) because, lacking names, they can't be resolved to the
definition.

[Also include a fix for llvm-dwarfdump/libDebugInfoDWARF to pretty print
types in type units]

llvm-svn: 360458
2019-05-10 19:15:29 +00:00
Momchil Velikov c396f09ce9 Adjust MachineScheduler to use ProcResource counts
This fix allows the scheduler to take into account the number of instances of
each ProcResource specified. Previously a declaration in a scheduler of
ProcResource<1> would be treated identically to a declaration of
ProcResource<2>. Now the hazard recognizer would report a hazard only after all
of the resource instances are busy.

Patch by Jackson Woodruff and Momchil Velikov.

Differential Revision: https://reviews.llvm.org/D51160

llvm-svn: 360441
2019-05-10 16:54:32 +00:00
Tim Northover 6c1e3f9493 SelectionDAG: accommodate atomic floating stores.
We were applying a pointer truncation to floating types, which crashed LLVM.
That is Not A Good Thing(TM).

llvm-svn: 360421
2019-05-10 11:23:04 +00:00
Cameron McInally 156eb28289 [CodeGen] Add comment about FSUB <-> FNEG xforms
Differential Revision: https://reviews.llvm.org/D61741

llvm-svn: 360366
2019-05-09 19:28:52 +00:00
Florian Hahn be10bc71f9 [DAGCombiner] Limit number of nodes explored as store candidates.
To find the candidates to merge stores we iterate over all nodes in a chain
for each store, which leads to quadratic compile times for large basic blocks
with a large number of stores.

Reviewers: niravd, spatel, craig.topper

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D61511

llvm-svn: 360357
2019-05-09 17:05:52 +00:00
Simon Pilgrim 38ef296265 [CodeGenPrepare] Ensure we get a non-null result from getTrueOrFalseValue. NFCI.
llvm-svn: 360328
2019-05-09 10:51:26 +00:00
Markus Lavin 92d5db524e Make sub-registers index names case sensitive in the MIRParser
Prior to this change sub-register index names are assumed to be lower
case (but they are printed with original casing). This means that if a
target has some upper case characters in its sub-register names then
mir-export directly followed by mir-import is not possible. This also
means that sub-register indices currently are (and will continue to be)
slightly inconsistent with register names which are printed and assumed
to be lower case.

As the current textual representation of mir has a few inconsistencies
in this area it is a bit arbitrary how to address the matter. This
change is towards the direction that we feel is most correct (i.e. case
sensitivity).

Differential Revision: https://reviews.llvm.org/D61499

llvm-svn: 360318
2019-05-09 08:29:04 +00:00
Pengfei Wang c05aad0532 Bugfix for nullptr check by klocwork
Klocwork static check:
Pointer from call to function `DebugLoc::operator DILocation *() const `
may be NULL and will be dereference in function `printExtendedName```
Patch by Shengchen Kan (skan)
Differential Revision: https://reviews.llvm.org/D61715

llvm-svn: 360317
2019-05-09 08:09:21 +00:00
Bjorn Pettersson 8d19e94f13 [CodeGen] Use "DL.getPointerSizeInBits" instead of "8 * DL.getPointerSize". NFC
llvm-svn: 360315
2019-05-09 08:07:36 +00:00
Leonard Chan 95b7abdcc5 [SelectionDAG] Expand ADD/SUBCARRY
This patch allows for expansion of ADDCARRY and SUBCARRY when the target does not support it.

Differential Revision: https://reviews.llvm.org/D61411

llvm-svn: 360303
2019-05-09 01:17:48 +00:00
Eric Christopher c93f56d39e Temporarily Revert "[DebugInfo] Terminate more location-list ranges at the end of blocks"
as it was causing significant compile time regressions.

This reverts commit r359426 while we come up with testcases and additional ideas.

llvm-svn: 360301
2019-05-08 23:54:03 +00:00
Sanjay Patel 902b3ecdad [SelectionDAG] fold 'fneg undef' to undef
This is extracted from the original draft of D61419 with some additional tests.
We don't currently get this in IR (it's conservatively turned into a NaN),
but presumably that'll get updated as we add real IR support for 'fneg'
rather than 'fsub -0.0, x'.

The x86-32 run shows the following, and I haven't looked further to see why,
but that seems to be independent:
  Legalizing: t1: f32 = undef
  Trying to expand node
  Creating fp constant: t4: f32 = ConstantFP<0.000000e+00>

Differential Revision: https://reviews.llvm.org/D61516

llvm-svn: 360296
2019-05-08 22:19:52 +00:00
Quentin Colombet 157427245a [RegAllocFast] Scan physcial reg definitions before assigning virtual reg definitions
When assigning the definitions of an instruction we were updating
the available registers while walking the definitions. Some of
those definitions may be from physical registers and thus, they are
not available for other definitions to take, but by the time we see
that we may have already assign these registers to another
virtual register.

Fix that by walking through all the definitions and mark as unavailable
the physical register definitions, then do the virtual register assignments.

PR41790

llvm-svn: 360278
2019-05-08 18:30:26 +00:00
Craig Topper 493aec3ef5 [FastISel][X86] Support FNeg instruction in target independent fast isel handling
This patch adds support for calling selectFNeg for FNeg instructions in addition to the fsub idiom

Differential Revision: https://reviews.llvm.org/D61624

llvm-svn: 360273
2019-05-08 17:27:08 +00:00
Simon Pilgrim 2788ad3ee2 [LegalizeDAG] Assert non-power-of-2 load/store op splits are in range. NFCI.
Fixes static analyzer undefined/out-of-range shift warnings.

llvm-svn: 360245
2019-05-08 11:22:10 +00:00
Simon Pilgrim 97a0c54179 Fix cppcheck operator precedence warning. NFCI.
llvm-svn: 360234
2019-05-08 10:07:34 +00:00
QingShan Zhang 0e71a6e755 [CodeGenPrepare] Don't split the store if it is volatile
We shouldn't split the store when it is volatile.

Differential Revision: https://reviews.llvm.org/D61169

llvm-svn: 360228
2019-05-08 07:32:12 +00:00
QingShan Zhang e065af6a42 [NFC] Add a static function to do the endian check
Add a new function to do the endian check, as I will commit another patch later, which will also need the endian check. 

Differential Revision: https://reviews.llvm.org/D61236

llvm-svn: 360226
2019-05-08 07:21:37 +00:00
Austin Kerbow 6e6480e216 [CodeGen] Rename DEBUG_TYPE for default hazard recognizer.
Summary:
The DEBUG_TYPE of the default hazard recognizer should be updated to
match the DEBUG_TYPE of the machine-scheduler pass.

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61359

llvm-svn: 360198
2019-05-07 22:09:04 +00:00
Adrian Prantl e6e8db5e9b Debug Info: Support address space attributes on rvalue references.
DWARF5, 2.12 20ff says that

Any debugging information entry representing a pointer or reference
type [may have a DW_AT_address_class attribute].

The existing code (https://reviews.llvm.org/D29670) seems to take a
quite literal interpretation of that wording. I don't see a reason why
an rvalue reference isn't a reference type in the spirit of that
paragraph. This patch allows rvalue references to also have address
spaces.

rdar://problem/50511483

Differential Revision: https://reviews.llvm.org/D61625

llvm-svn: 360176
2019-05-07 17:42:38 +00:00
Florian Hahn a9d6c32eaf [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor
When simplifying TokenFactors, we potentially iterate over all
operands of a large number of TokenFactors. This causes quadratic
compile times in some cases and the large token factors cause additional
scalability problems elsewhere.

This patch adds some limits to the number of nodes explored for the
cases mentioned above.

Reviewers: niravd, spatel, craig.topper

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D61397

llvm-svn: 360171
2019-05-07 16:47:27 +00:00
Simon Pilgrim 3044ac058b Avoid use-after-move warnings by using swap instead. NFCI.
Swap should be as quick in these cases, and leaves the original variables in a known (empty) state.

llvm-svn: 360164
2019-05-07 15:45:00 +00:00
Craig Topper c6d445f9c1 [FastISel][X86] If selectFNeg fails, fall back to SelectionDAG not treating it as an fsub.
Summary:
If fneg lowering for fsub -0.0, x fails we currently fall back to treating it as an fsub. This has different behavior for nans than the xor with sign bit trick we normally try to do. On X86, the xor trick for double fails fast-isel in 32-bit mode with sse2 due to 64 bit integer types not being available. With -O2 we would always use an xorpd for this case. If we use subsd, this creates an observable behavior difference between -O0 and -O2. So fall back to SelectionDAG if we can't fast-isel it, that way SelectionDAG will use the xorpd.

I believe this patch is restoring the behavior prior to r345295 from last October. This was missed then because our fast isel case in 32-bit mode aborted fast-isel earlier for another reason. But I've added new tests to cover that.

Reviewers: andrew.w.kaylor, cameron.mcinally, spatel, efriedma

Reviewed By: cameron.mcinally

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61622

llvm-svn: 360111
2019-05-07 04:25:24 +00:00
Fangrui Song da82ce99b7 [DebugInfo] Delete TypedDINodeRef
TypedDINodeRef<T> is a redundant wrapper of Metadata * that is actually a T *.

Accordingly, change DI{Node,Scope,Type}Ref uses to DI{Node,Scope,Type} * or their const variants.
This allows us to delete many resolve() calls that clutter the code.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D61369

llvm-svn: 360108
2019-05-07 02:06:37 +00:00