Commit Graph

161421 Commits

Author SHA1 Message Date
Simon Pilgrim 69a4132f63 [X86] Regenerate schedule tests with zero latency comments
llvm-svn: 327628
2018-03-15 14:30:59 +00:00
Sanjay Patel a4f42f2cfd [PatternMatch, InstSimplify] allow undef elements when matching any vector FP zero
This matcher implementation appears to be slightly more efficient than 
the generic constant check that it is replacing because every use was 
for matching FP patterns, but the previous code would check int and 
pointer type nulls too. 

llvm-svn: 327627
2018-03-15 14:29:27 +00:00
Sanjay Patel 8f063d0c70 [InstSimplify] remove 'nsz' requirement for frem 0, X
From the LangRef definition for frem: 
"The value produced is the floating-point remainder of the two operands. 
This is the same output as a libm ‘fmod‘ function, but without any 
possibility of setting errno. The remainder has the same sign as the 
dividend. This instruction is assumed to execute in the default 
floating-point environment."

llvm-svn: 327626
2018-03-15 14:04:31 +00:00
Sjoerd Meijer dfc7eb490a [AArch64] Codegen tests for the Armv8.2-A FP16 intrinsics
This is a follow up of the AArch64 FP16 intrinsics work;
the codegen tests had not been added yet.

Differential Revision: https://reviews.llvm.org/D44510

llvm-svn: 327624
2018-03-15 13:42:28 +00:00
Ulrich Weigand f4ceef8d3f [Debug] Retain both copies of debug intrinsics in HoistThenElseCodeToIf
When hoisting common code from the "then" and "else" branches of a condition
to before the "if", the HoistThenElseCodeToIf routine will attempt to merge
the debug location associated with the two original copies of the hoisted
instruction.

This is a problem in the special case where the hoisted instruction is a
debug info intrinsic, since for those the debug location is considered
part of the intrinsic and attempting to modify it may resut in invalid
IR.  This is the underlying cause of PR36410.

This patch fixes the problem by handling debug info intrinsics specially:
instead of hoisting one copy and merging the two locations, the code now
simply hoists both copies, each with its original location intact.  Note
that this is still only done in the case where both original copies are
otherwise (i.e. apart from location metadata) identical.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D44312

llvm-svn: 327622
2018-03-15 12:28:48 +00:00
Brock Wyma 3cc5710cec [CodeView] Initial support for emitting S_BLOCK32 symbols for lexical scopes
This patch sorts local variables by lexical scope and emits them inside
an appropriate S_BLOCK32 CodeView symbol.

Differential Revision: https://reviews.llvm.org/D42926

llvm-svn: 327620
2018-03-15 11:52:17 +00:00
Fedor Sergeev 194a407bda [New PM][IRCE] port of Inductive Range Check Elimination pass to the new pass manager
There are two nontrivial details here:
* Loop structure update interface is quite different with new pass manager,
  so the code to add new loops was factored out

* BranchProbabilityInfo is not a loop analysis, so it can not be just getResult'ed from
  within the loop pass. It cant even be queried through getCachedResult as LoopCanonicalization
  sequence (e.g. LoopSimplify) might invalidate BPI results.

  Complete solution for BPI will likely take some time to discuss and figure out,
  so for now this was partially solved by making BPI optional in IRCE
  (skipping a couple of profitability checks if it is absent).

Most of the IRCE tests got their corresponding new-pass-manager variant enabled.
Only two of them depend on BPI, both marked with TODO, to be turned on when BPI
starts being available for loop passes.

Reviewers: chandlerc, mkazantsev, sanjoy, asbirlea
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D43795

llvm-svn: 327619
2018-03-15 11:01:19 +00:00
Andrei Elovikov f9b8035f3c [LoopUnroll] Ignore ephemeral values when checking full unroll profitability.
Summary:
Before this patch call graph is like this in the LoopUnrollPass:

  tryToUnrollLoop
    ApproximateLoopSize
      collectEphemeralValues
      /* Use collected ephemeral values */
    computeUnrollCount
      analyzeLoopUnrollCost
        /* Bail out from the analysis if loop contains CallInst */

This patch moves collection of the ephemeral values to the tryToUnrollLoop
function and passes the collected values into both ApproximateLoopsize (as
before) and additionally starts using them in analyzeLoopUnrollCost:

  tryToUnrollLoop
    collectEphemeralValues
    ApproximateLoopSize(EphValues)
      /* Use EphValues */
    computeUnrollCount(EphValues)
      analyzeLoopUnrollCost(EphValues)
        /* Ignore ephemeral values - they don't contribute to the final cost */
        /* Bail out from the analysis if loop contains CallInst */

Reviewers: mzolotukhin, evstupac, sanjoy

Reviewed By: evstupac

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43931

llvm-svn: 327617
2018-03-15 09:59:15 +00:00
Max Kazantsev 4f9c7c5086 [SCEV][NFC] Remove TBB, FBB parameters from exit limit computations
Methods `computeExitLimitFromCondCached` and `computeExitLimitFromCondImpl` take
true and false branches as parameters and only use them for asserts and for identifying
whether true/false branch belongs to the loop (which can be done once earlier). This fact
complicates generalization of exit limit computation logic on guards because the guards
don't have blocks to which they go in case of failure explicitly.

The motivation of this patch is that currently this part of SCEV knows nothing about guards
and only works with explicit branches. As result, it fails to prove that a loop

  for (i = 0; i < 100; i++)
    guard(i < 10);

exits after 10th iteration, while in the equivalent example

  for (i = 0; i < 100; i++)
    if (i >= 10) break;

SCEV easily proves this fact. We are going to change it in near future, and this is why
we need to make these methods operate on more abstract level.

This patch refactors this code to get rid of these parameters as meaningless and prepare
ground for teaching these methods to work with guards as well as they work with explicit
branching instructions.

Differential Revision: https://reviews.llvm.org/D44419

llvm-svn: 327615
2018-03-15 09:38:00 +00:00
Craig Topper ff6e82c9d0 [X86] Add test cases for 512-bit addsub from build_vector.
There is no 512 bit addsub instruction, but we partially match it handle fmaddsub matching. We explicitly bail out for 512 bit vectors after failing the fmaddsub match, but we had no test coverage for that bail out.

We might want to consider splitting and using 256 bit instructions instead of the long sequence seen here.

llvm-svn: 327605
2018-03-15 06:49:01 +00:00
Craig Topper 26a3a80c87 [X86] Add support for matching FMSUBADD from build_vector.
llvm-svn: 327604
2018-03-15 06:14:55 +00:00
Craig Topper a5e712f402 [X86] Remove old TODO. We have coverage for this now.
Coverage was added in r320950.

llvm-svn: 327603
2018-03-15 06:14:53 +00:00
Craig Topper b9526e9fdb [X86] Use MVT in a couple places where we know the type is legal.
llvm-svn: 327602
2018-03-15 06:14:51 +00:00
Aaron Smith 40198f5905 [DebugInfo] Add a new method IPDBSession::findLineNumbersBySectOffset
Summary:
Some PDB symbols do not have a valid VA or RVA but have Addr by Section and Offset. For example, a variable in thread-local storage has the following properties:

     get_addressOffset: 0
     get_addressSection: 5
     get_lexicalParentId: 2
     get_name: g_tls
     get_symIndexId: 12
     get_typeId: 4
     get_dataKind: 6
     get_symTag: 7
     get_locationType: 2

This change provides a new method to locate line numbers by Section and Offset from those symbols.

Reviewers: zturner, rnk, llvm-commits

Subscribers: asmith, JDevlieghere

Differential Revision: https://reviews.llvm.org/D44407

llvm-svn: 327601
2018-03-15 06:04:51 +00:00
Lei Huang 1f8da3ae19 [PowerPC][NFC] formatting-only fix
llvm-svn: 327599
2018-03-15 03:06:44 +00:00
George Burgess IV cedfa6da81 Remove unused variable; NFC
llvm-svn: 327597
2018-03-15 02:58:36 +00:00
Lang Hames 5721ee48a2 [ORC] Re-apply r327566 with a fix for test-global-ctors.ll.
Also clang-formats the patch, which I should have done the first time around.

llvm-svn: 327594
2018-03-15 00:30:14 +00:00
Matt Davis 9407bb5f54 [CleanUp] Remove NumInstructions field from LoopVectorizer's RegisterUsage struct.
Summary:
This variable is largely going unused; aside from reporting number of instructions for in DEBUG builds.

The only use of NumInstructions is in debug output to represent the LoopSize.  That value can be can be misleading as it also includes metadata instructions (e.g., DBG_VALUE) which have no real impact.  If we do choose to keep this around, we probably should guard it by a DEBUG macro, as it's not used in production builds.



Reviewers: majnemer, congh, rengolin

Reviewed By: rengolin

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D44495

llvm-svn: 327589
2018-03-14 23:30:31 +00:00
Simon Pilgrim 48fbf0c69a [X86][Btver2] Add support for multiple pipelines stages for fpu schedules. NFCI.
This allows us to use JWriteResFpuPair for complex schedule classes as well as single pipe instructions.

llvm-svn: 327588
2018-03-14 23:12:09 +00:00
Sanjay Patel e4e3f79b83 [InstSimplify] add tests for frem and vectors with undef; NFC
These should all be folded. The vector tests need to have
m_AnyZero updated to ignore undef elements, but we need to
be careful not to return the existing value in that case
and unintentionally propagate undef.

llvm-svn: 327585
2018-03-14 22:45:58 +00:00
Mark Searles c3c02bde73 [AMDGPU] Waitcnt pass: Modify the waitcnt pass to propagate info in the case of a single basic block loop. mergeInputScoreBrackets() does this for us; update it so that it processes the single bb's score bracket when processing the single bb's preds. It is, after all, a pred of itself, so it's score bracket is needed.
Differential Revision: https://reviews.llvm.org/D44434

llvm-svn: 327583
2018-03-14 22:04:32 +00:00
Simon Pilgrim dfeebdbed7 [X86][Btver2] Add ResourceCycles and NumMicroOps overrides to scalar instructions. NFCI.
Currently still use default values - this is setup for a future patch.

llvm-svn: 327582
2018-03-14 21:55:54 +00:00
Reid Kleckner 3a7a2e4a0a [FastISel] Sink local value materializations to first use
Summary:
Local values are constants, global addresses, and stack addresses that
can't be folded into the instruction that uses them. For example, when
storing the address of a global variable into memory, we need to
materialize that address into a register.

FastISel doesn't want to materialize any given local value more than
once, so it generates all local value materialization code at
EmitStartPt, which always dominates the current insertion point. This
allows it to maintain a map of local value registers, and it knows that
the local value area will always dominate the current insertion point.

The downside is that local value instructions are always emitted without
a source location. This is done to prevent jumpy line tables, but it
means that the local value area will be considered part of the previous
statement. Consider this C code:
  call1();      // line 1
  ++global;     // line 2
  ++global;     // line 3
  call2(&global, &local); // line 4

Today we end up with assembly and line tables like this:
  .loc 1 1
  callq call1
  leaq global(%rip), %rdi
  leaq local(%rsp), %rsi
  .loc 1 2
  addq $1, global(%rip)
  .loc 1 3
  addq $1, global(%rip)
  .loc 1 4
  callq call2

The LEA instructions in the local value area have no source location and
are treated as being on line 1. Stepping through the code in a debugger
and correlating it with the assembly won't make much sense, because
these materializations are only required for line 4.

This is actually problematic for the VS debugger "set next statement"
feature, which effectively assumes that there are no registers live
across statement boundaries. By sinking the local value code into the
statement and fixing up the source location, we can make that feature
work. This was filed as https://bugs.llvm.org/show_bug.cgi?id=35975 and
https://crbug.com/793819.

This change is obviously not enough to make this feature work reliably
in all cases, but I felt that it was worth doing anyway because it
usually generates smaller, more comprehensible -O0 code. I measured a
0.12% regression in code generation time with LLC on the sqlite3
amalgamation, so I think this is worth doing.

There are some special cases worth calling out in the commit message:
1. local values materialized for phis
2. local values used by no-op casts
3. dead local value code

Local values can be materialized for phis, and this does not show up as
a vreg use in MachineRegisterInfo. In this case, if there are no other
uses, this patch sinks the value to the first terminator, EH label, or
the end of the BB if nothing else exists.

Local values may also be used by no-op casts, which adds the register to
the RegFixups table. Without reversing the RegFixups map direction, we
don't have enough information to sink these instructions.

Lastly, if the local value register has no other uses, we can delete it.
This comes up when fastisel tries two instruction selection approaches
and the first materializes the value but fails and the second succeeds
without using the local value.

Reviewers: aprantl, dblaikie, qcolombet, MatzeB, vsk, echristo

Subscribers: dotdash, chandlerc, hans, sdardis, amccarth, javed.absar, zturner, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D43093

llvm-svn: 327581
2018-03-14 21:54:21 +00:00
Francis Visoiu Mistrih e85b06d65f [CodeGen] Use MIR syntax for MachineMemOperand printing
Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)".

rdar://38163529

Differential Revision: https://reviews.llvm.org/D42377

llvm-svn: 327580
2018-03-14 21:52:13 +00:00
Philip Reames 0adbb19409 [EarlyCSE] Exploit open ended invariant.start scopes
If we have an invariant.start with no corresponding invariant.end, then the memory location becomes invariant indefinitely after the invariant.start. As a result, anything dominated by the start is guaranteed to see the value the memory location had when the invariant.start executed.

This patch adds an AvailableInvariants table which tracks the generation a particular memory location became invariant and then uses that information to allow value forwarding that would otherwise be disallowed by potentially aliasing stores. (Reminder: In EarlyCSE everything clobbers everything by default.)

This should be compatible with the MemorySSA variant, but design is generational. We can and should add first class support for invariant.start within MemorySSA at a later time.  I took a quick look at doing so, but probably need some input from a MemorySSA expert.

Differential Revision: https://reviews.llvm.org/D43716

llvm-svn: 327577
2018-03-14 21:35:06 +00:00
Reid Kleckner c7fd1540b3 Revert "[ORC] Switch from shared_ptr to unique_ptr for addModule methods."
This reverts commit r327566, it breaks
test/ExecutionEngine/OrcMCJIT/test-global-ctors.ll.

The test doesn't crash with a stack trace, unfortunately. It merely
returns 1 as the exit code.

ASan didn't produce a report, and I reproduced this on my Linux machine
and Windows box.

llvm-svn: 327576
2018-03-14 21:32:34 +00:00
Sanjay Patel 11f7f9908b [InstSimplify] fix folds for (0.0 - X) + X --> 0 (PR27151)
As shown in:
https://bugs.llvm.org/show_bug.cgi?id=27151
...the existing fold could miscompile when X is NaN.

The fold was also dependent on 'ninf' but that's not necessary.

From IEEE-754 (with default rounding which we can assume for these opcodes):
"When the sum of two operands with opposite signs (or the difference of two 
operands with like signs) is exactly zero, the sign of that sum (or difference) 
shall be +0...However, x + x = x − (−x) retains the same sign as x even when 
x is zero."

llvm-svn: 327575
2018-03-14 21:23:27 +00:00
Simon Pilgrim adf72e8549 [X86] Add haswell testing for PR35635 as well.
To improve complete model testing for schedulers for instructions with multiple results.

llvm-svn: 327572
2018-03-14 21:03:09 +00:00
Francis Visoiu Mistrih 164560bd74 [AArch64] Emit CSR loads in the same order as stores
Optionally allow the order of restoring the callee-saved registers in the
epilogue to be reversed.

The flag -reverse-csr-restore-seq generates the following code:

```
stp     x26, x25, [sp, #-64]!
stp     x24, x23, [sp, #16]
stp     x22, x21, [sp, #32]
stp     x20, x19, [sp, #48]

; [..]

ldp     x24, x23, [sp, #16]
ldp     x22, x21, [sp, #32]
ldp     x20, x19, [sp, #48]
ldp     x26, x25, [sp], #64
ret
```

Note how the CSRs are restored in the same order as they are saved.

One exception to this rule is the last `ldp`, which allows us to merge
the stack adjustment and the ldp into a post-index ldp. This is done by
first generating:

ldp x26, x27, [sp]
add sp, sp, #64

which gets merged by the arm64 load store optimizer into

ldp x26, x25, [sp], #64

The flag is disabled by default.

llvm-svn: 327569
2018-03-14 20:34:03 +00:00
Lang Hames 7bea03c2bb [ORC] Switch from shared_ptr to unique_ptr for addModule methods.
Layer implementations typically mutate module state, and this is better
reflected by having layers own the Module they are operating on.

llvm-svn: 327566
2018-03-14 20:29:45 +00:00
Alexander Richardson 115b0673b6 [UpdateTestChecks] Handle IR variables with a '-' in the name
Summary:
I noticed that clang will emit variables such as %indirect-arg-temp when
running update_cc1_test_checks.py and therefore update_cc1_test_checks.py
wasn't adding FileCheck captures for those variables.

Reviewers: MaskRay

Reviewed By: MaskRay

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44459

llvm-svn: 327564
2018-03-14 20:28:53 +00:00
Reid Kleckner 5a5ac65768 [MC] Always emit relocations for same-section function references
Summary:
We already emit relocations in this case when the "incremental linker
compatible" flag is set, but it turns out these relocations are also
required for /guard:cf. Now that we have two use cases for this
behavior, let's make it unconditional to try to keep things simple.

We never hit this problem in Clang because it always sets the
"incremental linker compatible" flag when targeting MSVC. However, LLD
LTO doesn't set this flag, so we'd get CFG failures at runtime when
using ThinLTO and /guard:cf. We probably don't want LLD LTO to set the
"incremental linker compatible" assembler flag, since this has nothing
to do with incremental linking, and we don't need to timestamp LTO
temporary objects.

Fixes PR36624.

Reviewers: inglorion, espindola, majnemer

Subscribers: mehdi_amini, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D44485

llvm-svn: 327557
2018-03-14 19:24:32 +00:00
Sanjay Patel fd82fd000c [InstSimplify] add tests to show missing/broken fadd folds (PR27151, PR26958); NFC
llvm-svn: 327554
2018-03-14 18:52:40 +00:00
Sanjay Patel e011d7964d [InstSimplify] regenerate checks; NFC
llvm-svn: 327553
2018-03-14 18:49:57 +00:00
Reid Kleckner e66458a841 [LLVM-C] [bindings/go] Add C and Golang bindings for COMDAT
Patch by Ben Clayton

Differential Revision: https://reviews.llvm.org/D44086

llvm-svn: 327551
2018-03-14 18:33:53 +00:00
Roman Lebedev 60d24445dd [InstSimplify] [NFC] cast-unsigned-icmp-cmp-0.ll - don't run instcombine
As disscussed in post-commit review of D44421, there is simply
no reason to run instcombine on this testcase.

llvm-svn: 327541
2018-03-14 17:59:12 +00:00
Craig Topper 9c098ed819 [X86] Add back fast-isel code for handling i8 shifts.
I removed this in r316797 because the coverage report showed no coverage and I thought it should have been handled by the auto generated table. I now see that there is code that bypasses the table if the shift amount is out of bounds.

This adds back the code. We'll codegen out of bounds i8 shifts to effectively (amount & 0x1f). The 0x1f is a strange quirk of x86 that shift amounts are always masked to 5-bits(except 64-bits). So if the masked value is still out bounds the result will be 0.

Fixes PR36731.

llvm-svn: 327540
2018-03-14 17:57:19 +00:00
Fangrui Song 56fb2b2f20 Fix LLVM IR check lines in utils/update_cc_test_checks.py
Reviewers: arichardson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44400

llvm-svn: 327538
2018-03-14 17:47:07 +00:00
Roman Lebedev 978aae7614 [InstSimplify] [NFC] Add tests for peeking through unsigned FP casts for sign compares (PR36682)
Summary:
This pattern came up in PR36682 / D44390
https://bugs.llvm.org/show_bug.cgi?id=36682
https://reviews.llvm.org/D44390
https://godbolt.org/g/oKvT5H

Looking at the IR pattern in question, as per [[ https://github.com/rutgers-apl/alive-nj | alive-nj ]], for all the type combinations i checked
(input: `i16`, `i32`, `i64`; intermediate: `half`/`i16`, `float`/`i32`, `double`/`i64`)
for the following `icmp` comparisons the `uitofp`+`bitcast`+`icmp` can be evaluated to a boolean:
* `slt 0`
* `sgt -1`
I did not check vectors, but i'm guessing it's the same there.
{F5889242}

Thus all these cases are in the testcase (along with the vector variant with additional `undef` element in the middle).
There are no negative patterns here (unless alive-nj lied/is broken), all of these should be optimized.

Reviewers: spatel, majnemer, efriedma, arsenm

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D44421

llvm-svn: 327535
2018-03-14 17:31:08 +00:00
Roman Lebedev 6ab60358ca [InstCombine] [NFC] Add tests for peeking through unsigned FP casts for zero-equality compares (PR36682)
Summary:
This pattern came up in PR36682 / D44390
https://bugs.llvm.org/show_bug.cgi?id=36682
https://reviews.llvm.org/D44390
https://godbolt.org/g/oKvT5H

Looking at the IR pattern in question, as per [[ https://github.com/rutgers-apl/alive-nj | alive-nj ]], for all the type combinations i checked
(input: `i16`, `i32`, `i64`; intermediate: `half`/`i16`, `float`/`i32`, `double`/`i64`)
for the following `icmp` comparisons the `uitofp`+`bitcast` can be dropped:
* `eq 0`
* `ne 0`
I did not check vectors, but i'm guessing it's the same there.
{F5889189}

Thus all these cases are in the testcase (along with the vector variant with additional `undef` element in the middle).
There are no negative patterns here (unless alive-nj lied/is broken), all of these should be optimized.

Generated with
{F5889196}

Reviewers: spatel, majnemer, efriedma, arsenm

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D44416

llvm-svn: 327534
2018-03-14 17:31:03 +00:00
Francis Visoiu Mistrih 084e7d8770 [AArch64] Keep track of MIFlags in the LoadStoreOptimizer
Merging:

* $x26, $x25 = frame-setup LDPXi $sp, 0
* $sp = frame-destroy ADDXri $sp, 64, 0

into an LDPXpost should preserve the flags from both instructions as
following:

* frame-setup frame-destroy LDPXpost

Differential Revision: https://reviews.llvm.org/D44446

llvm-svn: 327533
2018-03-14 17:10:58 +00:00
Craig Topper b36cb20ef9 [X86] Teach X86TargetLowering::targetShrinkDemandedConstant to set non-demanded bits if it helps created an and mask that can be matched as a zero extend.
I had to modify the bswap recognition to allow unshrunk masks to make this work.

Fixes PR36689.

Differential Revision: https://reviews.llvm.org/D44442

llvm-svn: 327530
2018-03-14 16:55:15 +00:00
Nicholas Wilson 48d6dbe3cb [WebAssembly] Add DenseMap traits and operator== for Wasm type structs
Differential Revision: https://reviews.llvm.org/D44303

llvm-svn: 327526
2018-03-14 15:58:03 +00:00
Simon Pilgrim d1c3c995c0 [X86][AVX] Use WriteFShuffleLd for broadcast reg-mem instructions
They shouldn't be treated as pure loads.

Found while investigating D44428

llvm-svn: 327524
2018-03-14 15:47:08 +00:00
Nicholas Wilson 027b9357a8 [WebAssembly] Identify COMDATs by index rather than string. NFC
This will enable an optimisation in LLD.

Differential Revision: https://reviews.llvm.org/D44343

llvm-svn: 327522
2018-03-14 15:44:45 +00:00
Arnold Schwaighofer bf1638daa8 SjLjEHPrepare: Don't reg-to-mem swifterror values
swifterror llvm values model the swifterror register as memory at the
LLVM IR level. ISel will perform adhoc mem-to-reg on them. swifterror
values are constraint in how they can be used. Spilling them to memory
is not allowed.

SjLjEHPrepare tried to lower swifterror values to memory which is
unecessary since the back-end will spill and reload the register as
neccessary (as long as clobbering calls are marked as such which is the
case here) and further leads to invalid IR because swifterror values
can't be stored to memory.

rdar://38164004

llvm-svn: 327521
2018-03-14 15:44:07 +00:00
Alexander Ivchenko 86ef9ab28f [GlobalIsel][X86] Support for G_SDIV instruction
Reviewed By: igorb

Differential Revision: https://reviews.llvm.org/D44430

llvm-svn: 327520
2018-03-14 15:41:11 +00:00
Sanjay Patel 5773ac3ee8 [CodeGen] allow printing of zero latency in sched comments
I don't know how to expose this in a test. There are ARM / AArch64 
sched classes that include zero latency instructions, but I'm not 
seeing sched info printed for those targets. X86 will almost 
certainly have these soon (see PR36671), but no model has
'let Latency = 0' currently.

llvm-svn: 327518
2018-03-14 15:28:48 +00:00
Andrea Di Biagio 36e34a99c7 [llvm-mca] Remove unused variable from InstrBuilder.cpp. NFC
This was causing a buildbot failure.

llvm-svn: 327517
2018-03-14 15:19:47 +00:00
Andrea Di Biagio 4732d43cae [llvm-mca] Move the logic that updates the register files from InstrBuilder to DispatchUnit. NFCI
Before this patch, the register file was always updated at instruction creation
time. That means, new read-after-write dependencies, and new temporary registers
were allocated at instruction creation time.

This patch refactors the code in InstrBuilder, and move all the logic that
updates the register file into the dispatch unit. We only want to update the
register file when instructions are effectively dispatched (not before).

This refactoring also helps removing a bad dependency between the InstrBuilder
and the DispatchUnit.

No functional change intended.

llvm-svn: 327514
2018-03-14 14:57:23 +00:00