Commit Graph

107048 Commits

Author SHA1 Message Date
Dylan McKay 8dd702c1cd [AVR] Implement LPMWRdZ pseudo-instruction's expansion.
FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should
refactor a bit and unify the two

Patch by Gerdo Erdi.

llvm-svn: 314898
2017-10-04 10:37:22 +00:00
Dylan McKay 3f71f1c91e [AVR] Factor out mayLoad in tablegen patterns
Patch by Gergo Erdi.

llvm-svn: 314897
2017-10-04 10:36:07 +00:00
Dylan McKay d00f9c1ef1 [AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`
Patch by Gergo Erdi.

llvm-svn: 314896
2017-10-04 10:33:36 +00:00
Dylan McKay 39069208d5 [AVR] Insert JMP for long branches
Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.

Patch by Thomas Backman.

llvm-svn: 314891
2017-10-04 09:51:28 +00:00
Dylan McKay c4b002bf5a [AVR] Fix displacement overflow for LDDW/STDW
In some cases, the code generator attempts to generate instructions such as:

lddw r24, Y+63

which expands to:

ldd r24, Y+63
ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary

This commit limits the first offset to 62, and thus the second to 63.
It also updates some asserts in AVRExpandPseudoInsts.cpp, including for
INW and OUTW, which appear to be unused.

Patch by Thomas Backman.

llvm-svn: 314890
2017-10-04 09:51:21 +00:00
Oliver Stannard 878216dd05 [ARM] Add diag string for movw/movt immediates in assembly
This adds diagnostics for invalid immediate operands to the MOVW and MOVT
instructions (ARM and Thumb).

Differential revision: https://reviews.llvm.org/D31879

llvm-svn: 314888
2017-10-04 09:24:54 +00:00
Oliver Stannard 5a7aae3a80 [ARM, Asm] Change grammar of immediate operand diagnostics
Currently, our diagnostics for assembly operands are not consistent.
Some start with (for example) "immediate operand must be ...",
and some with "operand must be an immediate ...". I think the latter
form is preferable for a few reasons:
* It's unambiguous that it is referring to the expected type of operand, not
  the type the user provided. For example, the user could provide an register
  operand, and get a message taking about an operand is if it is already an
  immediate, just not in the accepted range.
* It allows us to have a consistent style once we add diagnostics for operands
  that could take two forms, for example a label or pc-relative memory operand.

Differential revision: https://reviews.llvm.org/D36689

llvm-svn: 314887
2017-10-04 09:18:07 +00:00
Jatin Bhateja 3c29bacd43 [X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
         T1 = A + B
         T2 = T1 + 10
         T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
         Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
         leal 1(%rax,%rcx,1), %rdx
         leal 1(%rax,%rcx,2), %rcx
       will be factored as following
         leal 1(%rax,%rcx,1), %rdx
         leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314886
2017-10-04 09:02:10 +00:00
George Rimar 099960d322 [MC] - Don't assert when non-english characters are used.
I found that llvm-mc does not like non-english characters even in comments,
which it tries to tokenize.

Problem happens because of functions like isdigit(), isalnum() which takes
int argument and expects it is not negative.
But at the same time MCParser uses char* to store input buffer poiner, char has signed value,
so it is possible to pass negative value to one of functions from above and
that triggers an assert. 
Testcase for demonstration is provided.

To fix the issue helper functions were introduced in StringExtras.h

Differential revision: https://reviews.llvm.org/D38461

llvm-svn: 314883
2017-10-04 08:50:08 +00:00
Mikael Holmen a1a3f5c5e6 Recommit [UnreachableBlockElim] Use COPY if PHI input is undef
This time invoking llc with "-march=x86-64" in the testcase, so we don't assume
the default target is x86.

Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314882
2017-10-04 07:42:45 +00:00
Max Kazantsev 8aacef6cae [IRCE] Temporarily disable unsigned latch conditions by default
We have found some corner cases connected to range intersection where IRCE makes
a bad thing when the latch condition is unsigned. The fix for that will go as a follow up.
This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed.

The unsigned latch conditions were introduced to IRCE by rL310027.

Differential Revision: https://reviews.llvm.org/D38529

llvm-svn: 314881
2017-10-04 06:53:22 +00:00
Mikael Holmen 75b1992f78 Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef"
Build-bots broke on the new testcase. I'll investigate and fix.

llvm-svn: 314880
2017-10-04 06:39:22 +00:00
Mikael Holmen 65eb2f394c [UnreachableBlockElim] Use COPY if PHI input is undef
Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314879
2017-10-04 06:06:31 +00:00
Martin Storsjo e14145dcb0 [X86] Fix using the SJLJ jump table on x86_64
The previous version didn't work if the jump table base address didn't
fit in 32 bit, since it was encoded as an immediate offset. And in case
the jump table is encoded as 32 bit label differences, we need to
load and add them to the table base first.

This solves the first half of the issues mentioned in PR34720.

Also fix some of the errors pointed out by -verify-machineinstrs, by
using GR32_NOSPRegClass.

Differential Revision: https://reviews.llvm.org/D38333

llvm-svn: 314876
2017-10-04 05:12:10 +00:00
Adam Nemet f31b1f310c Move verbosity check for remarks to the diag handler
Test needs some slight adjustment because we no longer check the existence of
BFI but rather that the actual hotness is set on the remark.  If entry_count
is not set getBlockProfileCount returns None.

llvm-svn: 314874
2017-10-04 04:26:23 +00:00
Tim Shen 83fd6a1243 [FuzzerUtil] Partially revert D38481 on FuzzerUtil
This is because lib/Fuzzer doesn't really depend on llvm infrastucture.
It's not easy to access the llvm hardware_concurrency here.

Differential Reivision: https://reviews.llvm.org/D38481

llvm-svn: 314870
2017-10-04 01:05:34 +00:00
Rui Ueyama 15b8327963 Simplify multikey_qsort function.
This function implements the three-way radix quicksort algorithm.
This patch simplifies the implementation by using MutableArrayRef.

llvm-svn: 314858
2017-10-03 23:12:01 +00:00
Balaram Makam e0c43152b5 [AArch64] Use LateSimplifyCFG after expanding atomic operations.
Summary:
After r308422 we defer optimizations that can destroy loop canonical forms to
LateSimplifyCFG. Running LateSimplifyCFG after expanding atomic operations
can exploit more control-flow opportunities.

Reviewers: mcrosier, t.p.northover, efriedma

Reviewed By: efriedma

Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38262

llvm-svn: 314857
2017-10-03 22:39:24 +00:00
Konstantin Zhuravlyov 22bc039c89 AMDGPU: Expand setcc for v2f32 and v4f32
llvm-svn: 314853
2017-10-03 21:45:01 +00:00
Konstantin Zhuravlyov 908fa90b51 AMDGPU: Expand setcc for v2i32 and v4i32
llvm-svn: 314852
2017-10-03 21:31:24 +00:00
Konstantin Zhuravlyov 0aa94d314c AMDGPU: Add ELFOSABI_AMDGPU_MESA3D
Differential Revision: https://reviews.llvm.org/D38387

llvm-svn: 314846
2017-10-03 21:14:14 +00:00
Reid Kleckner 33cbbbc62f [X86] Remove dead declaration convertArgMovsToPushes, NFC
This was dead when it landed in r252578. We have this functionality, if
not for stack probe calls, but for regular calls in
X86CallFrameOptimization.cpp.

llvm-svn: 314845
2017-10-03 21:12:18 +00:00
Rafael Espindola 476a7f9293 Pre-compute the tail of the archive
An archive looks like

<header>
<symbol table>
<tail>

The symbol table refers to offsets in the tail. A complication is that
we would like to support symbol tables that use 64 bit offsets if it
turns out that any of the offsets is too big.

This patch changes the archive writer to first compute the tail. We
cannot just compute one big StringRef since that would require reading
every member upfront, but we can represent it as a series of
StringRefs.

Having done that it is much easier to compute the symbol table and all
offsets are computed before it is written. With this if there is an
accounting problem it will show up with a regular symbol table, not
just when a 64 bit one is needed.

llvm-svn: 314844
2017-10-03 20:59:43 +00:00
Konstantin Zhuravlyov a952b44ed5 AMDGPU: Add ELFOSABI_AMDGPU_PAL
llvm-svn: 314843
2017-10-03 20:54:07 +00:00
Reid Kleckner bc66947433 Refactor DIBuilder dbg intrinsic insertion, NFC
Both dbg.declare and dbg.value insertion had duplicate code for the two
overloads with different insertion point conventions.

llvm-svn: 314839
2017-10-03 20:36:40 +00:00
Jessica Paquette acc15e1265 [MachineOutliner] Fix off-by-one in cost model
This commit does two things. Firstly, it cleans up some of the benefit
calculation wrt outlined functions and candidates. Secondly, it fixes an
off-by-one bug in the cost model which was caused by the benefit value of
an OutlinedFunction and Candidate differing by 1. It updates the remarks test
to reflect this change.

llvm-svn: 314836
2017-10-03 20:32:55 +00:00
Stefan Pintilie e1d7547237 [PowerPC] Revert P9 scheduling model to incomplete
Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026
The previous change caused regressions on Power 9.

llvm-svn: 314835
2017-10-03 20:27:30 +00:00
Craig Topper df63b96811 [InstCombine] Use isSignBitCheck to simplify an if statement. Directly create new sign bit compares instead of manipulating the constant. NFCI
Since we no longer had the direct constant compares, manipulating the constant seemeded less clear.

llvm-svn: 314830
2017-10-03 19:14:23 +00:00
Tim Renouf 72800f0436 [AMDGPU] implemented pal metadata
Summary:
For the amdpal OS type:

We write an AMDGPU_PAL_METADATA record in the .note section in the ELF
(or as an assembler directive). It contains key=value pairs of 32 bit
ints. It is a merge of metadata from codegen of the shaders, and
metadata provided by the frontend as _amdgpu_pal_metadata IR metadata.
Where both sources have a key=value with the same key, the two values
are ORed together.

This .note record is part of the amdpal ABI and will be documented in
docs/AMDGPUUsage.rst in a future commit.

Eventually the amdpal OS type will stop generating the .AMDGPU.config
section once the frontend has safely moved over to using the .note
records above instead of .AMDGPU.config.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37753

llvm-svn: 314829
2017-10-03 19:03:52 +00:00
Alexander Timofeev 4651396584 [AMDGPU] Avoid predicated execution of the basic blocks containing scalar
instructions.

Differential revision: https://reviews.llvm.org/D38293

llvm-svn: 314828
2017-10-03 18:55:36 +00:00
Hans Wennborg dc8d6f2527 Fix -Wcovered-switch-default warnings from r314821
llvm-svn: 314826
2017-10-03 18:44:12 +00:00
Hans Wennborg ab2177edf7 Revert r314817 "[dwarfdump] Add -lookup option"
The test fails on Linux; see follow-up email on the llvm-commits list.

> Add the option to lookup an address in the debug information and print
> out the file, function, block and line table details.
>
> Differential revision: https://reviews.llvm.org/D38409

This also reverts the follow-up r314818:

> [test] Fix llvm-dwarfdump/cmdline.test
>
> Fixes test/tools/llvm-dwarfdump/cmdline.test

llvm-svn: 314825
2017-10-03 18:39:13 +00:00
Hans Wennborg 9a9048e19f Revert r314806 "[SLP] Vectorize jumbled memory loads."
All the buildbots are red, e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask' of
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh
>
> Reviewed By: Ayal
>
> Subscribers: hans, mzolotukhin
>
> Differential Revision: https://reviews.llvm.org/D36130

llvm-svn: 314824
2017-10-03 18:32:29 +00:00
Reid Kleckner b4569de739 Implement David Blaikie's suggestion for comparison operators
llvm-svn: 314822
2017-10-03 18:30:11 +00:00
Hans Wennborg 660531085a CodeView: Provide a .def file with the register ids
The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.

X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.

Differential revision: https://reviews.llvm.org/D38480

llvm-svn: 314821
2017-10-03 18:27:22 +00:00
Reid Kleckner 04e25e00b7 [DebugInfo] Correctly coalesce DBG_VALUEs that mix direct and indirect values
Summary:
This should fix a regression introduced by r313786, which switched from
MachineInstr::isIndirectDebugValue() to checking if operand 1 is an
immediate. I didn't have a test case for it until now.

A single UserValue, which approximates a user variable, may have many
DBG_VALUE instructions that disagree about whether the variable is in
memory or in a virtual register. This will become much more common once
we have llvm.dbg.addr, but you can construct such a test case manually
today with llvm.dbg.value.

Before this change, we would get two UserValues: one for direct and one
for indirect DBG_VALUE instructions describing the same variable. If we
build separate interval maps for direct and indirect locations, we will
end up accidentally coalescing identical DBG_VALUE intervals that need
to remain separate because they are broken up by intervals of the
opposite direct-ness.

Reviewers: aprantl

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37932

llvm-svn: 314819
2017-10-03 17:59:02 +00:00
Jonas Devlieghere f998c501b6 [dwarfdump] Add -lookup option
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.

Differential revision: https://reviews.llvm.org/D38409

llvm-svn: 314817
2017-10-03 17:10:21 +00:00
Geoff Berry fabedbad11 Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This reverts commit r314729.

Another bug has been encountered in an out-of-tree target reported by Quentin.

llvm-svn: 314814
2017-10-03 16:59:13 +00:00
Rafael Espindola 6e182fbab4 Use sched_getaffinity instead of std:🧵:hardware_concurrency.
The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.

With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.

This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.

llvm-svn: 314809
2017-10-03 16:25:15 +00:00
Dehao Chen ea523ddb1b Revert the change that accidentally went in r314806.
llvm-svn: 314807
2017-10-03 15:50:42 +00:00
Mohammad Shahid 1d5422f27f [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: hans, mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

llvm-svn: 314806
2017-10-03 15:28:48 +00:00
Oliver Stannard 0d5c792223 [ARM] Use table-gen'd assembly operand diags in ARM asm parser
This switches the ARM AsmParser to use assembly operand diagnostics from
tablegen, rather than a switch statement on the ARMMatchResultTy. It
moves the existing diagnostic strings to tablegen, but adds no new ones,
so this is NFC except for one diagnostic string that had an off-by-1 error
in the hand-written switch statement.

Differential revision: https://reviews.llvm.org/D31607

llvm-svn: 314804
2017-10-03 14:38:52 +00:00
Oliver Stannard 55114fd9f0 [ARM, Asm] Use correct source location for register tokens
tryParseRegister advances the lexer, so we need to take copies of the start and
end locations of the register operand before calling it.

Previously, the caret in the diagnostic pointer to the comma after the r0
operand in the test, rather than the start of the operand.

Differential revision: https://reviews.llvm.org/D31537

llvm-svn: 314799
2017-10-03 14:30:58 +00:00
Simon Dardis 055192ccd3 [mips] Enable spilling and reloading of the dsp register set.
The dsp register class is an alias of the gpr register class, so
we have to define instructions for spilling and reloading.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D38038

llvm-svn: 314798
2017-10-03 13:45:49 +00:00
John Brawn 736bf0020f [CGP] Make optimizeMemoryInst capable of handling multiple AddrModes
Currently optimizeMemoryInst requires that all of the AddrModes it sees are
identical. This patch makes it capable of tracking multiple AddrModes, so long
as they differ in at most one field.

This patch does nothing by itself, but later patches will make use of it to
insert or reuse phi or select instructions for the differing fields.

Differential Revision: https://reviews.llvm.org/D38278

llvm-svn: 314795
2017-10-03 13:08:22 +00:00
John Brawn eb83c7554e [CGP] In optimizeMemoryInst handle select similarly to phi
This lets us optimize away selects that perform the same address computation in
two different ways and is also the first step towards being able to handle
selects between two different, but compatible, address computations.

Differential Revision: https://reviews.llvm.org/D38242

llvm-svn: 314794
2017-10-03 13:04:15 +00:00
Oliver Stannard 68aa7de517 [ARM, Asm] Fix ubsan failure caused by out-of-range enum value
In this code, we use ~0U as a sentinel value for any operand class that doesn't
have a user-friendly error message, but this value isn't in range of the
MatchClassKind enum, so we need to ensure it does not get passed to isSubclass.

llvm-svn: 314793
2017-10-03 12:45:18 +00:00
Simon Pilgrim cf99d069c3 [X86][SSE] Add support for decoding PACKSS/PACKUS shuffles masks with UNDEF
llvm-svn: 314792
2017-10-03 12:41:39 +00:00
Oliver Stannard 5daee987fd [ARM, Asm] Remove dead code causing MSan failure.
r314779 caused ErrorInfo to be red uninitialised, but also made this code dead,
so it can just be removed.

llvm-svn: 314791
2017-10-03 12:28:28 +00:00
Simon Pilgrim f5f291d129 [X86][SSE] Add support for lowering shuffles to PACKSS/PACKUS
If the upper bits of a truncation shuffle patterns have at least the minimum number of sign/zero bits on their inputs then we can safely use PACKSS/PACKUS as shuffles.

Partial fix for https://bugs.llvm.org/show_bug.cgi?id=34773

Differential Revision: https://reviews.llvm.org/D38472

llvm-svn: 314788
2017-10-03 12:01:31 +00:00
Evgeny Astigeevich d3558b5d5e [InlineCost, NFC] Extract code dealing with inbounds GEPs from visitGetElementPtr into a function
The code responsible for analysis of inbounds GEPs is extracted into a separate
function: CallAnalyzer::canFoldInboundsGEP. With the patch SROA
enabling/disabling code is localized at one place instead of spreading across
the code of CallAnalyzer::visitGetElementPtr.

Differential Revision: https://reviews.llvm.org/D38233

llvm-svn: 314787
2017-10-03 12:00:40 +00:00
Sam Clegg b2b019f727 [WebAssembly] MC: Support for init_array and fini_array
Differential Revision: https://reviews.llvm.org/D37757

llvm-svn: 314783
2017-10-03 11:20:28 +00:00
Bjorn Pettersson 90cc1b53d0 [DebugInfo] Handle endianness when moving debug info for split integer values (reapplied)
Summary:
Take the target's endianness into account when splitting the
debug information in DAGTypeLegalizer::SetExpandedInteger.

This patch fixes so that, for big-endian targets, the fragment
expression corresponding to the high part of a split integer
value is placed at offset 0, in order to correctly represent
the memory address order.

I have attached a PPC32 reproducer where the resulting DWARF
pieces for a 64-bit integer were incorrectly reversed.

Original patch was reverted due to using -stop-after=isel in
the test case (but that is only working when AMDGPU target
is included in the llc build). The test case has now been
updated to use -stop-before=expand-isel-pseudos instead.

Patch by: dstenb

Reviewers: JDevlieghere, aprantl, dblaikie

Reviewed By: JDevlieghere, aprantl, dblaikie

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D38172

llvm-svn: 314781
2017-10-03 11:03:02 +00:00
Oliver Stannard e093bad472 [ARM] Use new assembler diags for ARM
This converts the ARM AsmParser to use the new assembly matcher error
reporting mechanism, which allows errors to be reported for multiple
instruction encodings when it is ambiguous which one the user intended
to use.

By itself this doesn't improve many error messages, because we don't have
diagnostic text for most operand types, but as we add that then this will allow
more of those diagnostic strings to be used when they are relevant.

Differential revision: https://reviews.llvm.org/D31530

llvm-svn: 314779
2017-10-03 10:26:11 +00:00
Simon Pilgrim d87af9a1c0 Remove unused variable. NFCI.
llvm-svn: 314778
2017-10-03 10:01:02 +00:00
Simon Pilgrim 640fbf5132 [X86][SSE] Add support for shuffle combining from PACKSS/PACKUS
Mentioned in D38472

llvm-svn: 314777
2017-10-03 09:54:03 +00:00
Simon Pilgrim 19d535e75b [X86][SSE] Add support for PACKSS/PACKUS constant folding
Pulled out of D38472

llvm-svn: 314776
2017-10-03 09:41:00 +00:00
Javed Absar e485b143ea [MiSched] - Simplify ProcResEntry access
Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38447

llvm-svn: 314775
2017-10-03 09:35:04 +00:00
Sjoerd Meijer 7a22a4948f ISel type legalization: add debug messages. NFCI.
This adds some more debug messages to the type legalizer and functions
like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make
the debug messages a little bit more informative and useful.

Differential Revision: https://reviews.llvm.org/D38450

llvm-svn: 314773
2017-10-03 08:54:15 +00:00
Alex Bradbury 1fb1a480b5 [RISCV] Parse RISC-V eflags in ObjectYAML
Differential Revision: https://reviews.llvm.org/D38311
Patch by Chih-Mao Chen.

llvm-svn: 314770
2017-10-03 08:00:47 +00:00
Hiroshi Inoue 224661d94b [trivial] fix format, NFC
llvm-svn: 314769
2017-10-03 07:28:58 +00:00
Shoaib Meenai 4fefcd6dad [ObjectYAML] Handle SHF_COMPRESSED
This was previously being silently dropped by obj2yaml and caused
parsing errors with yaml2obj.

Differential Revision: https://reviews.llvm.org/D38490

llvm-svn: 314768
2017-10-03 06:35:55 +00:00
Martin Storsjo 1e54738676 [X86] Provide the LSDA pointer with RIP relative addressing if necessary
This makes sure the LSDA pointer isn't truncated to 32 bit.

Make LowerINTRINSIC_WO_CHAIN a member function instead of a static
function, so that it can use the getGlobalWrapperKind method.

This solves the second half of the issues mentioned in PR34720.

Differential Revision: https://reviews.llvm.org/D38343

llvm-svn: 314767
2017-10-03 06:29:58 +00:00
Mikael Holmen 6efe507e42 [Lint] Avoid failed assertion by fetching the proper pointer type
Summary:
When checking if a constant expression is a noop cast we fetched the
IntPtrType by doing DL->getIntPtrType(V->getType())). However, there can
be cases where V doesn't return a pointer, and then getIntPtrType()
triggers an assertion.

Now we pass DataLayout to isNoopCast so the method itself can determine
what the IntPtrType is.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D37894

llvm-svn: 314763
2017-10-03 06:03:49 +00:00
Craig Topper 8ed1aa91bd [InstCombine] Change a bunch of methods to take APInts by reference instead of pointer.
This allows us to remove a bunch of dereferences and only have a few dereferences at the call sites.

llvm-svn: 314762
2017-10-03 05:31:07 +00:00
Craig Topper 664c4d0190 [InstCombine] Replace an equality compare of two APInt pointers with a compare of the APInts themselves.
Apparently this works by virtue of the fact that the pointers are pointers to the APInts stored inside of the ConstantInt objects. But I really don't think we should be relying on that.

llvm-svn: 314761
2017-10-03 04:55:04 +00:00
Quentin Colombet c2f3cea608 [Legalizer] Add support for G_OR NarrowScalar.
Legalize bitwise OR:
 A = BinOp<Ty> B, C
into:
 B1, ..., BN = G_UNMERGE_VALUES B
 C1, ..., CN = G_UNMERGE_VALUES C
 A1 = BinOp<Ty/N> B1, C2
 ...
 AN = BinOp<Ty/N> BN, CN
 A = G_MERGE_VALUES A1, ..., AN

llvm-svn: 314760
2017-10-03 04:53:56 +00:00
Rui Ueyama bfd7515408 Rewrite a function so that it doesn't use pointers to pointers. NFC.
Previous code was a bit puzzling because of its use of pointers.
In this patch, we pass a vector and its offsets, instead of pointers to
vector elements.

llvm-svn: 314756
2017-10-03 03:09:05 +00:00
Peter Collingbourne d5fc37cc88 LTO: Improve error reporting when adding a cache entry.
Move error handling code next to the code that returns the error,
and change the error message in order to distinguish it from a similar
error message elsewhere in this file.

llvm-svn: 314745
2017-10-03 00:44:21 +00:00
Daniel Berlin 67fbb351e3 SparseSolver: Rename getOrInitValueState to getValueState, matching what SCCP calls it
llvm-svn: 314744
2017-10-03 00:26:21 +00:00
Matt Arsenault 90c7593a75 AMDGPU: Remove global isGCN predicates
These are problematic because they apply to everything,
and can easily clobber whatever more specific predicate
you are trying to add to a function.

Currently instructions use SubtargetPredicate/PredicateControl
to apply this to patterns applied to an instruction definition,
but not to free standing Pats. Add a wrapper around Pat
so the special PredicateControls requirements can be appended
to the final predicate list like how Mips does it.

llvm-svn: 314742
2017-10-03 00:06:41 +00:00
Haicheng Wu 25f6c196d7 [InstSimplify] teach SimplifySelectInst() to fold more vector selects
Call ConstantFoldSelectInstruction() to fold cases like below

select <2 x i1><i1 true, i1 false>, <2 x i8> <i8 0, i8 1>, <2 x i8> <i8 2, i8 3>

All operands are constants and the condition has mixed true and false conditions.

Differential Revision: https://reviews.llvm.org/D38369

llvm-svn: 314741
2017-10-02 23:43:52 +00:00
Davide Italiano c48d1c8519 [PassManager] Retire cl::opt that have been set for a while. NFCI.
llvm-svn: 314740
2017-10-02 23:39:20 +00:00
Tim Shen 59465d29f8 [PowerPC] Revert r314666.
See https://reviews.llvm.org/D38172.

I tried to XFAIL it, but sometimes XPASS triggers the bot. Simply
revert it.

llvm-svn: 314739
2017-10-02 23:20:06 +00:00
Daniel Berlin 4d825bcf09 Template the sparse propagation solver instead of using void pointers
Summary:
This avoids using void * as the type of the lattice value and ugly casts needed to make that happen.
(If folks want to use references, etc, they can use a reference_wrapper).

Reviewers: davide, mssimpso

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38476

llvm-svn: 314734
2017-10-02 22:49:49 +00:00
Geoff Berry bfc5fb4571 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review:
- Avoid bug in regalloc greedy/machine verifier when forwarding to use
  in an instruction that re-defines the same virtual register.
- Fixed bug when forwarding to use in EarlyClobber instruction slot.
- Fixed incorrect forwarding to register definitions that showed up in
  explicit_uses() iterator (e.g. in INLINEASM).
- Moved removal of dead instructions found by
  LiveIntervals::shrinkToUses() outside of loop iterating over
  instructions to avoid instructions being deleted while pointed to by
  iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

llvm-svn: 314729
2017-10-02 22:01:37 +00:00
Michael Liao 70356574bb Remove trailing whitespace to trigger re-cmaking
llvm-svn: 314728
2017-10-02 21:54:38 +00:00
Amjad Aboud 8ef85a088e [X86][NFC] Add X86CmovConverterPass to the pass registry.
Differential Revision: https://reviews.llvm.org/D38355

llvm-svn: 314726
2017-10-02 21:46:37 +00:00
Michael Liao c6004d0371 Remove dead file.
llvm-svn: 314720
2017-10-02 21:00:52 +00:00
Matt Arsenault c6baa85fc6 AMDGPU: Fix typos
llvm-svn: 314715
2017-10-02 20:31:18 +00:00
Walter Lee 35b09cbd42 Add support for Myriad ma2x8x series of CPUs
Summary: Also add support for some older Myriad CPUs that were missing.

Reviewers: jyknight

Subscribers: fedor.sergeev

Differential Revision: https://reviews.llvm.org/D37552

llvm-svn: 314705
2017-10-02 18:50:48 +00:00
Adrian Prantl a8b2ddbde4 Move the stripping of invalid debug info from the Verifier to AutoUpgrade.
This came out of a recent discussion on llvm-dev
(https://reviews.llvm.org/D38042). Currently the Verifier will strip
the debug info metadata from a module if it finds the dbeug info to be
malformed. This feature is very valuable since it allows us to improve
the Verifier by making it stricter without breaking bcompatibility,
but arguable the Verifier pass should not be modifying the IR. This
patch moves the stripping of broken debug info into AutoUpgrade
(UpgradeDebugInfo to be precise), which is a much better location for
this since the stripping of malformed (i.e., produced by older, buggy
versions of Clang) is a (harsh) form of AutoUpgrade.

This change is mostly NFC in nature, the one big difference is the
behavior when LLVM module passes are introducing malformed debug
info. Prior to this patch, a NoAsserts build would have printed a
warning and stripped the debug info, after this patch the Verifier
will report a fatal error. I believe this behavior is actually more
desirable anyway.

Differential Revision: https://reviews.llvm.org/D38184

llvm-svn: 314699
2017-10-02 18:31:29 +00:00
Sanjay Patel 6e17c00a88 [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)
llvm-svn: 314698
2017-10-02 18:26:44 +00:00
Dehao Chen f464627f28 Update getMergedLocation to check the instruction type and merge properly.
Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined.

Reviewers: dblaikie, aprantl

Reviewed By: dblaikie, aprantl

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37877

llvm-svn: 314694
2017-10-02 18:13:14 +00:00
Hans Wennborg ea89ff7c25 CodeView symbol dumper: use symbolic names for registers
https://reviews.llvm.org/D38469

llvm-svn: 314690
2017-10-02 17:44:47 +00:00
Stanislav Mekhanoshin 6deb212522 Eliminate ftrunc if source is know to be rounded
Differential Revision: https://reviews.llvm.org/D38421

llvm-svn: 314688
2017-10-02 16:57:07 +00:00
Jonas Devlieghere f91dc28b7b [dwarfdump] Add -show-form
This enables printing of DWARF form types after the DWARF attribute
types.

Differential revision: https://reviews.llvm.org/D38459

llvm-svn: 314685
2017-10-02 16:02:04 +00:00
Sanjay Patel 232669d6b3 use range-for-loops; NFCI
llvm-svn: 314676
2017-10-02 15:02:06 +00:00
Coby Tayree 01e5320c48 [AsmParser] Support GAS's .print directive
Differential Revision: https://reviews.llvm.org/D38448

llvm-svn: 314674
2017-10-02 14:36:31 +00:00
Sanjay Patel 7998d37076 remove duplicate comments, reposition related functions; NFC
llvm-svn: 314669
2017-10-02 14:03:17 +00:00
Bjorn Pettersson 8e978c0151 [X86][SSE] Fix -Wsign-compare problems introduced in r314658
The refactoring in
"[X86][SSE] Add createPackShuffleMask helper function. NFCI."
resulted in warning when compiling the code (seen in build bots).

This patch restores some types from int to unsigned to avoid
those warnings.

llvm-svn: 314667
2017-10-02 12:46:38 +00:00
Bjorn Pettersson 839775a277 [Debug info] Handle endianness when moving debug info for split integer values
Summary:
Take the target's endianness into account when splitting the
debug information in DAGTypeLegalizer::SetExpandedInteger.

This patch fixes so that, for big-endian targets, the fragment
expression corresponding to the high part of a split integer
value is placed at offset 0, in order to correctly represent
the memory address order.

I have attached a PPC32 reproducer where the resulting DWARF
pieces for a 64-bit integer were incorrectly reversed.

Patch by: dstenb

Reviewers: JDevlieghere, aprantl, dblaikie

Reviewed By: JDevlieghere, aprantl, dblaikie

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D38172

llvm-svn: 314666
2017-10-02 12:46:32 +00:00
Simon Pilgrim e2e27aff9b [X86][SSE] Add createPackShuffleMask helper function. NFCI.
llvm-svn: 314658
2017-10-02 10:12:51 +00:00
Simon Pilgrim c04c7443ea [X86][SSE] matchBinaryVectorShuffle - add support for different src/dst value shuffle types
Preparation for support for combining to PACKSS/PACKUS

llvm-svn: 314656
2017-10-02 09:45:08 +00:00
Hiroshi Inoue dcedd66b00 [PowerPC] support ZERO_EXTEND in tryBitPermutation
This patch add a support of ISD::ZERO_EXTEND in PPCDAGToDAGISel::tryBitPermutation to increase the opportunity to use rotate-and-mask by reordering ZEXT and ANDI.
Since tryBitPermutation stops analyzing nodes if it hits a ZEXT node while traversing SDNodes, we want to avoid ZEXT between two nodes that can be folded into a rotate-and-mask instruction.

For example, we allow these nodes

      t9: i32 = add t7, Constant:i32<1>
    t11: i32 = and t9, Constant:i32<255>
  t12: i64 = zero_extend t11
t14: i64 = shl t12, Constant:i64<2>

to be folded into a rotate-and-mask instruction.
Such case often happens in array accesses with logical AND operation in the index, e.g. array[i & 0xFF];

Differential Revision: https://reviews.llvm.org/D37514

llvm-svn: 314655
2017-10-02 09:24:00 +00:00
Simon Pilgrim 3bbbf31590 Fix typo in comment. NFCI.
llvm-svn: 314653
2017-10-02 09:10:50 +00:00
Simon Pilgrim e575651370 [X86] Cleanup uses of computeKnownBits by using MaskedValueIsZero helper instead. NFCI.
llvm-svn: 314652
2017-10-02 09:08:45 +00:00
Michael Zuckerman e4084f6bdb [X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in X86InterleavedAccess (VF64 stride 3-4)
I continue to support different VF interleaved and in this pass for this patch,
I added the vf64 stride3 support for both load and store.
I also added support fot the stride4 store.

Reviewers:
1. zvi
2. dorit
3. igorb
4. guyblank

Differential Revision: https://reviews.llvm.org/D37687

Change-Id: I3d238efedf217d1768b348d710de1efa2f19d27b
llvm-svn: 314651
2017-10-02 07:35:25 +00:00
Craig Topper d37625859a [X86] Fix copy pasto in X86FastISel::fastEmitInst_rrrr.
The 4th operand was not being constrained and the third operand was being constrained twice.

llvm-svn: 314648
2017-10-02 05:46:53 +00:00
Craig Topper bb7866162c [X86] Use a bool flag instead of assigning an unsigned to two different values that we only use in an equality comparison.
llvm-svn: 314647
2017-10-02 05:46:52 +00:00
Craig Topper c05c390a7c [X86] Use _NOREX MOVZX instructions for some patterns even in 32-bit mode.
This unifies the patterns between both modes. This should be effectively NFC since all the available registers in 32-bit mode statisfy this constraint.

llvm-svn: 314643
2017-10-02 00:44:50 +00:00
Ron Lieberman 9bcdd80b66 [Hexagon] Check vector elements for equivalence in the HexagonVectorLoopCarriedReuse pass
If the two instructions being compared for equivalence have corresponding operands
    that are integer constants, then check their values to determine equivalence.
    
    Patch by Suyog Sarda!
    

llvm-svn: 314642
2017-10-02 00:34:07 +00:00
Ron Lieberman f90493d220 [Hexagon] Patch to Extract i1 element from vector of i1
This patch extracts 1 element from vector consisting
of elements of size 1 bit at given index.

llvm-svn: 314641
2017-10-02 00:16:15 +00:00
Craig Topper 6e025a3ecc [InstCombine] Use APInt for all the math in foldICmpDivConstant
Summary: This currently uses ConstantExpr to do its math, but as noted in a TODO it can all be done directly on APInt.

Reviewers: spatel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38440

llvm-svn: 314640
2017-10-01 23:53:54 +00:00
Craig Topper c20b46da2f [X86] Change register&memory TEST instructions from MRMSrcMem to MRMDstMem
Summary:
Intel documentation shows the memory operand as the first operand. But we currently treat it as the second operand. Conceptually the order doesn't matter since it doesn't write memory. We have aliases to parse with the operands in either order and the isel matching is commutable.

For the register&register form order does matter for the assembly parser. PR22995 was previously filed and fixed by changing the register&register form from MRMSrcReg to MRMDestReg to match gas. Ideally the memory form should match by using MRMDestMem.

I believe this supercedes D38025 which was trying to switch the register&register form back to pre-PR22995.

Reviewers: aymanmus, RKSimon, zvi

Reviewed By: aymanmus

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38120

llvm-svn: 314639
2017-10-01 23:53:53 +00:00
Craig Topper 00230604d3 [X86] Remove a couple unnecessary COPY_TO_REGCLASS from some output patterns where the instruction already produces the correct register class.
llvm-svn: 314638
2017-10-01 23:53:50 +00:00
Simon Pilgrim df23a2700d [X86][SSE] Add faux shuffle combining support for PACKUS
llvm-svn: 314631
2017-10-01 18:43:48 +00:00
Simon Pilgrim 836fa6dcfd [X86][SSE] Improve shuffle combining of PACKSS instructions.
Support unary packing and fix the faux shuffle mask for vectors larger than 128 bits.

llvm-svn: 314629
2017-10-01 17:54:55 +00:00
Sanjay Patel c7076a3ba9 [x86] formatting; NFC
llvm-svn: 314627
2017-10-01 14:39:10 +00:00
Daniel Jasper 3c9c60c727 Revert r314579: "Recommi r314561 after fixing over-debug assertion".
And follow-up r314585.
Leads to segfaults. I'll forward reproduction instructions to the patch
author.

Also, for a recommit, still add the original patch description.
Otherwise, it becomes really tedious to find out what a patch actually
does. The fact that it is a recommit with a fix is somewhat secondary.

llvm-svn: 314622
2017-10-01 09:53:53 +00:00
Dehao Chen d26dae0d34 Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases.
Summary: In SamplePGO ThinLTO compile phase, we will not invoke ICP as it may introduce confusion to the 2nd annotation. This patch extracted that logic and makes it clearer before profile annotation. In the mean time, we need to make function importing process both inlined callsites as well as not promoted indirect callsites.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D38094

llvm-svn: 314619
2017-10-01 05:24:51 +00:00
Xin Tong c063c3f09d Revert "Fix typo [NFC]"
This reverts commit e60b5028619be1c81bd039d63a0627dac32d38f9.

Incorrectly include changes that are not typo fix.

llvm-svn: 314614
2017-10-01 00:09:53 +00:00
Xin Tong efec219e1b Fix typo [NFC]
llvm-svn: 314613
2017-10-01 00:07:24 +00:00
Daniel Berlin d36c27bedb NewGVN: Fix PR 34473, by not using ExactlyEqualsExpression for finding
phi of ops users.

llvm-svn: 314612
2017-09-30 23:51:55 +00:00
Daniel Berlin c1305af09b NewGVN: Evaluate phi of ops expressions before creating phi node
llvm-svn: 314611
2017-09-30 23:51:54 +00:00
Daniel Berlin 9b926e90d3 NewGVN: Allow dependent PHI of ops
llvm-svn: 314610
2017-09-30 23:51:53 +00:00
Daniel Berlin de6958ee85 NewGVN: Make OpIsSafeForPhiOfOps non-recursive
llvm-svn: 314609
2017-09-30 23:51:04 +00:00
Dehao Chen 4f5d830343 Refactor the SamplePGO profile annotation logic to extract inlineCallInstruction. (NFC)
llvm-svn: 314601
2017-09-30 20:46:15 +00:00
Simon Pilgrim a8dd6f4f30 [X86][SSE] Fold (VSRAI (VSHLI X, C1), C1) --> X iff NumSignBits(X) > C1
Remove sign extend in register style pattern if the sign is already extended enough

llvm-svn: 314599
2017-09-30 17:57:34 +00:00
Craig Topper 619569841a [AVX-512] Add patterns to make fp compare instructions commutable during isel.
llvm-svn: 314598
2017-09-30 17:02:39 +00:00
Michael Zuckerman b92b6d424f Code refactoring for the interleaved code <NFC>
Change-Id: I7831c9febad8e14278a5bc87584a0053dc837be1
llvm-svn: 314596
2017-09-30 14:55:03 +00:00
Daniel Jasper 0a51ec29c9 Revert r314435: "[JumpThreading] Preserve DT and LVI across the pass"
Causes a segfault on a builtbot (and in our internal bootstrapping of
Clang). See Eli's response on the commit thread.

llvm-svn: 314589
2017-09-30 11:57:19 +00:00
Xinliang David Li b8aac3ac19 Fix buildbot failure -- tighten type check for matching phi
llvm-svn: 314585
2017-09-30 05:27:46 +00:00
Craig Topper d92ade96f4 [X86] Support v64i8 mulhu/mulhs
Implemented by splitting into two v32i8 mulhu/mulhs and concatenating the results.

Differential Revision: https://reviews.llvm.org/D38307

llvm-svn: 314584
2017-09-30 04:21:46 +00:00
Xinliang David Li 3409d9c07f Recommi r314561 after fixing over-debug assertion
llvm-svn: 314579
2017-09-30 00:46:32 +00:00
Stanislav Mekhanoshin 1d8cf2be89 [AMDGPU] Set fast-math flags on functions given the options
We have a single library build without relaxation options.
When inlined library functions remove fast math attributes
from the functions they are integrated into.

This patch sets relaxation attributes on the functions after
linking provided corresponding relaxation options are given.
Math instructions inside the inlined functions remain to have
no fast flags, but inlining does not prevent fast math
transformations of a surrounding caller code anymore.

Differential Revision: https://reviews.llvm.org/D38325

llvm-svn: 314568
2017-09-29 23:40:19 +00:00
Yaxun Liu b33607e5a1 CodeGen: Fix pointer info in expandUnalignedLoad/Store
Currently expandUnalignedLoad/Store uses place holder pointer info for temporary memory operand
in stack, which does not have correct address space. This causes unaligned private double16 load/store to be
lowered to flat_load instead of buffer_load for amdgcn target.

This fixes failures of OpenCL conformance test basic/vload_private/vstore_private on target amdgcn---amdgizcl.

Differential Revision: https://reviews.llvm.org/D35361

llvm-svn: 314566
2017-09-29 23:31:14 +00:00
Xinliang David Li 455dec098b Revert 314561 due to debug build assertion failure
llvm-svn: 314563
2017-09-29 22:30:34 +00:00
Xinliang David Li 5b9d96825b Eliminate PHI (int typed) which has only one use by intptr
This patch will eliminate redundant intptr/ptrtoint that pessimizes
analyses such as SCEV, AA and will make optimization passes such
as auto-vectorization more powerful.

Differential revision: http://reviews.llvm.org/D37832

llvm-svn: 314561
2017-09-29 22:10:15 +00:00
Alex Shlyapnikov e76aa3b0b2 Revert "Use the basic cost if a GEP is not used as addressing mode"
This reverts commit r314517.

This commit crashes sanitizer bots, for example:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167

Stack snippet:
...
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0
...

llvm-svn: 314560
2017-09-29 22:04:45 +00:00
Eugene Zelenko 4f81cdd818 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314559
2017-09-29 21:55:49 +00:00
Matthew Simpson f4bb480b62 [LV] Use correct insertion point when type shrinking reductions
When type shrinking reductions, we should insert the truncations and extends at
the end of the loop latch block. Previously, these instructions were inserted
at the end of the loop header block. The difference is only a problem for loops
with predicated instructions (e.g., conditional stores and instructions that
may divide by zero). For these instructions, we create new basic blocks inside
the vectorized loop, which cause the loop header and latch to no longer be the
same block. This should fix PR34687.

Reference: https://bugs.llvm.org/show_bug.cgi?id=34687
llvm-svn: 314542
2017-09-29 18:07:39 +00:00
Sam Clegg 63ebb81386 [WebAssembly] Allow each data segment to specify its own alignment
Also, add a flags field as we will almost certainly
be needing that soon too.

Differential Revision: https://reviews.llvm.org/D38296

llvm-svn: 314534
2017-09-29 16:50:08 +00:00
Hongbin Zheng c8abdf5f25 [SimplifyIndVar] Do not fail when we constant fold an IV user to ConstantPointerNull
The type of a SCEVConstant may not match the corresponding LLVM Value.
In this case, we skip the constant folding for now.

TODO: Replace ConstantInt Zero by ConstantPointerNull
llvm-svn: 314531
2017-09-29 16:32:12 +00:00
Jonas Devlieghere a15f25d325 [dwarfdump][NFC] Consistent printing of address ranges
This implement the insertion operator for DWARF address ranges so they
are consistently printed as [LowPC, HighPC).

While a dump method might have felt more consistent, it is used
exclusively for printing error messages in the verifier and never used
for actual dumping. Hence this approach is more intuitive and creates
less clutter at the call sites.

Differential revision: https://reviews.llvm.org/D38395

llvm-svn: 314523
2017-09-29 15:41:22 +00:00
Nicolai Haehnle ce4ddd06da AMDGPU: VALU carry-in and v_cndmask condition cannot be EXEC
The hardware will only forward EXEC_LO; the high 32 bits will be zero.

Additionally, inline constants do not work. At least,

   v_addc_u32_e64 v0, vcc, v0, v1, -1

which could conceivably be used to combine (v0 + v1 + 1) into a single
instruction, acts as if all carry-in bits are zero.

The llvm.amdgcn.ps.live test is adjusted; it would be nice to combine

   s_mov_b64 s[0:1], exec
   v_cndmask_b32_e64 v0, v1, v2, s[0:1]

into

   v_mov_b32 v0, v3

but it's not particularly high priority.

Fixes dEQP-GLES31.functional.shaders.helper_invocation.value.*

llvm-svn: 314522
2017-09-29 15:37:31 +00:00
Jun Bum Lim 0e16a59e83 Use the basic cost if a GEP is not used as addressing mode
Summary:
Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target.
However, since it doesn't check its actual users, it will return FREE even in cases
where the GEP cannot be folded away as a part of actual addressing mode.
For example, if an user of the GEP is a call instruction taking the GEP as a parameter,
then the GEP may not be folded in isel.

Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng

Reviewed By: hfinkel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38085

llvm-svn: 314517
2017-09-29 14:50:16 +00:00
Jonas Paulsson c9e363ac69 [SystemZ] implement shouldCoalesce()
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.

If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).

This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899
https://bugs.llvm.org/show_bug.cgi?id=34610

llvm-svn: 314516
2017-09-29 14:31:39 +00:00
Amara Emerson 7d6c55f8aa [X86] Improve codegen for inverted overflow checking intrinsics.
Adds a new combine for: xor(setcc cc, val), 1 --> setcc (invert(cc), val)

Differential Revision: https://reviews.llvm.org/D38161

llvm-svn: 314514
2017-09-29 13:53:44 +00:00
Sam Parker 963da5b119 [ARM] v8.3-a complex number support
New instructions are added to AArch32 and AArch64 to aid
floating-point multiplication and addition of complex numbers, where
the complex numbers are packed in a vector register as a pair of
elements. The Imaginary part of the number is placed in the more
significant element, and the Real part of the number is placed in the
less significant element.

This patch adds assembler for the ARM target.

Differential Revision: https://reviews.llvm.org/D36789

llvm-svn: 314511
2017-09-29 13:11:33 +00:00
Michael Zuckerman 0b5db55b96 Small modification <NFC>
Change-Id: I360abccee12cae29bd2ac4f8399c9ecc92eb7f13
llvm-svn: 314510
2017-09-29 12:45:54 +00:00
Aleksandar Beserminji 29341b88ac [mips] Reordering callseq* nodes to be linear
Fix nested callseq* nodes by moving callseq_start after the
arguments calculation to temporary registers, so that callseq* nodes
in resulting DAG are linear.

Recommitting r314497. This version does not contain test which fails
when compiler is not build in debug mode.

Differential Revision: https://reviews.llvm.org/D37328

llvm-svn: 314507
2017-09-29 11:05:02 +00:00
Aleksandar Beserminji a0a01e7172 Revert "[mips] Reordering callseq* nodes to be linear"
Added test relies on the compiler being built in debug mode,
which may not be the case.

This reverts commit r314497.

llvm-svn: 314506
2017-09-29 10:52:03 +00:00
Simon Dardis f21d8d6ad5 [mips] Add missing license info, formatting changes. NFCI
Add missing license information to MicroMipsInstrFPU.td and
fix most of the formatting errors present. Others will be
addressed in a follow up commits.

llvm-svn: 314505
2017-09-29 10:08:06 +00:00
Tim Renouf ef1ae8ffac [AMDGPU] calling conventions for AMDPAL OS type
Summary:
This commit adds comments on how the AMDPAL OS type overloads the
existing AMDGPU_ calling conventions used by Mesa, and adds a couple of
new ones.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37752

llvm-svn: 314502
2017-09-29 09:51:22 +00:00
Tim Renouf 132291589f [AMDGPU] AMDPAL scratch buffer support
Summary:
Added support for scratch (including spilling) for OS type amdpal:
generates code to set up the scratch descriptor if it is needed.

With amdpal, the scratch resource descriptor is loaded from offset 0 of
the global information table. The low 32 bits of the address of the
global information table is passed in s0.

Added amdgpu-git-ptr-high function attribute to hard-wire the high 32
bits of the address of the global information table. If the function
attribute is not specified, or is 0xffffffff, then the backend generates
code to use the high 32 bits of pc.

The documentation for the AMDPAL ABI will be added in a later commit.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye

Differential Revision: https://reviews.llvm.org/D37483

llvm-svn: 314501
2017-09-29 09:49:35 +00:00
Tim Renouf 9f7ead3334 [Triple] Add AMDPAL operating system type
Summary:
This operating system type represents the AMDGPU PAL runtime, and will
be required by the AMDGPU backend in order to generate correct code for
this runtime.

Currently it generates the same code as not specifying an OS at all.
That will change in future commits.

Patch from Tim Corringham.

Subscribers: arsenm, nhaehnle

Differential Revision: https://reviews.llvm.org/D37380

llvm-svn: 314500
2017-09-29 09:48:12 +00:00
Jonas Devlieghere 19fc4d941f [dwarfdump][NFC] Consistent errors and warnings with --verify
This patch introduces 3 helper functions: error(), warn() and note() to
make printing  during verification more consistent. When supported, the
respective prefixes are printed in color using the same color scheme as
clang.

Differential revision: https://reviews.llvm.org/D38368

llvm-svn: 314498
2017-09-29 09:33:31 +00:00
Aleksandar Beserminji 502dcb035a [mips] Reordering callseq* nodes to be linear
Fix nested callseq* nodes by moving callseq_start after the
arguments calculation to temporary registers, so that callseq* nodes
in resulting DAG are linear.

Differential Revision: https://reviews.llvm.org/D37328

llvm-svn: 314497
2017-09-29 09:32:14 +00:00
Coby Tayree c3d24118e8 [X86][MS-InlineAsm] Extended support for variables / identifiers on memory / immediate expressions
Allow the proper recognition of Enum values and global variables inside ms inline-asm memory / immediate expressions, as they require some additional overhead and treated incorrect if doesn't early recognized.
supersedes D33278, D35774

Differential Revision: https://reviews.llvm.org/D37412

llvm-svn: 314493
2017-09-29 07:02:46 +00:00
Sanjoy Das 0ac5ba5ade Revert "[BypassSlowDivision] Improve our handling of divisions by constants"
This reverts commit r314253.  It causes a miscompile on P100 in an internal
benchmark.  Reverting while I investigate.

llvm-svn: 314482
2017-09-29 00:54:16 +00:00
Adrian Prantl f51e78017d llvm-dwarfdump: support .apple-namespaces in --find
llvm-svn: 314481
2017-09-29 00:52:33 +00:00
Adrian Prantl 714ee4d536 llvm-dwarfdump: add support for .apple_types in --find
llvm-svn: 314479
2017-09-29 00:33:22 +00:00
Jessica Paquette 919991690c [MachineOutliner][NFC] Simplify logic in pruneCandidates
This commit yanks out the repeated sections of code in pruneCandidates into
two lambdas: ShouldSkipCandidate and Prune. This simplifies the logic in
pruneCandidates significantly, and reduces the chance of introducing bugs by
folding all of the shared logic into one place.

llvm-svn: 314475
2017-09-28 23:39:36 +00:00
Craig Topper 6255c7b675 [X86] Don't select (cmp (and, imm), 0) to testw
Summary:
X86ISelDAGToDAG tries to analyze ANDs compared with 0 to optimize to narrower immediates using subregisters.

I don't think we should be optimizing to 16-bit test instructions. It goes against our normal behavior of promoting i16 operations to i32. It only saves one byte due to the need to add a 0x66 prefix. I think it would also be subject to a length changing prefix penalty in the decoders on Intel CPUs.

Reviewers: RKSimon, zvi, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38273

llvm-svn: 314474
2017-09-28 23:35:36 +00:00
Matthias Braun 51687912a4 ARM: Fix cases where CSI Restored bit is not cleared
LR is an untypical callee saved register in that it is restored into a
different register (PC) and thus does not live-out of the return block.
This case requires the `Restored` flag in CalleeSavedInfo to be cleared.

This fixes a number of cases where this wasn't handled correctly yet.

llvm-svn: 314471
2017-09-28 23:12:06 +00:00
Yonghong Song ef29a84d48 bpf: fix a bug for disassembling ld_pseudo inst
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 314469
2017-09-28 22:47:34 +00:00
Eugene Zelenko 3b87336a0c [Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314467
2017-09-28 22:27:31 +00:00
Ulrich Weigand df86855f61 [SystemZ] Fix fall-out from r314428
The expensive-checks build bot found a problem with the r314428 commit:
if CC is live after a ATOMIC_CMP_SWAPW instruction, it needs to be
marked as live-in to the block after the loop the pseudo gets expanded
to.  This actually fixes a code-gen bug as well, since if the CC isn't
live, the CR and JLH are merged to a CRJLH which doesn't actually set
the condition code any more.

llvm-svn: 314465
2017-09-28 22:08:25 +00:00
Craig Topper ed19350293 [X86] Make use of vpmovwb when possible in LowerMULH
If we have BWI, we can truncate in a much simpler way by using vpmovwb. This even works without VLX by using the wider zmm->ymm truncate with a subvector extract.

Differential Revision: https://reviews.llvm.org/D38375

llvm-svn: 314457
2017-09-28 20:10:34 +00:00
Martin Storsjo d6218cc385 [ARM] Restore the right frame pointer register in Int_eh_sjlj_longjmp
In setupEntryBlockAndCallSites in CodeGen/SjLjEHPrepare.cpp,
we fetch and store the actual frame pointer, but on return via
the longjmp intrinsic, it always was restored into the r7 variable.

On windows, the frame pointer should be restored into r11 instead of r7.

On Darwin (where sjlj exception handling is used by default), the frame
pointer is always r7, both in arm and thumb mode, and likewise, on
windows, the frame pointer always is r11.

On linux however, if sjlj exception handling is enabled (which it isn't
by default), libcxxabi and the user code can be built in differing modes
using different registers as frame pointer. Therefore, when restoring
registers on a platform where we don't always use the same register
depending on code mode, restore both r7 and r11.

Differential Revision: https://reviews.llvm.org/D38253

llvm-svn: 314451
2017-09-28 19:04:30 +00:00
Martin Storsjo adceba59a2 [ARM] Fix SJLJ exception handling when manually chosen on a platform where it isn't default
Differential Revision: https://reviews.llvm.org/D38252

llvm-svn: 314450
2017-09-28 19:04:14 +00:00
Matthias Braun 5c3e8a450e MIR: Serialize CaleeSavedInfo Restored flag
llvm-svn: 314449
2017-09-28 18:52:14 +00:00
Craig Topper 3819be6cf6 [X86] Use target independent ZERO_EXTEND/SIGN_EXTEND nodes were possible in LowerMULH
We aren't do any in register extends here so we should be able to just the target independent nodes directly and allow them to be lowered as necessary.

llvm-svn: 314447
2017-09-28 18:45:28 +00:00
Craig Topper fc104bfbc0 [X86] Move a setOperation action for ISD::TRUNCATE near another one in the same if. Remove one that is redundant with another subtarget features.
llvm-svn: 314446
2017-09-28 18:45:27 +00:00
Adrian Prantl 99fdb9d927 llvm-dwarfdump: implement --find for .apple_names
This patch implements the dwarfdump option --find=<name>.  This option
looks for a DIE in the accelerator tables and dumps it if found.  This
initial patch only adds support for .apple_names to keep the review
small, adding the other sections and pubnames support should be
trivial though.

Differential Revision: https://reviews.llvm.org/D38282

llvm-svn: 314439
2017-09-28 18:10:52 +00:00
Lang Hames 705db63ce1 [ORC] Fix the type of RTDyldObjectLinkingLayer::NotifyLoadedFtor.
Bug found by Stefan Granitz. Thanks Stefan!

llvm-svn: 314436
2017-09-28 17:43:07 +00:00
Evandro Menezes 3701df55c6 [JumpThreading] Preserve DT and LVI across the pass
JumpThreading now preserves dominance and lazy value information across the
entire pass.  The pass manager is also informed of this preservation with
the goal of DT and LVI being recalculated fewer times overall during
compilation.

This change prepares JumpThreading for enhanced opportunities; particularly
those across loop boundaries.

Patch by: Brian Rzycki <b.rzycki@samsung.com>,
          Sebastian Pop <s.pop@samsung.com>

Differential revision: https://reviews.llvm.org/D37528

llvm-svn: 314435
2017-09-28 17:24:40 +00:00
Craig Topper ceff6da6e9 [X86] Use BWI instructions to improve lowering of v32i8 MULHU/S
Summary: If we have BWI instructions we can widen to v32i16 to do the multiply instead of splitting.

Reviewers: RKSimon, spatel, zvi

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38305

llvm-svn: 314432
2017-09-28 17:00:21 +00:00
Craig Topper fd6b8a67fb [X86] Remove dead code from X86ISelDAGToDAG.cpp multiply handling
Summary:
Lowering never creates X86ISD::UMUL for 8-bit types. X86ISD::UMUL8 is used instead. If X86ISD::UMUL 8-bit were ever used it would crash.

DAGCombiner replaces UMUL_LOHI/SMUL_LOHI with a wider MUL and a shift if the type twice as wide is legal. So we should never see i8 UMUL_LOHI/SMUL_LOHI. In fact I think there was a bug in part of the i8 code. Similar is true for i16 though without the bug.

Reviewers: RKSimon, spatel, zvi

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38276

llvm-svn: 314430
2017-09-28 16:56:36 +00:00
Craig Topper 71a8cf9f99 [X86] Use correct subvector index when combining two insert subvectors featuring zero vectors.
Previously we were using one of the subvector indices twice. The included test case causes an assert without this change.

Thanks to Simon Pilgrim for catching this.

llvm-svn: 314429
2017-09-28 16:53:16 +00:00
Ulrich Weigand 0f1de04979 [SystemZ] Custom-expand ATOMIC_CMP_AND_SWAP_WITH_SUCCESS
The SystemZ compare-and-swap instructions already provide the "success"
indication via a condition-code value, so the default expansion of those
operations generates an unnecessary extra comparsion.

llvm-svn: 314428
2017-09-28 16:22:54 +00:00
Jonas Devlieghere 35fdaa94f7 [dwarfdump] Verify that CUs have a unit DIE.
This patch adds a check to the DWARF verifier to detect CUs without a
unit DIE.

Differential revision: https://reviews.llvm.org/D38363

llvm-svn: 314426
2017-09-28 15:57:50 +00:00
Simon Pilgrim 2ff339303e Use SDValue::getConstantOperandVal helper. NFCI.
llvm-svn: 314425
2017-09-28 15:53:27 +00:00
Simon Dardis c8e33c5ca1 [mips] Remove codegen support for branch likely instructions.
This patch disables codegen support for branch likely instructions to
address a potential bug. These branches were unselectable as
they had the same patterns as the normal branches but came after them
when ISel was concerned.

The branch likely instructions were marked as having no delay
slots when they have annulling delay slots. The delay slot filler
does not currently handle annulling delay slot branches, so this
would lead to wrong codegen if these branches were generated.

Reviewers: atanasyan, nitesh.jain

Differential Revision: https://reviews.llvm.org/D38169

llvm-svn: 314421
2017-09-28 15:24:07 +00:00
Benjamin Kramer c965b30e54 [LoopUnroll] Fix use after poison.
llvm-svn: 314418
2017-09-28 14:47:39 +00:00
Bjorn Pettersson 715a5efaad [DebugInfo] Do not extend range for physreg in LiveDebugVariables
Summary:
A DBG_VALUE that is referring to a physical register is
valid up until the next def of the register, or the end
of the basic block that it belongs to.

LiveDebugVariables is computing live intervals (slot index
ranges) for DBG_VALUE instructions, before regalloc, in order
to be able to re-insert DBG_VALUE instructions again after
regalloc. When the DBG_VALUE is mapping a variable to a
physical register we do not need to compute the range. We
should simply re-insert the DBG_VALUE at the start position.

The problem that was found, resulting in this patch, was a
situation when the DBG_VALUE was the last real use of the
physical register. The computeIntervals/extendDef methods
extended the range to cover the whole basic block, even though
the physical register very well could be allocated to some
virtual register inside the basic block. So the extended
range could not be trusted.

This patch is a preparation for https://reviews.llvm.org/D38229,
where the goal is to insert DBG_VALUE after each new definition
of a variable, even if the virtual registers that the variable
was connected to has been coalesced into using the same physical
register (e.g. due to two address instructions). For more info
see https://bugs.llvm.org/show_bug.cgi?id=34545

Reviewers: aprantl, rnk, echristo

Reviewed By: aprantl

Subscribers: Ka-Ka, llvm-commits

Differential Revision: https://reviews.llvm.org/D38140

llvm-svn: 314414
2017-09-28 13:10:06 +00:00
Florian Hahn 8af01573a3 [LVI] Move LVILatticeVal class to separate header file (NFC).
Summary:
This allows sharing the lattice value code between LVI and SCCP (D36656). 

It also adds a `satisfiesPredicate` function, used by D36656.

Reviewers: davide, sanjoy, efriedma

Reviewed By: sanjoy

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D37591

llvm-svn: 314411
2017-09-28 11:09:22 +00:00
Coby Tayree 566348f2a0 [x86][AsmParser] Allow some more MS size directives
MS allows the following size directives: float/double and long as synonymous to dword/qword and dword, respectively.
Differential Revision: https://reviews.llvm.org/D37190

llvm-svn: 314410
2017-09-28 11:04:08 +00:00
Alex Bradbury 5518cbfc41 Teach TargetInstrInfo::getInlineAsmLength to parse .space directives with integer arguments
It's currently quite difficult to test passes like branch relaxation, which
requires branches with large displacement to be generated. The .space assembler
directive makes it easy to create arbitrarily large basic blocks, but
getInlineAsmLength is not able to parse it and so the size of the block is not
correctly estimated. Other backends (AArch64, AMDGPU) introduce options just
for testing that artificially restrict the ranges of branch instructions (e.g.
aarch64-tbz-offset-bits). Although parsing a single form of the .space
directive feels inelegant, it does allow a more direct testing approach.

This patch adapts the .space parsing code from
Mips16InstrInfo::getInlineAsmLength and removes it now the extra functionality
is provided by the base implementation. I want to move this functionality to
the generic getInlineAsmLength as 1) I need the same for RISC-V, and 2) I feel
other backends will benefit from more direct testing of large branch
displacements.

Differential Revision: https://reviews.llvm.org/D37798

llvm-svn: 314393
2017-09-28 09:31:46 +00:00
Hiroshi Inoue 79c0bec06e [PowerPC] eliminate partially redundant compare instruction
This is a follow-on of D37211.
D37211 eliminates a compare instruction if two conditional branches can be made based on the one compare instruction, e.g.
if (a == 0) { ... }
else if (a < 0) { ... }

This patch extends this optimization to support partially redundant cases, which often happen in while loops.
For example, one compare instruction is moved from the loop body into the preheader by this optimization in the following example.
do {
  if (a == 0) dummy1();
  a = func(a);
} while (a > 0);

Differential Revision: https://reviews.llvm.org/D38236

llvm-svn: 314390
2017-09-28 08:38:19 +00:00
Alex Bradbury 9d3f12501a [RISCV] Add common fixups and relocations
%lo(), %hi(), and %pcrel_hi() are supported and test cases have been added to 
ensure the appropriate fixups and relocations are generated. I've added an 
instruction format field which is used in RISCVMCCodeEmitter to, for 
instance, tell whether it should emit a lo12_i fixup or a lo12_s fixup 
(RISC-V has two 12-bit immediate encodings depending on the instruction 
type).

Differential Revision: https://reviews.llvm.org/D23568

llvm-svn: 314389
2017-09-28 08:26:24 +00:00
Mikael Holmen 07f1e2e2b3 [RegAllocGreedy]: Allow recoloring of done register if it's non-tied
Summary:
If we have a non-allocated register, we allow us to try recoloring of an
already allocated and "Done" register, even if they are of the same
register class, if the non-allocated register has at least one tied def
and the allocated one has none.

It should be easier to recolor the non-tied register than the tied one, so
it might be an improvement even if they use the same regclasses.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D38309

llvm-svn: 314388
2017-09-28 08:22:35 +00:00
George Burgess IV f8e11b803d [DAGCombiner] Fix an off-by-one error in vector logic
Without this, we could end up trying to get the Nth (0-indexed) element
from a subvector of size N.

Differential Revision: https://reviews.llvm.org/D37880

llvm-svn: 314380
2017-09-28 06:17:19 +00:00
Yonghong Song e9165f8720 bpf: add new insns for bswap_to_le and negation
This patch adds new insn, "reg = be16/be32/be64 reg",
for bswap to little endian for big-endian target (bpfeb).
It also adds new insn for negation "reg = -reg".

Currently, for source code, e.g.,
  b = -a
LLVM still prefers to generate:
  b = 0 - a
But "reg = -reg" format can be used in assembly code.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 314376
2017-09-28 02:46:11 +00:00
Sanjoy Das def1729dc4 Use a BumpPtrAllocator for Loop objects
Summary:
And now that we no longer have to explicitly free() the Loop instances, we can
(with more ease) use the destructor of LoopBase to do what LoopBase::clear() was
doing.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38201

llvm-svn: 314375
2017-09-28 02:45:42 +00:00
Rui Ueyama 5908845a7e Fix a UBsan bot.
If we do not initialize Prefix here, Prefix.data() returns a nullptr.
Later, it is passed to memcpy. memcpy's behavior is undefined if src (or
dst) is a nullptr even if a given size is 0. That's why this code
triggered UBsan.

llvm-svn: 314368
2017-09-28 00:27:39 +00:00
Eugene Zelenko fa57bd0ced [CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314363
2017-09-27 23:26:01 +00:00
Galina Kistanova 1c6f0bb63e Reverted r313993.
This patch produces a crash and hexagon_vector_loop_carried_reuse_constant.ll test fails on Windows (llvm-clang-x86_64-expensive-checks-win build bot).

llvm-svn: 314361
2017-09-27 23:09:14 +00:00
Craig Topper 0cd25942f7 Revert r314017 '[InstCombine] Simplify check for RHS being a splat constant in foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt.'
This reverts r314017 and similar code added in later commits. It seems to not work for pointer compares and is causing a bot failure for the last several days.

llvm-svn: 314360
2017-09-27 22:57:18 +00:00
Rui Ueyama 0dbb0f107e Fix -Wunused-variable for Release build.
llvm-svn: 314353
2017-09-27 22:03:15 +00:00
Sanjoy Das 4f3ebd537c Return the LoopUnrollResult from tryToUnrollLoop; NFC
I will use this in a later change.

llvm-svn: 314352
2017-09-27 21:45:22 +00:00
Sanjoy Das 8e8c1bc490 LoopDeletion: use return value instead of passing in LPMUpdater; NFC
I will use this refactoring in a later patch.

llvm-svn: 314351
2017-09-27 21:45:21 +00:00
Sanjoy Das 3567d3d2ec Rename LoopUnrollStatus to LoopUnrollResult; NFC
A "Result" suffix is more appropriate here

llvm-svn: 314350
2017-09-27 21:45:19 +00:00
Rui Ueyama 283f56ac03 Fix off-by-one error in TarWriter.
The tar format originally supported up to 99 byte filename. The two
extensions are proposed later: Ustar or PAX.

In the UStar extension, a pathanme is split at a '/' and its "prefix"
and "suffix" are stored in different locations in the tar header. Since
"prefix" can be up to 155 byte, it can represent up to 254 byte
filename (but exact limit depends on the location of '/' character in
a pathname.)

Our TarWriter first attempt to use UStar extension and then fallback to
PAX extension.

But there's a bug in UStar header creation. "Suffix" part must be a NUL-
terminated string, but we didn't handle it correctly. As a result, if
your filename just 100 characters long, the last character was droppped.

This patch fixes the issue.

Differential Revision: https://reviews.llvm.org/D38149

llvm-svn: 314349
2017-09-27 21:38:02 +00:00
Don Hinton 53eb637115 Cleanup some problems with LLVM_ENABLE_DUMP in release builds, and
always set LLVM_ENABLE_DUMP=ON for +Asserts builds.

Differential Revision: https://reviews.llvm.org/D38306

llvm-svn: 314346
2017-09-27 21:19:56 +00:00
Rui Ueyama 23fa4de2db Do not remove a target file in FileOutputBuffer::create().
FileOutputBuffer::create() attempts to remove a target file if the file
is a regular one, which results in an unexpected result in a failure
scenario.

If something goes wrong and the user of FileOutputBuffer decides to not
call commit(), it leaves nothing. An existing file is removed, and no
new file is created.

What we should do is to atomically replace an existing file with a new
file using rename(), so that it wouldn't remove an existing file without
creating a new one.

Differential Revision: https://reviews.llvm.org/D38283

llvm-svn: 314345
2017-09-27 21:19:24 +00:00
Jessica Paquette 4cf187b5b4 [MachineOutliner] AArch64: Avoid saving + restoring LR if possible
This commit allows the outliner to avoid saving and restoring the link register
on AArch64 when it is dead within an entire class of candidates.

This introduces changes to the way the outliner interfaces with the target.
For example, the target now interfaces with the outliner using a
MachineOutlinerInfo struct rather than by using getOutliningCallOverhead and
getOutliningFrameOverhead.

This also improves several comments on the outliner's cost model.

https://reviews.llvm.org/D36721

llvm-svn: 314341
2017-09-27 20:47:39 +00:00
Craig Topper c16a472966 Revert r314249 "Recommit r314151 "[X86] Make all the NOREX CodeGenOnly instructions into postRA pseudos like the NOREX version of TEST."""
This caused PR34751

llvm-svn: 314339
2017-09-27 20:34:17 +00:00
Craig Topper e0d8290094 Revert r314248 "[X86] Don't emit X86::MOV8rr_NOREX from X86InstrInfo::copyPhysReg."
This contributed to PR34751

llvm-svn: 314338
2017-09-27 20:34:13 +00:00