Commit Graph

113722 Commits

Author SHA1 Message Date
Amaury Sechet f47d9f30b0 [ARM] Remove code handling ADDC/ADDE/SUBC/SUBE
Summary: This code is now dead as the ARM backend uses ADDCARRY/SUBCARRY/SETCCCARRY .

Reviewers: rogfer01, efriedma, rengolin, javed.absar

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D47413

llvm-svn: 333544
2018-05-30 13:45:43 +00:00
Krzysztof Parzyszek 8987174627 [Hexagon] Use vector align-left when shift amount fits in 3 bits
This saves an instruction because for align-right the shift amount
would need to be put in a register first.

llvm-svn: 333543
2018-05-30 13:45:34 +00:00
Simon Dardis f990bf1cd2 [mips] Correct the definition of CTC2/CFC2
llvm-svn: 333542
2018-05-30 13:21:13 +00:00
Simon Dardis a3aa926c09 [mips] Correct the predicates of microMIPS compact branch instructions
llvm-svn: 333541
2018-05-30 13:16:17 +00:00
Simon Dardis f909058ad4 [mips] Sink PredicateControl further down the class hierarchy.
Previously PredicateControl in some cases was a member of <X>Inst classes
for some X (DSP, EVA) or was in more irregular place in the hierarchry
for any given instruction.

This patch moves PredicateControl down to the root so that it is consistently
available. Then correct the base class of microMIPS instructions as using
EncodingPredicates instead of the general Predicates field of Instruction.

Reviewers: smaksimovic, abeserminji, atanasyan

Differential Revision: https://reviews.llvm.org/D47526

llvm-svn: 333536
2018-05-30 12:40:53 +00:00
Simon Dardis 39710e3555 [mips] Correct the predicates of arithmetic and logic instructions.
As part of this effort, duplicate and correct the predicates of some
aliases. Also disable code generation of some short form instructions
for FastISel, as it would otherwise reject them.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D47075

llvm-svn: 333530
2018-05-30 11:33:35 +00:00
Tim Northover d8949f5002 AArch64: print correct annotation for ADRP addresses.
The immediate on an ADRP MCInst needs to be multiplied by 0x1000 to obtain the
actual PC-offset that will be calculated.

llvm-svn: 333525
2018-05-30 09:54:59 +00:00
Sander de Smalen bdf09fe7a2 [AArch64][AsmParser] Fix segfault on illegal fpimm.
Floating point immediate combining a negative sign and
a hexadecimal number, e.g. #-0x0  caused the compiler to crash.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: javed.absar

Differential Revision: https://reviews.llvm.org/D47483

llvm-svn: 333524
2018-05-30 09:54:19 +00:00
Daniel Cederman 248dae81dc [Sparc] Treat %fxx registers with value type Other as single precision
They get type Other when used in the clobber list in inline assembly.
This fixes tests fp128.ll and float.ll that failed after r333512.

llvm-svn: 333523
2018-05-30 09:52:18 +00:00
Serge Pavlov c4b6d0ebab Revert commit 333506
It looks like this commit is responsible for the fail:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/24382.

llvm-svn: 333518
2018-05-30 09:01:12 +00:00
Daniel Cederman 60e6ce4155 [Sparc] Select correct register class for FP register constraints
Summary: The fX version of floating-point registers only supports
single precision. We need to map the name to dX for doubles and qX
for long doubles if we want getRegForInlineAsmConstraint() to be
able to pick the correct register class.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D47258

llvm-svn: 333512
2018-05-30 06:07:55 +00:00
Craig Topper cc0741e59f [X86] Add unmasked AVX512VNNI instrinsics. Use a select in IR instead.
A future patch will remove the old masked intrinsics.

llvm-svn: 333508
2018-05-30 05:25:59 +00:00
Serge Pavlov 5096d06c10 Use uniform mechanism for OOM errors handling
This is a recommit of r333390, which was reverted in r333395, because it
caused cyclic dependency when building shared library `LLVMDemangle.so`.
In this commit `ItaniumDemangler.cpp` was not changed.

The original commit message is below.

In r325551 many calls of malloc/calloc/realloc were replaces with calls of
their safe counterparts defined in the namespace llvm. There functions
generate crash if memory cannot be allocated, such behavior facilitates
handling of out of memory errors on Windows.

If the result of *alloc function were checked for success, the function was
not replaced with the safe variant. In these cases the calling function made
the error handling, like:

    T *NewElts = static_cast<T*>(malloc(NewCapacity*sizeof(T)));
    if (NewElts == nullptr)
      report_bad_alloc_error("Allocation of SmallVector element failed.");

Actually knowledge about the function where OOM occurred is useless. Moreover
having a single entry point for OOM handling is convenient for investigation
of memory problems. This change removes custom OOM errors handling and
replaces them with calls to functions `llvm::safe_*alloc`.

Declarations of `safe_*alloc` are moved to a separate include file, to avoid
cyclic dependency in SmallVector.h

Differential Revision: https://reviews.llvm.org/D47440

llvm-svn: 333506
2018-05-30 05:13:19 +00:00
Hiroshi Inoue 3872c6c633 [PowerPC] fix broken JIT-compiled code with tail call optimization
The relocation for branch instructions in the dynamic loader of ExecutionEngine assumes branch instructions with R_PPC64_REL24 relocation type are only bl. However, with the tail call optimization, b instructions can be also used to jump into another function.
This patch makes the relocation to keep bits in the branch instruction other than the jump offset to avoid relocation rewrites a b instruction into bl.

Differential Revision: https://reviews.llvm.org/D47456

llvm-svn: 333502
2018-05-30 04:48:29 +00:00
Sam Clegg a81fb84811 MC: Remove redundant substr() call
Differential Revision: https://reviews.llvm.org/D47047

llvm-svn: 333496
2018-05-30 03:37:26 +00:00
Sam Clegg 105bdc2557 [WebAssembly] MC: Add compile-twice test and fix corresponding bug
Differential Revision: https://reviews.llvm.org/D47398

llvm-svn: 333494
2018-05-30 02:57:20 +00:00
Chandler Carruth 71fd27043e [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, schedule
loop-cleanup passes at the beginning of the loop pass pipeline, and
re-enqueue loops after even trivial unswitching.

This will allow us to much more consistently avoid simplifying code
while doing trivial unswitching. I've also added a test case that
specifically shows effective iteration using this technique.

I've unconditionally updated the new PM as that is always using the
SimpleLoopUnswitch pass, and I've made the pipeline changes for the old
PM conditional on using this new unswitch pass. I added a bunch of
comments to the loop pass pipeline in the old PM to make it more clear
what is going on when reviewing.

Hopefully this will unblock doing *partial* unswitching instead of just
full unswitching.

Differential Revision: https://reviews.llvm.org/D47408

llvm-svn: 333493
2018-05-30 02:46:45 +00:00
Lang Hames e9bdfc16d7 [ORC] Fix an ambiguous make_unique call.
llvm-svn: 333492
2018-05-30 02:40:40 +00:00
Lang Hames bd0cb787d0 [ORC] Update JITCompileCallbackManager to support multi-threaded code.
Previously JITCompileCallbackManager only supported single threaded code. This
patch embeds a VSO (see include/llvm/ExecutionEngine/Orc/Core.h) in the callback
manager. The VSO ensures that the compile callback is only executed once and that
the resulting address cached for use by subsequent re-entries.

llvm-svn: 333490
2018-05-30 01:57:45 +00:00
Shiva Chen c3d0e89284 [RISCV] Support resolving fixup_riscv_call and add to MCFixupKindInfo table
Resolving fixup_riscv_call by assembler when the linker relaxation diabled
and the function and callsite within the same compile unit.

And also adding static_assert after Infos array declaration
to avoid missing any new fixup in MCFixupKindInfo in the future.

Differential Revision: https://reviews.llvm.org/D47126

llvm-svn: 333487
2018-05-30 01:16:36 +00:00
Diego Caballero b94b21d441 [VPlan] Replace LLVM_ATTRIBUTE_USED with ifndef NDEBUG
Minor replacement. LLVM_ATTRIBUTE_USED was introduced to silence
a warning but using #ifndef NDEBUG makes more sense in this case.

Reviewers: dblaikie, fhahn, hsaito

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47498

llvm-svn: 333476
2018-05-29 23:10:44 +00:00
Craig Topper 5989db0fb4 [X86] Remove some of the extractelts from the new MOVSS+FMA patterns.
We only need the extractelt that corresponds to the register we're trying to insert back into. We can't guarantee the others haven't been optimized out depending on how those operands were produced.

So instead just look for an FR32/FR64 input and emit a COPY_TO_REGCLASS to VR128 in the output pattern. This matches what we do for ADD/SUB/MUL/DIV.

llvm-svn: 333473
2018-05-29 22:52:09 +00:00
Craig Topper dbd371e931 [X86] Use VR128X instead of VR128 in EVEX instruction patterns.
llvm-svn: 333464
2018-05-29 20:46:27 +00:00
Craig Topper aba57bfebd [X86] Rename the operands in the recently introduced MOVSS+FMA patterns so that the operand names in the output pattern are always in 1, 2, 3 order since those are the operand names in the instruction.
The order should be controlled in the input pattern.

llvm-svn: 333463
2018-05-29 20:46:26 +00:00
Sam Clegg f4f3750949 Fix build error introduced in rL333459
The DEBUG macro was renamed LLVM_DEBUG.

llvm-svn: 333462
2018-05-29 20:16:47 +00:00
Chandler Carruth 4cbcbb0761 [LoopInstSimplify] Re-implement the core logic of loop-instsimplify to
be both simpler and substantially more efficient.

Rather than use a hand-rolled iteration technique that isn't quite the
same as RPO, use the pre-built RPO loop body traversal utility.

Once visiting the loop body in RPO, we can assert that we visit defs
before uses reliably. When this is the case, the only need to iterate is
when simplifying a def that is used by a PHI node along a back-edge.
With this patch, the first pass over the loop body is just a complete
simplification of every instruction across the loop body. When we
encounter a use of a simplified instruction that stems from a PHI node
in the loop body that has already been visited (due to some cyclic CFG,
potentially the loop itself, or a nested loop, or unstructured control
flow), we recall that specific PHI node for the second iteration.
Nothing else needs to be preserved from iteration to iteration.

On the second and later iterations, only instructions known to have
simplified inputs are considered, each time starting from a set of PHIs
that had simplified inputs along the backedges.

Dead instructions are collected along the way, but deleted in a batch at
the end of each iteration making the iterations themselves substantially
simpler. This uses a new batch API for recursively deleting dead
instructions.

This alsa changes the routine to visit subloops. Because simplification
is fundamentally transitive, we may need to visit the entire loop body,
including subloops, to handle knock-on simplification.

I've added a basic test file that helps demonstrate that all of these
changes work. It includes both straight-forward loops with
simplifications as well as interesting PHI-structures, CFG-structures,
and a nested loop case.

Differential Revision: https://reviews.llvm.org/D47407

llvm-svn: 333461
2018-05-29 20:15:38 +00:00
Craig Topper 5439b3d1e5 [X86] Fix a potential crash that occur after r333419.
The code could issue a truncate from a small type to larger type. We need to extend in that case instead.

llvm-svn: 333460
2018-05-29 20:04:10 +00:00
Sam Clegg b7c6239408 [WebAssembly] Add more error checking to object file parsing
This should address some of the assert failures the fuzzer has been
finding such as:
  https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6719

Differential Revision: https://reviews.llvm.org/D47086

llvm-svn: 333459
2018-05-29 19:58:59 +00:00
Matt Arsenault 2e4d338d16 AMDGPU: Fix typo in option description
llvm-svn: 333457
2018-05-29 19:35:46 +00:00
Matt Arsenault 1ea0402e82 AMDGPU: Round up kernel argument allocation size
AFAIK the driver's allocation will actually have to round this
up anyway. It is useful to track the rounded up size, so that
the end of the kernel segment is known to be dereferencable so
a wider s_load_dword can be used for a short argument at the end
of the segment.

llvm-svn: 333456
2018-05-29 19:35:00 +00:00
Sameer AbuAsal 97684419e8 [RISCV] Add peepholes for Global Address lowering patterns
Summary:
  Base and offset are always separated when a GlobalAddress node is lowered
  (rL332641) as an optimization to reduce instruction count. However, this
  optimization is not profitable if the Global Address ends up being used in only
  instruction.

  This patch adds peephole optimizations that merge an offset of
  an address calculation into the LUI %%hi and ADD %lo of the lowering sequence.

  The peephole handles three patterns:

 1) ADDI (ADDI (LUI %hi(global)) %lo(global)), offset
     --->
      ADDI (LUI %hi(global + offset)) %lo(global + offset).

   This generates:
   lui a0, hi (global + offset)
   add a0, a0, lo (global + offset)

   Instead of

   lui a0, hi (global)
   addi a0, hi (global)
   addi a0, offset

   This pattern is for cases when the offset is small enough to fit in the
   immediate filed of ADDI (less than 12 bits).

 2) ADD ((ADDI (LUI %hi(global)) %lo(global)), (LUI hi_offset))
     --->
      offset = hi_offset << 12
      ADDI (LUI %hi(global + offset)) %lo(global + offset)

   Which generates the ASM:

   lui  a0, hi(global + offset)
   addi a0, lo(global + offset)

   Instead of:

   lui  a0, hi(global)
   addi a0, lo(global)
   lui a1, (offset)
   add a0, a0, a1

   This pattern is for cases when the offset doesn't fit in an immediate field
   of ADDI but the lower 12 bits are all zeros.

 3) ADD ((ADDI (LUI %hi(global)) %lo(global)), (ADDI lo_offset, (LUI hi_offset)))
     --->
        offset = global + offhi20<<12 + offlo12
        ADDI (LUI %hi(global + offset)) %lo(global + offset)

   Which generates the ASM:

   lui  a1, %hi(global + offset)
   addi a1, %lo(global + offset)

   Instead of:

   lui  a0, hi(global)
   addi a0, lo(global)
   lui a1, (offhi20)
   addi a1, (offlo12)
   add a0, a0, a1

   This pattern is for cases when the offset doesn't fit in an immediate field
   of ADDI and both the lower 1 bits and high 20 bits are non zero.

    Reviewers: asb

    Reviewed By: asb

    Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos,
  niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang

llvm-svn: 333455
2018-05-29 19:34:54 +00:00
Daniel Neilson 3a6c50f4e0 [BasicAA] Teach the analysis about atomic memcpy
Summary:
A simple change to derive mod/ref info from the atomic memcpy
intrinsic in the same way as from the regular memcpy intrinsic.

llvm-svn: 333454
2018-05-29 19:23:50 +00:00
Konstantin Zhuravlyov 2ca6b1f2ba AMDGPU: Always set COMPUTE_PGM_RSRC2.ENABLE_TRAP_HANDLER to zero for AMDHSA as
it is set by CP

Differential Revision: https://reviews.llvm.org/D47392

llvm-svn: 333451
2018-05-29 19:09:13 +00:00
Eli Friedman 63fead0f43 [ARM] Enable SETCCCARRY lowering for Thumb1.
We've had Thumb1 support for ARMISD::SUBE for a while now, so this just
works.  Reduces codesize a bit for 64-bit integer comparisons.

Differential Revision: https://reviews.llvm.org/D47387

llvm-svn: 333445
2018-05-29 18:17:16 +00:00
Matt Arsenault 64c6ab445e IRBuilder: Add overload for intrinsics without args
llvm-svn: 333443
2018-05-29 18:06:50 +00:00
Matt Arsenault ceafc55e5a AMDGPU: Pass function directly instead of MachineFunction
These functions just query the underlying IR function,
so pass it directly.

llvm-svn: 333442
2018-05-29 17:42:50 +00:00
Matt Arsenault 2fb9ccf770 AMDGPU: Add nuw to add off of kernarg ptr
llvm-svn: 333441
2018-05-29 17:42:38 +00:00
Matt Arsenault ab2b79cb97 DAG: Remove redundant version of getRegisterTypeForCallingConv
There seems to be no real reason to have these separate copies.
The existing implementations just copy each other for x86.
For Mips there is a subtle difference, which is just a bug
since it changes based on the context where which one was called.
Dropping this version, all tests pass. If I try to merge them
to match the removed version, a test fails.

llvm-svn: 333440
2018-05-29 17:42:26 +00:00
Tom Stellard 57b9342c80 AMDGPU: Split R600 MCInst lowering into its own class
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D47307

llvm-svn: 333439
2018-05-29 17:41:59 +00:00
Nicolai Haehnle e7ae0f48f4 TableGen: add some more helpful error messages
Summary: Change-Id: I6f3dacf675a4126134577616e259696bebdade3a

Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D47429

Change-Id: I614de12a4c154c6d53c090f2f3e53ad2d09942c5
llvm-svn: 333436
2018-05-29 17:12:20 +00:00
Cameron McInally b1bb60aec9 [StrictFP] Make getStrictFPOpcodeAction(...) more accessible
NFCI. This function will be reused in upcoming patches.

Differential Revision: https://reviews.llvm.org/D47380

llvm-svn: 333433
2018-05-29 16:49:32 +00:00
Evandro Menezes f8425340e4 [AArch64] Fix PR32384: bump up the number of stores per memset and memcpy
As suggested in https://bugs.llvm.org/show_bug.cgi?id=32384#c1, this change
makes the inlining of `memset()` and `memcpy()` more aggressive when
compiling for speed.  The tuning remains the same when optimizing for size.

Patch by: Sebastian Pop <s.pop@samsung.com>
          Evandro Menezes <e.menezes@samsung.com>

Differential revision: https://reviews.llvm.org/D45098

llvm-svn: 333429
2018-05-29 15:58:50 +00:00
Simon Atanasyan 69301c9eb9 [mips] Process numeric register name in the .set assignment directive
Now LLVM assembler cannot process the following code and generates an
error. GNU tools support .set assignment directive with numeric register
name.

```
.set r4, 4

test.s:1:11: error: invalid token in expression
  .set r4, $4
           ^
```

This patch teach assembler to handle such directives correctly.
Unfortunately a numeric register name cannot be represented as an
expression. That's why we have to maintain a separate `StringMap`
in the `MipsAsmParser` to keep mapping between aliases names and
register numbers.

Differential revision: https://reviews.llvm.org/D47464

llvm-svn: 333428
2018-05-29 15:58:06 +00:00
Amara Emerson d5a9e7bbc9 Revert "[AArch64] added FP16 vcvth intrinsic support"
This reverts commit r333410 due to bot failures.

llvm-svn: 333427
2018-05-29 15:34:22 +00:00
Sander de Smalen 8704b03c4d [AArch64][SVE] Asm: Support for predicated LSL/LSR (vectors)
Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D47365

llvm-svn: 333422
2018-05-29 14:40:24 +00:00
Jonas Devlieghere 43dce3edbe [CodeView] Add prefix to CodeView registers.
Adds CVReg to CodeView register names to prevent a duplicate symbol with
CR3 defined in termios.h, as suggested by Zachary on the mailing list.

http://lists.llvm.org/pipermail/llvm-dev/2018-May/123372.html

Differential revision: https://reviews.llvm.org/D47478

rdar://39863705

llvm-svn: 333421
2018-05-29 14:35:34 +00:00
Alexander Ivchenko 96062eaa8e [X86] Scalar mask and scalar move optimizations
1. Introduction of mask scalar TableGen patterns.
2. Introduction of new scalar move TableGen patterns
   and refactoring of existing ones.
3. Folding of pattern created by introducing scalar
   masking in Clang header files.

Patch by tkrupa

Differential Revision: https://reviews.llvm.org/D47012

llvm-svn: 333419
2018-05-29 14:27:11 +00:00
Than McIntosh 48bf43df8a StackColoring: better handling of statically unreachable code
Summary:
Avoid assert/crash during liveness calculation in situations
where the incoming machine function has statically unreachable BBs.
Second attempt at submitting; this version of the change includes
a revised testcase.

Fixes PR37130.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47372

llvm-svn: 333416
2018-05-29 13:52:24 +00:00
Lei Huang 716103f1cd [PowerPC] Fix the incorrect iterator inside peephole
Instruction selection can insert nodes into the underlying list after the root
node so iterating will thereby miss it. We should NOT assume that, the root node
is the last element in the DAG nodelist.

Patch by: steven.zhang (Qing Shan Zhang)

Differential Revision: https://reviews.llvm.org/D47437

llvm-svn: 333415
2018-05-29 13:38:56 +00:00
Sander de Smalen 26b9b2a8c3 [AArch64][SVE] Asm: Support for AND, ORR, EOR and BIC instructions.
This patch addresses the following variants:
  - bitmask immediate,         e.g. 'and z0.d, z0.d, #0x6'.
  - unpredicated data vectors, e.g. 'and z0.d, z1.d, z2.d'.
  - predicated data vectors,   e.g. 'and z0.d, p0/m, z0.d, z1.d'.

And also several aliases, such as: 
  - ORN, alias of ORR.
  - EON, alias of EOR.
  - BIC, alias of AND (immediate variant)
  - MOV, alias of ORR (if unpredicated and source register operands are the same)

Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47363

llvm-svn: 333414
2018-05-29 13:08:43 +00:00
Luke Geeson 16092ab3c5 [AArch64] added FP16 vcvth intrinsic support
Summary: Change-Id: I0df845749c7689dfc99150ba7c19c7d0dadbd705

Reviewers: javed.absar, SjoerdMeijer

Reviewed By: SjoerdMeijer

Subscribers: llvm-commits, SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46311

llvm-svn: 333410
2018-05-29 11:40:33 +00:00
Simon Atanasyan a1d69f9e53 [mips] Emit R_MICROMIPS_GPREL16/R_MICROMIPS_SUB/R_MICROMIPS_LO16 / HI16 relocations
Emit R_MICROMIPS_GPREL16/R_MICROMIPS_SUB/R_MICROMIPS_LO16 and
R_MICROMIPS_GPREL16/R_MICROMIPS_SUB/R_MICROMIPS_HI16 chains of
relocations for %lo(%neg(%gp_rel())) and %hi(%neg(%gp_rel()))
expressions in case of microMIPS.

Differential Revision: http://reviews.llvm.org/D47220

llvm-svn: 333409
2018-05-29 11:33:54 +00:00
Sander de Smalen 98686c6b15 [AArch64][SVE] Asm: Support for ADD (immediate) instructions.
This patch adds addsub_imm8_opt_lsl_(i8|i16|i32|i64) operands
that are unsigned values in the range 0 to 255. For element widths of
16 bits or higher it may also be a signed multiple of 256 in the
range 0 to 65280.

Note: This also does some refactoring to reuse convenience function
getShiftedVal<shift>(), and now allows AArch64 scalar 'ADD #-4096' to be
accepted to be mapped to SUB #4096.

Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47310

llvm-svn: 333408
2018-05-29 10:39:49 +00:00
Simon Atanasyan 6be87bce29 [mips] Emit R_MICROMIPS_HIGHER / R_MICROMIPS_HIGHEST relocations
Emit R_MICROMIPS_HIGHER / R_MICROMIPS_HIGHEST relocations for %higher()
and %highest() expressions in case of microMIPS. These relocations do
exactly the same things as R_MIPS_HIGHER / R_MIPS_HIGHEST, but for
consistency it's better to write microMIPS variants.

Differential Revision: http://reviews.llvm.org/D47219

llvm-svn: 333407
2018-05-29 10:27:44 +00:00
Simon Dardis 0fad58cbaf [mips] Correct the predicates for a number of instructions.
Previously, their listed predicates were overridden at the scope level.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46947

llvm-svn: 333405
2018-05-29 09:56:19 +00:00
Simon Atanasyan b2d61fa3d8 [mips] Cleanup the code to reduce diff with the upcoming patches. NFC
llvm-svn: 333404
2018-05-29 09:51:33 +00:00
Simon Atanasyan d408ec4cfa [mips] Escape else-after-return. NFC
llvm-svn: 333403
2018-05-29 09:51:28 +00:00
Simon Atanasyan 3535cb1130 [mips] Stop parsing a .set assignment if the first argument is not an identifier
Before this fix the following code triggers two error messages. The
second one is at least useless:

  test.s:1:9: error: expected identifier after .set
    .set  123, $a0
          ^
  test-set.s:1:9: error: unexpected token, expected comma
    .set  123, $a0
          ^

llvm-svn: 333402
2018-05-29 09:51:22 +00:00
Tim Renouf fa213f797b [AMDGPU] Fixed build warning
Summary:
V2: Use cast instead of extra if.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47426

Change-Id: I6ac31da0306f79706960284a7ebd7b9c6237a83a
llvm-svn: 333397
2018-05-29 08:15:37 +00:00
Serge Pavlov 1a095524f2 Reverted commits 333390, 333391 and 333394
Build of shared library LLVMDemangle.so fails due to dependency problem.

llvm-svn: 333395
2018-05-29 07:05:41 +00:00
Serge Pavlov 335fa1eb04 Added library LLVMSupport to dependencies of LLVMDemangle
After r333390 build of LLVMDemangle.so fails due to unresolved
reference `llvm::report_bad_alloc_error`.

llvm-svn: 333394
2018-05-29 06:48:57 +00:00
Craig Topper a34f8731c7 [X86] Disable a DAG combine to allow packed AVX512DQ instructions to be consistently used for i64->float/double conversions.
Summary: We already get this right if the i64 didn't come from a load.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47439

llvm-svn: 333393
2018-05-29 06:22:45 +00:00
Clement Courbet 07c9ec6f2e [X86][Sched] Add InstRW for CLC on Intel after SNB.
Summary:
After SNB, Intel CPUs can rename CF independently of other EFLAGS,
so the renamer can zero it for free. Note that STC still consumes resources.

To reproduce: `$ llvm-exegesis -mode=uops -opcode-name=CLC`

On SNB:
```
---
key:
  opcode_name:     CLC
  mode:            uops
  config:          ''
cpu_name:        sandybridge
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: '3', value: 0.0014, debug_string: SBPort0 }
  - { key: '4', value: 0.0013, debug_string: SBPort1 }
  - { key: '5', value: 0.0003, debug_string: SBPort4 }
  - { key: '6', value: 0.0029, debug_string: SBPort5 }
  - { key: '10', value: 0.0003, debug_string: SBPort23 }
error:           ''
info:            'instruction is serial, repeating a random one.
Snippet:
CLC
'
...
```

On HSW:
```
---
key:
  opcode_name:     CLC
  mode:            uops
  config:          ''
cpu_name:        haswell
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: '3', value: 0.001, debug_string: HWPort0 }
  - { key: '4', value: 0.0009, debug_string: HWPort1 }
  - { key: '5', value: 0.0004, debug_string: HWPort2 }
  - { key: '6', value: 0.0006, debug_string: HWPort3 }
  - { key: '7', value: 0.0002, debug_string: HWPort4 }
  - { key: '8', value: 0.0012, debug_string: HWPort5 }
  - { key: '9', value: 0.0022, debug_string: HWPort6 }
  - { key: '10', value: 0.0001, debug_string: HWPort7 }
error:           ''
info:            'instruction is serial, repeating a random one.
Snippet:
CLC
'
...

```

Reviewers: craig.topper, RKSimon

Subscribers: gchatelet, llvm-commits

Differential Revision: https://reviews.llvm.org/D47362

llvm-svn: 333392
2018-05-29 06:19:39 +00:00
Serge Pavlov 0e31285fe8 Use uniform mechanism for OOM errors handling
In r325551 many calls of malloc/calloc/realloc were replaces with calls of
their safe counterparts defined in the namespace llvm. There functions
generate crash if memory cannot be allocated, such behavior facilitates
handling of out of memory errors on Windows.

If the result of *alloc function were checked for success, the function was
not replaced with the safe variant. In these cases the calling function made
the error handling, like:

    T *NewElts = static_cast<T*>(malloc(NewCapacity*sizeof(T)));
    if (NewElts == nullptr)
      report_bad_alloc_error("Allocation of SmallVector element failed.");

Actually knowledge about the function where OOM occurred is useless. Moreover
having a single entry point for OOM handling is convenient for investigation
of memory problems. This change removes custom OOM errors handling and
replaces them with calls to functions `llvm::safe_*alloc`.

Declarations of `safe_*alloc` are moved to a separate include file, to avoid
cyclic dependency in SmallVector.h

Differential Revision: https://reviews.llvm.org/D47440

llvm-svn: 333390
2018-05-29 05:39:08 +00:00
Craig Topper 21aeddc3dc [X86] Remove masked vpermi2var/vpermt2var intrinsics and autoupgrade.
We have unmasked intrinsics now and wrap them with a select. This is a net reduction of 36 intrinsics from before the unmasked intrinsics were added.

llvm-svn: 333388
2018-05-29 05:22:05 +00:00
Craig Topper 2adc7d956c [X86] Add unmasked vermi2var intrinsics so we can use explicit select instructions for masking in clang.
This will allow us to remove the 3 different flavors of masked intrinsics. I'm leaving the actual intrinsic removal for another patch.

llvm-svn: 333386
2018-05-29 03:26:30 +00:00
Craig Topper dcfcfdb0d1 [X86] Converge X86ISD::VPERMV3 and X86ISD::VPERMIV3 to a single opcode.
These do the same thing with the first and second sources swapped. They previously came from separate intrinsics that specified different masking behavior. But we can cover that with isel patterns and a single node.

This is a step towards reducing the number of intrinsics needed.

A bunch of tests change because we are now biased to choosing VPERMT over VPERMI when there is nothing to signal that commuting is beneficial.

llvm-svn: 333383
2018-05-28 19:33:11 +00:00
Craig Topper 6b545182fb [X86] Fix typo in comment. NFC
llvm-svn: 333382
2018-05-28 19:33:06 +00:00
Farhana Aleen eacb1020aa [AMDGPU] Re-enabled 128bit wide-vector generation for local addr space by default.
Summary: Bug reported here https://bugs.freedesktop.org/show_bug.cgi?id=105464 found
         to be resolved by some other fixes.

Author: FarhanaAleen
llvm-svn: 333380
2018-05-28 18:15:11 +00:00
Fangrui Song afa95ee03d [LLVM-C] [OCaml] Remove LLVMAddBBVectorizePass
Summary: It was fully replaced back in 2014, and the implementation was removed 11 months ago by r306797.

Reviewers: hfinkel, chandlerc, whitequark, deadalnix

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47436

llvm-svn: 333378
2018-05-28 16:58:10 +00:00
Lei Huang 651be44913 [Power9]Legalize and emit code for HW/Byte vector extract and convert to QP
Implemente patterns to extract HWord and Byte vector elements and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46774

llvm-svn: 333377
2018-05-28 16:43:29 +00:00
Zaara Syeda 6f3df02fdc [PowerPC] Set isAsmParserOnly=1 for X-form TLS loads/stores
The X-form TLS load/store instructions added for optimizing the initial-exec
sequence in https://reviews.llvm.org/rL327635 fail to assemble. llvm-mc fails
with the error: invalid operand for instruction. This patch adds these
instructions into a block with isAsmParserOnly, similar to how ADD8TLS_ is
currently handled.

Differential Revision: https://reviews.llvm.org/D47382

llvm-svn: 333374
2018-05-28 15:27:58 +00:00
Daniel Cederman 2e7fe0edaf [Sparc] Add .uahalf and .uaword directives
Summary:
Adding these makes it easier to assemble the output from GCC which
generates a lot of .uahalf and .uaword directives.

GAS treats .uahalf and .half the same unless the --enforce-aligned-data
flag is used. I could not find a similar flag for LLVM so it seems that
.half does not have any alignment requirement and is treated the same as
.uahalf should be. If that would change later on then the tests in
sparc-directives.s would fail due to bad alignment.

Reviewers: jyknight, asb

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D47319

llvm-svn: 333372
2018-05-28 12:42:55 +00:00
Craig Topper 26bc84860a [X86] Stop forcing X86VPermi2X node index operand to match destination type to make masking pattern matching easier. Add extra patterns with bitcasts instead.
This basically reverts r280696 in favor of using extra patterns as mentioned as an alternative in that commit message. For now I've only added the cases we have test cases for, but it should be easy to add more in the future.

This will help to convert VPERMI2PS/VPERMT2PS intrinsics to use a single ISD node opcode. And hopefully allow some intrinsics to be removed.

llvm-svn: 333365
2018-05-28 05:37:25 +00:00
Tim Renouf 364edcd2e5 [AMDGPU] Fixed WWM bug in block otherwise entirely in WQM
Summary:
For a block with WQM on entry and exit and containing no exact mode
code, but containing some WWM code, the WQM pass forgot to process the
block at all and so did not insert code to enter and leave WWM.

This commit fixes that.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47027

Change-Id: I044792eead1293bed4203fb26ce75f47878afeb6
llvm-svn: 333362
2018-05-27 17:26:11 +00:00
Simon Pilgrim 79dae5ba2a [X86] Don't hardcode scheduler class
Also fixes BEXTRI instruction to use WritBEXTR class, which was missed when the class was added.

llvm-svn: 333360
2018-05-27 14:54:18 +00:00
David Green aee7ad0cde Revert 333358 as it's failing on some builders.
I'm guessing the tests reply on the ARM backend being built.

llvm-svn: 333359
2018-05-27 12:54:33 +00:00
David Green 3034281b43 [UnrollAndJam] Add a new Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

for i..
  ForeBlocks(i)
  for j..
    SubLoopBlocks(i, j)
  AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

for i... i+=2
  ForeBlocks(i)
  ForeBlocks(i+1)
  for j..
    SubLoopBlocks(i, j)
    SubLoopBlocks(i+1, j)
  AftBlocks(i)
  AftBlocks(i+1)
Remainder

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now-jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953

llvm-svn: 333358
2018-05-27 12:11:21 +00:00
Eric Christopher 958a1f8d87 Remove boolean argument from isSuitableFromBSS.
The argument was used as an additional negative condition and can be
expressed in the if conditional without needing to pass it down.
Update bss commentary around main use.

llvm-svn: 333357
2018-05-27 11:39:34 +00:00
Eric Christopher ed169ec424 Cleanups for getKindForGlobal:
- Clarify block comment
 - Make Function/GlobalVariable split more explicit.
 - Move locals closer to uses.

llvm-svn: 333356
2018-05-27 11:23:20 +00:00
Jonas Devlieghere cb547cbb5c [dwarfdump] Make -c and -p work together
When requesting to dump both the parent chain and children, we used to
print the DIE more than once because we propagated the dump options to
the parent without clearing the respective flags. This commit fixes this
oversight and adds a test.

rdar://39415292

Differential revision: https://reviews.llvm.org/D47263

llvm-svn: 333350
2018-05-26 19:39:56 +00:00
Craig Topper 51eddb8749 [X86] Remove masking from avx512ifma intrinsics. Use a select instead.
This allows us to avoid having mask and maskz variant. Reducing from 12 intrinsics to 6.

llvm-svn: 333346
2018-05-26 18:55:19 +00:00
Teresa Johnson 724e7a19de [ThinLTO] Fix bot failures from r333335
Change value in vector from StringRef to std::string to avoid errors
when trying to initialize from a std::string.

llvm-svn: 333336
2018-05-26 02:53:52 +00:00
Teresa Johnson 08d5b4ef0d [ThinLTO] Print module summary index to assembly
Summary:
Implements AsmWriter support for printing the module summary index to
assembly with the format discussed in the RFC "LLVM Assembly format for
ThinLTO Summary".

Implements just enough of the parsing support to recognize and ignore
the summary entries. As agreed in the RFC thread, this will be the
behavior when assembling the IR. A follow on change will implement
parsing/assembling of the summary entries for use by tools that
currently build the summary index from bitcode.

Reviewers: dexonsmith, pcc

Subscribers: inglorion, eraman, steven_wu, dblaikie, llvm-commits

Differential Revision: https://reviews.llvm.org/D46699

llvm-svn: 333335
2018-05-26 02:34:13 +00:00
George Burgess IV 45f263dd90 [MemorySSA] Reflow comments + clean up control flow; NFC
Style guide says `else`s after returns are iffy, and I agree. I also
don't know what broke the comments here and in CFLAA, but *shrug*.

llvm-svn: 333332
2018-05-26 02:28:55 +00:00
George Burgess IV e81681805c [CFLAA] Reflow comments; NFC
llvm-svn: 333330
2018-05-26 02:17:43 +00:00
Florian Hahn 718af2f817 Revert r333268: [IPSCCP] Use PredicateInfo to propagate facts from...
Reverting this to see if this is causing the failures of the
clang-with-thin-lto-ubuntu bot.

[IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.

This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

Differential Revision: https://reviews.llvm.org/D45330

llvm-svn: 333323
2018-05-25 23:32:02 +00:00
George Burgess IV 319be3a4e6 Replace AA's uses of uint64_t with LocationSize; NFC.
The uint64_ts that we pass around AA to represent MemoryLocation sizes
are logically an Optional<uint64_t>. In D44748, we want to add an extra
'imprecise' bit to this Optional<uint64_t> to represent whether a given
MemoryLocation size is an upper-bound or an exact size. For more context
on why, please see D44748.

That patch is quite large, but reviewers seem to be OK with the
approach. In D45581 (my first attempt to split 'noise' out of D44748),
reames asked that I land a precursor that is solely replacing uint64_t
with LocationSize, which starts out as `using LocationSize = uint64_t;`.
He also gave me the OK to submit this rename without further review.

llvm-svn: 333314
2018-05-25 21:16:58 +00:00
Petr Hosek f92ca01e42 [Support] Avoid normalization in sys::getDefaultTargetTriple
The return value of sys::getDefaultTargetTriple, which is derived from
-DLLVM_DEFAULT_TRIPLE, is used to construct tool names, default target,
and in the future also to control the search path directly; as such it
should be used textually, without interpretation by LLVM.

Normalization of this value may lead to unexpected results, for example
if we configure LLVM with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu,
normalization will transform that value to x86_64--linux-gnu. Driver will
use that value to search for tools prefixed with x86_64--linux-gnu- which
may be confusing. This is also inconsistent with the behavior of the
--target flag which is taken as-is without any normalization and overrides
the value of LLVM_DEFAULT_TARGET_TRIPLE.

Users of sys::getDefaultTargetTriple already perform their own
normalization as needed, so this change shouldn't impact existing logic.

Differential Revision: https://reviews.llvm.org/D47153

llvm-svn: 333307
2018-05-25 20:39:37 +00:00
Guozhi Wei 03005d3f03 [CodeGenPrepare] Revert r331783
The patch r331783 caused regression in one of our internal application. So revert it now, will investigate it further.

llvm-svn: 333305
2018-05-25 20:30:26 +00:00
Mark Searles 32efedcff3 [AMDGPU][Waitcnt] Remove obsolete waitcnt option
With the removal of the old waitcnt pass, the '-enable-si-insert-waitcnts' option is obsolete. Remove it.

Differential Revision: https://reviews.llvm.org/D47378

llvm-svn: 333303
2018-05-25 20:24:08 +00:00
Craig Topper 8f77dcadff Recommit r333226 "[ValueTracking] Teach computeKnownBits that the result of an absolute value pattern that uses nsw flag is always positive."
Libfuzzer tests have been fixed to prevent being optimized.

Original commit message:

If the nsw flag is used in the absolute value then it is undefined for INT_MIN. For all other value it will produce a positive number. So we can assume the result is positive.

This breaks some InstCombine abs/nabs combining tests because we simplify the second compare from known bits rather than as the whole pattern. Looks like we can probably fix it by adding a neg+abs/nabs combine to just swap the select operands. N

Differential Revision: https://reviews.llvm.org/D47041

llvm-svn: 333300
2018-05-25 19:18:09 +00:00
Stanislav Mekhanoshin 7fc1cee051 [AMDGPU] Fixed test failure with AMDGPUPerfHint
We shall not keep iterator to a map while map is modified,
this leads to a broken map.

llvm-svn: 333298
2018-05-25 18:46:58 +00:00
Reid Kleckner cb48efd585 Fix -Winconsistent-missing-overrides in AMDGPU code
llvm-svn: 333291
2018-05-25 17:46:24 +00:00
Stanislav Mekhanoshin 1c538423dc [AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:

1. Function is memory bound.
2. Kernel can use wave limiter.

Currently these hints are used in the scheduler. If a function is suspected
to be memory bound we allow occupancy to decrease to 4 waves in the course
of scheduling.

Differential Revision: https://reviews.llvm.org/D46992

llvm-svn: 333289
2018-05-25 17:25:12 +00:00
Simon Dardis 6a31992383 [mips] Fix the definitions of lwp, swp
Rather than using a regpair operand of these instructions, use two seperate
operands and a custom converter to handle the implicit second register operand.

Additionally, remove the microMIPS32R6 definition as its redundant.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D47255

llvm-svn: 333288
2018-05-25 16:15:48 +00:00
Teresa Johnson c80c873837 [NFC] Restructure linkage name printing in AsmWriter
This restructuring was suggested in the review for D46699, which
prepares the linkage type printer for use in printing the ThinLTO
summary index (where we want to print "external" and also don't
want a space after the linkage type as it is printed by the caller).

llvm-svn: 333281
2018-05-25 15:15:39 +00:00
Krzysztof Parzyszek 95b073525b [Hexagon] Fix packing source vectors in shufflevector selection
When the shuffle mask selected a subvector of the second input vector,
and aligning of the source was performed, the shuffle mask was updated
incorrectly, resulting in an ICE further in the selection process.

llvm-svn: 333279
2018-05-25 14:53:14 +00:00
David Stenberg 05b6a53340 [MustExecute] Fix a debug invariant issue in isGuaranteedToExecute()
Summary:
Look past debug intrinsics when querying whether an instruction is the
first instruction in the header block. The commit includes a reproducer
for a case where LICM would not hoist an instruction, due to the presence
of the intrinsic.

A caveat with this commit is that the check will not work properly if
the instruction at hand is a debug intrinsic. I assume that no one
depends on isGuaranteedToExecute() to return true for debug intrinsics
for these cases (and that this might be an indication of another debug
invariant issue), so I thought that it was not worth adding that extra
bit of complexity.

Reviewers: reames, anna

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47197

llvm-svn: 333274
2018-05-25 13:02:59 +00:00
Simon Pilgrim 0155bf0da9 [X86][SNB] Fix differences between vex/non-vex XMM vector moves (PR37286)
As confirmed by llvm-exegesis, there is no scheduler difference between MOVDQA/MOVDQU and VMOVDQA/VMOVDQU xmm reg-reg moves

Another chapter in the never ending crusade to remove useless InstRW overrides from the x86 scheduler models......

llvm-svn: 333271
2018-05-25 12:18:11 +00:00