Commit Graph

112391 Commits

Author SHA1 Message Date
Simon Pilgrim 294556d40e [X86] Remove remaining system/special schedule itineraries (PR37093)
llvm-svn: 329906
2018-04-12 12:43:49 +00:00
Simon Dardis a5a3c38c3d [mips] Correct the predicates for special nops, tlb ctrl instrs, software breakpoint and prefx.
Reviewers: atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D44436

llvm-svn: 329905
2018-04-12 12:37:02 +00:00
Simon Pilgrim 0cd0fbd8c5 [X86] Remove system/control schedule itineraries (PR37093)
llvm-svn: 329903
2018-04-12 12:09:24 +00:00
Sander de Smalen 650234ba36 [AArch64][AsmParser] Make parse function for VectorLists generic to other vector types.
Summary:
Added 'RegisterKind' to the VectorListOp structure, so that this operand 
type can be reused for SVE vector lists in a later patch. It also
refactors the 'tryParseVectorList' function so it can be used directly
in the ParserMethod of an operand. The parsing can now parse multiple 
kinds of vectors and recover if there is no match.

This is patch [3/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: rengolin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45429

llvm-svn: 329900
2018-04-12 11:40:52 +00:00
Shiva Chen b48b027d05 [RISCV] Change function alignment to 4 bytes, and 2 bytes for RVC
Summary:

According RISC-V ELF psABI specification, base RV32 and RV64 ISAs only
allow 32-bit instruction alignment, but instruction allow to be aligned
to 16-bit boundaries for C-extension.

So we just align to 4 bytes and 2 bytes for C-extension is enough.

Reviewers: asb, apazos

Differential Revision: https://reviews.llvm.org/D45560

Patch by Kito Cheng.

llvm-svn: 329899
2018-04-12 11:30:59 +00:00
Simon Pilgrim 69e0e8e3d4 [X86] Remove CMOV/SETCC schedule itineraries (PR37093)
llvm-svn: 329898
2018-04-12 11:01:40 +00:00
Simon Pilgrim 10e3bdaaa8 [X86] Remove MMX/3DNow schedule itineraries (PR37093)
llvm-svn: 329896
2018-04-12 10:49:57 +00:00
Simon Pilgrim 32d368147f [X86] Remove X87 schedule itineraries (PR37093)
First of a number of commits to remove x86 schedule itineraries entirely - approved off-line with @craig.topper

llvm-svn: 329893
2018-04-12 10:27:37 +00:00
Jonas Paulsson 319ce96fe4 [SystemZ] Use ResourceCycles=30 for FPd unit (NFC).
This is better than listing FPd 30 times :-)

Review: Ulrich Weigand
llvm-svn: 329887
2018-04-12 08:08:42 +00:00
Jonas Paulsson e3f53e5d14 [SystemZ] Remove FullInstRWOverlapCheck from SchedMachineModels.
This is NFC, even though it caught just a few cases of overlapping regular
expressions.

Review: Ulrich Weigand
llvm-svn: 329886
2018-04-12 08:06:04 +00:00
Jonas Paulsson 26e171f0a7 [HexagonMachineScheduler] Remove local (copied) getWeakLeft().
Since the common code getWeakLeft() is now available, there should not
be a local copy of this function in target.

llvm-svn: 329885
2018-04-12 07:39:33 +00:00
Jonas Paulsson e8f1ac7063 [MachineScheduler] NFC refactoring
This patch makes tryCandidate() virtual and some utility functions like
tryLess(), tryGreater(), ... externally available (used to be static).

This makes it possible for a target to derive a new MachineSchedStrategy from
GenericScheduler and reuse most parts.

It was necessary to wrap functions with the same names in
AMDGPU/SIMachineScheduler in a local namespace.

Review: Andy Trick, Florian Hahn
https://reviews.llvm.org/D43329

llvm-svn: 329884
2018-04-12 07:21:39 +00:00
Craig Topper 46300d1ff6 [LegalizeTypes] Remove unnecessary type action check on the type of operand 0 when promoting shift result type. NFC
Operand 0 should have the same type of the result. So if the result type needs to be promoted, operand 0 needs to be promoted unconditionally.

llvm-svn: 329883
2018-04-12 06:51:58 +00:00
Hiroshi Inoue bcadfee2ad [NFC] fix trivial typos in documents and comments
"is is" -> "is", "if if" -> "if", "or or" -> "or"

llvm-svn: 329878
2018-04-12 05:53:20 +00:00
Alex Bradbury 21d28fe8b8 [RISCV] Codegen support for RV32D floating point comparison operations
Also add double-prevoius-failure.ll which captures a test case that at one
point triggered a compiler crash, while developing calling convention support
for f64 on RV32D with soft-float ABI.

llvm-svn: 329877
2018-04-12 05:50:06 +00:00
Alex Bradbury 60baa2e015 [RISCV] Codegen support for RV32D floating point conversion operations
This also includes support and a test for truncating stores, which are now
possible thanks to the fpround pattern.

llvm-svn: 329876
2018-04-12 05:47:15 +00:00
Alex Bradbury 5d0dfa5e0e [RISCV] Add codegen support for RV32D floating point arithmetic operations
llvm-svn: 329874
2018-04-12 05:42:42 +00:00
Alex Bradbury 0b4175f160 [RISCV] Codegen support for RV32D floating point load/store, fadd.d, calling conv
fadd.d is required in order to force floating point registers to be used in
test code, as parameters are passed in integer registers in the soft float
ABI.

Much of this patch is concerned with support for passing f64 on RV32D with a
soft-float ABI. Similar to Mips, introduce pseudoinstructions to build an f64
out of a pair of i32 and to split an f64 to a pair of i32. BUILD_PAIR and
EXTRACT_ELEMENT can't be used, as a BITCAST to i64 would be necessary, but i64
is not a legal type.

llvm-svn: 329871
2018-04-12 05:34:25 +00:00
Yan Luo bedca0b41b Test commit access
llvm-svn: 329870
2018-04-12 04:26:49 +00:00
George Burgess IV 48ee59b6f0 [DeadArgElim] Remove allocsize attributes on callsites
We're already removing allocsize attributes from Functions that we
remove args from, since removing arguments from a function may make the
allocsize attribute incorrect. It appears we forgot to also remove them
from callsites.

Without this, I get verifier errors on `@Test2`.

It probably wouldn't be too hard to make DAE properly update allocsize
attributes instead of dropping them, but I can't think of a scenario
where that'd be useful in practice.

llvm-svn: 329868
2018-04-12 02:06:01 +00:00
Michael Zolotukhin 815f453f76 Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.
This reapplies commit r329644.

llvm-svn: 329865
2018-04-11 23:37:53 +00:00
Michael Zolotukhin 4fbb93003b [SSAUpdaterBulk] Fix linux bootstrap/sanitizer failures: explicitly specify order of evaluation.
The standard says that the order of evaluation of an expression
  s[x] = foo()
is unspecified. In our case, we first create an empty entry in the map,
then call foo(), then store its return value to the created entry. The
problem is that foo uses the map as a cache, so if it finds that there
is an entry in the map, it stops computation. This change explicitly
sets the order, thus fixing this heisenbug.

llvm-svn: 329864
2018-04-11 23:37:37 +00:00
Simon Pilgrim 7b88d09e75 [X86] Remove unused itinerary argument from FMA3/FMA4/XOP instructions. NFCI.
llvm-svn: 329862
2018-04-11 23:24:38 +00:00
Weiming Zhao 1bd40005ba Add missing vtable anchors
Summary: This patch adds anchor() for MemoryBuffer, raw_fd_ostream, RTDyldMemoryManager, SectionMemoryManager, etc.

Reviewers: jlebar, eli.friedman, dblaikie

Reviewed By: dblaikie

Subscribers: mehdi_amini, mgorny, dblaikie, weimingz, llvm-commits

Differential Revision: https://reviews.llvm.org/D45244

llvm-svn: 329861
2018-04-11 23:09:20 +00:00
whitequark 1ae61a6126 [LLVM-C] Add LLVMGetHostCPU{Name,Features}.
Without these functions it's hard to create a TargetMachine for
Orc JIT that creates efficient native code.

It's not sufficient to just expose LLVMGetHostCPUName(), because
for some CPUs there's fewer features actually available than
the CPU name indicates (e.g. AVX might be missing on some CPUs
identified as Skylake).

Differential Revision: https://reviews.llvm.org/D44861

llvm-svn: 329856
2018-04-11 22:40:42 +00:00
Nemanja Ivanovic c564dc060a [PowerPC] Fix condition for 64-bit rotate when replacing r+r instr with r+i
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=37039
The condition only covers one of the two 64-bit rotate instructions. This just
adds the second (RLDICLo).

Patch by Josh Stone.

llvm-svn: 329852
2018-04-11 21:25:44 +00:00
Yonghong Song 149d4d3730 bpf: signal error instead of silent drop for certain invalid asm insn
Currently, an invalid asm insn, either in an asm file or
in an inline asm format, might be silently dropped. This patch
fixed two places where this may happen by
signaling the error so user knows what goes wrong.

The following is an example to demonstrate error messages:

    -bash-4.2$ cat t.c
    int test(void *ctx) {
    #if defined(NO_ERROR)
      asm volatile("r0 = *(u16 *)skb[%0]" : : "i"(2));
    #elif defined(ERROR_1)
      asm volatile("r20 = *(u16 *)skb[%0]" : : "i"(2));
    #elif defined(ERROR_2)
      asm volatile("r0 = *(u16 *)(r1 + ?)" : :);
    #endif
      return 0;
    }
    -bash-4.2$ cat run.sh
    for macro in NO_ERROR ERROR_1 ERROR_2; do
      echo "===== compile for macro" $macro
      clang -D${macro} -O2 -target bpf -emit-llvm -S t.c
      echo "==llc=="
      llc -march=bpf -filetype=obj t.ll
    done
    -bash-4.2$ ./run.sh
    ===== compile for macro NO_ERROR
    ==llc==
    ===== compile for macro ERROR_1
    ==llc==
    <inline asm>:1:2: error: invalid register/token name
            r20 = *(u16 *)skb[2]
            ^
    note: !srcloc = 135
    ===== compile for macro ERROR_2
    ==llc==
    <inline asm>:1:21: error: unexpected token
            r0 = *(u16 *)(r1 + ?)
                               ^
    note: !srcloc = 210
    -bash-4.2$

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 329849
2018-04-11 20:24:52 +00:00
Gabor Buella 2ef36f3571 [X86] Describe wbnoinvd instruction
Similar to the wbinvd instruction, except this
one does not invalidate caches. Ring 0 only.
The encoding matches a wbinvd instruction with
an F3 prefix.

Reviewers: craig.topper, zvi, ashlykov

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D43816

llvm-svn: 329847
2018-04-11 20:01:57 +00:00
Peter Collingbourne cb8a666f4b CodeGen: Don't try to canonicalize Unix-style paths in CodeView debug info.
Most importantly, we should not replace slashes with backslashes
because that would invalidate the path.

Differential Revision: https://reviews.llvm.org/D45473

llvm-svn: 329838
2018-04-11 18:24:03 +00:00
Simon Pilgrim 8fc2b49620 [X86][Atom] Convert Atom scheduler model to SchedRW (PR32431)
Atom is the only x86 target that still uses schedule itineraries, if we can remove this then we can begin the work on removing x86 itineraries. I've also found that it will help with PR36550.

I've focussed on matching the existing model as closely as possible (relying on the schedule tests), PR36895 indicated a lot of these were incorrect but we can just as easily fix these after this patch as before. Hopefully we can get llvm-exegesis to help here,

There are a few instructions that rely on itinerary scheduling (mainly push/pop/return) of multiple resource stages, but I don't think any of these are show stoppers.

There are also a few codegen changes that seem related to the post-ra scheduler acting a little differently, I haven't tracked these down but they don't seem critical.

NOTE: I don't have access to any Atom hardware, so this hasn't been tested in the wild.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329837
2018-04-11 18:23:01 +00:00
Simon Pilgrim 7f321d8c24 [X86] Generalize X86PadShortFunction to work with TargetSchedModel
Pre-commit for D45486, don't rely on itinerary scheduler model to determine latencies for padding, use the generic TargetSchedModel::computeInstrLatency call.

Also, replace hard coded (atom specific) 2*uop creation per padding cycle with a version based on the scheduler model's issue width.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329834
2018-04-11 18:05:17 +00:00
Artem Belevich 2f8efcf3ca [NVPTX] Removed 'satom' feature which is no longer used.
Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329830
2018-04-11 17:51:33 +00:00
Artem Belevich 24e8a680e5 [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.
When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature,
consider those features available if we're compiling for GPU >= sm_XX or have
enabled PTX version >= ptxYY.

Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329829
2018-04-11 17:51:19 +00:00
Tim Renouf fd8d4af3bc [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Re-landed after noticing that the buildbot failure from 329808 seemed to
be unrelated.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329826
2018-04-11 17:18:36 +00:00
Reid Kleckner 0828699488 [FastISel] Disable local value sinking by default
This is causing compilation timeouts on code with long sequences of
local values and calls (i.e. foo(1); foo(2); foo(3); ...).  It turns out
that code coverage instrumentation is a great way to create sequences
like this, which how our users ran into the issue in practice.

Intel has a tool that detects these kinds of non-linear compile time
issues, and Andy Kaylor reported it as PR37010.

The current sinking code scans the whole basic block once per local
value sink, which happens before emitting each call. In theory, local
values should only be introduced to be used by instructions between the
current flush point and the last flush point, so we should only need to
scan those instructions.

llvm-svn: 329822
2018-04-11 16:03:07 +00:00
Sanjay Patel ff98682c9c [InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse()
llvm-svn: 329821
2018-04-11 15:57:18 +00:00
Paul Robinson 0195469a23 [DWARFv5] Fuss with asm syntax for conveying MD5 checksum.
Previously the MD5 option of the .file directive provided the checksum
as a quoted hex string; now it's a normal hex number with 0x prefix,
same as the .octa directive accepts.

Differential Revision: https://reviews.llvm.org/D45459

llvm-svn: 329820
2018-04-11 15:14:05 +00:00
Petar Jovanovic 366857a23a [MIPS GlobalISel] Select add i32, i32
Add the minimal support necessary to lower a function that returns the
sum of two i32 values.
Support argument/return lowering of i32 values through registers only.
Add tablegen for regbankselect and instructionselect.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D44304

llvm-svn: 329819
2018-04-11 15:12:32 +00:00
Yaxun Liu 9381ae9791 [AMDGPU] Fix lowering enqueue_kernel
Two issues were fixed:

runtime has difficulty to allocate memory for an external symbol of a
kernel and set the address of the external symbol, therefore make the runtime
handle of an enqueued kernel an ordinary global variable. Runtime only needs
to store the address of the loaded kernel to the handle and has verified
that this approach works.

handle the situation where __enqueue_kernel* gets inlined therefore
the enqueued kernel may be used through a constant expr instead
of an instruction.

Differential Revision: https://reviews.llvm.org/D45187

llvm-svn: 329815
2018-04-11 14:46:15 +00:00
Tim Renouf 8ca33bfcf3 Revert "[AMDGPU] Ensure there are enough registers for wave dispatch"
This reverts 329808. That change caused a report of a failure in
test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect
it is an expensive-check-only error.

Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0
llvm-svn: 329811
2018-04-11 14:27:41 +00:00
Sander de Smalen c88f9a1a57 [AArch64][AsmParser] Split index parsing from vector list.
Summary:
Place parsing of a vector index into a separate function to reduce
duplication, since the code is duplicated in both the parsing of a
Neon vector register operand and a Neon vector list.

This is patch [2/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: rengolin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45428

llvm-svn: 329809
2018-04-11 14:10:37 +00:00
Tim Renouf f26b723491 [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329808
2018-04-11 14:02:41 +00:00
Simon Pilgrim 89c8a10f7c [X86] Add variable shuffle schedule classes
Split variable index shuffles from immediate index shuffles

WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.)
WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.)

WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.)
WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.)

Differential Revision: https://reviews.llvm.org/D45404

llvm-svn: 329806
2018-04-11 13:49:19 +00:00
Dmitry Preobrazhensky fc715551a3 [AMDGPU][MC][GFX9] Added v_screen_partition_4se_b32
See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845

Differential Revision: https://reviews.llvm.org/D45443

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329801
2018-04-11 13:13:30 +00:00
Francis Visoiu Mistrih 6463922e3a [AArch64] Fix regression after r329691
In r329691, we would choose FP even if the offset wouldn't fit, just
because the offset is smaller than the one from BP. This made many
accesses through FP need to scavenge a register, which resulted in
slower and bigger code for no good reason.

This patch now always picks the offset that fits first, even if FP is
preferred.

llvm-svn: 329797
2018-04-11 12:36:55 +00:00
Artur Gainullin d928201ac5 Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max.
Bitwise 'not' of the min/max could be eliminated in the pattern:

%notx = xor i32 %x, -1
%cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y
%smax = select i1 %cmp1, i32 %notx, i32 %y
%res = xor i32 %smax, -1

https://rise4fun.com/Alive/lCN

Reviewers: spatel

Reviewed by: spatel

Subscribers: a.elovikov, llvm-commits

Differential Revision: https://reviews.llvm.org/D45317

llvm-svn: 329791
2018-04-11 10:29:37 +00:00
Sjoerd Meijer ac96d7c4b3 [ARM] FP16 VSEL codegen
This is a follow up of rL327695 to instruction select more variants of VSELGT
and VSELGE, for which it is necessary to custom lower SELECT.

More work is required in this area, which will be addressed soon:
- more variants need to be regression tested, but this depends on the next point.
- first LowerConstantFP need to be adjusted for fp16 values.

Differential Revision: https://reviews.llvm.org/D45205

llvm-svn: 329788
2018-04-11 09:28:04 +00:00
Sander de Smalen 73937b7c9d [AArch64][AsmParser] Unify code for parsing Neon/SVE vectors.
Summary:
Merged 'tryMatchVectorRegister' (specific to Neon) and
'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and
created a generic 'parseVectorKind()' function that returns the #Elements
and ElementWidth of a vector suffix. This reduces the duplication of
this functionality between two the vector implementations.

This is patch [1/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: fhahn

Subscribers: tschuett, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D45427

llvm-svn: 329782
2018-04-11 07:36:10 +00:00
Craig Topper 9507fa358c [X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace 512-bit masked intrinsic with unmasked intrinsic and a select.
The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency.

llvm-svn: 329774
2018-04-11 04:55:04 +00:00
Craig Topper ee2c1dea4d [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction.
Previously the code only knew how to handle setcc to a register.

This should fix a crash in the chromium build.

llvm-svn: 329771
2018-04-11 01:09:10 +00:00
Sriraman Tallam 182f2df7c5 Simplification of libcall like printf->puts must check for RtLibUseGOT metadata.
With -fno-plt, for example, calls to printf when getting converted to puts
still use the PLT. This patch checks for the metadata "RtLibUseGOT" and
annotates the declaration with the right attributes.

Differential Revision: https://reviews.llvm.org/D45180

llvm-svn: 329768
2018-04-10 23:32:36 +00:00
Sriraman Tallam d693093a65 GOTPCREL references must always use RIP.
With -fno-plt, global value references can use GOTPCREL and RIP must be used.

Differential Revision: https://reviews.llvm.org/D45460

llvm-svn: 329765
2018-04-10 22:50:05 +00:00
Marek Olsak a9a58fa236 AMDGPU: enable 128-bit for local addr space under an option
Author: Samuel Pitoiset

ds_read_b128 and ds_write_b128 have been recently enabled
under the amdgpu-ds128 option because the performance benefit
is unclear.

Though, using 128-bit loads/stores for the local address space
appears to introduce regressions in tessellation shaders. Not
sure what is broken, but as ds_read_b128/ds_write_b128 are not
enabled by default, just introduce a global option and enable
128-bit only if requested (until it's fixed/used correctly).

v2: - fix regressions in merge-stores.ll and multiple_tails.ll

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
llvm-svn: 329764
2018-04-10 22:48:23 +00:00
Geoff Berry 5696e075c3 [AArch64][Falkor] Fix bug in Falkor HWPF collision avoidance pass.
Summary:
When inserting MOVs to avoid Falkor HWPF collisions, the non-base
register operand of load instructions (e.g. a register offset) was not
being considered live, so it could potentially have been used as a
scratch register, clobbering the actual offset value.

Reviewers: mcrosier

Subscribers: rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45502

llvm-svn: 329761
2018-04-10 21:43:03 +00:00
Sanjay Patel 3b6d46761f [CVP] simplify phi with constant incoming values that match common variable edge values
This is based on an example that was recently posted on llvm-dev:

void *propagate_null(void* b, int* g) {
  if (!b) {
    return 0;
  }
  (*g)++;
  return b;
}

https://godbolt.org/g/xYk3qG

The original code or constant propagation in other passes has obscured the fact 
that the phi can be removed completely.

Differential Revision: https://reviews.llvm.org/D45448

llvm-svn: 329755
2018-04-10 20:42:39 +00:00
Daniel Neilson 5e10637a3b [Verifier] Refactor duplicate code for atomic mem intrinsic verification (NFC)
Summary:
The verification rules for the intrinsics for atomic memcpy, atomic memmove,
and atomic memset are basically code clones. This change merges their verification
rules into a single block to remove duplication.

llvm-svn: 329753
2018-04-10 20:23:50 +00:00
Steven Wu d0804aa6dc [MachO] Emit Weak ReadOnlyWithRel to ConstDataSection
Summary:
Darwin dynamic linker can handle weak symbols in ConstDataSection.
ReadonReadOnlyWithRel symbols should be emitted in ConstDataSection
instead of normal DataSection.

rdar://problem/39298457

Reviewers: dexonsmith, kledzik

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45472

llvm-svn: 329752
2018-04-10 20:16:35 +00:00
Jessica Paquette a450ed2352 Recommit r329716 "Add missing nullptr check before getSection() to AArch64MachObjectWriter::recordRelocation"
This commit fixes the bot failures that were coming up before with r329716.

The fix was to move the check for "isInSection()" inside of the if condition
and emit the error there instead of waiting to get past the unreachable statement.

This should work in debug and release builds now.

llvm-svn: 329746
2018-04-10 19:46:43 +00:00
Amara Emerson e27d5016ef [AArch64] Fix isel failure when BUILD_PAIR nodes are left over.
rdar://39175175

llvm-svn: 329743
2018-04-10 19:01:58 +00:00
Gabor Buella 213edc4a15 [X86] Split up -march=icelake to -client & -server
Reviewers: craig.topper, zvi, echristo

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45055

llvm-svn: 329742
2018-04-10 18:59:13 +00:00
Sanjay Patel 5da361a0b0 [InstSimplify] fix formatting; NFC
llvm-svn: 329736
2018-04-10 18:38:19 +00:00
Craig Topper 442428540a [X86] Change the name string for the newly add DF flag register to 'dirflag' to match the clobber name supported by clang for MS inline assembly.
This should fix the failure found by Chromium reported here https://bugs.chromium.org/p/chromium/issues/detail?id=831158

The test case will be added in clang.

llvm-svn: 329734
2018-04-10 18:21:04 +00:00
Robert Widmann 58568254bc [LLVM-C] Add Missing 'break's in InlineAsm bindings
Summary: Noticed by Andrea Di Biagio while reviewing r329369

Reviewers: whitequark, harlanhaskins

Reviewed By: harlanhaskins

Subscribers: llvm-commits, abergmeier-dsfishlabs

Differential Revision: https://reviews.llvm.org/D45496

llvm-svn: 329731
2018-04-10 18:10:10 +00:00
Jessica Paquette c140bbddaf Revert 329716 "Add missing nullptr check before getSection() to AArch64MachObjectWriter::recordRelocation"
This broke a bunch of bots so I'm reverting while I figure it out.

llvm-svn: 329728
2018-04-10 17:53:41 +00:00
Aaron Smith 3dca0bedbb [DebugInfoPDB] Add DIA implementations of findSymbolByRVA and findSymbolByAddr
llvm-svn: 329724
2018-04-10 17:33:18 +00:00
Krzysztof Parzyszek 71a4c0ca07 [CodeGen] Fix printing bundles in MIR output
Delay printing the newline until after the opening bracket was
printed, e.g.
  BUNDLE implicit-def $r1, implicit-def $r21, implicit $r1 {
    renamable $r1 = S2_asr_i_r renamable $r1, 1
    renamable $r21 = A2_tfrsi 0
  }
instead of
  BUNDLE implicit-def $r1, implicit-def $r21, implicit $r1
 {    renamable $r1 = S2_asr_i_r renamable $r1, 1
    renamable $r21 = A2_tfrsi 0
  }

llvm-svn: 329719
2018-04-10 16:46:13 +00:00
Peter Collingbourne a7d936f0c0 Revert r329611, "AArch64: Allow offsets to be folded into addresses with ELF."
Caused a build failure in check-tsan.

llvm-svn: 329718
2018-04-10 16:19:30 +00:00
Jessica Paquette e4b90d82a0 Add missing nullptr check to AArch64MachObjectWriter::recordRelocation
There was missing nullptr check before a call to getSection() in
recordRelocation. This would result in a segfault in code like the attached
test.

This adds the missing check and a test which makes sure we get the expected 
error output.

llvm-svn: 329716
2018-04-10 15:53:28 +00:00
Nicolai Haehnle b1c3b22b4c AMDGPU/MC: Allow disassembling without symbol info
Summary:
We would like the UMR debugging tool[0] to be able to provide
disassembly for currently live waves based on plain memory
dumps, and we want to leverage the LLVM disassembler for this.

This mostly works, except that UMR clearly can't provide real
symbol info, so it wants to set DisInfo == nullptr.

[0] https://cgit.freedesktop.org/amd/umr/

Reviewers: arsenm, rampitec, artem.tamazov, dp

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45477

Change-Id: Ibb2c5af2e66f2e100b4702fd81308e1932bc4ee6
llvm-svn: 329715
2018-04-10 15:46:43 +00:00
Aaron Smith c0a5c01aeb [PDB] Remove dead code and run clang format; NFC
llvm-svn: 329712
2018-04-10 15:25:04 +00:00
Chad Rosier af7519e9af Fix spelling. NFC.
llvm-svn: 329709
2018-04-10 14:57:13 +00:00
Pavel Labath b7243ed2f4 [CodeGen/Dwarf] Rename the "sizetype" synthetic type and add it to the accelerator table
Summary:
This type is created on-demand and used as the base type for array
ranges. Since it is "special", its construction did not go through the
createTypeDIE function and so it was never inserted into the accelerator
table, although it clearly belongs there.

I add an explicit addAccelType call to insert it into the table.

During review, we also decided to rename the type to something more
unique to avoid confusion in case the user has own "sizetype" type. The
new name for the type size __ARRAY_SIZE_TYPE__.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45445

llvm-svn: 329705
2018-04-10 14:23:41 +00:00
Simon Pilgrim 95f941117c Fix whitespace indentation. NFCI.
llvm-svn: 329704
2018-04-10 14:21:33 +00:00
Gabor Buella 3eab22d896 [X86] Disable SGX for Skylake Server
Reviewers: craig.topper, zvi, echristo

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45057

llvm-svn: 329700
2018-04-10 13:58:57 +00:00
David Green 5ef933b02c [DA] Improve alias checking in dependence analysis
Improve the alias analysis to account for cases where we
know that src/dst pairs cannot alias due to things like
TBAA. As we know they are noalias, we know no dependency
can occur. Also fixes issues around the size parameter
to AA being incorrect.

Differential Revision: https://reviews.llvm.org/D42381

llvm-svn: 329692
2018-04-10 11:37:21 +00:00
Francis Visoiu Mistrih f2c22050e8 [AArch64] Use FP to access the emergency spill slot
In the presence of variable-sized stack objects, we always picked the
base pointer when resolving frame indices if it was available.

This makes us hit an assert where we can't reach the emergency spill
slot if it's too far away from the base pointer. Since on AArch64 we
decide to place the emergency spill slot at the top of the frame, it
makes more sense to use FP to access it.

The changes here don't affect only emergency spill slots but all the
frame indices. The goal here is to try to choose between FP, BP and SP
so that we minimize the offset and avoid scavenging, or worse, asserting
when trying to access a slot allocated by the scavenger.

Previously discussed here: https://reviews.llvm.org/D40876.

Differential Revision: https://reviews.llvm.org/D45358

llvm-svn: 329691
2018-04-10 11:29:40 +00:00
Tim Renouf 7190a4692a [AMDGPU] For OS type AMDPAL, fixed scratch on compute shader
Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).

This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.

V2: Ensure s0 (s8 for gfx9 merged shader) is marked live-in when loading
scratch descriptor from GIT.

Reviewers: kzhuravl, nhaehnle, timcorringham

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm

Differential Revision: https://reviews.llvm.org/D44468

Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f
llvm-svn: 329690
2018-04-10 11:25:15 +00:00
Tim Northover 6a1c51bf6b AArch64: diagnose unpredictable store-exclusive instructions
Much like any written register in load/store instructions, the status register
is not allowed to overlap with any others. So diagnose it like we already do
with the other cases.

llvm-svn: 329687
2018-04-10 11:04:29 +00:00
Andrea Di Biagio 486358c153 [X86][Broadwell] HWPort5 should not be added to BroadwellModelProcResources.
The BroadwellModelProcResources had an entry for HWPort5, which is a Haswell
resource, and not a Broadwell processor resource. That entry was added to the
Broadwell model because variable blends were consuming it.

This was clearly a typo (the resource name should have been BWPort5), which
unfortunately was never caught before. It was not reported as an error because
HWPort5 is a resource defined by the Haswell model. It has been found when
testing some code with llvm-mca: the list of resources in the resource pressure
view was odd.

This patch fixes the issue; now variable blend instructions consume 2 cycles on
BWPort5 instead of HWPort5. This is enough to get rid of the extra (spurious)
entry in the BroadWellModelProcResources table.

llvm-svn: 329686
2018-04-10 10:49:41 +00:00
Sander de Smalen f974e255fe [AArch64][SVE] Asm: Add support for unpredicated LSL/LSR (shift by immediate) instructions.
Reviewers: rengolin, fhahn, javed.absar, SjoerdMeijer, huntergr, t.p.northover, echristo, evandro

Reviewed By: rengolin, fhahn

Subscribers: tschuett, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45371

llvm-svn: 329681
2018-04-10 10:03:13 +00:00
Clement Courbet b449379eae [MC][TableGen] Add optional libpfm counter names for ProcResUnits.
Summary:
Subtargets can define the libpfm counter names that can be used to
measure cycles and uops issued on ProcResUnits.
This allows making llvm-exegesis available on more targets.
Fixes PR36984.

Reviewers: gchatelet, RKSimon, andreadb, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45360

llvm-svn: 329675
2018-04-10 08:16:37 +00:00
Sander de Smalen 30fda45c18 [AArch64][SVE] Asm: Add support for SVE INDEX instructions.
Reviewers: rengolin, fhahn, javed.absar, SjoerdMeijer, huntergr, t.p.northover, echristo, evandro

Reviewed By: rengolin, fhahn

Subscribers: tschuett, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D45370

llvm-svn: 329674
2018-04-10 07:01:53 +00:00
Chandler Carruth 0ca3bd0729 [x86] Model the direction flag (DF) separately from the rest of EFLAGS.
This cleans up a number of operations that only claimed te use EFLAGS
due to using DF. But no instructions which we think of us setting EFLAGS
actually modify DF (other than things like popf) and so this needlessly
creates uses of EFLAGS that aren't really there.

In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD,
and the whole-flags writes (WRFLAGS and POPF) need to model this.

I've also somewhat cleaned up some of the flag management instruction
definitions to be in the correct .td file.

Adding this extra register also uncovered a failure to use the correct
datatype to hold X86 registers, and I've corrected that as necessary
here.

Differential Revision: https://reviews.llvm.org/D45154

llvm-svn: 329673
2018-04-10 06:40:51 +00:00
Craig Topper 7e42af87a6 [X86] Prevent folding loads with 64-bit ANDs with immediates that fit in 32-bits.
Prefer to use the 32-bit AND with immediate instead.

Primarily I'm doing this to ensure that immediates created by shrinkAndImmediate will always get absorbed into the AND. But I do believe this would be a reduction in the number of uops that need to execute. Ideally we should shrink the 'and' and the 'load' during DAG combine to re-enable the fold.

Fixes PR37063.

llvm-svn: 329667
2018-04-10 03:44:15 +00:00
Michael Zolotukhin d6beefd5d3 Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.
This reverts r329661. Bots are still unhappy.

llvm-svn: 329666
2018-04-10 03:40:29 +00:00
Michael Zolotukhin 8a13f6d4a7 Revert "Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading.""
This reapplies commit r329644.

llvm-svn: 329661
2018-04-10 02:16:45 +00:00
Michael Zolotukhin aa7868594e [SSAUpdaterBulk] Handle CFG with unreachable from entry blocks.
llvm-svn: 329660
2018-04-10 02:16:29 +00:00
Alexandre Ganea 08df84e4f0 [DebugInfo][COFF] Fix reading variable-length encoded records
While reading Codeview records which contain variable-length encoded integers,
such as LF_BCLASS, LF_ENUMERATE, LF_MEMBER, LF_VBCLASS or LF_IVBCLASS,
the record's size would be improperly calculated in cases where the value was
indeed of a variable length (>= LF_NUMERIC). This caused a bad alignement on
the next record, which would/might crash later on.

Differential Revision: https://reviews.llvm.org/D45104

llvm-svn: 329659
2018-04-10 01:58:45 +00:00
Chandler Carruth 19618fc639 [x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues.
The key idea is to lower COPY nodes populating EFLAGS by scanning the
uses of EFLAGS and introducing dedicated code to preserve the necessary
state in a GPR. In the vast majority of cases, these uses are cmovCC and
jCC instructions. For such cases, we can very easily save and restore
the necessary information by simply inserting a setCC into a GPR where
the original flags are live, and then testing that GPR directly to feed
the cmov or conditional branch.

However, things are a bit more tricky if arithmetic is using the flags.
This patch handles the vast majority of cases that seem to come up in
practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of
partially preserved EFLAGS as LLVM doesn't currently model that at all.

There are a large number of operations that techinaclly observe EFLAGS
currently but shouldn't in this case -- they typically are using DF.
Currently, they will not be handled by this approach. However, I have
never seen this issue come up in practice. It is already pretty rare to
have these patterns come up in practical code with LLVM. I had to resort
to writing MIR tests to cover most of the logic in this pass already.
I suspect even with its current amount of coverage of arithmetic users
of EFLAGS it will be a significant improvement over the current use of
pushf/popf. It will also produce substantially faster code in most of
the common patterns.

This patch also removes all of the old lowering for EFLAGS copies, and
the hack that forced us to use a frame pointer when EFLAGS copies were
found anywhere in a function so that the dynamic stack adjustment wasn't
a problem. None of this is needed as we now lower all of these copies
directly in MI and without require stack adjustments.

Lots of thanks to Reid who came up with several aspects of this
approach, and Craig who helped me work out a couple of things tripping
me up while working on this.

Differential Revision: https://reviews.llvm.org/D45146

llvm-svn: 329657
2018-04-10 01:41:17 +00:00
Vlad Tsyrklevich 0cdc6ec535 ShadowCallStack/x86_64: Ignore pseudo-machine instructions
llvm-svn: 329656
2018-04-10 01:31:01 +00:00
Vitaly Buka 6c05a3bb71 Object: Don't mark alias unconditionally defined
Summary:
Can't remove EmitAssignment override as llvm/test/Object/X86/nm-bitcodeweak.test
expects this behavior.

Reviewers: pcc, espindola

Subscribers: mehdi_amini, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D44596

llvm-svn: 329651
2018-04-10 00:53:16 +00:00
Michael Zolotukhin 0274632ee6 Revert "[PR16756] Use SSAUpdaterBulk in JumpThreading."
This reverts commit r329644.

llvm-svn: 329650
2018-04-10 00:42:43 +00:00
Hideki Saito d829973794 Fix for the buildbot failure. Now-unused private field TTI deleted.
llvm-svn: 329649
2018-04-10 00:38:36 +00:00
Alexandre Ganea 3241cec577 Fix line endings (CR/LF -> LF) introduced by rL329613
reviewer: zturner
llvm-svn: 329646
2018-04-10 00:09:15 +00:00
Hideki Saito dfa932b049 [NFC][LV] Move InterleaveInfo from Legal to CostModel
Summary:
Another clean up, following D43208.

Interleaved memory access analysis/optimization has nothing to do with vectorization legality. It doesn't really belong there. On the other hand, cost model certainly has to know about it.

In principle, vectorization should proceed like Legality ==> Optimization ==> CostModel ==> CodeGen, and this change just does that,
by moving the interleaved access analysis/decision out of Legal, and run it just before CostModel object is created.

After this, I can move LoopVectorizationLegality and Hints/Requirements classes into it's own header file, making it shareable within Transform tree. I have the patch already but I don't want to mix with this change. Eventual goal is to move to Analysis tree, but I first need to move RecurrenceDescriptor/InductionDescriptor from Transform/Util/LoopUtil.* to Analysis.

Reviewers: rengolin, hfinkel, mkuper, dcaballe, sguggill, fhahn, aemerson

Reviewed By: rengolin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45072

llvm-svn: 329645
2018-04-09 23:45:40 +00:00
Michael Zolotukhin c6d2d65f37 [PR16756] Use SSAUpdaterBulk in JumpThreading.
Summary:
SSAUpdater is a bottleneck in JumpThreading, and this patch improves the
situation by using SSAUpdaterBulk instead.

Compile time impact: no noticable changes on CTMark, a big improvement
on the test from PR16756.

Reviewers: dberlin, davide, MatzeB

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D44282

llvm-svn: 329644
2018-04-09 23:37:37 +00:00
Michael Zolotukhin 52b064f3d3 [PR16756] Add SSAUpdaterBulk.
Summary:
SSAUpdater is a bottleneck in a number of passes, and one of the reasons
is that it performs a lot of unnecessary computations (DT/IDF) over and
over again. This patch adds a new SSAUpdaterBulk that uses existing DT
and avoids recomputing IDF when possible.

Reviewers: dberlin, davide, MatzeB

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D44282

llvm-svn: 329643
2018-04-09 23:37:20 +00:00
George Burgess IV 0034e393d9 [MemorySSA] remove cruft; NFC.
The caching walker used to hold its own caches, which made its `reset()`
function meaningful. Since caching has been moved out of it, there's no
reason to continue to have these cache-related methods.

Similarly, the EXPENSIVE_CHECKS block that's getting removed used to
rerun the query with caching disabled. Since that's how we always do
queries now, it's redundant.

llvm-svn: 329638
2018-04-09 23:09:27 +00:00
George Burgess IV 2a84e4ab12 [MemorySSA] Remove redundant assert; NFC
The `if (!Def && !Use) return nullptr;` right above this assert sort of
defeats the purpose.

llvm-svn: 329632
2018-04-09 22:45:14 +00:00
Daniel Sanders 5281b02e84 [globalisel][legalizerinfo] Add support for the Lower action in getActionDefinitionsBuilder() and use it in AArch64.
Lower is slightly odd. It often doesn't change the type but the lowerings
do use the new type to decide what code to create. Treat it like a mutation
but provide convenience functions that re-use the existing type.

Re-uses the existing tests:
test/CodeGen/AArch64/GlobalISel/legalize-rem.mir
test/CodeGen/AArch64/GlobalISel//legalize-mul.mir
test/CodeGen/AArch64/GlobalISel//legalize-cmpxchg-with-success.mir

llvm-svn: 329623
2018-04-09 21:10:09 +00:00