Commit Graph

46619 Commits

Author SHA1 Message Date
Craig Topper 5a69a0011b [X86][Broadwell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
llvm-svn: 328076
2018-03-21 06:28:42 +00:00
Craig Topper 137a4dd84d [X86] Fix the SchedRW for XOP vpcom register form instructions to not be marked as loads.
llvm-svn: 328071
2018-03-21 03:41:33 +00:00
Craig Topper d25f1acf67 [X86] Change PMULLD to 10 cycles on Skylake per Agner's tables and llvm-exegesis.
Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server.

Fixes PR36808.

llvm-svn: 328061
2018-03-20 23:39:48 +00:00
Derek Schuff 73a98f5eea [WebAssembly] Update torture compile test expectations
The tests compile after r328049

llvm-svn: 328057
2018-03-20 23:00:13 +00:00
Derek Schuff 39b5367cba [WebAssembly] Strip threadlocal attribute from globals in single thread mode
The default thread model for wasm is single, and in this mode thread-local
global variables can be lowered identically to non-thread-local variables.

Differential Revision: https://reviews.llvm.org/D44703

llvm-svn: 328049
2018-03-20 22:01:32 +00:00
Simon Pilgrim 572bfa562a [X86] Drop unnecessary InstRW overrides for WriteFMA
As noticed on D44687, these already match the WriteFMA def so can be removed.

llvm-svn: 328045
2018-03-20 21:15:23 +00:00
Martin Storsjo 07589fc496 [X86] Don't use the MSVC stack protector names on mingw
Mingw uses the same stack protector functions as GCC provides
on other platforms as well.

Patch by Valentin Churavy!

Differential Revision: https://reviews.llvm.org/D27296

llvm-svn: 328039
2018-03-20 20:37:51 +00:00
Derek Schuff e4825975d8 [WebAssembly] Added initial AsmParser implementation.
It uses the MC framework and the tablegen matcher to do the
heavy lifting. Can handle both explicit and implicit locals
(-disable-wasm-explicit-locals). Comes with a small regression
test.

This is a first basic implementation that can parse most llvm .s
output and round-trips most instructions succesfully, but in order
to keep the commit small, does not address all issues.

There are a fair number of mismatches between what MC / assembly
matcher think a "CPU" should look like and what WASM provides,
some already have workarounds in this commit (e.g. the way it
deals with register operands) and some that require further work.
Some of that further work may involve changing what the
Disassembler outputs (and what s2wasm parses), so are probably
best left to followups.

Some known things missing:
- Many directives are ignored and not emitted.
- Vararg calls are parsed but extra args not emitted.
- Loop signatures are likely incorrect.
- $drop= is not emitted.
- Disassembler does not output SIMD types correctly, so assembler
  can't test them.

Patch by Wouter van Oortmerssen

Differential Revision: https://reviews.llvm.org/D44329

llvm-svn: 328028
2018-03-20 20:06:35 +00:00
Evandro Menezes 36afbee1d8 [AArch64] Adjust the cost model for Exynos M3
Fix typo in the number of integer dividers.

llvm-svn: 328027
2018-03-20 20:00:29 +00:00
Krzysztof Parzyszek 65059ee284 [Hexagon] Add heuristic to exclude critical path cost for scheduling
Patch by Brendon Cahoon.

llvm-svn: 328022
2018-03-20 19:26:27 +00:00
Krzysztof Parzyszek 9315c0de9b [Hexagon] Fix fall-through warnings in HexagonMCDuplexInfo.cpp
llvm-svn: 328021
2018-03-20 19:23:18 +00:00
Nirav Dave ce71989188 [MC,X86] Cleanup some X86 parser functions to use MCParser helpers. NFCI.
llvm-svn: 328019
2018-03-20 19:12:41 +00:00
Craig Topper c2dbd677bd [PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code
I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything.

If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all.

I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it.

Differential Revision: https://reviews.llvm.org/D44061

llvm-svn: 328017
2018-03-20 18:49:28 +00:00
Krzysztof Parzyszek eb0c510ecd [X86] Add phony registers for high halves of regs with low halves
Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters
that cover the low halves of these registers. This change adds artificial
subregisters for the high halves in order to differentiate (in terms of
register units) between the 32- and the low 16-bit registers.

This patch contains parts that aim to preserve the calculated register
pressure. This is in order to preserve the current codegen (minimize the
impact of this patch). The approach of having artificial subregisters
could be used to fix PR23423, but the pressure calculation would need
to be changed.

Differential Revision: https://reviews.llvm.org/D43353

llvm-svn: 328016
2018-03-20 18:46:55 +00:00
Artem Belevich 914d4babec [NVPTX] Make tensor load/store intrinsics overloaded.
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.

Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.

Updated tests to use non-default address spaces.

Differential Revision: https://reviews.llvm.org/D43268

llvm-svn: 328006
2018-03-20 17:18:59 +00:00
Krzysztof Parzyszek 4c6b65f685 [Hexagon] Correct the computation of TopReadyCycle and BotReadyCycle of SU
TopReadyCycle and BotReadyCycle were off by one cycle when an SU is either
the first instruction or the last instruction in a packet.

Patch by Ikhlas Ajbar.

llvm-svn: 328000
2018-03-20 17:03:27 +00:00
Krzysztof Parzyszek 73be83dec5 [Hexagon] Check weak dependences when only 1 instruction is available
Patch by Brendon Cahoon.

llvm-svn: 327997
2018-03-20 16:22:06 +00:00
Simon Pilgrim 62690e9d0e [X86][Haswell][Znver1] Fix typo in fldl instregexs
Missing comma was casing 2 instregex entries to be concatenated together by mistake.

Found while investigating PR35548

llvm-svn: 327992
2018-03-20 15:44:47 +00:00
Krzysztof Parzyszek 5ffd808a27 [Hexagon] Improve scheduling heuristic for large basic blocks
This patch changes the isLatencyBound heuristic to look at the
path length based upon the number of packets needed to schedule
a basic block. For small basic blocks, the heuristic uses a small
threshold for isLatencyBound. For large basic blocks, the
heuristic uses a large threshold.

The goal is to increase the priority of an instruction in a small
basic block that has a large height or depth relative to the code
size. For large functions, the height and depth are ignored
because it increases the live range of a register and causes more
spills. That is, for large functions, it is more important to
schedule instructions when available, and attempt to keep the defs
and uses closer together.

Patch by Brendon Cahoon.

llvm-svn: 327987
2018-03-20 14:54:01 +00:00
Geoff Berry 0b64402adb [AArch64][Falkor] Correct load/store increment scheduling details
llvm-svn: 327982
2018-03-20 13:46:35 +00:00
Krzysztof Parzyszek 2c4231d888 [Hexagon] Fix division by zero in machine scheduler
llvm-svn: 327980
2018-03-20 13:28:46 +00:00
Alex Bradbury 80c8eb7696 [RISCV] Add codegen for RV32F floating point load/store
As part of this, add support for load/store from the constant pool. This is
used to materialise f32 constants.

llvm-svn: 327979
2018-03-20 13:26:12 +00:00
Alex Bradbury 76c29ee815 [RISCV] Add codegen for RV32F arithmetic and conversion operations
Currently, only a soft floating point ABI is supported.

llvm-svn: 327976
2018-03-20 12:45:35 +00:00
Krzysztof Parzyszek dca383123f [Hexagon] Improve scheduling based on register pressure
Patch by Brendon Cahoon.

llvm-svn: 327975
2018-03-20 12:28:43 +00:00
Simon Pilgrim 4a83f802cc [X86][SandyBridge] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix - there are still a lot more of these to do.

llvm-svn: 327974
2018-03-20 12:26:55 +00:00
Martin Storsjo 802b434156 [X86] Properly implement the calling convention for f80 for mingw/x86_64
In these cases, both parameters and return values are passed
as a pointer to a stack allocation.

MSVC doesn't use the f80 data type at all, while it is used
for long doubles on mingw.

Normally, this part of the calling convention is handled
within clang, but for intrinsics that are lowered to libcalls,
it may need to be handled within llvm as well.

Differential Revision: https://reviews.llvm.org/D44592

llvm-svn: 327957
2018-03-20 06:19:38 +00:00
Craig Topper ad7c685791 [X86] Rename MOVSX32_NOREXrr8 to MOVSX32rr8_NOREX so that the scheduler model regular expressions will pick it up with the regular version.
Do the same for MOVSX32_NOREXrm8, MOVZX32_NOREXrr8, and MOVZX32_NOREXrm8

llvm-svn: 327948
2018-03-20 05:00:20 +00:00
Craig Topper 4778fa7e8a [X86] Fix the SchedRW for memory forms of CMP and TEST.
They were incorrectly marked as RMW operations. Some of the CMP instrucions worked, but the ones that use a similar encoding as RMW form of ADD ended up marked as RMW.

TEST used the same tablegen class as some of the CMPs.

llvm-svn: 327947
2018-03-20 03:55:17 +00:00
Craig Topper 3e9462607e [X86] Add TEST16mi/TEST32mi/TEST64mi32 to the Sandybridge/Haswell/Broadwell/Skylake scheduler models.
Move it from a load+store group on SNB to a load only group, the same group as CMP.

llvm-svn: 327944
2018-03-20 03:02:03 +00:00
Craig Topper 7c90e29cf8 [X86] Add ROR/ROL/SHR/SAR by 1 instructions to the Sandy Bridge scheduler model.
I assume these match the generic immediate version like they do in the other models.

llvm-svn: 327943
2018-03-20 03:01:59 +00:00
Shiva Chen cbd498ac10 [RISCV] Preserve stack space for outgoing arguments when the function contain variable size objects
E.g.

bar (int x)
{
  char p[x];

  push outgoing variables for foo.
  call foo
}

We need to generate stack adjustment instructions for outgoing arguments by
eliminateCallFramePseudoInstr when the function contains variable size
objects to avoid outgoing variables corrupt the variable size object.

Default hasReservedCallFrame will return !hasFP().
We don't want to generate extra sp adjustment instructions when hasFP()
return true, So We override hasReservedCallFrame as !hasVarSizedObjects().

Differential Revision: https://reviews.llvm.org/D43752

llvm-svn: 327938
2018-03-20 01:39:17 +00:00
Craig Topper 2330d6cd55 [X86] Fix the SNB scheduler for BLENDVB.
PBLENDVBrr0 was with the memory version of VBLENDVB and PBLENDVBrm0 was missing.

llvm-svn: 327937
2018-03-20 01:30:21 +00:00
Jessica Paquette 563548d8f3 [MachineOutliner] AArch64: Emit CFI instructions when outlining calls
When outlining calls, the outliner needs to update CFI to ensure that, say,
exception handling works. This commit adds that functionality and adds a test
just for call outlining.

Call outlining stuff in machine-outliner.mir should be moved into
machine-outliner-calls.mir in a later commit.

llvm-svn: 327917
2018-03-19 22:48:40 +00:00
Craig Topper ab6076514d [X86] Simplify the AVX512 code in LowerTruncate a little.
We don't need to create an ISD::TRUNCATE node to return, we started with one and can return it. Also remove the call to getExtendInVec, the result is just going to be a getNode of that value passed in.

llvm-svn: 327914
2018-03-19 21:58:02 +00:00
Craig Topper 3b967466d5 [X86] Replace a couple calls to getExtendInVec with getNode and the appropriate target independent EXTEND_VECTOR_INREG opcode.
llvm-svn: 327899
2018-03-19 20:20:22 +00:00
Nirav Dave 3264c1bdf6 [DAG, X86] Revert r327197 "Revert r327170, r327171, r327172"
Reland ISel cycle checking improvements after simplifying node id
invariant traversal and correcting typo.

llvm-svn: 327898
2018-03-19 20:19:46 +00:00
Martin Storsjo 9a55c1b0dc [ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes
This extends the use of this attribute on ARM and AArch64 from
SVN r325900 (where it was only checked for fixed stack
allocations on ARM/AArch64, but for all stack allocations on X86).

This also adds a testcase for the existing use of disabling the
fixed stack probe with the attribute on ARM and AArch64.

Differential Revision: https://reviews.llvm.org/D44291

llvm-svn: 327897
2018-03-19 20:06:50 +00:00
Lei Huang ecfede94a7 [Power9]Legalize and emit code for quad-precision copySign/abs/nabs/neg/sqrt
Legalize and emit code for quad-precision floating point operations:

  * xscpsgnqp
  * xsabsqp
  * xsnabsqp
  * xsnegqp
  * xssqrtqp

Differential Revision: https://reviews.llvm.org/D44530

llvm-svn: 327889
2018-03-19 19:22:52 +00:00
Craig Topper 9770107b5f [X86] Add JMP16r and JMP32r to Sandybridge scheduler model.
Fixes PR36010

llvm-svn: 327883
2018-03-19 19:00:37 +00:00
Craig Topper 5e65996fac [X86] Remove OUT32rr/OUT8rr/OUT32ri/OUT8ri from Sandybridge scheduler model.
PR35590 was already filed for this information being wrong. It's probably better to default to WriteSystem behavior instead of using something completely wrong.

llvm-svn: 327882
2018-03-19 19:00:35 +00:00
Craig Topper b4c7873f8c [X86] Add JCXZ/JECXZ to Sandybridge/Haswell/Broadwell/Skylake scheduler models.
JRCXZ was already present, but not the others.

We never codegen this instruction so this doesn't affect much just trying to get them all into a single generated scheduler class in the output.

llvm-svn: 327881
2018-03-19 19:00:32 +00:00
Craig Topper afabf36505 [X86] Correct regular expression in Zen scheduler model that was excluding JECXZ instruction.
The regex was looking for JECXZ_32 or JECXZ_64, but their is just one instruction called JECXZ. They used to exist as separate instructions, but were merged over 3 years ago.

llvm-svn: 327880
2018-03-19 19:00:29 +00:00
Craig Topper 591f44df54 [X86] Correct the SchedRW on (V)MOVAPSrr_REV and similar to match their non _REV counterparts.
llvm-svn: 327879
2018-03-19 19:00:26 +00:00
Lei Huang 6d1596a98c [PowerPC][Power9]Legalize and emit code for quad-precision add/div/mul/sub
Legalize and emit code for quad-precision floating point operations:

  * xsaddqp
  * xssubqp
  * xsdivqp
  * xsmulqp

Differential Revision: https://reviews.llvm.org/D44506

llvm-svn: 327878
2018-03-19 18:52:20 +00:00
Nemanja Ivanovic d9d5bd3067 [PowerPC] Make AddrSpaceCast noop
PowerPC targets do not use address spaces. As a result, we can get selection
failures with address space casts. This patch makes those casts noops.

Patch by Valentin Churavy.

Differential revision: https://reviews.llvm.org/D43781

llvm-svn: 327877
2018-03-19 18:50:02 +00:00
Craig Topper 836cfb3a4c [X86] Add the rest of the TEST with immediate instructions to the scheduler models to match their 8-bit counterpart.
llvm-svn: 327874
2018-03-19 17:58:41 +00:00
Craig Topper 645e531a69 [X86] Add MOV16ri*/MOV32ri*/MOV64ri* to scheduler models to match MOV8ri. Correct SchedRW and itinerary for MOV32ri64.
llvm-svn: 327872
2018-03-19 17:46:59 +00:00
Craig Topper 259eaa6e7c [X86] Remove sse41 specific code from lowering v16i8 multiply
With the SRAs removed from the SSE2 code in D44267, then there doesn't appear to be any advantage to the sse41 code. The punpcklbw instruction and pmovsx seem to have the same latency and throughput on most CPUs. And the SSE41 code requires moving the upper 64-bits into the lower 64-bit before the sign extend can be done. The unpckhbw in sse2 code can do better than that.

llvm-svn: 327869
2018-03-19 17:31:41 +00:00
Craig Topper 5ccd87233f [X86] Make the multiply and divide itineraries more consistent.
Sometimes we used the same itinerary for MEM and REG forms, but that seems inconsistent with our usual usage.

We also used the MUL8 itinerary for MULX32/64 which was also weird.

The test changes are because we were using IIC_IMUL32_RR and IIC_IMUL64_RR instead of IIC_IMUL32_REG/IIC_IMUL64_REG for the 32 and 64 bit multiplies that produce double width result.

llvm-svn: 327866
2018-03-19 16:38:33 +00:00
Zaara Syeda 01f414baaa Revert [MachineLICM] This reverts commit rL327856
Failing build bots. Revert the commit now.

llvm-svn: 327864
2018-03-19 16:19:44 +00:00