Commit Graph

46842 Commits

Author SHA1 Message Date
Krzysztof Parzyszek 4a5a80c370 [Hexagon] Assertion failure in HexagonSubtarget.cpp
In restoreLatency, replace range-for loop with std::find.

Patch by Jyotsna Verma.

llvm-svn: 328574
2018-03-26 19:04:58 +00:00
Simon Pilgrim fcf49df21c [X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs
Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

llvm-svn: 328573
2018-03-26 19:01:06 +00:00
Reid Kleckner 41fb2dba9c [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32
Summary:
Re-lands r328386 and r328443, reverting r328482.

Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small
parameters in i8 and i16 do not end up in the SysV register parameters
(EDI, ESI, etc).

I added tests for how we receive small parameters, since that is the
important part. It's always safe to store more bytes than will be read,
but the assumptions you make when loading them are what really matter.

I also tested this by self-hosting clang and it passed tests on win64.

Reviewers: mstorsjo, hans

Subscribers: hiraditya, mstorsjo, llvm-commits

Differential Revision: https://reviews.llvm.org/D44900

llvm-svn: 328570
2018-03-26 18:49:48 +00:00
Simon Pilgrim f33d905293 [X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881)
Give the bit count instructions their own scheduler classes instead of forcing them into existing classes.

These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar).

Differential Revision: https://reviews.llvm.org/D44879

llvm-svn: 328566
2018-03-26 18:19:28 +00:00
Mandeep Singh Grang 1b9ff45157 [XCore] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: dblaikie, RKSimon, robertlytton

Reviewed By: robertlytton

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44875

llvm-svn: 328564
2018-03-26 18:08:26 +00:00
Lei Huang be0afb0870 [Power9]Legalize and emit code for quad-precision convert from double-precision
Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.

Differential Revision: https://reviews.llvm.org/D44746

llvm-svn: 328558
2018-03-26 17:46:25 +00:00
Stefan Pintilie 26d4f923c4 [PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place.
A new function getOpcodeForSpill should now be the only place to get
the opcode for a given spilled register.

Differential Revision: https://reviews.llvm.org/D43086

llvm-svn: 328556
2018-03-26 17:39:18 +00:00
Tim Corringham 7116e8963d [AMDGPU] Improve disassembler error handling
Summary:
llvm-objdump now disassembles unrecognised opcodes as data, using
the .long directive. We treat unrecognised opcodes as being 32 bit
values, so move along 4 bytes rather than the single byte which
previously resulted in a cascade of bogus disassembly following an
unrecognised opcode.

While no solution can always disassemble code that contains
embedded data correctly this provides a significant improvement.

The disassembler will now cope with an arbitrary length section
as it no longer truncates it to a multiple of 4 bytes, and will
use the .byte directive for trailing bytes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44685

llvm-svn: 328553
2018-03-26 17:06:33 +00:00
Simon Pilgrim 86ea53123d [X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs
We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR.....

llvm-svn: 328551
2018-03-26 17:02:02 +00:00
David Blaikie 535ca36e5e Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header
llvm-svn: 328549
2018-03-26 16:57:31 +00:00
David Blaikie a1b2bf4c71 Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header
llvm-svn: 328548
2018-03-26 16:52:10 +00:00
Krzysztof Parzyszek a212204453 [Pipeliner] Use latency to compute RecMII
The patch contains severals changes needed to pipeline an example
that was transformed so that a Phi with a subreg is converted to
copies.

The pipeliner wasn't working for a couple of reasons.
- The RecMII was 3 instead of 2 due to the extra copies.
- Copy instructions contained a latency of 1.
- The node order algorithm was not choosing the best "bottom"
node, which caused an instruction to be scheduled that had a 
predecessor and successor already scheduled.
- Updated the Hexagon Machine Scheduler to check if the node is
latency bound when adding the cost for a 0-latency dependence.

The RecMII was 3 because the computation looks at the number of
nodes in the recurrence. The extra copy is an extra node but
it shouldn't increase the latency. The new RecMII computation
looks at the latency of the instructions in the recurrence. We
changed the latency of the dependence of a copy to 0. The latency
computation for the copy also checks the use of the copy (similar
to a reg_sequence).

The node order algorithm was not choosing the last instruction
in the recurrence for a bottom up traversal. This was when the
last instruction is a copy. A check was added when choosing the
instruction to check for NodeNum if the maxASAP is the same. This
means that the scheduler will not end up with another node in
the recurrence that has both a predecessor and successor already
scheduled.

The cost computation in Hexagon Machine Scheduler adds cost when
an instruction can be packetized with a zero-latency instruction.
We should only do this if the schedule is latency bound. 

Patch by Brendon Cahoon.

llvm-svn: 328542
2018-03-26 16:33:16 +00:00
Simon Pilgrim 8815105cd5 [X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs
llvm-svn: 328541
2018-03-26 16:24:13 +00:00
Simon Pilgrim aa40148cae [X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of JALU0 for GPR PRF write)
llvm-svn: 328536
2018-03-26 16:10:08 +00:00
Krzysztof Parzyszek 56f0fc4716 [Hexagon] Give priority to post-incremementing memory accesses in LSR
llvm-svn: 328506
2018-03-26 15:32:03 +00:00
Simon Pilgrim 0b73b29388 [X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs
Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

This also adds missing vcvttss2si tests 

llvm-svn: 328505
2018-03-26 15:30:47 +00:00
Simon Pilgrim 3aa9344605 [X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costs
These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults.

llvm-svn: 328501
2018-03-26 14:44:24 +00:00
Simon Pilgrim 67df1cf597 [X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs
The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps

llvm-svn: 328497
2018-03-26 14:03:40 +00:00
Nicolai Haehnle 4f850eabb6 AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classes
Differential revision: https://reviews.llvm.org/D44820

Change-Id: I732979e2964006aa15d78a333d8886e6855f319a
llvm-svn: 328496
2018-03-26 13:56:53 +00:00
Simon Pilgrim caa203aed5 [X86][Btver2] Double the AGU and schedule pipe resources for YMM
Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model.

llvm-svn: 328491
2018-03-26 13:15:20 +00:00
Hans Wennborg 311b63f13b Revert r328386 "[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32"
This broke Chromium (see crbug.com/825748). It looks like mstorsjo's follow-up
patch at D44876 fixes this, but let's revert back to green for now until that's
ready to land.

(Also reverts r328443.)

> Both GCC and MSVC only look at the low byte of a boolean when it is
> passed.

llvm-svn: 328482
2018-03-26 10:07:51 +00:00
Martin Storsjo 439824622a [ARM] Simplify constructing the ARMArchFeature string. NFC.
Differential Revision: https://reviews.llvm.org/D44819

llvm-svn: 328478
2018-03-26 08:41:10 +00:00
Craig Topper 6f28d3c954 [X86] Fix the SchedRW for intrinsic register form of SQRT/RCP/RSQRT.
llvm-svn: 328474
2018-03-26 05:05:12 +00:00
Craig Topper cdfcf8ecda [X86] Merge the SSE and AVX versions of fp divs and sqrts in the SandyBridge/Haswell/Broadwell/Skylake scheduler models.
I've used Agner's data as best I could to get the values to converge on.

llvm-svn: 328473
2018-03-26 05:05:10 +00:00
Craig Topper fbf2d850e3 [X86] Add itinerary to intrinsic version of sqrtss, rcpss, and rsqrtss instructions.
llvm-svn: 328472
2018-03-26 04:20:36 +00:00
Craig Topper c049cb7823 [X86] Correct the itineraries for the dot production instructions.
llvm-svn: 328471
2018-03-26 02:17:15 +00:00
Craig Topper 4367874bc5 [X86] Use the same itinerary for VCVTDQ2PD as the SSE version so that the generated scheduler classes will merge.
llvm-svn: 328470
2018-03-26 02:17:14 +00:00
Craig Topper 659f85af14 [X86] Swap the itineraries on the memory and register forms of CVTDQ2PD.
They were backwards.

llvm-svn: 328469
2018-03-26 02:17:13 +00:00
Craig Topper 4bf23eddaf [X86] Give VMOVSX/ZX the same itinerary as the SSE version so they'll reuse the same generated scheduler class.
llvm-svn: 328468
2018-03-26 02:17:12 +00:00
Craig Topper 6e8d99bbea [X86] Give vpmsadbw the same itinerary as the SSE version so they'll be able to share the same generated scheduler class.
llvm-svn: 328466
2018-03-25 23:52:06 +00:00
Craig Topper 15fef89ad9 [X86] Move (v)movss to port 5 only for Skylake. Move (v)movups/d to port 015 for Skylake.
This matches Agner's data and is consistent with what the EVEX instructions were doing on SKX.

llvm-svn: 328465
2018-03-25 23:40:56 +00:00
Simon Pilgrim 68a8fbc102 [X86] Use WriteResPair for WriteIDiv to cleanup sched defs. NFCI.
llvm-svn: 328460
2018-03-25 20:16:53 +00:00
Simon Pilgrim fecb0b7874 [X86][SkylakeClient] Fix missing comma
llvm-svn: 328458
2018-03-25 19:17:17 +00:00
Simon Pilgrim 351e4fa0e2 [ARM] Remove sched model instregex entries that don't match any instructions (D44687)
Reviewed by @javed.absar

llvm-svn: 328457
2018-03-25 19:07:17 +00:00
Simon Pilgrim 854ac7490d [X86] Add missing full stop to comment. NFCI.
llvm-svn: 328456
2018-03-25 18:49:48 +00:00
Craig Topper 972bdbd415 [X86][SkylakeClient] Fix a set of regular expressions that were checking for optionally starting with 'Y' instead of 'V'
These bad regexs were introduced by r328435

llvm-svn: 328454
2018-03-25 17:33:14 +00:00
Simon Pilgrim 562e8b4eae [X86][MMX] MOVQ2DQ/MOVDQ2Q are better described as WriteVecMove than WriteMove
Not that it makes a difference to current cost values, but will when we try to better model GPR-SIMD transfer costs

llvm-svn: 328453
2018-03-25 17:28:06 +00:00
Simon Pilgrim 25acc0a79b [X86][SkylakeServer] Merge multiple instregex. NFCI
llvm-svn: 328452
2018-03-25 17:25:37 +00:00
Craig Topper a985919d3e [X86] Update cost model for Goldmont. Add fsqrt costs for Silvermont
Add fdiv costs for Goldmont using table 16-17 of the Intel Optimization Manual. Also add overrides for FSQRT for Goldmont and Silvermont.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44644

llvm-svn: 328451
2018-03-25 15:58:12 +00:00
Simon Pilgrim e3547af7be [X86] Add the ability to override memory folding latency to schedules and add 1uop for memory folds for Intel models
The Intel models need an extra 1uop for memory folded instructions, plus a lot of instructions take a non-default memory latency which should allow us to use the multiclass a lot more to tidy things up.

Differential Revision: https://reviews.llvm.org/D44840

llvm-svn: 328446
2018-03-25 10:21:19 +00:00
Craig Topper e8f4e747bf [X86] Consistently prefix all defs in X86ScheduleSLM.td with 'SLM'.
llvm-svn: 328444
2018-03-25 01:28:43 +00:00
Martin Storsjo 98720156b9 [X86] Update a partially stale comment, since SVN r328386. NFC.
llvm-svn: 328443
2018-03-24 23:00:00 +00:00
Simon Pilgrim 31a9633724 [X86][SkylakeClient] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328435
2018-03-24 20:40:14 +00:00
Simon Pilgrim c21deec37b [X86][Broadwell] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328434
2018-03-24 19:37:28 +00:00
Mandeep Singh Grang 98bc25a0f2 [RISCV] Use init_array instead of ctors for RISCV target, by default
Summary:
LLVM defaults to the newer .init_array/.fini_array scheme for static
constructors rather than the less desirable .ctors/.dtors (the UseCtors
flag defaults to false). This wasn't being respected in the RISC-V
backend because it fails to call TargetLoweringObjectFileELF::InitializeELF with the the appropriate
flag for UseInitArray.
This patch fixes this by implementing RISCVELFTargetObjectFile and overriding its Initialize method to call
InitializeELF(TM.Options.UseInitArray).

Reviewers: asb, apazos

Reviewed By: asb

Subscribers: mgorny, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, llvm-commits

Differential Revision: https://reviews.llvm.org/D44750

llvm-svn: 328433
2018-03-24 18:37:19 +00:00
Simon Pilgrim 2b5967f510 [X86][Haswell] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328432
2018-03-24 18:36:01 +00:00
Simon Pilgrim efcf1d85b3 [X86][SandyBridge] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time
llvm-svn: 328431
2018-03-24 18:12:59 +00:00
Mandeep Singh Grang db00e2e20f [Hexagon] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44857

llvm-svn: 328430
2018-03-24 17:34:37 +00:00
Mandeep Singh Grang 860adef9e6 [AMDGPU] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Reviewers: tstellar, RKSimon, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44856

llvm-svn: 328429
2018-03-24 17:15:04 +00:00
Craig Topper 097b47a0fc [X86] Add a new disassembler opcode map for 3DNow. Stop treating 3DNow as an attribute.
This reduces the size of llvm-mc by at least 150k since we no longer have to multiply the attribute across 7 tables.

llvm-svn: 328416
2018-03-24 07:48:54 +00:00
Craig Topper e865641aea [X86] Merge the Has3DNow0F0FOpcode TSFlag into the OpMap encoding. NFC
The 3DNow instructions are encoded a little weird, but we can still represent it as an opcode map.

llvm-svn: 328410
2018-03-24 06:04:12 +00:00
Craig Topper 2c0a62ab9a [X86] Add a DAG combine to simplify PMULDQ/PMULUDQ nodes
These nodes only use the lower 32 bits of their inputs so we can use SimplifyDemandedBits to simplify them.

Differential Revision: https://reviews.llvm.org/D44375

llvm-svn: 328405
2018-03-24 01:52:01 +00:00
Craig Topper bc6d2ec8ce [X86] Correct the value AdSizeX in X86II enum. NFC
Should be NFC since nothing used the enum value. The instruction descriptions are generated from tablegen which had the correct value.

llvm-svn: 328398
2018-03-24 00:02:46 +00:00
David Blaikie 36a0f226b1 Fix layering by moving ValueTypes.h from CodeGen to IR
ValueTypes.h is implemented in IR already.

llvm-svn: 328397
2018-03-23 23:58:31 +00:00
David Blaikie 13e77db2df Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
David Blaikie bf121cf44a Fix layering by moving Support/CodeGenCWrappers.h to Target
This includes llvm-c/TargetMachine.h which is logically part of
libTarget (since libTarget implements llvm-c/TargetMachine.h's
functions).

llvm-svn: 328394
2018-03-23 23:58:21 +00:00
David Blaikie ab7f17f4ec Fix layering by moving X86DisassemblerDecoderCommon to Support
This is used from llvm tblgen and the X86Disassembler - the only common
library (apart from TableGen, which probably doesn't make sense to have
as a dependency from a release tool (rather than a use-while-building-llvm
tool) of LLVM)

llvm-svn: 328393
2018-03-23 23:58:20 +00:00
David Blaikie 6054e650ff Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

llvm-svn: 328392
2018-03-23 23:58:19 +00:00
Reid Kleckner e27b410661 [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32
Both GCC and MSVC only look at the low byte of a boolean when it is
passed.

llvm-svn: 328386
2018-03-23 23:38:53 +00:00
Krzysztof Parzyszek 998df2ca4f [Hexagon] Make findLoopInstr member of HexagonInstrInfo
llvm-svn: 328367
2018-03-23 20:43:02 +00:00
Krzysztof Parzyszek 8038dad7db [Hexagon] Correct update of instruction offet in HW loop fixup
llvm-svn: 328366
2018-03-23 20:41:44 +00:00
Krzysztof Parzyszek bcf0a96f9e [Hexagon] Boost profit for word-mask immediates, reduce for others
This avoids unnecessary splitting due to uninteresting immediates.

llvm-svn: 328364
2018-03-23 20:11:00 +00:00
Krzysztof Parzyszek ca93f5e605 [Hexagon] Assume all extendable branches to be of size 8 in relaxation
The branch relaxation pass collects sizes of all instructions at the
beginning, before any changes have been made. It then performs one pass
over all branches to see which ones need to be extended. It does not
account for the case when a previously valid branch becomes out-of-range
due to relaxing other branches.
This approach fixes this problem by assuming from the beginning that
all extendable branches have been extended. This may cause unneeded
relaxation in some cases, but avoids iteration and recomputing instruction
sizes.

llvm-svn: 328360
2018-03-23 19:47:13 +00:00
Krzysztof Parzyszek 6f503b96fb [Hexagon] Incorrectly removing dead flag and adding kill flag
The HexagonExpandCondsets pass is incorrectly removing the dead
flag on a definition that is really dead, and adding a kill flag
to a use that is tied to a definition. This causes an assert later
during the machine scheduler when querying the live interval
information.

Patch by Brendon Cahoon.

llvm-svn: 328357
2018-03-23 19:39:37 +00:00
Benjamin Kramer faa9b438ce [Hexagon] Silence unused variable warning in Release builds
llvm-svn: 328356
2018-03-23 19:39:16 +00:00
Krzysztof Parzyszek e247526cc9 [Hexagon] Fold offset in base+immediate loads/stores
Optimize Ry = add(Rx,#n); memw(Ry+#0) = Rz  =>  memw(Rx,#n) = Rz.

Patch by Jyotsna Verma.

llvm-svn: 328355
2018-03-23 19:30:34 +00:00
Craig Topper 4529d3abcb [X86] Add itinerary to RCPSS*_Int and similar instructions.
llvm-svn: 328353
2018-03-23 19:15:05 +00:00
Craig Topper 02fb3907f1 [X86] Add itineraries to ADD.*_DB instructions to match their normal counterparts.
llvm-svn: 328352
2018-03-23 19:15:03 +00:00
Tony Tye 7a893d4e34 [AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS.

Differential Revision: https://reviews.llvm.org/D43736

llvm-svn: 328349
2018-03-23 18:45:18 +00:00
Krzysztof Parzyszek 5f7ba9a74c [Hexagon] Always generate mux out of predicated transfers if possible
HexagonGenMux would collapse pairs of predicated transfers if it assumed
that the predicated .new forms cannot be created. Turns out that generating
mux is preferable in almost all cases.
Introduce an option -hexagon-gen-mux-threshold that controls the minimum
distance between the instruction defining the predicate and the later of
the two transfers. If the distance is closer than the threshold, mux will
not be generated. Set the threshold to 0 by default.

llvm-svn: 328346
2018-03-23 18:43:09 +00:00
Krzysztof Parzyszek 80f10e4fe5 [Hexagon] Avoid early if-conversion for one sided branches
Patch by Anand Kodnani.

llvm-svn: 328344
2018-03-23 18:00:18 +00:00
Simon Pilgrim 6c63e6c222 [X86][Btver2] Cleanup TEST instructions to use JFPA (+JFPX on ymms) function unit
llvm-svn: 328343
2018-03-23 17:59:22 +00:00
Ana Pazos 41573804f2 [ARM] Fix "Constant pool entry out of range!" in Thumb1 mode
This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode.

In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode,
adjustBBOffsetsAfter() is not calculating postOffset correctly by
properly accounting for the padding that is required for the constant pool
that immediately follows the jump table branch  instruction.

Reviewers: t.p.northover, eli.friedman

Reviewed By: t.p.northover

Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D44709

llvm-svn: 328341
2018-03-23 17:53:27 +00:00
Krzysztof Parzyszek 570c6440cd [Hexagon] Two fixes in early if-conversion
- Fix checking for vector predicate registers.
- Avoid speculating llvm.lifetime.end intrinsic.

Patch by Harsha Jagasia and Brendon Cahoon.

llvm-svn: 328339
2018-03-23 17:46:09 +00:00
Simon Pilgrim e5c0a041ff [X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unit
Add missing non-VEX and (V)PMOVMSKB instructions to the pattern

llvm-svn: 328338
2018-03-23 17:38:59 +00:00
Krzysztof Parzyszek c98802de09 [Hexagon] Copy subregisters in HexagonStoreWiden
When converting an instruction to the wider version, copy any
subregisters if the original operand has a subregister.

Patch by Brendon Cahoon.

llvm-svn: 328333
2018-03-23 17:22:55 +00:00
Simon Pilgrim 256f149bf0 [X86][Btver2] Vector permutes use a JFPU01 scheduler pipe and JFPX/JVALU function unit
llvm-svn: 328331
2018-03-23 16:17:56 +00:00
Simon Pilgrim ee282b3160 [X86][Btver2] Vector store instructions use a JFPU1 scheduler pipe and JSAGU/JSTC function units
llvm-svn: 328328
2018-03-23 15:35:13 +00:00
Zaara Syeda 6535993625 Re-commit: [MachineLICM] Add functions to MachineLICM to hoist invariant stores
This patch adds functions to allow MachineLICM to hoist invariant stores.
Currently, MachineLICM does not hoist any store instructions, however
when storing the same value to a constant spot on the stack, the store
instruction should be considered invariant and be hoisted. The function
isInvariantStore iterates each operand of the store instruction and checks
that each register operand satisfies isCallerPreservedPhysReg. The store
may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore.
This patch also adds the PowerPC changes needed to consider the stack
register as caller preserved.

Differential Revision: https://reviews.llvm.org/D40196

llvm-svn: 328326
2018-03-23 15:28:15 +00:00
Simon Pilgrim 1335b9c0ca [X86][Btver2] Cleanup DPPS/DPPD instructions to use JFPA/JFPM function units
llvm-svn: 328324
2018-03-23 15:17:50 +00:00
John Brawn e3b44f9de6 [AArch64] Don't reduce the width of loads if it prevents combining a shift
Loads and stores can only shift the offset register by the size of the value
being loaded, but currently the DAGCombiner will reduce the width of the load
if it's followed by a trunc making it impossible to later combine the shift.

Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and
make it prevent the width reduction if this is what would happen, though do
allow it if reducing the load width will let us eliminate a later sign or zero
extend.

Differential Revision: https://reviews.llvm.org/D44794

llvm-svn: 328321
2018-03-23 14:47:07 +00:00
Simon Pilgrim 5792e10ffb [X86][Btver2] Fix MicroOps counts for DPPS/YMM memory folded instructions
This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops.

llvm-svn: 328320
2018-03-23 14:45:03 +00:00
Simon Pilgrim 8619962c73 [X86][Btver2] Cleanup SSE42 PCMPISTR/PCMPESTR string instructions to correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units
Fixes throughput to match Agner/Fam16h-SoG as well.

llvm-svn: 328318
2018-03-23 14:27:26 +00:00
Christof Douma 4a025cc79d [ARM] Support float literals under XO
When targeting execute-only and fp-armv8, float constants in a compare
resulted in instruction selection failures. This is now fixed by using
vmov.f32 where possible, otherwise the floating point constant is
lowered into a integer constant that is moved into a floating point
register.

This patch also restores using fpcmp with immediate 0 under fp-armv8.

Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
llvm-svn: 328313
2018-03-23 13:02:03 +00:00
Simon Pilgrim 9ea14bbbb0 [X86][Znver1] Fix instregex entries that don't match any instructions (D44687)
Reviewed by @GGanesh and @craig.topper

llvm-svn: 328309
2018-03-23 12:08:23 +00:00
Simon Pilgrim 2755893834 [X86][SandyBridge] Fix missing comma that was causing string concatenation of 2 instregex entries
Found while updating D44687

llvm-svn: 328308
2018-03-23 11:56:38 +00:00
Simon Pilgrim a1e3ea01ef [X86][Btver2] Vector move/load/store instructions use a JFPU01 scheduler pipe and JFPX/JVALU function unit as well as the AGUs
llvm-svn: 328304
2018-03-23 11:27:31 +00:00
Florian Hahn 588e640ea1 [AArch64] Clean-up a few over-eager regexps in models.
Patch by Simon Pilgrim <llvm-dev@redking.me.uk>

That is a slightly modified version of the AArch64 changes from
Simon's D44687 .

llvm-svn: 328303
2018-03-23 11:00:42 +00:00
Martin Storsjo e1a64fe95c [ARM] Error out on .arm assembler directives on windows
Windows on arm is thumb only.

Differential Revision: https://reviews.llvm.org/D43005

llvm-svn: 328298
2018-03-23 09:10:03 +00:00
Craig Topper dfeea84d63 [X86] Give VPCMPEQQ the same itinerary as its SSE counterpart.
llvm-svn: 328296
2018-03-23 06:58:55 +00:00
Craig Topper 4787b7f434 [X86] Correct the latencies of SNB integer vector multiplies based on Agner's data. Add missing MMX multiplies.
llvm-svn: 328295
2018-03-23 06:41:43 +00:00
Craig Topper 659c66dfc1 [X86] Match vpblendvb/vblendvps/vblendvpd itineraries to the SSE equivalent. Change pblendvb/blendvps/blendvpd to use WriteFVarBlend
llvm-svn: 328294
2018-03-23 06:41:41 +00:00
Craig Topper 7580a7997d [X86] Change VPSADBW itinerary to SSE_INTALU_ITINS_P to match the SSE version.
llvm-svn: 328293
2018-03-23 06:41:40 +00:00
Craig Topper d5ac3ae8d3 [X86] Give VLDDQUrm and LDDQUrm the same itinerary.
llvm-svn: 328292
2018-03-23 06:41:39 +00:00
Craig Topper 7f142b8bf1 [X86] Merge VMOVMSKBrr and MOVMSKBrr in the SNB sheduler model.
The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that.

llvm-svn: 328291
2018-03-23 06:41:38 +00:00
Craig Topper fae4173b47 [X86] Add VEXTRB/W/D/Q to Zen scheduler model.
The SSE versions were present, but not the VEX version.

llvm-svn: 328290
2018-03-23 06:41:36 +00:00
Craig Topper 6ef55d1887 [X86] Fix the itinerary for vextractps to match extractps.
llvm-svn: 328289
2018-03-23 06:41:35 +00:00
Michael Zolotukhin fab7a676c2 State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'.
That removes some redundant recomputations from the passes pipeline.

llvm-svn: 328272
2018-03-22 23:44:40 +00:00
Craig Topper adb173314d [X86] Correct the VROUND regular expressions in Znver1 scheduler model to account for r328254
llvm-svn: 328260
2018-03-22 22:17:11 +00:00
Craig Topper 40d3b32e12 [X86] Rename VROUNDYPS* and VROUNDYPD* instructions to VROUNDPSY* and VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD
This makes the Y position consistent with other instructions.

This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary.

llvm-svn: 328254
2018-03-22 21:55:20 +00:00
Craig Topper 58afb4ea58 [X86][SkylakeClient] Fix a bunch of instructions that were incorrectly assigned Port015 instead of Port01.
The VEC ADD and VEC MUL units aren't present on port 5 on SkylakeClient.

llvm-svn: 328241
2018-03-22 21:10:07 +00:00
Jun Bum Lim 2ecb7ba4c6 [CodeGen] Add a new pass for PostRA sink
Summary:
This pass sinks COPY instructions into a successor block, if the COPY is not
used in the current block and the COPY is live-in to a single successor
(i.e., doesn't require the COPY to be duplicated).  This avoids executing the
the copy on paths where their results aren't needed.  This also exposes
additional opportunites for dead copy elimination and shrink wrapping.

These copies were either not handled by or are inserted after the MachineSink
pass. As an example of the former case, the MachineSink pass cannot sink
COPY instructions with allocatable source registers; for AArch64 these type
of copy instructions are frequently used to move function parameters (PhyReg)
into virtual registers in the entry block..

For the machine IR below, this pass will sink %w19 in the entry into its
successor (%bb.1) because %w19 is only live-in in %bb.1.

```
   %bb.0:
      %wzr = SUBSWri %w1, 1
      %w19 = COPY %w0
      Bcc 11, %bb.2
    %bb.1:
      Live Ins: %w19
      BL @fun
      %w0 = ADDWrr %w0, %w19
      RET %w0
    %bb.2:
      %w0 = COPY %wzr
      RET %w0
```
As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be
able to see %bb.0 as a candidate.

With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted  in spec2000/2006/2017 on AArch64.

Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz

Reviewed By: sebpop

Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D41463

llvm-svn: 328237
2018-03-22 20:06:47 +00:00
Nirav Dave 8c5f47ac40 [DAG, X86] Fix ISel-time node insertion ids
As in SystemZ backend, correctly propagate node ids when inserting new
unselected nodes into the DAG during instruction Seleciton for X86
target.

Fixes PR36865.

Reviewers: jyknight, craig.topper

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D44797

llvm-svn: 328233
2018-03-22 19:32:07 +00:00
Craig Topper 4a3be6e578 [X86] Correct the scheduling data for some of the 32 and 64 bit multiplies to as best as I understand how they are implemented.
llvm-svn: 328231
2018-03-22 19:22:51 +00:00
Simon Pilgrim bcb86bb927 [X86][Btver2] Conversion, MaskedLoad/MaskedStore and NTStores all are scheduled through the JFPU1 pipe
llvm-svn: 328226
2018-03-22 18:29:16 +00:00
Simon Pilgrim 0e031afa95 [X86][Btver2] FCMP (inc FMAX/FMIN) instructions use the JFPA functional pipe
The ymm instructions are double pumped as well.

llvm-svn: 328222
2018-03-22 17:43:12 +00:00
Simon Pilgrim e5b51f6786 [X86][Btver2] FMUL ymm instructions are double pumped on the JFPM functional pipe
llvm-svn: 328217
2018-03-22 17:25:38 +00:00
Craig Topper 7ccb5ebed8 [ARM] Enable the full InstRW overlap check for ARMScheduleR52.td
This fixes a few issues with the R52 instregexs to enable the full overlap checking

Differential Revision: https://reviews.llvm.org/D44767

llvm-svn: 328216
2018-03-22 17:17:47 +00:00
Simon Pilgrim 53b2c3329a [X86][SSE42] Use the default PCMPEST/PCMPIST scheduler classes directly. NFCI.
Models were completely overriding all SSE42 strins instructions when the default classes could be used for exactly the same coverage.

llvm-svn: 328203
2018-03-22 14:56:18 +00:00
Simon Pilgrim 3b2ff1faa9 [X86][CLMUL] Use the default CLMUL scheduler classes directly. NFCI.
Models were completely overriding all CLMUL instructions when the WriteCLMUL default classes could be used for exactly the same coverage.

llvm-svn: 328194
2018-03-22 13:37:30 +00:00
Simon Pilgrim 6bdd6b32fd [X86][CLMUL] Fix/add missing itinerary tags to (V)PCLMULQDQ instructions
PCLMULQDQrm was using the rr itinerary.

Difference in itineraries between PCLMULQDQ/VPCLMULQDQ variants was causing an unnecessary duplication of scheduler class entries.

llvm-svn: 328193
2018-03-22 13:36:06 +00:00
Simon Pilgrim 7684e055b3 [X86] Use the default AES scheduler classes directly. NFCI.
Models were completely overriding all AES instructions when the WriteAES default classes could be used for exactly the same coverage.

Removes 6 unnecessary scheduler classes from every model.

Note: Still looking for a way for tblgen to warn when this is happening - often the override is more complete than the default.
llvm-svn: 328192
2018-03-22 13:18:08 +00:00
Craig Topper df7855fc8d [X86] Remove unused SchedWriteRes classes. NFC
llvm-svn: 328181
2018-03-22 04:52:08 +00:00
Craig Topper fc179c6dd5 [X86][Skylake] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do.

This reduces the size of the optimized llc binary on my computer by 400K. Presumably because we went from 5000+ scheduler classes per CPU to ~2000.

llvm-svn: 328179
2018-03-22 04:23:41 +00:00
David Blaikie 2be3922807 Fix a couple of layering violations in Transforms
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.

Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.

Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.

llvm-svn: 328165
2018-03-21 22:34:23 +00:00
Artem Belevich 30512869ff [NVPTX] Make tensor shape part of WMMA intrinsic's name.
This is needed for the upcoming implementation of the
new 8x32x16 and 32x8x16 variants of WMMA instructions
introduced in CUDA 9.1.

Differential Revision: https://reviews.llvm.org/D44719

llvm-svn: 328158
2018-03-21 21:55:02 +00:00
Reid Kleckner 440219d53e [WebAssembly] Really disable wasm register name matcher
The "ShouldEmitMatchRegisterName" bit wasn't taking effect because the
WebAssembly target didn't point to the custom WebAssemblyAsmParser
record.

llvm-svn: 328155
2018-03-21 21:46:47 +00:00
Craig Topper 2854dc93e1 [X86] Rewrite getOperandBias in X86BaseInfo.h to be a little more structured and update comments to be more clear about what it does. NFC
llvm-svn: 328136
2018-03-21 19:30:28 +00:00
Krzysztof Parzyszek b4bb75d6ad [Hexagon] Generalize DAG mutation for function calls
Add barrier edges to check for any physical register. The previous code
worked for the function return registers: r0/d0, v0/w0.

Patch by Brendon Cahoon.

llvm-svn: 328120
2018-03-21 17:23:32 +00:00
Reid Kleckner 18574836f9 [WebAssembly] Suppress unused function warning for register name matcher
llvm-svn: 328112
2018-03-21 16:20:58 +00:00
Simon Pilgrim ec2f878779 [X86][Haswell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do.

llvm-svn: 328111
2018-03-21 16:19:03 +00:00
Simon Pilgrim 96b605d34c [X86][SandyBridge] Merge more VEX/non-VEX instregex patterns (NFCI) (PR35955)
llvm-svn: 328110
2018-03-21 16:05:58 +00:00
Alex Bradbury 65d6ea5e68 [RISCV] Codegen support for RV32F floating point comparison operations
This patch also includes extensive tests targeted at select and br+fcmp IR
inputs. A sequence of br+fcmp required support for FPR32 registers to be added
to RISCVInstrInfo::storeRegToStackSlot and
RISCVInstrInfo::loadRegFromStackSlot.

llvm-svn: 328104
2018-03-21 15:11:02 +00:00
Craig Topper 5a69a0011b [X86][Broadwell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
llvm-svn: 328076
2018-03-21 06:28:42 +00:00
Craig Topper 137a4dd84d [X86] Fix the SchedRW for XOP vpcom register form instructions to not be marked as loads.
llvm-svn: 328071
2018-03-21 03:41:33 +00:00
Craig Topper d25f1acf67 [X86] Change PMULLD to 10 cycles on Skylake per Agner's tables and llvm-exegesis.
Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server.

Fixes PR36808.

llvm-svn: 328061
2018-03-20 23:39:48 +00:00
Derek Schuff 73a98f5eea [WebAssembly] Update torture compile test expectations
The tests compile after r328049

llvm-svn: 328057
2018-03-20 23:00:13 +00:00
Derek Schuff 39b5367cba [WebAssembly] Strip threadlocal attribute from globals in single thread mode
The default thread model for wasm is single, and in this mode thread-local
global variables can be lowered identically to non-thread-local variables.

Differential Revision: https://reviews.llvm.org/D44703

llvm-svn: 328049
2018-03-20 22:01:32 +00:00
Simon Pilgrim 572bfa562a [X86] Drop unnecessary InstRW overrides for WriteFMA
As noticed on D44687, these already match the WriteFMA def so can be removed.

llvm-svn: 328045
2018-03-20 21:15:23 +00:00
Martin Storsjo 07589fc496 [X86] Don't use the MSVC stack protector names on mingw
Mingw uses the same stack protector functions as GCC provides
on other platforms as well.

Patch by Valentin Churavy!

Differential Revision: https://reviews.llvm.org/D27296

llvm-svn: 328039
2018-03-20 20:37:51 +00:00
Derek Schuff e4825975d8 [WebAssembly] Added initial AsmParser implementation.
It uses the MC framework and the tablegen matcher to do the
heavy lifting. Can handle both explicit and implicit locals
(-disable-wasm-explicit-locals). Comes with a small regression
test.

This is a first basic implementation that can parse most llvm .s
output and round-trips most instructions succesfully, but in order
to keep the commit small, does not address all issues.

There are a fair number of mismatches between what MC / assembly
matcher think a "CPU" should look like and what WASM provides,
some already have workarounds in this commit (e.g. the way it
deals with register operands) and some that require further work.
Some of that further work may involve changing what the
Disassembler outputs (and what s2wasm parses), so are probably
best left to followups.

Some known things missing:
- Many directives are ignored and not emitted.
- Vararg calls are parsed but extra args not emitted.
- Loop signatures are likely incorrect.
- $drop= is not emitted.
- Disassembler does not output SIMD types correctly, so assembler
  can't test them.

Patch by Wouter van Oortmerssen

Differential Revision: https://reviews.llvm.org/D44329

llvm-svn: 328028
2018-03-20 20:06:35 +00:00
Evandro Menezes 36afbee1d8 [AArch64] Adjust the cost model for Exynos M3
Fix typo in the number of integer dividers.

llvm-svn: 328027
2018-03-20 20:00:29 +00:00
Krzysztof Parzyszek 65059ee284 [Hexagon] Add heuristic to exclude critical path cost for scheduling
Patch by Brendon Cahoon.

llvm-svn: 328022
2018-03-20 19:26:27 +00:00
Krzysztof Parzyszek 9315c0de9b [Hexagon] Fix fall-through warnings in HexagonMCDuplexInfo.cpp
llvm-svn: 328021
2018-03-20 19:23:18 +00:00
Nirav Dave ce71989188 [MC,X86] Cleanup some X86 parser functions to use MCParser helpers. NFCI.
llvm-svn: 328019
2018-03-20 19:12:41 +00:00
Craig Topper c2dbd677bd [PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code
I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything.

If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all.

I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it.

Differential Revision: https://reviews.llvm.org/D44061

llvm-svn: 328017
2018-03-20 18:49:28 +00:00
Krzysztof Parzyszek eb0c510ecd [X86] Add phony registers for high halves of regs with low halves
Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters
that cover the low halves of these registers. This change adds artificial
subregisters for the high halves in order to differentiate (in terms of
register units) between the 32- and the low 16-bit registers.

This patch contains parts that aim to preserve the calculated register
pressure. This is in order to preserve the current codegen (minimize the
impact of this patch). The approach of having artificial subregisters
could be used to fix PR23423, but the pressure calculation would need
to be changed.

Differential Revision: https://reviews.llvm.org/D43353

llvm-svn: 328016
2018-03-20 18:46:55 +00:00
Artem Belevich 914d4babec [NVPTX] Make tensor load/store intrinsics overloaded.
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.

Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.

Updated tests to use non-default address spaces.

Differential Revision: https://reviews.llvm.org/D43268

llvm-svn: 328006
2018-03-20 17:18:59 +00:00
Krzysztof Parzyszek 4c6b65f685 [Hexagon] Correct the computation of TopReadyCycle and BotReadyCycle of SU
TopReadyCycle and BotReadyCycle were off by one cycle when an SU is either
the first instruction or the last instruction in a packet.

Patch by Ikhlas Ajbar.

llvm-svn: 328000
2018-03-20 17:03:27 +00:00
Krzysztof Parzyszek 73be83dec5 [Hexagon] Check weak dependences when only 1 instruction is available
Patch by Brendon Cahoon.

llvm-svn: 327997
2018-03-20 16:22:06 +00:00
Simon Pilgrim 62690e9d0e [X86][Haswell][Znver1] Fix typo in fldl instregexs
Missing comma was casing 2 instregex entries to be concatenated together by mistake.

Found while investigating PR35548

llvm-svn: 327992
2018-03-20 15:44:47 +00:00
Krzysztof Parzyszek 5ffd808a27 [Hexagon] Improve scheduling heuristic for large basic blocks
This patch changes the isLatencyBound heuristic to look at the
path length based upon the number of packets needed to schedule
a basic block. For small basic blocks, the heuristic uses a small
threshold for isLatencyBound. For large basic blocks, the
heuristic uses a large threshold.

The goal is to increase the priority of an instruction in a small
basic block that has a large height or depth relative to the code
size. For large functions, the height and depth are ignored
because it increases the live range of a register and causes more
spills. That is, for large functions, it is more important to
schedule instructions when available, and attempt to keep the defs
and uses closer together.

Patch by Brendon Cahoon.

llvm-svn: 327987
2018-03-20 14:54:01 +00:00
Geoff Berry 0b64402adb [AArch64][Falkor] Correct load/store increment scheduling details
llvm-svn: 327982
2018-03-20 13:46:35 +00:00
Krzysztof Parzyszek 2c4231d888 [Hexagon] Fix division by zero in machine scheduler
llvm-svn: 327980
2018-03-20 13:28:46 +00:00
Alex Bradbury 80c8eb7696 [RISCV] Add codegen for RV32F floating point load/store
As part of this, add support for load/store from the constant pool. This is
used to materialise f32 constants.

llvm-svn: 327979
2018-03-20 13:26:12 +00:00
Alex Bradbury 76c29ee815 [RISCV] Add codegen for RV32F arithmetic and conversion operations
Currently, only a soft floating point ABI is supported.

llvm-svn: 327976
2018-03-20 12:45:35 +00:00
Krzysztof Parzyszek dca383123f [Hexagon] Improve scheduling based on register pressure
Patch by Brendon Cahoon.

llvm-svn: 327975
2018-03-20 12:28:43 +00:00
Simon Pilgrim 4a83f802cc [X86][SandyBridge] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955)
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix - there are still a lot more of these to do.

llvm-svn: 327974
2018-03-20 12:26:55 +00:00
Martin Storsjo 802b434156 [X86] Properly implement the calling convention for f80 for mingw/x86_64
In these cases, both parameters and return values are passed
as a pointer to a stack allocation.

MSVC doesn't use the f80 data type at all, while it is used
for long doubles on mingw.

Normally, this part of the calling convention is handled
within clang, but for intrinsics that are lowered to libcalls,
it may need to be handled within llvm as well.

Differential Revision: https://reviews.llvm.org/D44592

llvm-svn: 327957
2018-03-20 06:19:38 +00:00
Craig Topper ad7c685791 [X86] Rename MOVSX32_NOREXrr8 to MOVSX32rr8_NOREX so that the scheduler model regular expressions will pick it up with the regular version.
Do the same for MOVSX32_NOREXrm8, MOVZX32_NOREXrr8, and MOVZX32_NOREXrm8

llvm-svn: 327948
2018-03-20 05:00:20 +00:00