Commit Graph

15577 Commits

Author SHA1 Message Date
Craig Topper 7bce79a539 [X86] Remove LowerEXTRACT_SUBVECTOR handler. All EXTRACT_SUBVECTORs are marked as legal.
llvm-svn: 316182
2017-10-19 20:59:40 +00:00
Simon Pilgrim fdd63d1535 [X86] Replace custom scalar integer absolute matching with ISD::ABS lowering.
x86 has its own copy of integer absolute pattern matching to combine directly to a SUB+CMOV.

This patch removes the x86 combine and adds custom lowering support for ISD::ABS instead, allowing us to use the DAGCombiner version.

Additional test cases are already covered by iabs.ll (rL315706 and rL315711).

Differential Revision: https://reviews.llvm.org/D38895

llvm-svn: 316162
2017-10-19 15:02:24 +00:00
Michael Zuckerman 49293264cc [AVX512][AVX2]Cost calculation for interleave load/store patterns {v8i8,v16i8,v32i8,v64i8}
This patch adds accurate instructions cost.
The formula presents two cases(stride 3 and stride 4) and calculates the cost according to the VF and stride.

Reviewers:
1. delena
2. Farhana
3. zvi
4. dorit
5. Ayal

Differential Revision: https://reviews.llvm.org/D38762

Change-Id: If4cfbd4ac0e63694e8144cb78c7fa34850647ff7
llvm-svn: 316072
2017-10-18 11:41:55 +00:00
Michael Zuckerman 72a6f893cb Fixing bug issue https://bugs.llvm.org/show_bug.cgi?id=34978
Change-Id: I7f13d5bcb181be2860377df7b40e1579a8ad4add
llvm-svn: 316067
2017-10-18 08:04:31 +00:00
Yichao Yu a46eb8e649 Fix `FaultMaps` crash when the out streamer is reused
Summary:
Make sure the map is cleared before processing a new module. Similar to what is done on `StackMaps`.

This issue is similar to D38588, though this time for FaultMaps (on x86) rather than ARM/AArch64. Other than possible mixing of information between modules, the crash is caused by the pointers values in the map that was allocated by the bump pointer allocator that is unwinded when emitting the next file. This issue has been around since 3.8.

This issue is likely much harder to write a test for since AFAICT it requires emitting something much more compilcated (and possibly real code) instead of just some random bytes.

Reviewers: skatkov, sanjoy

Reviewed By: skatkov, sanjoy

Subscribers: sanjoy, aemerson, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38924

llvm-svn: 315990
2017-10-17 11:44:34 +00:00
Gadi Haber 1e0f1f476a [X86][SKL] Updated scheduling information for the SkylakeClient target
Updated the scheduling information for the SkylakeClient target with the following changes:

1. regrouped the instructions after adding load and store latencies.
2. regrouped the instructions after adding identified missing ports in several groups.
The changes were made after revisiting the latencies impact of all the load and store uOps.

Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D38727

Change-Id: I778a308cc11e490e8fa5e27e2047412a1dca029f
llvm-svn: 315978
2017-10-17 06:47:04 +00:00
Craig Topper fbb1985c14 [X86] Fix typo in comment. NFC
llvm-svn: 315969
2017-10-17 04:17:54 +00:00
Krzysztof Parzyszek 72518eaa6f Add iterator range MachineRegisterInfo::liveins(), adopt users, NFC
llvm-svn: 315927
2017-10-16 19:08:41 +00:00
Simon Pilgrim 73bd5aa049 Fix or vs || typo.
llvm-svn: 315903
2017-10-16 14:01:59 +00:00
Andrew V. Tischenko bfc9061593 This patch is a result of D37262: The issues with X86 prefixes. It closes PR7709, PR17697, PR19251, PR32809 and PR21640. There could be other bugs closed by this patch.
llvm-svn: 315899
2017-10-16 11:14:29 +00:00
Craig Topper 2738117326 [X86] Remove the SlowBTMem feature flag entirely
Turns out we have no patterns on the instructions that were using this feature flag for other reasons. These instructions are slow on all modern CPUs so it seems unlikely that we will spend any effort supporting these instructions going forward. So we might as well just kill of the feature flag and just fix up the comments.

llvm-svn: 315862
2017-10-15 16:57:33 +00:00
Craig Topper a5af4a64d0 [AVX512] Don't mark EXTLOAD as legal with AVX512. Continue using custom lowering.
Summary:
This was impeding our ability to combine the extending shuffles with other shuffles as you can see from the test changes.

There's one special case that needed to be added to use VZEXT directly for v8i8->v8i64 since the custom lowering requires v64i8.

Reviewers: RKSimon, zvi, delena

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38714

llvm-svn: 315860
2017-10-15 16:41:17 +00:00
Craig Topper a1f9c9dd8b [X86] Add FeatureSlowBTMem to Haswell, Broadwell, Skylake, Cannonlake, and Knights Landing CPUs.
Summary: I see nothing in Agner Fog's tables to indicate that this improved between Ivy Bridge and Haswell. It's also set for all Atom CPUs so I assume KNL should have it too.

Reviewers: RKSimon, zvi, gadi.haber

Reviewed By: gadi.haber

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38890

llvm-svn: 315859
2017-10-15 16:41:15 +00:00
Aaron Ballman 615eb47035 Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.
Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854
2017-10-15 14:32:27 +00:00
Amjad Aboud c8d67979c0 [X86] Ignore DBG instructions in X86CmovConversion optimization to resolve PR34565
Differential Revision: https://reviews.llvm.org/D38359

llvm-svn: 315851
2017-10-15 11:00:56 +00:00
Craig Topper a9cd59fb5d [X86] Lower vselect with constant condition to vector_shuffle even with AVX512 instructions.
Summary:
It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register.

It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day.

Reviewers: RKSimon, delena, zvi

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38932

llvm-svn: 315849
2017-10-15 06:39:07 +00:00
Simon Pilgrim 36fe00ee17 [X86][SSE] Don't attempt to reduce the imul vector width of odd sized vectors (PR34947)
llvm-svn: 315825
2017-10-14 19:57:19 +00:00
Simon Pilgrim f5b9f353c3 Pull out repeated calls to VT.getVectorNumElements(). NFCI.
llvm-svn: 315818
2017-10-14 17:37:42 +00:00
Simon Pilgrim cded82837d Use DAG::getBitcast() helper. NFCI.
llvm-svn: 315815
2017-10-14 17:14:42 +00:00
Simon Pilgrim f367c27d2d [X86][SSE] Support combining AND(EXTRACT(SHUF(X)), C) -> EXTRACT(SHUF(X))
If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle.

Fixes the last issue with PR22415

llvm-svn: 315807
2017-10-14 15:01:36 +00:00
Craig Topper f7e777763d [X86] Add patterns for vzmovl+cvtpd2dq/cvttpd2dq with a load.
llvm-svn: 315802
2017-10-14 07:04:48 +00:00
Craig Topper 61010a85b8 [X86] Add AVX512 versions of VCVTPD2PS to load folding tables.
llvm-svn: 315801
2017-10-14 05:55:43 +00:00
Craig Topper ee277e190c [X86] Add patterns for vzmovl+cvtpd2ps with a load.
llvm-svn: 315800
2017-10-14 05:55:42 +00:00
Craig Topper aec05a9303 [X86] Remove some patterns for bitcasted alignednonedtemporalloads.
These select the same instruction as the non-bitcasted pattern. So this provides no additional value.

llvm-svn: 315799
2017-10-14 04:18:11 +00:00
Craig Topper 009f0aaeb0 [X86] Remove unnecessary bitconverts as the root of patterns for zero extended VCVTPD2UDQZ128rr and VCVTTPD2UDQZ128rr.
We don't need a bitconvert as a root pattern in these cases. The types in the other parts of the pattern are sufficient to express the behavior of these instructions.

llvm-svn: 315798
2017-10-14 04:18:10 +00:00
Craig Topper d746747d03 [X86] Add additional patterns for folding loads with 128-bit VCVTDQ2PD and VCVTUDQ2PD.
This matches the patterns we have for the SSE/AVX version.

This is a prerequisite for D38714.

llvm-svn: 315797
2017-10-14 04:18:09 +00:00
Craig Topper 134241e4af [X86] Add AVX512 flavors of VCVTDQ2PD plus VCVTUDQ2PD to the load folding tables.
llvm-svn: 315796
2017-10-14 04:18:08 +00:00
Craig Topper 0b64e67b0d [X86] Remove TB_NO_REVERSE from VCVTDQ2PDYrr and VCVTPS2PDYrr in the load folding tables.
I believe these were added incorrectly under the belief that the load size was smaller than the input register size, but that's not true.

llvm-svn: 315795
2017-10-14 04:18:07 +00:00
Craig Topper 53b0cb7fa9 [X86] Add an additional isel pattern to CVTDQ2PDrm/VCVTDQ2PDrm to enable load folding without the peephole pass.
This pattern is already used in AVX512VL version of these instructions. Though AVX512VL version is missing other patterns.

llvm-svn: 315794
2017-10-14 04:18:06 +00:00
Craig Topper f6c69564e7 [X86] Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is available
This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations.

For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP.

We may be able to use this for AVX1 as well which would allow us to remove more isel patterns.

I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP.

Differential Revision: https://reviews.llvm.org/D38836

llvm-svn: 315768
2017-10-13 21:56:48 +00:00
Daniel Sanders 11300cead8 [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36569

llvm-svn: 315761
2017-10-13 21:28:03 +00:00
Craig Topper 5d692917f4 [X86] Add initial skeleton support for knm cpu
This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it.

Differential Revision: https://reviews.llvm.org/D38811

llvm-svn: 315722
2017-10-13 18:10:17 +00:00
Craig Topper 5805fb3dfc [X86] Fix some inconsistent formatting in the processor feature lists.
llvm-svn: 315696
2017-10-13 16:06:06 +00:00
Craig Topper 54541c4675 [X86] Add ProcIntelBDW to BroadwellProc class not BDWFeatures class.
This isn't a property we want inherited.

llvm-svn: 315695
2017-10-13 16:04:08 +00:00
Craig Topper 0817346aef [X86] Stop creating CMOV nodes with a second MVT::Glue result
Summary: We seem to inconsistently create CMOV nodes some with a Glue result and some without. But I can't find any cases that use the Glue result. So I've tried to remove all the place that did this.

Reviewers: RKSimon, spatel, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38664

llvm-svn: 315686
2017-10-13 15:28:35 +00:00
Craig Topper bf0de9d3b6 [X86] Remove patterns that select unmasked vbroadcastf2x32/vbroadcasti2x32. Prefer vbroadcastsd/vpbroadcastq instead.
There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table.

llvm-svn: 315674
2017-10-13 06:07:10 +00:00
Matthias Braun bb8507e63c Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637
2017-10-12 22:57:28 +00:00
Matthias Braun 3a9c114b24 TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633
2017-10-12 22:28:54 +00:00
Craig Topper 060cb43721 [X86] Add CLWB intrinsic. llvm part
llvm-svn: 315613
2017-10-12 20:08:31 +00:00
Reid Kleckner 1a7e387849 [codeview] Don't emit FPO data in funclet prologues
Attempt 3 to work around bugs in FPO data with funclets.

llvm-svn: 315600
2017-10-12 18:20:35 +00:00
Don Hinton 3e0199f7eb [dump] Remove NDEBUG from test to enable dump methods [NFC]
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590
2017-10-12 16:16:06 +00:00
Sanjay Patel 3a72909b7e [x86] replace isEqualTo with == for efficiency
This is a follow-up suggested in D37534.
Patch by Yulia Koval.

llvm-svn: 315589
2017-10-12 16:15:38 +00:00
Simon Pilgrim 0903085ec3 [X86][SSE] Pull out repeated INSERT_VECTOR_ELT code from LowerBUILD_VECTOR v16i8/v8i16 insertion. NFCI.
llvm-svn: 315587
2017-10-12 15:52:01 +00:00
Reid Kleckner d925f98375 Speculative build fix 2
llvm-svn: 315542
2017-10-12 00:28:28 +00:00
Wei Mi 1736efd16a Revert r307036 because of PR34919.
llvm-svn: 315540
2017-10-12 00:24:52 +00:00
Reid Kleckner 9c0126ec0b Speculative build fix, apparently I built llc without my patch applied to test it
llvm-svn: 315539
2017-10-12 00:20:50 +00:00
Reid Kleckner 29cfa6f11f [codeview] Disable FPO in functions using EH funclets
Funclets are emitted by WinException which doesn't have access to
X86TargetStreamer so it's hard to make a quick fix for this.

llvm-svn: 315538
2017-10-12 00:06:57 +00:00
Reid Kleckner ec4ff24f79 [X86] Sink X86AsmPrinter ctor into .cpp file, NFC
I keep adding and removing code here, so let's sink it.

llvm-svn: 315534
2017-10-11 23:53:12 +00:00
Lang Hames 2241ffa43c [MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr.
MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that,
and allows us to remove the last instance of MCObjectStreamer's weird "holding
ownership via someone else's reference" trick.

llvm-svn: 315531
2017-10-11 23:34:47 +00:00
Reid Kleckner 9cdd4df81a [codeview] Implement FPO data assembler directives
Summary:
This adds a set of new directives that describe 32-bit x86 prologues.
The directives are limited and do not expose the full complexity of
codeview FPO data. They are merely a convenience for the compiler to
generate more readable assembly so we don't need to generate tons of
labels in CodeGen. If our prologue emission changes in the future, we
can change the set of available directives to suit our needs. These are
modelled after the .seh_ directives, which use a different format that
interacts with exception handling.

The directives are:
  .cv_fpo_proc _foo
  .cv_fpo_pushreg ebp/ebx/etc
  .cv_fpo_setframe ebp/esi/etc
  .cv_fpo_stackalloc 200
  .cv_fpo_endprologue
  .cv_fpo_endproc
  .cv_fpo_data _foo

I tried to follow the implementation of ARM EHABI CFI directives by
sinking most directives out of MCStreamer and into X86TargetStreamer.
This helps avoid polluting non-X86 code with WinCOFF specific logic.

I used cdb to confirm that this can show locals in parent CSRs in a few
cases, most importantly the one where we use ESI as a frame pointer,
i.e. the one in http://crbug.com/756153#c28

Once we have cdb integration in debuginfo-tests, we can add integration
tests there.

Reviewers: majnemer, hans

Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D38776

llvm-svn: 315513
2017-10-11 21:24:33 +00:00