Commit Graph

117058 Commits

Author SHA1 Message Date
Eli Friedman 5ab09a684f [ARM] Fix correctness checks in promoteToConstantPool.
Correctly check for relocations in the constant to promote. And don't
allow promoting a constant multiple times.

This partially fixes https://bugs.llvm.org//show_bug.cgi?id=32780 ;
it's not a complete fix because we also need to prevent
ARMConstantIslands from cloning the constant.

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51472

llvm-svn: 343361
2018-09-28 20:27:31 +00:00
Eli Friedman bb993be56b [ARM] Use preferred alignment for constants in promoteToConstantPool.
This mostly affects IR generated by non-clang frontends because clang
generally sets the alignment of globals explicitly.

Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 .

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51469

llvm-svn: 343359
2018-09-28 20:21:51 +00:00
Lang Hames b62f73420b [ORC] Narrow a cast: the block guarded by the condition only handles
GlobalVariables, not all GlobalValues.

llvm-svn: 343358
2018-09-28 20:16:16 +00:00
Evandro Menezes fc1852ff1c [AArch64] Split zero cycle feature more granularly
Split the `zcz` feature into specific ones got GP and FP registers, `zcz-gp`
and `zcz-fp`, respectively, while retaining the original feature option to
mean both.

Differential revision: https://reviews.llvm.org/D52621

llvm-svn: 343354
2018-09-28 19:05:09 +00:00
David Bolvansky 8e90bad63d [DAGCombiner] [NFC] Improve X div/rem 1 fold
Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52661

llvm-svn: 343349
2018-09-28 18:40:30 +00:00
Luke Cheeseman 10981cc884 Revert r343317
- asan buildbots are breaking and I need to investigate the issue

llvm-svn: 343341
2018-09-28 17:01:50 +00:00
whitequark 29b2980159 Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints"
This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97.

Broke the bots, and should really be in Transforms/Coroutines
instead.

llvm-svn: 343337
2018-09-28 16:45:18 +00:00
whitequark 937afbc365 [LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints
Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: mehdi_amini, modocache, llvm-commits

Differential Revision: https://reviews.llvm.org/D51642

llvm-svn: 343336
2018-09-28 16:38:11 +00:00
Robert Widmann d22ee9461f [LLVM-C] Fix broken build bots
Summary: Fix broken bots caused by the merge of D51522.

Reviewers: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52657

llvm-svn: 343334
2018-09-28 16:02:26 +00:00
Robert Widmann 9cba4eced8 [LLVM-C] Add more debug information accessors to GlobalObject and Instruction
Summary: Adds missing debug information accessors to GlobalObject.  This puts the finishing touches on cloning debug info in the echo tests.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: aprantl, JDevlieghere, llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D51522

llvm-svn: 343330
2018-09-28 15:35:18 +00:00
Sanjay Patel 242f90fe82 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548

llvm-svn: 343329
2018-09-28 15:24:41 +00:00
Lang Hames 604769943a [ORC] Remove some dead code.
llvm-svn: 343327
2018-09-28 15:13:41 +00:00
Aditya Nandakumar 1cbb057142 [GISel]: Remove an incorrect assert in CallLowering
https://reviews.llvm.org/D51147

Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.

reviewed by : dsanders.

llvm-svn: 343325
2018-09-28 15:08:49 +00:00
Lang Hames 46f40a7128 [ORC] Improve debugging output for ORC.
(1) Print debugging output under a session lock to avoid garbled messages when
compiling on multiple threads.

(2) Name MaterializationUnits, add an ostream operator for them, and so they can
be easily referenced in debugging output, and have that ostream operator
optionally print code/data/hidden symbols provided by that materialization unit
based on command line options.

llvm-svn: 343323
2018-09-28 15:03:11 +00:00
Simon Pilgrim 428c1196d8 [X86][Btver2] PSUBS/PSUBUS instructions are zero-idioms
Noticed during llvm-exegesis tests, the PSUBS/PSUBUS instructions have the same zero-idiom behaviour to PSUB

llvm-svn: 343321
2018-09-28 14:20:42 +00:00
Luke Cheeseman 21f2955bb2 Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method

llvm-svn: 343317
2018-09-28 13:37:27 +00:00
Petar Jovanovic ff1bc621a0 [MIPS GlobalISel] Lower i64 arguments
Lower integer arguments larger then 32 bits for MIPS32.
setMostSignificantFirst is used in order for G_UNMERGE_VALUES and
G_MERGE_VALUES to always hold registers in same order, regardless of
endianness.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52409

llvm-svn: 343315
2018-09-28 13:28:47 +00:00
Simon Pilgrim 66da1ed29d [X86][Btver2] CVTSS2I/CVTSD2I - add missing JFPU0 pipe
We issue JFPU1->JSTC then JFPU0->JFPA then -> JALU0 (integer pipe)

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343314
2018-09-28 13:19:22 +00:00
Simon Pilgrim 17e5981ebf [X86][Btver2] Fix BSF/BSR schedule
Double throughput to account for 2 pipes + fix BSF's latency/uop counts

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343311
2018-09-28 10:26:48 +00:00
Florian Hahn 8d72ecc36f Revert r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343310
2018-09-28 10:20:07 +00:00
Florian Hahn 0694c159f7 [LoopInterchange] Turn into a loop pass.
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.

The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.

It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.


Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D51702

llvm-svn: 343308
2018-09-28 09:45:50 +00:00
David Spickett ea605913be [ARM] Allow execute only code on Cortex-m23
The NoMovt feature prevents the use of MOVW/MOVT
instructions on Cortex-M23 for performance reasons.
These instructions are required for execute only code
so NoMovt should be disabled when that option is enabled.

Differential Revision: https://reviews.llvm.org/D52551

llvm-svn: 343302
2018-09-28 08:55:19 +00:00
David Spickett a799fe40dc Remove extra whitespace. NFC. (test commit)
llvm-svn: 343301
2018-09-28 08:45:28 +00:00
Oliver Stannard 5f34e9e265 [ARM][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52484

llvm-svn: 343300
2018-09-28 08:27:56 +00:00
Simon Pilgrim 280af1c7f0 [X86][BtVer2] Fix PHMINPOS schedule resources typo
PHMINPOS can run on either JFPU pipe

llvm-svn: 343299
2018-09-28 08:21:39 +00:00
Hiroshi Inoue 69bfa40200 [CodeGen] fix broken successor probability in MBB dump
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.

Differential Revision: https://reviews.llvm.org/D52605

llvm-svn: 343297
2018-09-28 05:27:32 +00:00
Craig Topper bb50c38635 [ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the scalar load/stores to handle element types that are byte-sized but not powers of 2.
This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work.

llvm-svn: 343294
2018-09-28 03:35:37 +00:00
Aaron Smith 757274f9b2 [pdb] Simplify the code by replacing a few string conversions with calls to invokeBstrMethod()
Reviewers: aleksandr.urakov, zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D52624

llvm-svn: 343291
2018-09-28 02:32:07 +00:00
Lang Hames 4328ea3443 [ORC] clang-format the ThreadSafeModule code.
Evidently I forgot to do this before committing r343055.

llvm-svn: 343288
2018-09-28 01:41:33 +00:00
Craig Topper fdf4c76ca0 [ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion.
It should be the minimum of the original alignment and the scalar size.

llvm-svn: 343284
2018-09-28 01:06:13 +00:00
Craig Topper 8b4f0e1b8c [ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts before generating the expansion without control flow.
Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case.

llvm-svn: 343278
2018-09-27 22:31:42 +00:00
Craig Topper 10ec021621 [ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an assert. Consistently make use of the element type variable we already have. NFCI
cast will take care of asserting internally.

llvm-svn: 343277
2018-09-27 22:31:40 +00:00
Derek Schuff 70ce1af9fa WebAssembly: Rename GetSignature to GetLibcallSignature [NFC]
llvm-svn: 343275
2018-09-27 22:20:33 +00:00
Craig Topper 6911bfe263 [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.

llvm-svn: 343273
2018-09-27 21:28:59 +00:00
Craig Topper 7d234d6628 [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.

llvm-svn: 343271
2018-09-27 21:28:52 +00:00
Craig Topper dfc0f289fa [ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.

So instead just check for Constant and use getAggregateElement which will do the dirty work for us.

llvm-svn: 343270
2018-09-27 21:28:46 +00:00
Craig Topper dfe460db57 [ScalarizeMaskedMemIntrin] Remove some temporary variables that are only used by a single if condition.
llvm-svn: 343268
2018-09-27 21:28:41 +00:00
Craig Topper 49dad8b8af [ScalarizeMaskedMemIntrin] Cleanup comments. NFC
llvm-svn: 343267
2018-09-27 21:28:39 +00:00
Lang Hames 8b81395db9 [ORC] Add definition for IRLayer::setCloneToNewContextOnEmit, use it to set the
flag to true in LLJIT when running in multithreaded mode.

The IRLayer::setCloneToNewContextOnEmit method sets a flag within the IRLayer
that causes modules added to that layer to be moved to a new context (by
serializing to/from a memory buffer) when they are emitted. This allows modules
that were all loaded on the same context to be compiled in parallel.

llvm-svn: 343266
2018-09-27 21:13:07 +00:00
Konstantin Zhuravlyov 5f1b8181ad AMDGPU: Split HasExt into HasExtDPP/SDWA/SDWA9
llvm-svn: 343264
2018-09-27 20:49:00 +00:00
Konstantin Zhuravlyov 9da26b20da AMDGPU: Split VOP2Inst into VOP2Inst_e32/e64/sdwa
llvm-svn: 343259
2018-09-27 19:46:41 +00:00
Lang Hames c5192f751a [ORC] Coalesce all of ORC's symbol renaming / linkage-promotion utilities into
one SymbolLinkagePromoter utility.

SymbolLinkagePromoter renames anonymous and private symbols, and bumps all
linkages to at least global/hidden-visibility. Modules whose symbols have been
promoted by this utility can be decomposed into sub-modules without introducing
link errors. This is used by the CompileOnDemandLayer to extract single-function
modules for lazy compilation.

llvm-svn: 343257
2018-09-27 19:27:20 +00:00
Konstantin Zhuravlyov 7d424aae13 AMDGPU/NFC: Simplify VOP_MAC_F16/F32
llvm-svn: 343254
2018-09-27 19:24:05 +00:00
Stanislav Mekhanoshin b080adfc0c [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

llvm-svn: 343249
2018-09-27 18:55:20 +00:00
Craig Topper 0423681d4a [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.

llvm-svn: 343244
2018-09-27 18:01:48 +00:00
Simon Pilgrim 2a64d393ea [X86] Remove BT/BTC/BTR/BTS rr/ri overrides
llvm-svn: 343241
2018-09-27 17:29:13 +00:00
Simon Pilgrim 86c7b07ecd [X86][Btver2] (V)MPSADBW instructions take 3uops not 1
llvm-svn: 343238
2018-09-27 17:13:57 +00:00
Luke Cheeseman 8e5676b1aa Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
2018-09-27 16:47:30 +00:00
Simon Pilgrim dd744f158a [X86][Btver2] BTC/BTR/BTS instructions take 2uops not 1
llvm-svn: 343234
2018-09-27 16:39:52 +00:00
Simon Pilgrim 29cf499bca [X86] Split BT and BTC/BTR/BTS scheduler classes
llvm-svn: 343233
2018-09-27 16:24:42 +00:00