Commit Graph

112241 Commits

Author SHA1 Message Date
Matt Arsenault 97b6b1b926 Fix printing of stack id in MachineFrameInfo
uint8_t is printed as a char, so it needs to be
casted to do the right thing.

llvm-svn: 329622
2018-04-09 21:04:30 +00:00
Zhaoshi Zheng 43af17be41 [MemorySSAUpdater] Mark Phi users of a node being moved as non-optimize
Fix PR36484, as suggested:

<quote>
during moves, mark the direct users of the erased things that were phis as "not to be optimized"
<quote>

llvm-svn: 329621
2018-04-09 20:55:37 +00:00
Konstantin Zhuravlyov 6183065b97 AMDGPU: Remove max_scratch_backing_memory_byte_size from kernel header
1. Remove max_scratch_backing_memory_byte_size from kernel header
2. Make it a reserved field
3. Ignore it while parsing assembly for backwards compatibility
4. Bump up minor version of kernel header

Differential Revision: https://reviews.llvm.org/D45452

llvm-svn: 329620
2018-04-09 20:47:22 +00:00
Craig Topper 47b2f9d836 [X86] Don't use Lower512IntUnary to split bitcasts with v32i16/v64i8 types on targets without AVX512BW.
LowerIntUnary as its name says has an assert for integer types. But for the bitcast case one side might be an FP type.

Rather than making sure the function really works for fp types and renaming it. Just do really basic splitting directly. The LowerIntUnary has the advantage that it can peek through BUILD_VECTOR because every other call is during Lowering. But these calls are during legalization and will be followed by a DAG combine round.

Revert some change to LowerVectorIntUnary that were originally made just to make these two calls work even in pure integer cases.

This was found purely by compiling the avx512f-builtins.c test from clang so I've copied over the offending function from that.

llvm-svn: 329616
2018-04-09 20:37:14 +00:00
Alexandre Ganea d9e96741c4 [Debuginfo][COFF] Minimal serialization support for precompiled types records
This change adds support for the LF_PRECOMP and LF_ENDPRECOMP records required
to read/write Microsoft precompiled types .objs.
See https://en.wikipedia.org/wiki/Precompiled_header#Microsoft_Visual_C_and_C++

This also adds handling for the .debug$P section, which is actually a .debug$T
section in disguise, found only in precompiled .objs.

Differential Revision: https://reviews.llvm.org/D45283

llvm-svn: 329613
2018-04-09 20:17:56 +00:00
Peter Collingbourne 5cff2409ae AArch64: Allow offsets to be folded into addresses with ELF.
This is a code size win in code that takes offseted addresses
frequently, such as C++ constructors that typically need to compute
an offseted address of a vtable. It reduces the size of Chromium for
Android's .text section by 46KB, or 56KB with ThinLTO (which exposes
more opportunities to use a direct access rather than a GOT access).

Because the addend range is limited in COFF and Mach-O, this is
enabled for ELF only.

Differential Revision: https://reviews.llvm.org/D45199

llvm-svn: 329611
2018-04-09 19:59:57 +00:00
Alex Shlyapnikov 79f2c720b5 Revert "AMDGPU: enable 128-bit for local addr space under an option"
This reverts commit r329591.

It breaks various bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/16516
http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/17374
http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/15992
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt
http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/11251
...

llvm-svn: 329610
2018-04-09 19:47:38 +00:00
Mandeep Singh Grang afa3aaf14d [WebAssembly] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: sunfish, RKSimon

Reviewed By: sunfish

Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D44873

llvm-svn: 329607
2018-04-09 19:38:31 +00:00
Daniel Sanders c01efe690f Fix type mismatch between MachineMemOperand constructor and accessors. NFC
This allows MachineMemOperand::getSize()'s result to be fed directly into 
MachineMemOperand::MachineMemOperand() without a narrowing type conversion
warning.

llvm-svn: 329602
2018-04-09 18:42:19 +00:00
Erik Pilkington d43931dcb8 [demangler] Support for fold expressions.
llvm-svn: 329601
2018-04-09 18:33:01 +00:00
Erik Pilkington 452e2ef996 [demangler] Support for <data-member-prefix>.
llvm-svn: 329600
2018-04-09 18:32:25 +00:00
Erik Pilkington 650130ac04 [demangler] Support for partially substituted sizeof....
llvm-svn: 329599
2018-04-09 18:31:50 +00:00
Aditya Nandakumar b1c467dbe7 [GISel] Refactor MachineIRBuilder to allow transformations while
building.

https://reviews.llvm.org/D45067

This change attempts to do two things:
1) It separates out the state that is stored in the
MachineIRBuilder(InsertionPt, MF, MRI, InsertFunction etc) into a
separate object called MachineIRBuilderState.
2) Add the ability to constant fold operations while building instructions
(optionally). MachineIRBuilder is now refactored into a MachineIRBuilderBase
which contains lots of non foldable build methods and their implementation.
Instructions which can be constant folded/transformed are now in a class
called FoldableInstructionBuilder which uses CRTP to use the implementation
of the derived class for buildBinaryOps. Additionally buildInstr in the derived
class can be used to implement other kinds of transformations.

Also because of separation of state, given a MachineIRBuilder in an API,
if one wishes to use another MachineIRBuilder, a new one can be
constructed from the state locally. For eg,

void doFoo(MachineIRBuilder &B) {
  MyCustomBuilder CustomB(B.getState());
  // Use CustomB for building.
}

reviewed by : aemerson

llvm-svn: 329596
2018-04-09 17:30:56 +00:00
Craig Topper 0c2a12cb3e [X86] Revert the SLM part of r328914.
While it appears to be correct information based on Intel's optimization manual and Agner's data, it causes perf regressions on a couple of the benchmarks in our internal list.

llvm-svn: 329593
2018-04-09 17:07:40 +00:00
Marek Olsak 52b033b827 AMDGPU: enable 128-bit for local addr space under an option
Author: Samuel Pitoiset

ds_read_b128 and ds_write_b128 have been recently enabled
under the amdgpu-ds128 option because the performance benefit
is unclear.

Though, using 128-bit loads/stores for the local address space
appears to introduce regressions in tessellation shaders. Not
sure what is broken, but as ds_read_b128/ds_write_b128 are not
enabled by default, just introduce a global option and enable
128-bit only if requested (until it's fixed/used correctly).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
llvm-svn: 329591
2018-04-09 16:56:32 +00:00
Tom Stellard e753c52227 AMDGPU: Initialize GlobalISel passes
Summary:
This fixes AMDGPU GlobalISel test failures when enabling the AMDGPU
target without any other targets that use GlobalISel.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D45353

llvm-svn: 329588
2018-04-09 16:09:13 +00:00
Simon Pilgrim 23c2182c2b Support generic expansion of ordered vector reduction (PR36732)
Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order.

This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1]

Differential Revision: https://reviews.llvm.org/D45366

llvm-svn: 329585
2018-04-09 15:44:20 +00:00
Zaara Syeda 935474fef5 [MachineLICM] Re-enable hoisting of constant stores
This patch fixes an issue exposed on the SystemZ build bots when committing
https://reviews.llvm.org/rL327856. The hoisting was temporarily disabled with
an option. This patch now re-enables hoisting and checks that we only hoist a
store instruction when all its operands are either constant caller preserved
registers or immediates.

Differential Revision: https://reviews.llvm.org/D45286

llvm-svn: 329577
2018-04-09 14:50:02 +00:00
Pavel Labath eadfac8748 [CodeGen/AccelTable] Don't emit zero-CU name indexes
Summary:
If an input DICompileUnit is completely empty (e.g., the result of
running "clang -g" on an empty file), we don't bother emitting an empty
DWARF CU. When we do that, we must make sure we don't also emit a DWARF v5
name index, as DWARF specifies that each index must reference at least
one compilation unit.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45435

llvm-svn: 329575
2018-04-09 14:38:53 +00:00
Xin Tong fdad23bc36 [MergeICmp] Update debug msg.NFC
llvm-svn: 329572
2018-04-09 14:29:13 +00:00
Simon Pilgrim e5ed5e2cba [X86][MMX] Fix missing itinerary for PALIGNR
llvm-svn: 329568
2018-04-09 13:52:33 +00:00
Simon Pilgrim 140fee078f [X86][MMX] Fix missing itinerary for MOVQ2DQ instruction format
llvm-svn: 329567
2018-04-09 13:42:14 +00:00
Simon Pilgrim abf3611332 [X86][MMX] Fix missing itinerary for CVTPI2PS
llvm-svn: 329565
2018-04-09 13:27:47 +00:00
Xin Tong 0efadbbcde [MergeICmp] Split blocks that do other work.
Summary:
We do not try to move the instructions and split the block till we
know the blocks can be split, i.e. BCE-cmp-insts can be separated from
non-BCE-cmp-insts.

Reviewers: davide, courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44443

llvm-svn: 329564
2018-04-09 13:14:06 +00:00
Dmitry Preobrazhensky 2f8e146ad3 [AMDGPU][MC][GFX9] Added instructions s_mul_hi_*32, s_lshl*_add_u32
See bugs
  36841: https://bugs.llvm.org/show_bug.cgi?id=36841
  36842: https://bugs.llvm.org/show_bug.cgi?id=36842

Differential Revision: https://reviews.llvm.org/D45251

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329562
2018-04-09 13:10:33 +00:00
Simon Pilgrim 0047efdd1e [X86][MMX] Fix flipped reg/mem typo in MMX_MISC_FUNC_ITINS
The RR/RM itineraries were the wrong way around

llvm-svn: 329561
2018-04-09 13:02:07 +00:00
Simon Pilgrim 6131286553 [X86][SSE] Fix f32 mul/div itinerary groups typo
The RM folded itineraries were incorrectly using the f64 version.

llvm-svn: 329556
2018-04-09 10:45:53 +00:00
Pavel Labath 889bf9fe00 [CodeGen/AccelTable]: Don't emit accelerator entries for functions with no names
Summary:
We were emitting accelerator entries for functions with no name, which
is contrary to the DWARF v5 spec: "All other (i.e., *not*
DW_TAG_namespace) debugging information entries without a DW_AT_name
attribute are excluded." Besides that, a name table entry with an empty
string as a key is fairly useless.

We can sometimes end up with functions which have a DW_AT_linkage_name but no
DW_AT_name. One such example is the global-constructor-initialization functions,
which C++ compilers synthesize for each compilation unit with global
constructors.
A very strict reading of the DWARF v5 spec would suggest that we should not even
emit the accelerator entry for the linkage name in this case, but I don't think
we should go that far.

I found this when running the dwarf verifier over llvm codebase compiled
with DWARF v5 accelerator tables.

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: vleschuk, clayborg, echristo, probinson, llvm-commits

Differential Revision: https://reviews.llvm.org/D45367

llvm-svn: 329552
2018-04-09 08:41:57 +00:00
Sam Parker 1f4f4d9a08 [DAGCombine] Improve ReduceLoad for SRL
Recommitting r329283, third time lucky...

If the SRL node is only used by an AND, we may be able to set the
ExtVT to the width of the mask, making the AND redundant. To support
this, another check has been added in isLegalNarrowLoad which queries
whether the load is valid.

Differential Revision: https://reviews.llvm.org/D41350

llvm-svn: 329551
2018-04-09 08:16:11 +00:00
Craig Topper c95e122cc7 [X86] Merge some of the autoupgrade handling for masked intrinsics that just need to upgrade to an unmasked version plus a select. NFCI
These are were previously grouped in small groups of similarish intrinsics. But all the intrinsics have the same number of arguments and the same order. So we can move them all into a larger group for handling.

llvm-svn: 329549
2018-04-09 06:15:09 +00:00
Max Kazantsev 8624a4786a [IRCE] Relax restriction on collected range checks
In IRCE, we have a very old legacy check that works when we collect comparisons that we
treat as range checks. It ensures that the value against which the indvar is compared is
loop invariant and is also positive.

This latter condition remained there since the times when IRCE was only able to handle
signed latch comparison. As the optimization evolved, it now learned how to intersect
signed or unsigned ranges, and this logic has no reliance on the fact that the right border
of each range should be positive.

The old implementation of this non-negativity check was also naive enough and just looked
into ranges (while most of other IRCE logic tries to use power of SCEV implications), so this
check did not allow to deal with the most simple case that looks like follows:

  int size; // not known non-negative
  int length; //known non-negative;
  i = 0;
  if (size != 0) {
    do {
      range_check(i < size);
      range_check(i < length);
    ++i;
    } while (i < size)
  }

In this case, even if from some dominating conditions IRCE could parse loop
structure, it could only remove the range check against `length` and simply
ignored the check against `size`.

In this patch we remove this obsolete check. It will allow IRCE to pick comparison
against `size` as a potential range check and then let Range Intersection logic
decide whether it is OK to eliminate it or not.

Differential Revision: https://reviews.llvm.org/D45362
Reviewed By: samparker

llvm-svn: 329547
2018-04-09 06:01:22 +00:00
Hiroshi Inoue 9ff2380ea6 [NFC] fix trivial typos in comments and error message
"is is" -> "is", "are are" -> "are"

llvm-svn: 329546
2018-04-09 04:37:53 +00:00
Michael Zolotukhin 8d052a0dd2 Remove MachineLoopInfo dependency from AsmPrinter.
Summary:
Currently MachineLoopInfo is used in only two places:
1) for computing IsBasicBlockInsideInnermostLoop field of MCCodePaddingContext, and it is never used.
2) in emitBasicBlockLoopComments, which is called only if `isVerbose()` is true.
Despite that, we currently have a dependency on MachineLoopInfo, which makes
pass manager to compute it and MachineDominator Tree. This patch removes the
use (1) and makes the use (2) lazy, thus avoiding some redundant
recomputations.

Reviewers: opaparo, gadi.haber, rafael, craig.topper, zvi

Subscribers: rengolin, javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D44812

llvm-svn: 329542
2018-04-09 00:54:47 +00:00
Sanjay Patel 0d7df36c66 [TargetSchedule] shrink interface for init(); NFCI
The TargetSchedModel is always initialized using the TargetSubtargetInfo's 
MCSchedModel and TargetInstrInfo, so we don't need to extract those and 
pass 3 parameters to init().

Differential Revision: https://reviews.llvm.org/D44789

llvm-svn: 329540
2018-04-08 19:56:04 +00:00
Craig Topper b7baa358f6 [X86] Add SchedWrites for CMOV and SETCC. Use them to remove InstRWs.
Summary:
Cmov and setcc previously used WriteALU, but on Intel processors at least they are more restricted than basic ALU ops.

This patch adds new SchedWrites for them and removes the InstRWs. I had to leave some InstRWs for CMOVA/CMOVBE and SETA/SETBE because those have an extra uop relative to the other condition codes on Intel CPUs.

The test changes are due to fixing a missing ZnAGU dependency on the memory form of setcc.

Reviewers: RKSimon, andreadb, GGanesh

Reviewed By: RKSimon

Subscribers: GGanesh, llvm-commits

Differential Revision: https://reviews.llvm.org/D45380

llvm-svn: 329539
2018-04-08 17:53:18 +00:00
Craig Topper c362f42b6a [X86][Znver1] Remove InstRWs for BLENDVPS/PD
Summary:
This removes the InstRWs for BLENDVPS/PD in favor of WriteFVarBlend. The latency listed was 3 cycles but WriteFVarBlend is defined as 1 cycle latency. The 1 cycle latency matches Agner Fog's data.

The patterns were missing the VEX forms which is why there are no test changes. We don't test "-mcpu=znver1 -mattr=-avx"

Reviewers: RKSimon, GGanesh

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44841

llvm-svn: 329538
2018-04-08 17:53:15 +00:00
Mandeep Singh Grang 68ab401f62 [Support] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: chandlerc, jordan_rose, bkramer

Reviewed By: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45140

llvm-svn: 329536
2018-04-08 16:46:22 +00:00
Mandeep Singh Grang 327fd5e47c [PowerPC] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: hfinkel, RKSimon

Reviewed By: RKSimon

Subscribers: nemanjai, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D44870

llvm-svn: 329535
2018-04-08 16:45:04 +00:00
Mandeep Singh Grang 68a151a13c [X86] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: chandlerc, craig.topper, RKSimon

Reviewed By: chandlerc, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44874

llvm-svn: 329534
2018-04-08 16:42:52 +00:00
Xin Tong 99c4e2f364 [LIR] Reorder header. NFC
llvm-svn: 329530
2018-04-08 13:19:53 +00:00
Zvi Rackover 7a53f169f1 DAGCombiner: Combine SDIV with non-splat vector pow2 divisor
Summary:
Extend existing SDIV combine for pow2 constant divider to handle
non-splat vectors of pow2 constants.

Reviewers: RKSimon, craig.topper, spatel, hfinkel, efriedma

Reviewed By: RKSimon

Subscribers: magabari, llvm-commits

Differential Revision: https://reviews.llvm.org/D42479

llvm-svn: 329525
2018-04-08 11:35:20 +00:00
Simon Pilgrim 86588fc809 [X86][Btver2] Add vector extract costs
llvm-svn: 329524
2018-04-08 11:26:26 +00:00
Michal Gorny 47671e31cd [LLVMTestingSupport] Add explicit linkage to LLVMSupport
Explicitly link LLVMTestingSupport library against LLVMSupport. This
is necessary to fix linking errors when LLVMTestingSupport is built
as a shared library (with BUILD_SHARED_LIBS=ON) and -Wl,-z,defs is used.

Differential Revision: https://reviews.llvm.org/D45408

llvm-svn: 329522
2018-04-08 06:49:17 +00:00
Guozhi Wei 0eb86c8efc [DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))
In our real world application, we found the following optimization is missed in DAGCombiner

(zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst))

If the user of original zext is an add, it may enable further lea optimization on x86.

This patch add a new function CombineZExtLogicopShiftLoad to do this optimization.

Differential Revision: https://reviews.llvm.org/D44402

llvm-svn: 329516
2018-04-07 23:36:10 +00:00
Craig Topper ef37aebc96 [X86] Combine vXi64 multiplies to MULDQ/MULUDQ during DAG combine instead of lowering.
Previously we used a custom lowering for this because of the AVX1 splitting requirement. But we can do the split during DAG combine if we check the types and subtarget

llvm-svn: 329510
2018-04-07 19:09:52 +00:00
Craig Topper 5b95eae1c3 [DAGCombiner] Add a combine to turn a build vector of zero extends of extract vector elts into a vector zero extend and possibly an extract subvector.
llvm-svn: 329509
2018-04-07 19:09:50 +00:00
Sanjay Patel 2a24958923 [InstCombine] simplify code that propagates FMF; NFC
llvm-svn: 329503
2018-04-07 14:14:23 +00:00
Simon Pilgrim 80ce1dde44 [CostModel][X86] Fix v32i16/v64i8 SETCC costs on AVX512BW targets
llvm-svn: 329498
2018-04-07 13:24:33 +00:00
Tim Northover e25e458d52 Reapply ARM: Do not spill CSR to stack on entry to noreturn functions
Should fix UBSan bot by also checking there's no "uwtable" attribute
before skipping. Otherwise the unwind table will be useless since its
moves expect CSRs to actually be preserved.

A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.

Should fix PR9970.

Patch mostly by myeisha (pmb).

llvm-svn: 329494
2018-04-07 10:57:03 +00:00
Roman Lebedev 41922f1a6d [InstCombine] Get rid of select of bittest (PR36950 / PR17564)
Summary:
See [[ https://bugs.llvm.org/show_bug.cgi?id=36950 | PR36950 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=17564 | PR17564 ]], D45065, D45107
https://godbolt.org/g/iAYRup

Alive proof: https://rise4fun.com/Alive/uiH

Testing: `ninja check-llvm`

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45108

llvm-svn: 329492
2018-04-07 10:37:24 +00:00