Commit Graph

101427 Commits

Author SHA1 Message Date
Rong Xu 48596b6f7a [PGO] Memory intrinsic calls optimization based on profiled size
This patch optimizes two memory intrinsic operations: memset and memcpy based
on the profiled size of the operation. The high level transformation is like:
  mem_op(..., size)
  ==>
  switch (size) {
    case s1:
       mem_op(..., s1);
       goto merge_bb;
    case s2:
       mem_op(..., s2);
       goto merge_bb;
    ...
    default:
       mem_op(..., size);
       goto merge_bb;
    }
  merge_bb:

Differential Revision: http://reviews.llvm.org/D28966

llvm-svn: 299446
2017-04-04 16:42:20 +00:00
Matt Arsenault 3e90f84806 AMDGPU: Remove legacy export intrinsic
llvm-svn: 299444
2017-04-04 16:34:39 +00:00
Matt Arsenault 236da200f1 AMDGPU: Remove legacy image intrinsics
llvm-svn: 299443
2017-04-04 16:34:35 +00:00
Coby Tayree 2cb497afa4 [X86][MS-compatability]Allow named synonymous for MS-assembly operators
This patch enhances X86AsmParser's immediate expression parsing abilities, to include a named synonymous for selected binary/unary bitwise operators: {and,shl,shr,or,xor,not}, ultimately achieving better MS-compatability
MASM reference:
https://msdn.microsoft.com/en-us/library/94b6khh4.aspx

Differential Revision: D31277

llvm-svn: 299439
2017-04-04 14:43:23 +00:00
Simon Pilgrim 448222d8ba Strip trailing whitespace
llvm-svn: 299438
2017-04-04 14:40:53 +00:00
Michael Zuckerman 88fb171015 [X86][LLVM] Converting __mm{|256|512}_movm_epi{8|16|32|64} LLVMIR call into generic intrinsics.
This patch is a part one of two reviews, one for the clang and the other for LLVM. 
The patch deletes the back-end intrinsics and adds support for them in the auto upgrade.

Differential Revision: https://reviews.llvm.org/D31393

llvm-svn: 299432
2017-04-04 13:32:14 +00:00
Daniel Sanders bee5739a7c [tablegen][globalisel] Add support for nested instruction matching.
Summary:
Lift the restrictions that prevented the tree walking introduced in the
previous change and add support for patterns like:
  (G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3
Also adds support for G_SEXT and G_ZEXT to support these cases.

One particular aspect of this that I should draw attention to is that I've
tried to be overly conservative in determining the safety of matches that
involve non-adjacent instructions and multiple basic blocks. This is intended
to be used as a cheap initial check and we may add a more expensive check in
the future. The current rules are:
* Reject if any instruction may load/store (we'd need to check for intervening
  memory operations.
* Reject if any instruction has implicit operands.
* Reject if any instruction has unmodelled side-effects.
See isObviouslySafeToFold().

Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka

Reviewed By: ab

Subscribers: igorb, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30539

llvm-svn: 299430
2017-04-04 13:25:23 +00:00
Simon Dardis 0a47edb153 [mips] Deal with empty blocks in the mips hazard scheduler
This patch teaches the hazard scheduler how to handle empty blocks
when search for the next real instruction when dealing with forbidden
slots.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D31293

llvm-svn: 299427
2017-04-04 11:28:53 +00:00
Oren Ben Simhon 568fb197da [X86] Add 64 bit pattern matching for PSADBW
PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison.
The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT).

Differential Revision: https://reviews.llvm.org/D31577

llvm-svn: 299425
2017-04-04 10:23:18 +00:00
Craig Topper e06b6bcfa1 [InstCombine] Use setAllBits in place of getAllOnesValue since we know the bitwidths are the same. NFCI
llvm-svn: 299413
2017-04-04 05:03:02 +00:00
Zvi Rackover 82bf48d8b9 InstCombine: Use the InstSimplify hook for shufflevector
Summary: Start using the recently added InstSimplify hook for shuffles in the respective InstCombine visitor.

Reviewers: spatel, RKSimon, craig.topper, majnemer

Reviewed By: majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D31526

llvm-svn: 299412
2017-04-04 04:47:57 +00:00
Reid Kleckner 13fc411e39 [PDB] Save one type record copy
Summary:
The TypeTableBuilder provides stable storage for type records. We don't
need to copy all of the bytes into a flat vector before adding it to the
TpiStreamBuilder.

This makes addTypeRecord take an ArrayRef<uint8_t> and a hash code to go
with it, which seems like a simplification.

Reviewers: ruiu, zturner, inglorion

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31634

llvm-svn: 299406
2017-04-04 00:56:34 +00:00
Reid Kleckner c4b5d794f1 [codeview] Cope with unsorted streams in type merging
Summary:
MASM can produce type streams that are not topologically sorted. It can
even produce type streams with circular references, but those are not
common in practice.

Reviewers: inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31629

llvm-svn: 299403
2017-04-03 23:58:15 +00:00
Reid Kleckner 67cecd1e1c [Fuzzer] Flush std::cout before aborting in CxxStringEqTest
On Windows, abort() does not appear to flush std::cout. Should fix red
sanitizer-windows bot.

llvm-svn: 299398
2017-04-03 23:00:25 +00:00
Zvi Rackover 8f460655a2 InstSimplify: Add a hook for shufflevector
Summary:
Add a hook for simplification of shufflevector's with the following rules:
- Constant folding - NFC, as it was already being done by the default handler.
-  If only one of the operands is constant, constant fold the shuffle if the
    mask does not select elements from the variable operand -  to show the hook is firing and affecting the test-cases.

Reviewers: RKSimon, craig.topper, spatel, sanjoy, nlopes, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31525

llvm-svn: 299393
2017-04-03 22:05:30 +00:00
Weiming Zhao 74a7fa0594 Reland r298901 with modifications (reverted in r298932)
Dont emit Mapping symbols for sections that contain only data.

Summary:
Dont emit mapping symbols for sections that contain only data.

Reviewers: rengolin, weimingz, kparzysz, t.p.northover, peter.smith

Reviewed By: t.p.northover

Patched by Shankar Easwaran <shankare@codeaurora.org>

Subscribers: alekseyshl, t.p.northover, llvm-commits

Differential Revision: https://reviews.llvm.org/D30724

llvm-svn: 299392
2017-04-03 21:50:04 +00:00
Matt Arsenault b600e138cc AMDGPU: Remove llvm.SI.vs.load.input
llvm-svn: 299391
2017-04-03 21:45:13 +00:00
Matt Arsenault c82768290d DAG: Fix missing legalization for any_extend_vector_inreg operands
llvm-svn: 299389
2017-04-03 21:28:13 +00:00
Reid Kleckner 1c3b5087b7 [codeview] Add support for label type records
MASM can produce these type records.

llvm-svn: 299388
2017-04-03 21:25:20 +00:00
Simon Pilgrim af33757b5d [X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + VECTOR_SHUFFLE
It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging.

This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values.

There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch.

Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this.

Differential Revision: https://reviews.llvm.org/D31373

llvm-svn: 299387
2017-04-03 21:06:51 +00:00
Craig Topper 1604f0773b [InstCombine] Remove canonicalization for (X & C1) | C2 --> (X | C2) & (C1|C2) when C1 & C2 have common bits.
It turns out that SimplifyDemandedInstructionBits will get called earlier and remove bits from C1 first. Effectively doing (X & (C1&C2)) | C2. So by the time it got to this check there could be no common bits.

I think the DAGCombiner has the same check but its check can be executed because it handles demanded bits later. I'll look at it next.

llvm-svn: 299384
2017-04-03 20:41:47 +00:00
Amjad Aboud 0389f62879 x86 interrupt calling convention: re-align stack pointer on 64-bit if an error code was pushed
The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated.

This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue.

Fixes Bug 26413

Patch by Philipp Oppermann.

Differential Revision: https://reviews.llvm.org/D30049

llvm-svn: 299383
2017-04-03 20:28:45 +00:00
Jun Bum Lim dee5565869 [CodeGenPrep] move aarch64-type-promotion to CGP
Summary:
Move the aarch64-type-promotion pass within the existing type promotion framework in CGP.
This change also support forking sexts when a new sext is required for promotion.
Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853.

Reviewers: jmolloy, mcrosier, javed.absar, qcolombet

Reviewed By: qcolombet

Subscribers: llvm-commits, aemerson, rengolin, mcrosier

Differential Revision: https://reviews.llvm.org/D28680

llvm-svn: 299379
2017-04-03 19:20:07 +00:00
Craig Topper 3882613956 [DAGCombine][InstCombine] Fix inverted if condition in equivalent comments in DAGCombine and InstCombine. NFC
llvm-svn: 299378
2017-04-03 19:18:48 +00:00
Matt Arsenault 754dd3eaef AMDGPU: Remove legacy bfe intrinsics
llvm-svn: 299372
2017-04-03 18:08:08 +00:00
Peter Collingbourne f5af778389 Bitcode: Remove reader support for MODULE_CODE_PURGEVALS.
Support for writing this module code was removed in r73220, which was well
before the LLVM 3.0 release, so we do not need to be able to understand it
for backwards compatibility.

Differential Revision: https://reviews.llvm.org/D31563

llvm-svn: 299370
2017-04-03 17:58:48 +00:00
Zvi Rackover d76a4d0ac6 Revert "[DAGCombine] A shuffle of a splat is always the splat itself"
This reverts commit r299047 which is incorrect because the
simplification may result in incorrect propogation of undefs to users of
the folded shuffle.

Thanks to Andrea Di Biagio for pointing this out.

llvm-svn: 299368
2017-04-03 17:41:19 +00:00
Krzysztof Parzyszek 44173f7d02 [Hexagon] Factor out some common code in HexagonEarlyIfConv.cpp, NFC
llvm-svn: 299367
2017-04-03 17:26:40 +00:00
Craig Topper 79120e80b8 Revert r299337 "[InstCombine] Remove redundant combine from visitAnd"
One of the tsan bots started failing at this commit. I don't see anything obviously wrong with the commit so trying this to see if it recovers.

Failing log: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/6792

llvm-svn: 299366
2017-04-03 17:22:23 +00:00
Sanjay Patel 77bf622db6 [InstCombine] fix formatting for foldLogOpOfMaskedICmps and related bits; NFCI
1. Improve enum, function, and variable names.
2. Improve comments.
3. Fix variable capitalization.
4. Run clang-format.

As an existing code comment suggests, this should work with vector types / splat constants too,
so making this look right first will reduce the diffs needed for that change.

llvm-svn: 299365
2017-04-03 16:53:12 +00:00
Craig Topper d33ee1b960 [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword
This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation.

Differential Revision: https://reviews.llvm.org/D31565

llvm-svn: 299362
2017-04-03 16:34:59 +00:00
Simon Pilgrim 9daf9c047d [DAGCombiner] Check limits before accessing array element (PR32502)
llvm-svn: 299361
2017-04-03 15:27:49 +00:00
Sjoerd Meijer 1179470ff8 ARMAsmParser: clean up of isImmediate functions
- we are now using immediate AsmOperands so that the range check functions are
  tablegen'ed.
- Big bonus is that error messages become much more accurate, i.e. instead of a
  useless "invalid operand" error message it will not say that the immediate
  operand must in range [x,y], which is why regression tests needed updating.

More tablegen operand descriptions could probably benefit from using
immediateAsmOperand, but this is a first good step to get rid of most of the
nearly identical range check functions. I will address the remaining immediate
operands in next clean ups.

Differential Revision: https://reviews.llvm.org/D31333

llvm-svn: 299358
2017-04-03 14:50:04 +00:00
Craig Topper d0b053d229 [InstCombine] Make foldOpWithConstantIntoOperand take a BinaryOperator instead of a generic Instruction.
It blindly assumes there are two operands so make it explicit.

llvm-svn: 299351
2017-04-03 07:08:08 +00:00
Craig Topper 07944f891c [InstCombine] Remove a And transform that should be handled by SimplifyDemandedInstructionBits. NFCI
llvm-svn: 299349
2017-04-03 06:02:09 +00:00
Craig Topper 00b47eec0f [APInt] Make use of whichWord and maskBit to simplify some code. NFC
llvm-svn: 299342
2017-04-02 19:35:18 +00:00
Craig Topper 55229b780d [APInt] Add a public typedef for the internal type of APInt use it instead of integerPart. Make APINT_BITS_PER_WORD and APINT_WORD_SIZE public.
This patch is one step to attempt to unify the main APInt interface and the tc functions used by APFloat.

This patch adds a WordType to APInt and uses that in all the tc functions. I've added temporary typedefs to APFloat to alias it to integerPart to keep the patch size down. I'll work on removing that in a future patch.

In future patches I hope to reuse the tc functions to implement some of the main APInt functionality.

I may remove APINT_ from BITS_PER_WORD and WORD_SIZE constants so that we don't have the repetitive APInt::APINT_ externally.

Differential Revision: https://reviews.llvm.org/D31523

llvm-svn: 299341
2017-04-02 19:17:22 +00:00
Craig Topper 70e4f434ae [InstCombine] Make InstCombiner::OptAndOp take a BinaryOperator instead of an Instruction.
The callers have already performed the necessary cast before calling. This allows us to remove a comment that says the instruction must be a BinaryOperator and make it explicit in the argument type.

Had to add a default case to the switch because BinaryOperator::getOpcode() returns a BinaryOps enum.

llvm-svn: 299339
2017-04-02 17:57:30 +00:00
Simon Pilgrim 0e2f8cd875 [X86][MMX] Improve support for folding fptosi from XMM to MMX
llvm-svn: 299338
2017-04-02 17:45:41 +00:00
Craig Topper d133591a7e [InstCombine] Remove redundant combine from visitAnd
As far as I can tell this combine is fully handled by SimplifyDemandedInstructionBits.

I was only looking at this because it is the only user of APIntOps::isShiftedMask which is itself broken. As demonstrated by r299187. I was going to fix isShiftedMask and needed to make sure we had coverage for the new cases it would expose to this combine. But looks like we can nuke it instead.

Differential Revision: https://reviews.llvm.org/D31543

llvm-svn: 299337
2017-04-02 17:34:30 +00:00
Simon Pilgrim ba28263b03 [X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64
llvm-svn: 299336
2017-04-02 16:20:34 +00:00
Simon Pilgrim e56a2d7b4c [X86][MMX] Added support for subvector extraction to MMX register
llvm-svn: 299335
2017-04-02 15:52:28 +00:00
Daniel Berlin 07daac8a36 NewGVN: Handle coercion of constant stores, loads, memory insts.
Summary:
Depends on D30928.

This adds support for coercion of stores and memory instructions that do not require insertion to process.
Another few tests down.
I added the relevant tests from rle.ll

Reviewers: davide

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30929

llvm-svn: 299330
2017-04-02 13:23:44 +00:00
Nikolai Bozhenov fca527af5c [BypassSlowDivision] Do not bypass division of hash-like values
Disable bypassing if one of the operands looks like a hash value. Slow
division often occurs in hashtable implementations and fast division is
never taken there because a hash value is extremely unlikely to have
enough upper bits set to zero.

A value is considered to be hash-like if it is produced by

1) XOR operation
2) Multiplication by a constant wider than the shorter type
3) PHI node with all incoming values being hash-like

Differential Revision: https://reviews.llvm.org/D28200

llvm-svn: 299329
2017-04-02 13:14:30 +00:00
Craig Topper 15e484aa2b [X86] Use tcAdd/tcSubtract to implement the slow case of operator+=/operator-=.
llvm-svn: 299326
2017-04-02 06:59:43 +00:00
Craig Topper b8f1068765 [APInt] Combine declaration and initialization. NFC
llvm-svn: 299325
2017-04-02 06:59:41 +00:00
Craig Topper b7d8faa231 [APInt] Simplify some code by using operator+=(uint64_t) instead of doing a more complex assignment into a temporary APInt just to use the APInt operator+=.
llvm-svn: 299324
2017-04-02 06:59:38 +00:00
Craig Topper d7ed50de26 [APInt] Fix typo in comment. NFC
llvm-svn: 299323
2017-04-02 06:59:36 +00:00
Daniel Berlin 8a00270838 MemorySSA: Add support for caching clobbering access in stores
Summary:
This enables us to cache the clobbering access for stores, despite the
fact that we can't rewrite the use-def chains themselves.

Early testing shows that, after this change, for larger testcases, it will be a significant net positive (memory and time) to remove the walker caching.

Reviewers: george.burgess.iv, davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31567

llvm-svn: 299322
2017-04-02 05:09:15 +00:00
Craig Topper 68a3ed2e1b [APInt] Use conditional operator to simplify some code. NFC
llvm-svn: 299320
2017-04-01 21:50:10 +00:00
Craig Topper a742cb5fc8 [APInt] Implement flipAllBitsSlowCase with tcComplement. NFCI
llvm-svn: 299319
2017-04-01 21:50:08 +00:00
Craig Topper 99cfe4f99d [APInt] Fix indentation. NFC
llvm-svn: 299318
2017-04-01 21:50:06 +00:00
Craig Topper b2aaa5da42 [APInt] Implement AndAssignSlowCase using tcAnd. Do the same for Or and Xor. NFCI
llvm-svn: 299317
2017-04-01 21:50:03 +00:00
Craig Topper 278ebd2f98 [APInt] Allow GreatestCommonDivisor to take rvalue inputs efficiently. Use moves instead of copies in the loop.
Summary:
GreatestComonDivisor currently makes a copy of both its inputs. Then in the loop we do one move and two copies, plus any allocation the urem call does.

This patch changes it to take its inputs by value so that we can do a move of any rvalue inputs instead of copying. Then in the loop we do 3 move assignments and no copies. This way the only possible allocations we have in the loop is from the urem call.

Reviewers: dblaikie, RKSimon, hans

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31572

llvm-svn: 299314
2017-04-01 20:30:57 +00:00
Davide Italiano 16fe5822f2 [WASM] Remove other comparison of unsigned expression >= 0.
This should finally fix the GCC 7 build with -Werror.

llvm-svn: 299313
2017-04-01 19:47:52 +00:00
Davide Italiano deede839f2 [WASM] Remove a set but never used variable.
llvm-svn: 299312
2017-04-01 19:40:51 +00:00
Davide Italiano 543760218a [WASM] Remove an assertion that can never fire.
uint* is by definition always >=0.

llvm-svn: 299311
2017-04-01 19:37:15 +00:00
Davide Italiano c88169e61b [AMDGPU] Garbage collect now unused dead code. NFCI.
llvm-svn: 299310
2017-04-01 19:30:17 +00:00
Sanjay Patel 8b5ad3f00e [InstSimplify] add constant folding for fdiv/frem
Also, add a helper function so we don't have to repeat this code for each binop.

llvm-svn: 299309
2017-04-01 19:05:11 +00:00
Sanjay Patel 1fd16f073d fix formatting; NFC
llvm-svn: 299307
2017-04-01 18:40:30 +00:00
Sanjay Patel 665021e7ee [DAGCombiner] enable vector transforms for any/all {sign} bits set/clear
The code already allowed vector types in via "isInteger" (which might want
a more specific name), so use splat-friendly constant predicates to match
those types.

llvm-svn: 299304
2017-04-01 15:05:54 +00:00
Daniel Berlin 9a9c9ff260 NewGVN: Don't try to kill off the stored value of stores when
processing the congruence class of the store.
Because we use the stored value of a store as the def, it isn't dead
just because it appears as a def when it comes from a store.

Note: I have not hit any cases with the memory code as it is where
this breaks anything, just because of what memory congruences we
actually allow.  In a followup that improves memory congruence,
this bug actually breaks real stuff (but the verifier catches it).

llvm-svn: 299300
2017-04-01 09:44:33 +00:00
Daniel Berlin 9b4984926c NewGVN: Clean up GVNExpression memory hierarchy, restructure hash computation a bit so we don't have to redefine it for loads, stores, and calls
llvm-svn: 299299
2017-04-01 09:44:29 +00:00
Daniel Berlin 871ecd90ca NewGVN: Use def_chain iterator in singleReachablePhiPath instead of recursion
llvm-svn: 299298
2017-04-01 09:44:24 +00:00
Daniel Berlin 07275c3065 Move def_chain iterator to MemorySSA.h so it can be reused
llvm-svn: 299297
2017-04-01 09:44:19 +00:00
Daniel Berlin d042031f0f MemorySSA: Push const correctness further.
llvm-svn: 299295
2017-04-01 09:01:12 +00:00
Daniel Berlin 7500c5641e MemorySSA: Kill the WalkTargetCache now that we have getBlockDefs.
llvm-svn: 299294
2017-04-01 08:59:45 +00:00
Craig Topper 9ab8d7f9c3 [APInt] Remove the mul/urem/srem/udiv/sdiv functions from the APIntOps namespace. Replace the few usages with calls to the class methods. NFC
llvm-svn: 299292
2017-04-01 05:08:57 +00:00
Craig Topper 73250168e7 [DAGCombiner] Fix fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask) to explicitly ensure that only one of the inputs of each shuffle is a zero vector.
This can only happen when we have a mix of zero and undef elements and the two vectors have a different arrangement of zeros/undefs. The shuffle should eventually be constant folded to all zeros.

Fixes PR32484.

llvm-svn: 299291
2017-04-01 04:26:20 +00:00
Quentin Colombet 49d70d0529 Revert "Feature generic option to setup start/stop-after/before"
This reverts commit r299282.

Didn't intend to commit this :(

llvm-svn: 299288
2017-04-01 01:26:24 +00:00
Quentin Colombet fc8f048c13 Revert "Localizer fun"
This reverts commit r299283.

Didn't intend to commit this :(

llvm-svn: 299287
2017-04-01 01:26:21 +00:00
Quentin Colombet 35a47010b1 Revert "Instrument SDISel C++ patterns"
This reverts commit r299284.

Didn't intend to commit this :(

llvm-svn: 299286
2017-04-01 01:26:17 +00:00
Quentin Colombet 7f64318938 [RegBankSelect] Support REG_SEQUENCE for generic mapping
REG_SEQUENCE falls into the same category as COPY for operands mapping:
- They don't have MCInstrDesc with register constraints
- The input variable could use whatever register classes
- It is possible to have register class already assigned to the operands

In particular, given REG_SEQUENCE are always target specific because of
the subreg indices. Those indices must apply to the register class of
the definition of the REG_SEQUENCE and therefore, the target must set a
register class to that definition. As a result, the generic code can
always use that register class to derive a valid mapping for a
REG_SEQUENCE.

llvm-svn: 299285
2017-04-01 01:26:14 +00:00
Quentin Colombet b43da15602 Instrument SDISel C++ patterns
llvm-svn: 299284
2017-04-01 01:21:32 +00:00
Quentin Colombet 3c40b366c5 Localizer fun
WIP

llvm-svn: 299283
2017-04-01 01:21:28 +00:00
Quentin Colombet ffe3053a66 Feature generic option to setup start/stop-after/before
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.

Previously each user would have needed to duplicate this logic and set
up its own options.

NFC

llvm-svn: 299282
2017-04-01 01:21:24 +00:00
Eric Christopher 60a245e0ff Reduce the number of times we query the subtarget for the same information.
llvm-svn: 299278
2017-03-31 23:12:27 +00:00
Eric Christopher cf965f2f03 Small cleanup to remove extraneous cast.
llvm-svn: 299277
2017-03-31 23:12:24 +00:00
Craig Topper 47fd2de304 [APInt] Fix bugs in isShiftedMask to match behavior of the similar function in MathExtras.h
This removes a parameter from the routine that was responsible for a lot of the issue. It was a bit count that had to be set to the BitWidth of the APInt and would get passed to getLowBitsSet. This guaranteed the call to getLowBitsSet would create an all ones value. This was then compared to (V | (V-1)). So the only shifted masks we detected had to have the MSB set.

The one in tree user is a transform in InstCombine that never fires due to earlier transforms covering the case better. I've submitted a patch to remove it completely, but for now I've just adapted it to the new interface for isShiftedMask.

llvm-svn: 299273
2017-03-31 22:23:42 +00:00
Derek Schuff c5b472f440 Add virtual destructor to WasmYAML::Section or avoid memory leak
Tested locally with -DLLVM_USE_SANITIZER=Address

Differential Revision: https://reviews.llvm.org/D31551

Patch by Sam Clegg

llvm-svn: 299270
2017-03-31 22:14:14 +00:00
Bob Haarman d6aea71957 LTO: call getRealLinkageName on IRNames before feeding to getGUID
Summary: GlobalValue has two getGUID methods: an instance method and a static method. The static method takes a string, which is expected to be what GlobalValue::getRealLinkageName() would return. In LTO.cpp, we were not doing this consistently, sometimes passing an IR name instead. This change makes it so that we call getRealLinkageName() first, making the static getGUID return value consistent with the instance method. Without this change, compiling FileCheck with ThinLTO on Windows fails with numerous undefined symbol errors. With the change, it builds successfully.

Reviewers: pcc, rnk

Reviewed By: pcc

Subscribers: tejohnson, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31444

llvm-svn: 299268
2017-03-31 21:56:30 +00:00
Craig Topper e625d74271 [InstCombine] When adding an Instruction and its Users to the worklist at the same time, make sure we put the Users in first. Then put in the instruction.
This way we ensure we immediately revisit the instruction and do any additional optimizations before visiting the users. Otherwise we might visit the users, then the instruction, then users again, then instruction again.

llvm-svn: 299267
2017-03-31 21:35:30 +00:00
Sanjay Patel 16d458ea0d [DAGCombiner] refactor and/or-of-setcc to get rid of duplicated code; NFCI
llvm-svn: 299266
2017-03-31 21:30:50 +00:00
Krzysztof Parzyszek d04c9b999c [Hexagon] Remove unused variables
Found by PVS-Studio. Fixes llvm.org/PR31676.

llvm-svn: 299262
2017-03-31 21:03:59 +00:00
Krzysztof Parzyszek b326411fdc [Hexagon] Fix typo in HexagonEarlyIfCConv.cpp
Found by PVS-Studio. Fixes llvm.org/PR32480.

llvm-svn: 299258
2017-03-31 20:36:00 +00:00
Stephen Canon 157c86913a Fix APFloat mod (committing for simonbyrne)
The previous version was prone to intermediate rounding or overflow.

Differential Revision: https://reviews.llvm.org/D29346

llvm-svn: 299256
2017-03-31 20:31:33 +00:00
Sanjay Patel 34da36e74f [DAGCombiner] add fold for 'All sign bits set?'
(and (setlt X,  0), (setlt Y,  0)) --> (setlt (and X, Y),  0)

We have 7 similar folds, but this one got away. The fact that the
x86 test with a branch didn't change is probably a separate bug. We
may also be missing this and the related folds in instcombine.

llvm-svn: 299252
2017-03-31 20:28:06 +00:00
Stanislav Mekhanoshin 12aa5b733e [AMDGPU] Remove assumption that vector and scalar types do not alias
Differential Revision: https://reviews.llvm.org/D31547

llvm-svn: 299250
2017-03-31 20:16:54 +00:00
Craig Topper 885fa12e8a [APInt] Remove shift functions from APIntOps namespace. Replace the few users with the APInt class methods. NFCI
llvm-svn: 299248
2017-03-31 20:01:16 +00:00
Joerg Sonnenberger 28bed106e0 Do not translate rint into nearbyint, but truncate it like nearbyint.
A common way to implement nearbyint is by fiddling with the floating
point environment and calling rint. This is used at least by the BSD
libm and musl. As such, canonicalizing the latter to the former will
create infinite loops for libm and generally pessimize performance, at
least when the generic C versions are used.

This change preserves the rint in the libcall translation and also
handles the domain truncation logic, so that rint with float argument
will be reduced to rintf etc.

llvm-svn: 299247
2017-03-31 19:58:07 +00:00
Matt Arsenault 8edfaee7be AMDGPU: Remove unnecessary ands when f16 is legal
Add a new node to act as a fancy bitcast from f16 operations to
i32 that implicitly zero the high 16-bits of the result.

Alternatively could try making v2f16 legal and canonicalizing
on build_vectors.

llvm-svn: 299246
2017-03-31 19:53:03 +00:00
Jan Vesely 3c99441ef4 AMDGPU/R600: Fix amdgpu alias analysis pass.
R600 uses higher AS number to access kernel parameters

Fixes: r298846
Differential Revision: https://reviews.llvm.org/D31520

llvm-svn: 299245
2017-03-31 19:26:23 +00:00
Craig Topper e7e3560288 [APInt] Rewrite getLoBits in a way that will do one less memory allocation in the multiword case. Rewrite getHiBits to use the class method version of lshr instead of the one in APIntOps. NFCI
llvm-svn: 299243
2017-03-31 18:48:14 +00:00
Sanjay Patel 61d3409535 [DAGCombiner] remove redundant code and add comments; NFCI
llvm-svn: 299241
2017-03-31 18:18:58 +00:00
Balaram Makam 2aba753e84 [AArch64] Add new subtarget feature to fold LSL into address mode.
Summary:
This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL.

Reviewers: mcrosier, rengolin, t.p.northover

Subscribers: junbuml, gberry, llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D31113

llvm-svn: 299240
2017-03-31 18:16:53 +00:00
Craig Topper 9601168670 [AVX-512] Update lowering for gather/scatter prefetch intrinsics to match the immediate encodings the frontend uses based on the _MM_HINT_T0/T1 constant values in clang's headers.
Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too.

Fixes PR32411.

llvm-svn: 299234
2017-03-31 17:24:29 +00:00
Dehao Chen fed890ea3a Fix the InstCombine to reserve the VP metadata and sets correct call count.
Summary: Currently the VP metadata was dropped when InstCombine converts a call to direct call. This patch converts the VP metadata to branch_weights so that its hotness is recorded.

Reviewers: eraman, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31344

llvm-svn: 299228
2017-03-31 15:59:52 +00:00
Jan Sjodin f1a30f1800 Refactor code to create getFallThrough method in MachineBasicBlock.
Differential Revision: https://reviews.llvm.org/D27264

llvm-svn: 299227
2017-03-31 15:55:37 +00:00
Kristof Beyls 7adf8c52a8 Remove name space pollution from Signals.cpp
llvm-svn: 299224
2017-03-31 14:58:52 +00:00
Petar Jovanovic 9bff3b7818 [mips][msa] Prevent output operand from commuting for dpadd_[su].df ins
Implementation of TargetInstrInfo::findCommutedOpIndices for MIPS target,
restricting commutativity to second and third operand only for
dpaadd_[su].df instructions therein.

Prior to this change, there were cases where the vector that is to be added
to the dot product of the other two could take a position other than the
first one in the instruction, generating false output in the destination
vector.

Such behavior has been noticed in the two functions generating v2i64 output
values so far. Other ones may exhibit such behavior as well, just not for
the vector operands which are present in the test at the moment.

Tests altered so that the function's first operand is a constant splat so
that it can be loaded with a ldi instruction, since that is the case in
which the erroneous instruction operand placement has occurred. We check
that the register which is present in the ldi instruction is placed as the
first operand in the corresponding dpadd instruction.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D30827

llvm-svn: 299223
2017-03-31 14:31:55 +00:00
Kristof Beyls a11dbf2c90 Remove more name space pollution from .inc files
llvm-svn: 299222
2017-03-31 14:26:44 +00:00
Simon Pilgrim 1cdbfe44b1 [DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR and INSERT_VECTOR_ELT
Followup to D31311

llvm-svn: 299221
2017-03-31 14:21:50 +00:00
Jonas Paulsson c7bb22e75f [SystemZ] Make sure of correct regclasses in insertSelect()
Since LOCR only accepts GR32 virtual registers, its operands must be copied
into this regclass in insertSelect(), when an LOCR is built. Otherwise, the
case where the source operand was GRX32 will produce invalid IR.

Review: Ulrich Weigand
llvm-svn: 299220
2017-03-31 14:06:59 +00:00
Simon Pilgrim 3c81c34d8d [DAGCombiner] Add vector demanded elements support to ComputeNumSignBits
Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements.

This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.

I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course.

Followup to D25691.

Differential Revision: https://reviews.llvm.org/D31311

llvm-svn: 299219
2017-03-31 13:54:09 +00:00
Kristof Beyls 60088c3ff6 Do not pollute the namespace in a header file.
llvm-svn: 299218
2017-03-31 13:48:21 +00:00
Jonas Paulsson 56bb0857e9 [SystemZ] Skip DAGCombining of vector node for older subtargets.
Even on older subtargets that lack vector support, there may be vector values
with just one element in the input program. These are converted during DAG
legalization to scalar values.

The pre-legalize SystemZ DAGCombiner methods should in this circumstance not
touch these nodes. This patch adds a check for this in
SystemZTargetLowering::combineEXTRACT_VECTOR_ELT().

Review: Ulrich Weigand
llvm-svn: 299213
2017-03-31 13:22:59 +00:00
Kristof Beyls 77ce4f6e37 Make naming in Host.h in line with coding standards.
Based on post-commit review comments by Chandler Carruth on
https://reviews.llvm.org/D31236. Thanks!

llvm-svn: 299211
2017-03-31 13:06:40 +00:00
Yaron Keren 80745c52bc Update comment for r299098 per feedback from James Henderson.
llvm-svn: 299207
2017-03-31 12:08:45 +00:00
Max Kazantsev 2e44d2969a [ScalarEvolution] Re-enable Predicate implication from operations
The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build.
The reason of the crash was type mismatch between either a or b and RHS in the following situation:

  LHS = sext(a +nsw b) > RHS.

This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type.
But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that
can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we
reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this
situation we don't need to create any non-constant SCEVs.

This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not
go further into range analysis etc (because in some situations these analyzes succeed even when the passed
arguments have wrong types, what should not normally happen).

The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong
usage of predicates in recursive invocations.

The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll

Reviewers: reames, apilipenko, anna, sanjoy

Reviewed By: sanjoy

Subscribers: mzolotukhin, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31238

llvm-svn: 299205
2017-03-31 12:05:30 +00:00
Kristof Beyls f698a69107 Do not pollute the namespace in a header file.
llvm-svn: 299203
2017-03-31 12:00:24 +00:00
Sam Kolton 27e0f8bc72 [AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns
Previously compiler often extracted common immediates into specific register, e.g.:
```
%vreg0 = S_MOV_B32 0xff;
%vreg2 = V_AND_B32_e32 %vreg0, %vreg1
%vreg4 = V_AND_B32_e32 %vreg0, %vreg3
```
Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands:
```
SDWA src: %vreg2 src_sel:BYTE_0
SDWA src: %vreg4 src_sel:BYTE_0
```
With this change peephole check if operand is either immediate or register that is copy of immediate.

llvm-svn: 299202
2017-03-31 11:42:43 +00:00
Simon Pilgrim 37b536e4b3 [DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.

Differential Revision: https://reviews.llvm.org/D31249

llvm-svn: 299201
2017-03-31 11:24:16 +00:00
Simon Pilgrim 6bdc755519 Spelling mistakes in comments. NFCI.
llvm-svn: 299197
2017-03-31 10:59:37 +00:00
Simon Pilgrim e32a833a5c Fix MSVC 'not all control paths return a value' warning
llvm-svn: 299195
2017-03-31 10:46:47 +00:00
Simon Pilgrim c8da0c0bdf Fix signed/unsigned warning
llvm-svn: 299194
2017-03-31 10:45:35 +00:00
Mikael Holmen 79235bd4d8 [Scalarizer] Handle scalar arguments in vector GEP
Summary:
Triggered by commit r298620: "[LV] Vectorize GEPs".

If we encounter a vector GEP with scalar arguments, we splat the scalar
into a vector of appropriate size before we scatter the argument.

Reviewers: arsenm, mehdi_amini, bkramer

Reviewed By: arsenm

Subscribers: bjope, mssimpso, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D31416

llvm-svn: 299186
2017-03-31 06:29:49 +00:00
Peter Collingbourne 7b30f16c9f Re-apply r299168 and r299169 now that the libdeps are fixed.
llvm-svn: 299184
2017-03-31 04:47:07 +00:00
Peter Collingbourne c66018e247 Move llvm::emitLinkerFlagsForGlobalCOFF() to Mangler.
llvm-svn: 299183
2017-03-31 04:46:50 +00:00
Peter Collingbourne 2e0ffe9858 Move llvm::canBeOmittedFromSymbolTable() to Analysis.
llvm-svn: 299182
2017-03-31 04:46:31 +00:00
Kostya Serebryany a617e16ff1 [libFuzzer] simplify the code a bit
llvm-svn: 299180
2017-03-31 04:17:45 +00:00
Kostya Serebryany 7de1f1a826 [libFuzzer] tests: don't test 64-bit comparison on 32-bit builds
llvm-svn: 299179
2017-03-31 03:51:40 +00:00
Kostya Serebryany b1f802cf80 [libFuzzer] ensure that strncmp is not inlined in a test
llvm-svn: 299177
2017-03-31 03:34:33 +00:00
Peter Collingbourne f10698b940 Revert r299168 and r299169 due to library dependency issues.
http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/25073/steps/build_llvmclang/logs/stdio

llvm-svn: 299171
2017-03-31 02:44:50 +00:00
Peter Collingbourne d9717aa0e4 LTO: Reduce memory consumption by creating an in-memory symbol table for InputFiles. NFCI.
Introduce symbol table data structures that can be potentially written to
disk, have the LTO library build those data structures using temporarily
constructed modules and redirect the LTO library implementation to go through
those data structures. This allows us to remove the LLVMContext and Modules
owned by InputFile.

With this change I measured a peak memory consumption decrease from 5.4GB to
2.8GB in a no-op incremental ThinLTO link of Chromium on Linux. The impact on
memory consumption is larger in COFF linkers where we are currently forced
to materialize all metadata in order to read linker options. Peak memory
consumption linking a large piece of Chromium for Windows with full LTO and
debug info decreases from >64GB (OOM) to 15GB.

Part of PR27551.

Differential Revision: https://reviews.llvm.org/D31364

llvm-svn: 299168
2017-03-31 02:28:30 +00:00
Kostya Serebryany af2dfce683 [libFuzzer] make sure we don't execute libFuzzer's mem* and str* hooks while calling mem*/str* inside libFuzzer itself
llvm-svn: 299167
2017-03-31 02:21:28 +00:00
Eric Christopher 9fd267c221 Temporarily revert "[PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64" as it's causing test failures, I've given Carrot a testcase offline.
This reverts commit r298955.

llvm-svn: 299153
2017-03-31 02:16:54 +00:00
Eric Christopher 276989f54c Fix typo, defind -> defined.
llvm-svn: 299149
2017-03-31 01:46:30 +00:00
Kostya Serebryany 3033065df9 [libFuzzer] try to fix value-profile-strncmp on the Mac bot
llvm-svn: 299145
2017-03-31 00:52:39 +00:00
Peter Collingbourne 61781ac26e ModuleSummaryAnalysis: Use a more precise #include. NFC.
llvm-svn: 299142
2017-03-31 00:08:24 +00:00
Dan Gohman 970d02c42d [WebAssembly] Initial linking metadata support
Add support for the new relocations and linking metadata section support in
https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md. In
particular, this allows LLVM to indicate which variable is the stack pointer,
so that it can be linked with other objects.

This also adds support for emitting type relocations for call_indirect
instructions.

Right now, this is mainly tested by using wabt and hexdump to examine the
output on selected testcases. We'll add more tests as the design stablizes
and more of the pieces are in place.

llvm-svn: 299141
2017-03-30 23:58:19 +00:00
Matt Arsenault 1074cb5420 AMDGPU: Rename isKernel
What we really want to do is distinguish functions that may
be called by other functions, and graphics shaders are not
called kernels.

llvm-svn: 299140
2017-03-30 23:58:04 +00:00
Peter Collingbourne 6b193966ac ThinLTOBitcodeWriter: Use Module::global_values(). NFCI.
llvm-svn: 299132
2017-03-30 23:43:08 +00:00
Eric Christopher b9c56d1235 getPristineRegs is not accurately considering shrink wrapping puts
registers not saved in certain blocks. Use explicit getCalleeSavedInfo
and isLiveIn instead.

This fixes pr32292.

Patch by Tim Shen!

llvm-svn: 299124
2017-03-30 22:34:20 +00:00
Craig Topper 79e5bc528d [InstCombine] Fix typo last->least. NFC
llvm-svn: 299123
2017-03-30 22:28:55 +00:00
Matt Arsenault 79f837c254 AMDGPU: Add all atomicrmw fields to atomic.inc/dec
Add scope, order, isVolatile

llvm-svn: 299122
2017-03-30 22:21:40 +00:00
Craig Topper 3a40a397c3 [InstSimplify] Use m_SignBit instead of calling getSignBit and using m_Specific. NFCI
llvm-svn: 299121
2017-03-30 22:21:16 +00:00
Craig Topper 6856d341a8 [InstSimplify] Use APInt::isMaxSignedValue() instead of comparing with ~APInt::getSignBit. NFC
llvm-svn: 299120
2017-03-30 22:10:54 +00:00
Hongbin Zheng bfd7c38de7 [SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negative
Since there is no sdiv in SCEV, an 'udiv' is a better canonical form than an 'sdiv' as the user of induction variable

Differential Revision: https://reviews.llvm.org/D31488

llvm-svn: 299118
2017-03-30 21:56:56 +00:00
Craig Topper 3001b35189 [AVX-512] Fix bad comment from r299112. NFC
llvm-svn: 299114
2017-03-30 21:05:33 +00:00
Craig Topper 533b1bde1b [AVX-512] Fix another case where fastisel was generating a GR8 to VK1 copy. This time after calls returning i1.
Fixes PR32472.

llvm-svn: 299112
2017-03-30 21:02:52 +00:00
Stanislav Mekhanoshin 89653dfd2a [AMDGPU] Add GlobalOpt parameter to Always Inliner pass
If set to false it does not remove global aliases. With this parameter
set to false it should be safe to run the pass before link.

Differential Revision: https://reviews.llvm.org/D31489

llvm-svn: 299108
2017-03-30 20:16:02 +00:00
Adrian Prantl 346dcaf1fa Teach stripNonLineTableDebugInfo() to remap DILocations in !llvm.loop nodes.
llvm-svn: 299107
2017-03-30 20:10:56 +00:00
Juergen Ributzka cad1249106 [Object] Remove check for BIND_OPCODE_DONE/REBASE_OPCODE_DONE.
BIND_OPCODE_DONE/REBASE_OPCODE_DONE may appear at the end of the opcode array,
but they are not required to. The linker only adds them as padding to align the
opcodes to pointer size.

This fixes rdar://problem/31285560.

llvm-svn: 299104
2017-03-30 19:56:50 +00:00
Davide Italiano a0bd28c4d9 [AArch64ISelLowering] Remove `else` after `return` in LowerGlobalTLSAddress.
llvm-svn: 299103
2017-03-30 19:52:31 +00:00
Davide Italiano de05686ec6 [AArch64] Simplify isSingExtended()/isZeroExtended(). NFCI.
llvm-svn: 299102
2017-03-30 19:46:18 +00:00
Derek Schuff d3d84fdda1 [WebAssembly] Improve support for WebAssembly binary format
Mostly this change adds support converting to and from
YAML which will allow us to write more test cases for
the WebAssembly MC and lld ports.

Better support for objdump, readelf, and nm will be in
followup CLs.

I had to update the two wasm test binaries because they
used the old style 'name' section which is no longer
supported.

Differential Revision: https://reviews.llvm.org/D31099

Patch by Sam Clegg

llvm-svn: 299101
2017-03-30 19:44:09 +00:00
Yaron Keren 9d27c47b46 Following r297661, disable dup workaround to disable duplicate STDOUT fd closing and instead directly prevent closing of STD* file descriptors.
We do not want to close STDOUT as there may have been several uses of it
such as the case: llc %s -o=- -pass-remarks-output=- -filetype=asm
which cause multiple closes of STDOUT_FILENO and/or use-after-close of it.
Using dup() in getFD doesn't work as we end up with original STDOUT_FILENO
open anyhow.

reviewed by Rafael Espindola

Differential Revision: https://reviews.llvm.org/D31505

llvm-svn: 299098
2017-03-30 19:30:51 +00:00
Adam Nemet edaec6de73 [DAGCombiner] Initial support for the fast-math flag contract
Now alternatively to the TargetOption.AllowFPOpFusion global flag, FMUL->FADD
can also use the per operation FMF to allow fusion.

The idea here is not to port everything to the new scheme (e.g. fused
multiply-and-sub will be ported later) but that this work all the way from
clang.

The transformation is conditionalized on *both* the FADD and the FMUL having
the FMF contract flag.

Differential Revision: https://reviews.llvm.org/D31169

llvm-svn: 299096
2017-03-30 18:53:04 +00:00
Ahmed Bougacha 6dd6082472 [CodeGen] Pass SDAG an ORE, and replace FastISel stats with remarks.
In the long-term, we want to replace statistics with something
finer-grained that lets us gather per-function data.
Remarks are that replacement.

Create an ORE instance in SelectionDAGISel, and pass it to
SelectionDAG.

SelectionDAG was used so that we can emit remarks from all
SelectionDAG-related code, including TargetLowering and DAGCombiner.
This isn't used in the current patch but Adam tells me he's interested
for the fp-contract combines.

Use the ORE instance to emit FastISel failures as remarks (instead of
the mix of dbgs() dumps and statistics that we currently have).

Eventually, we want to have an API that tells us whether remarks are
enabled (http://llvm.org/PR32352) so that we don't emit expensive
remarks (in this case, dumping IR) when it's not needed.  For now, use
'isEnabled' as a crude replacement.

This does mean that the replacement for '-fast-isel-verbose' is now
'-pass-remarks-missed=isel'.  Additionally, clang users also need to
enable remark diagnostics, using '-Rpass-missed=isel'.

This also removes '-fast-isel-verbose2': there are no static statistics
that we want to only enable in asserts builds, so we can always use
the remarks regardless of the build type.

Differential Revision: https://reviews.llvm.org/D31405

llvm-svn: 299093
2017-03-30 17:49:58 +00:00
Sanjay Patel 6d5ba061f8 [DAGCombiner] add helper function for visitORLike; NFCI
This combines all of the equivalent clean-ups for foldAndOfSetCCs:
https://reviews.llvm.org/rL298938
https://reviews.llvm.org/rL298940
https://reviews.llvm.org/rL298944
https://reviews.llvm.org/rL298949
https://reviews.llvm.org/rL298950
https://reviews.llvm.org/rL299002
https://reviews.llvm.org/rL299013

The sins of code duplication are on full display here:
each function is missing a fold that wasn't copied over from its logical sibling. 

llvm-svn: 299091
2017-03-30 17:32:42 +00:00
Simon Pilgrim 68168d17b9 Spelling mistakes in comments. NFCI.
Based on corrections mentioned in patch for clang for PR27635

llvm-svn: 299072
2017-03-30 12:59:53 +00:00
Simon Pilgrim ef4509b36e Spelling mistakes in comments. NFCI.
llvm-svn: 299069
2017-03-30 12:30:15 +00:00
Kristof Beyls 7a76b315d6 Revert "Make naming in Host.h in line with coding standards."
This reverts r299062, which caused build failures on Windows.
It also reverts the attempts to fix the windows builds in r299064 and r299065.
The introduction of namespace llvm::sys::detail makes MSVC, and seemingly also
mingw, complain about ambiguity with the existing namespace llvm::detail.
E.g.:
C:\b\slave\sanitizer-windows\llvm\include\llvm/Support/MathExtras.h(184): error C2872: 'detail': ambiguous symbol
C:\b\slave\sanitizer-windows\llvm\include\llvm/Support/PointerLikeTypeTraits.h(31): note: could be 'llvm::detail'
C:\b\slave\sanitizer-windows\llvm\include\llvm/Support/Host.h(80): note: or       'llvm::sys::detail'

In r299064 and r299065 I tried to fix these ambiguities, based on the errors
reported in the log files. It seems however that the build stops early when
this kind of error is encountered, and many build-then-fix-iterations on
Windows may be needed to fix this. Therefore reverting r299062 for now to
get the build working again on Windows.

llvm-svn: 299066
2017-03-30 11:06:25 +00:00
Kristof Beyls ca878c943b Make naming in Host.h in line with coding standards.
Based on post-commit review comments by Chandler Carruth on
https://reviews.llvm.org/D31236. Thanks!

llvm-svn: 299062
2017-03-30 09:31:59 +00:00
Kristof Beyls 9e46396ecc Refactor getHostCPUName to allow testing on non-native hardware.
This refactors getHostCPUName so that for the architectures that get the
host cpu info on linux from /proc/cpuinfo, the /proc/cpuinfo parsing
logic is present in the build, even if it wasn't built on a linux system
for that architecture.

Since the code is present in the build, we can then test that code also
on other systems, i.e. we don't need to have buildbots setup for all
architectures on linux to be able to test this. Instead, developers will
test this as part of the regression test run.

As an example, a few unit tests are added to test getHostCPUName for ARM
running linux. A unit test is preferred over a lit-based test, since the
expectation is that in the future, the functionality here will grow over
what can be tested with "llc -mcpu=native".

This is a preparation step to enable implementing the range of
improvements discussed on PR30516, such as adding AArch64 support,
support for big.LITTLE systems, reducing code duplication.

Differential Revision: https://reviews.llvm.org/D31236

llvm-svn: 299060
2017-03-30 07:24:49 +00:00
Craig Topper eafcbe2d10 [APInt] Remove references to integerPartWidth outside of APFloat implentation.
Turns out integerPartWidth only explicitly defines the width of the tc functions in the APInt class. Functions that aren't used by APInt implementation itself. Many places in the code base already assume APInt is made up of 64-bit pieces. Explicitly assuming 64-bit here doesn't make that situation much worse. A full audit would need to be done if it ever changes.

llvm-svn: 299059
2017-03-30 05:49:03 +00:00
Kostya Serebryany 01ddc1cfd5 [libFuzzer] remove a stale flag from tests, run value-profile-strncmp.test longer (hopefully, will fix the OSX bot)
llvm-svn: 299051
2017-03-30 04:22:20 +00:00
Zvi Rackover 7569436f81 [DAGCombine] A shuffle of a splat is always the splat itself
Summary:
Add a simplification:
shuffle (splat-shuffle), undef, M --> splat-shuffle

Fixes pr32449

Patch by Sanjay Patel

Reviewers: eli.friedman, RKSimon, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31426

llvm-svn: 299047
2017-03-30 01:42:57 +00:00
Kostya Serebryany d7d1d517ee [libFuzzer] best effort support for -fsanitize-coverage=trace-pc instrumentation. It is less efficient and precise than -fsanitize-coverage=trace-pc-guard, but still works
llvm-svn: 299046
2017-03-30 01:27:20 +00:00
Eric Christopher 9ea300f08d If the DIUnit has flags passed on it then have DW_AT_producer be a combination of DICompileUnit::Producer and Flags.
The darwin behavior is unchanged and will continue to use DW_AT_APPLE_flags.

Patch by Zhizhou Yang

llvm-svn: 299038
2017-03-29 23:34:27 +00:00
Reid Kleckner acd9a6f09d [codeview] Fix buggy BeginIndexMapSize assertion
This assert is just trying to test that processing each record adds
exactly one entry to the index map. The assert logic was wrong when the
first record in the type stream was a field list.

I've simplified the code by moving the LF_FIELDLIST-specific logic into
the callback for that record type.

llvm-svn: 299035
2017-03-29 22:51:22 +00:00
Davide Italiano e13920f407 [X86IselLowering] Remove extraneous semicolon. NFCI.
Unbreaks the build with GCC -Werror.

llvm-svn: 299030
2017-03-29 21:34:58 +00:00
Davide Italiano cdcdc97879 [DAGCombiner] Remove else after return. NFCI.
llvm-svn: 299022
2017-03-29 19:39:46 +00:00
Adrian McCarthy 4d93d66ddd Re-land: "Make NativeExeSymbol a concrete subclass of NativeRawSymbol [PDB]"
This should work on all platforms now that r299006 has landed.  Tested locally
on Windows and Linux.

This moves exe symbol-specific method implementations out of NativeRawSymbol
into a concrete subclass. Also adds implementations for hasCTypes and
hasPrivateSymbols and a simple test to ensure the native reader can access the
summary information for the executable from the PDB.

Original Differential Revision: https://reviews.llvm.org/D31059

llvm-svn: 299019
2017-03-29 19:27:08 +00:00
Rafael Espindola b26bc7fddc Add ifunc support to ModuleSymbolTable.
Do that by creating a global_values, which is similar to
global_objects, but also iterates over aliases and ifuncs.

llvm-svn: 299018
2017-03-29 19:26:26 +00:00
Matthew Simpson c8f0aeccda [InstCombine] Correct the check for vector GEPs
Some of the GEP combines (e.g., descaling) can't handle vector GEPs. We have an
existing check that attempts to bail out if given a vector GEP. However, the
check only tests the GEP's pointer operand. A GEP results in a vector of
pointers if at least one of its operands is vector-typed (e.g., its pointer
operand could be a scalar, but its index could be a vector). We should just
check the type of the GEP itself. This should fix PR32414.

Reference: https://bugs.llvm.org/show_bug.cgi?id=32414
Differential Revision: https://reviews.llvm.org/D31470

llvm-svn: 299017
2017-03-29 18:23:08 +00:00
Sanjay Patel ff211bb5a6 [DAGCombiner] unify type checks and add asserts; NFCI
We had a mix of type checks and usage that wasn't very clear.

llvm-svn: 299013
2017-03-29 18:08:01 +00:00
Simon Pilgrim 8362c95257 [X86] Tidied up comment - we don't custom lower add/sub i64 on i686 anymore. NFCI.
llvm-svn: 299004
2017-03-29 15:41:58 +00:00
Sanjay Patel 087e922328 [DAGCombiner] reduce code duplication by rearranging checks; NFCI
llvm-svn: 299002
2017-03-29 15:37:33 +00:00
Simon Pilgrim fc97d5049f Spelling mistakes in comments. NFCI.
llvm-svn: 299000
2017-03-29 15:27:24 +00:00
Sven van Haastregt 04bfa87f12 [MachineVerifier] Drop a spurious const
As of r298987 the argument is a value that we std::move, so it
shouldn't be const anymore.

llvm-svn: 298999
2017-03-29 15:25:06 +00:00
Filipe Cabecinhas 8b94273fe6 Cleanup in preparation for D30703. NFCI
Make the enumerators follow the coding convention and start with OW_...

llvm-svn: 298996
2017-03-29 14:42:27 +00:00
Simon Pilgrim 2845189bd1 [X86][AVX2] Prevent unary interleaving patterns from calling lowerVectorShuffleAsSplitOrBlend (PR32453)
llvm-svn: 298993
2017-03-29 13:00:00 +00:00
Simon Pilgrim b670ba4e87 [AMDGPU] Tidy up computeKnownBitsForTargetNode/ComputeNumSignBitsForTargetNode arguments. NFCI.
Based on comment in D31249.

llvm-svn: 298991
2017-03-29 12:09:25 +00:00
Simon Pilgrim ebd433d9fc [X86] Removed old comment. NFCI.
No longer makes sense as the previous opcode mnemonic it was referring to is long gone.

llvm-svn: 298988
2017-03-29 10:44:51 +00:00
Sven van Haastregt 039a6d9f9f [MachineVerifier] Avoid reference to nullptr
Instantiation of the MachineVerifierPass through
PassInfo::getNormalCtor would yield a segfault since the default
constructor of the MachineVerifierPass takes a reference to nullptr.

Patch by Simone Pellegrini.

Differential Revision: https://reviews.llvm.org/D31387

llvm-svn: 298987
2017-03-29 09:08:25 +00:00
Eric Christopher 5829741c46 Move the x86 cpu feature rtm from Haswell to Skylake matching clang commit r298956.
llvm-svn: 298986
2017-03-29 07:40:44 +00:00
Craig Topper d9f51350b8 [AVX-512] Remove explicit KMOVWrk from isel patterns. COPY_TO_REGCLASS to GR32 is enough.
llvm-svn: 298985
2017-03-29 07:31:56 +00:00
Craig Topper d284606327 [AVX-512] Remove explicit KMOVWrk/KMOVWKr instructions from patterns where we can just use COPY_TO_REGCLASS instead.
This will result in a KMOVW or KMOVD being emitted during register allocation. And in at least some cases this might allow the register coalescer to remove the copy all together.

llvm-svn: 298984
2017-03-29 06:55:28 +00:00
Dean Michael Berris 60c2487874 [XRay] Update FDR log reader to be aware of buffer sizes per thread.
Summary:
It is problematic for this reader that it expects to read data from
several threads, but the header or message format does not define
framing. Since the buffers are reused, we can't rely on skipping
zeroed out data as a synchronization method either.

There is an argument that this is not version compatible with the format
the reader expected previously. I argue that since the writer wrote garbage
past the end of buffer record, there is no currently working reader to
compromise.

The corresponding writer change is posted to D31384.

Reviewers: dberris, pelikan

Reviewed By: dberris

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31385

llvm-svn: 298983
2017-03-29 06:10:12 +00:00
Adam Nemet 92a5cf4366 [SDAG] Remove -enable-fmf-dag
This is no longer needed as spotted by Sanjay in
https://reviews.llvm.org/D31165.

llvm-svn: 298963
2017-03-28 23:46:14 +00:00
Adam Nemet 6820f391eb [SDAG] Add AllowContract to SNodeFlags
Properly propagate the FMF from the LLVM IR to this flag.

This is toward moving fp-contraction=fast from an LLVM TargetOption to a
FastMathFlag in order to fix PR25721.

Differential Revision: https://reviews.llvm.org/D31165

llvm-svn: 298961
2017-03-28 23:46:08 +00:00
Peter Collingbourne 192d8520de More accurate header inclusions. NFC.
llvm-svn: 298960
2017-03-28 23:35:34 +00:00
Craig Topper 331297c62e [AVX-512] Punt on fast-isel of truncates to i1 when AVX512 is enabled.
We should be masking the value and emitting a register copy like we do in non-fast isel. Instead we were just updating the value map and emitting nothing.

After r298928 we started seeing cases where we would create a copy from GR8 to GR32 because the source register in a VK1 to GR32 copy was replaced by the GR8 going into a truncate.

This fixes PR32451.

llvm-svn: 298957
2017-03-28 23:20:37 +00:00
Guozhi Wei f8d40181c9 [PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.

This patch fixed PR32442.

Differential Revision: https://reviews.llvm.org/D31407

llvm-svn: 298955
2017-03-28 22:55:01 +00:00
Sanjay Patel a41a5c29f0 [DAGCombiner] reduce code duplication with local variables; NFCI
llvm-svn: 298954
2017-03-28 22:45:53 +00:00
Peter Collingbourne 0d56b959ad LTO: Replace InputFile::Symbol::getFlags() with predicate accessors. NFC.
This makes the predicates independent of the flag representation
and makes the code a little easier to read.

llvm-svn: 298951
2017-03-28 22:31:35 +00:00
Sanjay Patel 9747d8070b [DAG] fix formatting; NFC
llvm-svn: 298950
2017-03-28 22:25:25 +00:00
Sanjay Patel d832eddde5 [DAGCombiner] remove redundant conditions and duplicated code; NFCI
llvm-svn: 298949
2017-03-28 22:22:50 +00:00
Stanislav Mekhanoshin baf31ac7c8 [AMDGPU] Boost unroll threshold for loops reading local memory
This is less important than increase threshold for private memory,
but still brings performance improvements in a wide range of tests.
Unrolling more for local memory serves three purposes: it allows
to combine ds operations if offset becomes static, saves registers
used for offsets in case of static offsets, and allows better lds
latency hiding.

Differential Revision: https://reviews.llvm.org/D31412

llvm-svn: 298948
2017-03-28 22:13:51 +00:00
Stanislav Mekhanoshin b933c3f554 [AMDGPU] Fix recorded region boundaries in max-occupancy scheduler
This is incorrect to record region boundaries before scheduling,
it may change after scheduling. As a result second pass may see less
instructions to schedule than it should.

Differential Revision: https://reviews.llvm.org/D31434

llvm-svn: 298945
2017-03-28 21:48:54 +00:00
Sanjay Patel d2a26db991 [DAGCombiner] rename variables in foldAndOfSetCCs for easier reading; NFCI
llvm-svn: 298944
2017-03-28 21:40:41 +00:00
Simon Pilgrim c7c5aa47cf [X86][MMX] Match MMX fp_to_sint conversions from XMM registers
We currently perform the various fp_to_sint XMM conversion and then transfer to the MMX register (on 32-bit via the stack).

This patch improves support for MOVDQ2Q XMM to MMX transfers and adds the XMM->MMX fp_to_sint direct conversion patterns. The SSE2 specifications are the same as for XMM->XMM and XMM->MMX rounding/exceptions/etc.

Differential Revision: https://reviews.llvm.org/D30868

llvm-svn: 298943
2017-03-28 21:32:11 +00:00
Matt Arsenault 323b021b5e Fix crashing on TargetCustom PseudoSourceValues
Default to something more reasonable if printCustom isn't implemented.

llvm-svn: 298941
2017-03-28 20:33:12 +00:00
Sanjay Patel 3230e4be11 [DAGCombiner] clean up foldAndOfSetCCs; NFCI
1. Fix bogus comment.
2. Early exit to reduce indent.
3. Change node pointer param to what it really is: an SDLoc.

llvm-svn: 298940
2017-03-28 20:28:16 +00:00
Adam Nemet cd847a8f30 [IR] Add AllowContract to FastMathFlags
-ffp-contract=fast does not currently work with LTO because it's passed as a
TargetOption to the backend rather than in the IR. This adds it to
FastMathFlags.

This is toward fixing PR25721

Differential Revision: https://reviews.llvm.org/D31164

llvm-svn: 298939
2017-03-28 20:11:52 +00:00
Sanjay Patel 16af53a395 [DAGCombiner] add helper function for and-of-setcc folds; NFC
This is just a cut and paste followed by clang-format. Clean up to follow.

llvm-svn: 298938
2017-03-28 19:58:46 +00:00
Mehdi Amini b5a46c1f45 Add support for -fno-builtin to LTO and ThinLTO to libLTO
Reviewers: tejohnson, pcc

Subscribers: Prazek, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D30791

llvm-svn: 298936
2017-03-28 18:55:44 +00:00
Stanislav Mekhanoshin 9053f22eeb [AMDGPU] Split -amdgpu-early-inline-all option
Previously it was covered by the internalization. It turns out we cannot
run internalizer in FE, it break separate compilation tests. Thus early
inliner gets its own option.

Differential Revision: https://reviews.llvm.org/D31429

llvm-svn: 298935
2017-03-28 18:23:24 +00:00
Sanjay Patel f01a1dad7f [x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equality
Follow-up to:
https://reviews.llvm.org/rL298775

llvm-svn: 298933
2017-03-28 17:23:49 +00:00