Commit Graph

48692 Commits

Author SHA1 Message Date
Max Kazantsev b6d40067af [SCEV] Enhance SCEVFindUnsafe for division
This patch allows SCEVFindUnsafe algorithm to tread division by any non-positive
value as safe. Previously, it could only recognize non-zero constants.

Differential Revision: https://reviews.llvm.org/D39228

llvm-svn: 316568
2017-10-25 11:07:43 +00:00
Clement Courbet 0c7cd071f7 Re-land "[CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads (1)"
Compute the actual decomposition only after deciding whether to expand
of not. Else, it's easy to make the compiler OOM with:
`memcpy(dst, src, 0xffffffffffffffff);`, which typically happens if
someone mistakenly passes a negative value. Add a test.

This reverts commit f8fc02fbd4ab33383c010d33675acf9763d0bd44.

llvm-svn: 316567
2017-10-25 11:02:09 +00:00
George Rimar 0be860f695 [llvm-dwarfdump] - Fix array out of bounds access crash.
This fixes possible out of bound access in
DWARFDie::getFirstChild()
which might happen when .debug_info section is corrupted,
like shown in testcase.

Differential revision: https://reviews.llvm.org/D39185

llvm-svn: 316566
2017-10-25 10:23:49 +00:00
Sam Parker ccb209bb97 [ARM] Swap cmp operands for automatic shifts
Swap the compare operands if the lhs is a shift and the rhs isn't,
as in arm and T2 the shift can be performed by the compare for its
second operand.

Differential Revision: https://reviews.llvm.org/D39004

llvm-svn: 316562
2017-10-25 08:33:06 +00:00
Martin Storsjo 373c8efa1e [AArch64] Add support for dllimport of values and functions
Previously, the dllimport attribute did the right thing in terms
of treating it as a pointer to a value, but this makes sure the
names get mangled properly, and calls to such functions load the
function from the __imp_ pointer.

This is based on SVN r212431 and r212430 where the same was
implemented for ARM.

Differential Revision: https://reviews.llvm.org/D38530

llvm-svn: 316555
2017-10-25 07:25:18 +00:00
Matt Arsenault 8a752b77a2 DAG: Fix creating select with wrong condition type
This code added in r297930 assumed that it could create
a select with a condition type that is just an integer
bitcast of the selected type. For AMDGPU any vselect is
going to be scalarized (although the vector types are legal),
and all select conditions must be i1 (the same as getSetCCResultType).

This logic doesn't really make sense to me, but there's
never really been a consistent policy in what the select
condition mask type is supposed to be. Try to extend
the logic for skipping the transform for condition types
that aren't setccs. It doesn't seem quite right to me though,
but checking conditions that seem more sensible (like whether the
vselect is going to be expanded) doesn't work since this
seems to depend on that also.

llvm-svn: 316554
2017-10-25 07:14:07 +00:00
Max Kazantsev 9ac7021a25 [IRCE] Fix intersection between signed and unsigned ranges
IRCE for unsigned latch conditions was temporarily disabled by rL314881. The motivating
example contained an unsigned latch condition and a signed range check. One of the safe
iteration ranges was `[1, SINT_MAX + 1]`. Its right border was incorrectly interpreted as a negative
value in `IntersectRange` function, this lead to a miscompile under which we deleted a range check
without inserting a postloop where it was needed.

This patch brings back IRCE for unsigned latch conditions. Now we treat range intersection more
carefully. If the latch condition was unsigned, we only try to consider a range check for deletion if:
1. The range check is also unsigned, or
2. Safe iteration range of the range check lies within `[0, SINT_MAX]`.
The same is done for signed latch.

Values from `[0, SINT_MAX]` are unambiguous, these values are non-negative under any interpretation,
and all values of a range intersected with such range are also non-negative.

We also use signed/unsigned min/max functions for range intersection depending on type of the
latch condition.

Differential Revision: https://reviews.llvm.org/D38581

llvm-svn: 316552
2017-10-25 06:47:39 +00:00
Mikael Holmen 279790b674 [MemDep] DBG intrinsics don't impact abort limit for call site dependence analysis
Summary:
Memory dependence analysis no longer counts DbgInfoIntrinsics towards the
limit where to abort the analysis. Before, a bunch of calls to dbg.value
could affect the generated code, meaning that with -g we could generate
different code than without.

Reviewers: chandlerc, Prazek, davide, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39181

llvm-svn: 316551
2017-10-25 06:15:32 +00:00
Max Kazantsev 4332a943bc [IRCE] Smarter detection of empty ranges using SCEV
For a SCEV range, this patch replaces the naive emptiness check for SCEV ranges
which looks like `Begin == End` with a SCEV check. The range is guaranteed to be
empty of `Begin >= End`. We should filter such ranges out and do not try to perform
IRCE for them.

For example, we can get such range when intersecting range `[A, B)` and `[C, D)`
where `A < B < C < D`. The resulting range is `[max(A, C), min(B, D)) = [C, B)`.
This range is empty, but its `Begin` does not match with `End`.

Making IRCE for an empty range is basically safe but unprofitable because we
never actually get into the main loop where the range checks are supposed to
be eliminated. This patch uses SCEV mechanisms to treat loops with proved
`Begin >= End` as empty.

Differential Revision: https://reviews.llvm.org/D39082

llvm-svn: 316550
2017-10-25 06:10:02 +00:00
Peter Collingbourne e0900acbd4 Assembly tests require x86 target.
llvm-svn: 316546
2017-10-25 04:24:20 +00:00
Teresa Johnson f99c84d548 [ThinLTO] Make test for promoted names more specific
With r314527, promoted values get a suffix that is a decimal value of
the module hash instead of hex. Change the regex to match only decimal
suffix values.

llvm-svn: 316544
2017-10-25 03:41:31 +00:00
Peter Collingbourne 689e6c052e llvm-readobj: Add support for reading relocations in the Android packed format.
This is in preparation for testing lld's upcoming relocation packing
feature (D39152). I have verified that this implementation correctly
unpacks the relocations from a Chromium DSO built with gold and the
Android relocation packer for ARM32 and ARM64.

Differential Revision: https://reviews.llvm.org/D39272

llvm-svn: 316543
2017-10-25 03:37:12 +00:00
Adrian Prantl 2eb7cbf987 Implement salavageDebugInfo functionality for SelectionDAG.
Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds
a hook that can be invoked before an SDNode that is associated with an
SDDbgValue is erased to capture the effect of the deleted node in a
DIExpression.

The motivating example is an SDDebugValue attached to an ADD operation
that gets folded into a LOAD+OFFSET operation.

rdar://problem/32121503

llvm-svn: 316525
2017-10-24 22:55:12 +00:00
Artem Belevich cb8f6328dc [NVPTX] allow address space inference for volatile loads/stores.
If particular target supports volatile memory access operations, we can
avoid AS casting to generic AS. Currently it's only enabled in NVPTX for
loads and stores that access global & shared AS.

Differential Revision: https://reviews.llvm.org/D39026

llvm-svn: 316495
2017-10-24 20:31:44 +00:00
Gadi Haber 323f2e1715 [X86][Broadwell] Added the instruction scheduling information for the Broadwell CPU.
Adding the scheduling information for the Browadwell (BDW) CPU target.

This patch adds the instruction scheduling information for the Broadwell (BDW) architecture target by adding the file X86SchedBroadwell.td located under the X86 Target.
We used the scheduling information retrieved from the Broadwell architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each BDW instruction.

The patch continues the scheduling replacement and insertion effort started with the SandyBridge (SNB) target in r310792, the Haswell (HSW) target in r311879, the SkylakeClient (SKL) target in rL313613 + rL315978 and the SkylakeServer (SKX) in rL315175.

Performance fluctuations may be expected due to code alignment effects.

Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D39054

Change-Id: If6f799e5ff60e1091c8d43b05ea78c53581bae01
llvm-svn: 316492
2017-10-24 20:19:47 +00:00
Justin Bogner 6c452834a1 MIR: Print the register class or bank in vreg defs
This updates the MIRPrinter to include the regclass when printing
virtual register defs, which is already valid syntax for the
parser. That is, given 64 bit %0 and %1 in a "gpr" regbank,

  %1(s64) = COPY %0(s64)

would now be written as

  %1:gpr(s64) = COPY %0(s64)

While this change alone introduces a bit of redundancy with the
registers block, it allows us to update the tests to be more concise
and understandable and brings us closer to being able to remove the
registers block completely.

Note: We generally only print the class in defs, but there is one
exception. If there are uses without any defs whatsoever, we'll print
the class on all uses. I'm not completely convinced this comes up in
meaningful machine IR, but for now the MIRParser and MachineVerifier
both accept that kind of stuff, so we don't want to have a situation
where we can print something we can't parse.

llvm-svn: 316479
2017-10-24 18:04:54 +00:00
Stefan Pintilie 8f0c783095 [PowerPC] Try to simplify a Swap if it feeds a Splat
If we have the situation where a Swap feeds a Splat we can sometimes change the
  index on the Splat and then remove the Swap instruction.

Fixed the test case that was failing and recommit after pulling the original
  commit.

  Original revision is here: https://reviews.llvm.org/D39009

llvm-svn: 316478
2017-10-24 17:44:27 +00:00
Simon Pilgrim 5e8c3f328f [X86][AVX] ComputeNumSignBitsForTargetNode - add support for X86ISD::VTRUNC
llvm-svn: 316462
2017-10-24 17:04:57 +00:00
Simon Pilgrim 1bc62f03a5 [SelectionDAG] Add VSELECT support to ComputeNumSignBits
llvm-svn: 316457
2017-10-24 16:38:38 +00:00
Saleem Abdulrasool fb490a0bcc PowerPC: support the separator character in the IAS
PowerPC uses ; as a comment leader and the @ as a separator character.
Support this properly.

llvm-svn: 316454
2017-10-24 16:19:56 +00:00
Simon Pilgrim 0a12c239b6 [X86] truncateVectorCompareWithPACKSS - use PACKSSDW/PACKSSWB instead of just PACKSSWB.
By using the widest type possible for PACKSS truncation we have a better chance of being able to peek through bitcasts and improves other combines driven by ComputeNumSignBits.

llvm-svn: 316448
2017-10-24 15:38:16 +00:00
Sanjay Patel f762c7b32f [x86] add more vector ISA variants for memcmp expansion; NFC
...because every swiss cheese has different holes.

llvm-svn: 316446
2017-10-24 15:27:47 +00:00
Oliver Stannard 103cca1af7 [ARM] Tighten up CHECK lines in a test
These tests checked for the line number without a leading ":", so for example,
a missed diagnostic on line 123 could match one on line 1123, 2123, etc,
desynchronising the test for hundreds of lines.

This couldn't cause it to incorrectly pass or fail, but made it hard to track
down test failures.

Differential revision: https://reviews.llvm.org/D39238

llvm-svn: 316442
2017-10-24 14:20:13 +00:00
Oliver Stannard 03ded27bbc [ARM] Error for invalid shift in memory operand
Report a diagnostic when we fail to parse a shift in a memory operand because
the shift type is not an identifier. Without this, we were silently ignoring
the whole instruction.

Differential revision: https://reviews.llvm.org/D39237

llvm-svn: 316441
2017-10-24 14:19:08 +00:00
Andrew V. Tischenko f4fbe4a51b Update f16c instruction scheduling on btver2.
Differential Revision: https://reviews.llvm.org/D39051

llvm-svn: 316435
2017-10-24 13:38:30 +00:00
Zvi Rackover 31b101a186 X86CallFrameOptimization: Recognize 'store 0/-1 using and/or' idioms
Summary:
r264440 added or/and patterns for storing -1 or 0 with the intention of decreasing code size. However,
X86CallFrameOptimization does not recognize these memory accesses so it will not replace them with push's when profitable.

This patch fixes this problem by teaching X86CallFrameOptimization these store 0/-1 idioms.

An alternative fix would be to prevent the 'store 0/1 idioms' patterns from firing when accessing the stack. This would save
the need to teach the pass about these idioms. However, because X86CallFrameOptimization does not always fire we may result
in cases where neither X86CallFrameOptimization not the patterns for 'store 0/1 idioms' fire.

Fixes pr34863

Reviewers: DavidKreitzer, guyblank, aymanmus

Reviewed By: aymanmus

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38738

llvm-svn: 316431
2017-10-24 12:13:05 +00:00
Bjorn Pettersson 1c043a9f28 [ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP
Summary:
Got asserts in llvm::CastInst::getCastOpcode saying:
`DestBits == SrcBits && "Illegal cast to vector (wrong type or size)"' failed.

Problem seemed to be that llvm::ConstantFoldCastInstruction did
not handle ptrtoint cast of a getelementptr returning a vector
correctly. I assume such situations are quite rare, since the
GEP needs to be considered as a constant value (base pointer
being null).
The solution used here is to simply avoid the constant fold
of ptrtoint when the value is a vector. It is not supported,
and by bailing out we do not fail on assertions later on.

Reviewers: craig.topper, majnemer, davide, filcab, efriedma

Reviewed By: efriedma

Subscribers: efriedma, filcab, llvm-commits

Differential Revision: https://reviews.llvm.org/D38546

llvm-svn: 316430
2017-10-24 12:08:11 +00:00
George Rimar a17480d602 [llvm-dwarfdump] - Cleanup of gnu_call_site.s. NFC.
This change fixes values of test so that it passes 
-verify without errors and also adds comments.
Test was introduced in D39119 and intention was to check
that tool is able to dump few
DW_*GNU_call_site* tags and attributes, so that
change is NFC cleanup.

llvm-svn: 316428
2017-10-24 11:44:19 +00:00
Marek Olsak ce76ea0394 AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.

Also allow kill in all shader stages.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38544

llvm-svn: 316427
2017-10-24 10:27:13 +00:00
Marek Olsak 2114fc3bcb AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D38543

llvm-svn: 316426
2017-10-24 10:26:59 +00:00
Oliver Stannard c507b370a1 [ARM] Remove tCPS alias which just crashed
This alias caused a crash when trying to print the "cps #0" instruction in a
diagnostic for thumbv6 (which doesn't have that instruction).
	    
The comment was incorrect, this instruction is UNPREDICTABLE if no flag bits
are set, so I don't think it's worth keeping.

Differential Revision: https://reviews.llvm.org/D39191

llvm-svn: 316420
2017-10-24 08:55:36 +00:00
Zvi Rackover 3c0d385598 X86: Fix X86CallFrameOptimization to search for the COPY StackPointer
SelectionDAG inserts a copy of ESP into a virtual register.
X86CallFrameOptimization assumed that the COPY, if present, is always
right after the call-frame setup instruction (ADJCALLSTACKDOWN). This was a
wrong assumption as the COPY can be located anywhere between the call-frame setup
instruction and its first use. If the COPY happened to be located in a different
location than what X86CallFrameOptimization assumed, visiting it while
processing the call chain would lead to a conservative bail-out.

The fix is quite straightfoward, scan ahead for the stack-pointer copy and make note
of it so it can be ignored while processing the call chain.

Fixes pr34903

Differential Revision: https://reviews.llvm.org/D38730

llvm-svn: 316416
2017-10-24 07:38:29 +00:00
Omer Paparo Bivas 2251c79aba [MC] Adding code padding for performance stability - infrastructure. NFC.
Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved.
The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known.
This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding.

Reviewers:
aaboud
zvi
craig.topper
gadi.haber

Differential revision: https://reviews.llvm.org/D34393

Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e
llvm-svn: 316413
2017-10-24 06:16:03 +00:00
Zvi Rackover c6d0b6c103 X86: Register the X86CallFrameOptimization pass
Summary:
The motivation of this change is to enable .mir testing for this pass.
Added one test case to cover the functionality, this same case will be improved by
a future patch.

Reviewers: igorb, guyblank, DavidKreitzer

Reviewed By: guyblank, DavidKreitzer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38729

llvm-svn: 316412
2017-10-24 05:47:07 +00:00
Saleem Abdulrasool 619b3269fd ObjCARC: do not increment past the end of the BB
The `BasicBlock::getFirstInsertionPt` call may return `std::end` for the
BB.  Dereferencing the end iterator results in an assertion failure
"(!NodePtr->isKnownSentinel()), function operator*".  Ensure that the
returned iterator is valid before dereferencing it.  If the end is
returned, move one position backward to get a valid insertion point.

llvm-svn: 316401
2017-10-24 00:09:10 +00:00
Reid Kleckner 0e88118dd7 [codeview] Add support for inlinee lists
This adds type index discovery and dumper support for symbol record kind
0x1168, which is a list of inlined function ids. This symbol kind is
undocumented, but S_INLINEES is consistent with the existing
nomenclature.

Fixes PR34222

llvm-svn: 316398
2017-10-23 23:43:40 +00:00
Jessica Paquette 9df7fde269 [MachineOutliner] Add optimisation remarks for successful outlining
This commit adds optimisation remarks for outlining which fire when a function
is successfully outlined.

To do this, OutlinedFunctions must now contain references to their Candidates.
Since the Candidates must still be sorted and worked on separately, this is
done by working on everything in terms of shared_ptrs to Candidates. This is
good; it means that we can easily move everything to outlining in terms of
the OutlinedFunctions rather than the individual Candidates. This is far more
intuitive than what's currently there!

(Remarks are output when a function is created for some group of Candidates.
In a later commit, all of the outlining logic should be rewritten so that we
loop over OutlinedFunctions rather than over Candidates.)
 

llvm-svn: 316396
2017-10-23 23:36:46 +00:00
Aditya Nandakumar 921f24cef1 [GISel][ARM]: Fix illegal Generic copies in tests
This is in preparation for a verifier check that makes sure
copies are of the same size (when generic virtual registers are involved).

llvm-svn: 316388
2017-10-23 22:53:08 +00:00
Aditya Nandakumar 4dfd2590dc [GISel][AArch64]: Fix illegal Generic copies in tests
This is in preparation for a verifier check that makes sure copies are
of the same size (when generic virtual registers are involved).

llvm-svn: 316387
2017-10-23 22:53:04 +00:00
Rong Xu e1f4245f8d [PM] Add pgo-memop-opt pass to the new pass manager
This pass adds pgo-memop-opt pass to the new pass manager.
It is in the old pass manager but somehow left out in the new pass manager.

Differential Revision: http://reviews.llvm.org/D39145

llvm-svn: 316384
2017-10-23 22:21:29 +00:00
Simon Pilgrim 321e54f72d [X86][SSE] combineBitcastvxi1 - use PACKSSWB directly to pack v8i16 to v16i8
Avoid difficulties determining the number of sign bits later on in shuffle lowering to lower to PACKSS

llvm-svn: 316383
2017-10-23 22:05:02 +00:00
George Burgess IV 8a0e4bc972 Don't crash when we see unallocatable registers in clobbers
This fixes a bug where we'd crash given code like the test-case from
https://bugs.llvm.org/show_bug.cgi?id=30792 . Instead, we let the
offending clobber silently slide through.

This doesn't fully fix said bug, since the assembler will still complain
the moment it sees a crypto/fp/vector op, and we still don't diagnose
calls that require vector regs.

Differential Revision: https://reviews.llvm.org/D39030

llvm-svn: 316374
2017-10-23 20:46:36 +00:00
Stefan Pintilie 52bbd587ac Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"
Revert commit r316366.
Previous commit causes p8-scalar_vector_conversions.ll to fail.

This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92.

llvm-svn: 316371
2017-10-23 20:22:23 +00:00
Krzysztof Parzyszek 6f06b6edff [Hexagon] Return the correct chain edge for i1 function calls
In HexagonISelLowering, there is code to handle the case when
a function returns an i1 type. In this case, we need to generate
extra nodes to copy the result from R0 to a predicate register.

The code was returning the wrong value for the chain edge which
caused an assert "Wrong topological sorting" when converting the
instructions to MIs.

This patch fixes the problem by returning the chain for the final
copy.

Patch by Brendon Cahoon.

llvm-svn: 316367
2017-10-23 19:35:25 +00:00
Stefan Pintilie feafa1d7f0 [PowerPC] Try to simplify a Swap if it feeds a Splat
If we have the situation where a Swap feeds a Splat we can sometimes change the
index on the Splat and then remove the Swap instruction.

Differential Revision: https://reviews.llvm.org/D39009

llvm-svn: 316366
2017-10-23 19:33:31 +00:00
Krzysztof Parzyszek 273678823b [Hexagon] Add extra pattern for S4_addaddi
One combination was missing: add(add(x,y),c).

llvm-svn: 316363
2017-10-23 19:07:50 +00:00
Daniel Sanders d66e0901ae [globalisel][tablegen] Import stores and allow GISel to automatically substitute zero regs like WZR/XZR/$zero.
This patch enables the import of stores. Unfortunately, doing so by itself,
loses an optimization where storing 0 to memory makes use of WZR/XZR.

To mitigate this, this patch also introduces a new feature that allows register
operands to nominate a zero register. When this is done, GlobalISel will
substitute (G_CONSTANT 0) with the nominated register automatically. This
is currently configured to only apply to the stores.

Applying it to GPR32/GPR64 register classes in general will be done after
review see (https://reviews.llvm.org/D39150).

llvm-svn: 316360
2017-10-23 18:19:24 +00:00
Vedant Kumar 35b50a83ab [wasm] readSection: Avoid reading past eof (fixes oss-fuzz #3219)
A wasm file crafted with a bogus section size can trigger an ASan issue
in the DWARFObjInMemory constructor. Nip the problem in the bud when we
read the wasm section.

Found by OSS-Fuzz:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3219

Differential Revision: https://reviews.llvm.org/D38777

llvm-svn: 316357
2017-10-23 18:04:34 +00:00
Simon Pilgrim 6bda996cab [X86][SSE] Regenerate PACKSS tests on 32 + 64-bit targets
llvm-svn: 316354
2017-10-23 17:50:40 +00:00
Sanjay Patel bdc0bf7ba9 [PassManager] add test to show the new PM uses -latesimplifycfg early; NFC
llvm-svn: 316351
2017-10-23 17:30:17 +00:00
Matt Arsenault b791802aef AMDGPU: Fix default range in non-kernel functions
The range should be assumed to be the hardware maximum
if a workitem intrinsic is used in a callable function
which does not know the restricted limit of the calling
kernel.

llvm-svn: 316346
2017-10-23 17:09:35 +00:00
Andrew V. Tischenko 777308b548 Update DPPD/DPPS instruction scheduling on btver2.
Differential Revision: https://reviews.llvm.org/D39046

llvm-svn: 316334
2017-10-23 15:53:30 +00:00
Craig Topper 8f182fdd8b [X86] Add PTWRITE instruction for assembler and disassembler.
llvm-svn: 316333
2017-10-23 15:53:21 +00:00
Craig Topper 5f0339d2f3 [X86] Add RDPID instruction for assembler and disassembler.
llvm-svn: 316332
2017-10-23 15:53:16 +00:00
Simon Pilgrim 32da2f9245 [DAGCombine] Permit combining of shuffles of equivalent splat BUILD_VECTORs
combineShuffleOfScalars is very conservative about shuffled BUILD_VECTORs that can be combined together.

This patch adds one additional case - if both BUILD_VECTORs represent splats of the same scalar value but with different UNDEF elements, then we should create a single splat BUILD_VECTOR, sharing only the UNDEF elements defined by the shuffle mask.

Differential Revision: https://reviews.llvm.org/D38696

llvm-svn: 316331
2017-10-23 15:48:08 +00:00
Simon Pilgrim 03c8753924 [X86][SSE] Regenerate bitcast-and-setcc tests
Avoid the retl/retq changes in an upcoming patch

llvm-svn: 316328
2017-10-23 14:47:49 +00:00
Simon Pilgrim e131cb0bd5 [X86][AVX2] Regenerate AVX2 intrinsics tests on 32 + 64-bit targets
llvm-svn: 316326
2017-10-23 14:19:46 +00:00
Simon Pilgrim c680c4742b [X86][AVX] Regenerate AVX intrinsics tests on 32 + 64-bit targets
llvm-svn: 316325
2017-10-23 14:17:59 +00:00
Simon Pilgrim eae6e9dbc5 [X86][F16C] Regenerate F16C schedule tests
llvm-svn: 316324
2017-10-23 14:15:24 +00:00
Artur Gainullin 610df9c890 Test commit.
llvm-svn: 316322
2017-10-23 13:25:49 +00:00
George Rimar 7fc298afe4 [llvm-dwarfdump] - Teach tool about few GNU call_sites constants.
This teaches tool about following consants: 
DW_TAG_GNU_call_site,
DW_TAG_GNU_call_site_parameter,
DW_AT_GNU_call_site_value,
DW_AT_GNU_all_call_sites.

Constants documented here: https://sourceware.org/elfutils/DwarfExtensions

Differential revision: https://reviews.llvm.org/D39119

llvm-svn: 316321
2017-10-23 11:24:14 +00:00
Ayman Musa 4b2bd5ff5e [X86] Add test for opportunity to use bzhi X86 instruction instead of load+and instructions.
Transformation uploaded for CR in https://reviews.llvm.org/D34141.

llvm-svn: 316320
2017-10-23 10:24:19 +00:00
Andrew V. Tischenko eff4fc0d41 Fix for Bug 30718 - Failure to disassemble certain MOV with rex.R. The issue was in illegal segment register index.
Differential Revision: https://reviews.llvm.org/D38786

llvm-svn: 316319
2017-10-23 09:36:33 +00:00
Sam Parker 487ab86942 [ARM] Allow unrolling of multi-block loops.
Before, loop unrolling was only enabled for loops with a single
block. This restriction has been removed and replaced by:
- allow a maximum of two exiting blocks,
- a four basic block limit for cores with a branch predictor.

Differential Revision: https://reviews.llvm.org/D38952

llvm-svn: 316313
2017-10-23 08:05:14 +00:00
Craig Topper 326008c615 [X86] Fix disassembly of EVEX rounding control and SAE instructions.
Fixes PR31955.

llvm-svn: 316308
2017-10-23 02:26:24 +00:00
Yichao Yu 92c11ee352 Fix invalid ptrtoint in InstCombine
Summary:
It's unclear if this is the only thing we can do but at least this is consistent with the check
of address space agreement in `isBitCastable`.

The code is used at least in both instcombine and jumpthreading though
I could only find a way to trigger the invalid cast in instcombine.

Reviewers: loladiro, sanjoy, majnemer

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34335

llvm-svn: 316302
2017-10-22 20:28:17 +00:00
Sanjay Patel b80daf0b48 [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
As discussed in D39011:
https://reviews.llvm.org/D39011
...replacing constants with a variable is inverting the transform done
by other IR passes, so we definitely don't want to do this early. 
In fact, it's questionable whether this transform belongs in SimplifyCFG 
at all. I'll look at moving this to codegen as a follow-up step.

llvm-svn: 316298
2017-10-22 19:10:07 +00:00
Marina Yatsina f9371d821f Add logic to greedy reg alloc to avoid bad eviction chains
This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario #1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario #2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D35816

Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39
llvm-svn: 316295
2017-10-22 17:59:38 +00:00
Sanjay Patel 24226504a7 [SimplifyCFG] try harder to forward switch condition to phi (PR34471)
The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen:

  int switcher(int x) {
      switch(x) {
      case 17: return 17;
      case 19: return 19;
      case 42: return 42;
      default: break;
      }
      return 0;
    }

  int comparator(int x) {
    if (x == 17) return 17;
    if (x == 19) return 19;
    if (x == 42) return 42;
    return 0;
  }

For the first example, we use a bit-test optimization to avoid a series of compare-and-branch:
https://godbolt.org/g/BivDsw

Differential Revision: https://reviews.llvm.org/D39011

llvm-svn: 316293
2017-10-22 16:51:03 +00:00
Momchil Velikov d6a4ab3d49 [ARM] Dynamic stack alignment for 16-bit Thumb
This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When
targeting processors, which support only the 16-bit Thumb instruction set
the compiler ignores the alignment attributes of automatic variables and may
silently generate incorrect code.

Differential revision: https://reviews.llvm.org/D38143

llvm-svn: 316289
2017-10-22 11:56:35 +00:00
Guy Blank 92d5ce3bd4 [X86] Add a pass to convert instruction chains between domains.
The pass scans the function to find instruction chains that define
registers in the same domain (closures).
It then calculates the cost of converting the closure to another domain.
If found profitable, the instructions are converted to instructions in
the other domain and the register classes are changed accordingly.

This commit adds the pass infrastructure and a simple conversion from
the GPR domain to the Mask domain.

Differential Revision:
https://reviews.llvm.org/D37251

Change-Id: Ic2cf1d76598110401168326d411128ae2580a604
llvm-svn: 316288
2017-10-22 11:43:08 +00:00
Nitesh Jain 757f74c2d3 [mips] Adds support for R_MIPS_26, HIGHER, HIGHEST relocations in RuntimeDyld.
Reviewers: sdardis

Subscribers: jaydeep, bhushan, llvm-commits

Differential Revision: https://reviews.llvm.org/D38314

llvm-svn: 316287
2017-10-22 09:47:41 +00:00
Craig Topper e975127db6 [X86] Teach the disassembler that some instructions use VEX.W==0 without a corresponding VEX.W==1 instruction and we shouldn't treat them as if VEX.W is ignored.
Fixes PR11304.

llvm-svn: 316285
2017-10-22 06:18:26 +00:00
Craig Topper 158bc6474a [X86] Don't allow gather/scatter to disassembler if memory operand does not use a SIB byte.
Fixes PR34998.

llvm-svn: 316282
2017-10-22 04:32:30 +00:00
Aaron Ballman fc02869c96 Reverting r316270 due to failing build bots.
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/12899
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7951

llvm-svn: 316276
2017-10-21 20:38:15 +00:00
Simon Pilgrim 3cb024490a [X86][SSE] Add extractps/pextrd equivalence to domain tables
Differential Revision: https://reviews.llvm.org/D39135

llvm-svn: 316274
2017-10-21 20:19:48 +00:00
Craig Topper ca2382d809 [X86] Fix disassembling of EVEX instructions to stop accidentally decoding the SIB index register as an XMM/YMM/ZMM register.
This introduces a new operand type to encode the whether the index register should be XMM/YMM/ZMM. And new code to fixup the results created by readSIB.

This has the nice effect of removing a bunch of code that hard coded the name of every GATHER and SCATTER instruction to map the index type.

This fixes PR32807.

llvm-svn: 316273
2017-10-21 20:03:20 +00:00
Fangrui Song c7b749bd06 [PPC CodeGen] Fix the bitreverse.i64 intrinsic.
Summary: The two 32-bit words were swapped.

Subscribers: nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D38705

llvm-svn: 316270
2017-10-21 16:59:40 +00:00
Simon Pilgrim 7025b07828 [X86][SSE] Add missing extractps scheduling test
llvm-svn: 316262
2017-10-21 14:35:09 +00:00
David Green 907b60fbba [LoopInterchange] Fix phi node ordering miscompile.
The way that splitInnerLoopHeader splits blocks requires that
the induction PHI will be the first PHI in the inner loop
header. This makes sure that is actually the case when there
are both IV and reduction phis.

Differential Revision: https://reviews.llvm.org/D38682

llvm-svn: 316261
2017-10-21 13:58:37 +00:00
Craig Topper fcf27188d7 [X86] Do not generate __multi3 for mul i128 on X86
Summary: __multi3 is not available on x86 (32-bit). Setting lib call name for MULI_128 to nullptr forces DAGTypeLegalizer::ExpandIntRes_MUL to generate instructions for 128-bit multiply instead of a call to an undefined function.  This fixes PR20871 though it may be worth looking at why licm and indvars combine to generate 65-bit multiplies in that test.

Patch by Riyaz V Puthiyapurayil

Reviewers: craig.topper, schweitz

Reviewed By: craig.topper, schweitz

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D38668

llvm-svn: 316254
2017-10-21 02:26:00 +00:00
Krzysztof Parzyszek 9d19c8cac9 [Packetizer] Add function to check for aliasing between instructions
llvm-svn: 316243
2017-10-20 22:08:40 +00:00
Sam Clegg 12fd3da9d1 [WebAssembly] MC: Fix crash when -g specified.
At this point we don't output any debug sections or thier
relocations.

Differential Revision: https://reviews.llvm.org/D39076

llvm-svn: 316240
2017-10-20 21:28:38 +00:00
Daniel Sanders 1e4569fdc1 [globalisel][tablegen] Fix small spelling nits. NFC
ComplexRendererFn -> ComplexRendererFns
Corrected a couple lingering references to tied operands that were missed.

llvm-svn: 316237
2017-10-20 20:55:29 +00:00
Krzysztof Parzyszek 022922b31a [Hexagon] Report error instead of crashing on wrong inline-asm constraints
llvm-svn: 316236
2017-10-20 20:24:44 +00:00
Krzysztof Parzyszek 64e5d7d3ae [Hexagon] Reorganize and update instruction patterns
llvm-svn: 316228
2017-10-20 19:33:12 +00:00
Simon Pilgrim 1311ff1340 [X86][SSE] Add missing _mm_extract_ps fast-isel test
llvm-svn: 316226
2017-10-20 19:29:01 +00:00
Sanjay Patel bb94161fb7 [x86] avoid FileCheck assert duplication with retl/retq regex; NFC
This was suggested in PR35003:
https://bugs.llvm.org/show_bug.cgi?id=35003

32-bit checks may be identical to 64-bit (if we avoid those pesky scalar params!).

I'll check in the script change shortly assuming this doesn't anger any bots.

llvm-svn: 316223
2017-10-20 18:35:32 +00:00
Dave Lee f9b72327b0 Make x86 __ehhandler comdat if parent function is
Summary:
This change comes from using lld for i686-windows-msvc. Before this change, lld
emits an error of:

    error: relocation against symbol in discarded section: .xdata

It's possible that this could be addressed in lld, but I think this change is
reasonable on its own.

At a high level, this is being generated:

    A (.text comdat) -> B (.text) -> C (.xdata comdat)

Where A is a C++ inline function, which references B, an exception handler
thunk, which references C, the exception handling info.

With this structure, lld will error when applying relocations to B if the C it
references has been discarded (some other C has been selected).

This change checks if A is comdat, and if so places the exception registration
thunk (B) in the comdata group of A (and B).

It appears that MSVC makes the __ehhandler function comdat.

Is it possible that duplicate thunks are being emitted into the final binary
with other linkers, or are they stripping the unused thunks?

Reviewers: rnk, majnemer, compnerd, smeenai

Reviewed By: rnk, compnerd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38940

llvm-svn: 316219
2017-10-20 17:04:43 +00:00
Krzysztof Parzyszek 3818aeaeb9 [Hexagon] Allow redefinition with immediates for hw loop conversion
Normally, if the registers holding the induction variable's bounds
are redefined inside of the loop's body, the loop cannot be converted
to a hardware loop. However, if the redefining instruction is actually
loading an immediate value into the register, this conversion is both
possible and legal (since the immediate itself will be used in the
loop setup in the preheader).

llvm-svn: 316218
2017-10-20 16:56:33 +00:00
Simon Pilgrim b6b617b7d8 [X86] Check all CPU target names.
We ignore the 32-bit/64-bit triple but I've tried to use i686 triples for CPUs that don't support x86_64

llvm-svn: 316217
2017-10-20 16:55:51 +00:00
Zvi Rackover e95709d54a X86 Tests: Add tests for vector permutes with variable indices. NFC.
Basic tests which are the equivalent of single-source shufflevector with variable mask.

llvm-svn: 316216
2017-10-20 15:32:14 +00:00
Aleksandar Beserminji 143572984d Revert "[mips] Reordering callseq* nodes to be linear"
This reverts commit r314507, because the original patch is causing test
failures.

llvm-svn: 316215
2017-10-20 14:35:41 +00:00
Eugene Leviant 27b226fb65 [ARM] Use post-RA MI scheduler when +use-misched is set
Differential revision: https://reviews.llvm.org/D39100

llvm-svn: 316214
2017-10-20 14:29:17 +00:00
Simon Pilgrim 46b791921f [X86][AVX512] Regenerate regcall tests.
As part of tracking down machine verifier issues (PR27481)

llvm-svn: 316213
2017-10-20 14:13:02 +00:00
Nikolai Bozhenov fa8c5514c5 [ValueTracking] Enabling ValueTracking patch by default
(recommit #2 after checking for timeout issue). 

The original patch was an improvement to IR ValueTracking on
non-negative integers. It has been checked in to trunk (D18777,
r284022). But was disabled by default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.

Reviewers: reames, hfinkel
 
Differential Revision: https://reviews.llvm.org/D34101
 
Patch by: Olga Chupina <olga.chupina@intel.com>

llvm-svn: 316208
2017-10-20 10:08:47 +00:00
Max Kazantsev 1c839629aa Add test case for LoopSink pass
This test checks that load from constant memory will be sunk regardless of
aliasing stores in the loop.

Patch by Daniil Suchkov!

Differential Revision: https://reviews.llvm.org/D39113

llvm-svn: 316207
2017-10-20 06:40:48 +00:00
Dylan McKay 6670e42402 [AVR] Fix the select-mbb-placement-bug.ll
llvm-svn: 316205
2017-10-20 04:17:14 +00:00
Lang Hames 716a142940 [ExecutionEngine] Temporarily remove the ExecutionEngine tls tests.
Will re-enable once I figure out why the necessary runtime functions are
missing on some bots.

llvm-svn: 316203
2017-10-20 01:18:00 +00:00
Lang Hames 8eec91e96d [ExecutionEngine] After a heroic dev-meeting hack session, the JIT supports TLS.
Turns on EmulatedTLS support by default in EngineBuilder. ;)

llvm-svn: 316200
2017-10-20 00:53:16 +00:00
Nemanja Ivanovic 0026c06e11 Disabling the transformation introduced in r315888
The commit at https://reviews.llvm.org/rL315888 is causing some failures
with internal testing. Disabling this code until we can resolve the issues.

llvm-svn: 316199
2017-10-20 00:36:46 +00:00
Alex Bradbury 8971842f43 [RISCV] Initial codegen support for ALU operations
This adds the minimum necessary to support codegen for simple ALU operations
on RV32. Prolog and epilog insertion, support for memory operations etc etc 
follow in future patches.

Leave guessInstructionProperties=1 until https://reviews.llvm.org/D37065 is 
reviewed and lands.

Differential Revision: https://reviews.llvm.org/D29933

llvm-svn: 316188
2017-10-19 21:37:38 +00:00
Simon Pilgrim e8e2c4c0cf [X86][AES] Test AES intrinsics on 32/64-bit targets with/without VEX encoding
Don't just test on 32-bit

llvm-svn: 316176
2017-10-19 19:05:04 +00:00
Graham Yiu 488782efa3 The cost of splitting a large vector instruction is not being taken into account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is now being multiplied by the cost of the type legalization. This will return a more accurate cost.
Committing on behalf on Brad Nemanich (brad.nemanich@ibm.com)

Differential Revision: https://reviews.llvm.org/D38961

llvm-svn: 316174
2017-10-19 18:16:31 +00:00
Krzysztof Parzyszek e4d0e199bf [Hexagon] Fix store conversion from rr to io in optimize addressing modes
llvm-svn: 316170
2017-10-19 16:59:22 +00:00
Saleem Abdulrasool 1261151912 ExecutionEngine: adjust COFF i386 tautological asserts
Modify static_casts to not be tautological in some COFF i386
relocations.

Patch by Alex Langford!

llvm-svn: 316169
2017-10-19 16:57:40 +00:00
Alex Bradbury 3c941e7ed9 [RISCV] RISCVAsmParser: early exit if RISCVOperand isn't immediate as expected
This is necessary to avoid an assertion in the included test case and similar 
assembler inputs.

llvm-svn: 316168
2017-10-19 16:22:51 +00:00
Nikolai Bozhenov 8dcab54cb4 Revert r315992 because of a found miscompilation failure
llvm-svn: 316164
2017-10-19 15:36:18 +00:00
Simon Pilgrim fdd63d1535 [X86] Replace custom scalar integer absolute matching with ISD::ABS lowering.
x86 has its own copy of integer absolute pattern matching to combine directly to a SUB+CMOV.

This patch removes the x86 combine and adds custom lowering support for ISD::ABS instead, allowing us to use the DAGCombiner version.

Additional test cases are already covered by iabs.ll (rL315706 and rL315711).

Differential Revision: https://reviews.llvm.org/D38895

llvm-svn: 316162
2017-10-19 15:02:24 +00:00
Simon Pilgrim d0649f978f [X86] Add scalar (abs (abs x)) -> (abs x) combine test.
Before landing D38895

llvm-svn: 316160
2017-10-19 14:59:26 +00:00
Diana Picus 7bf71008aa [ARM GlobalISel] Fix liveins in test. NFC
llvm-svn: 316155
2017-10-19 09:28:19 +00:00
Diana Picus a993859335 [ARM GlobalISel] Remove redundant tests
These test cases don't really add anything that isn't covered by other
tests as well, so we can safely remove them.

llvm-svn: 316154
2017-10-19 08:50:28 +00:00
Rafael Espindola 55680d0add Fix buffer overflow.
We were reading past the end of the buffer.

llvm-svn: 316143
2017-10-19 01:25:48 +00:00
Justin Bogner 876ad287d1 GISel: Canonicalize select tests using update_mir_test_checks
This runs `udpate_mir_test_checks --add-vreg-checks` on the tests taht
are already more or less in the format that generates, so that there
will be less churn in some upcoming changes.

llvm-svn: 316139
2017-10-18 23:33:31 +00:00
Justin Bogner f8dc015bd1 AArch64/GISel: Modernize the localizer test
llvm-svn: 316138
2017-10-18 23:26:24 +00:00
Justin Bogner d45849f703 Canonicalize a large number of mir tests using update_mir_test_checks
This converts a large and somewhat arbitrary set of tests to use
update_mir_test_checks. I ran the script on all of the tests I expect
to need to modify for an upcoming mir syntax change and kept the ones
that obviously didn't change the tests in ways that might make it
harder to understand.

llvm-svn: 316137
2017-10-18 23:18:12 +00:00
Sanjoy Das 2f27456c82 Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain."
This reverts commit r316054.  There was some confusion over the review process:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html

llvm-svn: 316129
2017-10-18 22:00:57 +00:00
Sam Clegg 799f55cb32 Fix lit.site.cfg.py.in after rL316123
llvm-svn: 316126
2017-10-18 20:46:05 +00:00
Dylan McKay 443695f80a [AVR] Fix the select_mbb_placement_bug.ll test
llvm-svn: 316124
2017-10-18 20:04:57 +00:00
Sam Clegg 0be459e066 Don't set static-libs test feature when using LLVM_LINK_LLVM_DYLIB
This was causing execname-options.ll to fail on the wasm
waterfall.

Differential Revision: https://reviews.llvm.org/D39022

llvm-svn: 316123
2017-10-18 19:37:30 +00:00
Vedant Kumar 9cbd33fec9 [llvm-cov] Suppress sub-line highlights in simple cases
llvm-cov tends to highlight too many regions because its policy is to
highlight all region entry segments. This can look confusing to users:
not all region entry segments are interesting and deserve highlighting.
Emitting these highlights only when the region count differs from the
line count is a more user-friendly policy.

llvm-svn: 316109
2017-10-18 18:52:29 +00:00
Vedant Kumar 988faf87f8 [llvm-cov] Highlight gaps in consecutive uncovered regions
llvm-cov typically doesn't highlight gap segments, but it should if the
gap occurs after an uncovered region in order to preserve continuity.

llvm-svn: 316107
2017-10-18 18:52:27 +00:00
Sumanth Gundapaneni e1983bcf55 [Hexagon] New HVX target features.
This patch lets the llvm tools handle the new HVX target features that
are added by frontend (clang). The target-features are of the form
"hvx-length64b" for 64 Byte HVX mode, "hvx-length128b" for 128 Byte mode HVX.
"hvx-double" is an alias to "hvx-length128b" and is soon will be deprecated.
The hvx version target feature is upgated form "+hvx" to "+hvxv{version_number}.
Eg: "+hvxv62"

For the correct HVX code generation, the user must use the following
target features.
For 64B mode: "+hvxv62" "+hvx-length64b"
For 128B mode: "+hvxv62" "+hvx-length128b"

Clang picks a default length if none is specified. If for some reason,
no hvx-length is specified to llvm, the compilation will bail out.
There is a corresponding clang patch.

Differential Revision: https://reviews.llvm.org/D38851

llvm-svn: 316101
2017-10-18 18:07:07 +00:00
Konstantin Zhuravlyov 8d5e9e110c AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistency
Differential Revision: https://reviews.llvm.org/D38957

llvm-svn: 316097
2017-10-18 17:31:09 +00:00
Alex Bradbury 13ce95b77f [RISCV] Bugfix createRISCVELFObjectWriter
r315275 set the IsLittleEndian parameter incorrectly. This patch corrects 
this, and adds a test to ensure such mistakes will be caught in the future.

llvm-svn: 316091
2017-10-18 16:11:31 +00:00
Justin Bogner 2ac32cc9ce AArch64/GISel: Fix a couple of tests that were testing the wrong thing
Fix a couple of tests that were extending the wrong vreg, and
regenerate their checks with update_mir_test_checks. This looks like
it was a copy-paste or test update error.

llvm-svn: 316087
2017-10-18 15:34:33 +00:00
Andre Vieira d4a25707f0 [ARM] Fix disassembly for conditional VMRS and VMSR instructions in ARM mode
Differential Revision: https://reviews.llvm.org/D38347

llvm-svn: 316085
2017-10-18 14:47:37 +00:00
Simon Dardis 03c2c65b2d [mips] Fix analyzeBranch to handle debug data
In the case where there was a conditional branch followed by a unconditional
branch with debug instruction separating them, MipsInstrInfo::analyzeBranch
would not skip past debug instruction when searching for the second branch
which give erroneous results about the control flow of the block.

This could lead to the branch folder to merge the non-fall through case
into it's predecessor, leaving the conditional branch with a dangling
basic block operand.

This resolves PR34975.

Thanks to Alexander Richardson for reporting the issue!

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D39003

llvm-svn: 316084
2017-10-18 14:35:29 +00:00
Simon Dardis 77bf0fd59c [mips] Move test to correct directory. NFCI
llvm-svn: 316081
2017-10-18 13:59:48 +00:00
Michael Zuckerman 7ba046c784 Adding new test for
bug fix 316067 https://bugs.llvm.org/show_bug.cgi?id=34978

This test checks that the x86-interleaved ends without any
assertion.

Change-Id: I1e970482a4d0404516cbc85517fc091bb21c35a8
llvm-svn: 316080
2017-10-18 13:51:31 +00:00
Michael Zuckerman 49293264cc [AVX512][AVX2]Cost calculation for interleave load/store patterns {v8i8,v16i8,v32i8,v64i8}
This patch adds accurate instructions cost.
The formula presents two cases(stride 3 and stride 4) and calculates the cost according to the VF and stride.

Reviewers:
1. delena
2. Farhana
3. zvi
4. dorit
5. Ayal

Differential Revision: https://reviews.llvm.org/D38762

Change-Id: If4cfbd4ac0e63694e8144cb78c7fa34850647ff7
llvm-svn: 316072
2017-10-18 11:41:55 +00:00
Hiroshi Inoue 5388e66d3a [PowerPC] Use helper functions to check sign-/zero-extended value
Helper functions to identify sign- and zero-extending machine instruction is introduced in rL315888.
This patch makes PPCInstrInfo::optimizeCompareInstr use the helper functions. It simplifies the code and also makes possible more optimizations since the helper can do more analysis than the original check code; I observed about 5000 more compare instructions are eliminated while building LLVM.

Also, this patch fixes a bug in helpers on ANDIo instruction handling due to the order of checks. This bug causes a failure in an existing test case for optimizeCompareInstr.

Differential Revision: https://reviews.llvm.org/D38988

llvm-svn: 316071
2017-10-18 10:31:19 +00:00
Nikolai Bozhenov 74c047eabb Improve lookThroughCast function.
Summary:
When we have the following case:

  %cond = cmp iN %x, CmpConst
  %tr = trunc iN %x to iK
  %narrowsel = select i1 %cond, iK %t, iK C

We could possibly match only min/max pattern after looking through cast.
So it is more profitable if widened C constant will be equal CmpConst.
That is why just set widened C constant equal to CmpConst, because there
is a further check in this function that trunc CmpConst == C.

Also description for lookTroughCast function was added.

Reviewers: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38536

Patch by: Artur Gainullin <artur.gainullin@intel.com>

llvm-svn: 316070
2017-10-18 09:28:09 +00:00
Jatin Bhateja 1fc49627e4 [ScalarEvolution] Handling for ICmp occuring in the evolution chain.
Summary:
 If a compare instruction is same or inverse of the compare in the
 branch of the loop latch, then return a constant evolution node.
 Currently scope of evaluation is limited to SCEV computation for
 PHI nodes.

 This shall facilitate computations of loop exit counts in cases
 where compare appears in the evolution chain of induction variables.

 Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju

Reviewed By: junryoungju

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38494

llvm-svn: 316054
2017-10-18 01:36:16 +00:00
Adrian Prantl 5a82f0a470 Verifier: Ignore CUs pulled in by ODR-uniqued types.
When more than one Module is imported into the same context, such as during
an LTO build before linking the modules, ODR type uniquing may cause types
to point to a different CU. This check does not make sense in this case.

This fixes the error reported in PR34944.

https://bugs.llvm.org/show_bug.cgi?id=34944
rdar://problem/34940685

This reapplies a cleaner implementation of r316049.

llvm-svn: 316052
2017-10-18 01:11:01 +00:00
Adrian Prantl fe8226fd94 Revert "Verifier: Ignore CUs pulled in by ODR-uniqued types."
This reverts commit r316049.

llvm-svn: 316050
2017-10-18 00:54:31 +00:00
Adrian Prantl f9a1cf6dcc Verifier: Ignore CUs pulled in by ODR-uniqued types.
When more than one Module is imported into the same context, such as during
an LTO build before linking the modules, ODR type uniquing may cause types
to point to a different CU. This check does not make sense in this case.

This fixes the error reported in PR34944.

https://bugs.llvm.org/show_bug.cgi?id=34944
rdar://problem/34940685

llvm-svn: 316049
2017-10-18 00:49:31 +00:00
Wei Ding 7ab1f7a421 AMDGPU : Fix an error for the llvm.cttz implementation.
Differential Revision: http://reviews.llvm.org/D39014

llvm-svn: 316037
2017-10-17 21:49:52 +00:00
Tim Northover 350a87eaf1 AArch64: account for possible frame index operand in compares.
If the address of a local is used in a comparison, AArch64 can fold the
address-calculation into the comparison via "adds". Unfortunately, a couple of
places (both hit in this one test) are not ready to deal with that yet and just
assume the first source operand is a register.

llvm-svn: 316035
2017-10-17 21:43:52 +00:00
Simon Pilgrim 7cd4e2c96f [X86][SSE] Tests packuswb/truncation codegen from PR34773
llvm-svn: 316033
2017-10-17 21:14:53 +00:00
Konstantin Zhuravlyov 7dabe9ced7 AMDGPU: Start generating metadata for MaxFlatWorkGroupSize
Differential Revision: https://reviews.llvm.org/D38958

llvm-svn: 316024
2017-10-17 20:03:21 +00:00
Sanjay Patel 94c0eb031c [ARM, AArch64] adjust tests trying to maintain their objective; NFC
A smarter compiler will see that these might be better without a jump table
if we're just using the constant values of the switch.

llvm-svn: 316012
2017-10-17 16:54:56 +00:00
Sanjay Patel 6d172f2d72 [SimplifyCFG] add test for part of PR34471 (switch squashing); NFC
llvm-svn: 316008
2017-10-17 15:56:42 +00:00
Sanjay Patel 6ed5c91422 [SimplifyCFG] update test to use auto-generated FileCheck asserts; NFC
llvm-svn: 316006
2017-10-17 15:50:47 +00:00
Gadi Haber 85d99b4310 [X86][Broadwell] Added the broadwell cpu to the scheduling regression tests.<NFC>
NFC.
Added the Broadwell cpu and the BROADWELL prefix to all the scheduling regression tests, as part of prepartion for a larger commit of adding all Broadwell scheduiling.

Reviewers: RKSimon, zvi, aaboud
Differential Revision: https://reviews.llvm.org/D38994

Change-Id: I54bc9065168844c107b1729fcdc1d311ce3ea0a9
llvm-svn: 315998
2017-10-17 13:45:39 +00:00
Nikolai Bozhenov 346f4329c4 Improve clamp recognition in ValueTracking.
Summary:
ValueTracking was recognizing not all variations of clamp. Swapping of
true value and false value of select was added to fix this problem. This
change breaks the canonical form of cmp inside the matchMinMax function,
that is why additional checks for compare predicates is needed. Added
corresponding test cases.

Reviewers: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38531

Patch by: Artur Gainullin <artur.gainullin@intel.com>

llvm-svn: 315992
2017-10-17 11:50:48 +00:00
Yichao Yu a18b0b1817 Fix implicit null check with negative offset
Summary:
It seems that negative offset was accidentally allowed in D17967.
AFAICT small negative offset should be valid (always raise segfault) on all archs that I'm aware of (especially x86, which is the only one with this optimization enabled) and such case can be useful when loading hiden metadata from an object.

However, like the positive side, it should only be done within a certain limit.
For now, use the same limit on the positive side for the negative side.
A separate option can be added if needs appear.

Reviewers: mcrosier, skatkov

Reviewed By: skatkov

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D38925

llvm-svn: 315991
2017-10-17 11:47:36 +00:00
Gadi Haber 3020490aac [X86][Skylake] fixed/updated regression test mmx-schedule.ll which failed after r315978.
Change-Id: I60cd7e03ea6c3d9a3dc661a882458e83feca66e3
llvm-svn: 315985
2017-10-17 10:00:08 +00:00
Andrew V. Tischenko 5bfbfb48b5 More tests with x86 prefixes which work after rL315899 commit
llvm-svn: 315983
2017-10-17 08:49:47 +00:00
Gadi Haber 1e0f1f476a [X86][SKL] Updated scheduling information for the SkylakeClient target
Updated the scheduling information for the SkylakeClient target with the following changes:

1. regrouped the instructions after adding load and store latencies.
2. regrouped the instructions after adding identified missing ports in several groups.
The changes were made after revisiting the latencies impact of all the load and store uOps.

Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D38727

Change-Id: I778a308cc11e490e8fa5e27e2047412a1dca029f
llvm-svn: 315978
2017-10-17 06:47:04 +00:00
Max Kazantsev 4faa509bb1 Remove a test after revert of rL315440
llvm-svn: 315977
2017-10-17 06:43:31 +00:00
Max Kazantsev 20fc63351d [NFC] Add test from bug 34937
llvm-svn: 315976
2017-10-17 06:37:58 +00:00
Philip Reames 6a7bbfb2e2 Revert 315440 on behalf of mkazantsev
This patch reverts rL315440 because of the bug described at
https://bugs.llvm.org/show_bug.cgi?id=34937

The fix for the bug is on review as D38944, but not yet ready.  Given this is a regression reverting until a fix is ready is called for.

Max would have done the revert himself, but is having trouble doing a build of fresh LLVM for some reason.  I did the build and test to ensure the revert worked as expected on his behalf.

llvm-svn: 315974
2017-10-17 06:21:07 +00:00
Daniel Sanders 3229217620 [globalisel][tablegen] Add a GIM_CheckIsSameOperand test where OtherInsnID and OtherOpIdx differ
llvm-svn: 315972
2017-10-17 05:24:44 +00:00
Craig Topper 341f2ab444 [X86] Add masked palignr tests to vector-shuffle-masked.ll
llvm-svn: 315971
2017-10-17 04:17:56 +00:00
Craig Topper 19f2f49ef1 [X86] Add AVX512BW to the vector-shuffle-masked test to prepare for an upcoming commit.
llvm-svn: 315970
2017-10-17 04:17:55 +00:00
Shoaib Meenai a42f60e7f7 [ExecutionEngine] Correct the size of a write in a COFF i386 relocation
We want to be writing a 32bit value, so we should be writing 4 bytes
instead of 2.

Patch by Alex Langford <apl@fb.com>.

Differential Revision: https://reviews.llvm.org/D38872

llvm-svn: 315964
2017-10-17 01:41:14 +00:00
Vedant Kumar 4d1969f22b [llvm-cov] Add one correction to r315960 (PR34962)
In r315960, I accidentally assumed that the first line segment is
guaranteed to be the non-gap region entry segment (given that one is
present). It can actually be any segment on the line, and the test I
checked in demonstrates that.

llvm-svn: 315963
2017-10-17 01:34:41 +00:00
Reid Kleckner 57b7d4fad7 Try to make crlf portable to other printf implementations
llvm-svn: 315961
2017-10-17 00:27:31 +00:00
Vedant Kumar 58548c30da [llvm-cov] Remove workaround in line execution count calculation (PR34962)
Gap areas make it possible to correctly determine when to use counts
from deferred regions. Before gap areas were introduced, llvm-cov needed
to use a heuristic to do this: it ignored counts from segments that
start, but do not end, on a line. This heuristic breaks down on a simple
example (see PR34962).

This patch removes the heuristic and picks counts from any region entry
segment which isn't a gap area.

llvm-svn: 315960
2017-10-16 23:47:10 +00:00
Mark Searles 4e3d6160db Use the return value of UpdateNodeOperands(); in some cases, UpdateNodeOperands() modifies the node in-place and using the return value isn’t strictly necessary. However, it does not necessarily modify the node, but may return a resultant node if it already exists in the DAG. See comments in UpdateNodeOperands(). In that case, the return value must be used to avoid such scenarios as an infinite loop (node is assumed to have been updated, so added back to the worklist, and re-processed; however, node hasn’t changed so it is once again passed to UpdateNodeOperands(), assumed modified, added back to worklist; cycle infinitely repeats).
Differential Revision: https://reviews.llvm.org/D38466

llvm-svn: 315957
2017-10-16 23:38:53 +00:00
Simon Pilgrim a590c74549 [X86][AVX] Add v4x64 vector shuffle test for <0,2,1,3> mask
llvm-svn: 315955
2017-10-16 23:20:16 +00:00
Quentin Colombet 0bd2825517 Re-apply [AArch64][RegisterBankInfo] Use the statically computed mappings for COPY
This reverts commit r315823, thus re-applying r315781.

Also make sure we don't use G_BITCAST mapping for non-generic registers.
Non-generic registers don't have a type but do have a reg bank.
Something the COPY mapping now how to deal with but the G_BITCAST
mapping don't.

-- Original Commit Message --
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.

Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.

Improve the compile time of RegBankSelect by up to 20%.

Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.

NFC.

llvm-svn: 315947
2017-10-16 22:28:40 +00:00
Quentin Colombet 9f20af6135 [AArch64][RegisterBankInfo] Add mapping support for G_BITCAST of s128
Anything bigger than 64-bit just map to FPR.

llvm-svn: 315946
2017-10-16 22:28:38 +00:00
Quentin Colombet 7c114d3d70 [AArch64][LegalizerInfo] Mark s128 G_BITCAST legal
We used to mark all G_BITCAST of 128-bit legal but only for vector
types. Scalars of this size are just fine as well.

llvm-svn: 315945
2017-10-16 22:28:27 +00:00
Matthew Simpson 36bbc8ce98 Add !callees metadata
This patch adds a new kind of metadata that indicates the possible callees of
indirect calls.

Differential Revision: https://reviews.llvm.org/D37354

llvm-svn: 315944
2017-10-16 22:22:11 +00:00
Reid Kleckner b0c9e0d647 [MC] Lex CRLF as one token
This will prevent doubling of line endings when parsing assembly and
emitting assembly.

Otherwise we'd parse the directive, consume the end of statement, hit
the next end of statement, and emit a fresh newline.

llvm-svn: 315943
2017-10-16 22:20:03 +00:00
Simon Pilgrim 03c89a840a [X86][3DNow] Add scheduling latency/throughput tests for 3DNow! instructions
llvm-svn: 315942
2017-10-16 21:55:09 +00:00
Simon Pilgrim 608e1b57cf [X86][MMX] Add scheduling latency/throughput tests for MMX instructions
llvm-svn: 315939
2017-10-16 21:29:29 +00:00
Tony Tye d288430c3e Add base relative relocation record that can be used for the following case (OpenCL example):
static __global int Var = 0; 
__global int* Ptr[] = {&Var};
...

In this case Var is a non premptable symbol and so its address can be used as the value of Ptr, with a base relative relocation that will add the delta between the ELF address and the actual load address. Such relocations do not require a symbol.

Differential Revision: https://reviews.llvm.org/D38909

llvm-svn: 315935
2017-10-16 20:44:29 +00:00
Alexander Timofeev 9dff31c769 [AMDGPU] : revert r315908
llvm-svn: 315916
2017-10-16 16:57:37 +00:00
Akira Hatanaka e8c1a54c07 [ObjCARC] Do not move a release that has the clang.imprecise_release tag
above PHI instructions.

ARC optimizer has an optimization that moves a call to an ObjC runtime
function above a phi instruction when the phi has a null operand and is
an argument passed to the function call. This optimization should not
kick in when the runtime function is an objc_release that releases an
object with precise lifetime semantics.

rdar://problem/34959669

llvm-svn: 315914
2017-10-16 16:46:59 +00:00
Sanjay Patel a4b89ed0b7 [x86] add minmax tests with more predicate coverage; NFC
llvm-svn: 315913
2017-10-16 15:20:00 +00:00
Alexander Timofeev 3828242c7e [AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one
Differential revision: https://reviews.llvm.org/D38754

llvm-svn: 315908
2017-10-16 14:35:29 +00:00
Simon Pilgrim 259b190f0d Fix test name typo.
llvm-svn: 315907
2017-10-16 14:33:51 +00:00
Simon Pilgrim 664f2f697a [X86][SSE] Added additional PACKUS shuffle tests
Mainly inspired by PR34773

llvm-svn: 315906
2017-10-16 14:32:41 +00:00
Simon Dardis 0d378a9eed [mips][micromips] Fix (dis)assembly of bc1(t|f)
Previously these instructions were marked codegen only and had
an under-specified instruction description that did not record the
fcc register.

Reviewers: atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D38847

llvm-svn: 315905
2017-10-16 14:20:22 +00:00
Stefan Maksimovic ee6b5a79dc [mips] Provide alternate predicates for constant synthesis
Ordering of patterns should not be of importance anymore
since the predicates used are mutually exclusive now.

llvm-svn: 315901
2017-10-16 13:18:21 +00:00
Andrew V. Tischenko bfc9061593 This patch is a result of D37262: The issues with X86 prefixes. It closes PR7709, PR17697, PR19251, PR32809 and PR21640. There could be other bugs closed by this patch.
llvm-svn: 315899
2017-10-16 11:14:29 +00:00
George Rimar 68b285f69e [llvm-dwarfdump] - Teach tool to parse DW_CFA_GNU_args_size.
Currently llvm-dwarfdump runs into llvm_unreachable when
faces DW_CFA_GNU_args_size. Patch implements the support.

Differential revision: https://reviews.llvm.org/D38879

llvm-svn: 315897
2017-10-16 10:26:17 +00:00
NAKAMURA Takumi 414151a47e Revert rL315894, "SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)"
llvm-svn: 315896
2017-10-16 09:50:01 +00:00
Nikolai Bozhenov 0e7ebbccc7 Move folding of icmp with zero after checking for min/max idioms.
Summary:
The following transformation for cmp instruction:

  icmp smin(x, PositiveValue), 0 -> icmp x, 0

should only be done after checking for min/max to prevent infinite
looping caused by a reverse canonicalization. That is why this
transformation was moved to place after the mentioned check.

Reviewers: spatel, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38934

Patch by: Artur Gainullin <artur.gainullin@intel.com>

llvm-svn: 315895
2017-10-16 09:19:21 +00:00
NAKAMURA Takumi 4543affa98 SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)
llvm-svn: 315894
2017-10-16 09:15:23 +00:00
Yonghong Song 6621cf67cf bpf: fix bug on silently truncating 64-bit immediate
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding.  This caused comparison with wrong value.

This bug looks to be introduced by r308080.  The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.

Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 315889
2017-10-16 04:14:53 +00:00
Hiroshi Inoue e3a3e3c9e9 [PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extended
This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass.
If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated.
One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. 
For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated.

void int_func(int);
void ii_test(int a) {
    if (a & 1) return int_func(a);
}

Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG.

Differential Revision: https://reviews.llvm.org/D31319

llvm-svn: 315888
2017-10-16 04:12:57 +00:00
Daniel Sanders ea8711b88e Re-commit r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

The previous commit failed on MSVC due to a failure to convert an
initializer_list to a std::vector. Hopefully, MSVC will accept this version.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315887
2017-10-16 03:36:29 +00:00
Daniel Sanders ce72d611af Revert r315885: [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
MSVC doesn't like one of the constructors.

llvm-svn: 315886
2017-10-16 02:15:39 +00:00
Daniel Sanders 6735ea86cd [globalisel][tblgen] Add support for iPTR and implement am_unscaled* and am_indexed*
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.

At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.

Depends on D37457

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37458

llvm-svn: 315885
2017-10-16 01:16:35 +00:00
Daniel Sanders a71f454765 [globalisel][tablegen] Implement unindexed load, non-extending load, and MemVT checks
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
       special-case. I've yet to find a reasonable way to implement it.

select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.

Depends on D37456

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37457

llvm-svn: 315884
2017-10-16 00:56:30 +00:00
Daniel Sanders df39cbae2f Re-commit r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Hopefully fixed the ambiguous constructor that a large number of bots reported.

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315869
2017-10-15 18:22:54 +00:00
Daniel Sanders bb082a36d3 Revert r315863: [globalisel][tablegen] Import ComplexPattern when used as an operator
A large number of bots are failing on an ambiguous constructor call.

llvm-svn: 315866
2017-10-15 17:51:07 +00:00
Daniel Sanders b95b867dd8 [globalisel][tablegen] Import ComplexPattern when used as an operator
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.

This patch adds support for this in order to enable the import of load/store
patterns.

Depends on D37445

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37456

llvm-svn: 315863
2017-10-15 17:03:36 +00:00
Craig Topper a5af4a64d0 [AVX512] Don't mark EXTLOAD as legal with AVX512. Continue using custom lowering.
Summary:
This was impeding our ability to combine the extending shuffles with other shuffles as you can see from the test changes.

There's one special case that needed to be added to use VZEXT directly for v8i8->v8i64 since the custom lowering requires v64i8.

Reviewers: RKSimon, zvi, delena

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38714

llvm-svn: 315860
2017-10-15 16:41:17 +00:00
Sanjay Patel 934738a3da revert r314984: revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)
Recommitting r314698. The bug exposed by this change should be fixed with:
https://reviews.llvm.org/rL315579 

llvm-svn: 315857
2017-10-15 15:39:15 +00:00
whitequark ae12efab20 [MergeFunctions] Merge small functions if possible without a thunk.
This can result in significant code size savings in some cases,
e.g. an interrupt table all filled with the same assembly stub
in a certain Cortex-M BSP results in code blowup by a factor of 2.5.

Differential Revision: https://reviews.llvm.org/D34806

llvm-svn: 315853
2017-10-15 12:29:09 +00:00
whitequark b2ce9ffede [MergeFunctions] Replace all uses of unnamed_addr functions.
This reduces code size for constructs like vtables or interrupt
tables that refer to functions in global initializers.

Differential Revision: https://reviews.llvm.org/D34805

llvm-svn: 315852
2017-10-15 12:29:01 +00:00
Amjad Aboud c8d67979c0 [X86] Ignore DBG instructions in X86CmovConversion optimization to resolve PR34565
Differential Revision: https://reviews.llvm.org/D38359

llvm-svn: 315851
2017-10-15 11:00:56 +00:00
Craig Topper a9cd59fb5d [X86] Lower vselect with constant condition to vector_shuffle even with AVX512 instructions.
Summary:
It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register.

It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day.

Reviewers: RKSimon, delena, zvi

Reviewed By: delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38932

llvm-svn: 315849
2017-10-15 06:39:07 +00:00
Craig Topper f02e97859b [X86] Don't use constant condition for select instruction when testing masking ops.
We should be able to fold constant conditions by converting to shuffles, but fixing that would break these tests in their current form. Since they are really trying to test masking ops, add a non-constant mask to the selects.

llvm-svn: 315848
2017-10-15 06:05:50 +00:00
Konstantin Zhuravlyov 263f7f6676 AMDGPU: Temporary disable pal metadata check line in llvm-readobj test
It fails on mips

llvm-svn: 315837
2017-10-14 23:42:11 +00:00
Craig Topper dfb443e88c [X86] Remove a bunch of dead FileCheck lines with the wrong prefix.
llvm-svn: 315828
2017-10-14 21:46:55 +00:00
Simon Pilgrim 36fe00ee17 [X86][SSE] Don't attempt to reduce the imul vector width of odd sized vectors (PR34947)
llvm-svn: 315825
2017-10-14 19:57:19 +00:00
Simon Pilgrim 3f49b988e0 [X86][SSE] Test vector imul reduction on 32 and 64-bit targets
llvm-svn: 315824
2017-10-14 19:46:08 +00:00
Konstantin Zhuravlyov a01d8b0b63 AMDGPU: Bring HSA metadata on par with the specification
Differential Revision: https://reviews.llvm.org/D38753

llvm-svn: 315821
2017-10-14 19:03:51 +00:00
Konstantin Zhuravlyov b3c605d680 llvm-readobj: Print AMDGPU note contents
Differential Revision: https://reviews.llvm.org/D38752

llvm-svn: 315819
2017-10-14 18:21:42 +00:00
Simon Pilgrim 5bd4431aec Cleanup update_llc_test_checks.py notes.
llvm-svn: 315817
2017-10-14 17:37:03 +00:00
Konstantin Zhuravlyov 7b4be1ed89 AMDGPU: Cleanup elf-notes.ll test
llvm-svn: 315816
2017-10-14 17:36:53 +00:00
Konstantin Zhuravlyov 716af741e9 llvm-readobj: Print AMDGPU note type names
Differential Revision: https://reviews.llvm.org/D38751

llvm-svn: 315813
2017-10-14 16:43:46 +00:00
Konstantin Zhuravlyov 219066bab8 AMDGPU: Improve note directive verification in assembler
- Do not allow amd_amdgpu_isa directives on non-amdgcn architectures
  - Do not allow amd_amdgpu_hsa_metadata on non-amdhsa OSes
  - Do not allow amd_amdgpu_pal_metadata on non-amdpal OSes

Differential Revision: https://reviews.llvm.org/D38750

llvm-svn: 315812
2017-10-14 16:15:28 +00:00
Konstantin Zhuravlyov eda425edd4 AMDGPU: Do not emit deprecated notes for code object v3
Differential Revision: https://reviews.llvm.org/D38749

llvm-svn: 315810
2017-10-14 15:59:07 +00:00
Konstantin Zhuravlyov 9c05b2bc3b AMDGPU: Add support for isa version note
- Emit NT_AMD_AMDGPU_ISA
  - Add assembler parsing for isa version directive
    - If isa version directive does not match command line arguments, then return error

Differential Revision: https://reviews.llvm.org/D38748

llvm-svn: 315808
2017-10-14 15:40:33 +00:00
Simon Pilgrim f367c27d2d [X86][SSE] Support combining AND(EXTRACT(SHUF(X)), C) -> EXTRACT(SHUF(X))
If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle.

Fixes the last issue with PR22415

llvm-svn: 315807
2017-10-14 15:01:36 +00:00
Craig Topper f7e777763d [X86] Add patterns for vzmovl+cvtpd2dq/cvttpd2dq with a load.
llvm-svn: 315802
2017-10-14 07:04:48 +00:00
Craig Topper 61010a85b8 [X86] Add AVX512 versions of VCVTPD2PS to load folding tables.
llvm-svn: 315801
2017-10-14 05:55:43 +00:00
Craig Topper ee277e190c [X86] Add patterns for vzmovl+cvtpd2ps with a load.
llvm-svn: 315800
2017-10-14 05:55:42 +00:00
Craig Topper 134241e4af [X86] Add AVX512 flavors of VCVTDQ2PD plus VCVTUDQ2PD to the load folding tables.
llvm-svn: 315796
2017-10-14 04:18:08 +00:00
Yaxun Liu adde4e4c01 Fix assembler for alloca of multiple elements in non-zero addr space
Currently llvm assembler emits parsing error for valid IR assembly

alloca i32, i32 9, addrspace(5)
when alloca addr space is 5.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D38713

llvm-svn: 315791
2017-10-14 03:23:18 +00:00
Daniel Sanders bfa9e2cae7 [globalisel][tablegen] Simplify named operand/operator lookups and fix a wrong-code bug this revealed.
Summary:
Operand variable lookups are now performed by the RuleMatcher rather than
searching the whole matcher hierarchy for a match. This revealed a wrong-code
bug that currently affects ARM and X86 where patterns that use a variable more
than once in the match pattern will be imported but won't check that the
operands are identical. This can cause the tablegen-erated matcher to
accept matches that should be rejected.

Depends on D36569

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: aemerson, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36618

llvm-svn: 315780
2017-10-14 00:31:58 +00:00
Craig Topper f6c69564e7 [X86] Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is available
This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations.

For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP.

We may be able to use this for AVX1 as well which would allow us to remove more isel patterns.

I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP.

Differential Revision: https://reviews.llvm.org/D38836

llvm-svn: 315768
2017-10-13 21:56:48 +00:00
Craig Topper 526b70a089 [X86] Use fsub in the movddup scheduling tests to prevent a future patch from folding movddup as a broadcast load.
llvm-svn: 315767
2017-10-13 21:56:45 +00:00
Daniel Sanders 11300cead8 [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat based ImmLeaf.
Summary:
There's only a tablegen testcase for IntImmLeaf and not a CodeGen one
because the relevant rules are rejected for other reasons at the moment.
On AArch64, it's because there's an SDNodeXForm attached to the operand.
On X86, it's because the rule either emits multiple instructions or has
another predicate using PatFrag which cannot easily be supported at the
same time.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: qcolombet

Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36569

llvm-svn: 315761
2017-10-13 21:28:03 +00:00
Matt Arsenault e11d8aca77 AMDGPU: Implement hasBitPreservingFPLogic
llvm-svn: 315754
2017-10-13 21:10:22 +00:00
Peter Collingbourne 868783e855 LowerTypeTests: Give imported symbols a type with size 0 so that they are not assumed not to alias.
It is possible for both a base and a derived class to be satisfied
with a unique vtable. If a program contains casts of the same pointer
to both of those types, the CFI checks will be lowered to this
(with ThinLTO):

if (p != &__typeid_base_global_addr)
  trap();
if (p != &__typeid_derived_global_addr)
  trap();

The optimizer may then use the first condition combined
with the assumption that __typeid_base_global_addr and
__typeid_derived_global_addr may not alias to optimize away the second
comparison, resulting in an unconditional trap.

This patch fixes the bug by giving imported globals the type [0 x i8]*,
which prevents the optimizer from assuming that they do not alias.

Differential Revision: https://reviews.llvm.org/D38873

llvm-svn: 315753
2017-10-13 21:02:16 +00:00
Sanjay Patel 505e071dc7 [Reassociate] auto-generate better checks; NFC
These would fail if the created variable names changed.

llvm-svn: 315752
2017-10-13 20:56:35 +00:00
Matt Arsenault 550c66d10f AMDGPU: Look for src mods before fp_extend
When selecting modifiers for mad_mix instructions,
look at fneg/fabs that occur before the conversion.

llvm-svn: 315748
2017-10-13 20:45:49 +00:00
Sanjay Patel f0242de143 [InstCombine] move code to remove repeated constant check; NFCI
Also, consolidate tests for this fold in one place.

llvm-svn: 315745
2017-10-13 20:29:11 +00:00
Matt Arsenault 4d70754e3c AMDGPU: Implement isFPExtFoldable
This helps match v_mad_mix* in some cases.

llvm-svn: 315744
2017-10-13 20:18:59 +00:00
Krzysztof Parzyszek 7c9c05888c [Hexagon] Minimize number of repeated constant extenders
Each constant extender requires an extra instruction, which adds to the
code size and also reduces the number of available slots in an instruction
packet. In most cases, the value of a repeated constant extender could be
loaded into a register, and the instructions using the extender could be
replaced with their counterparts that use that register instead.

This patch adds a pass that tries to reduce the number of constant
extenders, including extenders which differ only in an immediate offset
known at compile time, e.g. @global and @global+12.

llvm-svn: 315735
2017-10-13 19:02:59 +00:00
Craig Topper 5d692917f4 [X86] Add initial skeleton support for knm cpu
This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it.

Differential Revision: https://reviews.llvm.org/D38811

llvm-svn: 315722
2017-10-13 18:10:17 +00:00
Sanjay Patel c419c9f640 [InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing instructions
llvm-svn: 315718
2017-10-13 17:47:25 +00:00
Sanjay Patel 399fcbea37 [InstCombine] add tests for add (zext (add nuw X, C2)), C --> zext (add nuw X, C2 + C); NFC
llvm-svn: 315717
2017-10-13 17:42:12 +00:00
Max Moroz 43df793f5c [llvm-cov] Reland sources-specified.test with addition of "-path-equivalence".
Summary: This version of tests should be working properly.

Reviewers: vsk

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D38889

llvm-svn: 315714
2017-10-13 17:27:39 +00:00
Simon Pilgrim c4977fa9a1 [X86] Test scalar integer absolutes on 32-bit targets with/without CMOV
llvm-svn: 315711
2017-10-13 17:09:20 +00:00
Reid Kleckner be3724b5e1 Not all buildbots seem to dump the nuw flag in SDAG
llvm-svn: 315710
2017-10-13 17:00:49 +00:00
Simon Pilgrim df9611e178 [X86] Updated scalar integer absolute tests to cover i8/i16/i32/i64
llvm-svn: 315706
2017-10-13 16:53:07 +00:00
Sanjay Patel 2150651ac3 [InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types
The backend should be prepared for this transform after:
https://reviews.llvm.org/rL311731

llvm-svn: 315701
2017-10-13 16:29:38 +00:00
Reid Kleckner c687a34870 Update test to expect nuw flag in SDAG dump, fixes test after r315690
llvm-svn: 315698
2017-10-13 16:13:23 +00:00
Daniel Neilson fa14ebd138 [RS4GC] Look through vector bitcasts when looking for base pointer
Summary:
 In RS4GC it is possible that a base pointer is contained in a vector that
has undergone a bitcast from one element-pointertype to another. We teach
RS4GC how to look through bitcasts of vector types when looking for a base
pointer.

Reviewers: anna

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38849

llvm-svn: 315694
2017-10-13 15:59:13 +00:00
Max Moroz 8ff311b54a [llvm-cov] Temporary delete sources-specified.test, it is failing on some bots.
Summary: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/5950/steps/test-stage1-compiler/logs/stdio

Reviewers: vsk, Dor1s

Reviewed By: Dor1s

Subscribers: mehdi_amini

Differential Revision: https://reviews.llvm.org/D38888

llvm-svn: 315693
2017-10-13 15:58:58 +00:00
Krzysztof Parzyszek a0f2f7c413 [Hexagon] Add patterns for cmpb/cmph with immediate arguments
Patch by Sumanth Gundapaneni.

llvm-svn: 315692
2017-10-13 15:43:12 +00:00
Max Moroz 8bc53fd031 [llvm-cov] Fix sources-specified.test so it ignores the order of files printed.
Summary: https://reviews.llvm.org/D38884#896964

Reviewers: vsk, Dor1s

Reviewed By: Dor1s

Differential Revision: https://reviews.llvm.org/D38887

llvm-svn: 315691
2017-10-13 15:41:51 +00:00
Max Moroz c5834e5e88 [llvm-cov] An attempt to fix sources_specified.test failing on some buildbots.
Summary: https://reviews.llvm.org/rL315685#115380

Reviewers: vsk, Dor1s

Reviewed By: Dor1s

Differential Revision: https://reviews.llvm.org/D38884

llvm-svn: 315687
2017-10-13 15:30:24 +00:00
Max Moroz 4a4bfa4e27 [llvm-cov] Generate "report" for given source paths if sources are specified.
Summary:
Documentation says that user can specify sources for both "show" and
"report" commands. "Show" command respects specified sources, but "report" does
not. It is useful to have both "show" and "report" generated for specified
sources. Also added tests to for both commands with sources specified.

Reviewers: vsk, kcc

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D38860

llvm-svn: 315685
2017-10-13 14:44:51 +00:00
Jonas Devlieghere 614fab4bd8 Re-land "[dsymutil] Timestmap verification for __swift_ast"
This patch adds timestamp verification for swiftmodule files. A new flag
is provided to allows us to disable this check in order to allow testing
of this feature.

Differential revision: https://reviews.llvm.org/D38686

llvm-svn: 315684
2017-10-13 14:41:23 +00:00
Anna Thomas a2ca902033 [SCEV] Teach SCEV to find maxBECount when loop endbound is variant
Summary:
This patch teaches SCEV to calculate the maxBECount when the end bound
of the loop can vary. Note that we cannot calculate the exactBECount.

This will only be done when both conditions are satisfied:
1. the loop termination condition is strictly LT.
2. the IV is proven to not overflow.

This provides more information to users of SCEV and can be used to
improve identification of finite loops.

Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick

Reviewed by: mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38825

llvm-svn: 315683
2017-10-13 14:30:43 +00:00
Sanjay Patel 45d5568010 [InstCombine] add tests for boolean extend + add; NFC
llvm-svn: 315681
2017-10-13 14:09:45 +00:00
Daniel Jasper 3344a21236 Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode"
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)

I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.

llvm-svn: 315680
2017-10-13 14:04:21 +00:00
Craig Topper bf0de9d3b6 [X86] Remove patterns that select unmasked vbroadcastf2x32/vbroadcasti2x32. Prefer vbroadcastsd/vpbroadcastq instead.
There's no advantage to using these instructions when they aren't masked. This enables some additional execution domain switching without needing to update the table.

llvm-svn: 315674
2017-10-13 06:07:10 +00:00
Craig Topper 11655b22dc [X86] Add the test case for r315613 that I forgot to 'git add'.
llvm-svn: 315649
2017-10-13 00:20:47 +00:00
Matt Morehouse 8bc23ab658 [llvm-isel-fuzzer] Use "--" as separator rather than '='.
Summary: OSS-Fuzz doesn't support '=' in filenames.

Reviewers: bogner, kcc

Reviewed By: kcc

Subscribers: javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38866

llvm-svn: 315647
2017-10-13 00:18:32 +00:00
Justin Bogner 9c03fd5f64 llvm-isel-fuzzer: Use the right REQUIRES line for r315599
I'd mixed up ENABLE_SHARED and BUILD_SHARED_LIBS before, so these
tests were being disabled in too many places.

llvm-svn: 315646
2017-10-13 00:17:54 +00:00
Anna Thomas 61aec18d46 [CVP] Process binary operations even when def is local
Summary:
This patch adds processing of binary operations when the def of operands are in
the same block (i.e. local processing).

Earlier we bailed out in such cases (the bail out was introduced in rL252032)
because LVI at that time was more precise about context at the end of basic
blocks, which implied local def and use analysis didn't benefit CVP.

Since then we've added support for LVI in presence of assumes and guards. The
test cases added show how local def processing in CVP helps adding more
information to the ashr, sdiv, srem and add operators.

Note: processCmp which suffers from the same problem will
be handled in a later patch.

Reviewers: philip, apilipenko, SjoerdMeijer, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38766

llvm-svn: 315634
2017-10-12 22:39:52 +00:00
Artur Pilipenko ead69ee4bd [LoopPredication] Check whether the loop is already guarded by the first iteration check condition
llvm-svn: 315623
2017-10-12 21:21:17 +00:00
Bruno Cardoso Lopes 993d2e67d8 Revert "Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP.""
This reverts commit r315593: still affect two bots:

http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5308
http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21751/

llvm-svn: 315618
2017-10-12 20:52:34 +00:00
Artur Pilipenko b4527e1ce2 [LoopPredication] Support ule, sle latch predicates
This is a follow up for the loop predication change 313981 to support ule, sle latch predicates.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D38177

llvm-svn: 315616
2017-10-12 20:40:27 +00:00
Wei Ding 5676acad9e Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

llvm-svn: 315610
2017-10-12 19:37:14 +00:00
Craig Topper 3dc37cc592 [X86] Add a bunch of -mcpu strings to the cpus.ll test.
We were missing most of the "core" aliases as well as skylake, cannonlake, and knights landing.

llvm-svn: 315606
2017-10-12 18:55:57 +00:00
Artem Belevich 3bafc2f0d9 [NVPTX] Implemented wmma intrinsics and instructions.
WMMA = "Warp Level Matrix Multiply-Accumulate".
These are the new instructions introduced in PTX6.0 and available
on sm_70 GPUs.

Differential Revision: https://reviews.llvm.org/D38645

llvm-svn: 315601
2017-10-12 18:27:55 +00:00
Reid Kleckner 1a7e387849 [codeview] Don't emit FPO data in funclet prologues
Attempt 3 to work around bugs in FPO data with funclets.

llvm-svn: 315600
2017-10-12 18:20:35 +00:00
Justin Bogner 754a1a8a6f llvm-isel-fuzzer: Work around BUILD_SHARED_LIBS testing issues
Building with BUILD_SHARED_LIBS makes it tricky to copy around
executables at will, since they won't be able to find the LLVM
libraries any more. This makes testing a feature that's based on the
executable name problematic, so we'll just disable these two tests in
that configuration.

We could potentially fix this by symlinking the lib directory into the
test directory, but that wouldn't work on windows, and losing testing
on windows would be far worse than losing testing on a configuration
that's barely even supported.

llvm-svn: 315599
2017-10-12 18:10:22 +00:00
Artem Belevich 786ca6a166 [TableGen] Allow intrinsics to have up to 8 return values.
Differential Revision: https://reviews.llvm.org/D38633

llvm-svn: 315598
2017-10-12 17:40:00 +00:00
Sanjay Patel e272be7c9a [ValueTracking] return zero when there's conflict in known bits of a shift (PR34838)
Poison allows us to return a better result than undef.

llvm-svn: 315595
2017-10-12 17:31:46 +00:00
Bruno Cardoso Lopes 326fdcbff8 Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."
This is r315288 & r315294, which were reverted due to stage2 bot
failures.

Summary:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to `ret i32 2` with
this change

  source_filename = "sccp.c"
  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"

  ; Function Attrs: norecurse nounwind readnone uwtable
  define i32 @main() local_unnamed_addr #0 {
  entry:
    %call = tail call fastcc i32 @f(i32 1)
    %call1 = tail call fastcc i32 @f(i32 47)
    %add3 = add nsw i32 %call, %call1
    ret i32 %add3
  }

  ; Function Attrs: noinline norecurse nounwind readnone uwtable
  define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
  entry:
    %c1 = icmp sle i32 %x, 100

    %cmp = icmp sgt i32 %x, 300
    %. = select i1 %cmp, i32 1, i32 2
    ret i32 %.
  }

  attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 315593
2017-10-12 16:54:11 +00:00
Lei Huang 0724fea2da [PowerPC] Add profitablilty check for conversion to mtctr loops
Add profitability checks for modifying counted loops to use the mtctr instruction.

The latency of mtctr is only justified if there are more than 4 comparisons that
will be removed as a result.  Usually counted loops are formed relatively early
and before unrolling, so most low trip count loops often don't survive.  However
we want to ensure that if they do, we do not mistakenly update them to mtctr loops.

Use CodeMetrics to ensure we are only doing this for small loops with small trip counts.

Differential Revision: https://reviews.llvm.org/D38212

llvm-svn: 315592
2017-10-12 16:43:33 +00:00
Tim Renouf c8ffffe462 [AMDGPU] For amdpal, widen interpolation mode workaround
Summary:
The interpolation mode workaround ensures that at least one
interpolation mode is enabled in PSInputAddr. It does not also check
PSInputEna on the basis that the user might enable bits in that
depending on run-time state.

However, for amdpal os type, the user does not enable some bits after
compilation based on run-time states; the register values being
generated here are the final ones set in the hardware. Therefore, apply
the workaround to PSInputAddr and PSInputEnable together. (The case
where a bit is set in PSInputAddr but not in PSInputEnable is where the
frontend set up an input arg for a particular interpolation mode, but
nothing uses that input arg. Really we should have an earlier pass that
removes such an arg.)

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37758

llvm-svn: 315591
2017-10-12 16:16:41 +00:00
Mikael Holmen a079ef68e3 [RegisterCoalescer] Don't set read-undef in pruneValues, only clear
Summary:
The comments in the code said

 // Remove <def,read-undef> flags. This def is now a partial redef.

but the code didn't just remove read-undef, it could introduce new ones which
could cause errors.

E.g. if we have something like

%vreg1<def> = IMPLICIT_DEF
%vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4
%vreg2:subreg2<def> = op %vreg6, %vreg7

and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg
def, which the old code did.

Now we solve this by actually do what the code comment says. We remove
read-undef flags rather than remove or introduce them.

Reviewers: qcolombet, MatzeB

Reviewed By: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38616

llvm-svn: 315564
2017-10-12 06:21:28 +00:00
Justin Bogner 9ea7fbd1e8 Re-commit "llvm-isel-fuzzer: Handle a subset of backend flags in the exec name"
Here we add a secondary option parser to llvm-isel-fuzzer (and provide
it for use with other fuzzers). With this, you can copy the fuzzer to
a name like llvm-isel-fuzzer=aarch64-gisel for a fuzzer that fuzzer
AArch64 with GlobalISel enabled, or fuzzer=x86_64 to fuzz x86, with no
flags required. This should be useful for running these in OSS-Fuzz.

Note that this handrolls a subset of cl::opts to recognize, rather
than embedding a complete command parser for argv[0]. If we find we
really need the flexibility of handling arbitrary options at some
point we can rethink this.

This re-applies 315545 using "=" instead of ":" as a separator for
arguments.

llvm-svn: 315557
2017-10-12 04:35:32 +00:00
Hans Wennborg 022829d84c Revert r315545 "llvm-isel-fuzzer: Handle a subset of backend flags in the executable name"
It broke some tests on Windows:

Failing Tests (4):
    LLVM :: tools/llvm-isel-fuzzer/execname-options.ll
    LLVM :: tools/llvm-isel-fuzzer/missing-triple.ll
    LLVM :: tools/llvm-isel-fuzzer/x86-empty-bc.ll
    LLVM :: tools/llvm-isel-fuzzer/x86-empty.ll

> llvm-isel-fuzzer: Handle a subset of backend flags in the executable name
>
> Here we add a secondary option parser to llvm-isel-fuzzer (and provide
> it for use with other fuzzers). With this, you can copy the fuzzer to
> a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer
> AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no
> flags required. This should be useful for running these in OSS-Fuzz.
>
> Note that this handrolls a subset of cl::opts to recognize, rather
> than embedding a complete command parser for argv[0]. If we find we
> really need the flexibility of handling arbitrary options at some
> point we can rethink this.

llvm-svn: 315554
2017-10-12 03:32:09 +00:00
Hongbin Zheng d36f2030e2 [SimplifyIndVar] Replace IVUsers with loop invariant whenever possible
Differential Revision: https://reviews.llvm.org/D38415

llvm-svn: 315551
2017-10-12 02:54:11 +00:00
Justin Bogner a5969ce15f llvm-isel-fuzzer: Handle a subset of backend flags in the executable name
Here we add a secondary option parser to llvm-isel-fuzzer (and provide
it for use with other fuzzers). With this, you can copy the fuzzer to
a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer
AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no
flags required. This should be useful for running these in OSS-Fuzz.

Note that this handrolls a subset of cl::opts to recognize, rather
than embedding a complete command parser for argv[0]. If we find we
really need the flexibility of handling arbitrary options at some
point we can rethink this.

llvm-svn: 315545
2017-10-12 01:57:49 +00:00
Wei Mi 1736efd16a Revert r307036 because of PR34919.
llvm-svn: 315540
2017-10-12 00:24:52 +00:00
Konstantin Zhuravlyov c3beb6a075 AMDGPU/NFC: Minor clean ups in PAL metadata
- Move PAL metadata definitions to AMDGPUMetadata
  - Make naming consistent with HSA metadata

Differential Revision: https://reviews.llvm.org/D38745

llvm-svn: 315523
2017-10-11 22:41:09 +00:00
Konstantin Zhuravlyov a63b0f9d20 AMDGPU/NFC: Rename code object metadata as HSA metadata
- Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change)
  - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer
  - Introduce HSAMD namespace
  - Other minor name changes in function and test names

llvm-svn: 315522
2017-10-11 22:18:53 +00:00
Reid Kleckner ddf413f3e1 Really fix llvm-rc include-paths.test
llvm-svn: 315515
2017-10-11 21:27:54 +00:00
Reid Kleckner ade90cbd79 Attempt to fix failing llvm-rc include-paths.text
llvm-svn: 315514
2017-10-11 21:25:03 +00:00
Reid Kleckner 9cdd4df81a [codeview] Implement FPO data assembler directives
Summary:
This adds a set of new directives that describe 32-bit x86 prologues.
The directives are limited and do not expose the full complexity of
codeview FPO data. They are merely a convenience for the compiler to
generate more readable assembly so we don't need to generate tons of
labels in CodeGen. If our prologue emission changes in the future, we
can change the set of available directives to suit our needs. These are
modelled after the .seh_ directives, which use a different format that
interacts with exception handling.

The directives are:
  .cv_fpo_proc _foo
  .cv_fpo_pushreg ebp/ebx/etc
  .cv_fpo_setframe ebp/esi/etc
  .cv_fpo_stackalloc 200
  .cv_fpo_endprologue
  .cv_fpo_endproc
  .cv_fpo_data _foo

I tried to follow the implementation of ARM EHABI CFI directives by
sinking most directives out of MCStreamer and into X86TargetStreamer.
This helps avoid polluting non-X86 code with WinCOFF specific logic.

I used cdb to confirm that this can show locals in parent CSRs in a few
cases, most importantly the one where we use ESI as a frame pointer,
i.e. the one in http://crbug.com/756153#c28

Once we have cdb integration in debuginfo-tests, we can add integration
tests there.

Reviewers: majnemer, hans

Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D38776

llvm-svn: 315513
2017-10-11 21:24:33 +00:00
Krzysztof Parzyszek c4a9a8d8e0 [Hexagon] Make sure that new-value jump is packetized with producer
llvm-svn: 315510
2017-10-11 21:20:43 +00:00
Florian Hahn e52abba277 [MachineCombiner] Fix initialisation of LastUpdate for incremental update.
Summary:
Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled.

Patch by Paul Walker.

Reviewers: fhahn, Gerolf, efriedma, MatzeB

Reviewed By: fhahn

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38734

llvm-svn: 315502
2017-10-11 20:25:58 +00:00
Lei Huang 263dc4ef3a [PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex elimination to only use `lis/addi` if necessary.
Currently we produce a bunch of unnecessary code when emitting the
prologue/epilogue for spills/restores.  Namely, if the load from stack
slot/store to stack slot instruction is an X-Form instruction, we will
always produce an LIS/ORI sequence for the stack offset.

Furthermore, we have not exploited the P9 vector D-Form loads/stores for this
purpose.

This patch address both issues.

Specifying the D-Form load as the instruction to use for stack spills/reloads
should be safe because:

1. The stack should be aligned according to the ABI
2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will
   check for the offset being a multiple of 16 and will convert it to an
   X-Form instruction if it isn't.

Differential Revision : https://reviews.llvm.org/D38758

llvm-svn: 315500
2017-10-11 20:20:58 +00:00
Zachary Turner fa0ca6cbd0 [llvm-rc] Use proper search algorithm for finding resources.
Previously we would only look in the current directory for a
resource, which might not be the same as the directory of the
rc file.  Furthermore, MSVC rc supports a /I option, and can
also look in the system environment.  This patch adds support
for this search algorithm.

Differential Revision: https://reviews.llvm.org/D38740

llvm-svn: 315499
2017-10-11 20:12:09 +00:00
Sanjay Patel 6c0aef77aa [x86] avoid infinite loop from SoftenFloatOperand (PR34866)
Legalization of fp128 assumes things that we should have asserts for,
so that's another potential improvement.

Differential Revision: https://reviews.llvm.org/D38771

llvm-svn: 315485
2017-10-11 18:24:21 +00:00
Jake Ehrlich f03384dce7 Reland "[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data"
ubsan caught an issue I made where I was converting a null pointer to a
reference.

elf utils implements a particularly extreme form of stripping that I'd
like to support. eu-strip has an option called "strip-sections" that
removes all section headers and leaves only program headers and the
segment data. I have implemented this option partly as a test but mainly
because in Fuchsia we would like to use this option to minimize the size
of our executables. The other strip options that are on my list include
--strip-all and --strip-debug. This is a preliminary implementation that
I'd like to start using in Fuchsia builds if possible. This change
implements such a stripping option for llvm-objcopy

Differential Revision: https://reviews.llvm.org/D38335

llvm-svn: 315484
2017-10-11 18:09:18 +00:00
Lei Huang f9c7f7fed4 [NFC] update test case so checks are not order dependent when not needed
llvm-svn: 315482
2017-10-11 18:04:41 +00:00
Rafael Espindola 1a0e5a1933 Convert an ErrorOr to Expected.
getRelocationAddend should never be called on non SHT_RELA sections,
but changing that requires changing RelocVisitor.h.

llvm-svn: 315473
2017-10-11 16:56:33 +00:00
Krzysztof Parzyszek 8f174dde92 [Pipeliner] Improve serialization order for post-increments
The pipeliner is generating a serial sequence that causes poor
register allocation when a post-increment instruction appears
prior to the use of the post-increment register. This occurs when
there is a circular set of dependences involved with a sequence
of instructions in the same cycle. In this case, there is no
serialization of the parallel semantics that will not cause an
additional register to be allocated.

This patch fixes the problem by changing the instructions so that
the post-increment instruction is used by the subsequent
instruction, which enables the register allocator to make a
better decision and not require another register.

Patch by Brendon Cahoon.

llvm-svn: 315466
2017-10-11 15:51:44 +00:00
Sanjay Patel 8d565a233d [InstCombine] add baseline tests for D38531; NFC
llvm-svn: 315461
2017-10-11 14:29:17 +00:00
Sanjay Patel 34fd5eaaf0 [DAGCombiner] convert insertelement of bitcasted vector into shuffle
Eg:
insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7}

This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector. 
We may want to abandon that one if we can't find value in squashing the more specific pattern sooner.

We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles.

There may be room for improvement in the shuffle lowering here, but that would be follow-up work.

Differential Revision: https://reviews.llvm.org/D38388

llvm-svn: 315460
2017-10-11 14:12:16 +00:00
Jonas Devlieghere ec053332cf Revert "[dsymutil] Timestmap verification for __swift_ast"
This reverts commit r315456.

llvm-svn: 315458
2017-10-11 13:51:30 +00:00
Jonas Devlieghere 8acb2e3ac4 [dsymutil] Timestmap verification for __swift_ast
This patch adds timestamp verification for swiftmodule files.

 - A new flag is provided to allows us to continue testing of the code
   for embedding the__swift_ast. (git doesn't maintain timestamps)
 - Adds a new test for fat (arm) binaries.

Differential revision: https://reviews.llvm.org/D38686

llvm-svn: 315456
2017-10-11 13:34:52 +00:00
Simon Dardis 442ee63468 [mips] Add missing tests from rL315451
llvm-svn: 315454
2017-10-11 11:45:06 +00:00
Uriel Korach 782f28bf2f [X86] Added tests for TESTM and TESTNM (NFC)
Adding this test files now so after another commit that will add a new pattern for
TESTM and TESTNM instructions will show the improvemnts that have been done.

Change-Id: If3908b7f91897d764053312365a2bc1de78b291d
llvm-svn: 315443
2017-10-11 08:39:25 +00:00
Max Kazantsev 3b81809e06 [GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
  int array[LEN];
  ...
  guard(0 <= index && index < LEN);
  use(array[index]);

Differential Revision: https://reviews.llvm.org/D37460

llvm-svn: 315440
2017-10-11 08:10:43 +00:00
Max Kazantsev 0c8dd052b8 [LICM] Disallow sinking of unordered atomic loads into loops
Sinking of unordered atomic load into loop must be disallowed because it turns
a single load into multiple loads. The relevant section of the documentation
is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for
Optimizers section. Here is the full text of this section:

> Notes for optimizers
> In terms of the optimizer, this **prohibits any transformation that
> transforms a single load into multiple loads**, transforms a store into
> multiple stores, narrows a store, or stores a value which would not be
> stored otherwise. Some examples of unsafe optimizations are narrowing
> an assignment into a bitfield, rematerializing a load, and turning loads
> and stores into a memcpy call. Reordering unordered operations is safe,
> though, and optimizers should take advantage of that because unordered
> operations are common in languages that need them.

Patch by Daniil Suchkov!

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D38392

llvm-svn: 315438
2017-10-11 07:26:45 +00:00
Max Kazantsev 25d8655dc2 [IRCE] Do not process empty safe ranges
IRCE should not apply when the safe iteration range is proved to be empty.
In this case we do unneeded job creating pre/post loops and then never
go to the main loop.

This patch makes IRCE not apply to empty safe ranges, adds test for this
situation and also modifies one of existing tests where it used to happen
slightly.

Reviewed By: anna
Differential Revision: https://reviews.llvm.org/D38577

llvm-svn: 315437
2017-10-11 06:53:07 +00:00
Davide Italiano e2138fe41b [GVN] Don't replace constants with constants.
This fixes PR34908. Patch by Alex Crichton!

Differential Revision:  https://reviews.llvm.org/D38765

llvm-svn: 315429
2017-10-11 04:21:51 +00:00
Jake Ehrlich d9a283463a Revert "[llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data"
This reverts commit rL315412

llvm-svn: 315417
2017-10-11 02:42:29 +00:00
Jake Ehrlich b5152447ba [llvm-objcopy] Add support for --strip-sections to remove all section headers leaving only program headers and loadable segment data
elf utils implements a particularly extreme form of stripping that I'd
like to support. eu-strip has an option called "strip-sections" that
removes all section headers and leaves only program headers and the
segment data. I have implemented this option partly as a test but mainly
because in Fuchsia we would like to use this option to minimize the size
of our executables. The other strip options that are on my list include
--strip-all and --strip-debug. This is a preliminary implementation that
I'd like to start using in Fuchsia builds if possible. This change
implements such a stripping option for llvm-objcopy

Differential Revision: https://reviews.llvm.org/D38335

llvm-svn: 315412
2017-10-11 01:59:06 +00:00
Craig Topper 6ce20bd184 [X86] Add 128-bit version of vbroadcasti32x2 to shuffle comment decoding.
llvm-svn: 315395
2017-10-11 00:11:53 +00:00
Jake Ehrlich fcc05627d4 [llvm-objcopy] Add ability to remove multiple sections by name
This change adds the ability to use the "-R"/"-remove-section" option
multiple times.

Differential Revision: https://reviews.llvm.org/D38332

llvm-svn: 315385
2017-10-10 23:02:43 +00:00
Craig Topper bb0e316dc7 [X86] Add broadcast patterns that allow a scalar_to_vector between the broadcast and the load.
We already have these patterns for AVX512VL, but not AVX1 or 2.

llvm-svn: 315382
2017-10-10 22:40:31 +00:00
Rafael Espindola 8f1f7b1442 Make the ELFFile constructor private.
With this all clients have to use the new create method which returns
an Expected.

Fixes a crash on invalid input.

llvm-svn: 315376
2017-10-10 22:17:49 +00:00
Rafael Espindola ef421f9c18 Make the ELFObjectFile constructor private.
This forces every user to use the new create method that returns an
Expected. This in turn propagates better error messages.

llvm-svn: 315371
2017-10-10 21:21:16 +00:00
Dehao Chen 3f56a05ae5 Use the first instruction's count to estimate the funciton's entry frequency.
Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D38478

llvm-svn: 315369
2017-10-10 21:13:50 +00:00
Sanjay Patel b74063d21f [x86] fix prefix typos for CHECK lines; NFC
llvm-svn: 315368
2017-10-10 21:12:47 +00:00
Simon Dardis b994128d14 [mips] Correct the instruction predicates for microMIPSr3
Rather than using the AdditionalPredicates mechanism to guard
the microMIPS instructions, use the existing predicates to properly
guard those instructions.

This also resolves a case where an instruction pattern was incorrectly
available for microMIPS32R6, which caused a register allocation failure
as the registers specified in the pattern were not available.

Reviewers: nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D38451

llvm-svn: 315362
2017-10-10 20:52:53 +00:00
Matt Arsenault f42074b699 AMDGPU: Fix missing skipFunction calls
llvm-svn: 315361
2017-10-10 20:48:36 +00:00
Matt Arsenault d674e0ac0d AMDGPU: Fix failure to select branch with optnone
opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass.
The heuristic in the custom selector for brcond deferred the
branch uniformity check to the pattern, which would fail.

llvm-svn: 315360
2017-10-10 20:34:49 +00:00
Rafael Espindola 12db383e20 Convert two uses of ErrorOr to Expected.
llvm-svn: 315354
2017-10-10 20:00:07 +00:00
Yaxun Liu de4b88d9a1 [AMDGPU] Lower enqueued blocks and generate runtime metadata
This patch adds a post-linking pass which replaces the function pointer of enqueued
block kernel with a global variable (runtime handle) and adds
runtime-handle attribute to the enqueued block kernel.

In LLVM CodeGen the runtime-handle metadata will be translated to
RuntimeHandle metadata in code object. Runtime allocates a global buffer
for each kernel with RuntimeHandel metadata and saves the kernel address
required for the AQL packet into the buffer. __enqueue_kernel function
in device library knows that the invoke function pointer in the block
literal is actually runtime handle and loads the kernel address from it
and puts it into AQL packet for dispatching.

This cannot be done in FE since FE cannot create a unique global variable
with external linkage across LLVM modules. The global variable with internal
linkage does not work since optimization passes will try to replace loads
of the global variable with its initialization value.

Differential Revision: https://reviews.llvm.org/D38610

llvm-svn: 315352
2017-10-10 19:39:48 +00:00
Jake Ehrlich 36a2eb34ed [llvm-objcopy] Add support for removing sections
This change adds support for removing sections using the -R field (as
GNU objcopy does as well). This change should let us add many helpful
tests and is a proper stepping stone for adding more general kinds of
stripping.

Differential Revision: https://reviews.llvm.org/D38260

llvm-svn: 315346
2017-10-10 18:47:09 +00:00
Jake Ehrlich c5ff72708d Revert "temporary"
I forgot to add a proper commit message. I'm reverting this
to fix that.

This reverts commit r315344.

llvm-svn: 315345
2017-10-10 18:32:22 +00:00
Jake Ehrlich 77ec1ffe5c temporary
llvm-svn: 315344
2017-10-10 18:28:15 +00:00
Adrian Prantl 16b8b47152 Debug Info: Fix the SDLoc propagation for a DAGCombiner rule
This patch ensures that the rule:
  fold (zext (load x)) -> (zext (truncate (zextload x)))
propagates the SDLoc of the load to the zextload.

<rdar://problem/33755881>

llvm-svn: 315340
2017-10-10 18:08:32 +00:00
Francis Ricci 5776f26fa1 [llvm-objdump] Disable leak checking on an llvm-objdump test
Summary:
This leak doesn't reproduce locally on macOS 10.12, but is causing
buildbot failures. Disable leak checking until it can be fixed.

Reviewers: sqlbyme, qcolombet, enderby, bruno

Reviewed By: bruno

Subscribers: bruno, llvm-commits

Differential Revision: https://reviews.llvm.org/D38699

llvm-svn: 315337
2017-10-10 17:50:57 +00:00
Bruno Cardoso Lopes 57304923ca Revert "[SCCP] Propagate integer range info for parameters in IPSCCP."
This reverts commit r315288. This is part of fixing segfault introduced
in:

http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/

llvm-svn: 315329
2017-10-10 16:37:57 +00:00
Jacob Gravelle 37af00e7d0 [WebAssembly] Narrow the scope of WebAssemblyFixFunctionBitcasts
Summary:
The pass to fix function bitcasts generates thunks for functions that
are called directly with a mismatching signature. It was also generating
thunks in cases where the function was address-taken, causing aliasing
problems in otherwise valid cases.
This patch tightens the restrictions for when the pass runs.

Reviewers: sunfish, dschuff

Subscribers: jfb, sbc100, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D38640

llvm-svn: 315326
2017-10-10 16:20:18 +00:00
Simon Pilgrim 053a299a9b [X86][AVX512] Regenerate element insertion/extraction tests
llvm-svn: 315322
2017-10-10 15:58:54 +00:00
Simon Dardis 96d35fe06a [mips] Duplicate the reciprocal instruction definitions for FP32
Add instruction definitions for FP32 mode for recip.d and rsqrt.d.

Previously these instructions were only defined when targeting the
full 64-bit FPU model but were not guarded properly.

Reviewers: nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D38400

llvm-svn: 315318
2017-10-10 14:41:11 +00:00
Jonas Devlieghere aa6be823a4 Re-land "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.

Recommit after being reverted in r315299.

Differential revision: https://reviews.llvm.org/D36993

llvm-svn: 315316
2017-10-10 14:15:25 +00:00
Sanjay Patel 7d52c7ca74 [x86] add tests for insertelement; NFC
llvm-svn: 315312
2017-10-10 13:45:25 +00:00
Simon Dardis a17a7b619a [mips] Partially fix PR34391
Previously, the parsing of the 'subu $reg, ($reg,) imm' relied on a parser
which also rendered the operand to the instruction. In some cases the
general parser could construct an MCExpr which was not a MCConstantExpr
which MipsAsmParser was expecting.

Address this by altering the special handling to cope with unexpected inputs
and fine-tune the handling of cases where an register name that is not
available in the current ABI is regarded as not a match for the custom parser
but also not as an outright error.

Also enforces the binutils restriction that only constants are accepted.

This partially resolves PR34391.

Thanks to Ed Maste for reporting the issue!

Reviewers: nitesh.jain, arichardson

Differential Revision: https://reviews.llvm.org/D37476

llvm-svn: 315310
2017-10-10 13:34:45 +00:00
David Stuttard 51c1b22806 [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors
Summary:
See https://llvm.org/PR33743 for more details

It seems that for non-power of 2 vector sizes, the algorithm can produce
non-matching sizes for input and result causing an assert.

This usually isn't a problem as the isAnyExtend check will weed these out, but
in some cases (most often with lots of undefined values for the mask indices) it
can pass this check for non power of 2 vectors.

Adding in an extra check that ensures that bit size will match for the result
and input (as required)

Subscribers: nhaehnle

Differential Revision: https://reviews.llvm.org/D35241

llvm-svn: 315307
2017-10-10 12:45:45 +00:00
Oliver Stannard 30b732c942 [ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs
Previously, the code that implemented the GNU assembler aliases for the
LDRD and STRD instructions (where the second register is omitted)
assumed that the input was a valid instruction. This caused assertion
failures for every example in ldrd-strd-gnu-bad-inst.s.

This improves this code so that it bails out if the instruction is not
in the expected format, the check bails out, and the asm parser is run
on the unmodified instruction.

It also relaxes the alias on thumb targets, so that unaligned pairs of
registers can be used. The restriction that Rt must be even-numbered
only applies to the ARM versions of these instructions.

Differential revision: https://reviews.llvm.org/D36732

llvm-svn: 315305
2017-10-10 12:38:22 +00:00
Oliver Stannard cd3306f62f [ARM, Asm] Add diagnostics for floating-point register operands
This adds diagnostic strings for the ARM floating-point register
classes, which will be used when these classes are expected by the
assembler, but the provided operand is not valid.

One of these, DPR, requires C++ code to select the correct error
message, as that class contains different registers depending on the
FPU. The rest can all have their diagnostic strings stored in the
tablegen decription of them.

Differential revision: https://reviews.llvm.org/D36693

llvm-svn: 315304
2017-10-10 12:35:09 +00:00
Oliver Stannard bbad419e94 [ARM, Asm] Add diagnostics for general-purpose register operands
This adds diagnostic strings for the ARM general-purpose register
classes, which will be used when these classes are expected by the
assembler, but the provided operand is not valid.

One of these, rGPR, requires C++ code to select the correct error
message, as that class contains different registers in pre-v8 and v8
targets. The rest can all have their diagnostic strings stored in the
tablegen description of them.

Differential revision: https://reviews.llvm.org/D36692

llvm-svn: 315303
2017-10-10 12:31:53 +00:00
Nicolai Haehnle 312b64f4d7 AMDGPU: Split MUBUF offset into aligned components
Summary:
Atomic buffer operations do not work (and trap on gfx9) when the
components are unaligned, even if their sum is aligned.

Previously, we generated an offset of 4156 without an SGPR by
splitting it as 4095 + 61 (immediate + inline constant). The
highest offset for which we can do this correctly is 4156 = 4092 + 64.

Fixes dEQP-GLES31.functional.ssbo.atomic.*

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37850

llvm-svn: 315302
2017-10-10 12:22:23 +00:00
Jonas Devlieghere 5b0f885691 Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"
This reverts commit r315297.

llvm-svn: 315299
2017-10-10 11:49:56 +00:00
Jonas Devlieghere 2eb95c33f6 [llvm-dwarfdump] Print type names in DW_AT_type DIEs
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.

Differential revision: https://reviews.llvm.org/D36993

llvm-svn: 315297
2017-10-10 11:24:41 +00:00
Oliver Stannard 29ffd3f1d9 [AsmParser] Add DiagnosticString to register classes in tablegen
This allows a DiagnosticType and/or DiagnosticString to be associated
with a RegisterClass in tablegen, so that we can emit diagnostics in the
assembler when a register operand is incorrect.

DiagnosticType creates a predictable enum value, which gets returned as
the error code when an operand does not match, and can be used by the
assembly parser to map to a user-facing diagnostic. DiagnosticString
creates an anonymous enum value (currently based on the tablegen class
name), and a function to map from enum values to strings will be
generated. Both of these work the same was as they do for AsmOperand.

This isn't used by any targets yet, but has one (positive) side-effect.
It improves the diagnostic codes returned by validateOperandClass - we
always want to emit the diagnostic that relates to the expected operand
class, but this wasn't always being done when the expected and actual
classes were completely different (token/register/custom). This causes a
few AArch64 diagnostics to be improved, as Match_InvalidOperand was
being returned instead of a specific diagnostic type.

Differential revision: https://reviews.llvm.org/D36691

llvm-svn: 315295
2017-10-10 11:00:40 +00:00
Gadi Haber 2b132eb4f8 [X86][SKYLAKE] Update regression test to differentiate between HASWELL and SKYLAKE scheduling.<NFC>
NFC.
Updated 6 regression tests to differentiate between HASWELL and SKYLAKE scheduling information.

The fix is in preparation of a patch to update the information of the Skylake Client scheduling to include the appropriate load and store latencies.

Reviewers: zvi, RKSimon
Differential Revision: https://reviews.llvm.org/D38685

Change-Id: Ifc6b98d9eaf266913698f24c766fd994fc977555
llvm-svn: 315291
2017-10-10 09:53:18 +00:00
Florian Hahn 22a44bca40 [SCCP] Propagate integer range info for parameters in IPSCCP.
Summary:
This updates the SCCP solver to use of the ValueElement lattice for 
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to `ret i32 2` with
this change

  source_filename = "sccp.c"
  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
  target triple = "x86_64-unknown-linux-gnu"
  
  ; Function Attrs: norecurse nounwind readnone uwtable
  define i32 @main() local_unnamed_addr #0 {
  entry:
    %call = tail call fastcc i32 @f(i32 1)
    %call1 = tail call fastcc i32 @f(i32 47)
    %add3 = add nsw i32 %call, %call1
    ret i32 %add3
  }
  
  ; Function Attrs: noinline norecurse nounwind readnone uwtable
  define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
  entry:
    %c1 = icmp sle i32 %x, 100
  
    %cmp = icmp sgt i32 %x, 300
    %. = select i1 %cmp, i32 1, i32 2
    ret i32 %.
  }
  
  attributes #1 = { noinline }



Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 315288
2017-10-10 09:32:38 +00:00
Nemanja Ivanovic 7bf866eb10 Fix for PR34888.
The issue is that we assume operand zero of the input to the add instruction
is a register. In this case, the input comes from inline assembly and
operand zero is not a register thereby causing a crash.
The code will bail anyway if the input instruction doesn't have the right
opcode. So do that check first and let short-circuiting prevent the crash.

llvm-svn: 315285
2017-10-10 08:46:10 +00:00
Clement Courbet e2e8a5c496 Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."
(fixed stability issues)

This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc.

llvm-svn: 315281
2017-10-10 08:00:45 +00:00
Craig Topper a88306e6fb [AVX512] Add patterns to commute integer comparison instructions during isel.
This enables broadcast loads to be commuted and allows normal loads to be folded without the peephole pass.

llvm-svn: 315274
2017-10-10 06:36:46 +00:00
Xinliang David Li 4cdc9dab0a Renable r314928
Eliminate inttype phi with inttoptr/ptrtoint.

 This version fixed a bug in finding the matching
 phi -- the order of the incoming blocks may be 
 different (triggered in self build on Windows).
 A new test case is added.

llvm-svn: 315272
2017-10-10 05:07:54 +00:00
Reid Kleckner 97a2d5c42f [MC] Properly diagnose badly scoped .cfi_ directives
Removes two report_fatal_errors.

Implement this by removing EmitCFICommon, and do the checking in
getCurrentDwarfFrameInfo. Have the callers check for null before
dereferencing it.

llvm-svn: 315264
2017-10-10 01:49:21 +00:00
Reid Kleckner 78eb8b912f Give a test a triple
llvm-svn: 315263
2017-10-10 01:34:31 +00:00
Reid Kleckner e52d1e6787 [SEH] Use reportError instead of report_fatal_error for bad directives
This makes the .seh_ directives slightly more usable from standalone
assembly files.

This removes a large number of report_fatal_errors and recovers from the
error by ignoring the directive.

llvm-svn: 315262
2017-10-10 01:26:25 +00:00
Reid Kleckner ab23dace56 [MC] Suppress .Lcfi labels when emitting textual assembly
Summary:
This suppresses the generation of .Lcfi labels in our textual assembler.
It was annoying that this generated cascading .Lcfi labels:
  llc foo.ll -o - | llvm-mc | llvm-mc

After three trips through MCAsmStreamer, we'd have three labels in the
output when none are necessary. We should only bother creating the
labels and frame data when making a real object file.

This supercedes D38605, which moved the entire .seh_ implementation into
MCObjectStreamer.

This has the advantage that we do more checking when emitting textual
assembly, as a minor efficiency cost. Outputting textual assembly is not
performance critical, so this shouldn't matter.

Reviewers: majnemer, MatzeB

Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D38638

llvm-svn: 315259
2017-10-10 00:57:36 +00:00
Aditya Nandakumar c3bfc81a1f [GISel]: Fix generation of illegal COPYs during CallLowering
We end up creating COPY's that are either truncating/extending and this
should be illegal.

https://reviews.llvm.org/D37640

Patch for X86 and ARM by igorb, rovka

llvm-svn: 315240
2017-10-09 20:07:43 +00:00
Zvi Rackover c1d5955684 [X86] Unsigned saturation subtraction canonicalization [the backend part]
Summary:
On behalf of julia.koval@intel.com

The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints).
umax(a,b) - b -> subus(a,b)
a - umin(a,b) -> subus(a,b)

There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987).

The example of special case code:

```
void foo(unsigned short *p, int max, int n) {

  int i;
  unsigned m;
  for (i = 0; i < n; i++) {
    m = *--p;
    *p = (unsigned short)(m >= max ? m-max : 0);
  }
}
```
Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero.

Here is the table of types, I try to support, special case items are bold:

| Size | 128 | 256 | 512
| -----  | -----  | -----   | -----
| i8 | v16i8 | v32i8 | v64i8
| i16 | v8i16 | v16i16 | v32i16
| i32 | | **v8i32** | **v16i32**
| i64 | | | **v8i64**

Reviewers: zvi, spatel, DavidKreitzer, RKSimon

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37534

llvm-svn: 315237
2017-10-09 20:01:10 +00:00
Alexey Bataev 6ab4d075ff [SLP] Add test for reversed load, NFC.
llvm-svn: 315232
2017-10-09 19:08:15 +00:00
Daniel Sanders 4d4e7650dc [globalisel] Add support for ValueType operands in patterns.
It's rare but there are a small number of patterns like this:
    (set i64:$dst, (add i64:$src1, i64:$src2))
These should be equivalent to register classes except they shouldn't check for
a specific register bank.

This doesn't occur in AArch64/ARM/X86 but does occasionally come up in other
in-tree targets such as BPF.

llvm-svn: 315226
2017-10-09 18:14:53 +00:00
Francis Ricci 01ab402463 [dsymutil] Emit valid debug locations when no symbol flags are set
Summary:
swiftc emits symbols without flags set, which led dsymutil to ignore
them when searching for global symbols, causing dwarf location data
to be omitted. Xcode's dsymutil handles this case correctly, and emits
valid location data. Add this functionality to llvm-dsymutil by
allowing parsing of symbols with no flags set.

Reviewers: aprantl, friss, JDevlieghere

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38587

llvm-svn: 315218
2017-10-09 17:27:47 +00:00
Alexey Bataev aadc2331e4 [SLP] Test for wrongly vectorized set of extractelements, NFC.
llvm-svn: 315217
2017-10-09 17:14:03 +00:00
Zachary Turner bd3a9dbabb [llvm-rc] Have the tokenizer discard single & block comments.
This allows rc files to have comments.  Eventually we should
just use clang's c preprocessor, but that's a bit larger
effort for minimal gain, and this is straightforward.

Differential Revision: https://reviews.llvm.org/D38651

llvm-svn: 315207
2017-10-09 15:46:13 +00:00
Sanjay Patel 2a61a821a0 [DAG] combine assertsexts around a trunc
This was a suggested follow-up to:
D37017 / https://reviews.llvm.org/rL313577

llvm-svn: 315206
2017-10-09 15:22:20 +00:00
Amara Emerson 24ca39ce71 [AArch64] Improve codegen for inverted overflow checking intrinsics
E.g. if we have a (xor(overflow-bit), 1) where overflow-bit comes from an
intrinsic like llvm.sadd.with.overflow then we can kill the xor and use the
inverted condition code for the CSEL.

rdar://28495949

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D38160

llvm-svn: 315205
2017-10-09 15:15:09 +00:00
Sanjay Patel 8557e29408 [x86] regenerate test checks; NFC
llvm-svn: 315204
2017-10-09 15:01:58 +00:00
Sanjay Patel be37ab864c [AArch64] fix typos in test assertions
llvm-svn: 315203
2017-10-09 01:29:54 +00:00