Commit Graph

224349 Commits

Author SHA1 Message Date
Rui Ueyama 6df27e7f21 Remove dead code.
llvm-svn: 262662
2016-03-03 21:56:21 +00:00
Simon Pilgrim f33cb61471 [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.
PSHUFB decoder was assuming that input was 128 or 256-bit vector only.

llvm-svn: 262661
2016-03-03 21:55:01 +00:00
Davide Italiano 05920b14c8 [ELF] Be slightly more consistent, use uint8_t instead of unsigned char.
llvm-svn: 262660
2016-03-03 21:54:03 +00:00
Devin Coughlin 578a20a82e [analyzer] ObjCDeallocChecker: Only check for nil-out when type is retainable.
This fixes a crash when setting a property of struct type in -dealloc.

llvm-svn: 262659
2016-03-03 21:38:39 +00:00
Jonathan Peyton c1a7c97c1b [STATS] fix output formatting when sample count is 0
Force 0.0 to be displayed for all statistics which have sample count equal to 0

llvm-svn: 262658
2016-03-03 21:24:13 +00:00
Lang Hames 3b514554a2 [RuntimeDyld] Fix '_' stripping in RTDyldMemoryManager::getSymbolAddressInProcess.
The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a
linker-mangled symbol name, but it calls through to dlsym to do the lookup (via
DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled
symbol name.

Historically we've attempted to "demangle" by removing leading '_'s on all
platforms, and fallen back to an extra search if that failed. That's broken, as
it can cause symbols to resolve incorrectly on platforms that don't do mangling
if you query '_foo' and the process also happens to contain a 'foo'.

Fix this by demangling conditionally based on the host platform. That's safe
here because this function is specifically for symbols in the host process, so
the usual cross-process JIT looking concerns don't apply.

M    unittests/ExecutionEngine/ExecutionEngineTest.cpp
M    lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp

llvm-svn: 262657
2016-03-03 21:23:15 +00:00
Jonathan Peyton 30138256fa [STATS] fix master and single timers
Only the thread which executes the single/master section will update its statistics.

llvm-svn: 262656
2016-03-03 21:21:05 +00:00
Sylvestre Ledru 0b027eab7a fix some minor typos in the doc
llvm-svn: 262655
2016-03-03 20:57:16 +00:00
Sylvestre Ledru 4098de4062 Fix two minor syntax issues in the documentation
llvm-svn: 262654
2016-03-03 20:54:26 +00:00
Rafael Espindola ec9957bd0b Delete dead code.
llvm-svn: 262653
2016-03-03 20:45:26 +00:00
Carlo Bertolli 430d8ecc55 Add code generation for teams directive inside target region
llvm-svn: 262652
2016-03-03 20:34:23 +00:00
George Rimar e5960ceb6b [ELF] - Do not allow .bss to occupy the file space when producing relocatable output
When generating relocatable output SHT_NOBITS sections
were still occupy the file space.

Differential revision: http://reviews.llvm.org/D17857

llvm-svn: 262650
2016-03-03 20:24:14 +00:00
Tobias Grosser 2880f10aa4 tests: Fix some spelling mistakes
llvm-svn: 262649
2016-03-03 19:51:03 +00:00
Philip Reames b7270446cf [ValueTracking] "constant fold" an experimental hidden option
llvm-svn: 262648
2016-03-03 19:50:32 +00:00
Tobias Grosser 4e442d75e5 docs: Fix some spelling mistakes
llvm-svn: 262647
2016-03-03 19:48:30 +00:00
Philip Reames 146307eb52 [ValueTracking] Remove dead code from an old experiment
This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.  

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.  

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.  

llvm-svn: 262646
2016-03-03 19:44:06 +00:00
Sanjay Patel 9bba75084b [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)
Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.

The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
  %bc = bitcast <4 x i32> %not to <2 x i64>
  %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
  %bc2 = bitcast <2 x i64> %notnot to <4 x i32>
  ret <4 x i32> %bc2
}

Simplifies to the expected:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  ret <4 x i32> %lobit
}

Differential Revision: http://reviews.llvm.org/D17583

llvm-svn: 262645
2016-03-03 19:19:04 +00:00
Xinliang David Li dd12e9a8c0 [PGO] Add API for profile merge from buffer
Differential Revision: http://reviews.llvm.org/D17831

llvm-svn: 262644
2016-03-03 18:54:46 +00:00
Easwaran Raman fd6557e368 Fix breakage caused by r262636.
Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused))

llvm-svn: 262643
2016-03-03 18:53:20 +00:00
Rafael Espindola 005d84430d Fix PR26818.
The hack of using a plt address as the address of an undefined function
only works in executables. Don't try it with shared libraries.

llvm-svn: 262642
2016-03-03 18:44:38 +00:00
Anastasia Stulova 782d5f43ca [OpenCL] Improve diagnostics of address spaces for variables in function
- Prevent local variables to be declared in global AS
 - Diagnose AS of local variables with an extern storage class
   as if they would be in a program scope

Review: http://reviews.llvm.org/D17345
llvm-svn: 262641
2016-03-03 18:38:40 +00:00
Sanjoy Das 3928910fe6 [ConstantRange] Rename test; NFC
llvm-svn: 262640
2016-03-03 18:31:33 +00:00
Sanjoy Das 724f5cf278 [SCEV] Prove no-overflow via constant ranges
Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

llvm-svn: 262639
2016-03-03 18:31:29 +00:00
Sanjoy Das 11ef606f1d [SCEV] Be less eager about demoting zexts to sexts
After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges).  This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension.  This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

llvm-svn: 262638
2016-03-03 18:31:23 +00:00
Sanjoy Das f3867e64a8 [ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on ranges
This will be used in a later patch to ScalarEvolution.  Right now only
the unit tests exercise the newly added code.

llvm-svn: 262637
2016-03-03 18:31:16 +00:00
Easwaran Raman 3035719c86 Infrastructure for PGO enhancements in inliner
This patch provides the following infrastructure for PGO enhancements in inliner:

Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.

Differential Revision: http://reviews.llvm.org/D16381

llvm-svn: 262636
2016-03-03 18:26:33 +00:00
Simon Pilgrim abcee45b7a [X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS
The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic.

This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle.

Differential Revision: http://reviews.llvm.org/D17681

llvm-svn: 262635
2016-03-03 18:13:53 +00:00
Dehao Chen 57d1dda558 Use LineLocation instead of CallsiteLocation to index callsite profile.
Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples).

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17827

llvm-svn: 262634
2016-03-03 18:09:32 +00:00
Simon Pilgrim 022afe2538 [X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.
lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway.

llvm-svn: 262633
2016-03-03 17:54:35 +00:00
Filipe Cabecinhas 3fb319c3cf [test/ubsan/coverage-levels] Fix file references in UBSAN_OPTIONS
llvm-svn: 262632
2016-03-03 17:37:35 +00:00
Simon Pilgrim 0107d24810 [X86] Pulled out repeated code testing for constant vector shift amount. NFCI.
llvm-svn: 262631
2016-03-03 17:35:43 +00:00
Daniel Jasper 94a96fcebe clang-format: Use stable_sort when sorting #includes.
Otherwise, clang-format can output useless replacements in the presence
of identical #includes

llvm-svn: 262630
2016-03-03 17:34:14 +00:00
Michael Kruse faedfcbf6d [BlockGenerator] Fix PHI merges for MK_Arrays.
Value merging is only necessary for scalars when they are used outside
of the scop. While an array's base pointer can be used after the scop,
it gets an extra ScopArrayInfo of type MK_Value. We used to generate
phi's for both of them, where one was assuming the reault of the other
phi would be the original value, because it has already been replaced by
the previous phi. This resulted in IR that the current IR verifier
allows, but is probably illegal.

This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-process-unprofitable
from 16 to 10.

Also see llvm.org/PR26718.

llvm-svn: 262629
2016-03-03 17:20:43 +00:00
Amjad Aboud 0ce261d052 MCU target has its own ABI, however X86 interrupt handler calling convention overrides this ABI.
Fixed the ordering to check first for X86 interrupt handler then for MCU target.

Differential Revision: http://reviews.llvm.org/D17801

llvm-svn: 262628
2016-03-03 17:17:54 +00:00
Ahmed Bougacha 671795a985 [X86] Don't assume that shuffle non-mask operands starts at #0.
That's not the case for VPERMV/VPERMV3, which cover all possible
combinations (the C intrinsics use a different order; the AVX vs
AVX512 intrinsics are different still).

Since:
  r246981 AVX-512: Lowering for 512-bit vector shuffles.
VPERMV is recognized in getTargetShuffleMask.

This breaks assumptions in most callers, as they expect
the non-mask operands to start at index 0.
VPERMV has the mask as operand #0; VPERMV3 has it in the middle.

Instead of the faulty assumption, have getTargetShuffleMask return
its operands as well.

One alternative we considered was to change the operand order of
VPERMV, but we agreed to stick to the instruction order, as there
are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular).

Differential Revision: http://reviews.llvm.org/D17041

llvm-svn: 262627
2016-03-03 16:53:50 +00:00
Rafael Espindola 1130935c4a Simplify error handling.
This makes fatal return T when there is no error. This avoids the need
for quite a few temporaries.

llvm-svn: 262626
2016-03-03 16:21:44 +00:00
Samuel Antao b68e2db8f9 [OpenMP] Code generation for teams - kernel launching
Summary:
This patch implements the launching of a target region in the presence of a nested teams region, i.e calls tgt_target_teams with the required arguments gathered from the enclosed teams directive.

The actual codegen of the region enclosed by the teams construct will be contributed in a separate patch.

Reviewers: hfinkel, arpith-jacob, kkwli0, carlo.bertolli, ABataev

Subscribers: cfe-commits, caomhin, fraggamuffin

Differential Revision: http://reviews.llvm.org/D17019

llvm-svn: 262625
2016-03-03 16:20:23 +00:00
Matthew Simpson b840a6d6f4 [LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

llvm-svn: 262624
2016-03-03 16:12:01 +00:00
Sanjay Patel d6cb4ec2a2 [AArch64] fold 'isPositive' vector integer operations (PR26819)
This is one of the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26819

Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that
in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might
come from autovectorization even if it's unlikely to be produced from C NEON intrinsics.

The patch is based on the x86 equivalent:
http://reviews.llvm.org/rL262036

Differential Revision: http://reviews.llvm.org/D17834

llvm-svn: 262623
2016-03-03 15:56:08 +00:00
Pavel Labath c6ba6ae209 Revert "Fetch remote log files from LLGS tests"
Even after the last fixup, there still seems to be one failure left. Revert until I figure out
what is going on.

llvm-svn: 262622
2016-03-03 15:19:14 +00:00
Igor Breger 639fde79b0 AVX512: Combine AND + TESTM instructions .
Differential Revision: http://reviews.llvm.org/D17844

llvm-svn: 262621
2016-03-03 14:18:38 +00:00
Renato Golin f824ced6a1 Making rem_crash.ll target-specific
This test failed in some ARM bots after a divmod change because it was
running on a native llc, instead of targeted one. This makes sure the test
is target-specific (as intended), and also copies to ARM and AArch64
directories. If it is also supposed to work on other architectures, I'll
leave as an exercise to the respective maintainers.

llvm-svn: 262620
2016-03-03 14:01:10 +00:00
Bradley Smith f4affc13c5 [ARM] Add Clang targeting for ARMv8-M Baseline/Mainline
llvm-svn: 262619
2016-03-03 13:52:22 +00:00
Gabor Horvath 65c02ec837 [clang-tidy] Improve the robustness of a test.
llvm-svn: 262618
2016-03-03 13:43:23 +00:00
Michael Zuckerman 0d67e4b5d6 [CLANG][AVX512][BUILTIN] movddup{128|256|512}
Differential Revision: http://reviews.llvm.org/D17826

llvm-svn: 262617
2016-03-03 13:43:05 +00:00
Anastasia Stulova 1f95cc097c [OpenCL] Apply missing restrictions for Blocks in OpenCL v2.0
Applying the following restrictions for block types in OpenCL (v2.0 s6.12.5):
 - __block storage class is disallowed
 - every block declaration must be const qualified and initialized
 - a block can't be used as a return type of a function
 - a blocks can't be used to declare a structure or union field
 - extern speficier is disallowed

Corrected image and sampler types diagnostics with struct and unions.

Review: http://reviews.llvm.org/D16928
llvm-svn: 262616
2016-03-03 13:33:19 +00:00
Gabor Horvath f4336ac9eb [clang-tidy] Do not emit warnings from misc-suspicious-semicolon when the compilation fails.
llvm-svn: 262615
2016-03-03 13:08:11 +00:00
Johannes Doerfert ac37c565b5 Fix typo [NFC]
llvm-svn: 262613
2016-03-03 12:30:19 +00:00
Johannes Doerfert df88023d2b [FIX] Consolidation of loads with same pointer but different access relation
This should fix PR19422.

  Thanks to Jeremy Huddleston Sequoia for reporting this.
  Thanks to Roman Gareev for his investigation and the reduced test case.

llvm-svn: 262612
2016-03-03 12:26:58 +00:00
Michael Zuckerman 56de012b41 Fixing a checkfile error in avx512vlbw-builtins.c
Differential Revision: http://reviews.llvm.org/D17814

llvm-svn: 262611
2016-03-03 12:17:50 +00:00