Commit Graph

130980 Commits

Author SHA1 Message Date
Simon Pilgrim ca140b17cb [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements.
llvm-svn: 268206
2016-05-01 20:43:02 +00:00
Simon Pilgrim c590492075 Dropped FIXME comment
llvm-svn: 268205
2016-05-01 20:33:25 +00:00
Simon Pilgrim eeacc40e27 [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements.
llvm-svn: 268204
2016-05-01 20:22:42 +00:00
Simon Pilgrim cc7f567b6a [InstCombine][AVX] Fixed PERMILVAR identity tests and added additional decode tests
llvm-svn: 268203
2016-05-01 20:06:47 +00:00
Simon Pilgrim e5e8c2fde0 [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements.
llvm-svn: 268202
2016-05-01 19:26:21 +00:00
Simon Pilgrim cae3e70707 [InstCombine][SSE] Regenerate MOVSX/MOVZX tests
llvm-svn: 268201
2016-05-01 18:28:45 +00:00
Craig Topper b6da65403a [AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While there fix the execution domain for VPACKSSDW/VPACKUSDW.
llvm-svn: 268200
2016-05-01 17:38:32 +00:00
Simon Pilgrim 8cddf8b3c6 [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector.
llvm-svn: 268199
2016-05-01 16:41:22 +00:00
Simon Pilgrim 33ae13d3c3 Fixed MSVC 'not all control paths return a value' warning
llvm-svn: 268198
2016-05-01 15:52:31 +00:00
Simon Pilgrim 0aa27cd29b Document the LLVM_ENABLE_EXPENSIVE_CHECKS cmake option introduced in r268050
llvm-svn: 268197
2016-05-01 15:27:47 +00:00
Igor Breger 110af565c7 getelementptr instruction, support index vector of EVT.
Differential Revision: http://reviews.llvm.org/D19775

llvm-svn: 268195
2016-05-01 13:29:12 +00:00
Igor Breger 131008fbcb Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth.
Differential Revision: http://reviews.llvm.org/D19579

llvm-svn: 268190
2016-05-01 08:40:00 +00:00
Craig Topper e430de8be6 [AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when VLX and BWI are supported.
llvm-svn: 268189
2016-05-01 06:52:19 +00:00
Craig Topper 5acb5a1caf [AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB and VPMADDUBSW/VPMADDWD.
llvm-svn: 268188
2016-05-01 06:24:57 +00:00
Craig Topper db290664f6 [AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also marked as requiring VLX.
llvm-svn: 268186
2016-05-01 05:57:06 +00:00
Craig Topper f77ca947ce [X86] Add an AddedComplexity to another pattern to put it near similar in the output file.
llvm-svn: 268184
2016-05-01 05:22:15 +00:00
Craig Topper 742977ede8 [X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere with an AddedComplexity that made this unreachable.
llvm-svn: 268183
2016-05-01 05:22:13 +00:00
Craig Topper eb9a87918b [X86] Add AddedComplexity to keep some similar patterns near each other in the output file.
llvm-svn: 268181
2016-05-01 04:59:49 +00:00
Craig Topper 7ed84d826e [X86] Remove some redundant selection patterns.
llvm-svn: 268180
2016-05-01 04:59:46 +00:00
Craig Topper c9b1923358 [AVX512] Replace vector_extract with extractelt in some patterns. They mean the same thing but vector_extract is deprecated. NFC
llvm-svn: 268179
2016-05-01 04:59:44 +00:00
Sanjoy Das f2f00fb11a [SCEV] When printing via -analysis, dump loop disposition
There are currently some bugs in tree around SCEV caching an incorrect
loop disposition.  Printing out loop dispositions will let us write
whitebox tests as those are fixed.

The dispositions are printed as a list in "inside out" order,
i.e. innermost loop first.

llvm-svn: 268177
2016-05-01 04:51:05 +00:00
Amaury Sechet 8a367d404f Properly name LLVMSetIsInBounds's argument. NFC
llvm-svn: 268176
2016-05-01 02:23:14 +00:00
Amaury Sechet 81243a73ef Capitalize align argument in the C API as per convention. NFC
llvm-svn: 268175
2016-05-01 01:42:34 +00:00
Craig Topper 99f6b620cc [AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions.
llvm-svn: 268174
2016-05-01 01:03:56 +00:00
Lang Hames 2307f405cc [ORC] Save AArch64 NEON state in the JIT reentry block.
The earlier version of the resolver code did not save NEON state, so it would
have broken any callees that used floating point.

llvm-svn: 268173
2016-05-01 00:14:45 +00:00
Rui Ueyama 53aa9f2475 [lit] Add %:[STpst] to represent paths without colons on Windows.
Summary:
We need these variables to concatenate two absolute paths to construct
a valid path. Currently, %t\%t is, for example, expanded to C:\foo\C:\foo,
which is not a valid path because ":" is not a valid path character
on Windows. With this patch, %t will be expanded to C\foo.

Differential Revision: http://reviews.llvm.org/D19757

llvm-svn: 268168
2016-04-30 21:32:12 +00:00
Simon Pilgrim c179435055 [InstCombine][AVX2] Added VPERMD/VPERMPS shuffle combining placeholder tests.
For future support for VPERMD/VPERMPS to generic shuffles combines

llvm-svn: 268166
2016-04-30 20:41:52 +00:00
Saleem Abdulrasool e0f0c0e247 CodeGen: convert to range based loops
Convert to using some range based loops, avoid unnecessary variables for
unchecked casts.  NFC.

llvm-svn: 268165
2016-04-30 18:15:34 +00:00
Craig Topper e012ede137 [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.
llvm-svn: 268164
2016-04-30 17:59:49 +00:00
Rafael Espindola 92dd7b82be Add missing override.
llvm-svn: 268163
2016-04-30 15:18:21 +00:00
Marcin Koscielnicki 57290f934a [ASan] Add shadow offset for SystemZ.
SystemZ on Linux currently has 53-bit address space.  In theory, the hardware
could support a full 64-bit address space, but that's not supported due to
kernel limitations (it'd require 5-level page tables), and there are no plans
for that.  The default process layout stays within first 4TB of address space
(to avoid creating 4-level page tables), so any offset >= (1 << 42) is fine.
Let's use 1 << 52 here, ie. exactly half the address space.

I've originally used 7 << 50 (uses top 1/8th of the address space), but ASan
runtime assumes there's some space after the shadow area.  While this is
fixable, it's simpler to avoid the issue entirely.

Also, I've originally wanted to have the shadow aligned to 1/8th the address
space, so that we can use OR like X86 to assemble the offset.  I no longer
think it's a good idea, since using ADD enables us to load the constant just
once and use it with register + register indexed addressing.

Differential Revision: http://reviews.llvm.org/D19650

llvm-svn: 268161
2016-04-30 09:57:34 +00:00
Simon Pilgrim 8e38a5439b [InstCombine][AVX] Split off VPERMILVAR tests and added additional tests for UNDEF mask elements
llvm-svn: 268159
2016-04-30 07:32:19 +00:00
Simon Pilgrim 640f9964c7 [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

llvm-svn: 268158
2016-04-30 07:23:30 +00:00
Sriraman Tallam c198d3344e Differential Revision: http://reviews.llvm.org/D19753
Delete Target Option PositionIndependentExecutable as PIE is now part of module flags.

llvm-svn: 268155
2016-04-30 04:18:52 +00:00
Tom Stellard c51e4468b7 AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaits
This was supposed to be part of r268143.

llvm-svn: 268154
2016-04-30 04:04:48 +00:00
Hal Finkel 17e9754dd4 [PowerPC/QPX] Fix the load/splat peephole with overlapping reads
If, in between the splat and the load (which does an implicit splat), there is
a read of the splat register, then that register must have another earlier
definition. In that case, we can't replace the load's destination register with
the splat's destination register.

Unfortunately, I don't have a small or non-fragile test case.

llvm-svn: 268152
2016-04-30 01:59:28 +00:00
Amjad Aboud 72da9391f0 Reverting 268054 & 268063 as they caused PR27579.
llvm-svn: 268150
2016-04-30 01:44:07 +00:00
Sanjoy Das 47cf2affbd [LowerGuardIntrinsics] Keep track of !make.implicit metadata
If a guard call being lowered by LowerGuardIntrinsics has the
`!make.implicit` metadata attached, then reattach the metadata to the
branch in the resulting expanded form of the intrinsic.  This allows us
to implement null checks as guards and still get the benefit of implicit
null checks.

llvm-svn: 268148
2016-04-30 00:55:59 +00:00
Lawrence Hu 1befea2bdc Reroll loops with multiple IV and negative step part 3
support multiple induction variables

    This patch enable loop reroll for the following case:
        for(int i=0;  i<N; i += 2) {
           S += *a++;
           S += *a++;
        };

Differential Revision: http://reviews.llvm.org/D16550

llvm-svn: 268147
2016-04-30 00:51:22 +00:00
Lang Hames df29078dc8 [Orc] Fix the AArch64 resolver size.
llvm-svn: 268146
2016-04-30 00:50:26 +00:00
Vedant Kumar 62db78449f Fix a typo (NFC)
llvm-svn: 268144
2016-04-30 00:32:54 +00:00
Tom Stellard cb6ba62d6f AMDGPU/SI: Enable the post-ra scheduler
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.

Reviewers: arsenm

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18602

llvm-svn: 268143
2016-04-30 00:23:06 +00:00
Sanjoy Das 52c68bb0f5 [LowerGuardIntrinsics] Preserve calling conv when lowering
llvm-svn: 268142
2016-04-30 00:17:47 +00:00
Sanjay Patel bc6fad0bdf add minimal test to show dropped metadata
llvm-svn: 268141
2016-04-30 00:12:54 +00:00
Sanjay Patel 6748ec49e9 remove the metadata added with r267827
We can demonstrate the 'select' bug and fix with a simpler test case.
The merged weight values are already tested in another test.

llvm-svn: 268139
2016-04-30 00:02:36 +00:00
Xinliang David Li 4b2fdccad9 Reapply r268107 after fixing a bug breaks debug build.
Makes the new method to set data needed by debug dump.

llvm-svn: 268130
2016-04-29 22:59:36 +00:00
Sanjoy Das 107aefc2fc Mark guards on true as "trivially dead"
This moves some logic added to EarlyCSE in rL268120 into
`llvm::isInstructionTriviallyDead`.  Adds a test case for DCE to
demonstrate that passes other than EarlyCSE can now pick up on the new
information.

llvm-svn: 268126
2016-04-29 22:23:16 +00:00
Chris Bieneman aa8dfe9fb3 [CMake] [Xcode] Improving Xcode toolchain generation to support distribution targets
This adds a new target `install-distribution-toolchain` which will install an Xcode toolchain featuring just the LLVM components specified in LLVM_DISTRIBUTION_COMPONENTS.

llvm-svn: 268125
2016-04-29 22:19:35 +00:00
Sanjay Patel 1d0ac7c5b8 clean up documentation comments; NFC
llvm-svn: 268122
2016-04-29 22:03:27 +00:00
Haicheng Wu 4afe0425db [MBP] Use Function::optForSize() instead of checking OptimizeForSize directly.
Fix a FIXME.  Disable loop alignment if compiled with -Oz now.

llvm-svn: 268121
2016-04-29 22:01:10 +00:00