Commit Graph

145226 Commits

Author SHA1 Message Date
Daniel Berlin d46dfb3d0e Which, in turn, causes build bots to fail that have it unexpectedly passing. So remove debugcounter.ll for now
llvm-svn: 295597
2017-02-19 04:56:07 +00:00
Daniel Berlin 1ad7f5cab0 XFAIL this test until we figure out what to do here, since it will fail if NDEBUG defined
llvm-svn: 295596
2017-02-19 04:55:02 +00:00
Daniel Berlin ce3d348a5c Add two files lost in rebase, causing build break
llvm-svn: 295595
2017-02-19 04:29:50 +00:00
Daniel Berlin a4b5c01dd2 Add a DebugCounter for PredicateInfo renaming, and an associated test
llvm-svn: 295594
2017-02-19 04:29:01 +00:00
Daniel Berlin 25f1db1111 Add initial support for debug counting
Summary:

We have support for bisection, and bugpoint can reduce testcases
often to a single pass. But that doesn't help reduce it to a single
transform by a single pass.  Which debug counting lets us do.

Debug counting lets you instrument a pass so that it only executes a
certain thing (rwhatever you want) after skipping it a certain time of
times, and then only does a certain number of executions before saying
"skip" again.

To make it concrete, for predicateinfo, if i instrument use renaming,
i can make it so it skips renaming the first N uses, renames the next
N, and then skips the rest.

This lets you narrow down a miscompilation to, often, a single
transformation, and then also debug it (by using the same command line
parameters).

Reviewers: chandlerc, davide, mehdi_amini

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D29998

llvm-svn: 295593
2017-02-19 04:28:56 +00:00
NAKAMURA Takumi 486dfe11af llvm/test/CodeGen/AMDGPU/r600.alu-limits.ll should require +Asserts. This would run into infinite loop anyways with -Asserts.
llvm-svn: 295591
2017-02-19 02:31:06 +00:00
Craig Topper 007c93b2b9 [X86] Remove patterns for MOVSD with v4i32 types. We don't appear to really need them and if we do we should just use a bitcast to a 64-bit element type.
llvm-svn: 295589
2017-02-19 02:08:48 +00:00
Craig Topper 06ae5e821c [X86] Tighten up some of the SDNode type constraints.
llvm-svn: 295588
2017-02-19 01:54:47 +00:00
Simon Pilgrim dba9011942 Fix unused variable warning when assertions are disabled.
llvm-svn: 295587
2017-02-19 00:33:37 +00:00
Simon Pilgrim 599b872ca2 [X86] Fix enumeral/non-enumeral conditional expression warning.
gcc only allows you to mix enums / ints if they have the same signedness.

llvm-svn: 295586
2017-02-19 00:04:30 +00:00
Simon Pilgrim ee6ef4d6dd Fix 'variable set but not used' warning when assertions are disabled.
llvm-svn: 295585
2017-02-19 00:03:46 +00:00
Daniel Berlin f7d9580a08 NewGVN: Start making use of predicateinfo pass.
Summary: This begins using the predicateinfo pass in NewGVN.

Reviewers: davide

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D29682

llvm-svn: 295583
2017-02-18 23:06:50 +00:00
Daniel Berlin b355c4ff5f NewGVN: Make ranking prefer undef to constants. Fix direction of
shouldSwapOperands to be correct.

llvm-svn: 295582
2017-02-18 23:06:47 +00:00
Daniel Berlin 588e0be39d PredicateInfo: Clean up predicate info a little, using insertion
helpers, and fixing support for the renaming the comparison.

llvm-svn: 295581
2017-02-18 23:06:38 +00:00
Simon Pilgrim 2f2d8dc630 Fix signed/unsigned comparison warning.
llvm-svn: 295580
2017-02-18 22:56:17 +00:00
Craig Topper 811756b4dc [X86][XOP] Reduce the size of a multiclass by moving more stuff to parameters instead of doing 128-bit and 256-bit simultaneously.
This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions.

llvm-svn: 295579
2017-02-18 22:53:43 +00:00
Craig Topper f6564c991b [TableGen] Make sure EnforceSameSize populates the type sets if necessary.
This was found by another commit I'm working on.

llvm-svn: 295578
2017-02-18 22:53:38 +00:00
Simon Pilgrim b092166a76 [AArch64] Fix enumeral/non-enumeral conditional expression warning.
gcc only allows you to mix enums / ints if they have the same signedness.

llvm-svn: 295577
2017-02-18 22:50:28 +00:00
Simon Pilgrim 7a87eebcad [X86] Fix enumeral/non-enumeral comparison warning.
gcc only allows you to mix enums / ints if they have the same signedness.

llvm-svn: 295576
2017-02-18 22:40:58 +00:00
Simon Pilgrim 2e78c94ea5 [X86][SSE] Avoid repeated calls to SDValue::getValueType.
Added assertion to check input type of X86ISD::VZEXT during target known bits calculation.

llvm-svn: 295575
2017-02-18 22:25:27 +00:00
Sanjay Patel 53c5c3d65d [InstCombine] add nsw/nuw X, signbit --> or X, signbit
Changing to 'or' (rather than 'xor' when no wrapping flags are set)
allows icmp simplifies to happen as expected.

Differential Revision: https://reviews.llvm.org/D29729

llvm-svn: 295574
2017-02-18 22:20:09 +00:00
Sanjay Patel fe67255961 [InstSimplify] add nsw/nuw (xor X, signbit), signbit --> X
The change to InstCombine in:
https://reviews.llvm.org/D29729
...exposes this missing fold in InstSimplify, so adding this
first to avoid a regression.

llvm-svn: 295573
2017-02-18 21:59:09 +00:00
Sanjay Patel 308eb22118 [InstSimplify] add tests for add nsw/nuw (xor X, signbit), signbit --> X; NFC
llvm-svn: 295572
2017-02-18 21:51:14 +00:00
Craig Topper de10312bea Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
Clang has now been fixed to not use these intrinsics.

llvm-svn: 295571
2017-02-18 21:50:58 +00:00
Sanjay Patel dc8a24ea4c [x86] remove stale comments from tests; NFC
llvm-svn: 295569
2017-02-18 21:07:37 +00:00
Sanjay Patel 12c2093e1e [x86] fold sext (xor Bool, -1) --> sub (zext Bool), 1
This is the same transform that is current used for:
select Bool, 0, -1

llvm-svn: 295568
2017-02-18 21:03:28 +00:00
Piotr Padlewski cc5868c186 [MemorySSA] NFC small fixes
Summary:
2 small fixes extracted from
https://reviews.llvm.org/D29064

Reviewers: kuhar, davide, dberlin, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30109

llvm-svn: 295566
2017-02-18 20:34:36 +00:00
Craig Topper ba2a726cc6 Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support.

llvm-svn: 295565
2017-02-18 20:14:20 +00:00
Craig Topper 884db3f85d [X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR.
It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses.

llvm-svn: 295564
2017-02-18 19:51:25 +00:00
Craig Topper 03a9adc2ba [X86][IR] Simplify the XOP vpcmov autoupgrade code. NFC
llvm-svn: 295563
2017-02-18 19:51:19 +00:00
Craig Topper aa49f14496 [X86][IR] Merge together some very similar AutoUpgrade handling. NFC
llvm-svn: 295562
2017-02-18 19:51:14 +00:00
Matt Arsenault 2021f08080 AMDGPU: Fix assembler subtarget predicate for gfx9
This was accepting GFX9 instructions on VI.

llvm-svn: 295557
2017-02-18 19:12:26 +00:00
Matt Arsenault a3b3b489fb AMDGPU: Fix disassembly of aperture registers
llvm-svn: 295555
2017-02-18 18:41:41 +00:00
Matt Arsenault e823d92f7f AMDGPU: Merge initial gfx9 support
llvm-svn: 295554
2017-02-18 18:29:53 +00:00
Sanjay Patel 6d5dddb85f [InstCombine] add tests for trunc(insertelement); NFC
llvm-svn: 295553
2017-02-18 18:27:04 +00:00
Easwaran Raman 617f63640b Refactor instruction simplification code in visitors. NFC.
Several visitors check if operands to the instruction are constants,
either as it is or after looking up SimplifiedValues, check if the
result is a constant and update the SimplifiedValues map. This
refactoring splits it into a common function that does the checking of
whether the operands are constants and updating of the SimplifiedValues
table, and an instruction specific part that is implemented by each
instruction visitor as a lambda and passed to the common function.

Differential revision: https://reviews.llvm.org/D30104

llvm-svn: 295552
2017-02-18 17:22:52 +00:00
Sanjay Patel 86554de2bd [InstCombine] update trunc(shuffle) tests to reflect IR reality; NFC
We're ok shrinking splats, but not shuffles in general.

See https://reviews.llvm.org/D30123 for discussion.

llvm-svn: 295547
2017-02-18 15:24:31 +00:00
Brian Cain cc24826e1e opt-viewer: Fix syntax highlighting
Syntax highlighting has been done line-at-a-time. Done this way, the lexer
resets the context at each line, distorting the formatting.

This change will render the whole file at once and feed the highlighted text
line-at-a-time to be wrapped by the SourceFileRenderer.

Leading/trailing newlines were being ignored by Pygments but since each line
was rendered in its own row, it didn't matter. This bug was masked by the
line-at-a-time algorithm. So now we need to add "stripnl=False" to the 
CppLexer to change its behavior to match the expectation.

llvm-svn: 295546
2017-02-18 15:13:58 +00:00
Craig Topper a505169ca5 [AVX-512] Remove 128/256-bit masked fp max/min intrinsics. Upgrade them to legacy unmasked intrinsics and select instructions.
llvm-svn: 295543
2017-02-18 07:07:50 +00:00
Dehao Chen b70afdb5e8 Add default OptLevel value for createSimpleLoopUnrollPass to fix the build break introduced by r295538. (NFC)
llvm-svn: 295542
2017-02-18 06:42:16 +00:00
Jan Vesely 4b1243facb AMDGPU/R600: Assert on infinite loop in EmitClauseMarkers
Differential Revision: https://reviews.llvm.org/D29792

llvm-svn: 295539
2017-02-18 04:24:10 +00:00
Dehao Chen 7d230325ef Increases full-unroll threshold.
Summary:
The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold

This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge):

Performance:

403	0.11%
433	0.51%
445	0.48%
447	3.50%
453	1.49%
464	0.75%

Code size:

403	0.56%
433	0.96%
445	2.16%
447	2.96%
453	0.94%
464	8.02%

The compiler time overhead is similar with code size.

Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc

Reviewed By: hfinkel, chandlerc

Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D28368

llvm-svn: 295538
2017-02-18 03:46:51 +00:00
Davide Italiano 982bf827b5 [IR/Verifier] Don't visit DISubprograms more than needed.
Before this patch we happened to visit twice, one when scanning
MDNodes and the other one while visiting the function. Remove
the explicit call to visitDISubprogram there, so we don't emit
the same error twice in case the verifier fail and we save some
time when running it.
Thanks to Justin Bogner for the report and Adrian for the quick
review!

PR: 31995
llvm-svn: 295537
2017-02-18 03:02:44 +00:00
Dylan McKay 83788ff349 [AVR] Set UseIntegratedAssembler
llvm-svn: 295535
2017-02-18 02:26:11 +00:00
Justin Bogner 7bc978b543 OptDiag: Allow constructing DiagnosticLocation from DISubprograms
This avoids creating a DILocation just to represent a line number,
since creating Metadata is expensive. Creating a DiagnosticLocation
directly is much cheaper.

llvm-svn: 295531
2017-02-18 02:00:27 +00:00
Zachary Turner 607418771e Remove the is_trivially_copyable check entirely.
This is still breaking builds because some compilers think
this type is not trivially copyable even when it should be.

Reverting this static_assert until I have time to investigate.

llvm-svn: 295529
2017-02-18 01:51:00 +00:00
Zachary Turner 3720124545 Use llvm workaround for missing is_trivially_copyable.
some versions of GCC don't have this, so LLVM provides a
workaround.

llvm-svn: 295526
2017-02-18 01:46:01 +00:00
Zachary Turner 181fe17b6f Don't assume little endian in StreamReader / StreamWriter.
In an effort to generalize this so it can be used by more than
just PDB code, we shouldn't assume little endian.

llvm-svn: 295525
2017-02-18 01:35:33 +00:00
Matthias Braun 2a707a3d3d machine-region-info.mir: Slightly simplify test, -mtriple
llvm-svn: 295520
2017-02-18 00:48:43 +00:00
Justin Bogner d890f95bf6 OptDiag: Decouple backend diagnostics from debug info metadata
This creates and uses a DiagnosticLocation type rather than using
DebugLoc for this purpose in the backend diagnostics. This is NFC for
now, but will allow us to create locations for diagnostics without
having to create new metadata nodes when we don't have a DILocation.

llvm-svn: 295519
2017-02-18 00:42:23 +00:00
Matthias Braun 431305927f MachineRegionInfo: Fix pass initialization
- Adapt MachineBasicBlock::getName() to have the same behavior as the IR
  BasicBlock (Value::getName()).
- Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in
  the CodeGen library.
- MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region").
- MachineRegionInfo should depend on MachineDominatorTree,
  MachinePostDominatorTree and MachineDominanceFrontier instead of their
  respective IR versions.
- Since there were no tests for this, add a X86 MIR test.

Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com>

llvm-svn: 295518
2017-02-18 00:41:16 +00:00
Justin Bogner efc3fbf6a2 Verifier: Disallow a line number without a file in DISubprogram
A line number doesn't make much sense if you don't say where it's
from. Add a verifier check for this and update some tests that had
bogus debug info.

llvm-svn: 295516
2017-02-17 23:57:42 +00:00
Sanjay Patel f8346550bf [InstCombine] add tests for trunc(shuffle X, C, M); NFC
llvm-svn: 295513
2017-02-17 23:16:54 +00:00
Matthias Braun d9a59a8df8 AArch64LoadStoreOptimizer: Correctly clear kill flags
When promoting the Load of a Store-Load pair to a COPY all kill flags
between the store and the load need to be cleared.

rdar://30402435

Differential Revision: https://reviews.llvm.org/D30110

llvm-svn: 295512
2017-02-17 23:15:03 +00:00
Simon Pilgrim 8670993dc1 [X86] Add MOVBE targets to load combine tests
Test folded endian swap tests with MOVBE instructions.

llvm-svn: 295508
2017-02-17 23:00:21 +00:00
Guozhi Wei 7ec2c72095 [PPC] Give unaligned memory access lower cost on processor that supports it
Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost.

This patch fixes pr31492.

Differential Revision: https://reviews.llvm.org/D28630

This is resubmit of r292680, which was reverted by r293092. The internal application failures were actually caused by a source code bug.

llvm-svn: 295506
2017-02-17 22:29:39 +00:00
Eugene Zelenko be37db1882 [CodeGen] Revert changes in LowLevelType to pre-r295499 to fix broken buildbots.
llvm-svn: 295505
2017-02-17 22:23:34 +00:00
Krzysztof Parzyszek 1aaf41af54 [Hexagon] Start using regmasks on calls
Reapply r295371 with a fix for the Windows bot failures.

llvm-svn: 295504
2017-02-17 22:14:51 +00:00
Davide Italiano bca05df38b [NewGVN] isOnlyReachableViaThisEdge() is dead now. NFCI.
llvm-svn: 295503
2017-02-17 22:12:30 +00:00
Simon Pilgrim 7db8f42fe3 [X86] Simplify by pulling out valuetype. NFCI.
llvm-svn: 295502
2017-02-17 22:10:10 +00:00
Eugene Zelenko 9676ebec15 [CodeGen] Attempt to fix buildbots broken in r295499.
llvm-svn: 295501
2017-02-17 22:07:26 +00:00
Davide Italiano 6db0ecafd7 [NewGVN] createVariableOrConstant is not required anymore. NFCI.
llvm-svn: 295500
2017-02-17 21:55:47 +00:00
Eugene Zelenko 5db84df728 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 295499
2017-02-17 21:43:25 +00:00
Simon Pilgrim 996f9b4cad [X86] Add subborrow stack folding tests
llvm-svn: 295496
2017-02-17 21:16:24 +00:00
Sanjay Patel 00872c3dfe [x86] add tests for sext (not bool); NFC
llvm-svn: 295495
2017-02-17 21:10:40 +00:00
Matthew Simpson a899f86054 [LAA] Remove unused code (NFC)
llvm-svn: 295493
2017-02-17 20:46:52 +00:00
Simon Pilgrim a4c350ff17 [X86][SSE] Add (V)MOVD folding pattern with zextloadi64i32 load node.
Fixes PRPR31309

llvm-svn: 295492
2017-02-17 20:43:32 +00:00
Adrian Prantl e6c6a945ca Fix windows bots by locking down the target triple on this testcase.
llvm-svn: 295490
2017-02-17 20:02:26 +00:00
Matt Arsenault f6cf1032fd AMDGPU: Fix crashes on invalid icmp/fcmp intrinsics
llvm-svn: 295489
2017-02-17 19:49:10 +00:00
Peter Collingbourne 184773d81f WholeProgramDevirt: For VCP use a 32-bit ConstantInt for the byte offset.
A future change will cause this byte offset to be inttoptr'd and then exported
via an absolute symbol. On the importing end we will expect the symbol to be
in range [0,2^32) so that it will fit into a 32-bit relocation. The problem
is that on 64-bit architectures if the offset is negative it will not be in
the correct range once we inttoptr it.

This change causes us to use a 32-bit integer so that it can be inttoptr'd
(which zero extends) into the correct range.

Differential Revision: https://reviews.llvm.org/D30016

llvm-svn: 295487
2017-02-17 19:43:45 +00:00
Adrian Prantl 67c2442210 Debug Info: Sort frame index expressions before emitting them.
This fixes PR31381, which caused an assertion and/or invalid debug info.

This affects debug variables that have multiple fragments in the MMI
side (i.e.: in the stack frame) table.
rdar://problem/30571676

llvm-svn: 295486
2017-02-17 19:42:32 +00:00
Petr Hosek 7acbeaf1bc [CMake] Support externalizing debug info on non-Darwin platforms
On other platorms, we use objcopy to export the debug info.

Differential Revision: https://reviews.llvm.org/D28575

llvm-svn: 295481
2017-02-17 19:29:12 +00:00
Simon Pilgrim a3f2803905 [X86][SHA] Add SHA stack folding tests
llvm-svn: 295479
2017-02-17 19:24:55 +00:00
Artyom Skrobov 4592f6206c In Thumb1 mode, the custom lowering for ARMISD::CMPZ could never emit tADDi3
Reviewers: jmolloy, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D30097

llvm-svn: 295478
2017-02-17 18:59:16 +00:00
Simon Pilgrim f4f5cd5d19 [X86][TBM] Add TBM stack folding tests
llvm-svn: 295477
2017-02-17 18:51:53 +00:00
Tim Northover 88634996c7 GlobalISel: verify that generic loads & stores have a mem operand.
The mem operand is used by GlobalISel to convey atomic constraints so dropping
it is invalid.

llvm-svn: 295476
2017-02-17 18:50:15 +00:00
Joel Jones ab0f3b43e3 [AArch64] Add Cavium ThunderX support
This set of patches adds support for Cavium ThunderX ARM64 processors:

  * ThunderX
  * ThunderX T81
  * ThunderX T83
  * ThunderX T88

Patch by Stefan Teleman
Differential Revision: https://reviews.llvm.org/D28891

llvm-svn: 295475
2017-02-17 18:34:24 +00:00
Peter Collingbourne 37317f1207 WholeProgramDevirt: Examine the function body when deciding whether functions are readnone.
The goal is to get an analysis result even for de-refineable functions.

Differential Revision: https://reviews.llvm.org/D29803

llvm-svn: 295472
2017-02-17 18:17:04 +00:00
Simon Pilgrim 99193de8ab [X86][BMI] Add BMI2 stack folding tests
llvm-svn: 295470
2017-02-17 18:00:43 +00:00
Peter Collingbourne 10c500ddc0 opt: Rename -default-data-layout flag to -data-layout and make it always override the layout.
There isn't much point in a flag that only works if the data layout is empty.

Differential Revision: https://reviews.llvm.org/D30014

llvm-svn: 295468
2017-02-17 17:36:52 +00:00
Justin Bogner 073f56dc1a OptDiag: Rename DiagnosticInfoWithDebugLoc to WithLocation. NFC
This generalizes the name in preparation for decoupling the concept
from DebugLoc.

llvm-svn: 295465
2017-02-17 17:34:37 +00:00
Rui Ueyama ac20c17962 MC/COFF: Do not emit forward associative section referenceds.
MSVC link.exe cannot handle associative sections that refer later
sections in the section header. Technically, such COFF object doesn't
violate the Microsoft COFF spec, as the spec doesn't say anything
about that, but still we should avoid doing that to make it compatible
with MS tools.

This patch assigns smaller section numbers to non-associative sections
and larger numbers to associative sections. This should resolve the
compatibility issue.

Differential Revision: https://reviews.llvm.org/D30080

llvm-svn: 295464
2017-02-17 17:32:54 +00:00
Sanjay Patel 7f2e58972c [DAGCombiner] split i1 select-of-constants from non-i1 case; NFCI
I can't find any tests of the non-i1 code path, so it may be unnecessary at this point.

llvm-svn: 295463
2017-02-17 17:13:27 +00:00
Simon Pilgrim 09dde435ab [X86][BMI] Add BMI stack folding tests
llvm-svn: 295462
2017-02-17 17:11:00 +00:00
Sanjay Patel f2a345c8ee [PowerPC] add tests for select-of-constants; NFC
llvm-svn: 295460
2017-02-17 16:43:43 +00:00
Sanjay Patel 9b6cfaa7b1 [ARM] add tests for select-of-constants; NFC
llvm-svn: 295459
2017-02-17 16:34:13 +00:00
Matthew Simpson f68e183f91 [LV] Remove constant restriction for vector phi creation
We previously only created a vector phi node for an induction variable if its
step had a constant integer type. However, the step actually only needs to be
loop-invariant. We only handle inductions having loop-invariant steps, so this
patch should enable vector phi node creation for all integer induction
variables that will be vectorized.

Differential Revision: https://reviews.llvm.org/D29956

llvm-svn: 295456
2017-02-17 16:09:07 +00:00
Simon Pilgrim 0429c0cf8b Fix signed/unsigned comparison warning.
llvm-svn: 295453
2017-02-17 16:01:16 +00:00
Sam Parker 58af0c55d2 [ARM] Replace HasT2ExtractPack with HasDSP
Removed the HasT2ExtractPack feature and replaced its references
with HasDSP. This then allows the Thumb2 extend instructions to be
selected for ARMv8M +dsp. These instruction descriptions have also
been refactored and more target tests have been added for their isel.

Differential Revision: https://reviews.llvm.org/D29623

llvm-svn: 295452
2017-02-17 15:42:44 +00:00
Simon Pilgrim 511d788a95 [DAGCombine] Recognise any_extend_vector_inreg and truncation style shuffle masks
During legalization we are often creating shuffles (via a build_vector scalarization stage) that are "any_extend_vector_inreg" style masks, and also other masks that are the equivalent of "truncate_vector_inreg" (if we had such a thing).

This patch is an attempt to match these cases to help undo the effects of just leaving shuffle lowering to handle it - which typically means we lose track of the undefined elements of the shuffles resulting in an unnecessary extension+truncation stage for widened illegal types.

The 2011-10-21-widen-cmp.ll regression will be fixed by making SIGN_EXTEND_VECTOR_IN_REG legal in SSE instead of lowering them to X86ISD::VSEXT (PR31712).

Differential Revision: https://reviews.llvm.org/D29454

llvm-svn: 295451
2017-02-17 15:14:48 +00:00
Sanjay Patel 5573042035 [DAGCombiner] improve readability; NFCI
llvm-svn: 295447
2017-02-17 14:21:59 +00:00
Diana Picus e836878bf1 [ARM] GlobalISel: Clean up some helpers
Return invalid opcodes when some of the helpers in the instruction selection
pass can't handle a given combination.

llvm-svn: 295446
2017-02-17 13:44:19 +00:00
Diana Picus 38699dbac5 [ARM] GlobalISel: Check mappings used by reg bank select
Add some asserts to make sure we're using the mappings that we think we're
using. This is to keep us from accidentally breaking functionality while moving
to TableGen'erated mappings.

llvm-svn: 295441
2017-02-17 13:14:25 +00:00
Diana Picus 7cab0786bd [ARM] GlobalISel: Use Subtarget in Legalizer
Start using the Subtarget to make decisions about what's legal. In particular,
we only mark floating point operations as legal if we have VFP2, which is
something we should've done from the very start.

llvm-svn: 295439
2017-02-17 11:25:17 +00:00
Diana Picus d2f3ba71c9 [ARM] GlobalISel: Add end-to-end tests for double
Test some really basic functionality through the whole GlobalISel pipeline.

llvm-svn: 295438
2017-02-17 11:25:11 +00:00
Ismail Donmez c7ff81435d Update Bugzilla URLs in docs
llvm-svn: 295432
2017-02-17 08:26:11 +00:00
Eugene Leviant 958fcd7502 InstCombine: fix extraction when performing vector/array punning
Differential revision: https://reviews.llvm.org/D29491

llvm-svn: 295429
2017-02-17 07:36:03 +00:00
Craig Topper cbd1b60e42 [IR][X86] Simplify some AutoUpgrade code slightly. NFC
llvm-svn: 295426
2017-02-17 07:07:24 +00:00
Craig Topper 905cc75f97 [IR][X86] Rename an AutoUpgrade helper function to more accurately match what intrinsics it handles. NFC
llvm-svn: 295425
2017-02-17 07:07:21 +00:00
Craig Topper b9b9cb0ce6 [IR][X86] Move X86 specific portions of UpgradeIntrinsicFunction1 to a couple helper functions. NFC
This enables some early outs to avoid repeatedly using IsX86 check to qualify. I hope to continue to improve this to shorten the lengths of some of the string comparisons.

llvm-svn: 295424
2017-02-17 07:07:19 +00:00
Andrew Wilkins a256889fb2 Go binding: Add methods for missing PassManagerBuilder C APIs
Patch by Ryuichi Hayashida!

Differential Revision: http://reviews.llvm.org/D30042

llvm-svn: 295420
2017-02-17 05:41:05 +00:00
Sanjoy Das 8b859c26ec [JumpThreading] Re-enable JumpThreading for guards
Summary:
JumpThreading for guards feature has been reverted at https://reviews.llvm.org/rL295200
due to the following problem: the feature used the following algorithm for detection of
diamond patters:

1. Find a block with 2 predecessors;
2. Check that these blocks have a common single parent;
3. Check that the parent's terminator is a branch instruction.

The problem is that these checks are insufficient. They may pass for a non-diamond
construction in case if those two predecessors are actually the same block. This may
happen if parent's terminator is a br (either conditional or unconditional) to a block
that ends with "switch" instruction with exactly two branches going to one block.

This patch re-enables the JumpThreading for guards and fixes this issue by adding the
check that those found predecessors are actually different blocks. This guarantees that
parent's terminator is a conditional branch with exactly 2 different successors, which
is now ensured by assertions. It also adds two more tests for this situation (with parent's
terminator being a conditional and an unconditional branch).

Patch by Max Kazantsev!

Reviewers: anna, sanjoy, reames

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30036

llvm-svn: 295410
2017-02-17 04:21:14 +00:00
Rafael Espindola 6eab4044b9 Revert "[Hexagon] Start using regmasks on calls"
This reverts commit r295371.

It broke windows bots:

http://bb.pgr.jp/builders/ninja-clang-i686-msc19-R/builds/11402/steps/test-llvm/logs/stdio

llvm-svn: 295402
2017-02-17 02:08:58 +00:00
Dean Michael Berris 4f83c4d1a6 [XRAY] [x86_64] Adding a Flight Data filetype reader to the llvm-xray Trace implementation.
Summary:
The file type packs function trace data onto disk from potentially multiple
threads that are aggregated and flushed during the course of an instrumented
program's runtime.

It is named FDR mode or Flight Data recorder as an analogy to plane
blackboxes, which instrument a running system without access to IO.

The writer code is defined in compiler-rt in xray_fdr_logging.h/cc

Reviewers: rSerge, kcc, dberris

Reviewed By: dberris

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29697

llvm-svn: 295397
2017-02-17 01:47:16 +00:00
Teresa Johnson 95ed51dcfe Move test to X86 subdirectory for bot failures
Second attempt at fixing bot failures from r295384.

llvm-svn: 295395
2017-02-17 01:23:28 +00:00
Chandler Carruth 96d86a7f9c [x86] Give this test a triple so that we don't have to cope with two
different asm comment syntaxes.

llvm-svn: 295394
2017-02-17 01:18:38 +00:00
Chris Bieneman e43bffa718 [CMake] Add variable IOS to iOS toolchain
This is useful for some edge cases where detecting things gets tricky. Specifically LLDB needs this to support iOS because CMake doesn't support running tests using obj-c code.

llvm-svn: 295392
2017-02-17 01:11:41 +00:00
Teresa Johnson 1ede03d5d2 Attempt to fix bot failures by adding -mtriple to llc invocation
Failures on hexagon from test added with r295384, e.g.:
http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/3793

llvm-svn: 295389
2017-02-17 00:52:09 +00:00
Matt Arsenault c18b67745b Bug 31948: Fix assertion when bitcasting constantexpr pointers
llvm-svn: 295387
2017-02-17 00:32:19 +00:00
Chandler Carruth 8960686927 FileCheck-ize some tests in test/CodeGen/X86/
Patch by Jorge Gorbe!

Differential Revision: https://reviews.llvm.org/D29807

llvm-svn: 295386
2017-02-17 00:29:59 +00:00
Teresa Johnson 76b5b7493c Handle link of NoDebug CU with a CU that has debug emission enabled
Summary:
This is an issue both with regular and Thin LTO. When we link together
a DICompileUnit that is marked NoDebug (e.g when compiling with -g0
but applying an AutoFDO profile, which requires location tracking
in the compiler) and a DICompileUnit with debug emission enabled,
we can have failures during dwarf debug generation. Specifically,
when we have inlined from the NoDebug compile unit into the debug
compile unit, we can fail during construction of the abstract and
inlined scope DIEs. This is because the SPMap does not include NoDebug
CUs (they are skipped in the debug_compile_units_iterator).

This patch fixes the failures by skipping locations from NoDebug CUs
when extracting lexical scopes.

Reviewers: dblaikie, aprantl

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D29765

llvm-svn: 295384
2017-02-17 00:21:19 +00:00
Eugene Zelenko deaf695138 [IR] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 295383
2017-02-17 00:00:09 +00:00
Zachary Turner 7b327d051b [pdb] Add the ability to resolve TypeServer PDBs.
Some PDBs or object files can contain references to other PDBs
where the real type information lives.  When this happens,
all type indices in the original PDB are meaningless because
their records are not there.

With this patch we add the ability to pull type info from those
secondary PDBs.

Differential Revision: https://reviews.llvm.org/D29973

llvm-svn: 295382
2017-02-16 23:35:45 +00:00
Wei Mi 493fb266ed [LSR] Prevent formula with SCEVAddRecExpr type of Reg from Sibling loops
In rL294814, we allow formula with SCEVAddRecExpr type of Reg from loops
other than current loop. This is good for the case when induction variable
of outerloop being used in expr in innerloop. But it is very bad to allow
such Reg from sibling loop because we may need to add lsr.iv in other sibling
loops when scev expanding those SCEVAddRecExpr type exprs. For the testcase
below, one loop can be inserted with a bunch of lsr.iv because of LSR for
other loops. 

// The induction variable j from a loop in the middle will have initial
// value generated from previous sibling loop and exit value used by its
// next sibling loop.
void goo(long i, long j); 
long cond; 

void foo(long N) { 
long i = 0; 
long j = 0; 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
i = 0; do { goo(i, j); i++; j++; } while (cond); 
} 

The fix is to only allow formula with SCEVAddRecExpr type of Reg from current
loop or its parents.

Differential Revision: https://reviews.llvm.org/D30021

llvm-svn: 295378
2017-02-16 21:27:31 +00:00
David Blaikie fc4857f80b Fix -Wunused-lambda-capture by removing some unused lambda captures
llvm-svn: 295373
2017-02-16 20:55:48 +00:00
Benjamin Kramer 3f6260cab4 [MachinePipeliner] Remove redundant destructor. NFC.
llvm-svn: 295372
2017-02-16 20:26:51 +00:00
Krzysztof Parzyszek fb9503c080 [Hexagon] Start using regmasks on calls
All the cool targets are doing it...

llvm-svn: 295371
2017-02-16 20:25:23 +00:00
Erich Keane c4c31e2020 Change default TimerGroup singleton to use magic statics
TimerGroup was showing up on a leak in valigrind, and 
used some pretty complex code to implement a singleton.
This patch replaces the implementation with a vastly simpler
one.

Differential Revision: https://reviews.llvm.org/D28367

llvm-svn: 295370
2017-02-16 20:19:49 +00:00
Krzysztof Parzyszek cac10f9768 [RDF] Aggregate shadow phi uses into one cluster when propagating live info
llvm-svn: 295366
2017-02-16 19:28:06 +00:00
Simon Pilgrim e5215751ff [X86][SSE] Add PR31309 test case (load-extend i32 to i128).
llvm-svn: 295363
2017-02-16 19:17:36 +00:00
Matt Arsenault b95ddd7cea AMDGPU: Remove llvm.AMDGPU.cube intrinsic
llvm-svn: 295359
2017-02-16 19:09:04 +00:00
Matt Arsenault eb65cda986 AMDGPU: Remove llvm.AMDGPU.rsq intrinsic
llvm-svn: 295358
2017-02-16 19:08:58 +00:00
Hans Wennborg 35905d6a67 Re-apply r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)"
The original commit was reverted in r283329 due to a miscompile in
Chromium. That turned out to be the same issue as PR31257, which was
fixed in r295262.

llvm-svn: 295357
2017-02-16 19:04:42 +00:00
Krzysztof Parzyszek 84cd4ea301 [RDF] Differentiate between defining and clobbering nodes
Defining nodes should not alias with one another, while clobbering
nodes can. When pushing defs on stacks, push clobbers first, link
non-clobbering defs, then push the defs.

The data flow in a statement is now: uses -> clobbers -> defs.

llvm-svn: 295356
2017-02-16 18:53:04 +00:00
David Blaikie b2fbb4b276 Refactor DebugHandlerBase a bit to common non-debug-having-function filtering
llvm-svn: 295354
2017-02-16 18:48:33 +00:00
Matt Arsenault 920576042d InstCombine: Canonicalize fast fmuladd to fmul + fadd
llvm-svn: 295353
2017-02-16 18:46:24 +00:00
Krzysztof Parzyszek 5226ba8daa [RDF] Move normalize(RegisterRef) to PhysicalRegisterInfo
Remove the duplicate from DFG and make some members of PRI private.

llvm-svn: 295351
2017-02-16 18:45:23 +00:00
Andrea Di Biagio 42f7712e23 x86 interrupt calling convention: only save xmm registers if the target supports SSE
The existing code always saves the xmm registers for 64-bit targets even if the
target doesn't support SSE (which is common for kernels). Thus, the compiler
inserts movaps instructions which lead to CPU exceptions when an interrupt
handler is invoked.

This commit fixes this bug by returning a register set without xmm registers
from getCalleeSavedRegs and getCallPreservedMask for such targets.

Patch by Philipp Oppermann.

Differential Revision: https://reviews.llvm.org/D29959

llvm-svn: 295347
2017-02-16 18:25:37 +00:00
Sanjay Patel 8e55b685c2 [x86] add more tests of select of constants; NFC
llvm-svn: 295346
2017-02-16 18:15:16 +00:00
Artur Pilipenko 85d758299e [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine
Resubmit -r295314 with PowerPC and AMDGPU tests updated.

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://reviews.llvm.org/D29591

llvm-svn: 295336
2017-02-16 17:07:27 +00:00
Sjoerd Meijer cb2d950214 [AArch64] AArch64AsmParser clean up of isImmediate functions. NFC
Regression test neon-diagnostics.s needed changing because it now
produces a more specific diagnostic about the immediate ranges. One
change in the expected error message is not obvious, but there multiple
candidate and it happens to pick the immediate diagnostic.

Differential Revision: https://reviews.llvm.org/D29939

llvm-svn: 295331
2017-02-16 15:52:22 +00:00
Dan Gohman 4a5496902c [WebAssembly] Add a cast to void to fix an unused private member warning, for now.
llvm-svn: 295327
2017-02-16 15:21:37 +00:00
Simon Pilgrim 2fe568c95e [X86] Remove local areOnlyUsersOf helper and use SDNode::areOnlyUsersOf instead.
llvm-svn: 295326
2017-02-16 15:11:49 +00:00
Marshall Clow e9110d71dd Remove uses of deprecated std::random_shuffle in the LLVM code base. Reviewed as https://reviews.llvm.org/D29780.
llvm-svn: 295325
2017-02-16 14:37:03 +00:00
Diana Picus 1540b06ef8 [ARM] GlobalISel: Select floating point loads
llvm-svn: 295321
2017-02-16 14:10:50 +00:00
Artur Pilipenko a1b384c4ce Rever -r295314 "[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine"
This change causes some of AMDGPU and PowerPC tests to fail.

llvm-svn: 295316
2017-02-16 13:04:46 +00:00
Artur Pilipenko daaa0c0f7d [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine
Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://reviews.llvm.org/D29591

llvm-svn: 295314
2017-02-16 12:53:26 +00:00
Diana Picus b1701e0b05 [ARM] GlobalISel: Select G_SEQUENCE and G_EXTRACT
Since they're only used for passing around double precision floating point
values into the general purpose registers, we'll lower them to VMOVDRR and
VMOVRRD.

llvm-svn: 295310
2017-02-16 12:19:57 +00:00
Diana Picus 6beef3c087 [ARM] GlobalISel: Select double G_FADD and copies
Just use VADDD if available, bail out if not.

llvm-svn: 295309
2017-02-16 12:19:52 +00:00
Diana Picus 9b32faa821 [ARM] GlobalISel: Assert that we don't use the FPR bank if we don't have VFP
llvm-svn: 295308
2017-02-16 11:25:09 +00:00
Diana Picus a93803b9fe [ARM] GlobalISel: Add reg bank mappings for G_SEQUENCE and G_EXTRACT
Support G_SEQUENCE and G_EXTRACT as needed for passing double precision floating
point values in the soft-fp float mode.

llvm-svn: 295306
2017-02-16 11:00:31 +00:00
Diana Picus 7f82c87022 [ARM] GlobalISel: Make the FPR bank 64-bit wide
Also add mappings for single and double precision FP, and use them for G_FADD
and G_LOAD.

llvm-svn: 295302
2017-02-16 10:12:49 +00:00
Diana Picus 21c3d8e0fc [ARM] GlobalISel: Legalize 64-bit G_FADD and G_LOAD
For now we just mark them as legal all the time and let the other passes bail
out if they can't handle it. In the future, we'll want to move more of the
brains into the legalizer.

llvm-svn: 295300
2017-02-16 09:09:49 +00:00
NAKAMURA Takumi 14246c937d RWMutex.h: Use llvm-config.h instead of config.h in installed headers.
llvm-svn: 295297
2017-02-16 08:22:08 +00:00
Diana Picus ca6a890d7f [ARM] GlobalISel: Lower double precision FP args
For the hard float calling convention, we just use the D registers.

For the soft-fp calling convention, we use the R registers and move values
to/from the D registers by means of G_SEQUENCE/G_EXTRACT. While doing so, we
make sure to honor the endianness of the target, since the CCAssignFn doesn't do
that for us.

For pure soft float targets, we still bail out because we don't support the
libcalls yet.

llvm-svn: 295295
2017-02-16 07:53:07 +00:00
Craig Topper 3731f4d173 [AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit.
llvm-svn: 295294
2017-02-16 07:35:23 +00:00
Craig Topper 715873ead3 [AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked intrinsics with select instructions. For 512-bit add new unmasked intrinsics.
The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO.

llvm-svn: 295290
2017-02-16 06:31:54 +00:00
Rui Ueyama 26ca0bddf0 Split WinCOFFObjectWriter::writeSection.
llvm-svn: 295276
2017-02-16 02:56:06 +00:00
Rui Ueyama af20f10d81 Split WinCOFFObjectWriter::writeObject function.
llvm-svn: 295273
2017-02-16 02:35:48 +00:00
Matt Arsenault d3e5cb77e4 AMDGPU: Remove llvm.SI.sendmsg
llvm-svn: 295270
2017-02-16 02:01:17 +00:00
Matt Arsenault d2c8a337aa AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsics
Update test uses with expansion in terms of new intrinsics.

llvm-svn: 295269
2017-02-16 02:01:13 +00:00
Rui Ueyama 1473e5429e Remove useless local variable.
llvm-svn: 295268
2017-02-16 01:41:04 +00:00
Rui Ueyama 6237678b14 Rename variables to match the LLVM style.
llvm-svn: 295265
2017-02-16 01:06:45 +00:00
Hans Wennborg a468601e0e [X86] Re-enable conditional tail calls and fix PR31257.
This reverts r294348, which removed support for conditional tail calls
due to the PR above. It fixes the PR by marking live registers as
implicitly used and defined by the now predicated tailcall. This is
similar to how IfConversion predicates instructions.

Differential Revision: https://reviews.llvm.org/D29856

llvm-svn: 295262
2017-02-16 00:04:05 +00:00
Peter Collingbourne 08eb081ac3 PMB: Add an importing WPD pass to the start of the ThinLTO backend pipeline.
Differential Revision: https://reviews.llvm.org/D30008

llvm-svn: 295260
2017-02-15 23:48:38 +00:00
Teresa Johnson 3963ba3e48 Collapse my two entries in CODE_OWNERS.txt
llvm-svn: 295259
2017-02-15 23:45:21 +00:00
Tim Northover 9136617a3f GlobalISel: legalize va_arg on AArch64.
Uses a Custom implementation because the slot sizes being a multiple of the
pointer size isn't really universal, even for the architectures that do have a
simple "void *" va_list.

llvm-svn: 295255
2017-02-15 23:22:50 +00:00
Tim Northover 4a652227dd GlobalISel: support translating va_arg
Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also
need to attach the required alignment info.

llvm-svn: 295254
2017-02-15 23:22:33 +00:00
Daniel Berlin 3c1432fecf Implement intrinsic mangling for literal struct types.
Fixes PR 31921

Summary:
Predicateinfo requires an ugly workaround to try to avoid literal
struct types due to the intrinsic mangling not being implemented.
This workaround actually does not work in all cases (you can hit the
assert by bootstrapping with -print-predicateinfo), and can't be made
to work without DFS'ing the type (IE copying getMangledStr and using a
version that detects if it would crash).

Rather than do that, i just implemented the mangling.  It seems
simple, since they are unified structurally.

Looking at the overloaded-mangling testcase we have, it actually turns
out the gc intrinsics will *also* crash if you try to use a literal
struct.  Thus, the testcase added fails before this patch, and works
after, without needing to resort to predicateinfo.

Reviewers: chandlerc, davide

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D29925

llvm-svn: 295253
2017-02-15 23:16:20 +00:00
Matt Arsenault 824de226a1 AMDGPU: Remove dead node definitions
llvm-svn: 295247
2017-02-15 22:23:04 +00:00
Matt Arsenault 900b21c350 Fix typos
llvm-svn: 295246
2017-02-15 22:19:06 +00:00
Matt Arsenault a78ca62c64 AMDGPU: Consolidate sendmsg/sendmsghalt handling and tests
llvm-svn: 295244
2017-02-15 22:17:09 +00:00
Eugene Zelenko 454d0cea6a [Support] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 295243
2017-02-15 22:17:02 +00:00
Matt Arsenault 5de8dc9cf5 DAG: Do not scalarize fsub if fneg is legal
Tests will be included with future commit.

llvm-svn: 295242
2017-02-15 22:02:42 +00:00
Peter Collingbourne 50cbd7cc90 Re-apply r295110 and r295144 with a fix for the ASan issue.
llvm-svn: 295241
2017-02-15 21:56:51 +00:00
Matt Arsenault d122abead4 AMDGPU: Replace assert with report_fatal_error
Also use a more refined condition.

llvm-svn: 295239
2017-02-15 21:50:34 +00:00
Keno Fischer 5e1e59180e [GlobalObject] Fix setSection("")
Summary:
In rL291613, the section name was interned in LLVMContext. However,
this broke the ability to remove the section from a GlobalObject,
because it tried to intern empty strings, which is not allowed.
Fix that and add an appropriate regression test.

Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D29795

llvm-svn: 295238
2017-02-15 21:42:42 +00:00
Sanjay Patel 845ea963aa [InstCombine] improve formatting; NFC
llvm-svn: 295237
2017-02-15 21:31:34 +00:00
Peter Collingbourne 9421c2dc54 AssumptionCache: Disable the verifier by default, move it behind a hidden cl::opt and verify from releaseMemory().
This is a short term solution to the problem that many passes currently fail
to update the assumption cache. In the long term the verifier should not
be controllable with a flag. We should either fix all passes to correctly
update the assumption cache and enable the verifier unconditionally or
somehow arrange for the assumption list to be updated automatically by passes.

Differential Revision: https://reviews.llvm.org/D30003

llvm-svn: 295236
2017-02-15 21:10:09 +00:00
Simon Pilgrim 5b4c30fb32 [X86][SSE] Don't call EltsFromConsecutiveLoads if any element is missing.
Minor performance speedup - if any call to getShuffleScalarElt fails to get a result, don't both calling for the remaining elements as EltsFromConsecutiveLoads will fail anyhow.

llvm-svn: 295235
2017-02-15 21:09:00 +00:00
Arnold Schwaighofer 8d61e0030a AddressSanitizer: don't track swifterror memory addresses
They are register promoted by ISel and so it makes no sense to treat them as
memory.

Inserting calls to the thread sanitizer would also generate invalid IR.

You would hit:

"swifterror value can only be loaded and stored from, or as a swifterror
argument!"

llvm-svn: 295230
2017-02-15 20:43:43 +00:00
Ahmed Bougacha f8acf568f1 [AArch64] Make am_ldrlit an iPTR - not OtherVT - operand. NFC-ish.
am_ldrlit diverged from am_brcond in r207105, but kept the OtherVT
operand type.  It made sense for branch targets, as those are
represented as MVT::Other in SDAG.  But loads operate on pointers.

This shouldn't have an observable effect on any in-tree code, but helps
make the patterns consistent for external users.

llvm-svn: 295229
2017-02-15 20:38:31 +00:00
Ahmed Bougacha 360260066e [OptDiag] Pass const Values/Types to Argument. NFC.
llvm-svn: 295228
2017-02-15 20:38:28 +00:00
Ahmed Bougacha f9e5a1dd88 [IR] Accept 'const Type &' in the Type operator<<. NFC.
Type::print is const; there's no reason for the operator not to be.

llvm-svn: 295227
2017-02-15 20:38:22 +00:00
Tobias Edler von Koch f454b9eadf [LTO] Add ability to emit assembly to new LTO API
Summary:
Add a field to LTO::Config, CGFileType, to select the file type to emit (object
or assembly). This is useful for testing and to implement -save-temps.

Reviewers: tejohnson, mehdi_amini, pcc

Reviewed By: mehdi_amini

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D29475

llvm-svn: 295226
2017-02-15 20:36:36 +00:00
Kyle Butt 7fbec9bdf1 Codegen: Make chains from trellis-shaped CFGs
Lay out trellis-shaped CFGs optimally.
A trellis of the shape below:

  A     B
  |\   /|
  | \ / |
  |  X  |
  | / \ |
  |/   \|
  C     D

would be laid out A; B->C ; D by the current layout algorithm. Now we identify
trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an
increasing number of predecessors. A trellis is a a group of 2 or more
predecessor blocks that all have the same successors.

because of this we can tail duplicate to extend existing trellises.

As an example consider the following CFG:

    B   D   F   H
   / \ / \ / \ / \
  A---C---E---G---Ret

Where A,C,E,G are all small (Currently 2 instructions).

The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret.

The current code will copy C into B, E into D and G into F and yield the layout
A,C,B(C),E,D(E),F(G),G,H,ret

define void @straight_test(i32 %tag) {
entry:
  br label %test1
test1: ; A
  %tagbit1 = and i32 %tag, 1
  %tagbit1eq0 = icmp eq i32 %tagbit1, 0
  br i1 %tagbit1eq0, label %test2, label %optional1
optional1: ; B
  call void @a()
  br label %test2
test2: ; C
  %tagbit2 = and i32 %tag, 2
  %tagbit2eq0 = icmp eq i32 %tagbit2, 0
  br i1 %tagbit2eq0, label %test3, label %optional2
optional2: ; D
  call void @b()
  br label %test3
test3: ; E
  %tagbit3 = and i32 %tag, 4
  %tagbit3eq0 = icmp eq i32 %tagbit3, 0
  br i1 %tagbit3eq0, label %test4, label %optional3
optional3: ; F
  call void @c()
  br label %test4
test4: ; G
  %tagbit4 = and i32 %tag, 8
  %tagbit4eq0 = icmp eq i32 %tagbit4, 0
  br i1 %tagbit4eq0, label %exit, label %optional4
optional4: ; H
  call void @d()
  br label %exit
exit:
  ret void
}

here is the layout after D27742:
straight_test:                          # @straight_test
; ... Prologue elided
; BB#0:                                 # %entry ; A (merged with test1)
; ... More prologue elided
	mr 30, 3
	andi. 3, 30, 1
	bc 12, 1, .LBB0_2
; BB#1:                                 # %test2 ; C
	rlwinm. 3, 30, 0, 30, 30
	beq	 0, .LBB0_3
	b .LBB0_4
.LBB0_2:                                # %optional1 ; B (copy of C)
	bl a
	nop
	rlwinm. 3, 30, 0, 30, 30
	bne	 0, .LBB0_4
.LBB0_3:                                # %test3 ; E
	rlwinm. 3, 30, 0, 29, 29
	beq	 0, .LBB0_5
	b .LBB0_6
.LBB0_4:                                # %optional2 ; D (copy of E)
	bl b
	nop
	rlwinm. 3, 30, 0, 29, 29
	bne	 0, .LBB0_6
.LBB0_5:                                # %test4 ; G
	rlwinm. 3, 30, 0, 28, 28
	beq	 0, .LBB0_8
	b .LBB0_7
.LBB0_6:                                # %optional3 ; F (copy of G)
	bl c
	nop
	rlwinm. 3, 30, 0, 28, 28
	beq	 0, .LBB0_8
.LBB0_7:                                # %optional4 ; H
	bl d
	nop
.LBB0_8:                                # %exit ; Ret
	ld 30, 96(1)                    # 8-byte Folded Reload
	addi 1, 1, 112
	ld 0, 16(1)
	mtlr 0
	blr

The tail-duplication has produced some benefit, but it has also produced a
trellis which is not laid out optimally. With this patch, we improve the layouts
of such trellises, and decrease the cost calculation for tail-duplication
accordingly.

This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have
back edges, which is a negative, but it has a bigger compensating
positive, which is that it handles the case where there are long strings
of skipped blocks much better than the original layout. Both layouts
handle runs of executed blocks equally well. Branch prediction also
improves if there is any correlation between subsequent optional blocks.

Here is the resulting concrete layout:

straight_test:                          # @straight_test
; BB#0:                                 # %entry ; A (merged with test1)
	mr 30, 3
	andi. 3, 30, 1
	bc 12, 1, .LBB0_4
; BB#1:                                 # %test2 ; C
	rlwinm. 3, 30, 0, 30, 30
	bne	 0, .LBB0_5
.LBB0_2:                                # %test3 ; E
	rlwinm. 3, 30, 0, 29, 29
	bne	 0, .LBB0_6
.LBB0_3:                                # %test4 ; G
	rlwinm. 3, 30, 0, 28, 28
	bne	 0, .LBB0_7
	b .LBB0_8
.LBB0_4:                                # %optional1 ; B (Copy of C)
	bl a
	nop
	rlwinm. 3, 30, 0, 30, 30
	beq	 0, .LBB0_2
.LBB0_5:                                # %optional2 ; D (Copy of E)
	bl b
	nop
	rlwinm. 3, 30, 0, 29, 29
	beq	 0, .LBB0_3
.LBB0_6:                                # %optional3 ; F (Copy of G)
	bl c
	nop
	rlwinm. 3, 30, 0, 28, 28
	beq	 0, .LBB0_8
.LBB0_7:                                # %optional4 ; H
	bl d
	nop
.LBB0_8:                                # %exit

Differential Revision: https://reviews.llvm.org/D28522

llvm-svn: 295223
2017-02-15 19:49:14 +00:00
Xinliang David Li 538d666814 include function name in dot filename
Differential Revision: http://reviews.llvm.org/D29975

llvm-svn: 295220
2017-02-15 19:21:04 +00:00
Arnold Schwaighofer 8eb1a48540 ThreadSanitizer: don't track swifterror memory addresses
They are register promoted by ISel and so it makes no sense to treat them as
memory.

Inserting calls to the thread sanitizer would also generate invalid IR.

You would hit:

"swifterror value can only be loaded and stored from, or as a swifterror
argument!"

llvm-svn: 295215
2017-02-15 18:57:06 +00:00
Michael Kuperstein ba80db39d7 [DAG] Don't try to create an INSERT_SUBVECTOR with an illegal source
We currently can't legalize those, but we should really not be creating
them in the first place, since legalization would probably look similar to the
way we legalize CONCAT_VECTORS - basically replace the INSERT with a BUILD.

This fixes PR311956.

Differential Revision: https://reviews.llvm.org/D29961

llvm-svn: 295213
2017-02-15 18:37:26 +00:00
Dehao Chen 726da628e8 Expose getBaseDiscriminatorFromDiscriminator, getDuplicationFactorFromDiscriminator and getCopyIdentifierFromDiscriminator API so that downstream tools can use them to get the correct encoding.
Summary: Discriminators are now encoded with rich information. This patch exposes the encoding API to downstream tools.

Reviewers: davidxl, hfinkel

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29852

llvm-svn: 295210
2017-02-15 17:54:39 +00:00
Sanjay Patel 056218644b [Inline] add tests to show attribute information loss; NFC
llvm-svn: 295209
2017-02-15 17:42:58 +00:00
Simon Pilgrim da25d5c7b6 [X86][SSE] Propagate undef upper elements from scalar_to_vector during shuffle combining
Only do this for integer types currently - floats types (in particular insertps) load folding often fails with this.

llvm-svn: 295208
2017-02-15 17:41:33 +00:00
Stanislav Mekhanoshin 582a5237f9 [AMDGPU] Revert failed scheduling
This patch reverts region's scheduling to the original untouched state
in case if we have have decreased occupancy.

In addition it switches to use TargetRegisterInfo occupancy callback
for pressure limits instead of gradually increasing limits which were
just passed by. We are going to stay with the best schedule so we do
not need to tolerate worsened scheduling anymore.

Differential Revision: https://reviews.llvm.org/D29971

llvm-svn: 295206
2017-02-15 17:19:50 +00:00
Anna Thomas 94c8d4976c Revert "[JumpThreading] Thread through guards"
This reverts commit r294617.

We fail on an assert while trying to get a condition from an
unconditional branch.

llvm-svn: 295200
2017-02-15 17:08:29 +00:00
Simon Pilgrim d811bdd61a [X86] Regenerate scalar stack reload test
llvm-svn: 295195
2017-02-15 16:48:45 +00:00
David Bozier 4b21d022b7 Fix unittest for buildbot with mips host (32bit big endian) from r295174
llvm-svn: 295188
2017-02-15 16:03:22 +00:00
Sanjay Patel 288f075f8e [InlineFunction] use getFunction(); NFC
llvm-svn: 295185
2017-02-15 15:22:18 +00:00
Simon Pilgrim 1746e2152c Fix spelling mistake - paramater -> parameter. NFCI.
llvm-svn: 295182
2017-02-15 15:11:36 +00:00
Sanjay Patel 32d753cae3 [InlineFunction] use getCaller(); NFCI
llvm-svn: 295181
2017-02-15 15:08:38 +00:00
Sanjay Patel ada717e25b [InlineFunction] use range-for loop; NFCI
llvm-svn: 295179
2017-02-15 14:56:11 +00:00
Simon Pilgrim a0e56d2d68 [X86] Regenerate i64 ext-load on 32-bit target tests
llvm-svn: 295177
2017-02-15 14:06:17 +00:00
David Bozier 5c8e5f3722 Attempt to fix buildbots after commit of r295173.
Unit tests needed to check on the endianness of the host platform. (Test was failing for big endian hosts).

llvm-svn: 295174
2017-02-15 13:40:05 +00:00
David Bozier 4ab9a06f6a Fix incorrect formatting of DataRefImpl members in operator<< function
Changed format specifiers to use format macro constant for pointer type. 
Moved width part of format specifier in the correct place for formatting members a and b.

Added a unit test to confirm the output.

Differential Revision: https://reviews.llvm.org/D28957

llvm-svn: 295173
2017-02-15 12:58:41 +00:00
Simon Pilgrim 0f0e5bd3c6 [X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs
Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets

llvm-svn: 295169
2017-02-15 11:46:15 +00:00
Sagar Thakur ec65792910 [LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el
Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit.

Reviewed by sdardis, dberris
Differential: D27697

llvm-svn: 295164
2017-02-15 10:48:11 +00:00
Daniel Jasper eef9b03395 Revert r295110 and r295144.
This fails under ASAN:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/798/steps/check-llvm%20asan/logs/stdio

llvm-svn: 295162
2017-02-15 09:56:08 +00:00
Ayman Musa b8a4f255dd [X86][AVX] Remove REX_W from AVX instructions.
There is no meaning for REX_W in VEX encoded AVX instruction.

Differential Revision: https://reviews.llvm.org/D29894

llvm-svn: 295157
2017-02-15 08:12:16 +00:00
Craig Topper fbc7805e25 [X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types
Summary:
We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs.

As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast.

I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable.

This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused.

Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0.

Reviewers: delena, RKSimon, zvi

Reviewed By: zvi

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D28747

llvm-svn: 295155
2017-02-15 06:58:47 +00:00
Craig Topper ec5df5f4aa [AVX-512] Add PACKSS/PACKUS instructions to load folding tables.
llvm-svn: 295154
2017-02-15 06:51:39 +00:00
Craig Topper 96ec7a23e3 [SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask
Summary:
The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract.

This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract.

Reviewers: zvi, RKSimon

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29926

llvm-svn: 295152
2017-02-15 05:57:16 +00:00