Commit Graph

107839 Commits

Author SHA1 Message Date
Anna Thomas 729dafc16b Strip off invariant.start because memory locations arent invariant
The original change was reverted in rL317217 because of the failure in
the RS4GC testcase. I couldn't reproduce the failure on my local machine
(macbook) but could reproduce it on a linux box.

The failure was around removing the uses of invariant.start. The fix
here is to just RAUW undef (which was the first implementation in D39388).
This is perfectly valid IR as discussed in the review.

llvm-svn: 317225
2017-11-02 18:24:04 +00:00
Anna Thomas ebe429d99f Revert "[RS4GC] Strip off invariant.start because memory locations arent invariant"
This reverts commit r317215, investigating the test failure.

llvm-svn: 317217
2017-11-02 16:45:51 +00:00
Anna Thomas 486a7aaa31 [RS4GC] Strip off invariant.start because memory locations arent invariant
Summary:
Invariant.start on memory locations has the property that the memory
location is unchanging. However, this is not true in the face of
rewriting statepoints for GC.
Teach RS4GC about removing invariant.start so that optimizations after
RS4GC does not incorrect sink a load from the memory location past a
statepoint.

Added test showcasing the issue.

Reviewers: reames, apilipenko, dneilson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39388

llvm-svn: 317215
2017-11-02 16:23:31 +00:00
Clement Courbet 82bade615b Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage

This reverts commit eea333c33fa73ad225ef28607795984829f65688.

llvm-svn: 317213
2017-11-02 15:53:10 +00:00
Clement Courbet 1dc37b9c3b [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.

See the discussion in PR33325 for more details.

Reviewers: spatel

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D39456

llvm-svn: 317211
2017-11-02 15:02:51 +00:00
Ayman Musa a37d1130d7 [X86] Fix bug in legalize vector types - Split large loads
When splitting a large load to smaller legally-typed loads, the last load should be padded to reach the size of the previous one so a CONCAT_VECTORS node could reunite them again.
The code currently pads the last load to reach the size of the first load (instead of the previous).

Differential Revision: https://reviews.llvm.org/D38495

Change-Id: Ib60b55ed26ce901fabf68108daf52683fbd5013f
llvm-svn: 317206
2017-11-02 13:07:06 +00:00
Simon Dardis 725acb2d91 [mips] Use register scavenging with MSA.
MSA stores and loads to the stack are more likely to require an
emergency GPR spill slot due to the smaller offsets available
with those instructions.

Handle this by overestimating the size of the stack by determining
the largest offset presuming that all callee save registers are
spilled and accounting of incoming arguments when determining
whether an emergency spill slot is required.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D39056

llvm-svn: 317204
2017-11-02 12:47:22 +00:00
Sam McCall 0e142499a9 Temporary workaround for msan false positive.
llvm-svn: 317203
2017-11-02 12:29:47 +00:00
Yichao Yu 6fefc0d65e Allow inaccessiblememonly and inaccessiblemem_or_argmemonly to be overwriten on call site with operand bundle
Summary:
Similar to argmemonly, readonly and readnone.

Fix PR35128

Reviewers: andrew.w.kaylor, chandlerc, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D39434

llvm-svn: 317201
2017-11-02 12:18:33 +00:00
Francis Visoiu Mistrih 66d2c269dc [AsmPrinterDwarf] Add support for .cfi_restore directive
As of today we only use .cfi_offset to specify the offset of a CSR, but
we never use .cfi_restore when the CSR is restored.

If we want to perform a more advanced type of shrink-wrapping, we need
to use .cfi_restore in order to switch the CFI state between blocks.

This patch only aims at adding support for the directive.

Differential Revision: https://reviews.llvm.org/D36114

llvm-svn: 317199
2017-11-02 12:00:58 +00:00
Bjorn Pettersson e73b85d1ab [SimplifyCFG] Discard speculated dbg intrinsics
Summary:
SpeculativelyExecuteBB can flatten the CFG by doing
speculative execution followed by a select instruction.
When the speculatively executed BB contained dbg intrinsics
the result could be a little bit weird, since those dbg
intrinsics were inserted before the select in the flattened
CFG. So when single stepping in the debugger, printing the
value of the variable referenced in the dbg intrinsic, it
could happen that it looked like the variable had values
that never actually were assigned to the variable.

This patch simply discards all dbg intrinsics that were found
in the speculatively executed BB.

Reviewers: aprantl, chandlerc, craig.topper

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39494

llvm-svn: 317198
2017-11-02 11:55:14 +00:00
Sam Parker 242052c6b4 [ARM] and, or, xor and add with shl combine
The generic dag combiner will fold:

(shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
(shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)

This can create constants which are too large to use as an immediate.
Many ALU operations are also able of performing the shl, so we can
unfold the transformation to prevent a mov imm instruction from being
generated.

Other patterns, such as b + ((a << 1) | 510), can also be simplified
in the same manner.

Differential Revision: https://reviews.llvm.org/D38084

llvm-svn: 317197
2017-11-02 10:43:10 +00:00
Andrew V. Tischenko 3c8bf5ec37 The patch updates sched numbers for YMM AVX instrs such as VMOVx, VORx, VXOR, VPERMILx, VBROADCASTx, etc.
PR32857 should be closed.
Differential Revision: https://reviews.llvm.org/D39227

llvm-svn: 317196
2017-11-02 10:33:41 +00:00
Craig Topper 3bbe24c3ca [X86] Remove the model checks from the 486 detection code in Host.cpp
This just provided a bunch of comments to read and not much else.

llvm-svn: 317185
2017-11-02 03:32:50 +00:00
Craig Topper 094d7914ae [X86] Simplify the detection of pentium-mmx in Host.cpp.
Rather than looking at model numbers just check for the mmx feature flag. While there promote INTEL_PENTIUM_MMX to a CPU type instead of a subtype so that we don't have weird type with only one subtype.

llvm-svn: 317184
2017-11-02 03:32:49 +00:00
Jake Ehrlich 03aeeb09c5 [yaml2obj][ELF] Add support for setting alignment in program headers
Sometimes program headers have larger alignments than any of the
sections they contain. Currently yaml2obj can't produce such files. A
bug recently appeared in llvm-objcopy that failed in such a case. I'd
like to be able to add tests to llvm-objcopy for such cases.

This change adds an optional alignment parameter to program headers that
will be used instead of calculating the alignment.

Differential Revision: https://reviews.llvm.org/D39130

llvm-svn: 317139
2017-11-01 23:14:48 +00:00
Adrian Prantl bfa77c4c85 loop-unroll: teach remapInstruction to update dbg.value intrinsics.
Fixes PR35112.

https://bugs.llvm.org/show_bug.cgi?id=35112

llvm-svn: 317138
2017-11-01 23:12:35 +00:00
Petar Jovanovic bb5c84fb57 Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf
buildbot failure (build #15606).

llvm-svn: 317136
2017-11-01 23:05:52 +00:00
whitequark 789164d426 [LLVM-C] Expose functions to create debug locations via DIBuilder.
These include:
  * Several functions for creating an LLVMDIBuilder,
  * LLVMDIBuilderCreateCompileUnit,
  * LLVMDIBuilderCreateFile,
  * LLVMDIBuilderCreateDebugLocation.

Patch by Harlan Haskins.

Differential Revision: https://reviews.llvm.org/D32368

llvm-svn: 317135
2017-11-01 22:18:52 +00:00
Craig Topper 3837322a6b [X86] Use foreach in X86.td to combine some of the CPU names that are obviously aliases. NFC
llvm-svn: 317134
2017-11-01 22:15:49 +00:00
Craig Topper 7a754c4622 [X86] Add CMOV feature to 'i686' processor, making it a proper alias of pentiumpro which I believe it should be.
This is consistent with current gcc behavior.

llvm-svn: 317133
2017-11-01 22:15:40 +00:00
Daniel Sanders 466fe399b8 [globalisel][regbank] Warn about MIR ambiguities when register bank/class names clash.
llvm-svn: 317132
2017-11-01 22:13:05 +00:00
Simon Pilgrim e152c2c447 [X86][SSE] Add PACKUS support to LowerTruncate
Similar to the existing code to lower to PACKSS, we can use PACKUS if the input vector's leading zero bits extend all the way to the packed/truncated value.

We have to account for pre-SSE41 targets not supporting PACKUSDW

llvm-svn: 317128
2017-11-01 21:52:29 +00:00
Rui Ueyama a16fe65b72 Rewrite FileOutputBuffer as two separate classes.
This patch is to rewrite FileOutputBuffer as two separate classes;
one for file-backed output buffer and the other for memory-backed
output buffer. I think the new code is easier to follow because two
different implementations are now actually separated as different
classes.

Unlike the previous implementation, the class that does not replace the
final output file using rename(2) does not create a temporary file at
all. Instead, it allocates memory using mmap(2) and use it. I think
this is an improvement because it is now guaranteed that the temporary
memory region doesn't trigger any I/O and there's now zero chance to
leave a temporary file behind. Also, it shouldn't impose new restrictions
because were using mmap IO too.

Differential Revision: https://reviews.llvm.org/D39449

llvm-svn: 317127
2017-11-01 21:38:14 +00:00
Craig Topper 4e56ba271e [X86] Add custom code to EVEX to VEX pass to turn unmasked 128-bit VPALIGND/Q into VPALIGNR if the extended registers aren't being used.
This will enable us to prefer VALIGND/Q during shuffle lowering in order to get the extended register encoding space when BWI isn't available. But if we end up not using the extended registers we can switch VPALIGNR for the shorter VEX encoding.

Differential Revision: https://reviews.llvm.org/D39401

llvm-svn: 317122
2017-11-01 21:00:59 +00:00
Adrian Prantl 98c6549e4a loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.
This fixes the second half of PR35113.

This reapplies r317106 without modifications.

llvm-svn: 317121
2017-11-01 20:53:22 +00:00
Adrian Prantl d60f34c20a loop-rotate: eliminate duplicate debug intrinsics after splicing.
Fixes part of PR35113.

This reapplies r317105 with an additional check for isa<Instruction>
as found by the bots.

llvm-svn: 317120
2017-11-01 20:43:30 +00:00
Dehao Chen c6c051f2ea Include GUIDs from the same module when computing GUIDs that needs to be imported.
Summary: In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D39480

llvm-svn: 317118
2017-11-01 20:26:47 +00:00
Philip Reames 7b861f08cd Revert 317016 and 317048
The former appears to have introduced a miscompile in a stage2 clang build.  Revert so I can investigate offline.

llvm-svn: 317116
2017-11-01 19:49:20 +00:00
Konstantin Zhuravlyov 435151ad75 AMDGPU: Fix set but not used warnings related to AMDGPUAS
Differential Revision: https://reviews.llvm.org/D39499

llvm-svn: 317114
2017-11-01 19:12:38 +00:00
Craig Topper ca1aa83cbe [X86] Prevent fast isel from folding loads into the instructions listed in hasPartialRegUpdate.
This patch moves the check for opt size and hasPartialRegUpdate into the lower level implementation of foldMemoryOperandImpl to catch the entry point that fast isel uses.

We're still folding undef register instructions in AVX that we should also probably disable, but that's a problem for another patch.

Unfortunately, this requires reordering a bunch of functions which is why the diff is so large. I can do the function reordering separately if we want.

Differential Revision: https://reviews.llvm.org/D39402

llvm-svn: 317112
2017-11-01 18:10:06 +00:00
Graham Yiu 671526148c Adds code to PPC ISEL lowering to recognize half-word inserts from vector_shuffles, and use P9 shift and vector insert instructions instead of vperm.
Differential Revision: https://reviews.llvm.org/D34160

llvm-svn: 317111
2017-11-01 18:06:56 +00:00
Adrian Prantl c8516346e4 Revert r317105 to investigate bot breakage.
llvm-svn: 317110
2017-11-01 18:06:38 +00:00
Adrian Prantl 40a0ea5f29 Revert r317106 to facilitate reverting r317105.
llvm-svn: 317109
2017-11-01 18:06:35 +00:00
Peter Collingbourne 9fb6e1a037 LTO: Apply global DCE to ThinLTO modules at LTO opt level 0.
This is necessary because DCE is applied to full LTO modules. Without
this change, a reference from a dead ThinLTO global to a dead full
LTO global will result in an undefined reference at link time.

This problem is only observable when --gc-sections is disabled, or
when targeting COFF, as the COFF port of lld requires all symbols to
have a definition even if all references are dead (this is consistent
with link.exe).

This change also adds an EliminateAvailableExternally pass at -O0. This
is necessary to handle the situation on Windows where a non-prevailing
copy of a linkonce_odr function has an SEH filter function; any
such filters must be DCE'd because they will contain a call to the
llvm.localrecover intrinsic, passing as an argument the address of the
function that the filter belongs to, and llvm.localrecover requires
this function to be defined locally.

Fixes PR35142.

Differential Revision: https://reviews.llvm.org/D39484

llvm-svn: 317108
2017-11-01 17:58:39 +00:00
Adrian Prantl 9259f21604 loop-rotate: avoid duplicating dbg.value intrinsics in the entry block.
This fixes the second half of PR35113.

llvm-svn: 317106
2017-11-01 17:28:50 +00:00
Adrian Prantl b627acd0ce loop-rotate: eliminate duplicate debug intrinsics after splicing.
Fixes part of PR35113.

llvm-svn: 317105
2017-11-01 17:28:47 +00:00
Craig Topper 5ae677e102 [X86] Add 64-bit int to float/double conversion with AVX to X86FastISel::X86SelectSIToFP
Summary:
[X86] Teach fast isel to handle i64 sitofp with AVX.

For some reason we only handled i32 sitofp with AVX. But with SSE only we support i64 so we should do the same with AVX.

Also add i686 command lines for the 32-bit tests. 64-bit tests are in a separate file to avoid a fast-isel abort failure in 32-bit mode.

Reviewers: RKSimon, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39450

llvm-svn: 317102
2017-11-01 16:23:06 +00:00
Andrew V. Tischenko 3d971e39f8 Update VCVTx, VMOVNTPx and VROUNDYPx instructions scheduling on btver2.
Differential Revision: https://reviews.llvm.org/D39059

llvm-svn: 317101
2017-11-01 16:10:20 +00:00
Petar Jovanovic f2faee92aa Correct dwarf unwind information in function epilogue for X86
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.


Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D35844

llvm-svn: 317100
2017-11-01 16:04:11 +00:00
Simon Pilgrim 778810eb42 [X86][SSE] Begun generalizing truncateVectorWithPACKSS to work with PACKSS/PACKUS functions
Renamed to truncateVectorWithPACK

llvm-svn: 317098
2017-11-01 15:31:51 +00:00
Geoff Berry eed6531ea2 [BranchProbabilityInfo] Handle irreducible loops.
Summary:
Compute the strongly connected components of the CFG and fall back to
use these for blocks that are in loops that are not detected by
LoopInfo when computing loop back-edge and exit branch probabilities.

Reviewers: dexonsmith, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D39385

llvm-svn: 317094
2017-11-01 15:16:50 +00:00
Roger Ferrer Ibanez 9dfbc10522 Revert r313618 "[ARM] Use ADDCARRY / SUBCARRY"
That change causes PR35103, so reverting until I figure it out.

llvm-svn: 317092
2017-11-01 14:06:57 +00:00
NAKAMURA Takumi 1657f2ad99 Fix warnings discovered by rL317076. [-Wunused-private-field]
llvm-svn: 317091
2017-11-01 13:47:55 +00:00
NAKAMURA Takumi f7d7a59b9e Suppress a warning discovered by rL317076. [-Wunused-private-field]
llvm-svn: 317090
2017-11-01 13:47:51 +00:00
Max Kazantsev 6f5229d7da Revert rL311205 "[IRCE] Fix buggy behavior in Clamp"
This patch reverts rL311205 that was initially a wrong fix. The real problem
was in intersection of signed and unsigned ranges (see rL316552), and the
patch being reverted masked the problem instead of fixing it.

By now, the test against which rL311205 was made works OK even without this
code. This revert patch also contains a test case that demonstrates incorrect
behavior caused by rL311205: it is caused by incorrect choise of signed max
instead of unsigned.

llvm-svn: 317088
2017-11-01 13:21:56 +00:00
Simon Pilgrim 687982c181 [SelectionDAG] computeKnownBits - use ashrInPlace on known bits of ISD::SRA input. NFCI.
llvm-svn: 317087
2017-11-01 13:16:48 +00:00
Simon Pilgrim f657ba0cb6 [X86][SSE] Truncate with PACKSS any input with sufficient sign-bits
So far we've only been using PACKSS truncations with 'all-bits or zero-bits' patterns (vector comparison results etc.). When really we can safely use it for any case as long as the number of sign bits reach down to the last 16-bits (or 8-bits if we're truncating to bytes).

The next steps after this is add the equivalent support for PACKUS and to support packing to sub-128 bit vectors for truncating stores etc.

Differential Revision: https://reviews.llvm.org/D39476

llvm-svn: 317086
2017-11-01 11:47:44 +00:00
Florian Hahn b93c06331e [CodeExtractor] Fix iterator invalidation in findOrCreateBlockForHoisting.
Summary:
By replacing branches to CommonExitBlock, we remove the node from
CommonExitBlock's predecessors, invalidating the iterator. The problem
is exposed when the common exit block has multiple predecessors and
needs to sink lifetime info. The modification in the test case trigger
the issue.

Reviewers: davidxl, davide, wmi

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39112

llvm-svn: 317084
2017-11-01 09:48:12 +00:00
Serguei Katkov f2c2851efe Fix APFloat mod sign
fmod specification requires the sign of the remainder is
the same as numerator in case remainder is zero.

Reviewers: gottesmm, scanon, arsenm, davide, craig.topper
Reviewed By: scanon
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D39225

llvm-svn: 317081
2017-11-01 07:56:55 +00:00
Craig Topper 688f0ca6a7 [X86] Add more type qualifiers to INSERT_SUBREG operations in rotate patterns so they don't get created with a v64i8 type.
Not sure why tablegen didn't error on this.

Fixes PR35158.

llvm-svn: 317079
2017-11-01 07:11:32 +00:00
Craig Topper c51aac675d [DAGCombiner] Fix typos in comments. NFC
llvm-svn: 317072
2017-11-01 03:30:52 +00:00
Craig Topper a827f84dcc [X86] Add AVX512 support to X86FastISel::fastMaterializeFloatZero.
llvm-svn: 317059
2017-11-01 00:47:45 +00:00
Benjamin Kramer f9ab3ddb8f [AMDGPU] Clean up symbols in the global namespace.
llvm-svn: 317051
2017-10-31 23:21:30 +00:00
Philip Reames 357cd3289e [SimplifyIndVar] Inline makIVComparisonInvariant to eleminate code duplication [NFC]
This formulation might be slightly slower since I eagerly compute the cheap replacements.  If anyone sees this having a compile time impact, let me know and I'll use lazy population instead.

llvm-svn: 317048
2017-10-31 22:56:16 +00:00
Peter Collingbourne aedb4bf37f Object: Move some code from ELF.h into ELF.cpp.
Differential Revision: https://reviews.llvm.org/D39271

llvm-svn: 317046
2017-10-31 22:49:23 +00:00
Reid Kleckner bc6f52da82 [codeview] Merge file checksum entries for DIFiles with the same absolute path
Change the map key from DIFile* to the absolute path string. Computing
the absolute path isn't expensive because we already have a map that
caches the full path keyed on DIFile*.

llvm-svn: 317041
2017-10-31 21:52:15 +00:00
Marek Olsak 5914ece6aa AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset
Summary:
Apps that benefit:
- alien isolation
- bioshock infinite
- civilization: beyond earth
- company of heroes 2
- dirt showdown
- dota 2
- F1 2015
- grid autosport
- hitman
- legend of grimrock
- serious sam 3: bfe
- shadow warrior
- talos principle
- total war: warhammer
- UE4 demos: effects cave, elemental, sun temple

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38914

llvm-svn: 317038
2017-10-31 21:06:42 +00:00
Adrian Prantl deb437b038 loop-rotate: simplify code by using llvm::findDbgValues(). (NFC)
llvm-svn: 317037
2017-10-31 21:03:22 +00:00
Benjamin Kramer 0fad6dd3c4 Revert "[DWARF] Now that Optional is standard layout, put it into an union instead of splatting it."
GCC doesn't like it. This reverts commit r317028.

llvm-svn: 317030
2017-10-31 19:55:08 +00:00
Benjamin Kramer 8732bbec1e [DWARF] Now that Optional is standard layout, put it into an union instead of splatting it.
No functionality change intended.

llvm-svn: 317028
2017-10-31 19:40:03 +00:00
Benjamin Kramer 992fc4ea2d [coro] Make Spill a proper struct instead of deriving from pair.
No functionality change.

llvm-svn: 317027
2017-10-31 19:22:55 +00:00
Craig Topper 7c7fcabd3f [SimplifyCFG] Use a more generic name for the selects created by SpeculativelyExecuteBB to prevent long names from being created
Currently the selects are created with the names of their inputs concatenated together. It's possible to get cases that chain these selects together resulting in long names due to multiple levels of concatenation. Our internal branch of llvm managed to generate names over 100000 characters in length on a particular test due to an extreme compounding of the names.

This patch changes the name to a generic name that is not dependent on its inputs.

Differential Revision: https://reviews.llvm.org/D39440

llvm-svn: 317024
2017-10-31 19:03:51 +00:00
Philip Reames dc417a9819 [IndVarSimplify] Extract wrapper around SE-.isLoopInvariantPredicate [NFC]
This an intermediate state, the next patch will re-inline the markLoopInvariantPredicate function to reduce code duplication.

llvm-svn: 317016
2017-10-31 18:04:57 +00:00
Rui Ueyama 412b29e4ed [Support] Make the default chunk size of raw_fd_ostream to 1 GiB.
Previously, we call write(2) for each 32767 byte chunk. That is not
efficient because Linux can handle much larger write requests.
This patch changes the chunk size on Linux to 1 GiB.

This patch also changes the default chunks size to SSIZE_MAX. I think
that doesn't in practice change this function's behavior on any operating
system because SSIZE_MAX on 64-bit machine is unrealistically large,
and writing 2 GiB (SSIZE_MAX on 32-bit) on a 32-bit machine by a single
call of write(2) is also unrealistic, as the userspace is usually
limited to 2 GiB. That said, it is in general a good thing to do because
a write larger than SSIZE_MAX is implementation-defined in POSIX.

Differential Revision: https://reviews.llvm.org/D39444

llvm-svn: 317015
2017-10-31 17:37:20 +00:00
Philip Reames cd0a5bb96c [IndVarSimplify] Simplify code using a dictionary
Possibly very slightly slower, but this code is not performance critical and the readability benefit alone is huge.

llvm-svn: 317012
2017-10-31 17:06:32 +00:00
Reid Kleckner 39970069b1 [X86][AsmParser] Treat '%' as the modulo operator under Intel syntax
It can't be a register prefix, anyway. This is consistent with the masm
docs on MSDN: https://msdn.microsoft.com/en-us/library/t4ax90d2.aspx

This is a straight-forward extension of our support for "MOD"
implemented in https://reviews.llvm.org/D33876 / r306425

llvm-svn: 317011
2017-10-31 16:47:38 +00:00
Nico Weber 05c988473f LTOModule::isBitcodeFile() shouldn't assert when returning false.
Fixes a bunch of assert-on-invalid-bitcode regressions after 315483.
Expected<> calls assertIsChecked() in its dtor, and operator bool() only calls
setChecked() if there's no error. So for functions that don't return an error
itself, the Expected<> version needs explicit code to disarm the error that the
ErrorOr<> code didn't need.

https://reviews.llvm.org/D39437

llvm-svn: 317010
2017-10-31 16:39:47 +00:00
Reid Kleckner c212cc88e2 [asan] Upgrade private linkage globals to internal linkage on COFF
COFF comdats require symbol table entries, which means the comdat leader
cannot have private linkage.

llvm-svn: 317009
2017-10-31 16:16:08 +00:00
Simon Pilgrim f3c33ca83e [X86][SSE] Add VSRLI/VSRAI/VSLLI demanded elts support to computeKnownBits/ComputeNumSignBits
Mainly a perf improvements as most combines will have occurred before we lower to these instructions

llvm-svn: 317005
2017-10-31 16:06:21 +00:00
Benjamin Kramer 3f3d5be759 [LoopVectorize] Replace manual VPlan memory management with unique_ptr.
No functionality change intended.

llvm-svn: 317003
2017-10-31 14:58:22 +00:00
Matthew Simpson b6915fbfa2 [InstCombine] Simplify selects that test cmpxchg instructions
If a select instruction tests the returned flag of a cmpxchg instruction and
selects between the returned value of the cmpxchg instruction and its compare
operand, the result of the select will always be equal to its false value.

Differential Revision: https://reviews.llvm.org/D39383

llvm-svn: 316994
2017-10-31 12:34:02 +00:00
David Green 64f53b4214 [LoopUnroll] Clean up remarks for unroll remainder
The optimisation remarks for loop unrolling with an unrolled remainder looks something like:

test.c:7:18: remark: completely unrolled loop with 3 iterations [-Rpass=loop-unroll]
            C[i] += A[i*N+j];
                 ^
test.c:6:9: remark: unrolled loop by a factor of 4 with run-time trip count [-Rpass=loop-unroll]
        for(int j = 0; j < N; j++)
        ^
This removes the first of the two messages.

Differential revision: https://reviews.llvm.org/D38725

llvm-svn: 316986
2017-10-31 10:47:46 +00:00
Michael Zuckerman 9e58831cb8 [AVX512] Adding new patterns for extract_subvector of vXi1
extract subvector of vXi1 from vYi1 is poorly supported by LLVM and most of the time end with an assertion.
This patch fixes this issue by adding new patterns to the TD file.

Reviewers:
1. guyblank
2. igorb
3. zvi
4. ayman
5. craig.topper

Differential Revision: https://reviews.llvm.org/D39292

Change-Id: Ideb4d7e946c8d40cfce2920891f2d89fe64c58f8
llvm-svn: 316981
2017-10-31 10:00:19 +00:00
Serguei Katkov f66a59ee88 [CGP] Fix the detection of trivial case for addressing mode
The address can be presented as a bitcast of baseReg.
In this case it is still trivial but OriginalValue != baseReg.

llvm-svn: 316980
2017-10-31 07:01:35 +00:00
Max Kazantsev 84286ce5dd [IRCE][NFC] Rename fields of InductiveRangeCheck
Rename `Offset`, `Scale`, `Length` into `Begin`, `Step`, `End` respectively
to make naming of similar entities for Ranges and Range Checks more
consistent.

Differential Revision: https://reviews.llvm.org/D39414

llvm-svn: 316979
2017-10-31 06:19:05 +00:00
Craig Topper beed653135 [X86] Make AVX512_512_SET0 XMM16-31 lower to 128-bit XOR when AVX512VL is enabled. Use 128-bit VLX instruction when VLX is enabled.
Unfortunately, this weakens our ability to do domain fixing when AVX512DQ is not enabled, but it is consistent with our 256-bit behavior.

Maybe we should add custom handling to domain fixing to allow EVEX integer XOR/AND/OR/ANDN to switch to VEX encoded fp instructions if the high registers aren't being used?

llvm-svn: 316978
2017-10-31 06:01:04 +00:00
Max Kazantsev 21e7b53490 [NFC] Get rid of variables used in assert only
llvm-svn: 316977
2017-10-31 05:33:58 +00:00
Philip Reames 59bf1e0548 [IndVarSimplify] Simplify code using preheader assumption
As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops.  Given SCEV has comments to the contrary, that seems a bit suspect.  More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available.  Remove the excessive generaility and shorten the code greatly.

Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops.  See the added test case.

llvm-svn: 316976
2017-10-31 05:16:46 +00:00
Max Kazantsev 488ec975bb Reapply "[GVN] Prevent LoadPRE from hoisting across instructions that don't pass control flow to successors"
This patch fixes the miscompile that happens when PRE hoists loads across guards and
other instructions that don't always pass control flow to their successors. PRE is now prohibited
to hoist across such instructions because there is no guarantee that the load standing after such
instruction is still valid before such instruction. For example, a load from under a guard may be
invalid before the guard in the following case:
  int array[LEN];
  ...
  guard(0 <= index && index < LEN);
  use(array[index]);

Differential Revision: https://reviews.llvm.org/D37460

llvm-svn: 316975
2017-10-31 05:07:56 +00:00
Philip Reames 39a8dbff87 [SimplifyIndVar] Extract out invariant expression handling
Previously, the code returned early from the *function* when it couldn't find a free expansion, it should be returning from the *transform*.  I don't have a test case, noticed this via inspection.

As a follow up, I'm going to revisit the logic in the extract function.  I think that essentially the whole helper routine can be replaced with SCEVExpander, but I wanted to do that in a series of separate commits.

llvm-svn: 316974
2017-10-31 04:19:06 +00:00
Craig Topper 668b1ab6f1 [X86] Clang-format some code. NFC
llvm-svn: 316973
2017-10-31 02:34:29 +00:00
Philip Reames 5552f503d5 Undo accidental commit
These files shouldn't have been submitted in 316967

llvm-svn: 316968
2017-10-31 00:04:09 +00:00
Philip Reames 9c3cbeea39 [CGP] Fix crash on i96 bit multiply
Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725

If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area.  There's a bunch of obviously wrong code in the same function.  I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like.

llvm-svn: 316967
2017-10-30 23:59:51 +00:00
Simon Pilgrim 9cd7abbcff Fix unused variable warnings. NFCI.
llvm-svn: 316964
2017-10-30 22:38:07 +00:00
Simon Pilgrim 80b371361c [SelectionDAG] Tidyup computeKnownBits extension/truncation cases. NFCI.
We don't need to extend/truncate the Known structure before calling computeKnownBits - it will reset at the start of the function.

llvm-svn: 316962
2017-10-30 22:23:57 +00:00
Javed Absar d13d419d4a [AArch64]: range loopify frame-lowering
llvm-svn: 316960
2017-10-30 22:00:06 +00:00
Yaxun Liu d23f23d81c InferAddressSpaces: Fix bug about replacing addrspacecast
InferAddressSpaces assumes the pointee type of addrspacecast
is the same as the operand, which is not always true and causes
invalid IR.

This bug cause build failure in HCC.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39432

llvm-svn: 316957
2017-10-30 21:19:41 +00:00
Craig Topper 9f01f6093c [X86] Add AVX512 support to fast isel's X86ChooseCmpOpcode.
llvm-svn: 316955
2017-10-30 21:09:19 +00:00
Davide Italiano 834b45129b [NewGVN] Stop assuming PHI args ordering when looking at phi-of-ops.
It's not guaranteed. There's a bug open to sort them in predecessor
order, but it won't happen anytime soon. In the meanwhile, passes
will have to do an O(#preds) scan. Such is life.

llvm-svn: 316953
2017-10-30 20:20:16 +00:00
Stefan Pintilie 6262fd4b0a Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"
Revert r316478.
A test case has failed.
Will recommit this change once we find and fix the failure.

This reverts commit 7c330fabaedaba3d02c58bc3cc1198896c895f34.

llvm-svn: 316952
2017-10-30 19:55:38 +00:00
Daniel Neilson f9c7d29c77 Create instruction classes for identifying any atomicity of memory intrinsic. (NFC)
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html

This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:

IntrinsicInst
  -> MemIntrinsicBase (methods-only class)
    -> MemIntrinsic (non-atomic intrinsics)
      -> MemSetInst
      -> MemTransferInst
        -> MemCpyInst
        -> MemMoveInst
    -> AtomicMemIntrinsic (atomic intrinsics)
      -> AtomicMemSetInst
      -> AtomicMemTransferInst
        -> AtomicMemCpyInst
        -> AtomicMemMoveInst
    -> AnyMemIntrinsic (both atomicities)
      -> AnyMemSetInst
      -> AnyMemTransferInst
        -> AnyMemCpyInst
        -> AnyMemMoveInst

This involves some class renaming:
    ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
    ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
    ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.

An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).

---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"

REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"

FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )

SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"

for f in $FILES; do
    echo "Processing: $f"
    sed  -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done

Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev

Reviewed By: sanjoy

Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38419

llvm-svn: 316950
2017-10-30 19:51:48 +00:00
Mandeep Singh Grang f83268bd9e [GVNHoist] Fix non-deterministic sort order of PHIs for identical instructions
Summary: This fixes failure in Transforms/GVNHoist/hoist.ll uncovered by D39245.

Reviewers: hiraditya, spop, dberlin

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39410

llvm-svn: 316949
2017-10-30 19:42:41 +00:00
Simon Pilgrim 017f896adb [SelectionDAG] Add VSELECT demanded elts support to computeKnownBits
llvm-svn: 316947
2017-10-30 19:31:08 +00:00
Simon Pilgrim 96a0b9ef54 [SelectionDAG] Add VSELECT support to computeKnownBits
llvm-svn: 316944
2017-10-30 19:08:21 +00:00
Simon Pilgrim 5da11dfd24 [SelectionDAG] Add SELECT demanded elts support to ComputeNumSignBits
llvm-svn: 316933
2017-10-30 17:53:51 +00:00
Simon Pilgrim 194693e996 [MC] Split out register def/use idx calls to make debugging simpler. NFCI.
llvm-svn: 316927
2017-10-30 17:24:40 +00:00
Jina Nahias 5bf6620b15 [X86][AVX512] Adding a pattern for broadcastm intrinsic.
Differential Revision: https://reviews.llvm.org/D38312

Change-Id: I71c8605a8e4c98013ef25289694afc5cfd46bb0b
llvm-svn: 316921
2017-10-30 16:37:28 +00:00
Rafael Espindola 6f36637be0 Move isDSOLocal check and add a comment.
llvm-svn: 316920
2017-10-30 16:32:31 +00:00
Fangrui Song 2696db90d1 [PPC CodeGen] Fix the bitreverse.i64 intrinsic.
Summary: The two 32-bit words were swapped. Update a test omitted in reverted r316270.

Reviewers: jtony, aaron.ballman

Subscribers: nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D39163

llvm-svn: 316916
2017-10-30 16:03:44 +00:00
Craig Topper 4e13d4de52 [X86] Make sure we don't create locked inc/dec instructions when the carry flag is being used.
Summary:
INC/DEC don't update the carry flag so we need to make sure we don't try to use it.

This patch introduces new X86ISD opcodes for locked INC/DEC. Teaches lowerAtomicArithWithLOCK to emit these nodes if INC/DEC is not slow or the function is being optimized for size. An additional flag is added that allows the INC/DEC to be disabled if the caller determines that the carry flag is being requested.

The test_sub_1_cmp_1_setcc_ugt test is currently showing this bug. The other test case changes are recovering cases that were regressed in r316860.

This should fully fix PR35068 finishing the fix started in r316860.

Reviewers: RKSimon, zvi, spatel

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39411

llvm-svn: 316913
2017-10-30 14:51:37 +00:00
Craig Topper 367cc12fa9 [X86] Remove AVX512 early out from X86FastISel::X86SelectCmp.
This shouldn't be needed anymore since i1 isn't a legal type.

llvm-svn: 316912
2017-10-30 14:50:11 +00:00
Yaxun Liu c928f2a6d4 [AMDGPU] Emit metadata for hidden arguments for kernel enqueue
Identifies kernels which performs device side kernel enqueues and emit
metadata for the associated hidden kernel arguments. Such kernels are
marked with calls-enqueue-kernel function attribute by
AMDGPUOpenCLEnqueueKernelLowering pass and later on
hidden kernel arguments metadata HiddenDefaultQueue and
HiddenCompletionAction are emitted for them.

Differential Revision: https://reviews.llvm.org/D39255

llvm-svn: 316907
2017-10-30 14:30:28 +00:00
Clement Courbet b2c3eb8cf1 [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905
2017-10-30 14:19:33 +00:00
Krzysztof Parzyszek bef1c56724 [Hexagon] Allow the RDF optimizations to be run in .mir testcases
llvm-svn: 316904
2017-10-30 14:11:52 +00:00
Javed Absar 5cde1ccb29 [GlobalISel|ARM] : Allow legalizing G_FSUB
Adding support for VSUB.
Reviewed by: @rovka
Differential Revision: https://reviews.llvm.org/D39261

llvm-svn: 316902
2017-10-30 13:51:56 +00:00
Andrew V. Tischenko f94da596a7 Invalid used of 'w' suffix on push and pop using 64-bit register.
Differential Revision: https://reviews.llvm.org/D38626

llvm-svn: 316898
2017-10-30 12:02:06 +00:00
Jina Nahias e63db55c67 Revert "[X86][AVX512] Adding a pattern for broadcastm intrinsic."
This reverts commit r316890.

Change-Id: I683cceee9848ef309b452293086b1f26a941950d
llvm-svn: 316894
2017-10-30 10:35:53 +00:00
Florian Hahn d0208b4b1c Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 316891
2017-10-30 10:07:42 +00:00
Jina Nahias 70280f9a0d [X86][AVX512] Adding a pattern for broadcastm intrinsic.
Differential Revision: https://reviews.llvm.org/D38312

Change-Id: I6551fb13879e098aed74de410e29815cf37d9ab5
llvm-svn: 316890
2017-10-30 09:59:52 +00:00
Max Kazantsev 390fc57771 [IRCE][NFC] Store Length as SCEV in RangeCheck instead of Value
llvm-svn: 316889
2017-10-30 09:35:16 +00:00
Florian Hahn d18443edad Revert r316887 to fix buildbot failures.
llvm-svn: 316888
2017-10-30 09:21:50 +00:00
Florian Hahn 925d3e4a98 Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 316887
2017-10-30 09:04:18 +00:00
Max Kazantsev 1d7c0439b9 [GVN][NFC] Mark instruction for deletion instead of immediate erasing in LoadPRE
It is done to uniformly handle instructions removal.

Differential Revision: https://reviews.llvm.org/D39369

llvm-svn: 316884
2017-10-30 04:48:34 +00:00
Craig Topper 85bcc297c3 [X86] Rearrange code in X86InstrInfo.cpp to put all the foldMemoryOperandImpl methods together without partial/undef register handling in the middle. NFC
I have a future patch that wants to make use of the one of the partial functions in one of the earlier memory folding methods and the current ordering prevents that.

llvm-svn: 316883
2017-10-30 04:39:18 +00:00
Craig Topper c848355335 [X86] Simplify code by removing an unnecessary temporary variable. NFC
llvm-svn: 316882
2017-10-30 03:35:44 +00:00
Craig Topper 730414b0ca [X86] Move some EVEX->VEX code to a helper function to prepare for a future patch. NFC
llvm-svn: 316881
2017-10-30 03:35:43 +00:00
Simon Pilgrim 601ae238b7 [SelectionDAG] Add SEXT/AND/XOR/Or demanded elts support to ComputeNumSignBits
llvm-svn: 316875
2017-10-29 22:03:37 +00:00
Sanjay Patel adf38911d8 [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
The old PM sets the options of what used to be known as "latesimplifycfg" on the 
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not 
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 316869
2017-10-29 20:49:31 +00:00
Simon Pilgrim 7613a7b564 [SelectionDAG] Add SRA/SHL demanded elts support to ComputeNumSignBits
Introduce a isConstOrDemandedConstSplat helper function that can recognise a constant splat build vector for at least the demanded elts we care about.

llvm-svn: 316866
2017-10-29 18:19:37 +00:00
Craig Topper 495a1bc893 [X86] Remove combine that turns X86ISD::LSUB into X86ISD::LADD. Update patterns that depended on this.
If the carry flag is being used, this transformation isn't safe.

This does prevent some test cases from using DEC now, but I'll try to look into that separately.

Fixes PR35068.

llvm-svn: 316860
2017-10-29 06:51:04 +00:00
Craig Topper 7a60e29185 [X86] Fix typo in comment. NFC
llvm-svn: 316859
2017-10-29 06:51:02 +00:00
Craig Topper 912f3b8e4b [X86] Use the extended vector register classes in fast isel with AVX512F/VL.
llvm-svn: 316857
2017-10-29 05:14:26 +00:00
Craig Topper 5f2289a13c [X86] Add AVX512 support to X86FastISel::X86SelectFPExt and X86FastISel::X86SelectFPTrunc.
llvm-svn: 316856
2017-10-29 02:50:31 +00:00
Craig Topper 1e30d783dd [X86] Add AVX512 support to X86FastISel::X86MaterializeFP
llvm-svn: 316853
2017-10-29 02:18:41 +00:00
Craig Topper 0692ca4bd2 [X86] Remove invalid code from LowerVSELECT.
This code attempted to say that v8i16/v16i16 VSELECT is legal if BWI and VLX are enabled, but the only way we could reach this point is if the condition was not a vXi1 type. Which means it really wasn't legal.

We don't have any tests that exercise this code. So I'm hoping it wasn't really reachable.

llvm-svn: 316851
2017-10-28 23:10:13 +00:00
Simon Pilgrim b37a24e82f [SelectionDAG] Add support for INSERT_SUBVECTOR to computeKnownBits
llvm-svn: 316847
2017-10-28 22:10:40 +00:00
Simon Pilgrim 294f88dfa0 [X86][SSE] Combine 128-bit target shuffles to PACKSS/PACKUS.
llvm-svn: 316845
2017-10-28 20:51:27 +00:00
Simon Pilgrim bd3852aa5e [X86][SSE] Split off matchVectorShuffleWithPACK. NFCI.
Split matchVectorShuffleWithPACK from lowerVectorShuffleWithPACK so that we can reuse it for target shuffle combines

llvm-svn: 316844
2017-10-28 20:27:22 +00:00
Craig Topper 40f0584f08 [X86] Fix a mistake in the X86ISelDAGToDAG.cpp code for MUL8r/IMUL8r.
I think this code is unreachable due to some promotions that occur elsewhere. I'll look into that to be sure, but for now I thought I should at least fix the obvious typo.

llvm-svn: 316840
2017-10-28 19:56:57 +00:00
Craig Topper 202b559ae0 [X86] Replace some default cases in X86SelectShift with llvm_unreachable.
llvm-svn: 316839
2017-10-28 19:56:56 +00:00
Sanjay Patel b049173157 [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
Simon Pilgrim 25808c303f [X86][SSE] Rename truncateVectorCompareWithPACKSS to truncateVectorWithPACKSS. NFC.
We no longer rely on the vector source being a comparison result, just have sufficient sign bits.

llvm-svn: 316834
2017-10-28 17:59:56 +00:00
Simon Pilgrim d09c1ac20f [SelectionDAG] Support 'bit preserving' floating points bitcasts on computeKnownBits/ComputeNumSignBits
For cases where we know the floating point representations match the bitcasted integer equivalent, allow bitcasting to these types.

This is especially useful for the X86 floating point compare results which return all/zero bits but as a floating point type.

Differential Revision: https://reviews.llvm.org/D39289

llvm-svn: 316831
2017-10-28 14:27:53 +00:00
Craig Topper f8b92661b8 [X86] Remove unneeded MVT::i1 related code from fast isel.
llvm-svn: 316825
2017-10-28 05:52:23 +00:00
Haicheng Wu eb92e569de [ConstantFold] Fix a crash when folding a GEP that has vector index
LLVM crashes when factoring out an out-of-bound index into preceding dimension
and the preceding dimension uses vector index.  Simply bail out now when this
case happens.

Differential Revision: https://reviews.llvm.org/D38677

llvm-svn: 316824
2017-10-28 02:27:14 +00:00
Craig Topper 49687104d6 [PartialInlineLibCalls] Teach PartialInlineLibCalls to honor nobuiltin, properly check the function signature, and check TLI::has
Summary:
We shouldn't do this transformation if the function is marked nobuitlin.

We were only checking that the return type is floating point, we really should be checking the argument types and argument count as well. This can be accomplished by using the other version of getLibFunc that takes the Function and not just the name.

We should also be checking TLI::has since sqrtf is a macro on Windows.

Fixes PR32559.

Reviewers: hfinkel, spatel, davide, efriedma

Reviewed By: davide, efriedma

Subscribers: efriedma, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39381

llvm-svn: 316819
2017-10-28 00:36:58 +00:00
Tom Stellard d0c6cf2e8c AMDGPU/GlobalISel: Mark 32-bit G_FADD as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38439

llvm-svn: 316815
2017-10-27 23:57:41 +00:00
Bob Haarman d4e75f84e5 [support] remove tautological comparison in Support/Windows/Path.inc
Summary:
The removed code checks that we are able to handle a 64-bit number, but
the code we're calling takes two dwords (for a total of 64 bits), so this
is always true.

Reviewers: zturner, rnk, majnemer, compnerd

Reviewed By: zturner

Subscribers: amccarth, hiraditya, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D39263

llvm-svn: 316814
2017-10-27 23:41:17 +00:00
Jake Ehrlich f22728e636 Revert "Add support for writing 64-bit symbol tables for archives when offsets become too large for 32-bit"
This reverts commit r316805.

llvm-svn: 316813
2017-10-27 23:39:31 +00:00
Jake Ehrlich 9d5a7c3b8c Add support for writing 64-bit symbol tables for archives when offsets become too large for 32-bit
This should fix https://bugs.llvm.org//show_bug.cgi?id=34189

This change makes it so that if writing a K_GNU style archive, you need
to output a > 32-bit offset it should output in K_GNU64 style instead.

Differential Revision: https://reviews.llvm.org/D36812

llvm-svn: 316805
2017-10-27 22:26:37 +00:00
Krzysztof Parzyszek 4dc04e6a70 [Hexagon] Adjust patterns to reflect instruction selection preferences
llvm-svn: 316804
2017-10-27 22:24:49 +00:00
David Blaikie 8699f71310 Add a few missing headers for modularization/IWYU/etc
Several cases where class definitions are required for DenseMap pointer
traits handling.

llvm-svn: 316803
2017-10-27 22:12:46 +00:00
Guozhi Wei 7c67009fe5 [DAGCombine] Don't combine sext with extload if sextload is not supported and extload has multi users
In function DAGCombiner::visitSIGN_EXTEND_INREG, sext can be combined with extload even if sextload is not supported by target, then

  if sext is the only user of extload, there is no big difference, no harm no benefit.
  if extload has more than one user, the combined sextload may block extload from combining with other zext, causes extra zext instructions generated. As demonstrated by the attached test case.

This patch add the constraint that when sextload is not supported by target, sext can only be combined with extload if it is the only user of extload.

Differential Revision: https://reviews.llvm.org/D39108

llvm-svn: 316802
2017-10-27 21:54:24 +00:00
Jake Ehrlich de370414e3 Make 32-bit member offset in Archive::Symbol::getMember 64-bit
When accessing a member for a symbol with an offset greater than 2^32 -
1 the current Archive::Symbol::getMember implementation will overflow
and cause unexpected behavior. This change simply fixes that. In
particular if you call "llvm-nm --print-armap" on an archive that has
this behavior you'll get an error.

Differential Revision: https://reviews.llvm.org/D39379

llvm-svn: 316801
2017-10-27 21:47:38 +00:00
Rafael Espindola 2393c3b4e1 Handle undefined weak hidden symbols on all architectures.
We were handling the non-hidden case in lib/Target/TargetMachine.cpp,
but the hidden case was handled in architecture dependent code and
only X86_64 and AArch64 were covered.

While it is true that some code sequences in some ABIs might be able
to produce the correct value at runtime, that doesn't seem to be the
common case.

I left the AArch64 code in place since it also forces a got access for
non-pic code. It is not clear if that is needed, but it is probably
better to change that in another commit.

llvm-svn: 316799
2017-10-27 21:18:48 +00:00
Zachary Turner 94f5032aed Force #define GTEST_LANG_CXX11.
gtest depends on this #define to determine whether it can
use various classes like std::tuple, or whether it has to fall
back to experimental classes in the std::tr1 namespace.  The
check in the current version of gtest relies on the value of
the `__cplusplus` macro, but MSVC provides a non-conformant
value of this macro, making it effectively impossible to detect
C++11.  In short, LLVM compiled with MSVC has been silently
using the tr1 versions of several classes since the beginning of
time.

This would normally be pretty benign, except that in the latest
preview of MSVC they have marked all of the tr1 classes
deprecated, so it spews thousands of warnings.

llvm-svn: 316798
2017-10-27 21:12:28 +00:00
Craig Topper d69453290e [X86] Remove fast-isel code for handling i8 shifts. This is handled by auto generated code.
llvm-svn: 316797
2017-10-27 21:00:59 +00:00
Artur Gainullin af7ba8ff6b Improve clamp recognition in ValueTracking.
Summary:
ValueTracking was recognizing not all variations of clamp. Swapping of
true value and false value of select was added to fix this problem. The
first patch was reverted because it caused miscompile in NVPTX target. 
Added corresponding test cases.

Reviewers: spatel, majnemer, efriedma, reames

Subscribers: llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D39240

llvm-svn: 316795
2017-10-27 20:53:41 +00:00
Craig Topper 728fa7b4e2 [X86] Teach fastisel to use VLX VMOVNTDQA for v4f64 and 256-bit integers when available.
This looks to have been missed from r280682.

llvm-svn: 316790
2017-10-27 20:13:10 +00:00
Vlad Tsyrklevich b42db1567c Fix llvm-special-case-list-fuzzer regexp exception
Summary:
Original oss-fuzz report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3727#c2

The minimized test case that causes this failure:
5b 5b 5b 3d 47 53 00 5b  3d 5d 5b 5d 0a     [[[=GS.[=][].

Note the string "=GS\x00". The failure happens because the code is
searching the string against an array of known collated names. "GS\x00"
is a hit, but since len takes into account an extra NUL byte, indexing
into cp->name[len] goes one byte past it's allocated memory. Fix this to
use a strlen(cp->name) comparison to account for NUL bytes in the input.

Reviewers: pcc

Reviewed By: pcc

Subscribers: hctim, kcc

Differential Revision: https://reviews.llvm.org/D39380

llvm-svn: 316786
2017-10-27 19:15:13 +00:00
Krzysztof Parzyszek 92a2635bbd [Hexagon] Fix an incorrect assertion in HexagonConstExtenders.cpp
Making sure that an instruction has fewer operands than required, then
attempting to access one out of range is going to fail.

llvm-svn: 316785
2017-10-27 18:52:28 +00:00
Peter Collingbourne 5c54f15c55 ELF: Add support for emitting dynamic relocations in the Android relocation packing format.
The Android relocation packing format is a more compact
format for dynamic relocations in executables and DSOs
that is based on delta encoding and SLEBs. An overview
of the format can be found in the Android source code:
https://android.googlesource.com/platform/bionic/+/refs/heads/master/tools/relocation_packer/src/delta_encoder.h

This patch implements relocation packing using that format.

This implementation uses a more intelligent algorithm for compressing
relative relocations than Android's own relocation packer. As a
result it can generally create smaller relocation sections than
that packer. If I link Chromium for Android targeting ARM32 I get a
.rel.dyn of size 174693 bytes, as compared to 371832 bytes with gold
and the Android packer.

Differential Revision: https://reviews.llvm.org/D39152

llvm-svn: 316775
2017-10-27 17:49:40 +00:00
Simon Pilgrim 5e3808afa2 [X86][F16C] Fix btver2 AGU pipe scheduling
Use the store AGU for stores, and the load AGU needs to be the first pipe for loads

llvm-svn: 316771
2017-10-27 16:34:58 +00:00
Artur Pilipenko 8aadc643cf [LoopPredication] Handle the case when the guard and the latch IV have different offsets
This is a follow up change for D37569.

Currently the transformation is limited to the case when:
 * The loop has a single latch with the condition of the form: ++i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=.
 * The step of the IV used in the latch condition is 1.
 * The IV of the latch condition is the same as the post increment IV of the guard condition.
 * The guard condition is of the form i u< guardLimit.

This patch enables the transform in the case when the latch is

 latchStart + i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=.

And the guard is

 guardStart + i u< guardLimit

Reviewed By: anna

Differential Revision: https://reviews.llvm.org/D39097

llvm-svn: 316768
2017-10-27 14:46:17 +00:00
Clement Courbet e1eafe0a54 [CodeGen] Fix -Wunused-private-field warning on lld-x86_64-darwin13.
llvm-svn: 316765
2017-10-27 13:34:41 +00:00
Clement Courbet be684eee82 [CodeGen][ExpandMemCmp][NFC] Simplify load sequence generation.
llvm-svn: 316763
2017-10-27 12:34:18 +00:00
whitequark 131f98f054 [LLVM-C] Publicly expose getters of MetadataType, TokenType
Patch by Robert Widmann.

Expose getters for MetadataType and TokenType publicly in the C API.
Discovered a need for these while trying to wrap the intrinsics API.

Differential Revision: https://reviews.llvm.org/D38809

llvm-svn: 316762
2017-10-27 11:51:40 +00:00
George Rimar 3d07f6004e Fix BB after r316756 "[llvm-dwarfdump] - Teach verifier to report broken DWARF expressions."
Bot:
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/6255

Changed format of this message by mistake.

llvm-svn: 316757
2017-10-27 10:58:04 +00:00
George Rimar 144e4c5a32 [llvm-dwarfdump] - Teach verifier to report broken DWARF expressions.
Patch improves next things:

* Fixes assert/crash in getOpDesc when giving it a invalid expression op code.
* DWARFExpression::print() called DWARFExpression::Operation::getEndOffset() which
  returned and used uninitialized field EndOffset. Patch fixes that.
* Teaches verifier to verify DW_AT_location and error out on broken expressions.

Differential revision: https://reviews.llvm.org/D39294

llvm-svn: 316756
2017-10-27 10:42:04 +00:00
Matt Arsenault 878827d93a DAG: Fold fma (fneg x), K, y -> fma x, -K, y
llvm-svn: 316753
2017-10-27 09:06:07 +00:00
Max Kazantsev 665907c3c2 [GVN][NFC] Refactor loop iteration with foreach
llvm-svn: 316748
2017-10-27 08:19:35 +00:00
Max Kazantsev 52d0a49046 Revert rL316568 because of sudden performance drop on ARM
llvm-svn: 316739
2017-10-27 04:17:44 +00:00
Sean Fertile 57d46b8436 Add subclass data to the FoldingSetNode for MemIntrinsicSDNodes.
Not having the subclass data on an MemIntrinsicSDNodes means it was possible
to try to fold 2 nodes with the same operands but differing MMO flags. This
would trip an assertion when trying to refine the alignment between the 2
MachineMemOperands.

Differential Revision: https://reviews.llvm.org/D38898

llvm-svn: 316737
2017-10-27 04:02:51 +00:00
Eugene Zelenko 57bd5a0274 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316724
2017-10-27 01:09:08 +00:00
Reid Kleckner 145090f124 [PDB] Handle an empty globals hash table with no buckets
llvm-svn: 316722
2017-10-27 00:45:51 +00:00
Balaram Makam 32bcb5d7fb Revert "[CGP] Merge empty case blocks if no extra moves are added."
This reverts commit r316711. The domtree isn't getting updated correctly.

llvm-svn: 316721
2017-10-27 00:35:18 +00:00
Sam Clegg c55d13f461 [WebAssembly] MC: Don't allow zero sized data segments
This ensures that each segment has a unique address.
Without this, consecutive zero sized symbols would
end up with the same address and the linker cannot
map symbols to unique data segments.

Differential Revision: https://reviews.llvm.org/D39107

llvm-svn: 316717
2017-10-27 00:08:55 +00:00
David Blaikie 6265130054 InstructionSelectorImpl.h: Modularize/remove ODR violations by using a static member function to expose the debug name
llvm-svn: 316715
2017-10-26 23:39:54 +00:00
Balaram Makam cddf3c5e1c [CGP] Merge empty case blocks if no extra moves are added.
Summary:
Currently we skip merging when extra moves may be added in the header of switch instead of the case block, if the case block is used as an incoming
block of a PHI. If all the incoming values of the PHIs are non-constants and the destination block is dominated by the switch block then extra moves are likely not added by ISel, so there is no need to skip merging in this case.

Reviewers: efriedma, junbuml, davidxl, hfinkel, qcolombet

Reviewed By: efriedma

Subscribers: dberlin, kuhar, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37343

llvm-svn: 316711
2017-10-26 22:34:01 +00:00
Philip Reames 29dd40b38e [SimplifyIndVars] Shorten code by using SCEV helper [NFC]
llvm-svn: 316709
2017-10-26 22:02:16 +00:00
Eli Friedman d5dfb62de7 [ARM] Honor -mfloat-abi for libcall calling convention
As far as I can tell, this matches gcc: -mfloat-abi determines the
calling convention for all functions except those explicitly defined as
soft-float in the ARM RTABI.

This change only affects cases where the user specifies -mfloat-abi to
override the default calling convention derived from the target triple.

Fixes https://bugs.llvm.org//show_bug.cgi?id=34530.

Differential Revision: https://reviews.llvm.org/D38299

llvm-svn: 316708
2017-10-26 21:42:32 +00:00
David Blaikie 1f13b3c243 Support/reg*: Roll some non-modular headers into their singular uses
These headers have static variables in them, which would easily create
ODR violations if the header was included in another header, and the
constants were used by an inline function, for example.

llvm-svn: 316706
2017-10-26 21:32:58 +00:00
Dehao Chen ed2d5402cb Do not add discriminator encoding for debug intrinsics.
Summary: There are certain requirements for debug location of debug intrinsics, e.g. the scope of the DILocalVariable should be the same as the scope of its debug location. As a result, we should not add discriminator encoding for debug intrinsics.

Reviewers: dblaikie, aprantl

Reviewed By: aprantl

Subscribers: JDevlieghere, aprantl, bjope, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D39343

llvm-svn: 316703
2017-10-26 21:20:52 +00:00
Craig Topper b8d7d4d683 [X86] Improve handling of UDIVREM8_ZEXT_HREG/SDIVREM8_SEXT_HREG to support 64-bit extensions.
If the extend type is 64-bits, emit a 32-bit -> 64-bit extend after the UDIVREM8_ZEXT_HREG/UDIVREM8_SEXT_HREG operation.

This gives a shorter encoding for the second extend in the sext case, and allows us to completely remove the second extend in the zext case.

This also adds known bit and num sign bits support for UDIVREM8_ZEXT_HREG/SDIVREM8_SEXT_HREG.

Differential Revision: https://reviews.llvm.org/D38275

llvm-svn: 316702
2017-10-26 21:12:03 +00:00
Craig Topper 8a2a104129 [X86] Teach the assembly parser to warn on duplicate registers in gather instructions.
Fixes PR32238.

Differential Revision: https://reviews.llvm.org/D39077

llvm-svn: 316700
2017-10-26 21:03:54 +00:00
Philip Reames 21cc2fa3f6 [LICM] Restructure implicit exit handling to be more clear [NFCI]
When going to explain this to someone else, I got tripped up by the complicated meaning of IsKnownNonEscapingObject in load-store promotion.  Extract a helper routine and clarify naming/scopes to make this a bit more obvious.

llvm-svn: 316699
2017-10-26 21:00:15 +00:00
David Blaikie d97112e370 Support/reg*.h: Make headers include their dependencies
llvm-svn: 316696
2017-10-26 20:23:11 +00:00
Martin Storsjo 6c1fd2992a [COFF] Support ordinals in def files with space between @ and the number
Both GNU ld and MS link.exe support declaring ordinals this way.

A test will be added in lld.

Differential Revision: https://reviews.llvm.org/D39327

llvm-svn: 316690
2017-10-26 20:11:32 +00:00
Sanjay Patel ac50f3e907 [x86] use an insert op to put one variable element into a constant of vectors
Instead of loading (a potential ton of) scalar constants, load those as a vector and then insert into it.

Differential Revision: https://reviews.llvm.org/D38756

llvm-svn: 316685
2017-10-26 18:27:55 +00:00
Yichao Yu 221dae31a5 Clear LastMappingSymbols and LastEMS(Info) when resetting the ARM(AArch64)ELFStreamer
Summary:
This causes a segfault on ARM when (I think) the pass manager is used multiple times.

Reset set the (last) current section to NULL without saving the corresponding LastEMSInfo back into the map. The next use of the streamer then save the LastEMSInfo for the NULL section leaving the LastEMSInfo mapping for the last current section (the one that was there before the reset) NULL which cause the LastEMSInfo to be set to NULL when the section is being used again.

The reuse of the section (pointer) might mean that the map was holding dangling pointers previously which is why I went for clearing the map and resetting the info, making it as similar to the state right after the constructor run as possible. The AArch64 one doesn't have segfault (since LastEMS isn't a pointer) but it seems to have the same issue.

The segfault is likely caused by https://reviews.llvm.org/D30724 which turns LastEMSInfo into a pointer. As mentioned above, it seems that the actual issue was older though.

No test is included since the test is believed to be too complicated for such an obvious fix and not worth doing.

Reviewers: llvm-commits, shankare, t.p.northover, peter.smith, rengolin

Reviewed By: rengolin

Subscribers: mgorny, aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38588

llvm-svn: 316679
2017-10-26 17:36:43 +00:00
Keno Fischer 1c43ad0965 [DynamicLibrary] Fix build on musl libc
Summary:
On musl libc, stdin/out/err are defined as `FILE* const` globals,
and their address is not implicitly convertible to void *,
or at least gcc 6 doesn't allow it, giving errors like:

```
error: cannot initialize return object of type 'void *' with an rvalue of type 'FILE *const *' (aka '_IO_FILE *const *')
    EXPLICIT_SYMBOL(stderr);
    ^~~~~~~~~~~~~~~~~~~~~~~
```

Add an explicit cast to fix that problem.

Reviewers: marsupial, krytarowski, dim
Reviewed By: dim
Differential Revision: https://reviews.llvm.org/D39297

llvm-svn: 316672
2017-10-26 16:44:13 +00:00
Mandeep Singh Grang 049ed12df7 [MachineModuleInfoImpls] Replace qsort with array_pod_sort
Summary:
This seems to be the only place in llvm we directly call qsort. We can replace
this with a call to array_pod_sort. Also minor cleanup of the sorting function.

Reviewers: bkramer, Eugene.Zelenko, rafael

Reviewed By: bkramer

Subscribers: efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D39214

llvm-svn: 316671
2017-10-26 16:07:20 +00:00
Balaram Makam 9ee942f481 Reapply r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: This reverts r316612 to reapply r316582. The buildbot failure was unrelated to this commit.

Reviewers:

Subscribers:

llvm-svn: 316669
2017-10-26 15:04:53 +00:00
Sean Fertile c70d28bff5 Represent runtime preemption in the IR.
Currently we do not represent runtime preemption in the IR, which has several
drawbacks:

  1) The semantics of GlobalValues differ depending on the object file format
     you are targeting (as well as the relocation-model and -fPIE value).
  2) We have no way of disabling inlining of run time interposable functions,
     since in the IR we only know if a function is link-time interposable.
     Because of this llvm cannot support elf-interposition semantics.
  3) In LTO builds of executables we will have extra knowledge that a symbol
     resolved to a local definition and can't be preemptable, but have no way to
     propagate that knowledge through the compiler.

This patch adds preemptability specifiers to the IR with the following meaning:

dso_local --> means the compiler may assume the symbol will resolve to a
 definition within the current linkage unit and the symbol may be accessed
 directly even if the definition is not within this compilation unit.

dso_preemptable --> means that the compiler must assume the GlobalValue may be
replaced with a definition from outside the current linkage unit at runtime.

To ease transitioning dso_preemptable is treated as a 'default' in that
low-level codegen will still do the same checks it did previously to see if a
symbol should be accessed indirectly. Eventually when IR producers emit the
specifiers on all Globalvalues we can change dso_preemptable to mean 'always
access indirectly', and remove the current logic.

Differential Revision: https://reviews.llvm.org/D20217

llvm-svn: 316668
2017-10-26 15:00:26 +00:00
Marek Olsak 2232243863 AMDGPU: Handle s_buffer_load_dword hazard on SI
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D39171

llvm-svn: 316666
2017-10-26 14:43:02 +00:00
Bjorn Pettersson 86db068e39 [LSV] Avoid adding vectors of pointers as candidates
Summary:
We no longer add vectors of pointers as candidates for
load/store vectorization. It does not seem to work anyway,
but without this patch we can end up in asserts when trying
to create casts between an integer type and the pointer of
vectors type.

The test case I've added used to assert like this when trying to
cast between i64 and <2 x i16*>:
opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
#0 PrintStackTraceSignalHandler(void*)
#1 SignalHandler(int)
#2 __restore_rt
#3 __GI_raise
#4 __GI_abort
#5 __GI___assert_fail
#6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*)
#7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&)
#8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*)

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D39296

llvm-svn: 316665
2017-10-26 13:59:15 +00:00
Bjorn Pettersson 22a2282da1 [LSV] Skip all non-byte sizes, not only less than eight bits
Summary:
The code comments indicate that no effort has been spent on
handling load/stores when the size isn't a multiple of the
byte size correctly. However, the code only avoided types
smaller than 8 bits. So for example a load of an i28 could
still be considered as a candidate for vectorization.

This patch adjusts the code to behave according to the code
comment.

The test case used to hit the following assert when
trying to use "cast" an i32 to i28 using CreateBitOrPointerCast:

opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
#0 PrintStackTraceSignalHandler(void*)
#1 SignalHandler(int)
#2 __restore_rt
#3 __GI_raise
#4 __GI_abort
#5 __GI___assert_fail
#6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*)
#7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&)
#8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*)

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39295

llvm-svn: 316663
2017-10-26 13:42:55 +00:00
Simon Dardis b633acac9f [mips] Fix (dis)assembly of abs.fmt for micromips
These instructions were previously marked as codegen only preventing
them from being assembled as microMIPS or disassembled.

Reviewers: atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D39123

llvm-svn: 316656
2017-10-26 11:36:54 +00:00
Simon Dardis 13452383cd [mips] Fix PR35071
PR35071 exposed the fact that MipsInstrInfo::removeBranch did not walk past
debug instructions when removing branches for the control flow optimizer, which
lead to duplicated conditional branches. If the target of the branch was a
removable block, only the conditional branch in the terminating position would
have it's MBB operands updated, leaving the first branch with a dangling MBB
operand. The MIPS long branch pass would then trigger an assertion when
attempting to examine the instruction with dangling MBB operand.

This resolves PR35071.

Thanks to Alex Richardson for reporting the issue!

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D39288

llvm-svn: 316654
2017-10-26 10:58:36 +00:00
Hiroshi Inoue b72b1fb0de [PowerPC] Use record-form instruction for Less-or-Equal -1 and Greater-or-Equal 1
Currently a record-form instruction is used for comparison of "greater than -1" and "less than 1" by modifying the predicate (e.g. LT 1 into LE 0) in addition to the naive case of comparison against 0.
This patch also enables emitting a record-form instruction for "less than or equal to -1" (i.e. "less than 0") and "greater than or equal to 1" (i.e. "greater than 0") to increase the optimization opportunities.

Differential Revision: https://reviews.llvm.org/D38941

llvm-svn: 316647
2017-10-26 09:01:51 +00:00
Hans Wennborg caceb64067 Tidy up CountingFunctionInserter a little. NFC.
Use StringRef for CountingFunctionName, remove erroneous comment
copied from InstructionNamer, and drop some trailing whitespace.

llvm-svn: 316644
2017-10-26 08:29:08 +00:00
Craig Topper 0551556ed2 [AsmParser][TableGen] Add VariantID argument to the generated mnemonic spell check function so it can use the correct table based on variant.
I'm considering implementing the mnemonic spell checker for x86, and that would require the separate intel and att variants.

llvm-svn: 316641
2017-10-26 06:46:41 +00:00
Craig Topper 2a06028c0a [AsmParser][TableGen] Make the generated mnemonic spell checker function a file local static function.
Also only emit in targets that specificially request it. This is required so we don't get an unused static function error.

llvm-svn: 316640
2017-10-26 06:46:40 +00:00
Craig Topper 619b15283d [X86] Use correct type for return value of ComputeAvailableFeatures in the AsmParser. NFC
There aren't enough used bits to make this a functional change, but we should fix it for consistency.

llvm-svn: 316639
2017-10-26 06:46:38 +00:00
Eugene Zelenko 5c2aecef78 [Transforms] Revert r316630 changes in Scalar/MergeICmps.cpp to fix broken build bots (NFC).
llvm-svn: 316634
2017-10-26 01:25:14 +00:00
Eugene Zelenko 5adb96cc92 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316630
2017-10-26 00:55:39 +00:00
Matthew Simpson 99f57933ba Attempt to unbreak the expensive-checks-win bot
llvm-svn: 316625
2017-10-25 22:46:34 +00:00
Jonas Devlieghere f63ee64c4b Re-land "[dwarfdump] Add -lookup option"
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.

Differential revision: https://reviews.llvm.org/D38409

llvm-svn: 316619
2017-10-25 21:56:41 +00:00
Sanjoy Das 8499ebf2e9 [SCEV] Fix an assertion failure in the max backedge taken count
Max backedge taken count is always expected to be a constant; and this is
usually true by construction -- it is a SCEV expression with constant inputs.
However, if the max backedge expression ends up being computed to be a udiv with
a constant zero denominator[0], SCEV does not fold the result to a constant
since there is no constant it can fold it to (SCEV has no representation for
"infinity" or "undef").

However, in computeMaxBECountForLT we already know the denominator is positive,
and thus at least 1; and we can use this fact to avoid dividing by zero.

[0]: We can end up with a constant zero denominator if the signed range of the
stride is more precise than the unsigned range.

llvm-svn: 316615
2017-10-25 21:41:00 +00:00