Commit Graph

95580 Commits

Author SHA1 Message Date
Duncan P. N. Exon Smith c82c11428e GlobalStatus: Don't walk use-lists of ConstantData
Return early from llvm::isSafeToDestroyConstant() whenever the value
`isa<ConstantData>()`.  These constants are shared across the
LLVMContext.  We never really want to delete them here, and walking
their use-lists can be very expensive.

(This is motivated by an eventual goal of removing use-lists entirely
from ConstantData.)

llvm-svn: 282320
2016-09-24 02:30:11 +00:00
Kostya Serebryany 0800b81a21 [libFuzzer] simplify HandleTrace again, start re-running interesting units and collecting their features.
llvm-svn: 282316
2016-09-23 23:51:58 +00:00
Peter Collingbourne a638fe057c Add qualification to fix MSVC build.
llvm-svn: 282313
2016-09-23 23:23:23 +00:00
Sanjay Patel 0b36337d61 [x86] fix FCOPYSIGN lowering to create constants instead of ConstantPool loads
This is similar to:
https://reviews.llvm.org/rL279958

By not prematurely lowering to loads, we should be able to more easily eliminate
the 'or' with zero instructions seen in copysign-constant-magnitude.ll.

We should also be able to extend this code to handle vectors.

llvm-svn: 282312
2016-09-23 23:17:29 +00:00
Petr Hosek 85b2f67613 [MC] Support .ds directives in assembler parser
These directives are already supported by GNU assembler.

Differential Revision: https://reviews.llvm.org/D24740

llvm-svn: 282303
2016-09-23 21:53:36 +00:00
Matthias Braun 729c989083 llc: Add -start-before/-stop-before options
Differential Revision: https://reviews.llvm.org/D23089

llvm-svn: 282302
2016-09-23 21:46:02 +00:00
Peter Collingbourne 80186a57d6 LTO: Simplify caching interface.
The NativeObjectOutput class has a design problem: it mixes up the caching
policy with the interface for output streams, which makes the client-side
code hard to follow and would for example make it harder to replace the
cache implementation in an arbitrary client.

This change separates the two aspects by moving the caching policy
to a separate field in Config, replacing NativeObjectOutput with a
NativeObjectStream class which only deals with streams and does not need to
be overridden by most clients and introducing an AddFile callback for adding
files (e.g. from the cache) to the link.

Differential Revision: https://reviews.llvm.org/D24622

llvm-svn: 282299
2016-09-23 21:33:43 +00:00
Valery Pykhtin fbf2d93f73 [AMDGPU] Fix for bz30427: wrong MTBUF encoding on VI
Differential revision: https://reviews.llvm.org/D24875

llvm-svn: 282296
2016-09-23 21:21:21 +00:00
Kostya Serebryany 2d1d944f7e [libFuzzer] first steps in adding a proper automated test suite based on real-life code: add a script to build RE2 at a revision that has known bugs
llvm-svn: 282292
2016-09-23 20:43:22 +00:00
Kostya Serebryany 0d26de3922 [libFuzzer] reset Counters (trace-pc-guard) before every run
llvm-svn: 282284
2016-09-23 20:04:13 +00:00
Petr Hosek 4cb08ce3bc [MC] Support .dcb directives in assembler parser
These directives are already supported by GNU assembler.

Differential Revision: https://reviews.llvm.org/D24741

llvm-svn: 282283
2016-09-23 19:25:15 +00:00
Sanjay Patel 04949faf69 [TLI] isdigit / isascii / toascii param type should match return type (PR30484)
We crash in LibCallSimplifier if we don't check the validity of the function signature properly.

llvm-svn: 282278
2016-09-23 18:44:09 +00:00
Quentin Colombet 380cd88cfd [ResetMachineFunction] Populate the comments in the header of the file.
NFC

llvm-svn: 282276
2016-09-23 18:38:15 +00:00
Quentin Colombet 0f4c20a1e7 [ResetMachineFunction] Add statistic on the number of reset functions.
As the development of GlobalISel move forward, this statistic should
strictly decrease until it reaches zero. At this point, it would mean
GlobalISel can replace SDISel (at least on the tested inputs :P).

llvm-svn: 282275
2016-09-23 18:38:13 +00:00
Quentin Colombet d4c02437c1 [RegisterBankInfo] Add statistics for dynamic partial mappings.
Collect statistics about the number of partial mappings dynamically
allocated and accessed. Ultimately, when the whole TableGen
infrastructure is set, those numbers should be zero.

llvm-svn: 282274
2016-09-23 18:38:06 +00:00
Matthias Braun 1acb55e67c ScheduleDAG: Match enum names when printing sdep kinds
It is less confusing to have the same names in the debug print as the
enum members.

llvm-svn: 282273
2016-09-23 18:28:31 +00:00
Peter Collingbourne 6e8607527c BitcodeReader: Deduplicate code. NFC.
Differential Revision: https://reviews.llvm.org/D24852

llvm-svn: 282272
2016-09-23 18:27:42 +00:00
Quentin Colombet c13ea88a14 [RegBankSelect] Use DEBUG_TYPE instead of repeating the name of the pass
NFC

llvm-svn: 282267
2016-09-23 17:50:06 +00:00
Quentin Colombet 09a9edcc26 [RegisterBank] Mark the dump method with LLVM_DUMP_METHOD.
NFC

llvm-svn: 282266
2016-09-23 17:50:03 +00:00
Jun Bum Lim 3822939ba7 Enhance calcColdCallHeuristics for InvokeInst
Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst.

Reviewers: mcrosier, hfinkel, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D24868

llvm-svn: 282262
2016-09-23 17:26:14 +00:00
James Molloy 85124c76fc Revert "[ARM] Promote small global constants to constant pools"
This reverts commit r282241. It caused http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19882.

llvm-svn: 282249
2016-09-23 13:35:43 +00:00
Nemanja Ivanovic d2c3c51a70 [Power9] Exploit move and splat instructions for build_vector improvement
This patch corresponds to review:
https://reviews.llvm.org/D21135

This patch exploits the following instructions:
mtvsrws
lxvwsx
mtvsrdd
mfvsrld

In order to improve some build_vector and extractelement patterns.

llvm-svn: 282246
2016-09-23 13:25:31 +00:00
James Molloy 1ce54d6be2 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.

llvm-svn: 282241
2016-09-23 12:15:58 +00:00
George Rimar 4f82df52ae Revert r282238 "Revert r282235 "[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section.""
Build bot issues (http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15856/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Adwarfdump-dump-gdbindex.test)
should be fixed in that version. Issue was that MSVS does not support "%zu". Though it works fine on MSCS 2015,
Bot looks running MSVS 2013 that does not like it. MSDN also says that "z" prefix is not supported: https://msdn.microsoft.com/en-us/library/tcxf1dw6.aspx
I had to use PRId64 instead.

Original commit message:

[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section.

gold linker's --gdb-index option currently is able to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them,
this helps reduce the total size of the object files processed by the linker.

More info about that:
https://gcc.gnu.org/wiki/DebugFission
https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html

Patch teaches dwarfdump tool to dump this section.

Differential revision: https://reviews.llvm.org/D21503

llvm-svn: 282239
2016-09-23 11:01:53 +00:00
George Rimar a348527186 Revert r282235 "[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section."
It broke BB:
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15856

llvm-svn: 282238
2016-09-23 10:12:56 +00:00
Alexey Bataev fee9078dcd [InstCombine] Fix for PR29124: reduce insertelements to shufflevector
If inserting more than one constant into a vector:

define <4 x float> @foo(<4 x float> %x) {
  %ins1 = insertelement <4 x float> %x, float 1.0, i32 1
  %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2
  ret <4 x float> %ins2
}

InstCombine could reduce that to a shufflevector:

define <4 x float> @goo(<4 x float> %x) {
 %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3>
 ret <4 x float> %shuf
}
Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e.
shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float
undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> ->
insertelement <4 x float> %v, float 1.0, 1

Differential Revision: https://reviews.llvm.org/D24182

llvm-svn: 282237
2016-09-23 09:14:08 +00:00
George Rimar a77bcf5e42 [llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section.
gold linker's --gdb-index option currently is able to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them,
this helps reduce the total size of the object files processed by the linker.

More info about that:
https://gcc.gnu.org/wiki/DebugFission
https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html

Patch teaches dwarfdump tool to dump this section.

Differential revision: https://reviews.llvm.org/D21503

llvm-svn: 282235
2016-09-23 09:09:26 +00:00
Valery Pykhtin 355103f6c0 [AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions
Differential revision: https://reviews.llvm.org/D24738

llvm-svn: 282234
2016-09-23 09:08:07 +00:00
Craig Topper a02e394872 [AVX-512] Split X86ISD::VFPROUND and X86ISD::VFPEXT into separate opcodes for each type constraint.
This revealed that scalar intrinsics could create nodes with a rounding mode of FROUND_CUR_DIRECTION, but the patterns didn't check for it. It just worked because isel doesn't check operand count and we had a pattern without the rounding mode argument at all.

llvm-svn: 282231
2016-09-23 06:24:43 +00:00
Craig Topper 3174b6e467 [AVX-512] Add separate ISD opcodes for each form of CVT instructions. Don't reuse non-X86 ISD opcodes with extra X86 specific arguments.
llvm-svn: 282230
2016-09-23 06:24:39 +00:00
Craig Topper d70ec9b25e [AVX-512] Use different ISD opcodes for some of the scalar intrinsic lowering. Isel is not very robust against using the same ISD opcode with different number of operands so its better to separate.
llvm-svn: 282229
2016-09-23 06:24:35 +00:00
Kostya Serebryany ce1cab169f [libFuzzer] be more precise about what we reset in TracePC
llvm-svn: 282225
2016-09-23 02:18:59 +00:00
Kostya Serebryany 16a145fd0f [libFuzzer] fix merging with trace-pc-guard
llvm-svn: 282224
2016-09-23 01:58:51 +00:00
Tom Stellard e88bbc34c6 AMDGPU/SI: Include implicit arguments in kernarg_segment_byte_size
Reviewers: arsenm

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D24835

llvm-svn: 282223
2016-09-23 01:33:26 +00:00
Kostya Serebryany 87a598e19f [libFuzzer] simplify the TracePC logic
llvm-svn: 282222
2016-09-23 01:20:07 +00:00
Quentin Colombet ceb71c2f43 [RegisterBankInfo] Mark the dump methods with LLVM_DUMP_METHOD.
NFC

llvm-svn: 282221
2016-09-23 00:59:12 +00:00
Quentin Colombet fd0ab5c660 [AArch64][RegisterBankInfo] Sanity check TableGen'ed like inputs.
Make sure the entries written to mimic the behavior of TableGen are
sane.

llvm-svn: 282220
2016-09-23 00:59:07 +00:00
Kostya Serebryany ab73c6924f [libFuzzer] move value profiling logic into TracePC
llvm-svn: 282219
2016-09-23 00:46:18 +00:00
Tom Stellard e190cd2834 Triple: Add opencl environment type
Summary:
For AMDGPU, we have been using the operating system component of the triple
for specifying the low-level runtime that is being used.  The rationale for
this is that the host operating system (e.g. Linux) is irrelevant for GPU code,
since its execution enviroment will be mostly controled by the low-level runtime
being used to execute the code.

In most cases, higher level languages have their own runtime which is
implemented on top of the low-level runtime.  The kernel ABIs of each
language mostly depend on the low-level runtime, but there may be some
slight differences between languages.  OpenCL for example, may append
additional arguments to the kernel in order to pass values like global
offsets or buffers for printf.  OpenMP, HCC, or other languages may want
to add their own values which differ from OpenCL.

The reason for adding a new opencl environment type is to make it possible for the backend
to distinguish between the ABIs of the higher-level languages and handle them correctly.
It seems cleaner to use the enviroment component for this rather than creating a new
OS type for every combination of low-level runtime / high-level language.

Reviewers: Anastasia, chandlerc

Subscribers: whchung, pekka.jaaskelainen, wdng, yaxunl, llvm-commits

Differential Revision: https://reviews.llvm.org/D24735

llvm-svn: 282218
2016-09-23 00:42:56 +00:00
Petr Hosek 2f4ac44dc3 [MC] Support skip and count for .incbin directive
These optional arguments are supported by GNU assembler.

Differential Revision: https://reviews.llvm.org/D24714

llvm-svn: 282217
2016-09-23 00:41:06 +00:00
Kostya Serebryany d28099de5d [libFuzzer] change ValueBitMap to remember the number of bits in it
llvm-svn: 282216
2016-09-23 00:22:46 +00:00
Quentin Colombet 5b16d931dc [AArch64][RegisterBankInfo] Switch to TableGen'ed like PartialMapping.
Statically instanciate the most common PartialMappings. This should
be closer to what the code would look like when TableGen support is
added for GlobalISel. As a side effect, this should improve compile
time.

llvm-svn: 282215
2016-09-23 00:14:36 +00:00
Quentin Colombet 1946580cf2 [RegisterBankInfo] Check that the mapping covers the interesting bits.
In the verify method of the ValueMapping class we used to check that the
mapping exactly matches the bits of the input value. This is problematic
for statically allocated mappings because we would need a different
mapping for each different size of the value that maps on one
instruction. For instance, with such scheme, we would need a different
mapping for a value of size 1, 5, 23 whereas they all end up on a 32-bit
wide instruction.

Therefore, change the verifier to check that the meaningful bits are
covered by the mapping instead of matching them.

llvm-svn: 282214
2016-09-23 00:14:34 +00:00
Quentin Colombet 0afa7d6b82 [RegisterBankInfo] Use array instead of SmallVector for BreakDown.
This is another step toward TableGen'ed like structures. The BreakDown of
the mapping of the value will be statically computed by TableGen, thus
we only have to point to the right entry in the table instead of
dynamically allocate the mapping for each instruction.

We still support the dynamic allocation through a factory of
PartialMapping to ease the bring-up of the targets while the TableGen
backend is not available.

llvm-svn: 282213
2016-09-23 00:14:30 +00:00
Kostya Serebryany be0ed59cdc [libFuzzer] simplify the crash minimizer; split MaxLen into two: MaxInputLen and MaxMutationLen, allow MaxMutationLen to be less than MaxInputLen
llvm-svn: 282211
2016-09-22 23:16:36 +00:00
Sanjay Patel 30ef70b090 [InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672)
We already have the udiv variant of this transform, so I think this is ok for 
InstCombine too even though there is an increase in IR instructions. As the 
tests and TODO comments show, the transform can lead to follow-on combines.

This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672

Differential Revision: https://reviews.llvm.org/D24527

llvm-svn: 282209
2016-09-22 22:36:26 +00:00
Davide Italiano fcee2d8001 [AsmParser] Remove unused partial template specialization.
llvm-svn: 282206
2016-09-22 22:02:59 +00:00
Matthias Braun 5f8492e2ce MachineScheduler: Slightly simplify release node
llvm-svn: 282201
2016-09-22 21:39:56 +00:00
Matthias Braun 46533e614b MachineScheduler: Remove ineffective heuristic; NFC
Currently all nodes get added to the NextSU list when they are released,
so any candidate must be in that list, making the heuristic ineffective.
Remove it for now, we can add it back later in a working fashion if
necessary.

llvm-svn: 282200
2016-09-22 21:39:52 +00:00
Hans Wennborg c7957ef86c Revert r282168 "GVN-hoist: fix store past load dependence analysis (PR30216)"
and also the dependent r282175 "GVN-hoist: do not dereference null pointers"

It's causing compiler crashes building Harfbuzz (PR30499).

llvm-svn: 282199
2016-09-22 21:20:53 +00:00
Krzysztof Parzyszek 29e93f3880 [RDF] Add initial support for lane masks in the DFG
Use lane masks for calculating covering and aliasing of register
references.

llvm-svn: 282194
2016-09-22 21:01:24 +00:00
Krzysztof Parzyszek de5fcbaf92 [Hexagon] Remove USR_OVF from CtrRegs register class
USR_OVF is a subregister of USR, which is a member of CtrRegs. Having both
a register and its proper subregister in the same register class has bad
consequences for lane mask calculation: based solely on the lane mask info,
USR_OVF would not appear to be a subregister of USR.

llvm-svn: 282192
2016-09-22 20:59:41 +00:00
Krzysztof Parzyszek 670e0ca24f [RDF] Print the function name for calls in dumps
llvm-svn: 282191
2016-09-22 20:58:19 +00:00
Krzysztof Parzyszek c51f2394a6 [RDF] Use uint32_t for register numbers instead of unsigned
llvm-svn: 282190
2016-09-22 20:56:39 +00:00
Arnold Schwaighofer 0fd32c005b i386 does not support optimized swifterror handling
rdar://28432565

llvm-svn: 282186
2016-09-22 20:06:25 +00:00
Hans Wennborg c4b1d20ba2 Win64: Don't emit unwind info for "leaf" functions (PR30337)
According to MSDN (see the PR), functions which don't touch any callee-saved
registers (including %rsp) don't need any unwind info.

This patch makes LLVM not emit unwind info for such functions, to save
binary size.

Differential Revision: https://reviews.llvm.org/D24748

llvm-svn: 282185
2016-09-22 19:50:05 +00:00
Nemanja Ivanovic 8dacca943a [PowerPC] Sign extend sub-word values for atomic comparisons
Atomic comparison instructions use the sub-word load instruction on
Power8 and up but the value is not sign extended prior to the signed word
compare instruction. This patch adds that sign extension.

llvm-svn: 282182
2016-09-22 19:06:38 +00:00
Nirav Dave 9011da3d44 [DAG] Fix incorrect alignment of ext load.
Correctly use alignment size from loaded size not output value size.

Reviewers: jyknight, tstellarAMD, arsenm

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23356

llvm-svn: 282177
2016-09-22 17:28:43 +00:00
Sebastian Pop 1531f30ccc GVN-hoist: do not dereference null pointers
there may be basic blocks without memory accesses, in which case the
list of accesses is a null pointer.

llvm-svn: 282175
2016-09-22 17:22:58 +00:00
Krzysztof Parzyszek b66efb855c [PPC] Set SP after loading data from stack frame, if no red zone is present
Follow-up to r280705: Make sure that the SP is only restored after all data
is loaded from the stack frame, if there is no red zone.

This completes the fix for https://llvm.org/bugs/show_bug.cgi?id=26519.

Differential Revision: https://reviews.llvm.org/D24466

llvm-svn: 282174
2016-09-22 17:22:43 +00:00
Zachary Turner d5d57635ba Speculative fix for build failures due to consumeInteger.
A recent patch added support for consumeInteger() and made
getAsInteger delegate to this function.  A few buildbots are
failing as a result with an assertion failure.  On a hunch,
I tested what happens if I call getAsInteger() on an empty
string, and sure enough it crashes the same way that the
buildbots are crashing.

I confirmed that getAsInteger() on an empty string did not
crash before my patch, so I suspect this to be the cause.

I also added a unit test for the empty string.

llvm-svn: 282170
2016-09-22 15:55:05 +00:00
Sebastian Pop 8e6e3318c2 GVN-hoist: fix store past load dependence analysis (PR30216)
To hoist stores past loads, we used to search for potential
conflicting loads on the hoisting path by following a MemorySSA
def-def link from the store to be hoisted to the previous
defining memory access, and from there we followed the def-use
chains to all the uses that occur on the hoisting path. The
problem is that the def-def link may point to a store that does
not alias with the store to be hoisted, and so the loads that are
walked may not alias with the store to be hoisted, and even as in
the testcase of PR30216, the loads that may alias with the store
to be hoisted are not visited.

The current patch visits all loads on the path from the store to
be hoisted to the hoisting position and uses the alias analysis
to ask whether the store may alias the load. I was not able to
use the MemorySSA functionality to ask for whether load and
store are clobbered: I'm not sure which function to call, so I
used a call to AA->isNoAlias().

Store past store is still working as before using a MemorySSA
query: I added an extra test to pr30216.ll to make sure store
past store does not regress.

Differential Revision: https://reviews.llvm.org/D24517

llvm-svn: 282168
2016-09-22 15:33:51 +00:00
Sebastian Pop 5d68aa7913 GVN-hoist: fix typo
llvm-svn: 282165
2016-09-22 15:08:09 +00:00
Zachary Turner 65fd2fc7b4 [Support] Add StringRef::consumeInteger.
StringRef::getInteger() exists and treats the entire string as
an integer of the specified radix, failing if any invalid characters
are encountered or the number overflows.

Sometimes you might have something like "123456foo" and you want
to get the number 123456 and leave the string "foo" remaining.
This is similar to what would be possible by using the standard
runtime library functions strtoul et al and specifying an end
pointer.

This patch adds consumeInteger(), which does exactly that.  It
consumes as much as possible until an invalid character is found,
and modifies the StringRef in place so that upon return only
the portion of the StringRef after the number remains.

Differential Revision: https://reviews.llvm.org/D24778

llvm-svn: 282164
2016-09-22 15:05:19 +00:00
Etienne Bergeron 7f0e315327 [compiler-rt] fix typo in option description [NFC]
llvm-svn: 282163
2016-09-22 14:57:24 +00:00
Sebastian Pop 440f15b7fc GVN-hoist: only hoist relevant scalar instructions
Without this patch, GVN-hoist would think that a branch instruction is a scalar instruction
and would try to value number it. The patch filters out all such kind of irrelevant instructions.

A bit frustrating is that there is no easy way to discard all those very infrequent instructions,
a bit like isa<TerminatorInst> that stands for a large family of instructions. I'm thinking that
checking for those very infrequent other instructions would cost us more in compilation time
than just letting those instructions getting numbered, so I'm still thinking that a simpler check:

  if (isa<TerminatorInst>(I))
    return false;

is better than listing all the other less frequent instructions.

Differential Revision: https://reviews.llvm.org/D23929

llvm-svn: 282160
2016-09-22 14:45:40 +00:00
Keith Walker ba1598975f Reapplying r281895 (and follow-up r281964) after fixing pr30468.
The additional fix is:

When adding debug information to a lowered phi node in mem2reg
check that we have a valid insertion point after the phi for adding
the debug information.

This change addresses the issue in pr30468 where a lowered phi was
added before a catchswitch and no debug information should be added
after the phi in this case.

Differential Revision: https://reviews.llvm.org/D24797

llvm-svn: 282155
2016-09-22 14:13:25 +00:00
Tim Northover a5e38fa00d GlobalISel: handle stack-based parameters on AArch64.
llvm-svn: 282153
2016-09-22 13:49:25 +00:00
Anna Thomas 82c3717f54 [RS4GC] Remat in presence of phi and use live value
Summary:

Reviewers:

Subscribers:

llvm-svn: 282150
2016-09-22 13:13:06 +00:00
Artem Tamazov 2146a0a90e [AMDGPU][mc] Add support for absolute expressions in DPP modifiers.
Also added range checking for DPP attributes.
Assembler tests added as well.

Differential Revision: https://reviews.llvm.org/D24755

llvm-svn: 282145
2016-09-22 11:47:21 +00:00
Nemanja Ivanovic e78ffede6f [PowerPC] Remove LE patterns matching generic stores/loads to VSX permuting ops
This patch corresponds to:
https://reviews.llvm.org/D21409

The LXVD2X, LXVW4X, STXVD2X and STXVW4X instructions permute the two doublewords
in the vector register when in little-endian mode. Custom code ensures that the
necessary swaps are inserted for these. This patch simply removes the possibilty
that a load/store node will match one of these instructions in the SDAG as that
would not insert the necessary swaps.

llvm-svn: 282144
2016-09-22 10:32:03 +00:00
Nemanja Ivanovic 6e7879c5e6 [Power9] Add exploitation of non-permuting memory ops
This patch corresponds to review:
https://reviews.llvm.org/D19825

The new lxvx/stxvx instructions do not require the swaps to line the elements
up correctly. In order to select them over the lxvd2x/lxvw4x instructions which
require swaps, the patterns for the old instruction have a predicate that
ensures they won't be selected on Power9 and newer CPUs.

llvm-svn: 282143
2016-09-22 09:52:19 +00:00
Sagar Thakur e74eb4e7b9 [EfficiencySanitizer] Using '$' instead of '#' for struct counter name
For MIPS '#' is the start of comment line. Therefore we get assembler errors if # is used in the structure names.

Differential: D24334
Reviewed by: zhaoqin

llvm-svn: 282141
2016-09-22 08:33:06 +00:00
Dorit Nuzman d1247a684e Fix revision 281960
llvm-svn: 282139
2016-09-22 07:56:23 +00:00
Craig Topper 202b453a8a [AVX-512] Add support for commuting VPTERNLOG instructions.
VPTERNLOG is a ternary instruction with an immediate specifying the logical operation to perform. For each bit position in the 3 source vectors the bit from each source is concatenated together and the resulting 3-bit value is used to select a bit in the immediate. This bit value is written to the result vector.

We can commute this by swapping operands and modifying the immediate. To modify the immediate we need to swap two pairs of bits. The pairs correspond to the locations in the immediate where the commuted operands bits have opposite values and the uncommuted operand has the same value. Bits 0 and 7 will never be swapped since the relevant bits from all sources are the same value.

This refactors and reuses parts of the FMA3 commuting code which is also a three operand instruction.

llvm-svn: 282132
2016-09-22 03:00:50 +00:00
Quentin Colombet 6a76323c64 [RegisterBankInfo] Move to statically allocated RegisterBank.
This commit is basically the first step toward what will
RegisterBankInfo look when it gets TableGen'ed.

It introduces a XXXGenRegisterBankInfo.def file that is what TableGen
will issue at some point. Moreover, the RegBanks field in
RegisterBankInfo changed to reflect the static (compile time) aspect of
the information.

llvm-svn: 282131
2016-09-22 02:10:37 +00:00
Quentin Colombet b202eb88aa [RegisterBankInfo] Take advantage of the extra argument of SmallVector::resize.
When initializing an instance of OperandsMapper, instead of using
SmallVector::resize followed by std::fill, use the function that
directly does that in SmallVector.

llvm-svn: 282130
2016-09-22 02:10:32 +00:00
Kostya Serebryany 624f59f4d8 [libFuzzer] add 'features' to the corpus elements, allow mutations with Size > MaxSize, fix sha1 in corpus stats; various refactorings
llvm-svn: 282129
2016-09-22 01:34:58 +00:00
Kostya Serebryany c9e3de35ed [libFuzzer] one more test
llvm-svn: 282127
2016-09-22 00:57:29 +00:00
Kostya Serebryany 29bb664075 [libFuzzer] add stats to the corpus; more refactoring
llvm-svn: 282121
2016-09-21 22:42:17 +00:00
Kostya Serebryany 20801e1b8a [libFuzzer] more refactoring; don't compute sha1sum every time we mutate a unit from the corpus, use the stored one.
llvm-svn: 282115
2016-09-21 21:41:48 +00:00
Kostya Serebryany 8658618ea0 [libFuzzer] more refactoring
llvm-svn: 282113
2016-09-21 21:17:23 +00:00
Kevin Enderby e71e13c7d6 Next set of additional error checks for invalid Mach-O files for bad LC_UUID
load commands.  Added a missing check and made the check for more than
one like other other “more than one” checks.  And of course added test cases.

llvm-svn: 282104
2016-09-21 20:03:09 +00:00
Chad Rosier 00eb8db3a1 [LoopInterchange] Track all dependencies, not just anti dependencies.
Currently, we give up on loop interchange if we encounter a flow dependency
anywhere in the loop list. Worse yet, we don't even track output dependencies.

This patch updates the dependency matrix computation to track flow and output
dependencies in the same way we track anti dependencies.

This improves an internal workload by 2.2x.

Note the loop interchange pass is off by default and it can be enabled with
'-mllvm -enable-loopinterchange'

Differential Revision: https://reviews.llvm.org/D24564

llvm-svn: 282101
2016-09-21 19:16:47 +00:00
Teresa Johnson 3f212b8908 [ThinLTO] Emit files for distributed builds for all modules
With the new LTO API in r278338, we stopped emitting the individual
index files and imports files for some modules in the distributed backend
case (thinlto-index-only plugin option).

Specifically, this is when the linker decides not to include a module in the
link, because it was in an archive library and did not have a strong
reference to it. Not creating the expected output files makes the
distributed build system implementation more difficult, in terms of
checking for the expected outputs of the thin link, and scheduling the
backend jobs. To address this, the gold-plugin will write dummy empty
.thinlto.bc and .imports files for modules not included in the link
(which LTO never sees).

Augmented a gold v1.12+ test, since that version of gold has the handling
for notifying on modules not being included in the link.

llvm-svn: 282100
2016-09-21 19:12:05 +00:00
Davide Italiano 0f8b5d65f6 [MIRParser] Delete dead code. NFCI.
llvm-svn: 282098
2016-09-21 18:26:08 +00:00
Nico Weber a489438849 revert 281908 because 281909 got reverted
llvm-svn: 282097
2016-09-21 18:25:43 +00:00
Arnold Schwaighofer de2490d0dc Disable tail calls if there is an swifterror argument
ISel does not handle them correctly yet i.e we crash trying to emit tail call
code.

radar://28407842

llvm-svn: 282088
2016-09-21 16:53:36 +00:00
Matthew Simpson 15869f86d8 [LV] Don't emit unused scalars for uniform instructions
If we identify an instruction as uniform after vectorization, we know that we
should only use the value corresponding to the first vector lane of each unroll
iteration. However, when scalarizing such instructions, we still produce values
for the other vector lanes. This patch prevents us from generating the unused
scalars.

Differential Revision: https://reviews.llvm.org/D24275

llvm-svn: 282087
2016-09-21 16:50:24 +00:00
Artem Tamazov 2e217b87cb [AMDGPU][mc] Add support for ds_add_[rtn_]f32.
Lit tests added.
Resolves https://github.com/RadeonOpenCompute/hcc/issues/122.

Differential Revision: https://reviews.llvm.org/D24765

llvm-svn: 282086
2016-09-21 16:35:44 +00:00
Dehao Chen 160fbc3f95 Change the basic block weight calculation algorithm to use max instead of voting.
Summary: Now that we have more precise debug info, we should change back to use maximum to get basic block weight.

Reviewers: dnovillo

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D24788

llvm-svn: 282084
2016-09-21 16:26:51 +00:00
Matthew Simpson a95e8bb7ed [LV] Rename "Width" to "Lane" (NFC)
llvm-svn: 282083
2016-09-21 16:09:23 +00:00
Hans Wennborg 1049085c78 Revert r281895 "Add @llvm.dbg.value entries for the phi node created by -mem2reg"
(And follow-up r281964.)

It caused PR30468.

llvm-svn: 282077
2016-09-21 15:55:53 +00:00
Nico Weber 903859c0e4 Revert r281715, it caused PR30475
llvm-svn: 282076
2016-09-21 15:33:24 +00:00
Arnold Schwaighofer f62ba1031f DeadArgElim: Don't mark swifterror arguments as unused
Replacing swifterror arguments with undef creates invalid IR.

rdar://28300490

llvm-svn: 282075
2016-09-21 15:29:08 +00:00
Chad Rosier f7c76f91e0 [LoopInterchange] Various cleanup. NFC.
llvm-svn: 282071
2016-09-21 13:28:41 +00:00
Tim Northover 9a46718378 GlobalISel: produce correct code for signext/zeroext ABI flags.
We still don't really have an equivalent of "AssertXExt" in DAG, so we don't
exploit the guarantees on the receiving side yet, but this should produce
conservatively correct code on iOS ABIs.

llvm-svn: 282069
2016-09-21 12:57:45 +00:00
Tim Northover 862758ec14 GlobalISel: pass Function to lowerFormalArguments directly (NFC).
The only implementation that exists immediately looks it up anyway, and the
information is needed to handle various parameter attributes (stored on the
function itself).

llvm-svn: 282068
2016-09-21 12:57:35 +00:00
Sam Kolton 12b633beda [AMDGPU] Assembler: remove unused AMDGPUMCObjectWriter.
Summary: It is replaced by AMDGPUELFObjectWriter

Reviewers: tstellarAMD, vpykhtin, artem.tamazov

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl

Differential Revision: https://reviews.llvm.org/D24654

llvm-svn: 282065
2016-09-21 10:33:32 +00:00
Simon Dardis 9a66bbecae [mips] LLVM PR/30197 - Tail call incorrectly clobbers arguments for mips
The postRA scheduler performs alias analysis to determine if stores and loads
can moved past each other. When a function has more arguments than argument
registers for the calling convention used, excess arguments are spilled onto the
stack. LLVM by default assumes that argument slots are immutable, unless the
function contains a tail call. Without the knowledge of that a function contains
a tail call site, stores and loads to fixed stack slots may be re-ordered
causing the out-going arguments to clobber the incoming arguments before the
incoming arguments are supposed to be dead.

Reviewers: vkalintiris

Differential Review: https://reviews.llvm.org/D24077

llvm-svn: 282063
2016-09-21 09:43:40 +00:00
Diana Picus 2a3f066349 Revert "AArch64: Set shift bit of TLSLE HI12 add instruction"
This reverts commit r282057 because it broke the buildbots - see e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-42vma/builds/12063

llvm-svn: 282058
2016-09-21 08:24:41 +00:00
Lei Liu 6c87f23526 AArch64: Set shift bit of TLSLE HI12 add instruction
Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model.  This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000.

Reviewers: t.p.northover, peter.smith, rovka

Subscribers: salim.nasser, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24702

llvm-svn: 282057
2016-09-21 07:41:41 +00:00
Craig Topper 29f1a1f834 [AVX-512] Split the 3 different usages of the X86ISD::FSETCC opcode into 3 different opcodes.
It turns out isel is really not robust against having different type profiles for the same opcode. It turns out that if you put an illegal rounding mode(i.e. not CUR_DIRECTION or NO_EXC) on a comiss intrinsic we would generate the FSETCC form with the rounding mode added, but then pattern match to an instruction with ROUND_CUR_DIRECTION.

We can probably get away with just one FSETCCM opcode that always contains the rounding mode and explicitly put ROUND_CUR_DIRECTION in the pattern, but I'll leave that for future work.

With this change the clang tests for the comiss intrinsics that used an incorrect rounding mode of 3 properly fail isel instead of silently doing the wrong thing. Those clang tests will be fixed in a follow up commit and I also plan to add rounding mode checking to clang.

llvm-svn: 282055
2016-09-21 06:37:54 +00:00
Craig Topper d868870f17 [AVX-512] Don't add an additional rounding mode operand to the avx512 vcvtps2ph intrinsic lowering.
There was no way to control its value so it was always FROUND_CURRENT making it unnecessary. The true rounding mode is encoded in the immediate operand of the instruction.

This also removes the pattern from the rb form of the instructions since there is no way to specify the FROUND_NO_EXC rounding mode it required.

llvm-svn: 282052
2016-09-21 03:58:44 +00:00
Craig Topper a27f54b4d9 [AVX-512] Simplify handling of INTR_TYPE_1OP_MASK_RM to remove support for the second opcode since its never used. This makes it consistent with INTR_TYPE_2OP_MASK_RM and INTR_TYPE_3OP_MASK_RM.
And even if it was used we were passing the same operands to both so it wouldn't make sense to have two opcodes.

llvm-svn: 282051
2016-09-21 03:58:41 +00:00
Kostya Serebryany 225d8e45d4 [libFuzzer] fix libc++ build
llvm-svn: 282050
2016-09-21 03:50:37 +00:00
Adam Nemet e3cef93727 [LV] When reporting about a specific instruction without debug location use loop's
This can occur for example if some optimization drops the debug location.

llvm-svn: 282048
2016-09-21 03:14:20 +00:00
Kostya Serebryany 556894fb10 [libFuzzer] more refactoring; NFC
llvm-svn: 282047
2016-09-21 02:05:39 +00:00
Craig Topper e18258dc1c [AVX-512] Don't lower avx512 vcvtps2ph/vcvtph2ps nodes to ISD::FP16_TO_FP/ISD::FP_TO_FP16 with an extra x86 specific rounding mode operand. We should use a target specific ISD opcode.
llvm-svn: 282046
2016-09-21 02:05:22 +00:00
Jacques Pienaar 98345fc0a1 [NVPTX] Check if callsite is defined when computing argument allignment
Summary: In getArgumentAlignment check if the ImmutableCallSite pointer CS is non-null before dereferencing. If CS is 0x0 fall back to the ABI type alignment else compute the alignment as before.

Reviewers: eliben, jpienaar

Subscribers: jlebar, vchuravy, cfe-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D9168

llvm-svn: 282045
2016-09-21 01:57:57 +00:00
Kostya Serebryany 6f5a804cdb [libFuzzer] refactoring: split the large header into many; NFC
llvm-svn: 282044
2016-09-21 01:50:50 +00:00
Kostya Serebryany 09aa01a6f8 [libFuzzer] refactoring: move the Corpus into a separate class; delete two unused experimental features
llvm-svn: 282042
2016-09-21 01:04:43 +00:00
Michael Kuperstein 79dcc274b4 [InferAttributes] Don't access parameters that don't exist.
Check for the correct number of parameters before querying their type.
This fixes PR30455.

llvm-svn: 282038
2016-09-20 23:10:31 +00:00
Teresa Johnson 620c140a9b [ThinLTO] Always emit a summary when compiling in ThinLTO mode
Summary:
Emit an empty summary section, instead of no summary section, when
there are no global variables in the index. This ensures that LTO
will treat these files as ThinLTO inputs, instead of as regular
LTO inputs.

In addition to not being what the user likely intended when
compiling with -flto=thin, the current behavior is problematic for
distributed build systems that expect to get ThinLTO index and imports
files back for each input compiled with -flto=thin. Combining into
a single regular LTO module also reduces the backend parallelism.
And in the case where the index was suppressed due to uses in
inline assembly, combining into a single LTO module could provoke
renaming of duplicates that we were trying to prevent by suppressing
the index.

This change required a couple of fixes to handle the empty summary
section.

Reviewers: mehdi_amini

Subscribers: mehdi_amini, llvm-commits, pcc

Differential Revision: https://reviews.llvm.org/D24779

llvm-svn: 282037
2016-09-20 23:07:17 +00:00
Xinliang David Li 9780fc1451 code cleanup -- commoning IR travsersals
llvm-svn: 282034
2016-09-20 22:39:47 +00:00
Eric Christopher 5653e5dffc Remove the default subtarget from the x86 port as it isn't necessary (or
correct) anymore.

llvm-svn: 282031
2016-09-20 22:19:33 +00:00
Eric Christopher c4636b3002 Revert "Remove extra argument used once on
TargetMachine::getNameWithPrefix and inline the result into the singular
caller." and "Remove more guts of TargetMachine::getNameWithPrefix and
migrate one check to the TLOF mach-o version." temporarily until I can
get the whole call migrated out of the TargetMachine as we could hit
places where TLOF isn't valid.

This reverts commits r281981 and r281983.

llvm-svn: 282028
2016-09-20 22:03:28 +00:00
Anna Thomas 8cd7de1d18 [RS4GC] Refactor code for Rematerializing in presence of phi. NFC
Summary:
This is an NFC refactoring change as a precursor to the actual fix for rematerializing in
presence of phi.
https://reviews.llvm.org/D24399

Pasted from review:
findRematerializableChainToBasePointer changed to return the root of the
chain. instead of true or false.
move the PHI matching logic into the caller by inspecting the root return value.
This includes an assertion that the alternate root is in the liveset for the
call.

Tested with current RS4GC tests.

Reviewers: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24780

llvm-svn: 282023
2016-09-20 21:36:02 +00:00
George Burgess IV b44b6d98b3 [CodeGen] stop short-circuiting the SSP code for sspstrong.
This check caused us to skip adding layout information for calls to
alloca in sspreq/sspstrong mode. We check properly for sspstrong later
on (and add the correct layout info when doing so), so removing this
shouldn't hurt.

No test is included, since testing this using lit seems to require
checking for exact offsets in asm, which is something that the lit tests
for this avoid. If someone cares deeply, I'm happy to write a unittest
or something to cover this, but that feels like overkill.

Patch by Daniel Micay.

Differential Revision: https://reviews.llvm.org/D22714

llvm-svn: 282022
2016-09-20 21:30:01 +00:00
Petr Hosek 1290355f5a Mark ELF sections whose name start with .note as note
Previously, such section would be marked as SHT_PROGBITS which
makes it impossible to use an initialized C variable declaration
to emit an (allocated) ELF note. The new behavior is also consistent
with ELF assembly parser.

Differential Revision: https://reviews.llvm.org/D24692

llvm-svn: 282010
2016-09-20 20:21:13 +00:00
Xinliang David Li c7368287b7 [Profile] Do not annotate select insts not covered in profile.
Fixed PR/30466

llvm-svn: 282009
2016-09-20 20:20:01 +00:00
Kevin Enderby fc0929adb8 Next set of additional error checks for invalid Mach-O files for bad load commands
that use the Mach::dylib_command type for the load commands that are
currently used in the MachOObjectFile constructor.

This contains the missing checks for LC_ID_DYLIB, LC_ID_DYLIB, etc.
load commands and the fields for the Mach::dylib_command type.

Also checks that only an MH_DYLIB or MH_STUB_DYLIB has an
LC_ID_DYLIB load command (and others filetype don’t) and there
is not more than one of these load commands.

llvm-svn: 282008
2016-09-20 20:14:14 +00:00
Xinliang David Li a754c47ac5 [Profile] code refactoring: make getStep a method in base class
llvm-svn: 282002
2016-09-20 19:07:22 +00:00
Evandro Menezes 9b5d89513b Revert part of "AArch64: Do not test for CPUs, use SubtargetFeatures"
This reverts part of commit 119e358d9635c8d1f3e7aee67e3ea3b8a62f8db6 by
removing FeatureUseRSqrt et al per request by Eric Christopher
<echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 282001
2016-09-20 19:02:09 +00:00
Evandro Menezes ba4926efde Revert "[AArch64] Use the reciprocal estimation machinery"
This reverts commit b7d42b0048f65346e9fa37fb65defeea7ce8c337 per request by
Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 282000
2016-09-20 19:02:06 +00:00
Evandro Menezes 61a1273d27 Revert "[AArch64] Properly validate the reciprocal estimation."
This reverts commit ad8ca1528242e2a4cb363e3779309e70eb7a430e per request by
Eric Christopher <echristo@gmail.com> (v. http://bit.ly/2cmz6kW).

llvm-svn: 281999
2016-09-20 19:02:02 +00:00
Adrian Prantl 12fa3b3911 ASAN: Don't drop debug info attachements for global variables.
This is a follow-up to r281284. Global Variables now can have
!dbg attachements, so ASAN should clone these when generating a
sanitized copy of a global variable.

<rdar://problem/24899262>

llvm-svn: 281994
2016-09-20 18:28:42 +00:00
Adrian McCarthy ad8ac54d10 Fix syntactical nit from r281990.
llvm-svn: 281991
2016-09-20 17:42:13 +00:00
Adrian McCarthy c64acfd4c2 Emit S_COMPILE3 CodeView record
CodeView has an S_COMPILE3 record to identify the compiler and source language of the compiland.  This record comes first in the debug$S section for the compiland. The debuggers rely on this record to know the source language of the code.

There was a little test fallout from introducing a new record into the symbols subsection.

Differential Revision: https://reviews.llvm.org/D24317

llvm-svn: 281990
2016-09-20 17:20:51 +00:00
Saleem Abdulrasool 03ffa797ad X86: loosen an overly aggressive MachO assertion
We would assert that the FP setup CFI used esp/rsp always.  This held up in
practice when the code was generated from IR.  However, with the integrated
assembler, it is possible to have the input be user specified assembly.  In such
a case, we cannot assume that the function implementation has a compact unwind
representation.  Loosen the assertion into a check and bail if we cannot
represent the frame pointer in the compact unwinding.

Addresses PR30453!

llvm-svn: 281986
2016-09-20 17:05:04 +00:00
Eric Christopher a1ccdc3433 Remove more guts of TargetMachine::getNameWithPrefix and migrate one check to the TLOF mach-o version.
NFC intended.

llvm-svn: 281983
2016-09-20 16:05:02 +00:00
Eric Christopher ef579d2195 Remove a use of subtarget initialization in the X86 backend so we can get rid of the default subtarget.
NFC intended.

llvm-svn: 281982
2016-09-20 16:04:59 +00:00
Eric Christopher 0be7793d75 Remove extra argument used once on TargetMachine::getNameWithPrefix and inline the result into the singular caller.
llvm-svn: 281981
2016-09-20 16:04:50 +00:00
Keith Walker f83a19ffc2 Improve the -debug output for Debug Range Extension (NFC)
Include header messages and remove unnecessary blank lines.

llvm-svn: 281980
2016-09-20 16:04:31 +00:00
Tim Northover b18ea162df GlobalISel: split aggregates for PCS lowering
This should match the existing behaviour for passing complicated struct and
array types, in particular HFAs come through like that from Clang.

For C & C++ we still need to somehow support all the weird ABI flags, or at
least those that are present in the IR (signext, byval, ...), and stack-based
parameter passing.

llvm-svn: 281977
2016-09-20 15:20:36 +00:00
Sanjay Patel b2332e1931 move variables closer to their uses; add FIXMEs; NFC
llvm-svn: 281972
2016-09-20 14:36:14 +00:00
Elena Demikhovsky d3ff7c288b AVX-512: Fixed a bug in lowering saturated operations on KNL.
The generated code is still not optimal.

Differential Revision: https://reviews.llvm.org/D24723

llvm-svn: 281966
2016-09-20 11:02:26 +00:00
Valery Pykhtin e330cfa294 [AMDGPU] Refactor VOP3 instruction TD definitions
Differential revision: https://reviews.llvm.org/D24664

llvm-svn: 281965
2016-09-20 10:41:16 +00:00
Keith Walker 22b5dbc8bf Make llvm::ConvertDebugDeclareToDebugValue() be a void function (NFC)
The routines llvm::ConvertDebugDeclareToDebugValue() always returned
a true value which was never checked at the call site; change the
function return type to void.

This NFC cleanup was approved in the review https://reviews.llvm.org/D23715

llvm-svn: 281964
2016-09-20 10:36:17 +00:00
Dorit Nuzman 02efef0525 Reverting revision 281960 due to test failures.
llvm-svn: 281961
2016-09-20 08:27:48 +00:00
Dorit Nuzman d3686e5269 [SROA] Preserve llvm.mem.parallel_loop_access metadata.
SROA doesn't preserve the llvm.mem.parallel_loop_access metadata when it
transforms loads/stores. This patch fixes a couple occurences of this
issue.

(Partially addresses PR28981).

Differential Revision: https://reviews.llvm.org/D23549

llvm-svn: 281960
2016-09-20 07:50:49 +00:00
Craig Topper 67882bd94e [AVX-512] Teach X86InstrInfo::copyPhysReg to use a 512-bit move if XMM16-XMM31 or YMM16-YMM31 are the source or dest of the copy and VLX is not supported.
This can happen with SUBREG_TO_REG of ZMM16-ZMM31. Fixes PR30430.

llvm-svn: 281959
2016-09-20 06:49:17 +00:00
Craig Topper 9820e341f9 [AVX-512] Use 512-bit vcvtps2ph/vcvtph2ps to implement fp_to_f16/f16_to_fp when F16C and VLX are not supported.
Fixes PR23941.

llvm-svn: 281958
2016-09-20 05:44:47 +00:00
Matthias Braun 4040c0f4ec BranchFolder: Fix invalid undef flags after merge.
It is legal to merge instructions with different undef flags; However we
must drop the undef flag from the merged instruction if it isn't present
everywhere.

This fixes http://llvm.org/PR30199

llvm-svn: 281957
2016-09-20 01:14:42 +00:00
Sanjay Patel e97f7947b1 [x86] fix variable names; NFC
llvm-svn: 281953
2016-09-20 00:27:22 +00:00
Kostya Serebryany 06694d0a2f [sanitizer-coverage] add comdat to coverage guards if needed
llvm-svn: 281952
2016-09-20 00:16:54 +00:00
Philip Reames b1472ffed7 [LCSSA] Cache LoopExits to avoid wasted work
When looking at the scribus_1.3 example from https://llvm.org/bugs/show_bug.cgi?id=10584, I noticed that we were spending a large amount of time computing loop exits in LCSSA. This code appears to be written with the assumption that LoopExits are stored in the Loop and thus cheap to query. This is not true, so we should cache the result across the potentially long running loop which tends to visit a small handful of Loops.

On the particular example from 10584, this change drops the time spent in LCSSA computation by about 80%.

Differential Revision: https://reviews.llvm.org/D24509

llvm-svn: 281949
2016-09-19 23:30:23 +00:00
Quentin Colombet ae273b6ba2 [RegisterBankInfo] Adapt call to std::fill due to use of SmallVector.
This was meant to be commited with my previous commit.

llvm-svn: 281948
2016-09-19 23:18:47 +00:00
David Callahan c165a4e215 Merge branch 'ADCE5'
llvm-svn: 281947
2016-09-19 23:17:58 +00:00
Sanjay Patel 0fa3365923 [x86] use getSignBit() to simplify code; NFCI
llvm-svn: 281944
2016-09-19 22:07:27 +00:00
Mehdi Amini 7cf6382657 BitcodeWriter: fix emission of invoke when calling a var-arg function with operand bundles
llvm-svn: 281940
2016-09-19 21:27:04 +00:00
Kostya Serebryany 3750c04f7e [libFuzzer] use sleep() instead of std::this_thread::sleep_for to avoid coverage from instrumented libc++
llvm-svn: 281933
2016-09-19 20:32:34 +00:00
Dehao Chen 20866ed57e Handle early inline for hot callsites that reside in the same basic block.
Summary: Callsites in the same basic block should share the same hotness. This patch checks for the hottest callsite in the same basic block, and use the hotness for all callsites in that basic block for early inline decisions. It also fixes the test to add "-S" so theat the "CHECK-NOT" is actually checking the content.

Reviewers: dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24734

llvm-svn: 281927
2016-09-19 18:38:14 +00:00
Quentin Colombet a7330ba5ac [RegisterBankInfo] Avoid heap allocation in most cases.
The OperandsMapper class is used heavy in RegBankSelect and each
instantiation triggered a heap allocation for the array of operands.
Instead, use a SmallVector with a big enough size such that most of the
cases do not have to use dynamically allocated memory.

This improves the compile time of the RegBankSelect pass.

llvm-svn: 281916
2016-09-19 17:33:55 +00:00
Matthias Braun e1d9628bba LiveRangeCalc: Fix reporting of invalid vreg usage in liveness calculation
Machine programs need a definition of each vreg before reaching a use
(the definition may come from an IMPLICIT_DEF instruction). This class
of errors is not detected by the MachineVerifier because of efficiency
concerns. LiveRangeCalc used to report these problems, make it do that
again (followup to r279625).

Also use report_fatal_error() instead of llvm_unreachable() as the error
reporting is only present in asserts build anyway.

llvm-svn: 281914
2016-09-19 16:49:45 +00:00
Dehao Chen 82667d04c5 Only set branch weight during sample pgo annotation when max_weight of the branch is non-zero. Otherwise use default static profile to set branch probability.
Summary: It does not make sense to set equal weights for all unkown branches as we have static branch prediction available.

Reviewers: dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24732

llvm-svn: 281912
2016-09-19 16:33:41 +00:00
Dehao Chen 38e3731c47 Use call target count to derive the call instruction weight
Summary: The call target count profile is directly derived from LBR branch->target data. This is more reliable than instruction frequency profiles that could be moved across basic block boundaries. This patches uses call target count profile to annotate call instructions.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24410

llvm-svn: 281911
2016-09-19 16:06:37 +00:00
Etienne Bergeron 6ba5176862 [asan] Support dynamic shadow address instrumentation
Summary:
This patch is adding the support for a shadow memory with
dynamically allocated address range.

The compiler-rt needs to export a symbol containing the shadow
memory range.

This is required to support ASAN on windows 64-bits.

Reviewers: kcc, rnk, vitalybuka

Subscribers: kubabrecka, dberris, llvm-commits, chrisha

Differential Revision: https://reviews.llvm.org/D23354

llvm-svn: 281908
2016-09-19 15:58:38 +00:00
Valery Pykhtin 2828b9be1e [AMDGPU] Refactor VOPC instruction TD definitions
Differential Revision: https://reviews.llvm.org/D24546

llvm-svn: 281903
2016-09-19 14:39:49 +00:00
Diana Picus a53660e4a3 [AArch64] Fix encoding for lsl #12 in add/sub immediates
Whenever an add/sub immediate needs a fixup, we set that immediate field to zero,
which is correct, but we also set the shift bits to zero, which is not true for
instructions that use lsl #12. This patch makes sure that if lsl #12 was used,
it will appear in the encoding of the instruction.

Differential Revision: https://reviews.llvm.org/D23930

llvm-svn: 281898
2016-09-19 11:10:18 +00:00
Sam Kolton be7ffb90bf [AMDGPU] Fix s_branch with -1 offset
Summary:
In case s_branch instruction target is itself backend should emit offset -1 but instead it emit 0.
'''
label:
    s_branch label  // should emit [0xff,0xff,0x82,0xbf]
'''

Tom, Matt: why are we adjusting fixup values in applyFixup() method instead of processFixup()? processFixup() is calling adjustFixupValue() but does nothing with its result.

Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl

Differential Revision: https://reviews.llvm.org/D24671

llvm-svn: 281896
2016-09-19 10:20:55 +00:00
Keith Walker c941252374 Add @llvm.dbg.value entries for the phi node created by -mem2reg
When phi nodes are created in the -mem2reg phase, the @llvm.dbg.declare
entries are converted to @llvm.dbg.value entries at the place where the
store instructions existed. However no entry is created to describe
the resulting value of the phi node.

The effect of this is especially noticeable in for loops which have a
constant for the intial value; the loop control variable's location
would be described as the intial constant value in the loop body once
the -mem2reg optimization phase was run.

This change adds the creation of the @llvm.dbg.value entries to describe
variables whose location is the result of a phi node created in -mem2reg.

Also when the phi node is finally lowered to a machine instruction it
is important that the lowered "load" instruction is placed before the
associated DEBUG_VALUE entry describing the value loaded.

Differential Revision: https://reviews.llvm.org/D23715

llvm-svn: 281895
2016-09-19 09:49:30 +00:00
Oliver Stannard e1f6dc59ce [Thumb] Set correct initial mapping symbol for big-endian thumb
The initial mapping symbol state is set from the triple, but we only checked
for the little-endian thumb triple, so could end up with an ARM mapping symbol
for big-endian thumb.

Differential Revision: https://reviews.llvm.org/D24553

llvm-svn: 281894
2016-09-19 09:21:45 +00:00
Tim Northover eaee28b5ca ARM: check alignment before transforming ldr -> ldm (or similar).
ldm and stm instructions always require 4-byte alignment on the pointer, but we
weren't checking this before trying to reduce code-size by replacing a
post-indexed load/store with them. Unfortunately, we were also dropping this
incormation in DAG ISel too, but that's easy enough to fix.

llvm-svn: 281893
2016-09-19 09:11:09 +00:00
James Molloy 0efb96a8ee [SimplifyCFG] Update (AND) IR flags when CSE'ing instructions
We were updating metadata but not IR flags. Because we pick an arbitrary instruction to be the CSE candidate, it comes down to luck (50% or less chance) if this results in broken codegen or not, which is why PR30373 which is actually not the fault of the commit it was bisected down to.

Fixes PR30373.

llvm-svn: 281889
2016-09-19 08:23:08 +00:00
Craig Topper 61403201ea [X86,AVX-512] Use INSERT_SUBREG instead of SUBREG_TO_REG when the input is not the output of an instruction.
SUBREG_TO_REG is supposed to indicate that the super register has been zeroed, but we can't prove that if we don't know where it came from.

llvm-svn: 281885
2016-09-19 02:53:43 +00:00
Craig Topper b3b5033179 [AVX-512] Add support for lowering fp_to_f16 and f16_to_fp when VLX is supported regardless of whether F16C is also supported.
Still need to add support for lowering using AVX512F when neither VLX or F16C is supported.

llvm-svn: 281884
2016-09-19 02:53:37 +00:00
Dean Michael Berris 4640154446 [XRay] ARM 32-bit no-Thumb support in LLVM
This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter.
This is one of 3 commits to different repositories of XRay ARM port. The other 2 are:

https://reviews.llvm.org/D23932 (Clang test)
https://reviews.llvm.org/D23933 (compiler-rt)

Differential Revision: https://reviews.llvm.org/D23931

llvm-svn: 281878
2016-09-19 00:54:35 +00:00
Dehao Chen 41cde0b986 Handle Invoke during sample profiler annotation: make it inlinable.
Summary: Previously we reline on inst-combine to remove inlinable invoke instructions. This causes trouble because a few extra optimizations are schedule early that could introduce too much CFG change (e.g. simplifycfg removes too much control flow). This patch handles invoke instruction in-place during sample profile annotation, so that we do not rely on instcombine to remove those invoke instructions.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24409

llvm-svn: 281870
2016-09-18 23:11:37 +00:00
Craig Topper af5ee86bc9 [AVX-512] Don't lower CVTPD2PS intrinsics to ISD::FP_ROUND with an X86 rounding mode encoding in the second operand. This immediate should only be 0 or 1 and indicates if the truncation loses precision.
Also enhance an assert in SelectionDAG::getNode to flag this sort of problem in the future.

llvm-svn: 281868
2016-09-18 21:49:32 +00:00
Craig Topper c26cd68422 [AVX-512] Stop lowering avx512_mask_sqrt intrinsics to ISD:FSQRT with a second operand containing an X86 specific rounding mode encoding that doesn't belong.
llvm-svn: 281867
2016-09-18 21:49:28 +00:00
Kostya Serebryany b706b481ba [libFuzzer] add -print_coverage=1 flag to print coverage directly from libFuzzer
llvm-svn: 281866
2016-09-18 21:47:08 +00:00
Simon Pilgrim f33a6b713b Fix covered-switch-default warning
llvm-svn: 281865
2016-09-18 21:08:35 +00:00
Craig Topper cc03165d3f [X86] Fix typo in comment. NFC
llvm-svn: 281862
2016-09-18 18:59:38 +00:00
Craig Topper 8542041bb2 [AVX-512] Add memory load patterns for the legacy SSE scalar fp to integer conversion intrinsics to be consistent across all intruction sets.
llvm-svn: 281861
2016-09-18 18:59:36 +00:00
Craig Topper 8c252bc4dd [AVX-512] Remove COPY_TO_REGCLASS from a few patterns that already had the correct register class.
llvm-svn: 281860
2016-09-18 18:59:33 +00:00
Xinliang David Li f0630d3d96 Fix built bot failure
llvm-svn: 281859
2016-09-18 18:52:08 +00:00
Xinliang David Li 4ca1733a06 [Profile] Implement select instruction instrumentation in IR PGO
Differential Revision: http://reviews.llvm.org/D23727

llvm-svn: 281858
2016-09-18 18:34:07 +00:00
Elena Demikhovsky 5f8cc0c346 [Loop Vectorizer] Consecutive memory access - fixed and simplified
Amended consecutive memory access detection in Loop Vectorizer.
Load/Store were not handled properly without preceding GEP instruction.

Differential Revision: https://reviews.llvm.org/D20789

llvm-svn: 281853
2016-09-18 13:56:08 +00:00
Simon Pilgrim 6c21e6a54e [X86][SSE] Improve recognition of uitofp conversions that can be performed as sitofp
With D24253 we can now use SelectionDAG::SignBitIsZero with vector operations.

This patch uses SelectionDAG::SignBitIsZero to recognise that a zero sign bit means that we can use a sitofp instead of a uitofp (which is not directly support on pre-AVX512 hardware).

While AVX512 does provide support for uitofp, the conversion to sitofp should not cause any regressions.

Differential Revision: https://reviews.llvm.org/D24343

llvm-svn: 281852
2016-09-18 12:45:23 +00:00
Elena Demikhovsky a1a0e7ddbe [Loop vectorizer] Simplified GEP cloning. NFC.
Simplified GEP cloning in vectorizeMemoryInstruction().
Added an assertion that checks consecutive GEP, which should have only one loop-variant operand.

Differential Revision: https://reviews.llvm.org/D24557

llvm-svn: 281851
2016-09-18 09:22:54 +00:00
Wei Mi ab24cd189f Change the order of the splitted store from high - low to low - high.
It is a trivial change which could make the testcase easier to be reused
for the store splitting in CodeGenPrepare.

llvm-svn: 281846
2016-09-18 06:10:32 +00:00
Kostya Serebryany 8e781a888a [libFuzzer] use 'if guard' instead of 'if guard >= 0' with trace-pc; change the guard type to intptr_t; use separate array for 8-bit counters
llvm-svn: 281845
2016-09-18 04:52:23 +00:00
Davide Italiano a416d11357 [lib/LTO] Try harder to reduce code duplication. NFCI.
llvm-svn: 281843
2016-09-17 22:32:42 +00:00
Teresa Johnson fbb431b292 [ThinLTO] Ensure anonymous globals renamed even at -O0
Summary:
This fixes an issue when files are compiled with -flto=thin
at default -O0. We need to rename anonymous globals before attempting
to write the module summary because all values need names for
the summary. This was happening at -O1 and above, but not before
the early exit when constructing the pipeline for -O0.

Also add an internal -prepare-for-thinlto option to enable this
to be tested via opt.

Fixes PR30419.

Reviewers: mehdi_amini

Subscribers: probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24701

llvm-svn: 281840
2016-09-17 20:40:16 +00:00
Simon Pilgrim 6736096ac3 [X86][SSE] Improve target shuffle mask extraction
Add ability to extract vXi64 'vzext_movl' masks on 32-bit targets

llvm-svn: 281834
2016-09-17 18:50:54 +00:00
Ron Lieberman da5df7c99e [Hexagon] segv while processing SUnit with nullNodePtr
Added BoundaryNode check to isBestZeroLatency function.

llvm-svn: 281825
2016-09-17 16:21:09 +00:00
Matt Arsenault ac0fc849cf AMDGPU: Fix broken FrameIndex handling
We were trying to avoid using a FrameIndex operand in non-pointer
operands in a convoluted way, and would break because of
using TargetFrameIndex. The TargetFrameIndex should only be used
in the case where it makes sense to fold it as part of the addressing
mode, otherwise it requires materialization like a normal constant.
This wasn't working reliably and failed in the added testcase, hitting
the assert when processing the frame index.

The TargetFrameIndex was coming from trying to produce an AssertZext
limiting the maximum stack size. I'm not sure this was correct to begin
with, because it is apparently possible to have a single workitem
dispatch that requires all 4G of private memory.

llvm-svn: 281824
2016-09-17 16:09:55 +00:00
Matt Arsenault bcfd94c298 AMDGPU: Rename spill operands to match real instruction
llvm-svn: 281823
2016-09-17 15:52:37 +00:00
Matt Arsenault d99ef1144b AMDGPU: Push bitcasts through build_vector
This reduces the number of copies and reg_sequences
when using fp constant vectors. This significantly
reduces the code size in local-stack-alloc-bug.ll

llvm-svn: 281822
2016-09-17 15:44:16 +00:00
Kostya Serebryany bc3789a919 [libFuzzer] properly reset the guards when reseting the coverage. Also try to fix check-fuzzer on the bot
llvm-svn: 281814
2016-09-17 06:01:55 +00:00
Mehdi Amini a53d49e1b5 Don't create a SymbolTable in Function when the LLVMContext discards value names (NFC)
The ValueSymbolTable is used to detect name conflict and rename
instructions automatically. This is not needed when the value
names are automatically discarded by the LLVMContext.
No functional change intended, just saving a little bit of memory.

This is a recommit of r281806 after fixing the accessor to return
a pointer instead of a reference and updating all the call-sites.

llvm-svn: 281813
2016-09-17 06:00:02 +00:00
Mehdi Amini 05188a646d [MIR Parser] Fix Build!
Last-second refactoring before push was bad idea...

llvm-svn: 281812
2016-09-17 05:41:02 +00:00
Mehdi Amini 8d904e9534 MIR Parser: issue an error when the Context discard value names.
This is in line with the LLParser behavior

llvm-svn: 281811
2016-09-17 05:33:58 +00:00
Kostya Serebryany 3e36ec1d18 [libFuzzer] change trace-pc to use 8-byte guards
llvm-svn: 281810
2016-09-17 05:04:47 +00:00
Kostya Serebryany 8ad4155745 [sanitizer-coverage] change trace-pc to use 8-byte guards
llvm-svn: 281809
2016-09-17 05:03:05 +00:00
Mehdi Amini 952ed8e364 Revert "Don't create a SymbolTable in Function when the LLVMContext discards value names (NFC)"
This reverts commit r281806. It introduces undefined behavior as an
API is returning a reference to the Symtab

llvm-svn: 281808
2016-09-17 04:36:46 +00:00
Mehdi Amini b877aeb7c8 Don't create a SymbolTable in Function when the LLVMContext discards value names (NFC)
The ValueSymbolTable is used to detect name conflict and rename
instructions automatically. This is not needed when the value
names are automatically discarded by the LLVMContext.
No functional change intended, just saving a little bit of memory.

llvm-svn: 281806
2016-09-17 03:39:01 +00:00
Matt Arsenault 7b1dc2c983 AMDGPU: Use i64 scalar compare instructions
VI added eq/ne for i64, so use them.

llvm-svn: 281800
2016-09-17 02:02:19 +00:00
Tom Stellard 7998db634c AMDGPU/SI: Fix kernel argument ABI for HSA
Summary: i8, i16, and f16 values are not extended to 32-bit in the HSA kernel ABI.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl

Differential Revision: https://reviews.llvm.org/D24621

llvm-svn: 281789
2016-09-16 22:20:24 +00:00
Sanjay Patel f26710d97d [InstCombine] canonicalize vector select with constant vector condition to shuffle
As discussed on llvm-dev ( http://lists.llvm.org/pipermail/llvm-dev/2016-August/104210.html ): 
turn a vector select with constant condition operand into a shuffle as a canonicalization step.
Shuffles may be easier to reason about in conjunction with other shuffles and insert/extract.

Possible known (minor?) regressions from this change are filed as:
https://llvm.org/bugs/show_bug.cgi?id=28530 
https://llvm.org/bugs/show_bug.cgi?id=28531 
https://llvm.org/bugs/show_bug.cgi?id=30371

If something terrible happens to perf after this commit, feel free to revert until a backend
fix is in place.

Differential Revision: https://reviews.llvm.org/D24279

llvm-svn: 281787
2016-09-16 22:16:18 +00:00
Matt Arsenault 6408c9135c AMDGPU: Allow some control flow intrinsics to be CSEd
These clean up some unnecessary or instructions in
cases with complex loops.

In the original testcase I noticed this, the same
or with exec was repeated 5 or 6 times in a row. With
this only one is emitted or sometimes a copy.

llvm-svn: 281786
2016-09-16 22:11:18 +00:00
Evgeniy Stepanov aa84f050fc [safestack] Fix assertion failure in stack coloring.
This is a fix for PR30318.

Clang may generate IR where an alloca is already live when entering a
BB with lifetime.start. In this case, conservatively extend the
alloca lifetime all the way back to the block entry.

llvm-svn: 281784
2016-09-16 22:04:10 +00:00
Quentin Colombet 318582f9f9 [RegAllocGreedy] Fix the list of NewVRegs for last chance recoloring.
When trying to recolor a register we may split live-ranges in the
process. When we create new live-ranges we will have to process them,
but when we move a register from Assign to Split, the allocation is not
changed until the whole recoloring session is successful.
Therefore, only push the live-ranges that changed from Assign to
Split when the recoloring is successful.

Same as the previous commit, I was not able to produce a test case that
reproduce the problem with in-tree targets.

Note: The bug has been here since the recoloring scheme has been added
back in r200883 (Feb 2014).

llvm-svn: 281783
2016-09-16 22:00:50 +00:00
Quentin Colombet 631768659d [RegAllocGreedy] Fix an assertion and condition when last chance recoloring is used.
When last chance recoloring is used, the list of NewVRegs may not be
empty when calling selectOrSplitImpl. Indeed, another coloring may have
taken place with splitting/spilling in the same recoloring session.

Relax an assertion to take this into account and adapt a condition to
act as if the NewVRegs were local to this selectOrSplitImpl instance.

Unfortunately I am unable to produce a test case for this, I was only
able to reproduce the conditions on an out-of-tree target.

llvm-svn: 281782
2016-09-16 22:00:42 +00:00
Tom Stellard bbeb45aff6 AMDGPU: Refactor kernel argument lowering
Summary:
The main challenge in lowering kernel arguments for AMDGPU is determing the
memory type of the argument.  The generic calling convention code assumes
that only legal register types can be stored in memory, but this is not the
case for AMDGPU.

This consolidates all the logic AMDGPU uses for deducing memory types into a single
function.  This will make it much easier to support different ABIs in the future.

Reviewers: arsenm

Subscribers: arsenm, wdng, nhaehnle, llvm-commits, yaxunl

Differential Revision: https://reviews.llvm.org/D24614

llvm-svn: 281781
2016-09-16 21:53:00 +00:00
Matt Arsenault 7ccf6cd104 AMDGPU: Use SOPK compare instructions
llvm-svn: 281780
2016-09-16 21:41:16 +00:00
Tom Stellard 0b76fc4c77 AMDGPU/SI: Add support for triples with the mesa3d operating system
Summary:
mesa3d will use the same kernel calling convention as amdhsa, but it will
handle everything else like the default 'unknown' OS type.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D22783

llvm-svn: 281779
2016-09-16 21:34:26 +00:00
Sanjay Patel c96f6db246 [InstCombine] allow vector types for constant folding / computeKnownBits (PR24942)
computeKnownBits() already works for integer vectors, so allow vector types when calling that from InstCombine.

I don't think the change to use m_APInt in computeKnownBits is strictly necessary because we do check for 
ConstantVector later, but it's more efficient to handle the splat case without needing to loop on vector elements.

This should work with InstSimplify, but doesn't yet, so I made that a FIXME comment on the test for PR24942:
https://llvm.org/bugs/show_bug.cgi?id=24942

Differential Revision: https://reviews.llvm.org/D24677

llvm-svn: 281777
2016-09-16 21:20:36 +00:00
Davide Italiano 14e9e8af35 [LTO] Add ability to parse AA pipelines.
This is supposed to be a drop in replacement for what lld
provides via --lto-newpm-aa-pipeline.

llvm-svn: 281774
2016-09-16 21:03:21 +00:00
Nirav Dave 2364748a49 Defer asm errors to post-statement failure
Recommitting after fixing AsmParser initialization and X86 inline asm
error cleanup.

Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281762
2016-09-16 18:30:20 +00:00
Dehao Chen e0e0ed13f4 Change extractProfMetadata and extractProfTotalWeight to const member function.
llvm-svn: 281760
2016-09-16 18:27:20 +00:00
Eli Friedman 66fdba8799 LoopDistribute should preserve GlobalsAA.
Differential Revision: https://reviews.llvm.org/D24204

llvm-svn: 281758
2016-09-16 18:01:48 +00:00
Eli Friedman 02d48be5c0 LoopLoadElimination should preserve GlobalsAA.
Avoids losing GlobalsAA in the standard pass pipeline.

Differential Revision: https://reviews.llvm.org/D24094

llvm-svn: 281757
2016-09-16 17:58:07 +00:00
Mehdi Amini 55b06538b5 Fix test after renaming -name-anon-functions pass to -name-anon-globals
llvm-svn: 281752
2016-09-16 17:18:16 +00:00
Eric Christopher b0ee4e04b3 Actually remove the Mangler from the AsmPrinter and clean up the places it was "used" but not used.
llvm-svn: 281749
2016-09-16 17:07:23 +00:00
Eric Christopher dd7d68da58 Fix a hidden use of grabbing the Mangler from the AsmPrinter and update
accordingly.

llvm-svn: 281748
2016-09-16 17:07:13 +00:00
Mehdi Amini 27d2379b4e Rename NameAnonFunctions to NameAnonGlobals to match what it is doing (NFC)
llvm-svn: 281745
2016-09-16 16:56:30 +00:00
Mehdi Amini 2cac787919 Fix NameAnonFunctions pass: for ThinLTO we need to rename global variables as well
A follow-up patch will rename this pass and the source file accordingly,
but I figured the non-NFC change will be easier to spot in isolation.

Differential Revision: https://reviews.llvm.org/D24641

llvm-svn: 281744
2016-09-16 16:56:25 +00:00
Sanjay Patel 10494b2682 [InstCombine] add helper functions for visitICmpInst(); NFCI
llvm-svn: 281743
2016-09-16 16:10:22 +00:00
Davide Italiano cc1aa05350 [IRObjectFile] Turn llvm_unreachable("foo") into something more explicative.
llvm-svn: 281742
2016-09-16 16:07:19 +00:00
Davide Italiano dc8e07b98e [LTO] Prevent asm references to be dropped from the output.
Differential Revision:  https://reviews.llvm.org/D24617

llvm-svn: 281741
2016-09-16 16:05:25 +00:00
Ahmed Bougacha 85ef4a1c47 [AArch64][GlobalISel] Add default regbank mapping for int<>FP.
llvm-svn: 281739
2016-09-16 15:12:46 +00:00
Ahmed Bougacha 7b3b2e7f65 [AArch64][GlobalISel] Add default regbank mapping for G_FCMP.
llvm-svn: 281738
2016-09-16 15:12:43 +00:00
Ahmed Bougacha 90637f6196 [AArch64][GlobalISel] Add default regbank mapping for FP ops.
These should have all their operands - even scalars - go on FPR.

llvm-svn: 281737
2016-09-16 15:12:40 +00:00
Ahmed Bougacha 74db8faa71 [AArch64][GlobalISel] Test default regbank mapping for G_ICMP.
Also relax a RegisterBankInfo verifier check that's incompatible with
1-bit mappings.

llvm-svn: 281735
2016-09-16 14:44:54 +00:00
Ahmed Bougacha 7306313e6d [AArch64][GlobalISel] Add default regbank mappings for mixed-type ops.
We used to only support instructions with same-type operands.
Instead, use the per-register type information to map each
operand more accurately.

llvm-svn: 281734
2016-09-16 14:44:51 +00:00
David L Kreitzer 8bbabee21a Reapplying r278731 after fixing the problem that caused it to be reverted.
Enhance SCEV to compute the trip count for some loops with unknown stride.

Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 281732
2016-09-16 14:38:13 +00:00
Simon Dardis 1d56e888c9 [mips] Fix previous revert r281726.
llvm-svn: 281729
2016-09-16 14:16:23 +00:00
Keith Walker 830a8c1fbd Place the lowered phi instruction(s) before the DEBUG_VALUE entry
When a phi node is finally lowered to a machine instruction it is
important that the lowered "load" instruction is placed before the
associated DEBUG_VALUE entry describing the value loaded.

Renamed the existing SkipPHIsAndLabels to SkipPHIsLabelsAndDebug to
more fully describe that it also skips debug entries. Then used the
"new" function SkipPHIsAndLabels when the debug information should not
be skipped when placing the lowered "load" instructions so that it is
placed before the debug entries.

Differential Revision: https://reviews.llvm.org/D23760 

llvm-svn: 281727
2016-09-16 14:07:29 +00:00
Simon Dardis e53cfa73e4 Revert "[mips] Fix aui/daui/dahi/dati for MIPSR6"
This reverts r281724. Still need dsanders to accept this.

llvm-svn: 281726
2016-09-16 13:56:05 +00:00
Teresa Johnson 8dd61aee30 [LTO] Fix handling of mixed (regular and thin) mode LTO
Summary:
In runThinLTO we start the task numbering for ThinLTO backend
tasks depending on whether there was also a regular LTO object
(CombinedModule). However, the CombinedModule is moved at
the end of runRegularLTO, so we need to save this information and
pass it into runThinLTO. Otherwise the AddOutput callback to the client
will use the same task number for both the regular LTO object
and the first ThinLTO object, which in gold-plugin caused only
one to be end up in the output filename array and therefore passed
back to gold for the final native link.

Reviewers: pcc, mehdi_amini

Subscribers: mehdi_amini, kromanova

Differential Revision: https://reviews.llvm.org/D24643

llvm-svn: 281725
2016-09-16 13:54:19 +00:00
Simon Dardis cf060794cd [mips] Fix aui/daui/dahi/dati for MIPSR6
For compatiblity with binutils, define these instructions to take
two registers with a 16bit unsigned immediate. Both of the registers
have to be same for dahi and dati.

Reviewers: vkalintiris, dsanders, zoran.jovanovic
 
Differential Review: https://reviews.llvm.org/D21473

llvm-svn: 281724
2016-09-16 13:50:43 +00:00
Sjoerd Meijer 227825346e Reverting r281719, this is causing buildbot failures and timeouts again.
llvm-svn: 281722
2016-09-16 13:16:52 +00:00
Ahmed Bougacha b532360dd6 [AArch64][GlobalISel] Use the generic DefaultMapping as the default.
This lets generic logic handle the common case, instead of having to
implement applyMappingImpl for each instruction.

llvm-svn: 281720
2016-09-16 12:33:34 +00:00
Sjoerd Meijer 23385c87a4 This is an attempt to reapply r280808: [ARM] Lower UDIV+UREM to UDIV+MLS
(and the same for SREM)

This was causing buildbot failures earlier (time outs in the LNT suite).
However, we haven't been able to reproduce this and are suspecting this
was caused by another (reverted) patch.

llvm-svn: 281719
2016-09-16 12:10:09 +00:00
Eric Liu d07ad5196a Trying to fix Mangler memory leak in TargetLoweringObjectFile.
Summary:
`TargetLoweringObjectFile` can be re-used and thus `TargetLoweringObjectFile::Initialize()`
can be called multiple times causing `Mang` pointer memory leak.

Reviewers: echristo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24659

llvm-svn: 281718
2016-09-16 11:50:57 +00:00
Chandler Carruth 49d728ad21 [LCG] Redesign the lazy post-order iteration mechanism for the
LazyCallGraph to support repeated, stable iterations, even in the face
of graph updates.

This is particularly important to allow the CGSCC pass manager to walk
the RefSCCs (and thus everything else) in a module more than once. Lots
of unittests and other tests were hard or impossible to write because
repeated CGSCC pass managers which didn't invalidate the LazyCallGraph
would conclude the module was empty after the first one. =[ Really,
really bad.

The interesting thing is that in many ways this simplifies the code. We
can now re-use the same code for handling reference edge insertion
updates of the RefSCC graph as we use for handling call edge insertion
updates of the SCC graph. Outside of adapting to the shared logic for
this (which isn't trivial, but is *much* simpler than the DFS it
replaces!), the new code involves putting newly created RefSCCs when
deleting a reference edge into the cached list in the correct way, and
to re-formulate the iterator to be stable and effective even in the face
of these kinds of updates.

I've updated the unittests for the LazyCallGraph to re-iterate the
postorder sequence and verify that this all works. We even check for
using alternating iterators to trigger the lazy formation of RefSCCs
after mutation has occured.

It's worth noting that there are a reasonable number of likely
simplifications we can make past this. It isn't clear that we need to
keep the "LeafRefSCCs" around any more. But I've not removed that mostly
because I want this to be a more isolated change.

Differential Revision: https://reviews.llvm.org/D24219

llvm-svn: 281716
2016-09-16 10:20:17 +00:00
James Molloy 0dc4708fca [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.

llvm-svn: 281715
2016-09-16 10:17:04 +00:00
Eric Christopher 4367c7fb9a Move the Mangler from the AsmPrinter down to TLOF and clean up the
TLOF API accordingly.

llvm-svn: 281708
2016-09-16 07:33:15 +00:00
Eric Christopher a808f2981e Remove unused function getMang().
llvm-svn: 281707
2016-09-16 07:32:58 +00:00
Vitaly Buka 6c7a0bc3d9 Revert "[asan] Avoid lifetime analysis for allocas with can be in ambiguous state"
This approach is not good enough. Working on the new solution.

This reverts commit r280907.

llvm-svn: 281689
2016-09-16 01:38:46 +00:00
Vitaly Buka 4670ae5f61 Revert "[asan] Add flag to allow lifetime analysis of problematic allocas"
This approach is not good enough. Working on the new solution.

This reverts commit r281126.

llvm-svn: 281688
2016-09-16 01:38:43 +00:00
Mehdi Amini b53b62eb69 Fix autoupgrade logic for Objective-C class properties module flag
Previous we were issuing an error when linking a module containing
the new Objective-C metadata structure for class properties with an
"old" one.
Now instead we downgrade the module flag so that the Objective-C
runtime does not expect the new metadata structure.

This is consistent with what ld64 is doing on binary files.

Differential Revision: https://reviews.llvm.org/D24620

llvm-svn: 281685
2016-09-16 00:38:18 +00:00
Sanjay Patel 8da42cc5d3 [InstCombine] move folds for icmp (sh C2, Y), C1 in with other icmp+sh folds; NFCI
llvm-svn: 281672
2016-09-15 22:26:31 +00:00
Kostya Serebryany 0984517021 [libFuzzer] make caller-callee feedback work with trace-pc-guard
llvm-svn: 281667
2016-09-15 22:16:15 +00:00
Kostya Serebryany 66a9c175bf [sanitizer-coverage] make trace-pc-guard and indirect-call work together
llvm-svn: 281665
2016-09-15 22:11:08 +00:00
Reid Kleckner be82d3ec0c [codeview] Optimize the size of defranges with gaps
For small, discontiguous local variable regions, CodeView can use a
single defrange record with a gap, rather than having two defrange
records. I expect that this optimization will only have a minor impact
on debug info size.

llvm-svn: 281664
2016-09-15 22:05:08 +00:00
Sanjay Patel af91d1f81e [InstCombine] allow icmp (shr/shl) folds for vectors
These 2 helper functions were already using APInt internally, so just
change the API and caller to allow folds for splats. The scalar
regression tests look quite thorough, so I just added a couple of
tests to prove that vectors are handled too.

These folds should be grouped with the other cmp+shift folds though.
That can be an NFC follow-up.

llvm-svn: 281663
2016-09-15 21:35:30 +00:00
Mehdi Amini d880309835 [GlobalOpt] Dead Eliminate declarations
GlobalOpt is already dead-code-eliminating global definitions. With
this change it also takes care of declarations.
Hopefully this should make it now a strict superset of GlobalDCE.
This is important for LTO/ThinLTO as we don't want the linker to see
"undefined reference" when it processes the input files: it could
prevent proper internalization (or even load an extra file from a
static archive, changing the behavior of the program!).

llvm-svn: 281653
2016-09-15 20:26:27 +00:00
David Majnemer 8b16da8744 [InstCombine] Do not RAUW a constant GEP
canRewriteGEPAsOffset expects to process instructions, not constants.

This fixes PR30342.

llvm-svn: 281650
2016-09-15 20:10:09 +00:00
Evandro Menezes 19b2aed308 [AArch64] Support for FP FMA when -ffp-contract=fast
Currently, the machine combiner can proceed matching when -ffast-math is on.
It should also match when only -ffp-contract=fast is specified as was the
case before when DAGCombiner was doing the job.

Patch by: Abderrazek Zaafrani <a.zaafrani@samsung.com>.

Differential Revision: https://reviews.llvm.org/D24366

llvm-svn: 281649
2016-09-15 19:55:23 +00:00
Evgeniy Stepanov a0601a40f7 Revert "[ARM] Promote small global constants to constant pools"
This reverts r281604, which adds text relocations to ARM binaries.

llvm-svn: 281645
2016-09-15 19:13:32 +00:00
Sanjay Patel 524fcdf041 [InstCombine] simplify code; NFCI
llvm-svn: 281644
2016-09-15 19:04:55 +00:00
Sriraman Tallam 06a67ba57d [PM] Port CFGViewer and CFGPrinter to the new Pass Manager
Differential Revision: https://reviews.llvm.org/D24592

llvm-svn: 281640
2016-09-15 18:35:27 +00:00
Zachary Turner de9ba15511 [pdb] Write the IPI stream.
The IPI stream is structurally identical to the TPI stream, but it
contains different record types.  So we just re-use the TPI writing
code.

llvm-svn: 281638
2016-09-15 18:22:31 +00:00
Sanjay Patel d93c4c0137 fix function names; NFC
llvm-svn: 281637
2016-09-15 18:22:25 +00:00
Zachary Turner a6cbfb53c2 [pdb] Fix the TPI stream size computation.
We were inadvertently adding the size of the hash value stream to
the size of the TPI stream, even though the hash value stream is
an entirely separate stream.

llvm-svn: 281636
2016-09-15 18:22:21 +00:00
Kostya Serebryany 21c3573733 [libFuzzer] fix the build for AFLDriverTest
llvm-svn: 281633
2016-09-15 18:10:38 +00:00
Sanjay Patel 886a542e23 [InstCombine] allow icmp (sub nsw) folds for vectors
Also, clean up the code and comments for the existing folds in foldICmpSubConstant().

llvm-svn: 281631
2016-09-15 18:05:17 +00:00
Davide Italiano f7518498ff [IRObjectFile] Handle undefined weak symbols in RecordStreamer.
Differential Revision:  https://reviews.llvm.org/D24594

llvm-svn: 281629
2016-09-15 17:54:22 +00:00
Sanjay Patel 362ff5c0a5 [InstCombine] remove duplicated fold ; NFCI
This pattern is matched in foldICmpBinOpEqualityWithConstant() and already works
with vectors too. I changed some comments over there to point out the current 
location. The tests for this transform are currently in 'sub.ll'.

Note that the remaining folds in this block all require a sub too, so they should
get grouped with the other icmp(sub) patterns.

llvm-svn: 281627
2016-09-15 17:01:17 +00:00
Sanjay Patel 40c53ea933 [InstCombine] allow (icmp sgt smin(PosA, B), 0) fold for vectors
llvm-svn: 281624
2016-09-15 16:23:20 +00:00
Etienne Bergeron 78582b2ada [compiler-rt] Changing function prototype returning unused value
Summary: The return value of `maybeInsertAsanInitAtFunctionEntry` is ignored.

Reviewers: rnk

Subscribers: llvm-commits, chrisha, dberris

Differential Revision: https://reviews.llvm.org/D24568

llvm-svn: 281620
2016-09-15 15:45:05 +00:00
Etienne Bergeron 52e4743e24 Fix silly mistake introduced here : https://reviews.llvm.org/D24566
Asan bots are currently broken without this patch.

llvm-svn: 281618
2016-09-15 15:35:59 +00:00
Etienne Bergeron c0669ce984 address comments from: https://reviews.llvm.org/D24566
using startswith instead of find.

llvm-svn: 281617
2016-09-15 15:19:19 +00:00
Sanjay Patel 9745983a4d [InstCombine] clean up foldICmpWithConstant(); NFC
1. Early exit to reduce indent
2. Rename variables
3. Add local 'Pred' variable

llvm-svn: 281615
2016-09-15 15:11:12 +00:00
Sanjay Patel 06b127a771 [InstCombine] add helper function for foldICmpWithConstant; NFC
This is a big glob of transforms that probably should work for vectors,
but currently they are disallowed because of ConstantInt guards.

llvm-svn: 281614
2016-09-15 14:37:50 +00:00
Sanjay Patel 7577a3d799 [InstCombine] use m_APInt to allow icmp folds using known bits for splat constant vectors
llvm-svn: 281613
2016-09-15 14:15:47 +00:00
Simon Dardis 2f9bb1627a [mips][ias] Enable IAS by default for N64 on Debian mips64el.
Unfortunately we can't enable it for all N64 because it is not yet possible to
distinguish N32 from N64.

N64 has been confirmed to produce identical (within reason) objects to GAS
during stage 2 of compiler recursion on N64-abit Fedora. Unfortunately,
Fedora's triples do not distinguish N32 from N64 so I can't enable it by
default there. I'm currently repeating this testing for Debian mips64el but
it's very unlikely to produce a different result.

Patch by: Daniel Sanders

Reviewers: sdardis

Differential Review: https://reviews.llvm.org/D22678

llvm-svn: 281607
2016-09-15 13:13:01 +00:00
James Molloy fe7fd879d7 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

llvm-svn: 281604
2016-09-15 12:30:27 +00:00
Tim Northover 22d82cf179 GlobalISel: legalize GEP instructions with small offsets.
llvm-svn: 281602
2016-09-15 11:02:19 +00:00
Tim Northover 4cf0a482bc GlobalISel: relax type constraints on G_ICMP to allow pointers.
llvm-svn: 281600
2016-09-15 10:40:38 +00:00
Tim Northover 32a078ad1a GlobalISel: remove "unsized" LLT
It was only really there as a sentinel when instructions had to have precisely
one type. Now that registers are typed, each register really has to have a type
that is sized.

llvm-svn: 281599
2016-09-15 10:09:59 +00:00
Tim Northover 5ae8350af6 GlobalISel: cache pointer sizes in LLT
Otherwise everything that needs to work out what size they are has to keep a
DataLayout handy, which is a bit silly and very annoying.

llvm-svn: 281597
2016-09-15 09:20:34 +00:00
Wei Mi f160e345be Add some shortcuts in LazyValueInfo to reduce compile time of Correlated Value Propagation.
The patch is to partially fix PR10584. Correlated Value Propagation queries LVI
to check non-null for pointer params of each callsite. If we know the def of
param is an alloca instruction, we know it is non-null and can return early from
LVI. Similarly, CVP queries LVI to check whether pointer for each mem access is
constant. If the def of the pointer is an alloca instruction, we know it is not
a constant pointer. These shortcuts can reduce the cost of CVP significantly.

Differential Revision: https://reviews.llvm.org/D18066

llvm-svn: 281586
2016-09-15 06:28:34 +00:00
Kostya Serebryany 09e416615e [libFuzzer] disable test that requires debug info -- it fails on the bot
llvm-svn: 281584
2016-09-15 05:46:58 +00:00
Kostya Serebryany 0b47fbcb30 [libFuzzer] move the AFL driver build rule test into the uninstrumented dir
llvm-svn: 281583
2016-09-15 05:17:39 +00:00
Kostya Serebryany 33a497abf4 [libFuzzer] fix print_pcs test
llvm-svn: 281580
2016-09-15 04:43:06 +00:00
Kostya Serebryany 5350178487 [libFuzzer] implement print_pcs with trace-pc-guard. Change the trace-pc-guard heuristic for 8-bit counters to look more like in AFL (not that it's provable better, but the existin test preferes this heuristic)
llvm-svn: 281577
2016-09-15 04:36:45 +00:00
Kostya Serebryany a5277d59d0 [libFuzzer] add 8-bit counters to trace-pc-guard handler
llvm-svn: 281568
2016-09-15 01:30:18 +00:00
Sanjay Patel 9efb1bdcc4 [InstCombine] refactor eq/ne cases in foldICmpUsingKnownBits() ; NFCI
The pattern matching and transforms are identical; the cmp predicate just changes.

llvm-svn: 281561
2016-09-14 23:38:56 +00:00
Zachary Turner c67b00c695 [pdb] Get rid of Data and RawData in CVType.
The `CVType` had two redundant fields which were confusing and
error-prone to fill out.  By treating member records as a distinct
type from leaf records, we are able to simplify this quite a bit.

Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24432

llvm-svn: 281556
2016-09-14 23:00:16 +00:00
Zachary Turner 620961deb9 [pdb] Write TPI hash values to the TPI stream.
This completes being able to write all the interesting
values of a PDB TPI stream.

Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24370

llvm-svn: 281555
2016-09-14 23:00:02 +00:00
Reid Kleckner 31263731da [MC] Handle discardable COFF sections in assembly
Summary:
This fixes a dumpbin warning on objects produced by the MC assembler
when starting from text. All .debug$S sections are supposed to be marked
IMAGE_SCN_MEM_DISCARDABLE. The main, non-COMDAT .debug$S section had
this set, but any comdat ones were not being marked discardable because
there was no .section flag for it.

This change does two things:

- If the section name starts with .debug, implicitly mark the section as
  discardable. This means we do the same thing as gas on .s files with
  DWARF from GCC, which is important.

- Adds the 'D' flag to the .section directive on COFF to explicitly say
  a section is discardable. We only emit this flag if the section name
  does not start with .debug. This allows the user to explicitly tweak
  their section flags without worrying about magic section names.

The only thing you can't do in this scheme is to create a
non-discardable section with a name starting with ".debug", but
hopefully users don't need that.

Reviewers: majnemer, rafael

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24582

llvm-svn: 281554
2016-09-14 22:41:50 +00:00
Mehdi Amini e470927187 Fix auto-upgrade of TBAA tags in Bitcode Reader
If TBAA is on an intrinsic and it gets upgraded, it'll delete the call
instruction that we collected in a vector. Even if we were to use
WeakVH, it'll drop the TBAA and we'll hit the assert on the upgrade
path.

r263673 gave a shot to make sure the TBAA upgrade happens before
intrinsics upgrade, but failed to account for all cases.

Instead of collecting instructions in a vector, this patch makes it
just upgrade the TBAA on the fly, because metadata are always
already loaded at this point.

Differential Revision: https://reviews.llvm.org/D24533

llvm-svn: 281549
2016-09-14 22:29:59 +00:00
Reid Kleckner 4ad127f5fc Fix indentation in codeview code
llvm-svn: 281542
2016-09-14 21:49:21 +00:00
Mehdi Amini b2f46d1dc7 [LTO] Fix commons handling
Previously the prevailing information was not honored, and commons
symbols could override a strong definition. This patch fixes it and
propose the following semantic for commons: the client should mark
as prevailing the commons that it expects the LTO implementation to
merge (i.e. take the maximum size and alignment).
It implies that commons are allowed to have multiple prevailing
definitions.

Differential Revision: https://reviews.llvm.org/D24545

llvm-svn: 281538
2016-09-14 21:05:04 +00:00
Matt Arsenault 1b9fc8ed65 Finish renaming remaining analyzeBranch functions
llvm-svn: 281535
2016-09-14 20:43:16 +00:00
Sanjoy Das 23f06e53d8 [Stackmap] Added callsite counts to emitted function information.
Summary:
It was previously not possible for tools to use solely the stackmap
information emitted to reconstruct the return addresses of callsites in
the map, which is necessary to use the information to walk a stack. This
patch adds per-function callsite counts when emitting the stackmap
section in order to resolve the problem. Note that this slightly alters
the stackmap format, so external tools parsing these maps will need to
be updated.

**Problem Details:**
Records only store their offset from the beginning of the function they
belong to. While these records and the functions are output in program
order, it is not possible to determine where the end of one function's
records are without the callsite count when processing the records to
compute return addresses.

Patch by Kavon Farvardin!

Reviewers: atrick, ributzka, sanjoy

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D23487

llvm-svn: 281532
2016-09-14 20:22:03 +00:00
Evgeniy Stepanov e97d3b90b9 Revert "[ARM] Promote small global constants to constant pools"
Breaks Android tests by introducing text relocations to ARM binaries.

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/25362/steps/run%20asan%20lit%20tests%20%5Barm%2Fbullhead-userdebug%2FMTC20F%5D/logs/stdio

llvm-svn: 281526
2016-09-14 20:02:30 +00:00
Davide Italiano 63e8f444c7 [lib/LTO] Fix a typo. NFC.
llvm-svn: 281517
2016-09-14 18:48:43 +00:00
Matt Arsenault f40b70fa75 Revert "AMDGPU: Use SOPK compare instructions"
Accidentally committed

llvm-svn: 281514
2016-09-14 18:04:42 +00:00
Matt Arsenault f757c87959 AMDGPU: Use SOPK compare instructions
llvm-svn: 281513
2016-09-14 18:03:53 +00:00
Adrian Prantl a2ef047bd9 Verifier: Mark orphaned DICompileUnits as a debug info failure.
This is a follow-up to r268778 that adds a couple of missing cases,
most notably orphaned compile units.

rdar://problem/28193346

llvm-svn: 281508
2016-09-14 17:30:37 +00:00
Matt Arsenault e8e0f5cac6 Make analyzeBranch family of instruction names consistent
analyzeBranch was renamed to use lowercase first, rename
the related set to match.

llvm-svn: 281506
2016-09-14 17:24:15 +00:00
Matt Arsenault a2b036e88b AArch64: Use TTI branch functions in branch relaxation
The main change is to return the code size from
InsertBranch/RemoveBranch.

Patch mostly by Tim Northover

llvm-svn: 281505
2016-09-14 17:23:48 +00:00
Sanjay Patel c531c9ebf5 [x86] fix formatting; NFC
llvm-svn: 281504
2016-09-14 17:23:18 +00:00
Etienne Bergeron 752f8839a4 [compiler-rt] Avoid instrumenting sanitizer functions
Summary:
Function __asan_default_options is called by __asan_init before the
shadow memory got initialized. Instrumenting that function may lead
to flaky execution.

As the __asan_default_options is provided by users, we cannot expect
them to add the appropriate function atttributes to avoid
instrumentation.

Reviewers: kcc, rnk

Subscribers: dberris, chrisha, llvm-commits

Differential Revision: https://reviews.llvm.org/D24566

llvm-svn: 281503
2016-09-14 17:18:37 +00:00
Simon Pilgrim a369219ce6 [X86][SSE] Improve recognition of i64 sitofp conversions that can be performed as i32 (PR29078)
Until AVX512DQ we only support i64/vXi64 sitofp conversion as scalars.

This patch sees if the sign bit extends far enough that we can truncate to a i32 type and then perform sitofp without loss of precision.

Differential Revision: https://reviews.llvm.org/D24345

llvm-svn: 281502
2016-09-14 17:15:26 +00:00
Chad Rosier e6b3a63a3d [LoopInterchange] Typo. NFC.
llvm-svn: 281501
2016-09-14 17:12:30 +00:00
Chad Rosier 72431890b1 [LoopInterchange] Add CL option to override cost threshold.
Mostly useful for getting consistent lit testing.

llvm-svn: 281500
2016-09-14 17:07:13 +00:00
Simon Pilgrim fbbb28ebb3 [X86][SSE] Don't use PSHUFD directly - lower with generic shuffle
Remove the last user of the old getTargetShuffleNode helpers

llvm-svn: 281499
2016-09-14 17:04:22 +00:00
Sanjay Patel 284582b6d4 getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits(), round 2 ; NFCI
llvm-svn: 281498
2016-09-14 16:54:10 +00:00
Chad Rosier 58ede270a7 [LoopInterchange] Cleanup debug whitespace. NFC.
llvm-svn: 281497
2016-09-14 16:43:19 +00:00
Sanjay Patel 1ed771f5d7 getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281495
2016-09-14 16:37:15 +00:00
Sanjay Patel b1f0a0f4a8 getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI
llvm-svn: 281493
2016-09-14 16:05:51 +00:00
Etienne Bergeron 9bd4281006 Fix typo in comment [NFC]
llvm-svn: 281492
2016-09-14 15:59:32 +00:00
Matt Arsenault 2bc198a333 AMDGPU: Support folding FrameIndex operands
This avoids test regressions in a future commit.

llvm-svn: 281491
2016-09-14 15:51:33 +00:00
Sanjay Patel 5f6bb6cd24 getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI
llvm-svn: 281490
2016-09-14 15:43:44 +00:00
Sanjay Patel bd6fca1419 getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI
llvm-svn: 281489
2016-09-14 15:21:00 +00:00
Matt Arsenault fa5f767a38 AMDGPU: Improve splitting 64-bit bit ops by constants
This addresses a TODO to handle operations besides and. This
also starts eliminating no-op operations with a constant that
can emerge later.

llvm-svn: 281488
2016-09-14 15:19:03 +00:00
Matthew Simpson b25e87fca5 [LV] Process pointer IVs with PHINodes in collectLoopUniforms
This patch moves the processing of pointer induction variables in
collectLoopUniforms from the consecutive pointer phase of the analysis to the
phi node phase. Previously, if a pointer induction variable was used by both a
scalarized non-memory instruction as well as a vectorized memory instruction,
we would incorrectly identify the pointer as uniform. Pointer induction
variables should be treated the same as other phi nodes. That is, they are
uniform if all users of the induction variable and induction variable update
are uniform.

Differential Revision: https://reviews.llvm.org/D24511

llvm-svn: 281485
2016-09-14 14:47:40 +00:00
James Molloy 13065b00ba [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

llvm-svn: 281484
2016-09-14 14:47:27 +00:00
Simon Pilgrim ec2d206669 [X86][SSE] Removed unused getTargetShuffleNode function
llvm-svn: 281481
2016-09-14 14:30:00 +00:00
Nemanja Ivanovic d5deb4896c Fix code-gen crash on Power9 for insert_vector_elt with variable index (PR30189)
This patch corresponds to review:
https://reviews.llvm.org/D24021

In the initial implementation of this instruction, I forgot to account for
variable indices. This patch fixes PR30189 and should probably be merged into
3.9.1 (I'll open a bug according to the new instructions).

llvm-svn: 281479
2016-09-14 14:19:09 +00:00
Silviu Baranga 0a020f0fb0 [StackProtector] Use INITIALIZE_TM_PASS instead of INITIALIZE_PASS
in order to make sure that its TargetMachine constructor is
registered.

This allows us to run the PEI machine pass with MIR input
(see PR30324).

llvm-svn: 281474
2016-09-14 14:09:43 +00:00
Nemanja Ivanovic a103d104e1 Adding missing directive for Power9.
There is currently no codegen for Power9 that depends on the directive
so this is NFC for now but will be important in the future. This was
missed in r268950 so I'm adding it now.

llvm-svn: 281473
2016-09-14 14:09:39 +00:00
Simon Pilgrim ba325e3a73 [X86][SSE] Don't blend vector shifts with MOVSS/MOVSD directly, lower from generic shuffle
Shuffle lowering will correctly lower to MOVSS/MOVSD/PBLEND, improving commutation opportunities

llvm-svn: 281471
2016-09-14 14:08:18 +00:00
Kuba Brecka a1ea64a044 [asan] Enable -asan-use-private-alias on Darwin/Mach-O, add test for ODR false positive with LTO (llvm part)
The '-asan-use-private-alias’ option (disabled by default) option is currently only enabled for Linux and ELF, but it also works on Darwin and Mach-O. This option also fixes a known problem with LTO on Darwin (https://github.com/google/sanitizers/issues/647). This patch enables the support for Darwin (but still keeps it off by default) and adds the LTO test case.

Differential Revision: https://reviews.llvm.org/D24292

llvm-svn: 281470
2016-09-14 14:06:33 +00:00
James Molloy 9790d8f81d Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently"
This reverts commit r281323. It caused chromium test failures and a selfhost failure.

llvm-svn: 281451
2016-09-14 09:45:28 +00:00
Vassil Vassilev 2ec8b1506a Missing includes.
llvm-svn: 281450
2016-09-14 08:55:18 +00:00
Tim Northover 1c7825fd79 GlobalISel: mark pointer stores as legal on AArch64.
llvm-svn: 281448
2016-09-14 08:28:54 +00:00
Sjoerd Meijer 724023a1ec This reapplies r281304. The issue was that I had missed
to copy the new isAdd field in the tablegen data structure.

llvm-svn: 281447
2016-09-14 08:20:03 +00:00
Elena Demikhovsky 0569d9d588 AVX-512: Fixed a bug in kortest.z intrinsic
Lowering was wrong - X86ISD::SETCC node should return i8 type.

llvm-svn: 281446
2016-09-14 08:06:54 +00:00
Igor Breger 74813fc19c [AVX512BW] Change truncStore action (v16i16->v16i18). It can be legal only with AVX512VL.
Differential Revision: http://reviews.llvm.org/D24547

llvm-svn: 281445
2016-09-14 08:04:28 +00:00
Craig Topper 4e2d5a43cf [X86] Remove the VCVTSI2SD32 with rounding intrinsic. It's not used by clang and not needed since 32-bit integer to double is always exact.
llvm-svn: 281442
2016-09-14 06:27:46 +00:00
Wei Mi 24662395df Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.

This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.

Differential Revision: https://reviews.llvm.org/D24088

llvm-svn: 281439
2016-09-14 04:39:50 +00:00
Kostya Serebryany a00b243c75 [libFuzzer] start using trace-pc-guard as an alternative source of coverage
llvm-svn: 281435
2016-09-14 02:13:06 +00:00
Kostya Serebryany da718e55cf [sanitizer-coverage] add yet another flavour of coverage instrumentation: trace-pc-guard. The intent is to eventually replace all of {bool coverage, 8bit-counters, trace-pc} with just this one. LLVM part
llvm-svn: 281431
2016-09-14 01:39:35 +00:00
Akira Hatanaka 6d5a29489a Address Pete's review comment and define OrigArg on its own line.
This is a follow-up to r281419.

llvm-svn: 281421
2016-09-13 23:53:43 +00:00
Akira Hatanaka dea090e6b2 [ObjCARC] Traverse chain downwards to replace uses of argument passed to
ObjC library call with call return.

ARC contraction tries to replace uses of an argument passed to an
objective-c library call with the call return value. For example, in the
following IR, it replaces uses of argument %9 and uses of the values
discovered traversing the chain upwards (%7 and %8) with the call return
%10, if they are dominated by the call to @objc_autoreleaseReturnValue.
This transformation enables code-gen to tail-call the call to
@objc_autoreleaseReturnValue, which is necessary to enable auto release
return value optimization.

%7 = tail call i8* @objc_loadWeakRetained(i8** %6)
%8 = bitcast i8* %7 to %0*
%9 = bitcast %0* %8 to i8*
%10 = tail call i8* @objc_autoreleaseReturnValue(i8* %9)
ret %0* %8

Since r276727, llvm started removing redundant bitcasts and as a result
started feeding the following IR to ARC contraction:

%7 = tail call i8* @objc_loadWeakRetained(i8** %6)
%8 = bitcast i8* %7 to %0*
%9 = tail call i8* @objc_autoreleaseReturnValue(i8* %7)
ret %0* %8

ARC contraction no longer does the optimization described above since it
only traverses the chain upwards and fails to recognize that the
function return can be replaced by the call return. This commit changes
ARC contraction to traverse the chain downwards too and replace uses of
bitcasts with the call return.

rdar://problem/28011339

Differential Revision: https://reviews.llvm.org/D24523

llvm-svn: 281419
2016-09-13 23:43:11 +00:00
Pawel Bylica c397f0b272 [CodeGen] Fix invalid shift in mul expansion
Summary: When expanding mul in type legalization make sure the type for shift amount can actually fit the value. This fixes PR30354 https://llvm.org/bugs/show_bug.cgi?id=30354.

Reviewers: hfinkel, majnemer, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D24478

llvm-svn: 281403
2016-09-13 21:55:41 +00:00
Michael Kuperstein 59f8305305 [DAG] Allow build-to-shuffle combine to combine builds from two wide vectors.
This allows us to, in some cases, create a vector_shuffle out of a build_vector, when
the inputs to the build are extract_elements from two different vectors, at least one
of which is wider than the output. (E.g. a <8 x i16> being constructed out of
elements from a <16 x i16> and a <8 x i16>).

Differential Revision: https://reviews.llvm.org/D24491

llvm-svn: 281402
2016-09-13 21:53:32 +00:00
Kevin Enderby f76b56cb9c Next set of additional error checks for invalid Mach-O files for bad load commands
that use the Mach::dyld_info_command type for the load commands that are
currently use in the MachOObjectFile constructor.

This contains the missing checks for LC_DYLD_INFO and
LC_DYLD_INFO_ONLY load commands and the fields for the
Mach::dyld_info_command type.

llvm-svn: 281400
2016-09-13 21:42:28 +00:00
Krzysztof Parzyszek d19d0507c8 [Hexagon] Better handling of HVX vector lowering
- Expand SELECT_CC and BR_CC for vector types.
- Implement TLI::isShuffleMaskLegal.

llvm-svn: 281397
2016-09-13 21:16:07 +00:00
Matt Arsenault e2e6cfee61 Reapply "InstCombine: Reduce trunc (shl x, K) width."
This reapplies r272987 with a fix for infinitely looping
when the truncated value is another shift of a constant.

llvm-svn: 281379
2016-09-13 19:43:57 +00:00
Matthias Braun 1af1414d4d AArch64: Cleanup tailcall CC check, enable swiftcc.
Cleanup/change the code that checks for possible tailcall conventions to
look the same as the one in the X86 target. This makes the distinction
between calling conventions that can guarnatee tailcalls and the ones
that may tailcall more obvious.

- Add Swift to the mayTailCall list
- PreserveMost seemed to be incorrectly part of the guarnteed tail call
  list, move it to the mayTailCall list.

llvm-svn: 281376
2016-09-13 19:27:38 +00:00
Matt Arsenault a992f71bef AMDGPU: Remove code I think is dead
As far as I can tell, resolveFrameIndex is supposed to be
called with a legal offset, so inserting an add shouldn't be
necessary.

llvm-svn: 281372
2016-09-13 19:15:25 +00:00
Matt Arsenault 25dba30017 AMDGPU: Support commuting a FrameIndex operand
llvm-svn: 281369
2016-09-13 19:03:12 +00:00
Matthew Simpson 81335bec96 [LV] Clean up uniform induction variable analysis (NFC)
llvm-svn: 281368
2016-09-13 19:01:45 +00:00
Davide Italiano 39ccd24126 [LTO] Don't pass SF_Undefined symbols to the IRmover.
This should fix PR 30363.

llvm-svn: 281366
2016-09-13 18:45:13 +00:00
Simon Pilgrim 4a8eba3e96 [DAGCombiner] Use APInt directly in (shl (zext (srl x, C)), C) combine range test
To avoid assertion, we must ensure that the inner shift constant is within range before calling ConstantSDNode::getZExtValue(). We already know that the outer shift constant is in range.

Followup to D23007

llvm-svn: 281362
2016-09-13 18:33:29 +00:00
Nico Weber e204c48d16 Revert r281336 (and r281337), it caused PR30372.
llvm-svn: 281361
2016-09-13 18:17:00 +00:00
Douglas Katzman 8ea02f4e1c [Myriad]: set LeonCASA processor feature
llvm-svn: 281359
2016-09-13 17:51:41 +00:00
Simon Pilgrim bd28a85d14 [DAGCombiner] Use APInt directly in (shl (ext (shl x, c1)), c2) combine
Fix failure to detect out of range shift constants leading to assert in ConstantSDNode::getZExtValue()

Followup to D23007

llvm-svn: 281354
2016-09-13 17:15:28 +00:00
Matt Arsenault 30bccade0b Fix misleading comment for getOrEnforceKnownAlignment
It does not return 0 to indicate failure, and returns the known
alignment.

llvm-svn: 281350
2016-09-13 16:39:43 +00:00
Andrea Di Biagio 7277afeec1 [ConstantFold] Improve the bitcast folding logic for constant vectors.
The constant folder didn't know how to always fold bitcasts of constant integer
vectors. In particular, it was unable to handle the case where a constant vector
had some undef elements, and the resulting (i.e. bitcasted) vector type had more
elements than the original vector type.

Example:
  %cast = bitcast <2 x i64><i64 undef, i64 2> to <4 x i32>

On a little endian target, %cast could have been folded to:
  <4 x i32><i32 undef, i32 undef, i32 2, i32 0>

This patch improves the folding logic by teaching how to correctly propagate
undef elements in the folded vector.

Differential Revision: https://reviews.llvm.org/D24301

llvm-svn: 281343
2016-09-13 14:50:47 +00:00
Krzysztof Parzyszek b558ae2125 [Hexagon] Clear the flow queue after visiting a single instruction
llvm-svn: 281339
2016-09-13 14:36:55 +00:00
Nirav Dave fbd38cadf1 Apply Clang-format to MCAsmParser.cpp NFC.
llvm-svn: 281337
2016-09-13 13:57:16 +00:00
Nirav Dave 9fa8af2180 Defer asm errors to post-statement failure
Recommitting after fixing AsmParser Initialization.

Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281336
2016-09-13 13:55:06 +00:00
Chad Rosier 7ea0d3947a [LoopInterchange] Minor refactor. NFC.
llvm-svn: 281334
2016-09-13 13:30:30 +00:00
Chad Rosier 61683a22cb Don't use else if after return. Tidy comments. NFC.
llvm-svn: 281331
2016-09-13 13:08:53 +00:00
Chad Rosier d18ea0654b Typo. NFC.
llvm-svn: 281330
2016-09-13 13:00:29 +00:00
Chad Rosier 09c1109b12 [LoopInterchange] Tidy up and remove unnecessary dyn_casts. NFC.
llvm-svn: 281328
2016-09-13 12:56:04 +00:00
James Molloy 043d613791 Revert "[ARM] Promote small global constants to constant pools"
This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656

llvm-svn: 281327
2016-09-13 12:45:51 +00:00
Pablo Barrio bb6984d401 [ARM] Add ".code 32" to functions in the ARM instruction set
Before, only Thumb functions were marked as ".code 16". These
".code x" directives are effective until the next directive of its
kind is encountered. Therefore, in code with interleaved ARM and
Thumb functions, it was possible to declare a function as ARM and
end up with a Thumb function after assembly. A test has been added.

An existing test has also been fixed to take this change into
account.

Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D24337

llvm-svn: 281324
2016-09-13 12:18:15 +00:00
James Molloy d246c598de [Thumb] Teach ISel how to lower compares of AND bitmasks efficiently
For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).

1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.

1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.

llvm-svn: 281323
2016-09-13 12:12:32 +00:00
Sam Parker 214f7bf5cc Enable simplify libcalls for ARM PCS
Teach SimplifyLibcalls that in can treat functions annotated with
apcs, aapcs or aapcs_vfp like normal C functions if they only take
and return integer or pointer values, and the target is not iOS.

Differential Revision: https://reviews.llvm.org/D24453

llvm-svn: 281322
2016-09-13 12:10:14 +00:00
Peter Smith 85bbda191d [ARM] Support ldr.w in pseudo instruction ldr rd,=immediate
The changes made in r269352, r269353 and r269354 to support the 
transformation of the ldr rd,=immediate to mov introduced a regression
from 3.8 (ldr.w rd, =immediate) not supported.

This change puts support back in for ldr.w by means of a t2InstAlias for
the .w form. The .w is ignored in ARM state and propagated to the ldr in
Thumb2.

llvm-svn: 281319
2016-09-13 11:15:51 +00:00
James Molloy 3e4bc66134 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

llvm-svn: 281314
2016-09-13 10:28:11 +00:00
Ayman Musa 0c2da88f82 Remove MVT:i1 xor instruction before SELECT. (Performance improvement).
Differential Revision: https://reviews.llvm.org/D23764

llvm-svn: 281308
2016-09-13 09:12:45 +00:00
Sjoerd Meijer 520a18df9c Revert of r281304 as it is causing build bot failures in hexagon
hwloop regression tests. These tests pass locally; will be investigating
where these differences come from.

llvm-svn: 281306
2016-09-13 08:51:59 +00:00
Sjoerd Meijer 05453991fe This adds a new field isAdd to MCInstrDesc. The ARM and Hexagon instruction
descriptions now tag add instructions, and the Hexagon backend is using this to
identify loop induction statements.

Patch by Sam Parker and Sjoerd Meijer.

Differential Revision: https://reviews.llvm.org/D23601

llvm-svn: 281304
2016-09-13 08:08:06 +00:00
Elena Demikhovsky b906df9fe5 AVX-512: Fix for PR28175 - Scalar code optimization.
Optimized (truncate (assertzext x) to i1) and anyext i1 to i8/16/32.
Optimization of this patterns is a one more step towards i1 optimization on AVX-512.

Differential Revision: https://reviews.llvm.org/D24456

llvm-svn: 281302
2016-09-13 07:57:00 +00:00
Diana Picus 4b97288184 [AArch64] Support stackmap/patchpoint in getInstSizeInBytes
We currently return 4 for stackmaps and patchpoints, which is very optimistic
and can in rare cases cause the branch relaxation pass to fail to relax certain
branches.

This patch causes getInstSizeInBytes to return a pessimistic estimate of the
size as the number of bytes requested in the stackmap/patchpoint. In the future,
we could provide a more accurate estimate by sharing some of the logic in
AArch64::LowerSTACKMAP/PATCHPOINT.

Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750

Differential Revision: https://reviews.llvm.org/D24073

llvm-svn: 281301
2016-09-13 07:45:17 +00:00
Craig Topper 4619c9e6a8 [X86] Remove masked shufpd/shufps intrinsics and autoupgrade to native vector shuffles. They were removed from clang previously but accidentally left in the backend.
llvm-svn: 281300
2016-09-13 07:40:53 +00:00
Zachary Turner d97d5a2cee Revert "[Support][CommandLine] Add cl::getRegisteredSubcommands()"
This reverts r281290, as it breaks unit tests.
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/303

llvm-svn: 281292
2016-09-13 04:11:57 +00:00
Dean Michael Berris d9d290c0c6 [Support][CommandLine] Add cl::getRegisteredSubcommands()
This should allow users of the library to get a range to iterate through
all the subcommands that are registered to the global parser. This
allows users to define subcommands in libraries that self-register to
have dispatch done at a different stage (like main). It allows for
writing code like the following:

    for (auto *S : cl::getRegisteredSubcommands()) {
      if (*S) {
	// Dispatch on S->getName().
      }
    }

This change also contains tests that show this usage pattern.

Reviewers: zturner, dblaikie, echristo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24489

llvm-svn: 281290
2016-09-13 02:35:00 +00:00
Peter Collingbourne d4135bbc30 DebugInfo: New metadata representation for global variables.
This patch reverses the edge from DIGlobalVariable to GlobalVariable.
This will allow us to more easily preserve debug info metadata when
manipulating global variables.

Fixes PR30362. A program for upgrading test cases is attached to that
bug.

Differential Revision: http://reviews.llvm.org/D20147

llvm-svn: 281284
2016-09-13 01:12:59 +00:00
Michael Kuperstein efc0667583 [DAG] Refactor BUILD_VECTOR combine to make it easier to extend. NFCI.
This should make it easier to add cases that we currently don't cover,
like supporting more kinds of type mismatches and more than 2 input vectors.

llvm-svn: 281283
2016-09-13 00:57:43 +00:00
Hans Wennborg 8a42d4b9cc X86: Conditional tail calls should not have isBarrier = 1
That confuses e.g. machine basic block placement, which then doesn't
realize that control can fall through a block that ends with a conditional
tail call. Instead, isBranch=1 should be set.

Also, mark EFLAGS as used by these instructions.

llvm-svn: 281281
2016-09-13 00:21:32 +00:00
Eric Christopher 04c7db31e8 Temporarily Revert "[MC] Defer asm errors to post-statement failure" as it's causing errors on the sanitizer bots.
This reverts commit r281249.

llvm-svn: 281280
2016-09-13 00:19:29 +00:00
Philip Reames 9db7948e90 [LVI] Complete the abstract of the cache layer [NFCI]
Convert the previous introduced is-a relationship between the LVICache and LVIImple clases into a has-a relationship and hide all the implementation details of the cache from the lazy query layer.

The only slightly concerning change here is removing the addition of a queried block into the SeenBlock set in LVIImpl::getBlockValue.  As far as I can tell, this was effectively dead code.  I think it *used* to be the case that getCachedValueInfo wasn't const and might end up inserting elements in the cache during lookup.  That's no longer true and hasn't been for a while.  I did fixup the const usage to make that more obvious.

llvm-svn: 281272
2016-09-12 22:38:44 +00:00
Philip Reames b627aec407 [LVI] Sink a couple more cache manipulation routines into the cache itself [NFCI]
The only interesting bit here is the refactor of the handle callback and even that's pretty straight-forward.

llvm-svn: 281267
2016-09-12 22:03:36 +00:00
Philip Reames 92e5e1b92d [LVI] Abstract out the actual cache logic [NFCI]
Seperate the caching logic from the implementation of the lazy analysis.  For the moment, the lazy analysis impl has a is-a relationship with the cache; this will change to a has-a relationship shortly.  This was done as two steps merely to keep the changes simple and the diff understandable.

llvm-svn: 281266
2016-09-12 21:46:58 +00:00
Nico Weber 7c31d0ebc0 Revert r281215, it caused PR30358.
llvm-svn: 281263
2016-09-12 21:40:50 +00:00
Dehao Chen c32d71253c Fix the bug introduced in r281252.
llvm-svn: 281253
2016-09-12 20:29:54 +00:00
Dehao Chen 9bbb941acf Lower consecutive select instructions correctly.
Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095

Reviewers: davidxl

Subscribers: vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D24147

llvm-svn: 281252
2016-09-12 20:23:28 +00:00
Nirav Dave c0c0f7a196 [MC] Defer asm errors to post-statement failure
Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.

As part of this many minor cleanups to the Parser:

* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
  and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
  now fixed.

These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.

Reviewers: rnk, majnemer

Subscribers: aemerson, jyknight, llvm-commits

Differential Revision: https://reviews.llvm.org/D24047

llvm-svn: 281249
2016-09-12 20:03:02 +00:00
Lang Hames 8d4be3aacf [MCJIT] Fix some inconsistent handling of name mangling inside MCJIT.
This patch moves symbol mangling from findSymbol to getSymbolAddress. The
findSymbol, findExistingSymbol and findModuleForSymbol methods now always take
a mangled name, allowing the 'demangle-and-retry' cruft to be removed from
findSymbol. See http://llvm.org/PR28699 for details.

Patch by James Holderness. Thanks very much James!

llvm-svn: 281238
2016-09-12 17:19:24 +00:00
Sanjay Patel f5887f1fbd [InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors
isSignBitCheck could be changed to take a pointer param to avoid the 'UnusedBit' ugliness.

llvm-svn: 281231
2016-09-12 16:25:41 +00:00
Nicolai Haehnle e58e0e3fe3 AMDGPU: Do not clobber SCC in SIWholeQuadMode
Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D22198

llvm-svn: 281230
2016-09-12 16:25:20 +00:00
Ahmed Bougacha 925961b20c [GlobalISel] Fix mismatched "<..)" in intrinsic MO printing. NFC.
llvm-svn: 281229
2016-09-12 16:21:49 +00:00
James Molloy 3d06ff22b7 Revert "[ARM] Promote small global constants to constant pools"
This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625

llvm-svn: 281228
2016-09-12 16:18:23 +00:00
Ahmed Bougacha b678219aa6 [BranchFolding] Unique added live-ins after hoisting code.
We're not supposed to have duplicate live-ins.

llvm-svn: 281224
2016-09-12 16:05:31 +00:00
Ahmed Bougacha 45bfa8772f [X86] Copy imp-uses when folding tailcall into conditional branch.
r280832 added 32-bit support for emitting conditional tail-calls, but
dropped imp-used parameter registers.  This went unnoticed until
r281113, which added 64-bit support, as this is only exposed with
parameter passing via registers.

Don't drop the imp-used parameters.

llvm-svn: 281223
2016-09-12 16:05:27 +00:00
David Majnemer c83044d9bb [FunctionAttrs] Don't try to infer returned if it is already on an argument
Trying to infer the 'returned' attribute if an argument is already
'returned' can lead to verification failure: inference might determine
that a different argument is passed through which would result in two
different arguments marked as 'returned'.

This fixes PR30350.

llvm-svn: 281221
2016-09-12 16:04:59 +00:00
Sanjay Patel 0531f0a5bb fix formatting; NFC
llvm-svn: 281220
2016-09-12 15:52:28 +00:00
Sanjay Patel 3151dec7f1 [InstCombine] add helper function for foldICmpUsingKnownBits; NFCI
llvm-svn: 281217
2016-09-12 15:24:31 +00:00
Sam Kolton fb0d9d9c13 [AMDGPU] Assembler: Move disabled SDWA and DPP instruction into Disable asm variant
Summary: This removes disabled instructions from match tables so we will not match them at all.

Reviewers: tstellarAMD, vpykhtin, artem.tamazov

Subscribers: wdng, nhaehnle, arsenm

Differential Revision: https://reviews.llvm.org/D24452

llvm-svn: 281216
2016-09-12 14:42:43 +00:00
James Molloy 1e1b56bd48 [Thumb] Teach ISel how to lower compares of AND bitmasks efficiently
For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).

1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.

1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.

llvm-svn: 281215
2016-09-12 14:30:48 +00:00
Sanjay Patel 5352331716 fix formatting/typos; NFC
llvm-svn: 281214
2016-09-12 14:25:46 +00:00
James Molloy 8f82d45ff4 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

llvm-svn: 281213
2016-09-12 13:42:16 +00:00
Chad Rosier a4c424654e [LoopInterchange] Improve debug output. NFC.
llvm-svn: 281212
2016-09-12 13:24:47 +00:00
Rafael Espindola 74941239d8 Define a dummy zlib::uncompress when zlib is not available.
Should fix link errors in some bots when it is used.

llvm-svn: 281208
2016-09-12 13:00:51 +00:00
Tim Northover 032548fc5e GlobalISel: support translation of global addresses.
llvm-svn: 281207
2016-09-12 12:10:41 +00:00
Tim Northover a7653b3919 GlobalISel: translate GEP instructions.
Unlike SDag, we use a separate G_GEP instruction (much simplified, only taking
a single byte offset) to preserve the pointer type information through
selection.

llvm-svn: 281205
2016-09-12 11:20:22 +00:00
Tim Northover d28d3cc079 GlobalISel: disambiguate types when printing MIR
Some generic instructions have multiple types. While in theory these always be
discovered by inspecting the single definition of each generic vreg, in
practice those definitions won't always be local and traipsing through a big
function to find them will not be fun.

So this changes MIRPrinter to print out the type of uses as well as defs, if
they're known to be different or not known to be the same.

On the parsing side, we're a little more flexible: provided each register is
given a type in at least one place it's mentioned (and all types are
consistent) we accept the MIR. This doesn't introduce ambiguity but makes
writing tests manually a bit less painful.

llvm-svn: 281204
2016-09-12 11:20:10 +00:00
Eric Liu c7e5a9ce17 Fix WebAssembly broken build related to interface change in r281172.
Reviewers: bkramer

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: https://reviews.llvm.org/D24449

llvm-svn: 281201
2016-09-12 09:35:59 +00:00
Duncan P. N. Exon Smith cd0fffb6e1 MC: Move MCSection::begin/end to header, NFC
llvm-svn: 281188
2016-09-12 00:17:09 +00:00
Sanjay Patel 60312bc45f [InstCombine] add helper function for folding {and,or,xor} (cast X), C ; NFCI
llvm-svn: 281187
2016-09-12 00:16:23 +00:00
Duncan P. N. Exon Smith 23d8306d13 ADT: Add AllocatorList, and use it for yaml::Token
- Add AllocatorList, a non-intrusive list that owns an LLVM-style
  allocator and provides a std::list-like interface (trivially built on
  top of simple_ilist),
- add a typedef (and unit tests) for BumpPtrList, and
- use BumpPtrList for the list of llvm::yaml::Token (i.e., TokenQueueT).

TokenQueueT has no need for the complexity of an intrusive list.  The
only reason to inherit from ilist was to customize the allocator.
TokenQueueT was the only example in-tree of using ilist<> in a truly
non-intrusive way.

Moreover, this removes the final use of the non-intrusive
ilist_traits<>::createNode (after r280573, r281177, and r281181).  I
have a WIP patch that removes this customization point (and the API that
relies on it) that I plan to commit soon.

Note: AllocatorList owns the allocator, which limits the viable API
(e.g., splicing must be on the same list).  For now I've left out
any problematic API.  It wouldn't be hard to split AllocatorList into
two layers: an Impl class that calls DerivedT::getAlloc (via CRTP), and
derived classes that handle Allocator ownership/reference/etc semantics;
and then implement splice with appropriate assertions; but TBH we should
probably just customize the std::list allocators at that point.

llvm-svn: 281182
2016-09-11 22:40:40 +00:00
Craig Topper 7600794dde [TwoAddressInstruction] When commuting an instruction don't assume that the destination register is operand 0. Pass it from the caller.
In practice it probably is 0 so this may not be a functional change.

llvm-svn: 281180
2016-09-11 22:10:42 +00:00
Duncan P. N. Exon Smith 8b4e4af5ed ScalarOpts: Use std::list for Candidates, NFC
There is nothing intrusive about the Candidate list; use std::list over
llvm::ilist for simplicity.

llvm-svn: 281177
2016-09-11 21:29:34 +00:00
Duncan P. N. Exon Smith 077f5b41e4 ScalarOpts: Sort includes, NFC
llvm-svn: 281176
2016-09-11 21:04:36 +00:00
Duncan P. N. Exon Smith 1872096f1e CodeGen: Give MachineBasicBlock::reverse_iterator a handle to the current MI
Now that MachineBasicBlock::reverse_instr_iterator knows when it's at
the end (since r281168 and r281170), implement
MachineBasicBlock::reverse_iterator directly on top of an
ilist::reverse_iterator by adding an IsReverse template parameter to
MachineInstrBundleIterator.  This replaces another hard-to-reason-about
use of std::reverse_iterator on list iterators, matching the changes for
ilist::reverse_iterator from r280032 (see the "out of scope" section at
the end of that commit message).  MachineBasicBlock::reverse_iterator
now has a handle to the current node and has obvious invalidation
semantics.

r280032 has a more detailed explanation of how list-style reverse
iterators (invalidated when the pointed-at node is deleted) are
different from vector-style reverse iterators like std::reverse_iterator
(invalidated on every operation).  A great motivating example is this
commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp.

Note: If your out-of-tree backend deletes instructions while iterating
on a MachineBasicBlock::reverse_iterator or converts between
MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator,
you'll need to update your code in similar ways to r280032.  The
following table might help:

                  [Old]              ==>             [New]
        delete &*RI, RE = end()                   delete &*RI++
        RI->erase(), RE = end()                   RI++->erase()
      reverse_iterator(I)                 std::prev(I).getReverse()
      reverse_iterator(I)                          ++I.getReverse()
    --reverse_iterator(I)                            I.getReverse()
      reverse_iterator(std::next(I))                 I.getReverse()
                RI.base()                std::prev(RI).getReverse()
                RI.base()                         ++RI.getReverse()
              --RI.base()                           RI.getReverse()
     std::next(RI).base()                           RI.getReverse()

(For more details, have a look at r280032.)

llvm-svn: 281172
2016-09-11 18:51:28 +00:00
Duncan P. N. Exon Smith cc9edace0c CodeGen: Turn on sentinel tracking for MachineInstr iterators
This is a prep commit before fixing MachineBasicBlock::reverse_iterator
invalidation semantics, ala r281167 for ilist::reverse_iterator.  This
changes MachineBasicBlock::Instructions to track which node is the
sentinel regardless of LLVM_ENABLE_ABI_BREAKING_CHECKS.

There's almost no functionality change (aside from ABI).  However, in
the rare configuration:

    #if !defined(NDEBUG) && !defined(LLVM_ENABLE_ABI_BREAKING_CHECKS)

the isKnownSentinel() assertions in ilist_iterator<>::operator* suddenly
have teeth for MachineInstr.  If these assertions start firing for your
out-of-tree backend, have a look at the suggestions in the commit
message for r279314, and at some of the commits leading up to it that
avoid dereferencing the end() iterator.

llvm-svn: 281168
2016-09-11 16:38:18 +00:00
Igor Breger e73ef85c6f [AVX512] Fix pattern for vgetmantsd and all other instructions that use same class. Fix memory operand size, remove unnecessary pattern.
Differential Revision: http://reviews.llvm.org/D24443

llvm-svn: 281164
2016-09-11 12:38:46 +00:00
James Molloy 104370ab37 [SimplifyCFG] Be even more conservative in SinkThenElseCodeToEnd
This should *actually* fix PR30244. This cranks up the workaround for PR30188 so that we never sink loads or stores of allocas.

The idea is that these should be removed by SROA/Mem2Reg, and any movement of them may well confuse SROA or just cause unwanted code churn. It's not ideal that the midend should be crippled like this, but that unwanted churn can really cause significant regressions in important workloads (tsan).

llvm-svn: 281162
2016-09-11 09:00:03 +00:00
James Molloy 18d96e8fa5 [SimplifyCFG] Harden up the profitability heuristic for block splitting during sinking
Exposed by PR30244, we will split a block currently if we think we can sink at least one instruction. However this isn't right - the reason we split predecessors is so that we can sink instructions that otherwise couldn't be sunk because it isn't safe to do so - stores, for example.

So, change the heuristic to only split if it thinks it can sink at least one non-speculatable instruction.

Should fix PR30244.

llvm-svn: 281160
2016-09-11 08:07:30 +00:00
Craig Topper 1f81deee1f [CodeGen] Make the TwoAddressInstructionPass check if the instruction is commutable before calling findCommutedOpIndices for every operand. Also make sure the operand is a register before each call to save some work on commutable instructions that might have an operand.
llvm-svn: 281158
2016-09-11 06:00:15 +00:00
Craig Topper fb4564cf21 [AVX-512] Add VPTERNLOG to load folding tables.
llvm-svn: 281156
2016-09-11 05:33:40 +00:00
Craig Topper 69be1bd352 [X86] Make a helper method into a static function local to the cpp file.
llvm-svn: 281154
2016-09-11 05:33:35 +00:00
Justin Lebar 11a3204355 Add handling of !invariant.load to PropagateMetadata.
Summary:
This will let e.g. the load/store vectorizer propagate this metadata
appropriately.

Reviewers: arsenm

Subscribers: tra, jholewinski, hfinkel, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23479

llvm-svn: 281153
2016-09-11 01:39:08 +00:00
Justin Lebar 6d6b11a4a6 [NVPTX] Use ldg for explicitly invariant loads.
Summary:
With this change (plus some changes to prevent !invariant from being
clobbered within llvm), clang will be able to model the __ldg CUDA
builtin as an invariant load, rather than as a target-specific llvm
intrinsic.  This will let the optimizer play with these loads --
specifically, we should be able to vectorize them in the load-store
vectorizer.

Reviewers: tra

Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc

Differential Revision: https://reviews.llvm.org/D23477

llvm-svn: 281152
2016-09-11 01:39:04 +00:00
Justin Lebar adbf09e8cf [CodeGen] Split out the notions of MI invariance and MI dereferenceability.
Summary:
An IR load can be invariant, dereferenceable, neither, or both.  But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.

This patch splits up the notions of invariance and dereferenceability at
the MI level.  It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23371

llvm-svn: 281151
2016-09-11 01:38:58 +00:00
Arnold Schwaighofer 6c57f4f56d It should also be legal to pass a swifterror parameter to a call as a swifterror
argument.

rdar://28233388

llvm-svn: 281147
2016-09-10 19:42:53 +00:00
Arnold Schwaighofer 5d335559b9 InstCombine: Don't combine loads/stores from swifterror to a new type
This generates invalid IR: the only users of swifterror can be call
arguments, loads, and stores.

rdar://28242257

llvm-svn: 281144
2016-09-10 18:14:57 +00:00
Arnold Schwaighofer ad83002243 Add an isSwiftError predicate to Value
llvm-svn: 281143
2016-09-10 18:14:54 +00:00
Sanjay Patel 0a3d72bb93 [InstCombine] clean up foldICmpBinOpEqualityWithConstant / foldICmpIntrinsicWithConstant ; NFC
1. Rename variables to be consistent with related/preceding code (may want to reorganize).
2. Fix comments/formatting.

llvm-svn: 281140
2016-09-10 15:33:39 +00:00
Sanjay Patel f58f68c891 [InstCombine] rename and reorganize some icmp folding functions; NFC
Everything under foldICmpInstWithConstant() should now be working for
splat vectors via m_APInt matchers. Ie, I've removed all of the FIXMEs
that I added while cleaning that section up. Note that not all of the
associated FIXMEs in the regression tests are gone though, because some
of the tests require earlier folds that are still scalar-only. 

llvm-svn: 281139
2016-09-10 15:03:44 +00:00
Arnold Schwaighofer 112ff66505 We also need to pass swifterror in R12 under swiftcc not only under ccc
rdar://28190687

llvm-svn: 281138
2016-09-10 14:16:55 +00:00
Valery Pykhtin b66e5eb612 [AMDGPU] Refactor MUBUF/MTBUF instructions
Differential revision: https://reviews.llvm.org/D24295

llvm-svn: 281137
2016-09-10 13:09:16 +00:00
Heejin Ahn 99bd16b34b [WebAssembly] Fix typos in comments
llvm-svn: 281131
2016-09-10 02:33:47 +00:00
Kostya Serebryany 8c537c556a [libFuzzer] print a failed-merge warning only in the merge mode
llvm-svn: 281130
2016-09-10 02:17:22 +00:00
Matt Arsenault 3354f42ae7 AMDGPU: Implement is{LoadFrom|StoreTo}FrameIndex
llvm-svn: 281128
2016-09-10 01:20:33 +00:00
Matt Arsenault 7348a7eadd AMDGPU: Fix scheduling info for spill pseudos
These defaulted to Write32Bit. I don't think this actually matters
since these don't exist during scheduling.

llvm-svn: 281127
2016-09-10 01:20:28 +00:00
Vitaly Buka 3ac3aa50f6 [asan] Add flag to allow lifetime analysis of problematic allocas
Summary:
Could be useful for comparison when we suspect that alloca was skipped
because of this.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24437

llvm-svn: 281126
2016-09-10 01:06:11 +00:00
Justin Lebar d98cf00c95 [CodeGen] Rename MachineInstr::isInvariantLoad to isDereferenceableInvariantLoad. NFC
Summary:
I want to separate out the notions of invariance and dereferenceability
at the MI level, so that they correspond to the equivalent concepts at
the IR level.  (Currently an MI load is MI-invariant iff it's
IR-invariant and IR-dereferenceable.)

First step is renaming this function.

Reviewers: chandlerc

Subscribers: MatzeB, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D23370

llvm-svn: 281125
2016-09-10 01:03:20 +00:00
Kostya Serebryany 4529960a3b [libFuzzer] don't print help for internal flags
llvm-svn: 281124
2016-09-10 00:35:30 +00:00
Kostya Serebryany b991cc1f0e [libFuzzer] print a visible message if merge fails due to a crash
llvm-svn: 281122
2016-09-10 00:15:41 +00:00
Matt Arsenault 124384f08d AMDGPU: Fix immediate folding logic when shrinking instructions
If the literal is being folded into src0, it doesn't matter
if it's an SGPR because it's being replaced with the literal.

Also fixes initially selecting 32-bit versions of some instructions
which also confused commuting.

llvm-svn: 281117
2016-09-09 23:32:53 +00:00
Arnold Schwaighofer c9277f40fd Inliner: Don't mark swifterror allocas with lifetime markers
This would create a bitcast use which fails the verifier: swifterror values may
only be used by loads, stores, and as function arguments.

rdar://28233244

llvm-svn: 281114
2016-09-09 22:40:27 +00:00
Hans Wennborg 6ecf619be9 X86: Fold tail calls into conditional branches also for 64-bit (PR26302)
This extends the optimization in r280832 to also work for 64-bit. The only
quirk is that we can't do this for 64-bit Windows (yet).

Differential Revision: https://reviews.llvm.org/D24423

llvm-svn: 281113
2016-09-09 22:37:27 +00:00
Matt Arsenault 0efdd06b22 AMDGPU: Run LoadStoreVectorizer pass by default
llvm-svn: 281112
2016-09-09 22:29:28 +00:00
Kostya Serebryany 1837152a34 [libFuzzer] use sizeof() in tests instead of 4 and 8
llvm-svn: 281111
2016-09-09 22:21:16 +00:00
Matt Arsenault 950a82047b LSV: Fix incorrectly increasing alignment
If the unaligned access has a dynamic offset, it may be odd which
would make the adjusted alignment incorrect to use.

llvm-svn: 281110
2016-09-09 22:20:14 +00:00
Sanjay Patel 58109abe91 [InstCombine] use m_APInt to allow icmp ult X, C folds for splat constant vectors
llvm-svn: 281107
2016-09-09 21:59:37 +00:00
Kostya Serebryany 4b17a331ae [libFuzzer] one more puzzle for value profile
llvm-svn: 281106
2016-09-09 21:58:42 +00:00
Simon Pilgrim a3d1e03cd7 [X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targets
Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets

llvm-svn: 281105
2016-09-09 21:47:21 +00:00
Krzysztof Parzyszek 73e0ad8220 [Hexagon] Fix disassembler crash after r279255
When p0 was added as an explicit operand to the duplex subinstructions,
the disassembler was not updated to reflect this.

llvm-svn: 281104
2016-09-09 21:45:00 +00:00
Arnold Schwaighofer 7d7b4b4014 Create phi nodes for swifterror values at the end of the phi instructions list
ISel makes assumption about the order of phi nodes.

rdar://28190150

llvm-svn: 281095
2016-09-09 21:18:47 +00:00
Justin Lebar b5e884976b [NVPTX] Implement llvm.fabs.f32, llvm.max.f32, etc.
Summary:
Previously these only worked via NVPTX-specific intrinsics.

This change will allow us to convert these target-specific intrinsics
into the general LLVM versions, allowing existing LLVM passes to reason
about their behavior.

It also gets us some minor codegen improvements as-is, from situations
where we canonicalize code into one of these llvm intrinsics.

Reviewers: majnemer

Subscribers: llvm-commits, jholewinski, tra

Differential Revision: https://reviews.llvm.org/D24300

llvm-svn: 281092
2016-09-09 21:07:26 +00:00
Saleem Abdulrasool 92e33a3ebc ARM: move the builtins libcall CC setup
Move the target specific setup into the target specific lowering setup.  As
pointed out by Anton, the initial change was moving this too high up the stack
resulting in a violation of the layering (the target generic code path setup
target specific bits).  Sink this into the ARM specific setup.  NFC.

llvm-svn: 281088
2016-09-09 20:11:31 +00:00
Rafael Espindola 46cdcb03ed Add a lower level zlib::uncompress.
SmallVectors are convenient, but they don't cover every use case.

In particular, they are fairly large (3 pointers + one element) and
there is no way to take ownership of the buffer to put it somewhere
else.  This patch then adds a lower lever interface that works with
any buffer.

llvm-svn: 281082
2016-09-09 19:32:36 +00:00
Wei Ding 06f8d39424 AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type.
Differential Revision: http://reviews.llvm.org/D23700

llvm-svn: 281081
2016-09-09 19:31:51 +00:00
Tom Stellard b2869eb6e9 AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSA
Reviewers: arsenm

Subscribers: arsenm, wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D24405

llvm-svn: 281080
2016-09-09 19:28:00 +00:00
Zachary Turner 36efbfa6d8 [pdb] Print out some more info when dumping a raw stream.
We have various command line options that print the type of a
stream, the size of a stream, etc but nowhere that it can all be
viewed together.

Since a previous patch introduced the ability to dump the bytes
of a stream, this seems like a good place to present a full view
of the stream's properties including its size, what kind of data
it represents, and the blocks it occupies.  So I added the
ability to print that information to the -stream-data command
line option.

llvm-svn: 281077
2016-09-09 19:00:49 +00:00
Dehao Chen 22ce5eb051 Do not widen load for different variable in GVN.
Summary:
Widening load in GVN is too early because it will block other optimizations like PRE, LICM.

https://llvm.org/bugs/show_bug.cgi?id=29110

The SPECCPU2006 benchmark impact of this patch:

Reference: o2_nopatch
(1): o2_patched

           Benchmark             Base:Reference   (1)  
-------------------------------------------------------
spec/2006/fp/C++/444.namd                  25.2  -0.08%
spec/2006/fp/C++/447.dealII               45.92  +1.05%
spec/2006/fp/C++/450.soplex                41.7  -0.26%
spec/2006/fp/C++/453.povray               35.65  +1.68%
spec/2006/fp/C/433.milc                   23.79  +0.42%
spec/2006/fp/C/470.lbm                    41.88  -1.12%
spec/2006/fp/C/482.sphinx3                47.94  +1.67%
spec/2006/int/C++/471.omnetpp             22.46  -0.36%
spec/2006/int/C++/473.astar               21.19  +0.24%
spec/2006/int/C++/483.xalancbmk           36.09  -0.11%
spec/2006/int/C/400.perlbench             33.28  +1.35%
spec/2006/int/C/401.bzip2                 22.76  -0.04%
spec/2006/int/C/403.gcc                   32.36  +0.12%
spec/2006/int/C/429.mcf                   41.04  -0.41%
spec/2006/int/C/445.gobmk                 26.94  +0.04%
spec/2006/int/C/456.hmmer                  24.5  -0.20%
spec/2006/int/C/458.sjeng                    28  -0.46%
spec/2006/int/C/462.libquantum            55.25  +0.27%
spec/2006/int/C/464.h264ref               45.87  +0.72%

geometric mean                                   +0.23%

For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400.

Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames

Subscribers: gberry, junbuml

Differential Revision: https://reviews.llvm.org/D24096

llvm-svn: 281074
2016-09-09 18:42:35 +00:00