Commit Graph

37452 Commits

Author SHA1 Message Date
Krzysztof Parzyszek f228c95f87 [Hexagon] Handle expansion of cmpxchg
llvm-svn: 273432
2016-06-22 16:07:10 +00:00
Artur Pilipenko 1cec4fdddf Upgrade old memset/memcpy signatures (without isVolatile argument) in tests
We no longer have corresponding code in autoupgrade and the vast majority of the tests were fixed long time ago. Fix the remaining few. One of the verifier test cases is marked as XFAIL because it was passing only because the signature was incorrect.

llvm-svn: 273428
2016-06-22 15:16:06 +00:00
Sanjay Patel c6cacd6067 [InstSimplify] add ashr tests including vector types
llvm-svn: 273421
2016-06-22 14:18:04 +00:00
Simon Pilgrim bc35f9f702 [SLPVectorizer][X86] Added ceil/floor/nearbyint/rint/trunc vectorization tests
llvm-svn: 273420
2016-06-22 14:07:46 +00:00
Sanjay Patel 21579bb39a [InstSimplify] regenerate checks
llvm-svn: 273419
2016-06-22 14:00:16 +00:00
George Rimar ff8b539f7b [llvm-readobj] - Teach llvm-readobj to print dependencies of SHT_GNU_verdef and refactor dumping method.
This patch changes single method of llvm-readobj.
It teaches SHT_GNU_verdef dumper to print version dependencies,
also it removes few fields from output that can be dumped with other keys
and slightly refactors code.
Testcase was also modified to match the changes.
Change is required for testcases of upcoming lld patches.

Differential revision: http://reviews.llvm.org/D21552

llvm-svn: 273417
2016-06-22 13:43:38 +00:00
Simon Pilgrim 1536c19642 Regenerated test
llvm-svn: 273404
2016-06-22 12:58:15 +00:00
Peter Zotov ee462ca194 [OCaml] Add functions for accessing metadata nodes.
Patch by Xinyu Zhuang.

Differential Revision: http://reviews.llvm.org/D19309

llvm-svn: 273370
2016-06-22 03:30:24 +00:00
Reid Kleckner 0c5d874bea [codeview] Improve names of types in scopes and member function ids
We now include namespace scope info in LF_FUNC_ID records and we emit
LF_MFUNC_ID records for member functions as we should.

Class names are now fully qualified, which is what MSVC does.

Add a little bit of scaffolding to handle ThisAdjustment when it arrives
in DISubprogram.

llvm-svn: 273358
2016-06-22 01:32:56 +00:00
Reid Kleckner fd3b35ad84 [codeview] Add missing test from r273294
llvm-svn: 273355
2016-06-22 01:17:05 +00:00
Anna Zaks 644d9d3a44 [asan] Do not instrument pointers with address space attributes
Do not instrument pointers with address space attributes since we cannot track
them anyway. Instrumenting them results in false positives in ASan and a
compiler crash in TSan. (The compiler should not crash in any case, but that's
a different problem.)

llvm-svn: 273339
2016-06-22 00:15:52 +00:00
Peter Collingbourne 21521891a2 IR: Allow metadata attachments on declarations, and fix lazy loaded metadata issue with globals.
This change is motivated by an upcoming change to the metadata representation
used for CFI. The indirect function call checker needs type information for
external function declarations in order to correctly generate jump table
entries for such declarations. We currently associate such type information
with declarations using a global metadata node, but I plan [1] to move all
such metadata to global object attachments.

In bitcode, metadata attachments for function declarations appear in the
global metadata block. This seems reasonable to me because I expect metadata
attachments on declarations to be uncommon. In the long term I'd also expect
this to be the case for CFI, because we'd want to use some specialized bitcode
format for this metadata that could be read as part of the ThinLTO thin-link
phase, which would mean that it would not appear in the global metadata block.

To solve the lazy loaded metadata issue I was seeing with D20147, I use the
same bitcode representation for metadata attachments for global variables as I
do for function declarations. Since there's a use case for metadata attachments
in the global metadata block, we might as well use that representation for
global variables as well, at least until we have a mechanism for lazy loading
global variables.

In the assembly format, the metadata attachments appear after the "declare"
keyword in order to avoid a parsing ambiguity.

[1] http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html

Differential Revision: http://reviews.llvm.org/D21052

llvm-svn: 273336
2016-06-21 23:42:48 +00:00
Haicheng Wu a783bac50b [Kryo] Enable loop prefetcher.
Differential Revision: http://reviews.llvm.org/D21535

llvm-svn: 273329
2016-06-21 22:47:56 +00:00
Kevin Enderby 606a338db9 Update llvm-obdump(1) to print FAT_MAGIC_64 for Darwin’s 64-bit universal files
with the -macho and -universal-headers flags.

Just a follow on to r273207, I missed updating the printing of the fat magic
number when the universal file is a 64-bit universal file.

rdar://26899493

llvm-svn: 273324
2016-06-21 21:55:01 +00:00
Jan Vesely fea814d531 AMDGPU: Add implicitarg.ptr intrinsic.
Points to the start of implicit arguments (appended after explicit arguments)

Differential Revision: http://reviews.llvm.org/D20297

llvm-svn: 273317
2016-06-21 20:46:20 +00:00
Michael Kuperstein 78028b84d2 [X86] Make arithmetic operations cost model test saner. NFC.
llvm-svn: 273316
2016-06-21 20:41:40 +00:00
Artem Belevich d7ebcfb291 [NVPTX] Improve lowering of byval args of device functions.
Avoid unnecessary spills of such vars to local space on SASS level and
pointer space conversion.

Instead, make a local copy with appropriate addrspacecasts and let
LLVM optimize them away when possible.

This allows loading value of the argument using [symbol+offset]
instead of converting argument to general space pointer and using it
for indexing (which also implicitly converts param space pointer to
local space one on SASS level and triggers copying of argument into
local space in the process).

This reduces call overhead, uses less registers and reduces overall
SASS size by 2-4%.

Differential Review: http://reviews.llvm.org/D21421

llvm-svn: 273313
2016-06-21 20:30:26 +00:00
Easwaran Raman 8bceb9d210 Fix PR28219: Use profile summary from reader and not compute it
Differentiaal revision: http://reviews.llvm.org/D21546

llvm-svn: 273301
2016-06-21 19:29:49 +00:00
Reid Kleckner 5b335b864b [codeview] Add support for splitting field list records over 64KB
The basic structure is that once a list record goes over 64K, the last
subrecord of the list is an LF_INDEX record that refers to the next
record. Because the type record graph must be toplogically sorted, this
means we have to emit them in reverse order. We build the type record in
order of declaration, so this means that if we don't want extra copies,
we need to detect when we were about to split a record, and leave space
for a continuation subrecord that will point to the eventual split
top-level record.

Also adds dumping support for these records.

Next we should make sure that large method overload lists work properly.

llvm-svn: 273294
2016-06-21 18:33:01 +00:00
Silviu Baranga 03b6a4fc88 [AArch64] Fix merge-store.ll regression test after r273271
r273271 changed the RUN line of the regression test to use
-march=cyclone instead of -mtriple=aarch64-none-none.

This caused a change in the output syntax for the ext
instruction, causing the test to fail. Change this test
back to using -mtriple=aarch64-none-none.

llvm-svn: 273286
2016-06-21 17:15:49 +00:00
Etienne Bergeron f6be62f2c8 [StackProtector] Fix computation of GSCookieOffset and EHCookieOffset with SEH4
Summary:
Fix the computation of the offsets present in the scopetable when using the
SEH (__except_handler4).

This patch added an intrinsic to track the position of the allocation on the
stack of the EHGuard. This position is needed when producing the ScopeTable.

```
    struct _EH4_SCOPETABLE {
        DWORD GSCookieOffset;
        DWORD GSCookieXOROffset;
        DWORD EHCookieOffset;
        DWORD EHCookieXOROffset;
        _EH4_SCOPETABLE_RECORD ScopeRecord[1];
    };

    struct _EH4_SCOPETABLE_RECORD {
        DWORD EnclosingLevel;
        long (*FilterFunc)();
            union {
            void (*HandlerAddress)();
            void (*FinallyFunc)();
        };
    };
```

The code to generate the EHCookie is added in `X86WinEHState.cpp`.
Which is adding these instructions when using SEH4.

```
Lfunc_begin0:
# BB#0:                                 # %entry
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ebx
	pushl	%edi
	pushl	%esi
	subl	$28, %esp
	movl	%ebp, %eax                <<-- Loading FramePtr
	movl	%esp, -36(%ebp)
	movl	$-2, -16(%ebp)
	movl	$L__ehtable$use_except_handler4_ssp, %ecx
	xorl	___security_cookie, %ecx
	movl	%ecx, -20(%ebp)
	xorl	___security_cookie, %eax  <<-- XOR FramePtr and Cookie
	movl	%eax, -40(%ebp)           <<-- Storing EHGuard
	leal	-28(%ebp), %eax
	movl	$__except_handler4, -24(%ebp)
	movl	%fs:0, %ecx
	movl	%ecx, -28(%ebp)
	movl	%eax, %fs:0
	movl	$0, -16(%ebp)
	calll	_may_throw_or_crash
LBB1_1:                                 # %cont
	movl	-28(%ebp), %eax
	movl	%eax, %fs:0
	addl	$28, %esp
	popl	%esi
	popl	%edi
	popl	%ebx
	popl	%ebp
	retl

```

And the corresponding offset is computed:
```
Luse_except_handler4_ssp$parent_frame_offset = -36
	.p2align	2
L__ehtable$use_except_handler4_ssp:
	.long	-2                      # GSCookieOffset
	.long	0                       # GSCookieXOROffset
	.long	-40                     # EHCookieOffset    <<----
	.long	0                       # EHCookieXOROffset
	.long	-2                      # ToState
	.long	_catchall_filt          # FilterFunction
	.long	LBB1_2                  # ExceptionHandler

```

Clang is not yet producing function using SEH4, but it's a work in progress.
This patch is a step toward having a valid implementation of SEH4.
Unfortunately, it is not yet fully working. The EH registration block is not
allocated at the right offset on the stack.

Reviewers: rnk, majnemer

Subscribers: llvm-commits, chrisha

Differential Revision: http://reviews.llvm.org/D21231

llvm-svn: 273281
2016-06-21 15:58:55 +00:00
Evandro Menezes 230083ff9d [AArch64] Change the preferred alignment for char and short to word alignment
Differential Revision: http://reviews.llvm.org/D21414

llvm-svn: 273279
2016-06-21 15:55:18 +00:00
Silviu Baranga dc43d61a25 [AArch64] Switch regression tests to test features not CPUs
Summary:
We have switched to using features for all heuristics, but
the tests for these are still using -mcpu, which means we
are not directly testing the features.

This converts at least some of the existing regression tests
to use the new features.

This still leaves the following features untested:

merge-narrow-ld
predictable-select-expensive
alternate-sextload-cvt-f32-pattern
disable-latency-sched-heuristic

Reviewers: mcrosier, t.p.northover, rengolin

Subscribers: MatzeB, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D21288

llvm-svn: 273271
2016-06-21 15:16:34 +00:00
Daniel Sanders bf2c03ee69 [arm+x86] Make GNU variants behave like GNU w.r.t combining sin+cos into sincos.
Summary:
canCombineSinCosLibcall() would previously combine sin+cos into sincos for
GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not.
However, GNU would only combine them for UnsafeFPMath because sincos does not
set errno like sin and cos do. It seems likely that this is an oversight.

Reviewers: t.p.northover

Subscribers: t.p.northover, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D21431

llvm-svn: 273259
2016-06-21 12:29:03 +00:00
Elena Demikhovsky a266cf0518 reverted the prev commit due to assertion failure
llvm-svn: 273258
2016-06-21 12:10:11 +00:00
Elena Demikhovsky 9823c995bc Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)
  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Differential revision: http://reviews.llvm.org/D20789

llvm-svn: 273257
2016-06-21 11:32:01 +00:00
Craig Topper 283418fbb6 [AVX512] Add patterns for any-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX.
llvm-svn: 273253
2016-06-21 07:37:32 +00:00
Craig Topper 9038aa3001 [AVX512] Use update_llc_test_checks.py to regenerate a test in preparation for a future commit.
llvm-svn: 273252
2016-06-21 07:37:27 +00:00
James Y Knight 03c1415b8f Revert "Change RelaxELFRelocations for llc."
This reverts commit r273019.

From email I sent to list:
> I don't think this makes sense. Either the linker you're using supports
> this feature, or it doesn't. Having it enabled for llc if your linker
> doesn't support it is not fun.
>
> Further note that this also affects basically all other code using llvm
> libraries -- other than Clang, which explicitly sets it back to false by
> default, unless you set the ENABLE_X86_RELAX_RELOCATIONS cmake flag to
> true.
>
> If you want to enable the relax mode across all llvm tools in some
> circumstances, I think it should be via moving the cmake flag from clang
> down into llvm.
>
> I'm going to revert this commit, since I both think it intrinsically
> doesn't make sense to do this, and because it's breaking some of our
> tools.

llvm-svn: 273245
2016-06-21 05:40:41 +00:00
Craig Topper 0a0fb0fda1 [AVX512] Remove the masked vpcmpeq/vcmpgt intrinsics and autoupgrade them to native icmps.
llvm-svn: 273240
2016-06-21 03:53:24 +00:00
George Burgess IV 9fdbfe17a8 [CFLAA] Be more aggressive with interprocedural analysis.
This patch makes us perform interprocedural analysis on functions that
don't have internal linkage. It also removes a test that should've been
deleted in an earlier commit (since other tests now cover everything
that the newly-removed test covers).

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21513

llvm-svn: 273229
2016-06-21 01:42:47 +00:00
George Burgess IV 87b2e41416 [CFLAA] Add interprocedural function summaries.
This patch adds function summaries, so that we don't need to recompute
various properties about function parameters/return values at each
callsite of a function. It also adds many interprocedural tests for
CFLAA.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21475#inline-182390

llvm-svn: 273219
2016-06-20 23:10:56 +00:00
Simon Pilgrim 356e823b51 [X86][SSE] Add cost model for BSWAP of vectors
The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization.

Differential Revision: http://reviews.llvm.org/D21521

llvm-svn: 273217
2016-06-20 23:08:21 +00:00
Simon Pilgrim 225b2e37a0 [X86][X87] Fix issue with sitofp i64 -> fp128 on 32-bit targets
Fix for PR27726 - sitofp i64 to fp128 was loading the merged load i64 to a x87 register preventing legalization for conversion to fp128.

Added 32-bit tests for fp128 cast/conversions.

llvm-svn: 273210
2016-06-20 22:41:17 +00:00
Kevin Enderby d0a814ccec Forgot to svn add one of my test files for the change in r273207.
llvm-svn: 273208
2016-06-20 22:27:49 +00:00
Kevin Enderby eb6d110c1d Add support for Darwin’s 64-bit universal files with 64-bit offsets and sizes for the objects.
Darwin added support in its Xcode 8.0 tools (released in the beta) for universal
files where offsets and sizes for the objects are 64-bits to allow support for
objects contained in universal files to be larger then 4gb.  The change is very
straight forward.  There is a new magic number that differs by one bit, much
like the 64-bit Mach-O files.  Then there is a new structure that follow the
fat_header that has the same layout but with the offset and size fields using
64-bit values instead of 32-bit values.

rdar://26899493

llvm-svn: 273207
2016-06-20 22:16:18 +00:00
Vedant Kumar 0222adbcd2 [tsan] Do not instrument accesses to the gcov counters array
There is a known intended race here. This is a follow-up to r264805,
which disabled tsan instrumentation for updates to instrprof counters.
For more background on this please see the discussion in D18164.

llvm-svn: 273202
2016-06-20 21:24:26 +00:00
Sanjay Patel 9ad8fb68f7 [InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869)
By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question
raised by PR27869:
https://llvm.org/bugs/show_bug.cgi?id=27869
...where InstCombine turns an icmp+zext into a shift causing us to miss the fold.

Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp.

Differential Revision: http://reviews.llvm.org/D21512

llvm-svn: 273200
2016-06-20 20:59:59 +00:00
Dehao Chen 071bb9d7af Pass AssumptionCacheTracker from SampleProfileLoader to Inliner
Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader

Reviewers: davidxl

Subscribers: eraman, vsk, danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D21205

llvm-svn: 273199
2016-06-20 20:53:40 +00:00
Matt Arsenault 802ebcb4bb InstCombine: Don't strip convergent from intrinsic callsites
Specific instances of intrinsic calls may want to be convergent, such
as certain register reads but the intrinsic declaration is not.

llvm-svn: 273188
2016-06-20 19:04:44 +00:00
Sanjay Patel 445d7ecf89 [InstCombine] consolidate some icmp+logic tests and improve checks
llvm-svn: 273186
2016-06-20 18:40:37 +00:00
Matt Arsenault 2209625387 AMDGPU: Preserve undef flag on vcc when shrinking v_cndmask_b32
The implicit operand is added by the initial instruction construction,
so this was adding an additional vcc use. The original one
was missing the undef flag the original condition had,
so the verifier would complain.

llvm-svn: 273182
2016-06-20 18:34:00 +00:00
Matt Arsenault b6d8c37e1a AMDGPU: Fold more custom nodes to undef
This will help sneak undefs past GVN into the DAG for
some tests.

Also add missing intrinsic for rsq_legacy, even though the node
was already selected to the instruction. Also start passing
the debug location to intrinsic errors.

llvm-svn: 273181
2016-06-20 18:33:56 +00:00
Sanjay Patel 14dcb042bc [InstCombine] update to use FileCheck with autogenerated exact checking
llvm-svn: 273180
2016-06-20 18:23:40 +00:00
Matt Arsenault ff98241f37 Generalize DiagnosticInfoStackSize to support other limits
Backends may want to report errors on resources other than
stack size.

llvm-svn: 273177
2016-06-20 18:13:04 +00:00
Sanjay Patel 06918ad79e [InstCombine] update to use FileCheck with autogenerated exact checking
llvm-svn: 273173
2016-06-20 17:56:13 +00:00
Matt Arsenault a9720c67f1 AMDGPU: Use correct method for determining instruction size
llvm-svn: 273172
2016-06-20 17:51:32 +00:00
Sanjay Patel a038240660 [InstCombine] regenerate checks
llvm-svn: 273170
2016-06-20 17:48:48 +00:00
Rafael Espindola 959e9c8d01 Use shouldAssumeDSOLocal.
With this ARM fast isel knows that PIE variable are not preemptable.

llvm-svn: 273169
2016-06-20 17:45:33 +00:00
Tom Stellard 5350894265 AMDGPU: Add support for R_AMDGPU_REL32 relocations
Reviewers: arsenm, kzhuravl, rafael

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21401

llvm-svn: 273168
2016-06-20 17:33:43 +00:00
Tom Stellard 1c89eb7db0 AMDGPU: Emit R_AMDGPU_ABS32_{HI,LO} for scratch buffer relocations
Reviewers: arsenm, rafael, kzhuravl

Subscribers: rafael, arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21400

llvm-svn: 273166
2016-06-20 16:59:44 +00:00
Sam Parker d616cf07b2 [ARM] Enable isel of UMAAL
TargetLowering and DAGToDAG are used to combine ADDC, ADDE and UMLAL
dags into UMAAL. Selection is split into the two phases because it
is easier to match the two patterns at those different times.

Differential Revision: http://http://reviews.llvm.org/D21461

llvm-svn: 273165
2016-06-20 16:47:09 +00:00
David Majnemer c5601df9fd Reapply "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273160, reapplying r273132.
RecursivelyDeleteTriviallyDeadInstructions cannot be called on a
parentless Instruction.

llvm-svn: 273162
2016-06-20 16:03:25 +00:00
Cong Liu 1c28b6d733 Revert "[LoopIdiom] Don't remove dead operands manually"
This reverts commit r273132.
Breaks multiple test under /llvm/test:Transforms (e.g.
llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan.

llvm-svn: 273160
2016-06-20 15:22:15 +00:00
Simon Pilgrim 0a81b13f31 [X86][F16C] Added half <-> double conversion tests
llvm-svn: 273153
2016-06-20 12:51:55 +00:00
Pankaj Gode 0aab2e398a [AARCH64] Add support for Broadcom Vulcan
Adding core tuning support for new Broadcom Vulcan core (ARMv8.1A).

Differential Revision: http://reviews.llvm.org/D21500

llvm-svn: 273148
2016-06-20 11:13:31 +00:00
Patrik Hagglund 7205215591 Fix for PR27940
After a store has been eliminated, when making sure that the
instruction iterator points to a valid instruction, dbg intrinsics are
now ignored as a new instruction.

Patch by Henric Karlsson.

Reviewed by Daniel Berlin.

Differential Revision: http://reviews.llvm.org/D21076

llvm-svn: 273141
2016-06-20 09:10:10 +00:00
Igor Breger e59165ca63 [AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering.
Differential Revision: http://reviews.llvm.org/D20897

llvm-svn: 273138
2016-06-20 07:05:43 +00:00
David Majnemer a705843f23 [LoopIdiom] Don't remove dead operands manually
Removing dead instructions requires remembering which operands have
already been removed.  RecursivelyDeleteTriviallyDeadInstructions has
this logic, don't partially reimplement it in LoopIdiomRecognize.

This fixes PR28196.

llvm-svn: 273132
2016-06-20 02:33:29 +00:00
Sanjay Patel a4b052c7d1 [InstSimplify] add tests for PR27689; regenerate checks
llvm-svn: 273128
2016-06-19 21:40:12 +00:00
Simon Pilgrim 0887d5b02e [X86][AVX512] Added 512-bit BITREVERSE tests and enabled AVX512BW lowering support
llvm-svn: 273125
2016-06-19 20:59:19 +00:00
Simon Pilgrim 3d881a0230 [X86][SSE] Allow target shuffle combining to match masks with SM_Sentinel values
We currently only allow exact matches of shuffle mask patterns during target shuffle combining.

This patch relaxes this to permit SM_SentinelUndef in the combined shuffle to always be accepted as well as allowing exact matching of the SM_SentinelZero value.

I've adjusted some tests that were requiring exact shuffle masks to now include undef values.

Differential Revision: http://reviews.llvm.org/D21495

llvm-svn: 273119
2016-06-19 18:03:52 +00:00
Chris Dewhurst a294541c05 [SPARC[ Correcting out-of-date unit tests checked in as part of r273108
llvm-svn: 273110
2016-06-19 12:52:39 +00:00
Chris Dewhurst 0c1e0026aa [SPARC] Fixes for hardware errata on LEON processor.
Passes to fix three hardware errata that appear on some LEON processor variants.

The instructions FSMULD, FMULS and FDIVS do not work as expected on some LEON processors. This change allows those instructions to be substituted for alternatives instruction sequences that are known to work.

These passes only run when selected individually, or as part of a processor defintion. They are not included in general SPARC processor compilations for non-LEON processors or for those LEON processors that do not have these hardware errata.

llvm-svn: 273108
2016-06-19 11:03:28 +00:00
David Majnemer 3119599475 [LoadCombine] Combine Loads formed from GEPS with negative indexes
Change the underlying offset and comparisons to use int64_t instead of
uint64_t.

Patch by River Riddle!

Differential Revision: http://reviews.llvm.org/D21499

llvm-svn: 273105
2016-06-19 06:14:56 +00:00
Simon Pilgrim 9a09652a3a [X86][AVX] Added test case for PR28136
llvm-svn: 273098
2016-06-18 22:59:08 +00:00
Simon Pilgrim cd6d4352bc [X86][SSSE3] Added examples of target shuffle combining failing to match undefs in shuffle masks
llvm-svn: 273097
2016-06-18 21:18:21 +00:00
Simon Pilgrim ab009e9f41 [X86][XOP] Added fast-isel tests matching tools/clang/test/CodeGen/xop-builtins.c
llvm-svn: 273096
2016-06-18 21:07:31 +00:00
Simon Pilgrim b201678763 [X86][TBM] Added fast-isel tests matching tools/clang/test/CodeGen/tbm-builtins.c
llvm-svn: 273087
2016-06-18 17:20:52 +00:00
Vasileios Kalintiris 0cf68df6cc [mips] Emit a JALR with $rd equal to $zero, instead of a JR in MIPS32R6.
Summary:
JR is an alias of JALR with $rd=0 in the R6 ISA. Also, this fixes recursive
builds in MIPS32R6.

Reviewers: dsanders, sdardis

Subscribers: jfb, dschuff, dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21370

llvm-svn: 273085
2016-06-18 15:39:43 +00:00
Amjad Aboud 76c9eb99a7 [codeview] Emit non-virtual method type.
Differential Revision: http://reviews.llvm.org/D21011

llvm-svn: 273084
2016-06-18 10:25:07 +00:00
Marcin Koscielnicki 3feda222c6 [sanitizers] Disable target-specific lowering of string functions.
CodeGen has hooks that allow targets to emit specialized code instead
of calls to memcmp, memchr, strcpy, stpcpy, strcmp, strlen, strnlen.
When ASan/MSan/TSan/ESan is in use, this sidesteps its interceptors, resulting
in uninstrumented memory accesses.  To avoid that, make these sanitizers
mark the calls as nobuiltin.

Differential Revision: http://reviews.llvm.org/D19781

llvm-svn: 273083
2016-06-18 10:10:37 +00:00
Matt Arsenault a466a7cf62 Add looping testcase that broke in r272987
llvm-svn: 273081
2016-06-18 05:15:58 +00:00
Matt Arsenault e935f05a94 AMDGPU: Fix kernel argument alignment impacting stack size
Don't use AllocateStack because kernel arguments have nothing
to do with the stack. The ensureMaxAlignment call was still
changing the stack alignment.

llvm-svn: 273080
2016-06-18 05:15:53 +00:00
Sanjoy Das e8fd9561cb [SCEV] Fix incorrect trip count computation
The way we elide max expressions when computing trip counts is incorrect
-- it breaks cases like this:

```
static int wrapping_add(int a, int b) {
  return (int)((unsigned)a + (unsigned)b);
}

void test() {
  volatile int end_buf = 2147483548; // INT_MIN - 100
  int end = end_buf;

  unsigned counter = 0;
  for (int start = wrapping_add(end,  200); start < end; start++)
    counter++;

  print(counter);
}
```

Note: the `NoWrap` variable that was being tested has little to do with
the values flowing into the max expression; it is a property of the
induction variable.

test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test
functionality I'm reverting in this change, so I've deleted the test
fully.

llvm-svn: 273079
2016-06-18 04:38:31 +00:00
Simon Pilgrim f4b2af1b9f [X86][SSE4A] Autoupgrade and remove MOVNTSD/MOVNTSS intrinsics
Required better annotation of the instruction defs upon removal of the builtin intrinsic pattern.

llvm-svn: 273077
2016-06-18 02:38:26 +00:00
Rafael Espindola 88bb8ce821 Add a test for r273022.
llvm-svn: 273073
2016-06-18 00:24:49 +00:00
Matt Arsenault 8fd5978811 Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."""
This seems to be causing an infinite loop / crash in instcombine
on some bots.

llvm-svn: 273069
2016-06-17 23:36:38 +00:00
Tom Stellard f8db61c5f0 Support/ELF: Add AMDGPU relocation definitions to match documentation
Reviewers: arsenm, kzhuravl, rafael

Subscribers: llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21443

llvm-svn: 273066
2016-06-17 22:38:08 +00:00
Adam Nemet a9f09c6245 [LAA] Enable symbolic stride speculation for all LAA clients
This is a functional change for LLE and LDist.  The other clients (LV,
LVerLICM) already had this explicitly enabled.

The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides.  This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter.  This makes the
interface more friendly to the new Pass Manager.

The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.

llvm-svn: 273064
2016-06-17 22:35:41 +00:00
Matt Arsenault 0bb294b224 AMDGPU: Temporarily select trap to s_endpgm
This should select to s_trap, but that requires
additonal work to setup and enable the trap handler.
For now emit s_endpgm so bugpoint stops getting stuck
on the unsupported call to abort.

Emit a warning that this will only terminate the wave and
not really trap.

llvm-svn: 273062
2016-06-17 22:27:03 +00:00
Kevin Enderby ae108ffb9a Add support for Darwin’s static library table of contents with 64-bit offsets to the archive members.
Darwin added support in its Xcode 8.0 tools (released in the beta) for static
library table of contents with 64-bit offsets to the archive members.  The
change is very straight forward.  The table of contents member is named
___.SYMDEF_64 or "___.SYMDEF_64 SORTED" and same layout is used but with
fields using 64 bit values instead of 32 bit values.

rdar://26869808

llvm-svn: 273058
2016-06-17 22:16:06 +00:00
Reid Kleckner 6fa1546ad9 [codeview] Emit incomplete member pointer types with the unknown model
An incomplete member pointer type will always have a size of zero, so we
don't need an extra flag. Credit to David Majnemer for the idea.

llvm-svn: 273057
2016-06-17 22:14:39 +00:00
Reid Kleckner 604105bb90 [codeview] Add DIFlags for pointer to member representations
Summary:
This seems like the least intrusive way to pass this information
through.

Fixes PR28151

Reviewers: majnemer, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21444

llvm-svn: 273053
2016-06-17 21:31:33 +00:00
Matt Arsenault 8885910f8e AMDGPU: Remove llvm.SI.tid intrinsic
Mesa doesn't emit this for llvm >= 3.8 anymore.

llvm-svn: 273050
2016-06-17 21:18:41 +00:00
Matt Arsenault d76efc14b9 Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""
Reapply r272987. Condition should be in terms of the destination type,
and the flags should not be copied.

llvm-svn: 273045
2016-06-17 20:33:53 +00:00
Marcin Koscielnicki fd4b6b9e51 [SelectionDAG] Don't treat library calls specially if marked with nobuiltin.
To be used by D19781.

Differential Revision: http://reviews.llvm.org/D19801

llvm-svn: 273039
2016-06-17 20:24:07 +00:00
Michael Kuperstein 18d6d3d95e [X86] Add missing AVX512 anyext patterns.
Add AVX512 anyext patterns for i16 and i64, modeled on the existing i8 and
i32 patterns.

llvm-svn: 273038
2016-06-17 20:21:17 +00:00
Davide Italiano b49aa5c0c4 [PM] Port MergedLoadStoreMotion to the new pass manager, take two.
This is indeed a much cleaner approach (thanks to Daniel Berlin
for pointing out), and also David/Sean for review.

Differential Revision:  http://reviews.llvm.org/D21454

llvm-svn: 273032
2016-06-17 19:10:09 +00:00
Tim Northover 28a9e7f4ba ARM: take account of possible bundle when erasing an instruction.
Fortunately this appears to be the only ARM-specific pass that runs while
bundles might be in play, so no other cases need modifying.

llvm-svn: 273029
2016-06-17 18:40:46 +00:00
Davide Italiano 16bfa13a77 [IRObjectFile] Handle .weak in RecordStreamer.
Differential Revision:  http://reviews.llvm.org/D21476

llvm-svn: 273027
2016-06-17 18:20:14 +00:00
James Y Knight 148a6469dc Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass.
Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1
or 2-byte. For those, you need to mask and shift the 1 or 2 byte values
appropriately to use the 4-byte instruction.

This change adds support for cmpxchg-based instruction sets (only SPARC,
in LLVM). The support can be extended for LL/SC-based PPC and MIPS in
the future, supplanting the ISel expansions those architectures
currently use.

Tests added for the IR transform and SPARCv9.

Differential Revision: http://reviews.llvm.org/D21029

llvm-svn: 273025
2016-06-17 18:11:48 +00:00
Rafael Espindola 9f86baebe0 Change RelaxELFRelocations for llc.
As a developer tool it makes sense for it to use the new relocations.

llvm-svn: 273019
2016-06-17 17:43:41 +00:00
Rafael Espindola e021e85166 Change the default of -relax-relocations.
llvm-mc is a developer tool, as such it make sense for it to use new
features by default.

This doesn't change the user facing clang, which still defaults to non
relaxable relocations.

llvm-svn: 273014
2016-06-17 17:04:56 +00:00
Sanjay Patel 216d8cf720 [InstCombine] allow more than one use for vector bitcast folding with selects
The motivating example for this transform is similar to D20774 where bitcasts interfere
with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to 
produce min and max ops:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %bc1 = bitcast <4 x float> %a to <4 x i32>
  %bc2 = bitcast <4 x float> %b to <4 x i32>
  %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2
  %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1
  %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>*
  store <4 x i32> %sel1, <4 x i32>* %bc3
  %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>*
  store <4 x i32> %sel2, <4 x i32>* %bc4
  ret void
}

With this patch, we move the selects up to use the input args which allows getting rid of
all of the bitcasts:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
  %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a
  store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16
  store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16
  ret void
}

The asm for x86 SSE then improves from:

movaps  %xmm0, %xmm2
cmpltps %xmm1, %xmm2
movaps  %xmm2, %xmm3
andnps  %xmm1, %xmm3
movaps  %xmm2, %xmm4
andnps  %xmm0, %xmm4
andps %xmm2, %xmm0
orps  %xmm3, %xmm0
andps %xmm1, %xmm2
orps  %xmm4, %xmm2
movaps  %xmm0, (%rdi)
movaps  %xmm2, (%rsi)

To:

movaps  %xmm0, %xmm2
minps %xmm1, %xmm2
maxps %xmm0, %xmm1
movaps  %xmm2, (%rdi)
movaps  %xmm1, (%rsi)

The TODO comments show that we're limiting this transform only to vectors and only to bitcasts
because we need to improve other transforms or risk creating worse codegen.

Differential Revision: http://reviews.llvm.org/D21190

llvm-svn: 273011
2016-06-17 16:46:50 +00:00
David Majnemer da9548f949 [CodeView] Refactor enumerator emission
This addresses Amjad's review comments on D21442.

llvm-svn: 273010
2016-06-17 16:13:21 +00:00
Reid Kleckner ac945e27dd [codeview] Make function names more consistent with MSVC
Names in function id records don't include nested name specifiers or
template arguments, but names in the symbol stream include both.

For the symbol stream, instead of having Clang put the fully qualified
name in the subprogram display name, recreate it from the subprogram
scope chain. For the type stream, take the unqualified name and chop of
any template arguments.

This makes it so that CodeView DI metadata is more similar to DWARF DI
metadata.

llvm-svn: 273009
2016-06-17 16:11:20 +00:00
Nirav Dave fd91041ce1 Refactor and cleanup Assembly Parsing / Lexing
Recommiting after fixing non-atomic insert to front of SmallVector in
MCAsmLexer.h

Add explicit Comment Token in Assembly Lexing for future support for
outputting explicit comments from inline assembly. As part of this,
CPPHash Directives are now explicitly distinguished from Hash line
comments in Lexer.

Line comments are recorded as EndOfStatement tokens, not Comment tokens
to simplify compatibility with current TargetParsers. This slightly
complicates comment output.

This remove all lexing tasks out of the parser, does minor cleanup
to remove extraneous newlines Asm Output, and some improvements white
space handling.

Reviewers: rtrieu, dwmw2, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20009

llvm-svn: 273007
2016-06-17 16:06:17 +00:00
Simon Pilgrim 6a35e5ab97 [X86][SSE4A] Remove the GCCBuiltins from the movntsd/movntss intrinsic defs so we can emit native IR from clang.
Clang-side sibling commit to follow.

llvm-svn: 273002
2016-06-17 14:27:38 +00:00
Adam Nemet e7709d92ba [LLE] Don't hard-code the name of the preheader in test
Turns out a didn't get this right because symbolic stride versioning
changes the name.  Relax the matching.

llvm-svn: 272992
2016-06-17 09:13:15 +00:00