Commit Graph

28542 Commits

Author SHA1 Message Date
Reed Kotler aa150ed780 Add bulk of returning of values to Mips fast-isel
Summary:
Implement the bulk of returning values in Mips fast-isel



Test Plan:
reatabi.ll

Passes test-suite at -O0,-O2 and with mips32r2 and mips32r1.





Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D5920

llvm-svn: 228958
2015-02-12 21:05:12 +00:00
Bjorn Steinbrink 6f972a13f6 Fix a crash in the assumption cache when inlining indirect function calls
Summary:
Instances of the AssumptionCache are per function, so we can't re-use
the same AssumptionCache instance when recursing in the CallAnalyzer to
analyze a different function. Instead we have to pass the
AssumptionCacheTracker to the CallAnalyzer so it can get the right
AssumptionCache on demand.

Reviewers: hfinkel

Subscribers: llvm-commits, hans

Differential Revision: http://reviews.llvm.org/D7533

llvm-svn: 228957
2015-02-12 21:04:22 +00:00
Benjamin Kramer e8cb17f282 Update test case.
llvm-svn: 228956
2015-02-12 20:40:19 +00:00
Benjamin Kramer 443c7967ea InstCombine: Allow folding of xor into icmp by changing the predicate for vectors
The loop vectorizer can create this pattern.

llvm-svn: 228954
2015-02-12 20:26:46 +00:00
Michael Zolotukhin 56c918bcce Add a testcase for r228432.
llvm-svn: 228951
2015-02-12 19:57:24 +00:00
Rafael Espindola 203c5b9f39 On ELF, put PIC jump tables in a non executable section.
Fixes PR22558.

llvm-svn: 228939
2015-02-12 17:46:49 +00:00
Rafael Espindola 29786d4c16 Put each jump table in an independent section if the function is too.
This allows the linker to GC both, fixing pr22557.

llvm-svn: 228937
2015-02-12 17:16:46 +00:00
James Molloy e805ad95dc [LoopRerolling] Be more forgiving with instruction order.
We can't solve the full subgraph isomorphism problem. But we can
allow obvious cases, where for example two instructions of different
types are out of order. Due to them having different types/opcodes,
there is no ambiguity.

llvm-svn: 228931
2015-02-12 15:54:14 +00:00
Michael Kuperstein f4d1aca568 [X86] Call frame optimization - allow stack-relative movs to be folded into a push
Since we track esp precisely, there's no reason not to allow this.

llvm-svn: 228924
2015-02-12 14:17:35 +00:00
Andrea Di Biagio b08862c4f0 [TTI] Teach the cost heuristic how to query TLI to check if a zext/trunc is 'free' for the target.
Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl
how to query TLI in order to get a more accurate cost for truncates and
zero-extends.

Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase
would have conservatively returned a 'default' TCC_Basic for all zero-extends,
and TCC_Free for truncates on native types.

This patch improves the heuristic so that we query TLI (if available) to get
more accurate answers. If TLI is available, then methods 'isZExtFree' and
'isTruncateFree' can be used to check if a zext/trunc is free for the target.

Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll.
With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz
immediately followed by a free zext/trunc.

Differential Revision: http://reviews.llvm.org/D7585

llvm-svn: 228923
2015-02-12 14:17:24 +00:00
Asiri Rathnayake e045e378ad ARM: Fix another regression introduced in r223113
The changes in r223113 (ARM modified-immediate syntax) have broken
instructions like:
  mov r0, #~0xffffff00
The problem is that I've added a spurious range check on the immediate
operand to ensure that it lies between INT32_MIN and UINT32_MAX. While
this range check is correct in theory, it causes problems because the
operand is stored in an int64_t (by MC). So valid 32-bit constants like
\#~0xffffff00 become out of range. The solution is to simply remove this
range check. It is not possible to validate the range of the immediate
operand with the current setup because: 1) The operand is stored in an
int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm
or even #((~imm)) etc. So we just chop the value to 32 bits and use it.

Also noted that the original range check was note tested by any of the
unit tests. I've added a new test to cover #~imm kind of operands.

Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599
llvm-svn: 228920
2015-02-12 13:37:28 +00:00
Dmitry Vyukov 2e8d82e607 tsan: do not instrument not captured values
I've built some tests in WebRTC with and without this change. With this change number of __tsan_read/write calls is reduced by 20-40%, binary size decreases by 5-10% and execution time drops by ~5%. For example:

$ ls -l old/modules_unittests new/modules_unittests
-rwxr-x--- 1 dvyukov 41708976 Jan 20 18:35 old/modules_unittests
-rwxr-x--- 1 dvyukov 38294008 Jan 20 18:29 new/modules_unittests
$ objdump -d old/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
239871
$ objdump -d new/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
148365

http://reviews.llvm.org/D7069

llvm-svn: 228917
2015-02-12 09:55:28 +00:00
Elena Demikhovsky d2cb3c8876 AVX-512: Fixed the "test" operation for i1 type
Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits.
KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits.
I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB.

There are some cases where i1 is in the mask register and the upper bits are already zeroed.
Then KORTESTW is the better solution, but it is subject for optimization.
Meanwhile, I'm fixing the correctness issue.

llvm-svn: 228916
2015-02-12 08:40:34 +00:00
Michael Kuperstein db95d04be4 [X86] A heuristic to estimate the size impact for converting stack-relative parameter movs to pushes
This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size.
We go over all calls in the MachineFunction and compute:
a) For each callsite that can not use pushes, the penalty of not having a reserved call frame.
b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack).

Differential Revision: http://reviews.llvm.org/D7561

llvm-svn: 228915
2015-02-12 08:36:35 +00:00
Ahmed Bougacha 24433a7005 [CodeGen] Don't blindly combine (fp_round (fp_round x)) to (fp_round x).
We used to do this DAG combine, but it's not always correct:
If the first fp_round isn't a value preserving truncation, it might
introduce a tie in the second fp_round, that wouldn't occur in the
single-step fp_round we want to fold to.
In other words, double rounding isn't the same as rounding.

Differential Revision: http://reviews.llvm.org/D7571

llvm-svn: 228911
2015-02-12 06:15:29 +00:00
George Burgess IV 33305e7280 Fixed a bug where CFLAA would crash the compiler.
We would crash if we couldn't locate a Function that either Location's
Value belonged to. Now we just print out a debug message and return 
conservatively.

llvm-svn: 228901
2015-02-12 03:07:07 +00:00
Chandler Carruth 63aaa98d94 [slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.

The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.

Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.

I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.

I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.

llvm-svn: 228899
2015-02-12 02:30:56 +00:00
Hal Finkel 7a0516ea66 [PowerPC] Mark jumps as expensive (using using CR bits)
On PowerPC, which has a full set of logical operations on (its multiple sets
of) condition-register bits, it is not profitable to break of complex
conditions feeding a jump into multiple jumps. We can turn off this feature of
CGP/SDAGBuilder by marking jumps as "expensive".

P7 test-suite speedups (no regressions):
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
	-0.626647% +/- 0.323583%
MultiSource/Benchmarks/Olden/power/power
	-18.2821% +/- 8.06481%

llvm-svn: 228895
2015-02-12 01:02:52 +00:00
Tim Northover 02438033e8 DeadArgElim: aggregate Return assessment properly.
I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.

llvm-svn: 228885
2015-02-11 23:13:11 +00:00
David Majnemer 3df3c61e91 MC, COFF: Align section contents to a four byte boundary
llvm-svn: 228879
2015-02-11 22:22:30 +00:00
Mehdi Amini 9730116bd6 Reassociate: cannot negate a INT_MIN value
Summary:
When trying to canonicalize negative constants out of
multiplication expressions, we need to check that the
constant is not INT_MIN which cannot be negated.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7286

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 228872
2015-02-11 19:54:44 +00:00
Tom Stellard 0648588e7d R600/SI: Disable subreg liveness
This is temporary while we try to fix a crash in the register coalescer.

llvm-svn: 228861
2015-02-11 18:24:53 +00:00
Simon Pilgrim 2a9a745328 [X86][SSE] Added dual vector truncation tests.
llvm-svn: 228857
2015-02-11 18:14:35 +00:00
Tom Stellard 502ef4e791 R600/SI: Fix -march in test
llvm-svn: 228848
2015-02-11 17:11:48 +00:00
Jan Wen Voung c11b45a2ea Gold-plugin: Broaden scope of get/release_input_file to scope of Module.
Summary:
Move calls to get_input_file and release_input_file out of
getModuleForFile(). Otherwise release_input_file may end up
unmapping a view of the file while the view is still being
used by the Module (on 32-bit hosts).

Fix for PR22482.

Test Plan: Add test using --no-map-whole-files.

Reviewers: rafael, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7539

llvm-svn: 228842
2015-02-11 16:12:50 +00:00
Sanjay Patel afe251649b fixed to test features, not CPUs
llvm-svn: 228836
2015-02-11 15:00:41 +00:00
Sanjay Patel b53d82cbc5 fixed to test features, not CPUs
llvm-svn: 228835
2015-02-11 15:00:19 +00:00
Sanjay Patel 8b88bc91bd fixed to test features, not CPUs
llvm-svn: 228834
2015-02-11 14:58:25 +00:00
Marek Olsak fa6607d0b6 R600/SI: Enable a lot of existing tests for VI (squashed commits)
This is a union of these commits:

* R600/SI: Enable more tests for VI which need no changes

* R600/SI: Enable V_BCNT tests for VI
    Differences:
    - v_bcnt_..._e32 -> _e64
    - s_load_dword* inline offset is in bytes instead of dwords

* R600/SI: Enable all tests for VI which use S_LOAD_DWORD
    The inline offset is changed from dwords to bytes.

* R600/SI: Enable LDS tests for VI
    Differences:
    - the s_load_dword inline offset changed from dwords to bytes
    - the tests checked very little on CI, so they have been fixed to check all
      instructions that "SI" checked

* R600/SI: Enable lshr tests for VI

* R600/SI: Fix divrem64 tests
    - "v_lshl_64" was missing "b" before "64"
    - added VI-NOT checks

* R600/SI: Enable the SI.tid test for VI

* R600/SI: Enable the frem test for VI
    Also, the frem_f64 checking is added for CI-VI.

* R600/SI: Add VI tests for rsq.clamped

llvm-svn: 228830
2015-02-11 14:26:46 +00:00
Andrea Di Biagio 2a0e435db1 [TTI] Improved cost heuristic for cttz/ctlz calls.
This patch is a follow-up of r228826 (see code-review: D7506).

Now that SimplifyCFG uses TargetTransformInfo for cost analysis, we 
have to fix the cost heuristic for intrinsic calls to cttz/ctlz.

This patch defines method 'getIntrinsicCost' in BasicTTIImpl: now, BasicTTIImpl
queries TLI to check if a call to cttz/ctlz is cheap for the target.

Added test cases in Transforms/SimplifyCFG/X86 to verify that on x86,
SimplifyCFG only speculates a call to cttz/ctlz if it is cheap.

Differential Revision: http://reviews.llvm.org/D7554

llvm-svn: 228829
2015-02-11 14:22:18 +00:00
James Molloy 99f06df8ac Make buildbots better.
This testcase change was associated incorrectly to a followup commit in my git tree, not the base commit. Sorry!

llvm-svn: 228827
2015-02-11 12:24:09 +00:00
James Molloy 7c336576a5 [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

llvm-svn: 228826
2015-02-11 12:15:41 +00:00
Daniel Sanders a19216c8f4 [mips] Merge disassemblers into a single implementation.
Summary:
Currently we have Mips32 and Mips64 disassemblers and this causes the target
triple to affect the disassembly despite all the relevant information being in
the ELF header. These implementations do not need to be separate.

This patch merges them together such that the appropriate tables are checked
for the subtarget (e.g. Mips64 is checked when GP64 is enabled).

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7498

llvm-svn: 228825
2015-02-11 11:28:56 +00:00
James Molloy f147359376 [LoopReroll] Introduce the concept of DAGRootSets.
A DAGRootSet models an induction variable being used in a rerollable
loop. For example:

   x[i*3+0] = y1
   x[i*3+1] = y2
   x[i*3+2] = y3

   Base instruction -> i*3
                    +---+----+
                   /    |     \
               ST[y1]  +1     +2  <-- Roots
                        |      |
                      ST[y2] ST[y3]

There may be multiple DAGRootSets, for example:

   x[i*2+0] = ...   (1)
   x[i*2+1] = ...   (1)
   x[i*2+4] = ...   (2)
   x[i*2+5] = ...   (2)
   x[(i+1234)*2+5678] = ... (3)
   x[(i+1234)*2+5679] = ... (3)

This concept is similar to the "Scale" member used previously, but allows
multiple independent sets of roots based off the same induction variable.

llvm-svn: 228821
2015-02-11 09:19:47 +00:00
David Majnemer fad5a31160 AsmParser: Validate alloca's type
An alloca's type should be weird things like metadata.

llvm-svn: 228820
2015-02-11 09:13:11 +00:00
David Majnemer 04578fcfa5 DataLayout: Report when the preferred alignment is less than the ABI
llvm-svn: 228819
2015-02-11 09:13:09 +00:00
David Majnemer d7677e7a8d Verifier: Check for null operands in !llvm.module.flags
llvm-svn: 228818
2015-02-11 09:13:06 +00:00
David Majnemer 9fd8cdc009 Verifier: Make sure !llvm.ident's operand isn't null
llvm-svn: 228815
2015-02-11 08:23:20 +00:00
David Majnemer 300745351f AsmParser: Don't crash when insertvalue has bad operands
llvm-svn: 228813
2015-02-11 07:43:58 +00:00
Reid Kleckner b3775df32e Fix invalid LLVM IR in PruneEH tests
llvm-svn: 228786
2015-02-11 02:06:47 +00:00
Reid Kleckner 96d011315a Don't promote asynch EH invokes of nounwind functions to calls
If the landingpad of the invoke is using a personality function that
catches asynch exceptions, then it can catch a trap.

Also add some landingpads to invalid LLVM IR test cases that lack them.

Over-the-shoulder reviewed by David Majnemer.

llvm-svn: 228782
2015-02-11 01:23:16 +00:00
Tom Stellard 94b7231740 R600/SI: Store immediate offsets > 12-bits in soffset
This will save us from having to extend these offsets to 64-bits
and storing them in a pair of vgprs.

llvm-svn: 228776
2015-02-11 00:34:35 +00:00
Adrian Prantl eb94727098 Add the missing testcase for r228764.
llvm-svn: 228766
2015-02-10 23:32:56 +00:00
Petar Jovanovic d9f52043b1 Fix makeLibCall argument (signed) in SoftenFloatRes_XINT_TO_FP function
The isSigned argument of makeLibCall function was hard-coded to false
(unsigned). This caused zero extension on MIPS64 soft float.
As the result SingleSource/Benchmarks/Stanford/FloatMM test and
SingleSource/UnitTests/2005-07-17-INT-To-FP test failed. 
The solution was to use the proper argument.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D7292

llvm-svn: 228765
2015-02-10 23:30:14 +00:00
David Majnemer 44a5f22fbc EarlyCSE: Add check lines for test added in r228760
llvm-svn: 228761
2015-02-10 23:11:02 +00:00
David Majnemer 7679300d93 EarlyCSE: It isn't safe to CSE across synchronization boundaries
This fixes PR22514.

llvm-svn: 228760
2015-02-10 23:09:43 +00:00
David Majnemer ca19485f08 X86: @llvm.frameaddress should defer to SelectionDAG for Win CFI
llvm-svn: 228754
2015-02-10 22:00:34 +00:00
David Majnemer 13d0b11d7b X86: Make @llvm.frameaddress work correctly with Windows unwind codes
Simply loading or storing the frame pointer is not sufficient for
Windows targets.  Instead, create a synthetic frame object that we will
lower later.  References to this synthetic object will be replaced with
the correct reference to the frame address.

llvm-svn: 228748
2015-02-10 21:22:05 +00:00
Daniel Jasper 1d966eff08 Fix overly prescriptive test that broken on Mac after r228725.
llvm-svn: 228742
2015-02-10 20:49:05 +00:00
Andrew Kaylor 78b53dbcc1 Adding support for llvm.eh.begincatch and llvm.eh.endcatch intrinsics and beginning the documentation of native Windows exception handling.
Differential Revision: http://reviews.llvm.org/D7398

llvm-svn: 228733
2015-02-10 19:52:43 +00:00
Tim Northover 43c0d2db50 DeadArgElim: arguments affect all returned sub-values by default.
Unless we meet an insertvalue on a path from some value to a return, that value
will be live if *any* of the return's components are live, so all of those
components must be added to the MaybeLiveUses.

Previously we were deleting arguments if sub-value 0 turned out to be dead.

llvm-svn: 228731
2015-02-10 19:49:18 +00:00
Bill Schmidt 82f1c775a0 [PowerPC] Fix reverted patch r227976 to avoid register assignment issues
See full discussion in http://reviews.llvm.org/D7491.

We now hide the add-immediate and call instructions together in a
separate pseudo-op, which is tagged to define GPR3 and clobber the
call-killed registers.  The PPCTLSDynamicCall pass prior to RA now
expands this op into the two separate addi and call ops, with explicit
definitions of GPR3 on both instructions, and explicit clobbers on the
call instruction.  The pass is now marked as requiring and preserving
the LiveIntervals and SlotIndexes analyses, and fixes these up after
the replacement sequences are introduced.

Self-hosting has been verified on LE P8 and BE P7 with various
optimization levels, etc.  It has also been verified with the
--no-tls-optimize flag workaround removed.

llvm-svn: 228725
2015-02-10 19:09:05 +00:00
David Majnemer a7d908eb2b X86: Emit Win64 SaveXMM opcodes at the right offset in the right order
Walk the instructions marked FrameSetup and consider any stores of XMM
registers to the stack as needing a SaveXMM opcode.

This fixes PR22521.

Differential Revision: http://reviews.llvm.org/D7527

llvm-svn: 228724
2015-02-10 19:01:47 +00:00
Hal Finkel 57c6ac5e41 [PowerPC] Support the (old) cntlz instruction alias
Some old assembly code uses the cntlz alias for cntlzw, binutils supports this,
and we should too. Fixes PR22519.

llvm-svn: 228719
2015-02-10 18:45:02 +00:00
Michael Zolotukhin 03e3518c91 Add a test case for new unrolling heuristics.
THe heuristics were added in r228265 and r228434.

llvm-svn: 228713
2015-02-10 17:54:54 +00:00
Zoran Jovanovic 416886793f [mips][microMIPS] Implement movep instruction
Differential Revision: http://reviews.llvm.org/D7465

llvm-svn: 228703
2015-02-10 16:36:20 +00:00
Paul Robinson 848cf6aa3a Explicitly initialize a flag in a default constructor.
Works around a Visual C++ issue.

Patch by Douglas Yung!

llvm-svn: 228699
2015-02-10 15:30:02 +00:00
Bradley Smith e997b45076 [ARM] Add armv6s[-]m as an alias to armv6[-]m
llvm-svn: 228696
2015-02-10 15:15:08 +00:00
Simon Pilgrim d142ab7d08 [X86][AVX2] Missing AVX2 memory folding instructions
Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns)

Differential Revision: http://reviews.llvm.org/D7492

llvm-svn: 228688
2015-02-10 13:22:57 +00:00
Jozef Kolek e76eb41c21 [mips][microMIPS] Add disassembler tests for 16-bit instructions BREAK16 and SDBBP16
Differential Revision: http://reviews.llvm.org/D7443

llvm-svn: 228687
2015-02-10 13:20:51 +00:00
Simon Pilgrim cd32254a35 [X86][XOP] Added XOP memory folding patterns + tests
This patch adds the complete AMD Bulldozer XOP instruction set to the memory folding pattern tables for stack folding, etc.

Note: Many of the XOP instructions have multiple table entries as it can fold loads from different sources.

Differential Revision: http://reviews.llvm.org/D7484

llvm-svn: 228685
2015-02-10 12:57:17 +00:00
Jozef Kolek d68d424abf [mips][microMIPS] Fix disassembling of 16-bit microMIPS instructions LWM16 and SWM16
Differential Revision: http://reviews.llvm.org/D7436

llvm-svn: 228683
2015-02-10 12:41:13 +00:00
Andrea Di Biagio 62622d2396 [X86][FastIsel] Avoid introducing legacy SSE instructions if the target has AVX.
This patch teaches X86FastISel how to select AVX instructions for scalar
float/double convert operations.

Before this patch, X86FastISel always selected legacy SSE instructions
for FPExt (from float to double) and FPTrunc (from double to float).

For example:
\code
  define double @foo(float %f) {
    %conv = fpext float %f to double
    ret double %conv
  }
\end code

Before (with -mattr=+avx -fast-isel) X86FastIsel selected a CVTSS2SDrr which is
legacy SSE:
  cvtss2sd %xmm0, %xmm0

With this patch, X86FastIsel selects a VCVTSS2SDrr instead:
  vcvtss2sd %xmm0, %xmm0, %xmm0

Added test fast-isel-fptrunc-fpext.ll to check both the register-register and
the register-memory float/double conversion variants.

Differential Revision: http://reviews.llvm.org/D7438

llvm-svn: 228682
2015-02-10 12:04:41 +00:00
Chandler Carruth 2496910325 Revert r228556: InstCombine: propagate nonNull through assume
This commit isn't using the correct context, and is transfoming calls
that are operands to loads rather than calls that are operands to an
icmp feeding into an assume. I've replied on the original review thread
with a very reduced test case and some thoughts on how to rework this.

llvm-svn: 228677
2015-02-10 08:07:32 +00:00
Nick Lewycky 1cbc13a928 Remove non-test files that appear to have been accidentally committed in r228641.
llvm-svn: 228657
2015-02-10 02:39:17 +00:00
Chandler Carruth b65d61a2e8 [x86] Fix PR22524: the DAG combiner was incorrectly handling illegal
nodes when folding bitcasts of constants.

We can't fold things and then check after-the-fact whether it was legal.
Once we have formed the DAG node, arbitrary other nodes may have been
collapsed to it. There is no easy way to go back. Instead, we need to
test for the specific folding cases we're interested in and ensure those
are legal first.

This could in theory make this less powerful for bitcasting from an
integer to some vector type, but AFAICT, that can't actually happen in
the SDAG so its fine. Now, we *only* whitelist specific int->fp and
fp->int bitcasts for post-legalization folding. I've added the test case
from the PR.

(Also as a note, this does not appear to be in 3.6, no backport needed)

llvm-svn: 228656
2015-02-10 02:25:56 +00:00
David Majnemer 93c22a45be X86: Emit an ABI compliant prologue and epilogue for Win64
Win64 has specific contraints on what valid prologues and epilogues look
like.  This constraint is born from the flexibility and descriptiveness
of Win64's unwind opcodes.

Prologues previously emitted by LLVM could not be represented by the
unwind opcodes, preventing operations powered by stack unwinding to
successfully work.

Differential Revision: http://reviews.llvm.org/D7520

llvm-svn: 228641
2015-02-10 00:57:42 +00:00
Adrian Prantl 34e7590e0d Debug info: When updating debug info during SROA, do not emit debug info
for any padding introduced by SROA. In particular, do not emit debug info
for an alloca that represents only the padding introduced by a previous
iteration.

Fixes PR22495.

llvm-svn: 228632
2015-02-09 23:57:22 +00:00
Adrian Prantl 27bd01f71c Debug info: Use DW_OP_bit_piece instead of DW_OP_piece in the
intermediate representation. This
- increases consistency by using the same granularity everywhere
- allows for pieces < 1 byte
- DW_OP_piece didn't actually allow storing an offset.

Part of PR22495.

llvm-svn: 228631
2015-02-09 23:57:15 +00:00
Ramkumar Ramachandra 2e4b9e0a37 PlaceSafepoints: modernize gc.result.* -> gc.result
Differential Revision: http://reviews.llvm.org/D7516

llvm-svn: 228625
2015-02-09 23:00:40 +00:00
Duncan P. N. Exon Smith b407bb2789 DebugInfo: Remove DW_TAG_constant
Remove handling for DW_TAG_constant.  We started producing it in
r110656, but reverted that in r110876 without dropping the support.
Finish the job.

llvm-svn: 228623
2015-02-09 22:48:04 +00:00
Philip Reames 0edbf2e407 Introduce more tests for PlaceSafepoints
These tests the two optimizations for backedge insertion currently implemented and the split backedge flag which is currently off by default.

llvm-svn: 228617
2015-02-09 22:10:15 +00:00
Philip Reames 5fc82fd6b5 Minor test cleanup
a) add gc attribute
b) remove unused param

llvm-svn: 228612
2015-02-09 21:50:31 +00:00
Ramkumar Ramachandra 82ab65c7cd MemDerefPrinter: Require DataLayoutPass for higher accuracy
Without a valid data layout, deferenceable(N) doesn't get parsed or
propagated. Since this is the key item we are testing, add a dependency
on the pass.

Differential Revision: http://reviews.llvm.org/D7508

llvm-svn: 228611
2015-02-09 21:50:03 +00:00
Philip Reames b1ed02f728 Add basic tests for PlaceSafepoints
This is just adding really simple tests which should have been part of the original submission.  When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission.  I fixed two such bugs in this submission.

llvm-svn: 228610
2015-02-09 21:48:05 +00:00
Ramkumar Ramachandra a7343d65f4 isDereferenceablePointer: look through gc.relocate calls
While a theoretical GC might change dereferenceability on collection,
there is no such known collector and no need to account for the case
with a flag yet.

Differential Revision: http://reviews.llvm.org/D7454

llvm-svn: 228606
2015-02-09 21:08:03 +00:00
Colin LeMahieu 955c4ff9c3 [Hexagon] Factoring classes out of store patterns.
llvm-svn: 228602
2015-02-09 20:33:46 +00:00
Sanjoy Das bf5d870dfa Bugfix: SCEV incorrectly marks certain add recurrences as nsw
When creating a scev for sext({X,+,Y}), scev checks if the expression
is equivalent to {sext X,+,zext Y}.  If it can prove that, it also
tags the original {X,+,Y} as <nsw>, which is not correct.

In the test case I run `-scalar-evolution` twice because the bug
manifests only once SCEV has run through and seen the `sext`
expressions (and then does a in-place mutation on {X,+,Y}).

Differential Revision: http://reviews.llvm.org/D7495

llvm-svn: 228586
2015-02-09 18:34:55 +00:00
Sanjay Patel 546f26acf3 fixed to test features, not CPUs
llvm-svn: 228581
2015-02-09 17:17:09 +00:00
Kit Barton 0b0cdb1cd4 This change implements the following three logical vector operations:
veqv (vector equivalence)
vnand
vorc
I increased the AddedComplexity for these instructions to 500 to ensure they are generated instead of issuing other VSX instructions.


Phabricator review: http://reviews.llvm.org/D7469

llvm-svn: 228580
2015-02-09 17:03:18 +00:00
Johannes Doerfert 2683e5676c Allow ScalarEvolution to catch more min/max cases
For the attached test case different types are used in the ICmpInst
  and SelectInst that represent the min/max expressions. However, if the
  ICmpInst type is smaller a comparison with the sign/zero extended
  operands would have yielded the same result. This situation might
  arise after the instruction combination pass was applied.

  Differential Revision: http://reviews.llvm.org/D7338

llvm-svn: 228572
2015-02-09 12:34:23 +00:00
Akira Hatanaka 8d3cb829ce Fix a bug in DemoteRegToStack where a reload instruction was inserted into the
wrong basic block.

This would happen when the result of an invoke was used by a phi instruction
in the invoke's normal destination block. An instruction to reload the invoke's
value would get inserted before the critical edge was split and a new basic
block (which is the correct insertion point for the reload) was created. This
commit fixes the bug by splitting the critical edge before all the reload
instructions are inserted.

Also, hoist up the code which computes the insertion point to the only place
that need that computation.

rdar://problem/15978721

llvm-svn: 228566
2015-02-09 06:38:23 +00:00
David Majnemer 1de3094d78 MC: Calculate intra-section symbol differences correctly for COFF
This fixes PR22060.

llvm-svn: 228565
2015-02-09 06:31:31 +00:00
Tim Northover 705d2af9e1 DeadArgElim: fix mismatch in accounting of array return types.
Some parts of DeadArgElim were only considering the individual fields
of StructTypes separately, but others (where insertvalue &
extractvalue instructions occur) also looked into ArrayTypes.

This one is an actual bug; the mismatch can lead to an argument being
considered used by a return sub-value that isn't being tracked (and
hence is dead by default). It then gets incorrectly eliminated.

llvm-svn: 228559
2015-02-09 01:21:00 +00:00
Tim Northover 854c927de5 DeadArgElim: assess uses of entire return value aggregate.
Previously, a non-extractvalue use of an aggregate return value meant
the entire return was considered live (the algorithm gave up
entirely). This was correct, but conservative. It's better to actually
look at that Use, making the analysis results apply to all sub-values
under consideration.

E.g.

  %val = call { i32, i32 } @whatever()
  [...]
  ret { i32, i32 } %val

The return is using the entire aggregate (sub-values 0 and 1). We can
still simplify @whatever if we can prove that this return is itself
unused.

Also unifies the logic slightly between aggregate and non-aggregate
cases..

llvm-svn: 228558
2015-02-09 01:20:53 +00:00
Ramkumar Ramachandra a021ee62ca InstCombine: propagate nonNull through assume
Make assume (load (call|invoke) != null) set nonNull return attribute
for the call and invoke. Also include tests.

Differential Revision: http://reviews.llvm.org/D7107

llvm-svn: 228556
2015-02-09 01:13:13 +00:00
Sanjoy Das f2e931cae9 Bugfix: ScalarEvolution incorrectly assumes that the start of certain
add recurrences don't overflow.

This change makes the optimization more restrictive.  It still assumes
that an overflowing `add nsw` is undefined behavior; and this change
will need revisiting once we have a consistent semantics for poison
values.

Differential Revision: http://reviews.llvm.org/D7331

llvm-svn: 228552
2015-02-08 22:52:17 +00:00
Sanjay Patel baf0a2415c fix test attributes; this is an SSE2 test, not a Nehalem test
llvm-svn: 228546
2015-02-08 21:14:27 +00:00
Sanjay Patel e6eed52325 fix test attributes; this is an x86-64 test, not a Nehalem test
llvm-svn: 228545
2015-02-08 21:10:40 +00:00
Sanjay Patel 5cf03374a1 fix test attributes; these are SSE2 tests, not Nehalem tests
llvm-svn: 228544
2015-02-08 21:05:03 +00:00
Sanjay Patel 9be09a3617 fix test attributes; these are SSE2 tests, not Nehalem tests
llvm-svn: 228541
2015-02-08 20:50:58 +00:00
Sanjay Patel ff9dec22ba fix test attributes; these are x86-64 tests, not Nehalem tests
llvm-svn: 228536
2015-02-08 20:05:53 +00:00
Sanjay Patel 972425d221 fix test attributes; these are MMX tests, not Nehalem tests
llvm-svn: 228535
2015-02-08 20:01:12 +00:00
Sanjay Patel c871fe746b fix test attributes; these are SSE2 tests, not Nehalem tests
llvm-svn: 228534
2015-02-08 19:50:55 +00:00
Sanjay Patel 3fa03da6f8 generalize test; nothing Nehalem-specific here
llvm-svn: 228532
2015-02-08 19:38:25 +00:00
Simon Pilgrim e490385843 [X86][AVX2] AVX2 broadcast + permute memory folding tests.
llvm-svn: 228528
2015-02-08 18:33:13 +00:00
Bjorn Steinbrink 5ec7522771 Correctly combine alias.scope metadata by a union instead of intersecting
Summary:
The alias.scope metadata represents sets of things an instruction might
alias with. When generically combining the metadata from two
instructions the result must be the union of the original sets, because
the new instruction might alias with anything any of the original
instructions aliased with.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7490

llvm-svn: 228525
2015-02-08 17:07:14 +00:00
Tim Northover 45aa89c925 ARM & AArch64: teach LowerVSETCC that output type size may differ from input.
While various DAG combines try to guarantee that a vector SETCC
operation will have the same output size as input, there's nothing
intrinsic to either creation or LegalizeTypes that actually guarantees
it, so the function needs to be ready to handle a mismatch.

Fortunately this is easy enough, just extend or truncate the naturally
compared result.

I couldn't reproduce the failure in other backends that I know have
SIMD, so it's probably only an issue for these two due to shared
heritage.

Should fix PR21645.

llvm-svn: 228518
2015-02-08 00:50:47 +00:00
Craig Topper 1d472db8cc [X86] Add GETSEC instruction.
llvm-svn: 228514
2015-02-07 23:36:36 +00:00
Simon Pilgrim 7440699267 [X86][AVX2] AVX2 integer stack folding tests.
This adds tests for the remaining AVX2 instructions that currently support memory folding.

llvm-svn: 228513
2015-02-07 23:28:16 +00:00