Commit Graph

28365 Commits

Author SHA1 Message Date
Michael Kuperstein e86aa9a8a4 Revert r227728 due to bad line endings.
llvm-svn: 227746
2015-02-01 16:15:07 +00:00
Hal Finkel e6698d5305 [PowerPC] Make r2 allocatable on PPC64/ELF for some leaf functions
The TOC base pointer is passed in r2, and we normally reserve this register so
that we can depend on it being there. However, for leaf functions, and
specifically those leaf functions that don't do any TOC access of their own
(which is generally due to accessing the constant pool, using TLS, etc.),
we can treat r2 as an ordinary callee-saved register (it must be callee-saved
because, for local direct calls, the linker will not insert any save/restore
code).

The allocation order has been changed slightly for PPC64/ELF systems to put r2
at the end of the list (while leaving it near the beginning for Darwin systems
to prevent unnecessary output changes). While r2 is allocatable, using it still
requires spill/restore traffic, and thus comes at the end of the list.

llvm-svn: 227745
2015-02-01 15:03:28 +00:00
Michael Kuperstein bd57186c76 [X86] Convert esp-relative movs of function arguments to pushes, step 2
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a 
reserved call frame), and perform rudimentary call folding. It still doesn't 
have a heuristic, so it is enabled only for optsize/minsize, with stack 
alignment <= 8, where it ought to be a fairly clear win.

Differential Revision: http://reviews.llvm.org/D6789

llvm-svn: 227728
2015-02-01 11:44:44 +00:00
Chandler Carruth fdffd87d68 [PM] Port SimplifyCFG to the new pass manager.
This should be sufficient to replace the initial (minor) function pass
pipeline in Clang with the new pass manager. I'll probably add an (off
by default) flag to do that just to ensure we can get extra testing.

llvm-svn: 227726
2015-02-01 11:34:21 +00:00
Chandler Carruth e8c686aa86 [PM] Port EarlyCSE to the new pass manager.
I've added RUN lines both to the basic test for EarlyCSE and the
target-specific test, as this serves as a nice test that the TTI layer
in the new pass manager is in fact working well.

llvm-svn: 227725
2015-02-01 10:51:23 +00:00
Chandler Carruth 9f8d9b613c [PM] Teach the module-to-function adaptor to not run function passes
over declarations.

This is both quite unproductive and causes things to crash, for example
domtree would just assert.

I've added a declaration and a domtree run to the basic high-level tests
for the new pass manager.

llvm-svn: 227724
2015-02-01 10:47:25 +00:00
Chandler Carruth e038552c8a [PM] Port TTI to the new pass manager, introducing a TargetIRAnalysis to
produce it.

This adds a function to the TargetMachine that produces this analysis
via a callback for each function. This in turn faves the way to produce
a *different* TTI per-function with the correct subtarget cached.

I've also done the necessary wiring in the opt tool to thread the target
machine down and make it available to the pass registry so that we can
construct this analysis from a target machine when available.

llvm-svn: 227721
2015-02-01 10:11:22 +00:00
Elena Demikhovsky 534a99d878 AVX2: Added 2 more tests for gather intrinsics.
llvm-svn: 227718
2015-02-01 08:52:15 +00:00
Jingyue Wu 0220df0dfd [NVPTX] Emit .pragma "nounroll" for loops marked with nounroll
Summary:
CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA
driver from unrolling a loop marked with llvm.loop.unroll.disable is not
unrolled by CUDA driver, we need to emit .pragma "nounroll" at the
header of that loop.

This patch also extracts getting unroll metadata from loop ID metadata
into a shared helper function.

Test Plan: test/CodeGen/NVPTX/nounroll.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7041

llvm-svn: 227703
2015-02-01 02:27:45 +00:00
Adrian Prantl 152ac396db Fix PR22393. When recursively replacing an aggregate with a smaller
aggregate or scalar, the debug info needs to refer to the absolute offset
(relative to the entire variable) instead of storing the offset inside
the smaller aggregate.

llvm-svn: 227702
2015-02-01 00:58:04 +00:00
Adrian Prantl 02d6f22c93 Add missing tags.
llvm-svn: 227701
2015-02-01 00:57:31 +00:00
Matt Arsenault 08ad328ae2 R600/SI: Only select cvt_flr/cvt_rpi with no NaNs.
These have different behavior from cvt_i32_f32 on NaN.

llvm-svn: 227693
2015-01-31 21:28:13 +00:00
Simon Pilgrim 9c76b47469 [X86][SSE] Shuffle mask decode support for zero extend, scalar float/double moves and integer load instructions
This patch adds shuffle mask decodes for integer zero extends (pmovzx** and movq xmm,xmm) and scalar float/double loads/moves (movss/movsd).

Also adds shuffle mask decodes for integer loads (movd/movq).

Differential Revision: http://reviews.llvm.org/D7228

llvm-svn: 227688
2015-01-31 14:09:36 +00:00
Saleem Abdulrasool 9fd41316ad llvm-readobj: add a test case for ARM_MOV32(T) base relocation
Add a trivial binary (int main() { return 0; }) built for Windows on ARM to
ensure that we can correctly identify ARM_MOV32(T) base relocations.  Addresses
post-commit review comments.

llvm-svn: 227673
2015-01-31 04:46:50 +00:00
Chandler Carruth 705b185f90 [PM] Change the core design of the TTI analysis to use a polymorphic
type erased interface and a single analysis pass rather than an
extremely complex analysis group.

The end result is that the TTI analysis can contain a type erased
implementation that supports the polymorphic TTI interface. We can build
one from a target-specific implementation or from a dummy one in the IR.

I've also factored all of the code into "mix-in"-able base classes,
including CRTP base classes to facilitate calling back up to the most
specialized form when delegating horizontally across the surface. These
aren't as clean as I would like and I'm planning to work on cleaning
some of this up, but I wanted to start by putting into the right form.

There are a number of reasons for this change, and this particular
design. The first and foremost reason is that an analysis group is
complete overkill, and the chaining delegation strategy was so opaque,
confusing, and high overhead that TTI was suffering greatly for it.
Several of the TTI functions had failed to be implemented in all places
because of the chaining-based delegation making there be no checking of
this. A few other functions were implemented with incorrect delegation.
The message to me was very clear working on this -- the delegation and
analysis group structure was too confusing to be useful here.

The other reason of course is that this is *much* more natural fit for
the new pass manager. This will lay the ground work for a type-erased
per-function info object that can look up the correct subtarget and even
cache it.

Yet another benefit is that this will significantly simplify the
interaction of the pass managers and the TargetMachine. See the future
work below.

The downside of this change is that it is very, very verbose. I'm going
to work to improve that, but it is somewhat an implementation necessity
in C++ to do type erasure. =/ I discussed this design really extensively
with Eric and Hal prior to going down this path, and afterward showed
them the result. No one was really thrilled with it, but there doesn't
seem to be a substantially better alternative. Using a base class and
virtual method dispatch would make the code much shorter, but as
discussed in the update to the programmer's manual and elsewhere,
a polymorphic interface feels like the more principled approach even if
this is perhaps the least compelling example of it. ;]

Ultimately, there is still a lot more to be done here, but this was the
huge chunk that I couldn't really split things out of because this was
the interface change to TTI. I've tried to minimize all the other parts
of this. The follow up work should include at least:

1) Improving the TargetMachine interface by having it directly return
   a TTI object. Because we have a non-pass object with value semantics
   and an internal type erasure mechanism, we can narrow the interface
   of the TargetMachine to *just* do what we need: build and return
   a TTI object that we can then insert into the pass pipeline.
2) Make the TTI object be fully specialized for a particular function.
   This will include splitting off a minimal form of it which is
   sufficient for the inliner and the old pass manager.
3) Add a new pass manager analysis which produces TTI objects from the
   target machine for each function. This may actually be done as part
   of #2 in order to use the new analysis to implement #2.
4) Work on narrowing the API between TTI and the targets so that it is
   easier to understand and less verbose to type erase.
5) Work on narrowing the API between TTI and its clients so that it is
   easier to understand and less verbose to forward.
6) Try to improve the CRTP-based delegation. I feel like this code is
   just a bit messy and exacerbating the complexity of implementing
   the TTI in each target.

Many thanks to Eric and Hal for their help here. I ended up blocked on
this somewhat more abruptly than I expected, and so I appreciate getting
it sorted out very quickly.

Differential Revision: http://reviews.llvm.org/D7293

llvm-svn: 227669
2015-01-31 03:43:40 +00:00
Saleem Abdulrasool fb8a66fbc5 ARM: support stack probe size on Windows on ARM
Now that -mstack-probe-size is piped through to the backend via the function
attribute as on Windows x86, honour the value to permit handling of non-default
values for stack probes.  This is needed /Gs with the clang-cl driver or
-mstack-probe-size with the clang driver when targeting Windows on ARM.

llvm-svn: 227667
2015-01-31 02:26:37 +00:00
Kevin Enderby f6d258537d Add the -section option to llvm-objdump used with -macho that takes the argument
segname,sectname to specify a Mach-O section to print.  The printing is based on
the section type or section attributes.

The printing of the module initialization and termination section types is printed
with this change.  Printing of other section types will be added next.

llvm-svn: 227649
2015-01-31 00:37:11 +00:00
David Blaikie c52b4944ad Add PPC test for r227481, but XFAIL because this is actually more work than it appeared to be.
Same sort of bug as on ARM where the cmp+branch are lowered to br_cc
(choosing the branch's debugloc for the br_cc's debugloc) then expanded
out to a cmp and a br, but both using the debug loc of the br_cc, thus
losing fidelity.

llvm-svn: 227645
2015-01-30 23:52:19 +00:00
Ahmed Bougacha aab9677f65 [AArch64] Add a few more DUP testcases. NFC.
Also, don't lie about testing index 0.

llvm-svn: 227642
2015-01-30 23:41:15 +00:00
Philip Reames c2f99b421b Fix statepoint verifier tests to actually test verifier.
Patch by: Igor Laevsky

"Statepoint verifier tests were using wrong names for the statepoint and gc.relocate intrinsics. This change renames them to use correct names and fixes all uncovered issues."

Differential Revision: http://reviews.llvm.org/D7266

llvm-svn: 227636
2015-01-30 23:18:42 +00:00
Ahmed Bougacha 77b76542eb [AArch64] Robustize neon-scalar-copy.ll tests. NFC.
Some of those didn't even have run lines: they were removed
inadvertently during the Great Merge of 2014.

They used to check for DUPs, but now we go through W-regs?
Filed PR22418 for that potential regression.

For now, just make the tests explicit, so we now where we stand.

llvm-svn: 227635
2015-01-30 23:13:57 +00:00
David Blaikie 3ef249c9c0 Add ARM test for r227489, but XFAIL because this is actually more work than it appeared to be.
Also revert r227489 since it didn't actually fix the thing I thought I
was fixing (since the test case was targeting the wrong architecture
initially). The change might be correct & demonstrated by other test
cases, but it's not a priority for me to find those test cases right
now.

Filed PR22417 for the failure.

llvm-svn: 227632
2015-01-30 23:04:39 +00:00
Colin LeMahieu cefca69d72 [Hexagon] Adding vector shift instructions and tests.
llvm-svn: 227619
2015-01-30 21:58:46 +00:00
Ahmed Bougacha 2d80ea1939 [X86] Cleanup tabs in test vector-zext.ll. NFC.
Some tests have tabs, some don't.
In vector-[sz]ext.ll, space wins (well duh!).

llvm-svn: 227615
2015-01-30 21:41:28 +00:00
Colin LeMahieu cc4329b836 [Hexagon] Adding vector predicate instructions.
llvm-svn: 227613
2015-01-30 21:24:06 +00:00
Colin LeMahieu 26a537c743 [Hexagon] Adding vector permutation instructions and tests.
llvm-svn: 227612
2015-01-30 21:14:00 +00:00
Reid Kleckner a580b6ec67 Win64: Put a REX_W prefix on all TAILJMP* instructions
MSDN's x64 software conventions page says that this is one of the fixed
list of legal epilogues:
https://msdn.microsoft.com/en-us/library/tawsa7cb.aspx

Presumably this is how the unwinder distinguishes epilogue jumps from
in-function control flow.

Also normalize the way we place "## TAILCALL" comments on such jumps.

llvm-svn: 227611
2015-01-30 21:03:31 +00:00
Colin LeMahieu 16f5e56703 [Hexagon] Adding vector multiplies. Cleaning up tests.
llvm-svn: 227609
2015-01-30 20:56:54 +00:00
Colin LeMahieu b84ec02296 [Hexagon] Adding XTYPE/COMPLEX instructions and cleaning up tests.
llvm-svn: 227607
2015-01-30 20:08:37 +00:00
Adrian Prantl 3e2659eb92 Inliner: Use replaceDbgDeclareForAlloca() instead of splicing the
instruction and generalize it to optionally dereference the variable.
Follow-up to r227544.

llvm-svn: 227604
2015-01-30 19:37:48 +00:00
Saleem Abdulrasool 70fe588c88 ARM: further correct .fpu directive handling
If the original FPU specification involved a restricted VFP unit (d16), ensure
that we reset the functionality when we encounter a new FPU type.  In
particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has
32 double precision registers), we would fail to reset the D16 feature, and
treat it as being equivalent to vfpv3-d16.

llvm-svn: 227603
2015-01-30 19:35:18 +00:00
Renato Golin e7c2b386f1 Revert "Add missing test from r227488"
This reverts commit r227489, since this is the real one failing the bots.

llvm-svn: 227602
2015-01-30 19:25:23 +00:00
Colin LeMahieu 21fbc94777 [Hexagon] Adding XTYPE/ALU vector instructions. Organizing test files.
llvm-svn: 227598
2015-01-30 19:13:26 +00:00
Saleem Abdulrasool 07b7c03805 ARM: improve caret diagnostics for invalid FPU name
In the case of an invalid FPU name, place the caret at the name rather than FPU
directive.

llvm-svn: 227595
2015-01-30 18:42:10 +00:00
Filipe Cabecinhas fcd044b692 Check bit widths before trying to get a type.
Added a test case for it.
Also added run lines for the test case in r227566.

Bugs found with afl-fuzz

llvm-svn: 227589
2015-01-30 18:13:50 +00:00
Colin LeMahieu 709c0a16bb [Hexagon] Adding a number of vector load variants and organizing tests.
llvm-svn: 227588
2015-01-30 18:09:44 +00:00
Saleem Abdulrasool 206d1160ce ARM: correct handling of .fpu directive
The FPU directive permits the user to switch the target FPU, enabling
instructions that would be otherwise unavailable.  However, when configuring the
new subtarget features, we would not enable the implied functions for newer
FPUs.  This would result in invalid rejection of valid input.  Ensure that we
inherit the implied FPU functionality when enabling newer versions of the FPU.
Fortunately, these are mostly hierarchical, unlike the CPUs.

Addresses PR22395.

llvm-svn: 227584
2015-01-30 17:58:25 +00:00
Toma Tabacu 8f6603a2dc [mips] Manually replace JAL pseudo-instructions with their JALR equivalent, instead of using InstAlias.
Summary:
This is needed by the .cprestore assembler directive.

This directive needs to be able to insert an LW instruction after every JALR replacement of a JAL pseudo-instruction
(and never after a JALR which has NOT been a result of a pseudo-instruction replacement).

The problem with using InstAlias for these is that after it replaces the pseudo-instruction, we can't find out if the resulting JALR instruction
was generated by an InstAlias or not, so we don't know whether or not to insert our LW instruction.

By replacing it manually, we know when the pseudo-instruction replacement happens and we can insert the LW instruction correctly.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D5601

llvm-svn: 227568
2015-01-30 11:18:50 +00:00
Filipe Cabecinhas d0858e1037 [bitcode reader] Fix an assert on invalid type tables
Bug found with afl-fuzz

llvm-svn: 227566
2015-01-30 10:57:58 +00:00
NAKAMURA Takumi 9469a592ba Introduce llvm/test/LTO/X86. LTO tests may be assumed as target-specific.
llvm-svn: 227564
2015-01-30 10:09:26 +00:00
NAKAMURA Takumi c05ebb424b Introduce llvm/test/LTO/ARM for arm-specific LTO test(s).
llvm-svn: 227563
2015-01-30 09:53:37 +00:00
Hao Liu 6bd67c08fa Move the target specific test case arbitrary-induction-step.ll to test/Transforms/LoopVectorize/AArch64 folder.
llvm-svn: 227561
2015-01-30 07:33:31 +00:00
Hao Liu 8de4f8b1b5 [LoopVectorize] Induction variables: support arbitrary constant step.
Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support
arbitrary constant steps.
Initial patch by Alexey Volkov. Some bug fixes are added in the following version.

Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193

llvm-svn: 227557
2015-01-30 05:02:21 +00:00
Hao Liu e0335d77c3 [AArch64]Fix PR21675, a bug about lowering llvm.ctpop.i32. We should noot use "DAG.getUNDEF(MVT::v8i8)" to get all zero vector.
Patch by Wei-cheng Wang.

llvm-svn: 227550
2015-01-30 02:13:53 +00:00
Adrian Prantl 4d365250ec Fix PR22386. The inliner moves static allocas to the entry basic block
so we need to move the dbg.declare intrinsics that describe them, too.

llvm-svn: 227544
2015-01-30 01:55:25 +00:00
Akira Hatanaka 8fba18e958 [LTO] Scan all per-function subtargets when collecting runtime library names.
accumulateAndSortLibcalls in LTOCodeGenerator.cpp collects names of runtime
library functions which are used to identify user-defined functions that should
be protected. Previously, this function would only scan the TargetLowering
object belonging to the "main" subtarget for the library function names. This
commit changes it to scan all per-function subtargets.

Differential Revision: http://reviews.llvm.org/D7275

llvm-svn: 227533
2015-01-30 01:16:24 +00:00
Reid Kleckner ca9b9feb2c x86: Fix large model calls to __chkstk for dynamic allocas
In the large code model, we now put __chkstk in %r11 before calling it.

Refactor the code so that we only do this once. Simplify things by using
__chkstk_ms instead of __chkstk on cygming. We already use that symbol
in the prolog emission, and it simplifies our logic.

Second half of PR18582.

llvm-svn: 227519
2015-01-29 23:58:04 +00:00
Reid Kleckner dafc2ae1ad Update comments to use unreachable instead of llvm.trap, as implemented now
win64: Call __chkstk through a register with the large code model

Fixes half of PR18582. True dynamic allocas will still have a
CALL64pcrel32 which will fail.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D7267

llvm-svn: 227503
2015-01-29 22:33:00 +00:00
Colin LeMahieu 3c740a3614 [Hexagon] Organizing tests and adding a few missing jump instruction encodings.
llvm-svn: 227498
2015-01-29 21:47:15 +00:00
Colin LeMahieu bc63f42e0d [Hexagon] Adding missing instruction encodings and tests.
llvm-svn: 227495
2015-01-29 21:30:22 +00:00
Colin LeMahieu bd4770f915 [Hexagon] Adding alu vector instructions
llvm-svn: 227493
2015-01-29 21:09:30 +00:00
Sanjay Patel 4f07a56958 [GVN] don't propagate equality comparisons of FP zero (PR22376)
In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities
to allow some simple value range optimizations. But that introduced a bug
when comparing to -0.0 or 0.0: these compare equal even though they are not
bitwise identical.

This patch disallows propagating zero constants in equality comparisons. 
Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376

Differential Revision: http://reviews.llvm.org/D7257

llvm-svn: 227491
2015-01-29 20:51:49 +00:00
David Blaikie 0652066c8d Add missing test from r227488
llvm-svn: 227489
2015-01-29 20:25:46 +00:00
David Blaikie 24adfbc20c Refactor test to be reused across architectures
llvm-svn: 227487
2015-01-29 20:21:24 +00:00
David Blaikie 939ce9c79e Remove erroneous REQUIRES: object-emission for asm test.
llvm-svn: 227486
2015-01-29 20:17:15 +00:00
David Blaikie 06a3b92868 Missing test case for r227481
llvm-svn: 227485
2015-01-29 19:40:02 +00:00
Matt Arsenault 423bf3f64a R600/SI: Implement enableAggressiveFMAFusion
Add tests for the various combines. This should
always be at least cycle neutral on all subtargets for f64,
and faster on some. For f32 we should prefer selecting
v_mad_f32 over v_fma_f32.

llvm-svn: 227484
2015-01-29 19:34:32 +00:00
Colin LeMahieu 1610730faf [Hexagon] Deleting old variants of intrinsics and adding missing tests.
llvm-svn: 227474
2015-01-29 17:26:56 +00:00
Colin LeMahieu 860210bc49 [Hexagon] Adding CR intrinsic tests.
llvm-svn: 227463
2015-01-29 16:55:37 +00:00
Tom Stellard 83f0bcef7a R600/SI: Define a schedule model and enable the generic machine scheduler
The schedule model is not complete yet, and could be improved.

llvm-svn: 227461
2015-01-29 16:55:25 +00:00
Robert Lougher c69cfeeafa [X86] Use single add/sub for large stack offsets
For large stack offsets the compiler generates multiple immediate mode
sub/add instructions in the prologue/epilogue.  This patch makes the
compiler place the final amount to be added/subtracted into a register,
which is then added/substracted with a single operation.

Differential Revision: http://reviews.llvm.org/D7226

llvm-svn: 227458
2015-01-29 16:18:29 +00:00
Colin LeMahieu a749b3ee6a [Hexagon] Adding XTYPE/PRED intrinsic tests. Converting predicate types to i32 instead of i1.
llvm-svn: 227457
2015-01-29 16:08:43 +00:00
Bill Schmidt 8cf15ced8c [PowerPC] Complete setting the baseline for ppc64le
Patch by Nemanja Ivanovic.

As was uncovered by the failing test case (when run on non-PPC
platforms), the feature set when compiling with -march=ppc64le was not
being picked up. This change ensures that if the -mcpu option is not
specified, the correct feature set is picked up regardless of whether
we are on PPC or not.

llvm-svn: 227455
2015-01-29 15:59:09 +00:00
Alex Rosenberg 96e833a6a6 Make the test actually test what it's supposed to test. Add a test for the from memory variant of vcvtph2ps for 256-bit.
llvm-svn: 227446
2015-01-29 15:19:54 +00:00
Alex Rosenberg c235500295 Cleanup a few tests on sse4a machines and FileCheckize along the way.
llvm-svn: 227437
2015-01-29 13:31:32 +00:00
Rafael Espindola fad3901095 Don't create multiple mergeable sections with -fdata-sections.
ELF has support for sections that can be split into fixed size or
null terminated entities.

Since these sections can be split by the linker, it is not necessary
to split them in codegen.

This reduces the combined .o size in a llvm+clang build from
202,394,570 to 173,819,098 bytes.

The time for linking clang with gold (on a VM, on a laptop) goes
from 2.250089985 to 1.383001792 seconds.

The flip side is the size of rodata in clang goes from 10,926,785
to 10,929,345 bytes.

The increase seems to be because of http://sourceware.org/bugzilla/show_bug.cgi?id=17902.

llvm-svn: 227431
2015-01-29 12:43:28 +00:00
Vladimir Medic df464ae224 [Mips][Disassembler] When disassembler meets cache/pref instructions for r6 it crashes as the access to operands array is out of range. This patch adds dedicated decoder method for R6 CACHE_HINT_DESC class that properly handles decoding of these instructions.
llvm-svn: 227430
2015-01-29 11:33:41 +00:00
Charlie Turner ed76a05c4d Add a missing Tag_DIV_use test for Cortex-M7.
llvm-svn: 227429
2015-01-29 11:19:54 +00:00
Simon Atanasyan 99cd1fb012 [ELFYAML] Provide default value 0 for YAML relocation addendum field
Follow up to r227318.

llvm-svn: 227422
2015-01-29 06:56:24 +00:00
Reid Kleckner 1185fced3d Add a Windows EH preparation pass that zaps resumes
If the personality is not a recognized MSVC personality function, this
pass delegates to the dwarf EH preparation pass. This chaining supports
people on *-windows-itanium or *-windows-gnu targets.

Currently this recognizes some personalities used by MSVC and turns
resume instructions into traps to avoid link errors.  Even if cleanups
are not used in the source program, LLVM requires the frontend to emit a
code path that resumes unwinding after an exception.  Clang does this,
and we get unreachable resume instructions. PR20300 covers cleaning up
these unreachable calls to resume.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D7216

llvm-svn: 227405
2015-01-29 00:41:44 +00:00
Philip Reames 9198b33b48 Teach SplitBlockPredecessors how to handle landingpad blocks.
Patch by: Igor Laevsky <igor@azulsystems.com>

"Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors."

Differential Revision: http://reviews.llvm.org/D7157

llvm-svn: 227390
2015-01-28 23:06:47 +00:00
Colin LeMahieu 4379d10273 [Hexagon] Updating several V5 intrinsics and adding FP tests.
llvm-svn: 227379
2015-01-28 22:08:16 +00:00
Zoran Jovanovic 14c567be90 [mips][microMIPS] Implement SWM and LWM aliases
Differential Revision: http://reviews.llvm.org/D5820

llvm-svn: 227373
2015-01-28 21:52:27 +00:00
Colin LeMahieu 1de7e0d923 [Hexagon] Updating many V4 intrinsic patterns. Adding missing instruction and deleting unused classes.
llvm-svn: 227353
2015-01-28 19:39:09 +00:00
Colin LeMahieu 94c33218e3 [Hexagon] Adding XTYPE/MPY intrinsic tests and some missing multiply instructions.
llvm-svn: 227347
2015-01-28 19:16:17 +00:00
Colin LeMahieu 19ed07c75a [Hexagon] Deleting a lot of old variants of intrinsics and updating references.
llvm-svn: 227338
2015-01-28 18:29:11 +00:00
Frederic Riss d34551833f [dsymutil] Add DwarfLinker class.
It's an empty shell for now. It's main method just opens the debug
map objects and parses their Dwarf info. Test that we at least do
that correctly.

llvm-svn: 227337
2015-01-28 18:27:01 +00:00
Colin LeMahieu 39b846ce0f [Hexagon] Converting XTYPE/BIT intrinsic patterns and adding tests.
llvm-svn: 227335
2015-01-28 18:06:23 +00:00
Colin LeMahieu fe03c9a678 [Hexagon] Replacing XTYPE/SHIFT intrinsic patternss. Adding tests and missing instructions with tests.
llvm-svn: 227330
2015-01-28 17:37:59 +00:00
Jozef Kolek e10a02ecf0 [mips][microMIPS] Implement LWGP instruction
Differential Revision: http://reviews.llvm.org/D6650

llvm-svn: 227325
2015-01-28 17:27:26 +00:00
Colin LeMahieu c17902b89b [Hexagon] Replacing old intrinsic tests with organized versions that match the reference manual.
llvm-svn: 227321
2015-01-28 16:58:05 +00:00
Bjorn Steinbrink a09ac0085d Fix LLVMSetMetadata and LLVMAddNamedMetadataOperand for single value MDNodes
Summary:
MetadataAsValue uses a canonical format that strips the MDNode if it
contains only a single constant value. This triggers an assertion when
trying to cast the value to a MDNode.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7165

llvm-svn: 227319
2015-01-28 16:35:59 +00:00
Simon Atanasyan e13a9624c2 [ELFYAML] Provide explicit value for relocation addendums in the test
The `Addend` is an optional field of the `Relocation` YAML record. But
we do not provide its default value while reading it from a YAML file
and so it might keep uninitialized.

I am going to fix the code by a separate commit. We might either make
this field mandatory (at least for .rela sections) or specify 0 as
a default value explicitly.

llvm-svn: 227318
2015-01-28 16:22:50 +00:00
Tom Stellard 40ce8af4a5 R600: Move DataLayout to AMDGPUTargetMachine
This is a follow up to r227113.

It is now required to use the amdgcn target for SI and newer GPUs.

llvm-svn: 227316
2015-01-28 16:04:26 +00:00
Michael Kuperstein 951995821a [X86] Reduce some 32-bit imuls into lea + shl
Reduce integer multiplication by a constant of the form k*2^c, where k is in {3,5,9} into a lea + shl. Previously it was only done for imulq on 64-bit platforms, but it makes sense for imull and 32-bit as well.

Differential Revision: http://reviews.llvm.org/D7196

llvm-svn: 227308
2015-01-28 14:08:22 +00:00
Michael Kuperstein f387611ac2 [x32] Enable sibcall optimization on x32.
This includes two things:
1) Fix TCRETURNdi and TCRETURN64di patterns to check the right thing (LP64 as opposed to target bitness).
2) Allow LEA64_32 in MatchingStackOffset.

llvm-svn: 227307
2015-01-28 13:38:48 +00:00
Elena Demikhovsky 7b0dd39db6 AVX-512: Added FMA intrinsics with rounding mode
By Asaf Badouh and Elena Demikhovsky

Added special nodes for rounding: FMADD_RND, FMSUB_RND..
It will prevent merge between nodes with rounding and other standard nodes.

llvm-svn: 227303
2015-01-28 10:21:27 +00:00
Craig Topper 7d3c6d307a [X86] Teach disassembler to handle illegal immediates on AVX512 integer compare instructions.
llvm-svn: 227302
2015-01-28 10:09:56 +00:00
Elena Demikhovsky 45f0448081 Fold fcmp in cases where value is provably non-negative. By Arch Robison.
This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where:
 - the predicate is olt or uge
 - the first operand is provably not less than zero
 - the second operand is zero
The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(x*x+y*y)..

http://reviews.llvm.org/D6972

llvm-svn: 227298
2015-01-28 08:03:58 +00:00
David Blaikie fa1a3c7cf5 PR22356: DebugInfo: Handle the size of a member where the type of that member is a typedef (or other sugar) of a declaration.
llvm-svn: 227290
2015-01-28 02:34:53 +00:00
Reid Kleckner 4af6415237 Move EH personality type classification to Analysis/LibCallSemantics.h
Summary:
Also add enum types for __C_specific_handler and _CxxFrameHandler3 for
which we know a few things.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7214

llvm-svn: 227284
2015-01-28 01:17:38 +00:00
Quentin Colombet 308b171318 Revert r227242 - Merge vector stores into wider vector stores (PR21711).
This commit creates infinite loop in DAG combine for in the LLVM test-suite
for aarch64 with mcpu=cylcone (just having neon may be enough to expose this).

llvm-svn: 227272
2015-01-27 23:58:01 +00:00
Saleem Abdulrasool c44d71b8df SymbolRewriter: allow rewriting with comdats
COMDATs must be identically named to the symbol.  When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs.  This corrects the behaviour and
adds test cases for this.

llvm-svn: 227261
2015-01-27 22:57:39 +00:00
Ahmed Bougacha 1ac9356524 [SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.
This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).

The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.

The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification.  This was introduced in a
refactoring (r225640) to match the original behavior.

However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.

For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW.  When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
  stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
  and simplifies the first stpcpy to a memcpy.  We now have
  two memcpys.

llvm-svn: 227250
2015-01-27 21:52:16 +00:00
Sanjoy Das dcf2651043 Teach IRCE to look at branch weights when recognizing range checks
Splitting a loop to make range checks redundant is profitable only if
the range check "never" fails. Make this fact a part of recognizing a
range check -- a branch is a range check only if it is expected to
pass (via branch_weights metadata).

Differential Revision: http://reviews.llvm.org/D7192

llvm-svn: 227249
2015-01-27 21:38:12 +00:00
Alexey Samsonov 533948088e Revert "[x86] Combine x86mmx/i64 to v2i64 conversion to use scalar_to_vector"
This reverts commits r226953 and r226974.

llvm-svn: 227248
2015-01-27 21:34:11 +00:00
Kevin Enderby 9a50944ca0 dd the option, -link-opt-hints to llvm-objdump used with -macho to print the
Mach-O AArch64 linker optimization hints for ADRP code optimization.

llvm-svn: 227246
2015-01-27 21:28:24 +00:00
Sanjay Patel bcf62f2fa2 Merge vector stores into wider vector stores (PR21711)
This patch resolves part of PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ).

The 'f3' test case in that report presents a situation where we have two 128-bit
stores extracted from a 256-bit source vector. 

Instead of producing this:

vmovaps %xmm0, (%rdi)
vextractf128    $1, %ymm0, 16(%rdi)

This patch merges the 128-bit stores into a single 256-bit store:

vmovups %ymm0, (%rdi)

Differential Revision: http://reviews.llvm.org/D7208

llvm-svn: 227242
2015-01-27 20:50:27 +00:00
Dmitry Vyukov 91ffdec3ec tsan: properly instrument unaligned accesses
If a memory access is unaligned, emit __tsan_unaligned_read/write
callbacks instead of __tsan_read/write.
Required to change semantics of __tsan_unaligned_read/write to not do the user memory.
But since they were unused (other than through __sanitizer_unaligned_load/store) this is fine.
Fixes long standing issue 17:
https://code.google.com/p/thread-sanitizer/issues/detail?id=17

llvm-svn: 227231
2015-01-27 20:19:17 +00:00
Ramkumar Ramachandra ca49c036ba overloaded-intrinsic-name: exercise anyptr on struct
No other test I know shows how struct names are mangled in overloaded
intrinsic functions.

Differential Revision: http://reviews.llvm.org/D7037

llvm-svn: 227229
2015-01-27 20:03:08 +00:00
Kai Nacke e024539ea0 [mips] Add range checks and transformation to octeon instructions in AsmParser.
This patch adds range checks to the immediate operands of octeon
instructions in the AsmParser. Like gas, it applies the following
transformations if the immediate is to large:

bbit0 $8, 42, foo => bbit032 $8, 10, foo
bbit1 $8, 46, foo => bbit132 $8, 14, foo
cins $8, $31, 32, 31 => cins32 $8, $31, 0, 31
exts $7, $4, 54, 9 => exts32 $7, $4, 22, 9

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D7080

llvm-svn: 227225
2015-01-27 19:11:28 +00:00
Marek Olsak 75170778ec R600/SI: Enable all tests that pass on VI without changes
llvm-svn: 227214
2015-01-27 17:27:15 +00:00
Andrea Di Biagio 086cbc37ad [InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag.
This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by
a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared.

Added test InstCombine/select-cmp-cttz-ctlz.ll.

llvm-svn: 227197
2015-01-27 15:58:14 +00:00
Evgeniy Stepanov 3fdfc7b1b3 [sancov] Fix unspecified constructor order between sancov and asan.
Sanitizer coverage constructor must run after asan constructor (for each DSO).
Bump constructor priority to guarantee that.

llvm-svn: 227195
2015-01-27 15:01:22 +00:00
David Majnemer 4c82daea60 LoopRotate: Don't walk the uses of a Constant
LoopRotate wanted to avoid live range interference by looking at the
uses of a Value in the loop latch and seeing if any lied outside of the
loop.  We would wrongly perform this operation on Constants.

This fixes PR22337.

llvm-svn: 227171
2015-01-27 06:21:43 +00:00
Adrian Prantl ad49878697 Replace this testcase with an even shorter one provided by dblaikie.
llvm-svn: 227152
2015-01-27 00:22:17 +00:00
Chad Rosier f9327d6fe9 Commoning of target specific load/store intrinsics in Early CSE.
Phabricator revision: http://reviews.llvm.org/D7121
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!

llvm-svn: 227149
2015-01-26 22:51:15 +00:00
Philip Reames 219b15e85f Add test cases for PRE w/volatile loads
These tests check that the combination of 227110 (cross block query inst) and 227112 (volatile load semantics) work together properly to allow PRE in cases where a loop contains a volatile access.

llvm-svn: 227146
2015-01-26 22:40:44 +00:00
Simon Pilgrim 0629ba1ad9 [X86][SSE] Float comparisons can sometimes be safely commuted
For ordered, unordered, equal and not-equal tests, packed float and double comparison instructions can be safely commuted without affecting the results. This patch checks the comparison mode of the (v)cmpps + (v)cmppd instructions and commutes the result if it can.

Differential Revision: http://reviews.llvm.org/D7178

llvm-svn: 227145
2015-01-26 22:29:24 +00:00
Simon Pilgrim 9b7c00352d [X86][PCLMUL] Enable commutation for PCLMUL instructions
Patch to allow (v)pclmulqdq to be commuted - swaps the src registers and inverts the immediate (low/high) src mask.

Differential Revision: http://reviews.llvm.org/D7180

llvm-svn: 227141
2015-01-26 22:00:18 +00:00
Simon Pilgrim 106abe47d6 Line endings fix. NFC.
llvm-svn: 227138
2015-01-26 21:28:32 +00:00
Matt Arsenault 7d734f44a3 R600: Cleanup or test
Fix broken check lines, use multiple check prefixes,
add an additional test for i1 or.

llvm-svn: 227137
2015-01-26 21:16:10 +00:00
Simon Pilgrim a2f1c40575 Line endings fix. NFC.
llvm-svn: 227136
2015-01-26 21:15:42 +00:00
Bruno Cardoso Lopes 3da5b126a5 [x86][MMX] Rename and cleanup tests: arith, intrinsics and shuffle
- Rename mmx-builtins to mmx-intrinsics to match other intrinsic test naming.
- Remove tests that duplicate functionality from mmx-intrinsics.ll.
- Move arith related tests to mmx-arith.ll.
- MMX related shuffle goes to vector-shuffle-mmx.ll.

llvm-svn: 227130
2015-01-26 20:06:51 +00:00
Hans Wennborg b64cb271dc SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
The range check would get optimized away later, but we might as well not emit
them in the first place.

http://reviews.llvm.org/D6471

llvm-svn: 227126
2015-01-26 19:52:34 +00:00
Hans Wennborg 6800008f04 SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations and
allows for more efficient lowering. Both the SDag switch lowering and
LowerSwitch can exploit unreachable defaults.

Also make TurnSwitchRangeICmp handle switches with unreachable default.
This is kind of separate change, but it cannot be tested without the change
above, and I don't want to land the change above without this since that would
regress other tests.

Differential Revision: http://reviews.llvm.org/D6471

llvm-svn: 227125
2015-01-26 19:52:32 +00:00
Justin Holewinski d4d2e9bd0e [NVPTX] Generate a more optimal sequence for select of i1
Instead of creating a pattern like "(p && a) || ((!p) && b)",
just expand the i8 operands to i32 and perform the selp on them.

Fixes PR22246

llvm-svn: 227123
2015-01-26 19:52:20 +00:00
Justin Holewinski 23df659e6d [NVPTX] Handle floating-point conversion patterns that are not explicitly ordered or unordered
Fixes PR22322

llvm-svn: 227117
2015-01-26 19:11:20 +00:00
Alex Rosenberg b9fefdd215 Use a different encoding for debugtrap on PS4.
llvm-svn: 227116
2015-01-26 19:09:27 +00:00
Philip Reames a7ad6a589c Refine memory dependence's notion of volatile semantics
According to my reading of the LangRef, volatiles are only ordered with respect to other volatiles. It is entirely legal and profitable to forward unrelated loads over the volatile load. This patch implements this for GVN by refining the transition rules MemoryDependenceAnalysis uses when encountering a volatile.

The added test cases show where the extra flexibility is profitable for local dependence optimizations. I have a related change (227110) which will extend this to non-local dependence (i.e. PRE), but that's essentially orthogonal to the semantic change in this patch. I have tested the two together and can confirm that PRE works over a volatile load with both changes.  I will be submitting a PRE w/volatiles test case seperately in the near future.

Differential Revision: http://reviews.llvm.org/D6901

llvm-svn: 227112
2015-01-26 18:54:27 +00:00
Sanjay Patel 805bc02c2b Model sqrtsd as a binary operation with one source operand tied to the destination (PR14221)
This patch fixes the following miscompile:

define void @sqrtsd(<2 x double> %a) nounwind uwtable ssp {
  %0 = tail call <2 x double> @llvm.x86.sse2.sqrt.sd(<2 x double> %a) nounwind 
  %a0 = extractelement <2 x double> %0, i32 0
  %conv = fptrunc double %a0 to float
  %a1 = extractelement <2 x double> %0, i32 1
  %conv3 = fptrunc double %a1 to float
  tail call void @callee2(float %conv, float %conv3) nounwind
  ret void
}

Current codegen:

sqrtsd	%xmm0, %xmm1        ## high element of %xmm1 is undef here
xorps	%xmm0, %xmm0
cvtsd2ss	%xmm1, %xmm0
shufpd	$1, %xmm1, %xmm1
cvtsd2ss	%xmm1, %xmm1 ## operating on undef value
jmp	_callee

This is a continuation of http://llvm.org/viewvc/llvm-project?view=revision&revision=224624 ( http://reviews.llvm.org/D6330 ) 
which was itself a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ).

All of these patches are partial fixes for PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ); 
this should be the final patch needed to resolve that bug.

Differential Revision: http://reviews.llvm.org/D6885

llvm-svn: 227111
2015-01-26 18:42:16 +00:00
Philip Reames 32351455f6 Pass QueryInst down through non-local dependency calculation
This change is mostly motivated by exposing information about the original query instruction to the actual scanning work in getPointerDependencyFrom when used by GVN PRE. In a follow up change, I will use this to be more precise with regards to the semantics of volatile instructions encountered in the scan of a basic block.

Worth noting, is that this change (despite appearing quite simple) is not semantically preserving. By providing more information to the helper routine, we allow some optimizations to kick in that weren't previously able to (when called from this code path.) In particular, we see that treatment of !invariant.load becomes more precise. In theory, we might see a difference with an ordered/atomic instruction as well, but I'm having a hard time actually finding a test case which shows that.

Test wise, I've included new tests for !invariant.load which illustrate this difference. I've also included some updated TBAA tests which highlight that this change isn't needed for that optimization to kick in - it's handled inside alias analysis itself. 

Eventually, it would be nice to factor the !invariant.load handling inside alias analysis as well.

Differential Revision: http://reviews.llvm.org/D6895

llvm-svn: 227110
2015-01-26 18:39:52 +00:00
Eric Christopher a576281694 Move the Mips target to storing the ABI in the TargetMachine rather
than on MipsSubtargetInfo.

This required a bit of massaging in the MC level to handle this since
MC is a) largely a collection of disparate classes with no hierarchy,
and b) there's no overarching equivalent to the TargetMachine, instead
only the subtarget via MCSubtargetInfo (which is the base class of
TargetSubtargetInfo).

We're now storing the ABI in both the TargetMachine level and in the
MC level because the AsmParser and the TargetStreamer both need to
know what ABI we have to parse assembly and emit objects. The target
streamer has a pointer to the one in the asm parser and is updated
when the asm parser is created. This is fragile as the FIXME comment
notes, but shouldn't be a problem in practice since we always
create an asm parser before attempting to emit object code via the
assembler. The TargetMachine now contains the ABI so that the DataLayout
can be constructed dependent upon ABI.

All testcases have been updated to use the -target-abi command line
flag so that we can set the ABI without using a subtarget feature.

Should be no change visible externally here.

llvm-svn: 227102
2015-01-26 17:33:46 +00:00
Daniel Berlin 16f7a52628 Fix incorrect partial aliasing
Update testcases

llvm-svn: 227099
2015-01-26 17:31:17 +00:00
Sanjay Patel 078b612de1 fix line-endings; NFC
llvm-svn: 227095
2015-01-26 17:21:36 +00:00
Vasileios Kalintiris ef96a8ecd6 [mips] Enable arithmetic and binary operations for the i128 data type.
Summary:
This patch adds support for some operations that were missing from
128-bit integer types (add/sub/mul/sdiv/udiv... etc.). With these
changes we can support the __int128_t and __uint128_t data types
from C/C++.

Depends on D7125

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7143

llvm-svn: 227089
2015-01-26 12:33:22 +00:00
Vasileios Kalintiris 2ed214f387 [mips] Add tests for bitwise binary and integer arithmetic operators.
Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7125

llvm-svn: 227087
2015-01-26 12:04:40 +00:00
Vladimir Medic 0516a5b686 When disassembler meets compact jump instructions for r6 it crashes as the access to operands array is out of range. This patch removes dedicated decoder method that wrongly handles decoding of these instructions.
llvm-svn: 227084
2015-01-26 10:33:43 +00:00
Vasileios Kalintiris 30c5451fbc Revert "[mips] Fix assertion on i128 addition/subtraction on MIPS64"
This reverts commit r227003. Support for addition/subtraction and
various other operations for the i128 data type will be added in a
future commit based on the review D7143.

llvm-svn: 227082
2015-01-26 09:53:30 +00:00
NAKAMURA Takumi 2081cefdbe Revert llvm/test/MC/ELF/noexec.s in r227074, "Fix a problem where the AArch64 ELF assembler was failing with"
It should be split into target-specific location.

llvm-svn: 227080
2015-01-26 09:30:29 +00:00
Erik Eckstein 98df6da740 SLPVectorizer: fix wrong scheduling of atomic load/stores.
This fixes PR22306.

llvm-svn: 227077
2015-01-26 09:07:04 +00:00
Eric Christopher c8c76853b9 Fix a problem where the AArch64 ELF assembler was failing with
-no-exec-stack. This was due to it not deriving from the correct
asm info base class and missing the override for the exec
stack section query. Added another line to the noexec test
line to make sure this doesn't regress.

llvm-svn: 227074
2015-01-26 06:32:17 +00:00
Craig Topper d1e1d106ca [X86] Change comparision immediate type to i8 in test cases for AVX512 floating point comparisons. The type was already changed in the definitions and was being auto upgraded to the new type.
llvm-svn: 227064
2015-01-25 23:26:12 +00:00
Craig Topper 29f2e95185 [X86] Use i8 immediate for comparison type on AVX512 packed integer instructions. This matches floating point equivalents. Includes autoupgrade support to convert old code.
llvm-svn: 227063
2015-01-25 23:26:02 +00:00
Adrian Prantl 40cb819c6f Debug info: Fix PR22296 by omitting the DW_AT_location if we lost the
physical register that is described in a DBG_VALUE.

In the testcase the DBG_VALUE describing "p5" becomes unavailable
because the register its address is in is clobbered and we (currently)
aren't smart enough to realize that the value is rematerialized immediately
after the DBG_VALUE and/or is actually a stack slot.

llvm-svn: 227056
2015-01-25 19:04:08 +00:00
Bill Schmidt 258ac3cc56 [PowerPC] Revert ppc64le-aggregates.ll test changes from r227053
It appears we have different behavior with and without -mcpu=pwr8 even
with ppc64le defaulting to POWER8.  The failure appears as follows:

/home/bb/cmake-llvm-x86_64-linux/llvm-project/llvm/test/CodeGen/PowerPC/ppc64le-aggregates.ll:268:14: error: expected string not found in input
; CHECK-DAG: lfs 1, 0([[REG]])
             ^
<stdin>:497:11: note: scanning from here
 ld 3, .LC1@toc@l(3)
          ^
<stdin>:497:11: note: with variable "REG" equal to "3"
 ld 3, .LC1@toc@l(3)
          ^
<stdin>:514:2: note: possible intended match here
 lfs 1, 0(4)
 ^

Reverting this particular test case change.  Nemanja, please have a look
at the reason for the failure.

llvm-svn: 227055
2015-01-25 18:18:54 +00:00
Bill Schmidt 279cabb450 [PowerPC] Reset the baseline for ppc64le to be equivalent to pwr8
Test by Nemanja Ivanovic.

Since ppc64le implies POWER8 as a minimum, it makes sense that the
same features are included. Since the pwr8 processor model will likely
be getting new features until the implementation is complete, I
created a new list to add these updates to. This will include them in
both pwr8 and ppc64le.

Furthermore, it seems that it would make sense to compose the feature
lists for other processor models (pwr3 and up). Per discussion in the
review, I will make this change in a subsequent patch.

In order to test the changes, I've added an additional run step to
test cases that specify -march=ppc64le -mcpu=pwr8 to omit the -mcpu
option. Since the feature lists are the same, the behaviour should be
unchanged.

llvm-svn: 227053
2015-01-25 18:05:42 +00:00
Simon Atanasyan 5d19c67a68 [ELFYAML] Support mips64 relocation record format in yaml2obj/obj2yaml
MIPS64 ELF file has a very specific relocation record format. Each
record might specify up to three relocation operations. So the `r_info`
field in fact consists of three relocation type sub-fields and optional
code of "special" symbols.

http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf
page 40

The patch implements support of the MIPS64 relocation record format in
yaml2obj/obj2yaml tools by introducing new optional Relocation fields:
Type2, Type3, and SpecSym. These fields are recognized only if the
object/YAML file relates to the MIPS64 target.

Differential Revision: http://reviews.llvm.org/D7136

llvm-svn: 227044
2015-01-25 13:29:25 +00:00
Elena Demikhovsky 1a603b3f13 AVX-512: Changes in operations on masks registers for KNL and SKX
- Added KSHIFTB/D/Q for skx
- Added KORTESTB/D/Q for skx
- Fixed store operation for v8i1 type for KNL
- Store size of v8i1, v4i1 and v2i1 are changed to 8 bits

llvm-svn: 227043
2015-01-25 12:47:15 +00:00
Elena Demikhovsky a3232f764e Implemented cost model for masked load/store operations.
llvm-svn: 227035
2015-01-25 08:44:46 +00:00
Lang Hames b5be1addce Remove a few more redundant ExecutionEngine regression tests.
llvm-svn: 227021
2015-01-24 22:41:13 +00:00
Lang Hames 73e55395e8 Remove a number of redundant ExecutionEngine regression tests.
These tests used to test the legacy JIT but since that has been removed they're
just redundantly testing MCJIT. Remove them and just leave their counterparts in
test/ExecutionEngine/MCJIT.

llvm-svn: 227010
2015-01-24 18:49:51 +00:00
Alexei Starovoitov 8a3d0c1557 bpf: add missing lit.local.cfg
llvm-svn: 227009
2015-01-24 18:20:52 +00:00
Alexei Starovoitov e4c8c807bb BPF backend
Summary:
V8->V9:
- cleanup tests

V7->V8:
- addressed feedback from David:
- switched to range-based 'for' loops
- fixed formatting of tests

V6->V7:
- rebased and adjusted AsmPrinter args
- CamelCased .td, fixed formatting, cleaned up names, removed unused patterns
- diffstat: 3 files changed, 203 insertions(+), 227 deletions(-)

V5->V6:
- addressed feedback from Chandler:
- reinstated full verbose standard banner in all files
- fixed variables that were not in CamelCase
- fixed names of #ifdef in header files
- removed redundant braces in if/else chains with single statements
- fixed comments
- removed trailing empty line
- dropped debug annotations from tests
- diffstat of these changes:
  46 files changed, 456 insertions(+), 469 deletions(-)

V4->V5:
- fix setLoadExtAction() interface
- clang-formated all where it made sense

V3->V4:
- added CODE_OWNERS entry for BPF backend

V2->V3:
- fix metadata in tests

V1->V2:
- addressed feedback from Tom and Matt
- removed top level change to configure (now everything via 'experimental-backend')
- reworked error reporting via DiagnosticInfo (similar to R600)
- added few more tests
- added cmake build
- added Triple::bpf
- tested on linux and darwin

V1 cover letter:
---------------------
recently linux gained "universal in-kernel virtual machine" which is called
eBPF or extended BPF. The name comes from "Berkeley Packet Filter", since
new instruction set is based on it.
This patch adds a new backend that emits extended BPF instruction set.

The concept and development are covered by the following articles:
http://lwn.net/Articles/599755/
http://lwn.net/Articles/575531/
http://lwn.net/Articles/603983/
http://lwn.net/Articles/606089/
http://lwn.net/Articles/612878/

One of use cases: dtrace/systemtap alternative.

bpf syscall manpage:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b4fc1a460f3017e958e6a8ea560ea0afd91bf6fe

instruction set description and differences vs classic BPF:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.txt

Short summary of instruction set:
- 64-bit registers
  R0      - return value from in-kernel function, and exit value for BPF program
  R1 - R5 - arguments from BPF program to in-kernel function
  R6 - R9 - callee saved registers that in-kernel function will preserve
  R10     - read-only frame pointer to access stack
- two-operand instructions like +, -, *, mov, load/store
- implicit prologue/epilogue (invisible stack pointer)
- no floating point, no simd

Short history of extended BPF in kernel:
interpreter in 3.15, x64 JIT in 3.16, arm64 JIT, verifier, bpf syscall in 3.18, more to come in the future.

It's a very small and simple backend.
There is no support for global variables, arbitrary function calls, floating point, varargs,
exceptions, indirect jumps, arbitrary pointer arithmetic, alloca, etc.
From C front-end point of view it's very restricted. It's done on purpose, since kernel
rejects all programs that it cannot prove safe. It rejects programs with loops
and with memory accesses via arbitrary pointers. When kernel accepts the program it is
guaranteed that program will terminate and will not crash the kernel.

This patch implements all 'must have' bits. There are several things on TODO list,
so this is not the end of development.
Most of the code is a boiler plate code, copy-pasted from other backends.
Only odd things are lack or < and <= instructions, specialized load_byte intrinsics
and 'compare and goto' as single instruction.
Current instruction set is fixed, but more instructions can be added in the future.

Signed-off-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>

Subscribers: majnemer, chandlerc, echristo, joerg, pete, rengolin, kristof.beyls, arsenm, t.p.northover, tstellarAMD, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D6494

llvm-svn: 227008
2015-01-24 17:51:26 +00:00
Daniel Sanders 9a4f2c55df [mips] Fix 'jumpy' debug line info around calls.
Summary:
At the moment, address calculation is taking the debug line info from the
address node (e.g. TargetGlobalAddress). When a function is called multiple
times, this results in output of the form:

  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .. address calculation ..
  .loc $second_call_location
  .. function call ..
  .loc $first_call_location
  .. address calculation ..
  .loc $third_call_location
  .. function call ..

This patch makes address calculations for function calls take the debug line
info for the call node and results in output of the form:
  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .loc $second_call_location
  .. address calculation ..
  .. function call ..
  .loc $third_call_location
  .. address calculation ..
  .. function call ..

All other address calculations continue to use the address node.

Test Plan: Fixes test/DebugInfo/multiline.ll on a mips host.

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D7050

llvm-svn: 227005
2015-01-24 14:35:11 +00:00
Daniel Sanders 5a9225b262 [mips] Fix assertion on i128 addition/subtraction on MIPS64
Summary:
In addition to the included tests, this fixes
test/CodeGen/Generic/i128-addsub.ll on a mips64 host.

Reviewers: atanasyan, sagar, vmedic

Reviewed By: vmedic

Subscribers: sdkie, llvm-commits

Differential Revision: http://reviews.llvm.org/D6610

llvm-svn: 227003
2015-01-24 12:58:10 +00:00
Andrea Di Biagio 8381475a75 [DAG] Fix wrong canonicalization performed on shuffle nodes.
This fixes a regression introduced by r226816.
When replacing a splat shuffle node with a constant build_vector,
make sure that the new build_vector has a valid number of elements.

Thanks to Patrik Hagglund for reporting this problem and providing a
small reproducible.

llvm-svn: 227002
2015-01-24 11:54:29 +00:00
Chandler Carruth 43e590e51f [PM] Port LowerExpectIntrinsic to the new pass manager.
This just lifts the logic into a static helper function, sinks the
legacy pass to be a trivial wrapper of that helper fuction, and adds
a trivial wrapper for the new PM as well. Not much to see here.

I switched a test case to run in both modes, but we have to strip the
dead prototypes separately as that pass isn't in the new pass manager
(yet).

llvm-svn: 226999
2015-01-24 11:13:02 +00:00
Yunzhong Gao a8cf495a15 If we see UTF-8 BOM sequence at the beginning of a response file, we shall
remove these bytes before parsing.

Phabricator Revision: http://reviews.llvm.org/D7156

llvm-svn: 226988
2015-01-24 04:23:08 +00:00
Chandler Carruth 83ba269e4b [PM] Port instcombine to the new pass manager!
This is exciting as this is a much more involved port. This is
a complex, existing transformation pass. All of the core logic is shared
between both old and new pass managers. Only the access to the analyses
is separate because the actual techniques are separate. This also uses
a bunch of different and interesting analyses and is the first time
where we need to use an analysis across an IR layer.

This also paves the way to expose instcombine utility functions. I've
got a static function that implements the core pass logic over
a function which might be mildly interesting, but more interesting is
likely exposing a routine which just uses instructions *already in* the
worklist and combines until empty.

I've switched one of my favorite instcombine tests to run with both as
well to make sure this keeps working.

llvm-svn: 226987
2015-01-24 04:19:17 +00:00