Commit Graph

21837 Commits

Author SHA1 Message Date
Matheus Almeida c596839e67 [mips][msa] CHECK-DAG-ize MSA 2rf.ll test.
No functional changes.

llvm-svn: 194387
2013-11-11 16:24:53 +00:00
Matheus Almeida 9826d07a2f [mips][msa] CHECK-DAG-ize MSA 2r.ll test.
No functional changes.

llvm-svn: 194386
2013-11-11 16:16:53 +00:00
Rafael Espindola 9d34018954 Add a testcase for pr17852.
llvm-svn: 194385
2013-11-11 15:37:52 +00:00
Hal Finkel c6a243987d Add PPC option for full register names in asm
On non-Darwin PPC systems, we currently strip off the register name prefix
prior to instruction printing. So instead of something like this:

  mr r3, r4

we print this:

  mr 3, 4

The first form is the default on Darwin, and is understood by binutils, but not
yet understood by our integrated assembler. Once our integrated-as understands
full register names as well, this temporary option will be replaced by tying
this functionality to the verbose-asm option. The numeric-only form is
compatible with legacy assemblers and tools, and is also gcc's default on most
PPC systems. On the other hand, it is harder to read, and there are some
analysis tools that expect full register names.

llvm-svn: 194384
2013-11-11 14:58:40 +00:00
Peter Zotov 18636a8777 [OCaml] Add missing Llvm_target functions
llvm-svn: 194382
2013-11-11 14:47:28 +00:00
Peter Zotov dfa957746c [OCaml] Accept context explicitly in Llvm_target functions
Llvm_target.intptr_type used to implicitly use global context. As
none of other functions in OCaml bindings do, it is changed to
accept context explicitly.

llvm-svn: 194381
2013-11-11 14:47:20 +00:00
Peter Zotov d52cf17584 [OCaml] Make Llvm_target.DataLayout.t automatically managed
This breaks the API by removing Llvm_target.DataLayout.dispose.

llvm-svn: 194380
2013-11-11 14:47:11 +00:00
Evgeniy Stepanov 560e089355 [msan] Propagate origin for insertvalue, extractvalue.
llvm-svn: 194374
2013-11-11 13:37:10 +00:00
NAKAMURA Takumi 5c0be2f67a Mark 36 tests as XFAIL:vg_leak in llvm/test/TableGen.
In historical reason, tblgen is not strictly required to be free from memory leaks.
For now, I mark them as XFAIL, they could be fixed, though.

llvm-svn: 194353
2013-11-10 14:26:08 +00:00
NAKAMURA Takumi cae86ce38b Remove 6 of XFAIL(s) in llvm/test/TableGen, since r193736. They have been XPASSing.
llvm-svn: 194352
2013-11-10 14:25:44 +00:00
Bill Wendling fed6c220ec Revert "Resurrect r191017 " GVN proceeds in the presence of dead code" plus a fix to PR17307 & 17308."
This causes PR17852.

This reverts commit d93e8a06b2ca09ab18f390cd514b7443e2e571f7.

Conflicts:
	test/Transforms/GVN/cond_br2.ll

llvm-svn: 194348
2013-11-10 07:34:34 +00:00
Nadav Rotem 5ba1c6ced8 SimplifyCFG has a heuristics for out-of-order processors that decides when it is worthwhile to merge branches. It tries to estimate if the operands of the instruction that we want to hoist are ready. This commit marks function arguments as 'ready' because they require no calculation. This boosts libquantum and a few other workloads from the testsuite.
llvm-svn: 194346
2013-11-10 04:13:31 +00:00
Matt Arsenault ba035bce21 Resolve TODO in test now that filecheck has multiple check prefixes.
llvm-svn: 194344
2013-11-10 02:16:47 +00:00
Matt Arsenault 13df462691 Allow multiple check prefixes in FileCheck.
This is useful if you want to run multiple variations
of a single test, and the majority of check lines
should be the same.

llvm-svn: 194343
2013-11-10 02:04:09 +00:00
Matt Arsenault 5bcefabcda Teach MergeFunctions about address spaces
llvm-svn: 194342
2013-11-10 01:44:37 +00:00
Matt Arsenault 0fb71e545c Use variable for register name in test
llvm-svn: 194338
2013-11-10 00:57:17 +00:00
Reed Kotler 45c5927c5c Mostly finish up constant islands port for Mips for load constants.
Still need to finish the branch part. Still lots more review of the code,
clean up and testing. 

llvm-svn: 194337
2013-11-10 00:09:26 +00:00
Akira Hatanaka d1c58ed8a7 [mips] Make sure there is a chain edge dependency between loads that read
formal arguments on the stack and stores created afterwards. We need this to
ensure tail call optimized function calls do not write over the argument area
of the stack before it is read out.
 

llvm-svn: 194309
2013-11-09 02:38:51 +00:00
Juergen Ributzka 87ed906b2e [Stackmap] Materialize the jump address within the patchpoint noop slide.
This patch moves the jump address materialization inside the noop slide. This
enables patching of the materialization itself or its complete removal. This
patch also adds the ability to define scratch registers that can be used safely
by the code called from the patchpoint intrinsic. At least one scratch register
is required, because that one is used for the materialization of the jump
address. This patch depends on D2009.

Differential Revision: http://llvm-reviews.chandlerc.com/D2074

Reviewed by Andy

llvm-svn: 194306
2013-11-09 01:51:33 +00:00
Juergen Ributzka 9969d3e6e8 [Stackmap] Add AnyReg calling convention support for patchpoint intrinsic.
The idea of the AnyReg Calling Convention is to provide the call arguments in
registers, but not to force them to be placed in a paticular order into a
specified set of registers. Instead it is up tp the register allocator to assign
any register as it sees fit. The same applies to the return value (if
applicable).

Differential Revision: http://llvm-reviews.chandlerc.com/D2009

Reviewed by Andy

llvm-svn: 194293
2013-11-08 23:28:16 +00:00
Jim Grosbach 2fca51d3b4 X86: Assembly files with .cfi_cfa_def shouldn't hit llvm_unreachable()
On darwin, when trying to create compact unwind info, a .cfi_cfa_def
directive would case an llvm_unreachable() to be hit. Back off when we
see this directive and generate the regular DWARF style eh_frame.

rdar://15406518

llvm-svn: 194285
2013-11-08 22:33:06 +00:00
Quentin Colombet b06a0ed4b0 [VirtRegMap] Fix for PR17825. Do not ignore noreturn definitions when setting
isPhysRegUsed if the unwind information is required.
Indeed, the runtime may need a correct stack to be able to unwind the call.

llvm-svn: 194271
2013-11-08 18:14:17 +00:00
Tim Northover 93bcc66e73 ARM: fold prologue/epilogue sp updates into push/pop for code size
ARM prologues usually look like:
    push {r7, lr}
    sub sp, sp, #4

If code size is extremely important, this can be optimised to the single
instruction:
    push {r6, r7, lr}

where we don't actually care about the contents of r6, but pushing it subtracts
4 from sp as a side effect.

This should implement such a conversion, predicated on the "minsize" function
attribute (-Oz) since I've yet to find any code it actually makes faster.

llvm-svn: 194264
2013-11-08 17:18:07 +00:00
Artyom Skrobov 202ff08f97 [ARM] Handling for coprocessor instructions that are undefined starting from ARMv8 (Thumb encodings)
llvm-svn: 194263
2013-11-08 16:25:50 +00:00
Artyom Skrobov d2116a4ef7 [ARM] Handling for coprocessor instructions that are undefined starting from ARMv8 (ARM encodings)
llvm-svn: 194262
2013-11-08 16:17:14 +00:00
Artyom Skrobov e686cec7d4 [ARM] Handling for coprocessor instructions that are undefined starting from ARMv8 (ARM encodings)
llvm-svn: 194261
2013-11-08 16:16:30 +00:00
Zoran Jovanovic 2914d2d980 Test for microMIPS trap instructions.
llvm-svn: 194258
2013-11-08 14:55:31 +00:00
NAKAMURA Takumi 0d82bac470 llvm-ar: Let opening a directory failed in llvm-ar.
Linux cannot open directories with open(2), although cygwin and *bsd can.

Motivation: The test, Object/directory.ll, had been failing with --target=cygwin on Linux. XFAIL was improper for host issues.
llvm-svn: 194257
2013-11-08 12:35:56 +00:00
Matheus Almeida a3bac16950 [mips][msa] Update encoding of LDI instruction.
The encoding was updated in MSA r1.07.

llvm-svn: 194255
2013-11-08 10:43:11 +00:00
Artyom Skrobov 8653443902 [ARM] In ARMAsmParser, MatchCoprocessorOperandName() permitted p10 and p11 as operands for coprocessor instructions, resulting in encodings that clash with FP/NEON instruction encodings
llvm-svn: 194253
2013-11-08 09:16:31 +00:00
David Majnemer bd4fef4a89 IR: Do not canonicalize constant GEPs into an out-of-bounds array access
Summary:
Consider a GEP of:
i8* getelementptr ({ [2 x i8], i32, i8, [3 x i8] }* @main.c, i32 0, i32 0, i64 0)

If we proceeded to GEP the aforementioned object by 8, would form a GEP of:
i8* getelementptr ({ [2 x i8], i32, i8, [3 x i8] }* @main.c, i32 0, i32 0, i64 8)

Note that we would go through the first array member, causing an
out-of-bounds accesses.  This is problematic because we might get fooled
if we are trying to evaluate loads using this GEP, for example, based
off of an object with a constant initializer where the array is zero.

This fixes PR17732.

Reviewers: nicholas, chandlerc, void

Reviewed By: void

CC: llvm-commits, echristo, void, aemerson

Differential Revision: http://llvm-reviews.chandlerc.com/D2093

llvm-svn: 194220
2013-11-07 22:15:53 +00:00
Zoran Jovanovic c18b6d1083 Support for microMIPS trap instructions 1.
llvm-svn: 194205
2013-11-07 14:35:24 +00:00
Vincent Lejeune 4f3751f2af R600: Fix LowerUDIVREM
llvm-svn: 194153
2013-11-06 17:36:04 +00:00
Benjamin Kramer 9e9773d46d Add test case for PR12377, it was fixed by r194116.
llvm-svn: 194147
2013-11-06 11:55:41 +00:00
Vladimir Medic 4c29985cd0 Implement gpword directive for mips, test case added. Stype changes using clang-format are also included.
llvm-svn: 194145
2013-11-06 11:27:05 +00:00
Peter Zotov 578267fb73 [OCaml] Impement Llvm_irreader, bindings to LLVM assembly parser
llvm-svn: 194138
2013-11-06 09:21:25 +00:00
Peter Zotov d10ae6c527 [OCaml] Implement Llvm.string_of_llvalue
llvm-svn: 194136
2013-11-06 09:21:08 +00:00
Jiangning Liu f4226f1d7b Implement AArch64 Neon instruction set Perm.
llvm-svn: 194123
2013-11-06 03:35:27 +00:00
Jiangning Liu a50e22ca4f Implement AArch64 Neon instruction set Bitwise Extract.
llvm-svn: 194118
2013-11-06 02:25:49 +00:00
Andrew Trick 34e2f0c4ea Rewrite SCEV's backedge taken count computation.
Patch by Michele Scandale!

Rewrite of the functions used to compute the backedge taken count of a
loop on LT and GT comparisons.

I decided to split the handling of LT and GT cases becasue the trick
"a > b == -a < -b" in some cases prevents the trip count computation
due to the multiplication by -1 on the two operands of the
comparison. This issue comes from the conservative computation of
value range of SCEVs: taking the negative SCEV of an expression that
have a small positive range (e.g. [0,31]), we would have a SCEV with a
fullset as value range.

Indeed, in the new rewritten function I tried to better handle the
maximum backedge taken count computation when MAX/MIN expression are
used to handle the cases where no entry guard is found.

Some test have been modified in order to check the new value correctly
(I manually check them and reasoning on possible overflow the new
values seem correct).

I finally added a new test case related to the multiplication by -1
issue on GT comparisons.

llvm-svn: 194116
2013-11-06 02:08:26 +00:00
Andrew Trick 6664df12fb Slightly change the way stackmap and patchpoint intrinsics are lowered.
MorphNodeTo is not safe to call during DAG building. It eagerly
deletes dependent DAG nodes which invalidates the NodeMap. We could
expose a safe interface for morphing nodes, but I don't think it's
worth it. Just create a new MachineNode and replaceAllUsesWith.

My understaning of the SD design has been that we want to support
early target opcode selection. That isn't very well supported, but
generally works. It seems reasonable to rely on this feature even if
it isn't widely used.

llvm-svn: 194102
2013-11-05 22:44:04 +00:00
Tim Northover f02287db27 ARM: permit bare dmb/dsb/isb aliases on Cortex-M0
Cortex-M0 supports these 32-bit instructions despite being Thumb1 only
(mostly). We knew about that but not that the aliases without the default "sy"
operand were also permitted.

llvm-svn: 194094
2013-11-05 21:36:02 +00:00
Jiangning Liu d7c52676f6 Implement AArch64 Neon Crypto instruction classes AES, SHA, and 3 SHA.
llvm-svn: 194085
2013-11-05 17:42:05 +00:00
Michael Gottesman 24b2f6fdda [objc-arc] Convert the one directional retain/release relation assert to a conditional check + fail.
Due to the previously added overflow checks, we can have a retain/release
relation that is one directional. This occurs specifically when we run into an
additive overflow causing us to drop state in only one direction. If that
occurs, we should bail and not optimize that retain/release instead of
asserting.

Apologies for the size of the testcase. It is necessary to cause the additive
cfg overflow to trigger.

rdar://15377890

llvm-svn: 194083
2013-11-05 16:02:40 +00:00
Alp Toker a2f1b8d238 Provide a test input for opt
This was only working previously due to a quirk in the way lit
concatenates script commands.

llvm-svn: 194078
2013-11-05 13:57:34 +00:00
Peter Zotov 28f6876ecc [OCaml] (PR16318) Add missing argument to Llvm.const_intcast
llvm-svn: 194065
2013-11-05 11:56:20 +00:00
Peter Zotov ce7a91b277 [OCaml] (PR11717) Make declare_qualified_global respect address argument
Original patch by Jonathan Ragan-Kelley

llvm-svn: 194064
2013-11-05 11:56:13 +00:00
Reed Kotler 0f007fc4ce Fix r194019 as requested by Eric Christopher.
Submit the basic port of the rest of ARM constant islands code to Mips. 
Two test cases are added which reflect the next level of functionality:
constants getting moved to water areas that are out of range from the
initial placement at the end of the function and basic blocks being split to
create water when none exists that can be used. There is a bunch of this
code that is not complete and has been marked with IN_PROGRESS. I will
finish cleaning this all up during the next week or two and submit the
rest of the test cases. I have elminated some code for dealing with
inline assembly because to me it unecessarily complicates things and
some of the newer features of llvm like function attributies and builtin
assembler give me better tools to solve the alignment issues created
there. Also, for Mips16 I even have the option of not doing constant
islands in the present of inline assembler if I chose. When everything
has been completed I will summarize the port and notify people that
are knowledgable regarding the ARM Constant Islands code so they can
review it in it's entirety if they wish.

llvm-svn: 194053
2013-11-05 08:14:14 +00:00
Hao Liu d6b40b51c7 Implement AArch64 post-index vector load/store multiple N-element structure class SIMD(lselem-post).
Including following 14 instructions:
4 ld1 insts: post-index load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: post-index load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: post-index store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: post-index store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 194043
2013-11-05 03:39:32 +00:00
Kevin Qin 97f6aaa8ad Implemented aarch64 neon intrinsic vcopy_lane with float type.
llvm-svn: 194041
2013-11-05 02:03:59 +00:00
Yuchen Wu f3e653e9a6 Revert "Added basic unit test for llvm-cov."
This reverts commit 9cacd131c22b888303cb88e9a3235b2d7b2f19a1.

llvm-svn: 194039
2013-11-05 01:56:26 +00:00
Yuchen Wu 0b8e9a1480 Added basic unit test for llvm-cov.
This test compares the output of llvm-cov against a coverage file
generated by gcov.

llvm-svn: 194038
2013-11-05 01:56:23 +00:00
NAKAMURA Takumi 5267613e3a Revert r194019 to r194021, "Submit the basic port of the rest of ARM constant islands code to Mips."
It broke -Asserts build.

llvm-svn: 194026
2013-11-04 23:14:36 +00:00
Tim Northover ace0bd4d33 AArch64: use default asm operand printing when modifier inapplicable
If an inline assembly operand has multiple constraints (e.g. "Ir" for immediate
or register) and an operand modifier (E.g. "w" for "print register as wN") then
we need to decide behaviour when the modifier doesn't apply to the constraint.

Previousely produced some combination of an assertion failure and a fatal
error. GCC's behaviour appears to be to ignore the modifier and print the
operand in the default way. This patch should implement that.

llvm-svn: 194024
2013-11-04 23:04:07 +00:00
Reed Kotler 3fe68871da Add the test case that goes with the previous submission for constant
islands. I forgot to add it to svn on that patch. Ooops.

llvm-svn: 194020
2013-11-04 22:13:41 +00:00
Eric Christopher 542c8d934d Check for both styles of clobbers, those produced by dragonegg and
those produced by clang for the inline asm bswap conversion.

Modified from a patch by Chris Smowton.

llvm-svn: 194016
2013-11-04 21:41:21 +00:00
Matt Arsenault a8e894405c Fix another constant folding address space place I missed.
This fixes an assertion failure with a different sized address space.

llvm-svn: 194014
2013-11-04 20:46:52 +00:00
Matt Arsenault 243140f2fd Scalarize select vector arguments when extracted.
When the elements are extracted from a select on vectors
or a vector select, do the select on the extracted scalars
from the input if there is only one use.

llvm-svn: 194013
2013-11-04 20:36:06 +00:00
Cameron McInally d80f7d34de Add support for AVX512 masked vector blend intrinsics.
llvm-svn: 194006
2013-11-04 19:14:56 +00:00
Manman Ren 289ef7d992 Rename testing case to use - instead of _.
llvm-svn: 194001
2013-11-04 18:52:06 +00:00
Rafael Espindola 48da4f4691 Change BitcodeReader to use error_code instead of bool + string.
In order to create an ObjectFile implementation that uses bitcode files, we
need to propagate the bitcode errors to the ObjectFile interface, so we need
to convert it to use the same error handling as ObjectFile: error_code.

llvm-svn: 193996
2013-11-04 16:16:24 +00:00
Zoran Jovanovic 8a80aa76c8 Support for microMIPS branch instructions.
llvm-svn: 193992
2013-11-04 14:53:22 +00:00
Peter Zotov 7fc270a171 [OCaml] implement Llvm_passmgr_builder, bindings for PassManagerBuilder
llvm-svn: 193968
2013-11-04 01:39:42 +00:00
Peter Zotov 0f22bab63f [OCaml] Implement missing LLVMCore APIs
llvm-svn: 193966
2013-11-04 01:39:26 +00:00
Elena Demikhovsky dacddb0bab AVX-512: added VPCONFLICT instruction and intrinsics,
added EVEX_KZ to tablegen

llvm-svn: 193959
2013-11-03 13:46:31 +00:00
Venkatraman Govindaraju 5ae77f7564 [SparcV9] Handle i64 <-> float conversions in sparcv9 mode.
llvm-svn: 193957
2013-11-03 12:28:40 +00:00
David Majnemer 120f4a06fd Revert "Inliner: Handle readonly attribute per argument when adding memcpy"
This reverts commit r193356, it caused PR17781.

A reduced test case covering this regression has been added to the test suite.

llvm-svn: 193955
2013-11-03 12:22:13 +00:00
Peter Zotov 311548cdad [OCaml] Implement Llvm.MemoryBuffer.{of_string,as_string}
llvm-svn: 193953
2013-11-03 08:27:45 +00:00
Peter Zotov 45451cf62a [OCaml] Implement Llvm_linker, bindings for the IR linker
llvm-svn: 193951
2013-11-03 08:27:32 +00:00
Peter Zotov cbae39416f [OCaml] Implement Llvm_vectorize bindings
llvm-svn: 193950
2013-11-03 08:27:22 +00:00
Peter Zotov 5186033b0b [OCaml] Refactor Llvm_target tests
Llvm_target tests did not check for return values. This actually
caused them to miss a bug.

llvm-svn: 193949
2013-11-03 08:27:13 +00:00
Venkatraman Govindaraju f1d807ee13 [Sparc] Expand FP_TO_UINT, UINT_TO_FP for fp128.
llvm-svn: 193947
2013-11-03 08:00:19 +00:00
Peter Zotov 3e0c21ed53 [OCaml] Llvm_scalar_opts: add missing transforms
llvm-svn: 193946
2013-11-03 07:54:17 +00:00
Peter Zotov e4deac7b4a [OCaml] Llvm_ipo: add missing transforms
llvm-svn: 193945
2013-11-03 07:54:08 +00:00
Bob Wilson d8d92d90fa Convert calls to __sinpi and __cospi into __sincospi_stret
This adds an SimplifyLibCalls case which converts the special __sinpi and
__cospi (float & double variants) into a __sincospi_stret where appropriate to
remove duplicated work.

Patch by Tim Northover

llvm-svn: 193943
2013-11-03 06:48:38 +00:00
Bob Wilson e7dde0c061 Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+.
rdar://12856873
Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller

llvm-svn: 193942
2013-11-03 06:14:38 +00:00
Venkatraman Govindaraju 5615aca219 [SparcV9] Add ctpop instruction for i64. Also, expand ctlz, cttz and bswap.
llvm-svn: 193941
2013-11-03 05:59:07 +00:00
Rafael Espindola 99a3ba7674 A better fix that also works on ppc: add a target tripple.
llvm-svn: 193915
2013-11-02 06:00:09 +00:00
Rafael Espindola 3cd286643d Fix this test to pass on darwin now that llvm-nm is working.
llvm-svn: 193914
2013-11-02 05:29:22 +00:00
Rafael Espindola a135632af0 Fix llvm-nm to mach OS X's nm on some tests.
There is still a long way to go for llvm-nm, but at least we now match
nm's letter output in the cases we test for.

llvm-svn: 193912
2013-11-02 05:03:24 +00:00
Michael Liao b638d05ecb Fix PR17764
- When selecting BLEND from vselect, the operands need swapping as due to the
  difference between vselect and SSE/AVX's BLEND insn

llvm-svn: 193900
2013-11-02 00:10:02 +00:00
David Blaikie ba8125dfd0 DebugInfo: regenerate test case from Clang to adjust for fixes/improvements
I hit some problems with future work due to the member subprogram of
'a_b's type having a subprogram (an implicit default ctor, !52 in the
pre-commit source) with no name. Clang now generates a name for such a
function but in this case doesn't even emit debug info for it as it is
unused (Clang never emits the body of the ctor, instead just emitting
memset if needed).

llvm-svn: 193892
2013-11-01 22:29:28 +00:00
Arnold Schwaighofer a846a7f8f0 LoopVectorizer: Perform redundancy elimination on induction variables
When the loop vectorizer was part of the SCC inliner pass manager gvn would
run after the loop vectorizer followed by instcombine. This way redundancy
(multiple uses) were removed and instcombine could perform scalarization on the
induction variables. Having moved the loop vectorizer to later we no longer run
any form of redundancy elimination before we perform instcombine. This caused
vectorized induction variables to survive that did not before.

On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops.

This should also help lpbench and paq8p.

I ran a Release (without Asserts) build over the test-suite and did not see any
negative impact on compile time.

radar://15339680

llvm-svn: 193891
2013-11-01 22:18:19 +00:00
David Blaikie d0d458665a DebugInfo: Improve readability of test case added in r193878
The point is to ensure that the attribute in question
(DW_AT_data_member_location) is associated with the prior tag, so ensure
that we don't see another tag starting between the intended tag and the
desired attribute.

llvm-svn: 193884
2013-11-01 20:59:53 +00:00
David Blaikie f0bc1ec767 DebugInfo: add a test case for data member locations (coverage for r193835)
llvm-svn: 193878
2013-11-01 18:25:55 +00:00
David Blaikie c5f888c909 Fix a test case broken by r193872
llvm-svn: 193876
2013-11-01 18:18:16 +00:00
Manman Ren 1d0b6bb2ef Add comments.
llvm-svn: 193874
2013-11-01 18:06:25 +00:00
David Blaikie 2ede02f6d0 DebugInfo: Make pubnames header printing similar to unit header printing
In a failed attempt to allow the gnu-public-names.ll test case to not
hardcode the size of the unit that the pubnames section referred to I've
at least managed to have unit headers and pubnames headers print out in
a similar style.

This failed to achieve the desired goal because the header in a unit
specifies the length of the unit without the length element of the
header whereas the length in the pubnames includes this element, so the
numbers are off by 4 bytes. I don't know of any arithmetic powers in
FileCheck so the test case can't simply say "CU_LENGTH + 4".

llvm-svn: 193872
2013-11-01 17:53:30 +00:00
Benjamin Kramer 1fbcdca9e3 LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices
If we have a pointer to a single-element struct we can still build wide loads
and stores to it (if there is no padding).

llvm-svn: 193860
2013-11-01 14:09:50 +00:00
Bradley Smith 2521975a42 [ARM] Add Virtualization subtarget feature and more build attributes in this area
Add a Virtualization ARM subtarget feature along with adding proper build
attribute emission for Tag_Virtualization_use (encodes Virtualization and
TrustZone) and Tag_MPextension_use.

Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to
something that is more maintainable. This changes the focus of this
testcase away from testing CPU defaults (which is tested elsewhere), onto
specifically testing that attributes are encoded correctly.

llvm-svn: 193859
2013-11-01 13:27:35 +00:00
Bradley Smith c848beba5e [ARM] Fix Tag_ABI_HardFP_use build attribute
Fix Tag_ABI_HardFP_use build attribute to handle single precision FP,
replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add
some tests for Tag_ABI_VFP_args.

llvm-svn: 193856
2013-11-01 11:21:16 +00:00
Hal Finkel 4d94930bcb Consider (x == -1) unlikely in BranchProbabilityInfo
This adds another heuristic to BPI, similar to the existing heuristic that
considers (x == 0) unlikely to be true. As suggested in the PACT'98 paper by
Deitrich, Cheng, and Hwu, -1 is often used to indicate an invalid index, and
equality comparisons with -1 are also unlikely to succeed. Local
experimentation supports this hypothesis: This yields a 1-2% speedup in the
test-suite sqlite benchmark on the PPC A2 core, with no significant
regressions.

llvm-svn: 193855
2013-11-01 10:58:22 +00:00
Arnold Schwaighofer 70a4665f55 LoopVectorizer: If dependency checks fail try runtime checks
When a dependence check fails we can still try to vectorize loops with runtime
array bounds checks.

This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the
scalar performance on a corei7-avx.

radar://15339680

llvm-svn: 193853
2013-11-01 03:05:07 +00:00
Rafael Espindola d7a0e60e8f Use \01 to disable the mangler. Should fix the 32 bit windows bots.
llvm-svn: 193846
2013-11-01 01:14:20 +00:00
David Blaikie 71d34a2eef DebugInfo: Emit member variable locations as data instead of expressions in blocks
Drive by space optimization. Also makes the DIEs more regular which
might speed up DWARF parsing.

llvm-svn: 193835
2013-11-01 00:25:45 +00:00
Andrew Trick f990411256 These test cases for experimental features are a bit too darwin-specific still. Use a triple.
llvm-svn: 193820
2013-10-31 22:46:51 +00:00
Chad Rosier 74b65cd811 [AArch64] Add support for NEON scalar fixed-point convert to floating-point instructions.
llvm-svn: 193816
2013-10-31 22:36:59 +00:00
Andrew Trick a3a11dedca Add new calling convention for WebKit Java Script.
llvm-svn: 193812
2013-10-31 22:12:01 +00:00
Andrew Trick 153ebe6d2a Add support for stack map generation in the X86 backend.
Originally implemented by Lang Hames.

llvm-svn: 193811
2013-10-31 22:11:56 +00:00
Rafael Espindola 57afdc7f09 Relax check line to match what llvm-nm prints for COFF.
llvm-svn: 193810
2013-10-31 22:07:46 +00:00
Manman Ren 87a2adc7fe Do not convert "call asm" to "invoke asm" in Inliner.
Given that backend does not handle "invoke asm" correctly ("invoke asm" will be
handled by SelectionDAGBuilder::visitInlineAsm, which does not have the right
setup for LPadToCallSiteMap) and we already made the assumption that inline asm
does not throw in InstCombiner::visitCallSite, we are going to make the same
assumption in Inliner to make sure we don't convert "call asm" to "invoke asm".

If it becomes necessary to add support for "invoke asm" later on, we will need
to modify the backend as well as remove the assumptions that inline asm does
not throw.

Fix rdar://15317907

llvm-svn: 193808
2013-10-31 21:56:03 +00:00
Rafael Espindola 775ef460c9 XFAIL on ppc64 too.
llvm-svn: 193804
2013-10-31 21:27:02 +00:00
Rafael Espindola cb5bd5e508 XFAIL this for now.
llvm-svn: 193802
2013-10-31 21:22:43 +00:00
Rafael Espindola 282a47037b Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list".
There are two ways one could implement hiding of linkonce_odr symbols in LTO:
* LLVM tells the linker which symbols can be hidden if not used from native
  files.
* The linker tells LLVM which symbols are not used from other object files,
  but will be put in the dso symbol table if present.

GOLD's API is the second option. It was implemented almost 1:1 in llvm by
passing the list down to internalize.

LLVM already had partial support for the first option. It is also very similar
to how ld64 handles hiding these symbols when *not* doing LTO.

This patch then
* removes the APIs for the DSO list.
* marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr
  global values and other linkonce_odr whose address is not used.
* makes the gold plugin responsible for handling the API mismatch.

llvm-svn: 193800
2013-10-31 20:51:58 +00:00
Chad Rosier 77ada678ed [AArch64] Add diagnostic tests for NEON scalar shift immediate instructions (see: r193790).
llvm-svn: 193798
2013-10-31 20:11:32 +00:00
Chad Rosier 20e1f20d69 [AArch64] Add support for NEON scalar shift immediate instructions.
llvm-svn: 193790
2013-10-31 19:28:44 +00:00
Roman Divacky 2262cfaf19 SparcV9 doesnt have rem instruction either.
llvm-svn: 193789
2013-10-31 19:22:33 +00:00
Reid Kleckner 775f29573a Use a larger invalid attribute bitcode number
That way the test won't start faililng when someone adds a new attribute
and wants to use the next logical enum (38) for bitcode.  The new
bitcode file tries to use the number 48 as an attribute instead.

llvm-svn: 193787
2013-10-31 19:12:36 +00:00
Matt Arsenault b78b1b2330 Add FileCheck tests for @LINE
llvm-svn: 193782
2013-10-31 18:18:09 +00:00
Petar Jovanovic 1f8578dca6 [mips] XFAIL several MCJIT remote tests
Two of the tests are new test cases (cross-module-a.ll, multi-module-a.ll)
not yet supported on MIPS, while XFAIL for the other two tests was
accidentally removed in r193570 and this change reverts those lines.

llvm-svn: 193781
2013-10-31 18:10:25 +00:00
Manman Ren 4dbdc9021d Debug Info: remove duplication of DIEs when a DIE can be shared across CUs.
We add a map in DwarfDebug to map MDNodes that are shareable across CUs to the
corresponding DIEs: MDTypeNodeToDieMap. These DIEs can be shared across CUs,
that is why we keep the maps in DwarfDebug instead of CompileUnit.

We make the assumption that if a DIE is not added to an owner yet, we assume
it belongs to the current CU. Since DIEs for the type system are added to
their owners immediately after creation, and other DIEs belong to the current
CU, the assumption should be true.

A testing case is added to show that we only create a single DIE for a type
MDNode and we use ref_addr to refer to the type DIE.

We also add a testing case to show ref_addr relocations for non-darwin
platforms.

llvm-svn: 193779
2013-10-31 17:54:35 +00:00
Roman Divacky 8d72f4a06f Merge and filecheckize.
llvm-svn: 193778
2013-10-31 17:50:45 +00:00
Andrew Trick 2af716afbe Add Verifier test case for variable argument intrinsics.
llvm-svn: 193768
2013-10-31 17:18:17 +00:00
Andrew Trick a2efd99bdf Enable variable arguments support for intrinsics.
llvm-svn: 193766
2013-10-31 17:18:11 +00:00
Cameron McInally 394d557f41 Add AVX512 unmasked integer broadcast intrinsics and support.
llvm-svn: 193748
2013-10-31 13:56:31 +00:00
Elena Demikhovsky 496656900e AVX-512: Implemented CMOV for 512-bit vectors
llvm-svn: 193747
2013-10-31 13:15:32 +00:00
Richard Sandiford f834ea19db [SystemZ] Automatically detect zEC12 and z196 hosts
As on other hosts, the CPU identification instruction is priveleged,
so we need to look through /proc/cpuinfo.  I copied the PowerPC way of
handling "generic".

Several tests were implicitly assuming z10 and so failed on z196.

llvm-svn: 193742
2013-10-31 12:14:17 +00:00
Amara Emerson f80f95fcc7 [AArch64] Make the use of FP instructions optional, but enabled by default.
This adds a new subtarget feature called FPARMv8 (implied by NEON), and
predicates the support of the FP instructions and registers on this feature.

llvm-svn: 193739
2013-10-31 09:32:11 +00:00
NAKAMURA Takumi 160cef8ddc llvm/test/Bitcode/invalid.ll: Tweak expresion to mach "llvm-dis.EXE:"
llvm-svn: 193738
2013-10-31 06:21:00 +00:00
Rafael Espindola 26b43cac18 Fix a use after free on invalid input.
llvm-svn: 193737
2013-10-31 04:20:23 +00:00
Jim Grosbach 7236678687 Legalize: Improve legalization of long vector extends.
When an extend more than doubles the size of the elements (e.g., a zext
from v16i8 to v16i32), the normal legalization method of splitting the
vectors will run into problems as by the time the destination vector is
legal, the source vector is illegal. The end result is the operation
often becoming scalarized, with the typical horrible performance. For
example, on x86_64, the simple input of:
define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
  %tmp = zext <16 x i8> %a to <16 x i32>
  store <16 x i32> %tmp, <16 x i32>*%p
  ret void
}

Generates:
  .section  __TEXT,__text,regular,pure_instructions
  .section  __TEXT,__const
  .align  5
LCPI0_0:
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _bar
  .align  4, 0x90
_bar:
  vpunpckhbw  %xmm0, %xmm0, %xmm1
  vpunpckhwd  %xmm0, %xmm1, %xmm2
  vpmovzxwd %xmm1, %xmm1
  vinsertf128 $1, %xmm2, %ymm1, %ymm1
  vmovaps LCPI0_0(%rip), %ymm2
  vandps  %ymm2, %ymm1, %ymm1
  vpmovzxbw %xmm0, %xmm3
  vpunpckhwd  %xmm0, %xmm3, %xmm3
  vpmovzxbd %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vandps  %ymm2, %ymm0, %ymm0
  vmovaps %ymm0, (%rdi)
  vmovaps %ymm1, 32(%rdi)
  vzeroupper
  ret

So instead we can check if there are legal types that enable us to split
more cleverly when the input vector is already legal such that we don't
turn it into an illegal type. If the extend is such that it's more than
doubling the size of the input we check if
  - the number of vector elements is even,
  - the source type is legal,
  - the type of a split source is illegal,
  - the type of an extended (by doubling element size) source is legal, and
  - the type of that extended source when split is legal.
If the conditions are met, instead of just splitting both the
destination and the source types, we create an extend that only goes up
one "step" (doubling the element width), and the continue legalizing the
rest of the operation normally. The result is that this operates as a
new, more effecient, termination condition for the loop of "split the
operation until the destination type is legal."

With this change, the above example now compiles to:
_bar:
  vpxor %xmm1, %xmm1, %xmm1
  vpunpcklbw  %xmm1, %xmm0, %xmm2
  vpunpckhwd  %xmm1, %xmm2, %xmm3
  vpunpcklwd  %xmm1, %xmm2, %xmm2
  vinsertf128 $1, %xmm3, %ymm2, %ymm2
  vpunpckhbw  %xmm1, %xmm0, %xmm0
  vpunpckhwd  %xmm1, %xmm0, %xmm3
  vpunpcklwd  %xmm1, %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vmovaps %ymm0, 32(%rdi)
  vmovaps %ymm2, (%rdi)
  vzeroupper
  ret

This generalizes a custom lowering that was added a while back to the
ARM backend. That lowering is no longer necessary, and is removed. The
testcases for it, however, provide excellent ARM tests for this change
and so remain.

rdar://14735100

llvm-svn: 193727
2013-10-31 00:20:48 +00:00
Matt Arsenault 2ba54c3d90 Fix CodeGen for unaligned loads with address spaces
llvm-svn: 193721
2013-10-30 23:30:05 +00:00
Matt Arsenault 38b8ecf378 Teach scalarrepl about address spaces
llvm-svn: 193720
2013-10-30 22:54:58 +00:00
Rafael Espindola 6f1b2852fc Produce .weak_def_can_be_hidden for some linkonce_odr values
With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr
if they are also unnamed_addr or don't have their address taken.

There is not a lot of documentation about .weak_def_can_be_hidden, but
from the old discussion about linkonce_odr_auto_hide and the name of
the directive this looks correct: these symbols can be hidden.

Testing this with the ld64 in Xcode 5 linking clang reduces the number of
exported symbols from 21053 to 19049.

llvm-svn: 193718
2013-10-30 22:08:11 +00:00
Will Dietz b67a714d37 Add DebugInfo testcase for high_pc encoded as constant, fixed in r193555.
llvm-svn: 193711
2013-10-30 20:27:17 +00:00
Matt Arsenault 614ea99da7 Fix GVN creating bitcast between address spaces
llvm-svn: 193710
2013-10-30 19:05:41 +00:00
Tom Roeder 04d88fba3e This commit adds some (but not all) of the x86-64 relocations that are not
currently supported in the ELF object writer, along with a simple test case.

llvm-svn: 193709
2013-10-30 18:47:25 +00:00
Artyom Skrobov c1be9c16bc [ARM] NEON instructions were erroneously decoded from certain invalid encodings
llvm-svn: 193705
2013-10-30 18:10:09 +00:00
Tom Stellard c947d8ca64 R600: Custom lower f32 = uint_to_fp i64
llvm-svn: 193701
2013-10-30 17:22:05 +00:00
Daniel Sanders d5f554f0bb [mips][msa] Correct definition of bins[lr] and CHECK-DAG-ize related tests
llvm-svn: 193695
2013-10-30 15:45:42 +00:00
Daniel Sanders ab94b537d7 [mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal IR (i.e. not intrinsics)
Also corrected the definition of the intrinsics for these instructions (the
result register is also the first operand), and added intrinsics for bsel and
bseli to clang (they already existed in the backend).

These four operations are mostly equivalent to bsel, and bseli (the difference
is which operand is tied to the result). As a result some of the tests changed
as described below.

bitwise.ll:
- bsel.v test adapted so that the mask is unknown at compile-time. This stops
  it emitting bmnzi.b instead of the intended bsel.v.
- The bseli.b test now tests the right thing. Namely the case when one of the
  values is an uimm8, rather than when the condition is a uimm8 (which is
  covered by bmnzi.b)

compare.ll:
- bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this
  is the same operation (see MSA.txt).

i8.ll
- CHECK-DAG-ized test.
- bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands
  because this is the same operation (see MSA.txt).
- bseli.b still emits bseli.b though because the immediate makes it
  distinguishable from bmnzi.b.

vec.ll:
- CHECK-DAG-ized test.
- bmz.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).
- bsel.v tests now (correctly) emits bmnz.v with swapped operands (see
  MSA.txt).

llvm-svn: 193693
2013-10-30 15:20:38 +00:00
Chad Rosier be020d0309 [AArch64] Add support for NEON scalar floating-point compare instructions.
llvm-svn: 193691
2013-10-30 15:19:37 +00:00
Daniel Sanders d74b130cc9 [mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. not intrinsics)
This required correcting the definition of the bins[lr]i intrinsics because
the result is also the first operand.

It also required removing the (arbitrary) check for 32-bit immediates in
MipsSEDAGToDAGISel::selectVSplat().

Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d
because the constant is legalized into a ConstantPool. Similar things can
happen with binsri.d with more than 10 bits set in the mask. The resulting
code when this happens is correct but not optimal.

llvm-svn: 193687
2013-10-30 14:45:14 +00:00
Daniel Sanders 53fe6c4d56 [mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECT
(or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b).
where $mask is a constant splat. This allows bitwise operations to make use
of bsel.

It's also a stepping stone towards matching bins[lr], and bins[lr]i from
normal IR.

Two sets of similar tests have been added in this commit. The bsel_* functions
test the case where binsri cannot be used. The binsr_*_i functions will
start to use the binsri instruction in the next commit.

llvm-svn: 193682
2013-10-30 13:51:01 +00:00
Daniel Sanders e7ef0c817b [mips][msa] Added support for matching splat.[bhw] from normal IR (i.e. not intrinsics)
splat.d is implemented but this subtest is currently disabled. This is because
it is difficult to match the appropriate IR on MIPS32. There is a patch under
review that should help with this so I hope to enable the subtest soon.

llvm-svn: 193680
2013-10-30 13:07:44 +00:00
Juergen Ributzka 3bd686d493 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
Now Hexagon and SystemZ are not happy with it :-(

llvm-svn: 193677
2013-10-30 06:36:19 +00:00
Juergen Ributzka 6ad05d6b95 SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

llvm-svn: 193676
2013-10-30 05:48:18 +00:00
Manman Ren f4c339e04a Debug Info: instead of calling addToContextOwner which constructs the context
after the DIE creation, we construct the context first.

Ensure that we create the context before we create a type so that we can add
the newly created type to the parent. Remove last use of addToContextOwner
now that it's not needed.

We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs
should be added to their parents right after the creation.

Reviewed off-list by Eric, Thanks.

llvm-svn: 193657
2013-10-29 22:49:29 +00:00
Akira Hatanaka 6b2d841975 [mips] Align the stack to 16-bytes for mfp64.
llvm-svn: 193641
2013-10-29 19:29:03 +00:00
Manman Ren 75cc7658e1 Debug Info: clean up testing case.
Add a tag before the name attribute for readability. Use CHECK-NEXT
instead of CHECK-NOT followed by a CHECK. Add new lines to separate checking
of different DIEs.

llvm-svn: 193629
2013-10-29 17:27:14 +00:00
Weiming Zhao acf48d75e5 add test cases for frameaddr and returnaddr for aarch64
llvm-svn: 193626
2013-10-29 17:01:29 +00:00
Zoran Jovanovic 507e084a18 Support for microMIPS jump instructions
llvm-svn: 193623
2013-10-29 16:38:59 +00:00
Tom Stellard 6e1ee476ab R600/SI: Add compute support for CI v2
v2:
  - Fix LDS size calculation

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 193621
2013-10-29 16:37:28 +00:00
Tom Stellard e118b8becd R600: Expand vector FSQRT ops
llvm-svn: 193620
2013-10-29 16:37:20 +00:00
Bernard Ogden fce246f0c6 Test cleanup for v8 instructions
Add some missing tests, factor out a test not specific to v8 into
its own file.

llvm-svn: 193611
2013-10-29 14:16:09 +00:00
Bernard Ogden ee87e85505 ARM: Add subtarget feature for CRC
Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend.

Differential Revision: http://llvm-reviews.chandlerc.com/D2036

llvm-svn: 193599
2013-10-29 09:47:35 +00:00
Tim Northover d29ddf6713 AArch64: add 'a' inline asm operand modifier
This is used in the Linux kernel, and effectively just means "print an
address".

llvm-svn: 193593
2013-10-29 08:22:33 +00:00
Manman Ren f6b936bc06 Debug Info: instead of calling addToContextOwner which constructs the context
after the DIE creation, we construct the context first.

This touches creation of namespaces and global variables. The purpose is to
handle all DIE creations similarly: constructs the context first, then creates
the DIE and immediately adds the DIE to its parent.

We use createAndAddDIE to wrap around "new DIE(".

llvm-svn: 193589
2013-10-29 05:49:41 +00:00
NAKAMURA Takumi 16c7184ba4 Add llvm/test/Transforms/SLPVectorizer/ARM/lit.local.cfg. Tests there require ARM in targets.
llvm-svn: 193580
2013-10-29 02:46:00 +00:00
Alp Toker 6a03374526 Fix "existant" typos
llvm-svn: 193579
2013-10-29 02:35:28 +00:00
Arnold Schwaighofer 89ae217422 ARM cost model: Unaligned vectorized double stores are expensive
Updated a test case that assumed that <2 x double> would vectorize to use
<4 x float>.

radar://15338229

llvm-svn: 193574
2013-10-29 01:33:57 +00:00
Arnold Schwaighofer 77af0f6e82 ARM cost model: Account for zero cost scalar SROA instructions
By vectorizing a series of srl, or, ... instructions we have obfuscated the
intention so much that the backend does not know how to fold this code away.

radar://15336950

llvm-svn: 193573
2013-10-29 01:33:53 +00:00
Andrew Kaylor 1ca510ea67 Adding a workaround for __main linking with remote lli and Cygwin/MinGW
llvm-svn: 193570
2013-10-29 01:29:56 +00:00
Joerg Sonnenberger fc18473400 Move the STT_FILE symbols out of the normal symbol table processing for
ELF. They can overlap with the other symbols, e.g. if a source file
"foo.c" contains a function "foo" with a static variable "c".

llvm-svn: 193569
2013-10-29 01:06:17 +00:00
Manman Ren 73d697c641 Debug Info: use createAndAddDIE for newly-created Subprogram DIEs.
More patches will be submitted to convert "new DIE(" to use createAddAndDIE in
DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where
we have to decide between ref4 and ref_addr, because DIEs that can be shared
across CU will be added to a CU already.

Reviewed off-list by Eric.

llvm-svn: 193567
2013-10-29 00:58:04 +00:00
Andrew Kaylor 2873b38e69 Renaming MCJIT .ir files to .ll and moving them to Inputs
llvm-svn: 193562
2013-10-28 23:51:03 +00:00
Alp Toker 5e9ed7cf1d lit: add missing substitutions for recently added tools
llvm-mcmarkup, obj2yaml and yaml2obj were missing from the substitutions list,
causing the test suite to fail in a sandboxed environment.

llvm-svn: 193559
2013-10-28 23:37:49 +00:00
Alp Toker 0d44e49e92 Quote potential shell expansions found in tests
llvm-svn: 193558
2013-10-28 23:37:45 +00:00
Rafael Espindola d1cac0af6b Convert another llc -filetype=obj test.
llvm-svn: 193548
2013-10-28 22:17:19 +00:00
Rafael Espindola 3a8c0734f9 Convert another llc -filetype=obj test.
llvm-svn: 193547
2013-10-28 22:11:47 +00:00
Rafael Espindola 060e6444ea Convert another llc -filetype=obj test.
llvm-svn: 193546
2013-10-28 22:05:05 +00:00
Andrew Kaylor 4404eb4857 Standardizing lli's extra module command line option
llvm-svn: 193544
2013-10-28 21:58:15 +00:00
Rafael Espindola 940ca0bada Convert another llc -filetype=obj test.
llvm-svn: 193539
2013-10-28 21:12:15 +00:00
Rafael Espindola 57ec995c37 Convert another llc -filetype=obj test.
llvm-svn: 193538
2013-10-28 21:06:12 +00:00
Rafael Espindola 3a5eecb57c Convert another llc -filetype=obj test.
llvm-svn: 193537
2013-10-28 20:59:41 +00:00
Rafael Espindola 3f018baac0 Convert another llc -filetype=obj test.
llvm-svn: 193536
2013-10-28 20:54:33 +00:00
Lang Hames b52816615b Return early from getUnconditionalBranchTargetOpValue if the branch target is
an MCExpr, in order to avoid writing an encoded zero value in the immediate
field.

When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we
don't know what the final immediate field value should be. We shouldn't
explicitly set the immediate field to an encoded zero value as zero is encoded
with a non-zero bit pattern. This leads to bits being set that pollute the
final immediate value. The nature of the encoding is such that the polluted
bits only affect very large immediate values, explaining why this hasn't
caused problems earlier.

Fixes <rdar://problem/15155975>.

llvm-svn: 193535
2013-10-28 20:51:11 +00:00
Rafael Espindola 889a180e5a Convert a llc -filetype=obj test into a llvm-mc test.
llvm-svn: 193534
2013-10-28 20:40:20 +00:00
Logan Chien 8cbb80d159 [arm] Implement eabi_attribute, cpu, and fpu directives.
This commit allows the ARM integrated assembler to parse
and assemble the code with .eabi_attribute, .cpu, and
.fpu directives.

To implement the feature, this commit moves the code from
AttrEmitter to ARMTargetStreamers, and several new test
cases related to cortex-m4, cortex-r5, and cortex-a15 are
added.

Besides, this commit also change the Subtarget->isFPOnlySP()
to Subtarget->hasD16() to match the usage of .fpu directive.

This commit changes the test cases:

* Several .eabi_attribute directives in
  2010-09-29-mc-asm-header-test.ll are removed because the .fpu
  directive already cover the functionality.

* In the Cortex-A15 test case, the value for
  Tag_Advanced_SIMD_arch has be changed from 1 to 2,
  which is more precise.

llvm-svn: 193524
2013-10-28 17:51:12 +00:00
Richard Sandiford 094e609716 [SystemZ] Set usaAA to true
useAA significantly improves the handling of vector code that has TBAA
information attached.  It also helps other cases, as shown by the testsuite
changes here.  The only real downside I've seen is that it interferes with
MergeConsecutiveStores.  The problem is that that optimization works top
down, starting at the first store in the chain, and looks for cases where
the chain result is only used by a single related store.  These related
stores don't alias, so useAA will have rewritten all the later stores to
use a different chain input (typically the same one as the first store).

I think the advantages outweigh the disadvantages though, so for now I've
just disabled alias analysis for the unaligned-01.ll test.

llvm-svn: 193521
2013-10-28 13:53:37 +00:00
Richard Sandiford 981fdeb477 [DAGCombiner] Respect volatility when checking for aliases
Making useAA() default to true for SystemZ showed that the combiner alias
analysis wasn't handling volatile accesses.  This hit many of the SystemZ
tests, but I arbitrarily picked one for the purpose of this patch.

llvm-svn: 193518
2013-10-28 12:00:00 +00:00
Richard Sandiford 39c1ce4dc1 Keep TBAA info when rewriting SelectionDAG loads and stores
Most SelectionDAG code drops the TBAA info when creating a new form of a
load and store (e.g. during legalization, or when converting a plain
load to an extending one).  This patch tries to catch all cases where
the TBAA information can legitimately be carried over.

The patch adds alternative forms of getLoad() and getExtLoad() that take
a MachineMemOperand instead of individual fields.  (The corresponding
getTruncStore() already exists.)  The idea is to use the MachineMemOperand
forms when all fields are carried over (size, pointer info, isVolatile,
isNonTemporal, alignment and TBAA info).  If some adjustment is being
made, e.g. to narrow the load, then we still pass the individual fields
but also pass the TBAA info.

llvm-svn: 193517
2013-10-28 11:17:59 +00:00
Benjamin Kramer 6094f30da2 SCEV: Make the final add of an inbounds GEP nuw if we know that the index is positive.
We can't do this for the general case as saying a GEP with a negative index
doesn't have unsigned wrap isn't valid for negative indices.
  %gep = getelementptr inbounds i32* %p, i64 -1

But an inbounds GEP cannot run past the end of address space. So we check for
the very common case of a positive index and make GEPs derived from that NUW.
Together with Andy's recent non-unit stride work this lets us analyze loops
like

  void foo3(int *a, int *b) {
    for (; a < b; a++) {}
  }

PR12375, PR12376.

Differential Revision: http://llvm-reviews.chandlerc.com/D2033

llvm-svn: 193514
2013-10-28 07:30:06 +00:00
Reed Kotler 91ae9829a9 Make first substantial checkin of my port of ARM constant islands code to Mips.
Before I just ported the shell of the pass. I've tried to keep everything
nearly identical to the ARM version. I think it will be very easy to eventually
merge these two and create a new more general pass that other targets can
use. I have some improvements I would like to make to allow pools to 
be shared across functions and some other things. When I'm all done we
can think about making a more general pass. More to be ported but the
basic mechanism works now almost as good as gcc mips16.

llvm-svn: 193509
2013-10-27 21:57:36 +00:00
NAKAMURA Takumi 5bb014371e MCJIT-remote: __main should be resolved in child context.
- Mark tests as XFAIL:cygming in test/ExecutionEngine/MCJIT/remote.
    Rather to suppress them, I'd like to leave them running as XFAIL.
  - Revert r193472. RecordMemoryManager no longer resolves __main on cygming.

There are a couple of issues.

  - X86 Codegen emits "call __main" in @main for targeting cygming.
    It is useless in JIT. FYI, tests are passing when emitting __main is disabled.
  - Current remote JIT does not resolve any symbols in child context.

FIXME: __main should be disabled, or remote JIT should resolve __main.
llvm-svn: 193498
2013-10-27 10:22:52 +00:00
Elena Demikhovsky 199c823555 AVX-512: PMIN/PMAX intrinsics and patterns
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193497
2013-10-27 08:18:37 +00:00
Shuxin Yang 2e1890e18b Revert r193251 : Use address-taken to disambiguate global variable and indirect memops.
llvm-svn: 193489
2013-10-27 03:08:44 +00:00
NAKAMURA Takumi e00225bf14 llvm/test/lit.cfg: Tighten conditions to enable 'native'.
I saw the case that 'native' was mis-enabled when x86_64-pc-win32 on x86_64-linux.

FIXME: Consider cases that target can be executed even if host_triple were different from target_triple.
llvm-svn: 193459
2013-10-26 02:50:20 +00:00
NAKAMURA Takumi 0328dfa6a4 llvm/test/Other/close-stderr.ll: Remove "XFAIL:win32". It reverts r173509.
"REQUIRES: shell" should cover if this failed.

llvm-svn: 193458
2013-10-26 02:50:14 +00:00
Andrew Trick 57243da70f Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop.
Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu
(affecting trunk and 3.3)

When SCEV expands a recurrence outside of a loop it attempts to scale
by the stride of the recurrence. Chained recurrences don't work that
way. We could compute binomial coefficients, but would hve to
guarantee that the chained AddRec's are in a perfectly reduced form.

llvm-svn: 193438
2013-10-25 21:35:56 +00:00
Andrew Trick 29abce3189 Fix LSR: don't normalize quadratic recurrences.
Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu
(affecting trunk and 3.3)

ScalarEvolutionNormalization was attempting to normalize by adding and
subtracting strides. Chained recurrences don't work that way.

llvm-svn: 193437
2013-10-25 21:35:52 +00:00
Rafael Espindola 7749d7ccc7 Handle calls and invokes in GlobalStatus.
This patch teaches GlobalStatus to analyze a call that uses the global value as
a callee, not as an argument.

With this change internalize call handle the common use of linkonce_odr
functions. This reduces the number of linkonce_odr functions in a LTO build of
clang (checked with the emit-llvm gold plugin option) from 1730 to 60.

llvm-svn: 193436
2013-10-25 21:29:52 +00:00
Hal Finkel 02f562df43 LoopVectorizer: Don't attempt to vectorize extractelement instructions
The loop vectorizer does not currently understand how to vectorize
extractelement instructions. The existing check, which excluded all
vector-valued instructions, did not catch extractelement instructions because
it checked only the return value. As a result, vectorization would proceed,
producing illegal instructions like this:

  %58 = extractelement <2 x i32> %15, i32 0
  %59 = extractelement i32 %58, i32 0

where the second extractelement is illegal because its first operand is not a vector.

llvm-svn: 193434
2013-10-25 20:40:15 +00:00
Quentin Colombet 8761a8f5c0 [X86][AVX512] Add patterns that match the AVX512 floating point register vbroadcast intrinsics.
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193422
2013-10-25 18:04:12 +00:00
Quentin Colombet 4bf1c282c2 [X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast intrinsics.
Patch by Cameron McInally <cameron.mcinally@nyu.edu>

llvm-svn: 193421
2013-10-25 17:47:18 +00:00
Tim Northover 1744d0ad83 ARM: allow .thumb_func to be separated from symbol definition
When assembling, a .thumb_func directive is supposed to be applicable to the
next symbol definition, even if there are intervening directives. We were
racing ahead to try and find it, and this commit should fix the issue.

Patch by Gabor Ballabas

llvm-svn: 193403
2013-10-25 12:49:50 +00:00
Tim Northover c7ea8048e7 ARM: don't expand atomicrmw inline on Cortex-M0
There's a barrier instruction so that should still be used, but most actual
atomic operations are going to need a platform decision on the correct
behaviour (either nop if single-threaded or OS-support otherwise).

rdar://problem/15287210

llvm-svn: 193399
2013-10-25 09:30:24 +00:00
Tim Northover a564d329c2 LegalizeDAG: allow libcalls for max/min atomic operations
ARM processors without ldrex/strex need to be able to make libcalls for all
atomic operations, including the newer min/max versions.

The alternative would probably be expanding these operations in terms of
cmpxchg (as x86 does always), but in the configurations where this matters
code-size tends to be paramount so the libcall is more desirable.

llvm-svn: 193398
2013-10-25 09:30:20 +00:00
Tim Northover 41d2049180 ARM: tweak test to pass on all platforms
A TableGen indeterminacy means that the reason for the failure can
vary, and Windows gets the other option.

llvm-svn: 193394
2013-10-25 07:34:56 +00:00
Jim Grosbach c16a657ad0 ARM: Test r193381 a bit more thoroughly.
Make sure we're predicating right based on CPU even if the triple is 'wrong'.

llvm-svn: 193382
2013-10-24 23:11:05 +00:00
Jim Grosbach 1d1d6d4675 ARM: Tweak usage of '*vfp' compiler_rt functions.
Only use them if the subtarget has ARM mode, as these routines are implemented
as ARM code.

rdar://15302004

llvm-svn: 193381
2013-10-24 23:07:11 +00:00
Tom Stellard bc7d87f07c Inliner: Handle readonly attribute per argument when adding memcpy
Patch by: Vincent Lejeune

llvm-svn: 193356
2013-10-24 16:38:33 +00:00
Renato Golin 9f36932c8d I had to move and remove
llvm-svn: 193355
2013-10-24 16:31:43 +00:00
Tim Northover 5620faf771 ARM: Mark double-precision instructions as such
This prevents us from silently accepting invalid instructions on (for example)
Cortex-M4 with just single-precision VFP support.

No tests for the extra Pat Requires because they're essentially assertions: the
affected code should have been lowered to libcalls before ISel.

rdar://problem/15302004

llvm-svn: 193354
2013-10-24 15:49:39 +00:00
Renato Golin e865d70678 Fix broken builds by moving test to x86 dir
llvm-svn: 193351
2013-10-24 15:11:03 +00:00
Renato Golin 1ba143e140 Mark vector loops as already vectorized
Make sure we mark all loops (scalar and vector) when vectorizing,
so that we don't try to vectorize them anymore. Also, set unroll
to 1, since this is what we check for on early exit.

llvm-svn: 193349
2013-10-24 14:50:51 +00:00
Tim Northover 225bcbbe71 ARM: add a couple more NEON predicates.
The fused multiply instructions were added in VFPv4 but are still NEON
instructions, in particular they shouldn't be available on a Cortex-M4 not
matter how floaty it is.

llvm-svn: 193342
2013-10-24 12:48:05 +00:00
Tim Northover 64dacb2b8a ARM: mark various aliases with their architecture requirements.
If an alias inherits directly from InstAlias then it doesn't get any default
"Requires" values, so llvm-mc will allow it even on architectures that don't
support the underlying instruction.

This tidies up the obvious VFP and NEON cases I found.

llvm-svn: 193340
2013-10-24 12:22:58 +00:00
Zoran Jovanovic 2f0a712e18 Added tests for microMIPS relocations 1.
llvm-svn: 193332
2013-10-24 10:55:00 +00:00
Tim Northover 94ecbd2e6c ARM: Use non-VFP softcalls on embedded Darwinish targets
The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1
code to make use of VFP instructions by switching back to ARM mode, they make
no sense for M-class processors which don't even have an ARM mode.

Given that justification, in practice this is a platform ABI decision so the
actual check is based on that rather than CPU features.

rdar://problem/15302004

llvm-svn: 193327
2013-10-24 10:37:09 +00:00
Tim Northover 741e6ef4d4 ARM: fix assert on unpredictable POP instruction.
POP instructions are aliased to the ARM LDM variants but have different syntax.
This caused two problems: we tried to access a non-existent operand to annotate
the '!', and the error message didn't make much sense.

With some vigorous hand-waving in the error message both problems can be
fixed.

llvm-svn: 193322
2013-10-24 09:37:18 +00:00
Yaron Keren 744fcdf587 Added test for -elf configuration, to see that _alloca call is properly
generated. See:

http://llvm.org/viewvc/llvm-project?view=revision&revision=193289

llvm-svn: 193321
2013-10-24 09:36:08 +00:00
Job Noorman a8d35c98fd Make sure SP is always aligned on a 2 byte boundary
llvm-svn: 193320
2013-10-24 09:32:31 +00:00
Nuno Lopes 340b0463e6 fix PR17635: false positive with packed structures
LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object

llvm-svn: 193317
2013-10-24 09:17:24 +00:00
Amara Emerson c5cae0f20c [AArch64] Fix NZCV reg live-in bug in F128CSEL codegen.
When generating the IfTrue basic block during the F128CSEL pseudo-instruction
handling, the NZCV live-in for the newly created BB wasn't being added. This
caused a fault during MI-sched/live range calculation when the predecessor
for the fall-through BB didn't have a live-in for phys-reg as expected.

llvm-svn: 193316
2013-10-24 08:28:24 +00:00
Elena Demikhovsky dd0794e51b AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsics
llvm-svn: 193312
2013-10-24 07:16:35 +00:00
Craig Topper 551862d66b Replace sse41/sse42 with sse4.1/sse4.2 in test command lines to fix bots.
llvm-svn: 193311
2013-10-24 07:00:06 +00:00
Craig Topper 1c1ecb51f0 Add non-AVX tests for AES intrinsics.
llvm-svn: 193310
2013-10-24 06:50:17 +00:00
Craig Topper 5b7c2b4f0d Add tests for SSE intrinsics in non-avx mode by copying from the AVX test cases. Some of these may have been tested by other tests, but most weren't. Patch by Cameron McInally.
llvm-svn: 193309
2013-10-24 06:45:13 +00:00
Juergen Ributzka d04d096ecf Fix a bug in LinearFunctionTestReplace that created invalid loop exit checks.
Reviewed by Andy

llvm-svn: 193303
2013-10-24 05:29:56 +00:00
Benjamin Kramer 0ccab2d66c X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate.
Also update the cost model.

llvm-svn: 193270
2013-10-23 21:06:07 +00:00
Benjamin Kramer da8446b833 X86: Custom lower zext v16i8 to v16i16.
On sandy bridge (PR17654) we now get
	vpxor	%xmm1, %xmm1, %xmm1
	vpunpckhbw	%xmm1, %xmm0, %xmm2
	vpunpcklbw	%xmm1, %xmm0, %xmm0
	vinsertf128	$1, %xmm2, %ymm0, %ymm0

On haswell it's a simple
	vpmovzxbw	%xmm0, %ymm0

There is a maze of duplicated and dead transforms and patterns in this
area. Remove the dead custom lowering of zext v8i16 to v8i32, that's
already handled by LowerAVXExtend.

llvm-svn: 193262
2013-10-23 19:19:04 +00:00
Michael Liao c7bf44d7bb Fix PR17631
- Skip instructions added in prolog. For specific targets, prolog may
  insert helper function calls (e.g. _chkstk will be called when
  there're more than 4K bytes allocated on stack). However, these
  helpers don't use/def YMM/XMM registers.

llvm-svn: 193261
2013-10-23 18:32:43 +00:00
NAKAMURA Takumi 3f1f0c9923 Add llvm-c-test to check-llvm.
llvm-svn: 193258
2013-10-23 17:57:04 +00:00
Shuxin Yang e4fb375995 Use address-taken to disambiguate global variable and indirect memops.
Major steps include:
 1). introduces a not-addr-taken bit-field in GlobalVariable
 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable 
    dosen't have its address taken.
 3). AA use this info for disambiguation. 

llvm-svn: 193251
2013-10-23 17:28:19 +00:00
Anders Waldenborg 7ec37f8f3b Fix cmake dependency on llvm-c-test in test
llvm-svn: 193243
2013-10-23 15:01:23 +00:00
Matheus Almeida be8681b461 [mips][msa] Direct Object Emission support for the LSA instruction.
llvm-svn: 193240
2013-10-23 13:20:07 +00:00
Daniel Sanders a952160078 [mips][msa] Added support for matching fexp2 from normal IR (i.e. not intrinsics)
llvm-svn: 193239
2013-10-23 10:36:52 +00:00
Artyom Skrobov fc12e7016c Make ARM hint ranges consistent, and add tests for these ranges
llvm-svn: 193238
2013-10-23 10:14:40 +00:00
Anders Waldenborg 49416cf126 Fix check for supported targets in llvm-c lit.local.cfg
llvm-svn: 193235
2013-10-23 08:47:52 +00:00
Anders Waldenborg b932c66955 Add llvm-c-test tool for testing llvm-c
This provides rudimentary testing of the llvm-c api.

The following commands are implemented:

  * --module-dump
    Read bytecode from stdin - print ir

  * --module-list-functions
    Read bytecode from stdin - list summary of functions

  * --module-list-globals
    Read bytecode from stdin - list summary of globals

  * --targets-list
    List available targets

  * --object-list-sections
    Read object file from stdin - list sections

  * --object-list-symbols
    Read object file from stdin - list symbols (like nm)

  * --disassemble
    Read lines of triple, hex ascii machine code from stdin - print disassembly

  * --calc
    Read lines of name, rpn from stdin - print generated module ir

Differential-Revision: http://llvm-reviews.chandlerc.com/D1776
llvm-svn: 193233
2013-10-23 08:10:20 +00:00
Tom Stellard 54774e5681 R600/SI: fix MIMG writemask adjustement
This fixes piglit:
- shaders/glsl-fs-texture2d-masked
- shaders/glsl-fs-texture2d-masked-4

Patch by: Marek Olšák

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 193222
2013-10-23 02:53:47 +00:00
Tom Stellard af77543244 R600: Fix handling of vector kernel arguments
The SelectionDAGBuilder was promoting vector kernel arguments to legal
types, but this won't work for R600 and SI since kernel arguments are
stored in memory and can't be promoted.  In order to handle vector
arguments correctly we need to look at the original types from the LLVM IR
function.

llvm-svn: 193215
2013-10-23 00:44:32 +00:00
Tom Stellard fb9616905a R600/SI: Add support for i64 bitwise or
llvm-svn: 193213
2013-10-23 00:44:19 +00:00
Tom Stellard a66cafa096 R600/SI: Use S_LOAD_DWORD instructions for v8i32 and v16i32
llvm-svn: 193212
2013-10-23 00:44:12 +00:00
David Blaikie 15b25dffc3 MC: Support multiple sections with the same name in the same comdat group
Code review by Eric Christopher and Rafael Espindola.

llvm-svn: 193209
2013-10-22 23:41:52 +00:00
Tim Northover 08a8660260 ARM: provide diagnostics on more writeback LDM/STM instructions
The set of circumstances where the writeback register is allowed to be in the
list of registers is rather baroque, but I think this implements them all on
the assembly parsing side.

For disassembly, we still warn about an ARM-mode LDM even if the architecture
revision is < v7 (the required architecture information isn't available). It's
a silly instruction anyway, so hopefully no-one will mind.

rdar://problem/15223374

llvm-svn: 193185
2013-10-22 19:00:39 +00:00
Tom Stellard 26a3b67b3b R600: Simplify handling of private address space
The AMDGPUIndirectAddressing pass was previously responsible for
lowering private loads and stores to indirect addressing instructions.
However, this pass was buggy and way too complicated.  The only
advantage it had over the new simplified code was that it saved one
instruction per direct write to private memory.  This optimization
likely has a minimal impact on performance, and we may be able
to duplicate it using some other transformation.

For the private address space, we now:
1. Lower private loads/store to Register(Load|Store) instructions
2. Reserve part of the register file as 'private memory'
3. After regalloc lower the Register(Load|Store) instructions to
   MOV instructions that use indirect addressing.

llvm-svn: 193179
2013-10-22 18:19:10 +00:00
Manman Ren a1ce9891ac Simplify testing case (Thanks Rafael for the testing case).
llvm-svn: 193177
2013-10-22 18:15:50 +00:00
Matheus Almeida eb68d9d985 [mips][msa] Direct Object Emission support for conditional branches.
These branches have a 16-bit offset (R_MIPS_PC16).

List of conditional branch instructions:
bnz.{b,h,w,d}
bnz.v
bz.{b,h,w,d}
bz.v

llvm-svn: 193157
2013-10-22 09:43:32 +00:00
Elena Demikhovsky 1f3ed4169c AVX-512: aligned / unaligned load and store for 512-bit integer vectors.
llvm-svn: 193156
2013-10-22 09:19:28 +00:00
Bill Wendling 7855b38c0e Add testcase for PR3168. It was fixed over time.
PR3168

llvm-svn: 193152
2013-10-22 08:23:03 +00:00
Manman Ren f3850c0807 TBAA: fix PR17620.
We can have a struct type with a single field and the field does not start
with 0. In that case, we should correctly update the offset.

llvm-svn: 193137
2013-10-22 01:40:25 +00:00
Eric Christopher 874fa0f6c7 Fix spelling, grammar, and match naming convention for test files.
llvm-svn: 193130
2013-10-21 23:14:06 +00:00
Chad Rosier e012cb3783 [AArch64] Add the constraint to NEON scalar mla/mls instructions.
llvm-svn: 193117
2013-10-21 20:11:47 +00:00
Tom Stellard e1631ddf93 SimplifyCFG: Don't duplicate calls to functions marked noduplicate v2
v2:
  - Use CI->cannotDuplicate()

llvm-svn: 193115
2013-10-21 20:07:30 +00:00
Matt Arsenault 51f9f77494 Fix CodeGen for vectors of pointers with address spaces.
llvm-svn: 193112
2013-10-21 20:03:58 +00:00
Matt Arsenault b768912db8 Fix CodeGen for different size address space GEPs
llvm-svn: 193111
2013-10-21 20:03:54 +00:00
Matt Arsenault fa64659bd8 Teach SimplifyCFG about address spaces
llvm-svn: 193104
2013-10-21 18:55:08 +00:00
Matt Arsenault be18b8a3ca Fix creating bitcasts between address spaces in SCEV.
The test before wasn't successfully testing this
since it was missing the datalayout piece to change
the size of the second address space.

llvm-svn: 193102
2013-10-21 18:41:10 +00:00
Lang Hames 2783993fca X86 vector element shift-by-immediate instructions take i8 immediates. Make
the instruction defenitions and ISEL reflect this.

Prior to this patch these instructions took an i32i8imm, and the high bits were
dropped during encoding. This led to incorrect behavior for shifts by
immediates higher than 255. This patch fixes that issue by detecting large
immediate shifts and returning constant zero (for logical shifts) or capping
the shift amount at an encodable value (for arithmetic shifts).

Fixes <rdar://problem/14968098>

llvm-svn: 193096
2013-10-21 17:51:24 +00:00
Rafael Espindola 3d7fc25c7c Optimize more linkonce_odr values during LTO.
When a linkonce_odr value that is on the dso list is not unnamed_addr
we can still look to see if anything is actually using its address. If
not, it is safe to hide it.

This patch implements that by moving GlobalStatus to Transforms/Utils
and using it in Internalize.

llvm-svn: 193090
2013-10-21 17:14:55 +00:00
David Blaikie 63bb3e1182 DebugInfo: Hash DW_FORM_GNU_str_index as a string.
Found while adding type safety to the various DWARF enumerations (form,
attribute, tag, etc) that caused Clang to warn on an incompletely
covered switch. Converting the comment to a default/unreachable
uncovered this case of an unsupported form encoding. Seems we were
skipping fission strings entirely.

llvm-svn: 193089
2013-10-21 16:37:22 +00:00
Elena Demikhovsky 665c90e184 AVX-512: MUL operation lowering for v8i64
llvm-svn: 193083
2013-10-21 13:27:34 +00:00
Matheus Almeida fe0bf9f618 [mips][msa] Direct Object Emission support for LD/ST instructions.
llvm-svn: 193082
2013-10-21 13:07:13 +00:00
Matheus Almeida 8ddad15177 [mips][msa] Direct Object Emission support for LDI instructions.
llvm-svn: 193081
2013-10-21 12:56:20 +00:00
Matheus Almeida 83d797de4a [mips][msa] Direct Object Emission support for MOVE.v.
llvm-svn: 193080
2013-10-21 12:43:54 +00:00
Matheus Almeida a591fdc63c [mips][msa] Direct Object Emission support for CTCMSA and CFCMSA.
These instructions are logically related as they allow read/write of MSA control registers.
Currently MSA control registers are emitted by number but hopefully that will change as soon 
as GAS starts accepting them by name as that would make the assembly easier to read.

llvm-svn: 193078
2013-10-21 12:26:50 +00:00
Matheus Almeida 5798c6f3bb [mips][msa] Direct Object Emission of SPLAT instruction.
llvm-svn: 193077
2013-10-21 12:07:26 +00:00
Matheus Almeida 70fbf77546 [mips][msa] Fix definition of SLD instruction.
The second parameter of the SLD intrinsic is the number of columns (GPR) to 
slide left the source array.

llvm-svn: 193076
2013-10-21 11:47:56 +00:00
Bill Wendling 90dd90afcb Don't eliminate a partially redundant load if it's in a landing pad.
A landing pad can be jumped to only by the unwind edge of an invoke
instruction. If we eliminate a partially redundant load in a landing pad, it
will create a basic block that violates this constraint. It then leads to other
problems down the line if it tries to merge that basic block with the landing
pad. Avoid this by not eliminating the load in a landing pad.

PR17621

llvm-svn: 193064
2013-10-21 04:09:17 +00:00
Nick Lewycky f8c68da7a9 Fix typo in test's XFAIL line. Patch by Dimitry Andric!
llvm-svn: 193063
2013-10-21 00:46:21 +00:00
Michael Gottesman c024f3258a Teach simplify-cfg how to correctly create covered lookup tables for switches on iN with N >= 3.
One optimization simplify-cfg performs is the converting of switches to
lookup tables if the switch has > 4 cases. This is done by:

1. Finding the max/min case value and calculating the switch case range.
2. Create a lookup table basic block.
3. Perform a check in the switch's BB to see if the input value is in
the switch's case range. If the input value satisfies said predicate
branch to the lookup table BB, otherwise branch to the switch's default
destination BB using the default value as the result.

The conditional check consists of subtracting the min case value of the
table from any input iN value and then ensuring that said value is
unsigned less than the size of the lookup table represented as an iN
value.

If the lookup table is a covered lookup table, the size of the table will be N
which is 0 as an iN value. Thus the comparison will be an `icmp ult` of an iN
value against 0 which is always false yielding the incorrect result.

This patch fixes this problem by recognizing if we have a covered lookup table
and if we do, unconditionally jumps to the lookup table BB since the covering
property of the lookup table implies no input values could not be handled by
said BB.

rdar://15268442

llvm-svn: 193045
2013-10-20 07:04:37 +00:00
Peter Collingbourne e9f45e25f9 Emit prefix data after debug and EH directives.
This ensures that the prefix data is treated as part of the function for
the purpose of debug info.  This provides a better debugging experience,
among other things by allowing a debug info client to correctly look up
a function in debug info given a function pointer.

llvm-svn: 193042
2013-10-20 02:16:21 +00:00
Peter Collingbourne 588009e484 Emit DWARF line entries for all data in the instruction stream.
r182712 attempted to do this, but it failed to handle data emitted via
EmitBytes.

llvm-svn: 193041
2013-10-20 02:16:18 +00:00
Bill Wendling 4fea22c63b Perform an intelligent splice of the predecessor with the single successor.
If the predecessor's being spliced into a landing pad, then we need the PHIs to
come first and the rest of the predecessor's code to come *after* the landing
pad instruction.

llvm-svn: 193035
2013-10-19 11:27:12 +00:00
Andrew Trick 1bda72144d Update PPC loop tests after SCEV non-unit-stride checkin r193015.
llvm-svn: 193021
2013-10-19 00:14:04 +00:00
Andrew Trick 768b917dc8 SCEV should use NSW to get trip count for positive nonunit stride loops.
SCEV currently fails to compute loop counts for nonunit stride
loops. This comes up frequently. It prevents loop optimization and
forces vectorization to insert extra loop checks.

For example:
void foo(int n, int *x) {
 for (int i = 0; i < n; i += 3) {
   x[i] = i;
   x[i+1] = i+1;
   x[i+2] = i+2;
 }
}

We need to properly handle the case in which limit > INT_MAX-stride. In
the above case: n > INT_MAX-3. In this case the loop counter will step
beyond the limit and overflow at the same time. However, knowing that
signed integer overlow in undefined, we can assume the loop test
behavior is arbitrary after overflow. This obeys both C undefined
behavior rules, and the more strict LLVM poison value rules.

I'm finally fixing this in response to Hal Finkel's persistence.
The most probable reason that we never optimized this before is that
we were being careful to handle case where the developer expected a
side-effect free infinite loop relying on overflow:

for (int i = 0; i < n; i += s) {
  ++j;
}
return j;

If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might
expect an infinite loop. However there are plenty of ways to achieve
this effect without relying on undefined behavior of signed overflow.

llvm-svn: 193015
2013-10-18 23:43:53 +00:00
Michael J. Spencer c064a9abff [Support][YAML] Add support for accessing tags and tag handle substitution.
llvm-svn: 193004
2013-10-18 22:38:04 +00:00
Hans Wennborg ce69d77cec MC asm parser: allow ?'s in symbol names, and handle @'s in names in MS asm
This is another (final?) stab at making us able to parse our own asm output
on Windows.

Symbols on Windows often contain @'s and ?'s in their names. Our asm parser
didn't like this. ?'s were not allowed, and @'s were intepreted as trying to
reference PLT/GOT/etc.

We can't just add quotes around the bad names, since e.g. for MinGW, we use gas
to assemble, and it doesn't like quotes in some places (notably in .def
directives).

This commit makes us allow ?'s in symbol names, and @'s in symbol names for MS
assembly.

Differential Revision: http://llvm-reviews.chandlerc.com/D1978

llvm-svn: 193000
2013-10-18 20:46:28 +00:00
David Majnemer 2b33845230 Test case for r192957
Forgot to 'svn add'

llvm-svn: 192978
2013-10-18 14:49:59 +00:00
Bill Schmidt 3684fdd59f [PATCH] Fix PR17168 (DAG scheduler inserts DBG_VALUE before PHI with fast-isel)
PR17168 describes a test case that fails when compiling for debug with
fast-isel.  Investigation showed that the test was failing because a DBG_VALUE
machine instruction was placed prior to a PHI.

For this problem to occur requires the following:
 * Compile for debug
 * Compile with fast-isel
 * In a block B, fast-isel must partially succeed before punting to DAG-isel
 * B must start with a PHI
 * The first unhandled node in the DAG must not generate a machine instruction
 * A debug value with an order less than that of that first node exists

When all of these circumstances apply, the existing test that an instruction
was not inserted won't fire.  Currently it tests whether the block is empty,
or whether the last instruction generated is a phi.  When fast-isel has
partially succeeded, the last instruction generated will not be a phi.
Instead, we need to check whether the current insert position is immediately
following a phi.  This patch adds that check, and adds the test case from the
PR as a regression test.

llvm-svn: 192976
2013-10-18 14:20:11 +00:00
Richard Barton 87dacc38b8 Add hint disassembly syntax for 16-bit Thumb hint instructions.
Patch by Artyom Skrobov

llvm-svn: 192972
2013-10-18 14:09:49 +00:00
Chad Rosier fe2f58c8a1 [AArch64] Add support for NEON scalar extract narrow instructions.
llvm-svn: 192970
2013-10-18 14:03:24 +00:00
Silviu Baranga 314e58fdcc Add hardware division as a default feature on Cortex-A15. Also add test cases to check this, and change diagnostics for the hwdiv-arm feature to something useful.
llvm-svn: 192963
2013-10-18 10:18:40 +00:00
Daniel Sanders eb4003651d [mips][msa] Added a regression test that depended on multiple patches to pass.
llvm-svn: 192961
2013-10-18 09:52:21 +00:00
Hans Wennborg 7ddcdc82a5 Revert "Re-commit r192758 - MC: quote tricky symbol names in asm output"
This caused the clang-native-mingw32-win7 buildbot to break.

The assembler was complaining about the following lines that were showing up
in the asm for CrashRecoveryContext.cpp:

  movl  $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax)
  calll "_AddVectoredExceptionHandler@8"
  .def   "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4";
  "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4":
  calll "_RemoveVectoredExceptionHandler@4"

Reverting for now.

llvm-svn: 192940
2013-10-18 02:14:40 +00:00
David Peixotto 8e5abc52cb 17309 ARM backend incorrectly lowers COPY_STRUCT_BYVAL_I32 for thumb1 targets
This commit implements the correct lowering of the
COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets.
Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the
post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not
have the post-increment form of these instructions so the generated
assembly contained invalid instructions.

Passing the generated assembly to gcc caused it to complain with an
error like this:

  Error: cannot honor width suffix -- `ldrb r3,[r0],#1'

and the integrated assembler would generate an object file with an
invalid instruction encoding.

This commit contains a small test case that demonstrates the problem
with thumb1 targets as well as an expanded test case that more
throughly tests the lowering of byval struct passing for arm,
thumb1, and thumb2 targets.

llvm-svn: 192916
2013-10-17 19:52:05 +00:00
Chad Rosier 37d29173aa [AArch64] Add support for NEON scalar three register different instruction
class.  The instruction class includes the signed saturating doubling
multiply-add long, signed saturating doubling multiply-subtract long, and
the signed saturating doubling multiply long instructions.

llvm-svn: 192908
2013-10-17 18:12:29 +00:00
Bill Wendling 16754ddc00 Add testcase to make sure we don't generate a compact unwind section for ELF binaries.
This tests r190354.

llvm-svn: 192903
2013-10-17 17:38:49 +00:00
Daniel Sanders a4eaf59f9e [mips][msa] Added lsa instruction
llvm-svn: 192895
2013-10-17 13:38:20 +00:00
Benjamin Kramer d687560d51 Fix tests not to depend on specific regalloc or instruction order.
They were failing with -mcpu=atom.

llvm-svn: 192890
2013-10-17 12:41:05 +00:00
Daniel Sanders 66f5e46a2d Fix r192888: test/CodeGen/Mips/msa/3r_ld_st.ll should have been deleted
llvm-svn: 192889
2013-10-17 12:36:35 +00:00
Richard Sandiford 95f7ba988b Replace sra with srl if a single sign bit is required
E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2).

llvm-svn: 192884
2013-10-17 11:16:57 +00:00
Andrea Di Biagio 561badf717 Fix edge condition in DAGCombiner to improve codegen of shift sequences.
When canonicalizing dags according to the rule
(shl (zext (shr X, c1) ), c1) ==> (zext (shl (shr X, c1), c1))

remember to add the new shl dag to the DAGCombiner worklist of nodes.
If we don't explicitly add it to the worklist of nodes to visit, we
may not trigger later on the rule that folds the shift left + logical
shift right into a AND instruction with bitmask.

llvm-svn: 192883
2013-10-17 11:02:58 +00:00
Michael Kuperstein 2cbd5132bf Changing DebugInfoFinder to iterate over all the compile units.
Solves http://llvm.org/bugs/show_bug.cgi?id=17507

Committed on behalf of alon.mishne@intel.com

llvm-svn: 192879
2013-10-17 10:27:12 +00:00
Dmitry Vyukov b1ad5720a2 tsan: implement no_sanitize_thread attribute
If a function has no_sanitize_thread attribute,
do not instrument memory accesses in it.

llvm-svn: 192871
2013-10-17 07:20:06 +00:00
Jim Grosbach c044c65470 x86: Move bitcasts outside concat_vector.
Consider the following:

typedef unsigned short ushort4U __attribute__((ext_vector_type(4),
aligned(2)));
typedef unsigned short ushort4 __attribute__((ext_vector_type(4)));
typedef unsigned short ushort8 __attribute__((ext_vector_type(8)));
typedef int int4 __attribute__((ext_vector_type(4)));

int4 __bbase_cvt_int(ushort4 v) {
  ushort8 a;
  a.lo = v;
  return _mm_cvtepu16_epi32(a);
}

This generates the, not unreasonable, IR:
define <4 x i32> @foo0(double %v.coerce) nounwind ssp {
  %tmp = bitcast double %v.coerce to <4 x i16>
  %tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32
  %0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef>
  %tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1)
  ret <4 x i32> %tmp2
}

The problem is when type legalization gets hold of the v4i16. It
legalizes that by spilling to the stack, then doing a zero-extending
load. Things go even more silly from there, ending up with something
like:
_foo0:
  movsd %xmm0, -8(%rsp)       <== Spill to the stack.
  movq  -8(%rsp), %xmm0       <== Reload it right back out.
  pmovzxwd  %xmm0, %xmm1      <== Here's what we actually asked for.
  pblendw $1, %xmm1, %xmm0    <== We don't need this at all
  pmovzxwd  %xmm0, %xmm0      <== We already did this
  ret

The v8i8 to v8i16 zext intrinsic gives even worse results, with two
table lookups via pshufb instructions(!!).

To avoid all that, we can move the bitcasting until after we've formed
the wider (legal) vector type. Then our normal codegen flows along
nicely and we get the expected:
_foo0:
  pmovzxwd  %xmm0, %xmm0
  ret

rdar://15245794

llvm-svn: 192866
2013-10-17 02:58:06 +00:00
Eric Christopher 2c8b7907c3 According to the dwarf standard pubnames and pubtypes for languages
like C++ should be the fully qualified names for the type.

Add a routine that does a language specific context walk to build
up the qualified name and use it when we add types/names to the
tables. Expand the gnu pubnames testcase as it's the most complex
to make sure that qualified types are also being added.

llvm-svn: 192865
2013-10-17 02:06:06 +00:00
Eric Christopher 96eff3f393 Add the subprogram DIEs to the context they're created with only
if they're a declaration, otherwise they're owned by the compile
unit.

llvm-svn: 192861
2013-10-17 01:31:12 +00:00
Hans Wennborg 69918bccab Re-commit r192758 - MC: quote tricky symbol names in asm output
The reason this got reverted was that the @feat.00 symbol which was emitted
for every TU became quoted, and on cygwin/mingw we use the gas assembler which
couldn't handle the quotes.

This commit fixes the problem by only emitting @feat.00 for win32, where we use
clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no
loss there.

With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since
it uses the Itanium ABI, which doesn't put these funny characters in symbols.

> Because of win32 mangling, we produce symbol and section names with
> funny characters in them, most notably @ characters.
>
> MC would choke on trying to parse its own assembly output. This patch addresses
> that by:
>
> - Making @ trigger quoting of symbol names
> - Also quote section names in the same way
> - Just parse section names like other identifiers (to allow for quotes)
> - Don't assume @ signifies a symbol variant if it is in a string.

llvm-svn: 192859
2013-10-17 01:13:02 +00:00
David Blaikie 6316ca45a7 DIEHash: Use DW_FORM_sdata for integers, per spec.
This allows us to produce the same hash as GCC for at least some simple
examples.

llvm-svn: 192855
2013-10-16 23:36:20 +00:00
David Blaikie b32db97a8d Update test case due to DIE hashing in r192836
llvm-svn: 192850
2013-10-16 21:21:51 +00:00
Chad Rosier 846a72539c [AArch64] Add support for NEON scalar negate instruction.
llvm-svn: 192843
2013-10-16 21:04:39 +00:00
Chad Rosier 175601d997 [AArch64] Add support for NEON scalar absolute value instruction.
llvm-svn: 192842
2013-10-16 21:04:34 +00:00
Yunzhong Gao dfc277f350 Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar,
bulldozer and piledriver. Support for the instruction itself seems to have
already been added in r178040.

Differential Revision: http://llvm-reviews.chandlerc.com/D1933

llvm-svn: 192828
2013-10-16 19:04:11 +00:00
Rafael Espindola 6174e5ee68 Create an atom with just the data that failed to disassemble.
Patch by Stephen Checkoway.

llvm-svn: 192827
2013-10-16 19:03:14 +00:00
Arnold Schwaighofer a66582470b SLPVectorizer: Don't vectorize volatile memory operations
radar://15231682

Reapply r192799,
  http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/8226
showed that the bot is still broken even with this out.

llvm-svn: 192820
2013-10-16 17:52:40 +00:00
Arnold Schwaighofer 06a0324f6a Revert "SLPVectorizer: Don't vectorize volatile memory operations"
This speculatively reverts commit 192799. It might have broken a linux buildbot.

llvm-svn: 192816
2013-10-16 17:19:40 +00:00
Tom Stellard b34186ae38 R600: Fix a crash in the AMDILCFGStructurizer
We were calling llvm_unreachable() when failing to optimize the
branch into if case.  However, it is still possible for us
to structurize the CFG by duplicating blocks even if this optimization
fails.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 192813
2013-10-16 17:06:02 +00:00
Rafael Espindola 678a9431fd Port to FileCheck.
llvm-svn: 192810
2013-10-16 16:47:56 +00:00
Chad Rosier abe458d0bf Update comment.
llvm-svn: 192806
2013-10-16 16:30:10 +00:00
Chad Rosier 178b1cefc7 [AArch64] Add support for NEON scalar signed saturating accumulated of unsigned
value and unsigned saturating accumulate of signed value instructions.

llvm-svn: 192800
2013-10-16 16:09:02 +00:00
Arnold Schwaighofer 5078ea2bd9 SLPVectorizer: Don't vectorize volatile memory operations
radar://15231682

llvm-svn: 192799
2013-10-16 16:09:00 +00:00
Benjamin Kramer 00eb07b791 DAGCombiner: Don't fold xor into not if getNOT would introduce an illegal constant.
This happens e.g. with <2 x i64> -1 on x86_32. It cannot be generated directly
because i64 is illegal. It would be nice if getNOT would handle this
transparently, but I don't see a way to generate a legal constant there right
now. Fixes PR17487.

llvm-svn: 192795
2013-10-16 14:16:19 +00:00
Kostya Serebryany d3d23bec66 [asan] Optimize accesses to global arrays with constant index
Summary:
Given a global array G[N], which is declared in this CU and has static initializer
avoid instrumenting accesses like G[i], where 'i' is a constant and 0<=i<N.
Also add a bit of stats.

This eliminates ~1% of instrumentations on SPEC2006
and also partially helps when asan is being run together with coverage.

Reviewers: samsonov

Reviewed By: samsonov

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1947

llvm-svn: 192794
2013-10-16 14:06:14 +00:00
Richard Sandiford 3e382972d9 [SystemZ] Handle extensions in RxSBG optimizations
The input to an RxSBG operation can be narrower as long as the upper bits
are don't care.  This fixes a FIXME added in r192783.

llvm-svn: 192790
2013-10-16 13:35:13 +00:00
Richard Sandiford f722a8e30e [SystemZ] Improve handling of SETCC
We previously used the default expansion to SELECT_CC, which in turn would
expand to "LHI; BRC; LHI".  In most cases it's better to use an IPM-based
sequence instead.

llvm-svn: 192784
2013-10-16 11:10:55 +00:00
Richard Sandiford 374a0e50c4 Handle (shl (anyext (shr ...))) in SimpilfyDemandedBits
This is really an extension of the current (shl (shr ...)) -> shl optimization.
The main difference is that certain upper bits must also not be demanded.

The motivating examples are the first two in the testcase, which occur
in llvmpipe output.

llvm-svn: 192783
2013-10-16 10:26:19 +00:00
NAKAMURA Takumi 272416fda9 Revert r192758 (and r192759), "MC: Better handling of tricky symbol and section names"
GNU AS didn't like quotes in symbol names.

    Error: junk at end of line, first unrecognized character is `"'

        .def "@feat.00";
        "@feat.00" = 1

Reproduced on Cygwin's 2.23.52.20130309 and mingw32's 2.20.1.20100303.

llvm-svn: 192775
2013-10-16 08:22:49 +00:00
Rafael Espindola d9db76cb46 Add a triple to this test.
llvm-svn: 192767
2013-10-16 02:27:33 +00:00
Rafael Espindola 0018a59d01 Add support for metadata representing .ident directives.
llvm-svn: 192764
2013-10-16 01:49:05 +00:00
Eric Christopher d2b497b522 Fix a pair of bugs in the emission of pubname tables:
1) Make sure we emit static member variables by checking
at the end of createGlobalVariableDIE rather than piecemeal
in the function.
(As a note, createGlobalVariableDIE needs rewriting.)

2) Make sure we use the definition rather than declaration DIE
for two things: a) determining linkage for gnu pubnames, and b)
as the address of the DIE for global variables.
(As a note, createGlobalVariableDIE really needs rewriting.)

Adjust the testcase to make sure we're checking the correct DIEs.

llvm-svn: 192761
2013-10-16 01:37:49 +00:00
Hans Wennborg 6468493250 dos2unix on quoted-names.ll
llvm-svn: 192759
2013-10-16 01:22:07 +00:00
Hans Wennborg d34cf14339 MC: Better handling of tricky symbol and section names
Because of win32 mangling, we produce symbol and section names with
funny characters in them, most notably @ characters.

MC would choke on trying to parse its own assembly output. This patch addresses
that by:

- Making @ trigger quoting of symbol names
- Also quote section names in the same way
- Just parse section names like other identifiers (to allow for quotes)
- Don't assume @ signifies a symbol variant if it is in a string.

Differential Revision: http://llvm-reviews.chandlerc.com/D1945

llvm-svn: 192758
2013-10-16 01:20:40 +00:00
Andrew Trick e97d8d6dde Enable MI Sched for x86.
This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.

Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.

On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.

Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.

The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.

Unit tests are updated so they don't depend on SD scheduling heuristics.

llvm-svn: 192750
2013-10-15 23:33:07 +00:00
Chad Rosier 9d51708677 [AArch64] Add support for NEON scalar signed saturating absolute value and
scalar signed saturating negate instructions.

llvm-svn: 192733
2013-10-15 21:18:44 +00:00
Manman Ren fd956dbae0 Struct byval: fix a copy-paste error for thumb2.
PR17309

llvm-svn: 192730
2013-10-15 19:42:32 +00:00
Michael Liao ad71659def Fix PR17546
- Type of index used in extract_vector_elt or insert_vector_elt supposes
  to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
  better to truncate (or zero-extend in case it's changed later) it to
  mask element type to guarantee they are matching instead of asserting
  that.

llvm-svn: 192722
2013-10-15 17:51:58 +00:00
Michael Liao 8ba068211d Fix PR16807
- Lower signed division by constant powers-of-2 to target-independent
  DAG operators instead of target-dependent ones to support them better
  on targets where vector types are legal but shift operators on that
  types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
  though <16 x i16> is a legal type.

llvm-svn: 192721
2013-10-15 17:51:02 +00:00
Daniel Sanders 1dfddc73dc [mips][msa] Added support for build_vector for v4f32 and v2f64.
llvm-svn: 192699
2013-10-15 13:14:41 +00:00
Richard Sandiford 6af6ff1e15 [SystemZ] Use A(G)SI when spilling the target of a constant addition
llvm-svn: 192681
2013-10-15 08:42:59 +00:00
Job Noorman e9a1d4c274 Fix MSP430 calling convention to match MSPGCC
llvm-svn: 192678
2013-10-15 08:19:39 +00:00
NAKAMURA Takumi f845be14f6 llvm/test/CodeGen/X86/break-avx-dep.ll: Relax an expression to be matched to also r[89], not only rXX.
llvm-svn: 192675
2013-10-15 06:36:36 +00:00
Andrew Trick 3a99693c5a Improve on r192635, ExeDepsFix for avx, and add a test case.
rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.

This was blocking the switch to MI-Sched.

llvm-svn: 192669
2013-10-15 03:39:43 +00:00
Akira Hatanaka 86c3c794fa [mips] Transfer kill flag to the newly created operand.
llvm-svn: 192662
2013-10-15 01:06:30 +00:00
Akira Hatanaka 8368b3b3df [mips] Set HI/LO registers' HWEncoding field.
llvm-svn: 192661
2013-10-15 01:00:00 +00:00
Quentin Colombet 778dba1dd8 [X86][FastISel] During X86 fastisel, the address of indirect call was resolved
through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.

The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.

<rdar://problem/15192473>

llvm-svn: 192636
2013-10-14 22:32:09 +00:00
Nick Lewycky ce4ca41e47 Fix a typo, in a comment, in a test.
llvm-svn: 192632
2013-10-14 22:02:53 +00:00
Eric Christopher 740025745b Revert part of a fix from 2010, changes since then:
a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.

I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.

llvm-svn: 192631
2013-10-14 21:52:26 +00:00
Will Dietz 5cb7f4e3f2 MachineSink: Fix and tweak critical-edge breaking heuristic.
Per original comment, the intention of this loop
is to go ahead and break the critical edge
(in order to sink this instruction) if there's
reason to believe doing so might "unblock" the
sinking of additional instructions that define
registers used by this one.  The idea is that if
we have a few instructions to sink "together"
breaking the edge might be worthwhile.

This commit makes a few small changes
to help better realize this goal:

First, modify the loop to ignore registers
defined by this instruction.  We don't
sink definitions of physical registers,
and sinking an SSA definition isn't
going to unblock an upstream instruction.

Second, ignore uses of physical registers.
Instructions that define physical registers are
rejected for sinking, and so moving this one
won't enable moving any defining instructions.
As an added bonus, while virtual register
use-def chains are generally small due
to SSA goodness, iteration over the uses
and definitions (used by hasOneNonDBGUse)
for physical registers like EFLAGS
can be rather expensive in practice.
(This is the original reason for looking at this)

Finally, to keep things simple continue
to only consider this trick for registers that
have a single use (via hasOneNonDBGUse),
but to avoid spuriously breaking critical edges
only do so if the definition resides
in the same MBB and therefore this one directly
blocks it from being sunk as well.
If sinking them together is meant to be,
let the iterative nature of this pass
sink the definition into this block first.

Update tests to accomodate this change,
add new testcase where sinking avoids pipeline stalls.

llvm-svn: 192608
2013-10-14 16:57:17 +00:00
Evgeniy Stepanov be83d8f693 [msan] Instrument x86.*_cvt* intrinsics.
Currently MSan checks that arguments of *cvt* intrinsics are fully initialized.
That's too much to ask: some of them only operate on lower half, or even
quarter, of the input register.

llvm-svn: 192599
2013-10-14 15:16:25 +00:00
Chad Rosier d1f40d760a [AArch64] Add support for NEON scalar integer compare instructions.
llvm-svn: 192596
2013-10-14 14:37:20 +00:00
Bernard Ogden 53169762d0 Add Cortex-A57 support
llvm-svn: 192591
2013-10-14 13:17:07 +00:00
Bernard Ogden 4400cde89a Add subtarget feature support for Cortex-A53
Some previous implicit defaults have changed, for example FP and NEON
are now on by default.

llvm-svn: 192590
2013-10-14 13:16:57 +00:00
Matheus Almeida 2102188cfc [mips][msa] Direct Object Emission support for BIT instructions.
List of instructions:
bclri.{b,h,w,d}
binsli.{b,h,w,d}
binsri.{b,h,w,d}
bnegi.{b,h,w,d}
bseti.{b,h,w,d}
sat_s.{b,h,w,d}
sat_u.{b,h,w,d}
slli.{b,h,w,d}
srai.{b,h,w,d}
srari.{b,h,w,d}
srli.{b,h,w,d}
srlri.{b,h,w,d}

llvm-svn: 192589
2013-10-14 13:07:39 +00:00
Matheus Almeida 5be0cd8720 [mips][msa] Direct Object Emission support for VEC instructions.
List of instructions:
and.v, bmnz.v, bmz.v, bsel.v, nor.v, or.v, xor.v.

llvm-svn: 192588
2013-10-14 12:57:18 +00:00
Matheus Almeida 7d65ea76ea [mips][msa] Direct Object Emission of INSVE.{b,h,w,d}.
llvm-svn: 192587
2013-10-14 12:38:17 +00:00
Matheus Almeida bc189eb318 [mips][msa] Direct Object Emission for the majority of the ELM instructions.
List of instructions:
copy_s.{b,h,w}
copy_u.{b,h,w}
sldi.{b,h,w,d}
splati.{b,h,w,d}

llvm-svn: 192586
2013-10-14 12:22:43 +00:00
Matheus Almeida b74293dc55 [mips][msa] Direct Object Emission of INSERT.{B,H,W} instruction.
INSERT is the first type of MSA instruction that requires a change to the way
MSA registers are parsed. This happens because MSA registers may be suffixed by
an index in the form of an immediate or a general purpose register. The changes
to parseMSARegs reflect that requirement.

llvm-svn: 192582
2013-10-14 11:49:30 +00:00
Evgeniy Stepanov 9b5517b127 [msan] Fix handling of scalar select of vectors.
llvm-svn: 192575
2013-10-14 09:52:09 +00:00
Elena Demikhovsky 82a46ebe0a Fixed a bug in dynamic allocation memory on stack.
The alignment of allocated space was wrong, see Bugzila 17345.

Done by Zvi Rackover <zvi.rackover@intel.com>.

llvm-svn: 192573
2013-10-14 07:26:51 +00:00
Craig Topper a422b09ae3 Allow pinsrw/pinsrb/pextrb/pextrw/movmskps/movmskpd/pmovmskb/extractps instructions to parse either GR32 or GR64 without resorting to duplicating instructions.
llvm-svn: 192567
2013-10-14 04:55:01 +00:00
Craig Topper 4432208884 Add disassembler support for SSE4.1 register/register form of PEXTRW. There is a shorter encoding that was part of SSE2, but a memory form was added in SSE4.1. This is the register form of that encoding.
llvm-svn: 192566
2013-10-14 01:42:32 +00:00
Craig Topper 7158745e55 Mark MOVMSKPS/MOVMSKPD/VPINSRWrr64i as AsmParserOnly to remove them from the disassembler tables. Add PINSRWrr64i to complement the AVX version.
llvm-svn: 192565
2013-10-14 01:21:22 +00:00
Vincent Lejeune d6cbede9c5 R600: improve dump of S_WAITCNT
llvm-svn: 192557
2013-10-13 17:56:28 +00:00
Vincent Lejeune fa58a5fb60 R600: Use masked read sel for texture instructions
llvm-svn: 192554
2013-10-13 17:56:10 +00:00
Vincent Lejeune 301beb80d4 R600: fix swizzle export
llvm-svn: 192553
2013-10-13 17:56:04 +00:00
Arnold Schwaighofer 58864d2d5f SLPVectorizer: Sort PHINodes based on their opcode
Before this patch we relied on the order of phi nodes when we looked for phi
nodes of the same type. This could prevent vectorization of cases where there
was a phi node of a second type in between phi nodes of some type.

This is important for vectorization of an internal graphics kernel. On the test
suite + external on x86_64 (and on a run on armv7s) it showed no impact on
either performance or compile time.

radar://15024459

llvm-svn: 192537
2013-10-12 18:56:27 +00:00
Benjamin Kramer f016258068 Force a CPU on test so it doesn't depend on microarchitectural scheduling decisions.
llvm-svn: 192532
2013-10-12 11:17:12 +00:00
Reed Kotler de64774b4d For Mips16, start to consolidate all forms of 32 bit literal loading so that
they can be better handled and optimized in the Mips16 constant island code.

llvm-svn: 192520
2013-10-12 02:19:08 +00:00
Andrew Kaylor 6587bcfdbc Fixing problems in lli's RemoteMemoryManager.
This fixes a problem from a previous check-in where a return value was omitted.

Previously the remote/stubs-remote.ll and remote/stubs-sm-pic.ll tests were reporting passes, but they should have been failing.  Those tests attempt to link against an external symbol and remote symbol resolution is not supported.  The old RemoteMemoryManager implementation resulted in local symbols being used for resolution and the child process crashed but the test didn't notice.  With this check-in remote symbol resolution fails, and so the test (correctly) fails.

llvm-svn: 192514
2013-10-11 22:47:10 +00:00
Andrew Kaylor 7bb1344c67 Adding multiple object support to MCJIT EH frame handling
llvm-svn: 192504
2013-10-11 21:25:48 +00:00
Matt Arsenault 441387832e R600: Add scalar i32 add test
llvm-svn: 192501
2013-10-11 21:03:41 +00:00
Matt Arsenault c15b85715e Use CHECK-LABEL
llvm-svn: 192500
2013-10-11 21:03:39 +00:00
Benjamin Kramer 8a37f63714 Mips: Disassemble sign-extended 64 bit immediates properly.
This doesn't change the meaning of the output, but makes look right. PR17539.

llvm-svn: 192483
2013-10-11 19:05:08 +00:00
Matthias Braun d616ccc069 Remove kill flags after if conversion if necessary
When if converting something like:
true:
   ... = R0<kill>

false:
   ... = R0<kill>

then the instructions of the true block must not have a <kill> flag
anymore, as the instruction of the false block follow and do still read
the R0 value.
Specifically this patch determines the set of register live-in in the
false block (possibly after simulating the liveness changes of the
duplicated instructions). Each of these live-in registers mustn't be
killed.

llvm-svn: 192482
2013-10-11 19:04:37 +00:00
Quentin Colombet c50b943b31 [DAGCombiner] Load slicing test case: attempt to really fix the buildbots (used sse4.2 instead of avx!).
<rdar://problem/14477220>

llvm-svn: 192480
2013-10-11 18:54:49 +00:00
Manman Ren 1848cb6654 Debug Info Testing Case: check for the name of a structure.
llvm-svn: 192478
2013-10-11 18:50:00 +00:00
Stephen Lin e93a3a088f Really fix CHECK-LABEL and CHECK-DAG interaction. This actually just restores the initial implementation that was in r186162 but got lost in some subsequent refactoring. More explicit variable names and comments are present now to hopefully prevent repeat regression, as well as another test.
llvm-svn: 192477
2013-10-11 18:38:36 +00:00