Commit Graph

16935 Commits

Author SHA1 Message Date
Chandler Carruth 881d0a7966 Add a much more conservative strategy for aligning branch targets.
Previously, MBP essentially aligned every branch target it could. This
bloats code quite a bit, especially non-looping code which has no real
reason to prefer aligned branch targets so heavily.

As Andy said in review, it's still a bit odd to do this without a real
cost model, but this at least has much more plausible heuristics.

Fixes PR13265.

llvm-svn: 161409
2012-08-07 09:45:24 +00:00
Manman Ren cb36b8c2e6 MachineCSE: Update the heuristics for isProfitableToCSE.
If the result of a common subexpression is used at all uses of the candidate
expression, CSE should not increase the live range of the common subexpression.

rdar://11393714 and rdar://11819721

llvm-svn: 161396
2012-08-07 06:16:46 +00:00
Jack Carter f4946cfbb9 The define for 64 bit sign extension neglected to
initialize fields of the class that it used.

The result was nonsense code.

Before:
0000000000000000 <foo>:
   0:    00441100     0x441100
   4:    03e00008     jr    ra
   8:    00000000     nop

After:
0000000000000000 <foo>:
   0:    00041000     sll    v0,a0,0x0
   4:    03e00008     jr    ra
   8:    00000000     nop 

llvm-svn: 161377
2012-08-07 00:35:22 +00:00
Jack Carter 612c66314c The Mips64InstrInfo.td definitions DynAlloc64 LEA_ADDiu64
were using a class defined for 32 bit instructions and 
thus the instruction was for addiu instead of daddiu.

This was corrected by adding the instruction opcode as a 
field in the  base class to be filled in by the defs.

llvm-svn: 161359
2012-08-06 23:29:06 +00:00
Jack Carter 84491abb20 Mips relocations R_MIPS_HIGHER and R_MIPS_HIGHEST.
These 2 relocations gain access to the 
highest and the second highest 16 bits
of a 64 bit object.

R_MIPS_HIGHER %higher(A+S)
The %higher(x) function is [ (((long long) x + 0x80008000LL) >> 32) & 0xffff ]. 

R_MIPS_HIGHEST %highest(A+S)
The %highest(x) function is [ (((long long) x + 0x800080008000LL) >> 48) & 0xffff ]. 

llvm-svn: 161348
2012-08-06 21:26:03 +00:00
Hal Finkel 33e529d56b MFTB on PPC64 should really be encoded using MFSPR.
The MFTB instruction itself is being phased out, and its functionality
is provided by MFSPR. According to the ISA docs, using MFSPR works on all known
chips except for the 601 (which did not have a timebase register anyway)
and the POWER3.

Thanks to Adhemerval Zanella for pointing this out!

llvm-svn: 161346
2012-08-06 21:21:44 +00:00
Craig Topper ab47fe4e16 Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Craig Topper 812005e562 Update test to check for r161305
llvm-svn: 161307
2012-08-05 09:06:28 +00:00
Hal Finkel 70381a7b18 Add readcyclecounter lowering on PPC64.
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.

llvm-svn: 161302
2012-08-04 14:10:46 +00:00
Anton Korobeynikov 218aaf6d04 Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377

llvm-svn: 161299
2012-08-04 13:16:12 +00:00
Bob Wilson 874886cd66 Refactor and check "onlyReadsMemory" before optimizing builtins.
This patch is mostly just refactoring a bunch of copy-and-pasted code, but
it also adds a check that the call instructions are readnone or readonly.
That check was already present for sin, cos, sqrt, log2, and exp2 calls, but
it was missing for the rest of the builtins being handled in this code.

llvm-svn: 161282
2012-08-03 23:29:17 +00:00
Akira Hatanaka 22bec282e9 1. Redo mips16 instructions to avoid multiple opcodes for same instruction.
Change these to patterns.
2. Add another 16 instructions.

Patch by Reed Kotler.

llvm-svn: 161272
2012-08-03 22:57:02 +00:00
Bob Wilson fa59485b94 Fix memcmp code-gen to honor -fno-builtin.
I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp
in TargetLibraryInfo, so that it would use custom code for memcmp calls even
with -fno-builtin.  I also had to add a new -disable-simplify-libcalls option
to llc so that I could write a test for this.

llvm-svn: 161262
2012-08-03 21:26:18 +00:00
Bob Wilson 3e6fa462f3 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Jush Lu 4705da9020 [arm-fast-isel] Add support for shl, lshr, and ashr.
llvm-svn: 161230
2012-08-03 02:37:48 +00:00
NAKAMURA Takumi edac6ee6aa [CMake] Add yaml2obj to check-llvm.
llvm-svn: 161229
2012-08-03 00:45:32 +00:00
Matt Beaumont-Gay 99fc2e19a6 Move test yaml files under Inputs until they are converted to be the actual
test files.

llvm-svn: 161219
2012-08-02 21:52:49 +00:00
Manman Ren ba8122cc25 X86 Peephole: fold loads to the source register operand if possible.
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276

llvm-svn: 161207
2012-08-02 19:37:32 +00:00
Michael J. Spencer 1ffd9de4f1 Add yaml2obj. A utility to convert YAML to binaries.
yaml2obj takes a textual description of an object file in YAML format
and outputs the binary equivalent. This greatly simplifies writing
tests that take binary object files as input.

llvm-svn: 161205
2012-08-02 19:16:56 +00:00
Akira Hatanaka fffad897f2 Set transient stack alignment in constructor of MipsFrameLowering and re-enable
test o32_cc_vararg.ll.

llvm-svn: 161189
2012-08-02 18:15:13 +00:00
Jiangning Liu fa18005a4c Support fpv4 for ARM Cortex-M4.
llvm-svn: 161163
2012-08-02 08:35:55 +00:00
Jiangning Liu 6a43bf7d74 Fix #13035, a bug around Thumb instruction LDRD/STRD with negative #0 offset index issue.
llvm-svn: 161162
2012-08-02 08:29:50 +00:00
Jiangning Liu 288e1af8c8 Fix #13138, a bug around ARM instruction DSB encoding and decoding issue.
llvm-svn: 161161
2012-08-02 08:21:27 +00:00
Jiangning Liu 10dd40e42d Fix #13241, a bug around shift immediate operand for ARM instruction ADR.
llvm-svn: 161159
2012-08-02 08:13:13 +00:00
NAKAMURA Takumi 7020f51622 llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Make sure this is testing without +avx.
FIXME: Could +avx be checked here too?
llvm-svn: 161156
2012-08-02 06:36:56 +00:00
NAKAMURA Takumi aaca1e690d llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Rewrite expressions to pass regardless of PR11031.
- Relax to match even if epilogue (pop %ebp) were emitted.
  - Assume the return value is stored to %xmm0.

llvm-svn: 161155
2012-08-02 06:33:58 +00:00
Manman Ren 5759d01230 X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152
2012-08-02 00:56:42 +00:00
Eric Christopher b1b9451337 Temporarily revert c23b933d5f8be9b51a1d22e717c0311f65f87dcd. It's causing
failures in the debug testsuite and possibly PR13486.

llvm-svn: 161121
2012-08-01 18:19:01 +00:00
Matt Beaumont-Gay 7947aecaf1 Line endings.
llvm-svn: 161117
2012-08-01 16:42:35 +00:00
Nuno Lopes b1c473998d fix 'make check' when ocamlopt returns the compiler path with CFLAGS (and there's a cflag with a = char)
llvm-svn: 161114
2012-08-01 15:50:34 +00:00
Elena Demikhovsky 3cb3b0045c Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Nick Lewycky fb78083b1c Stay rational; don't assert trying to take the square root of a negative value.
If it's negative, the loop is already proven to be infinite. Fixes PR13489!

llvm-svn: 161107
2012-08-01 09:14:36 +00:00
Akira Hatanaka d1c43cee24 Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and
MipsSEFrameLowering.

Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be
reserved if there is a call with a large call frame or there are variable sized
objects on the stack.

llvm-svn: 161090
2012-07-31 22:50:19 +00:00
Akira Hatanaka 02de0e4425 Let PEI::calculateFrameObjectOffsets compute the final stack size rather than
computing it in MipsFrameLowering::emitPrologue.

llvm-svn: 161078
2012-07-31 21:28:49 +00:00
Akira Hatanaka 33a25af5a8 Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering.
The frame object which points to the dynamically allocated area will not be
needed after changes are made to cease reserving call frames.

llvm-svn: 161076
2012-07-31 20:54:48 +00:00
Akira Hatanaka beda2241a4 When store nodes or memcpy nodes are created to copy the function call
arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and
integer offset operands rather than frame object operands.

llvm-svn: 161068
2012-07-31 18:46:41 +00:00
Chad Rosier 710be7df71 [x86 frame lowering] In 32-bit mode, use ESI as the base pointer.
Previously, we were using EBX, but PIC requires the GOT to be in EBX before 
function calls via PLT GOT pointer.

llvm-svn: 161066
2012-07-31 18:29:21 +00:00
Akira Hatanaka 4ce7c4060d Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as
single-precision load and store.

Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect
to map unaligned floating point load/store nodes to these instructions.

llvm-svn: 161063
2012-07-31 18:16:49 +00:00
Manman Ren 8c549b586c MachineSink: Sort the successors before trying to find SuccToSinkTo.
One motivating example is to sink an instruction from a basic block which has
two successors: one outside the loop, the other inside the loop. We should try
to sink the instruction outside the loop.

rdar://11980766

llvm-svn: 161062
2012-07-31 18:10:39 +00:00
Jakob Stoklund Olesen 0c807dfae2 Clear kill flags in removeCopyByCommutingDef().
We are extending live ranges, so kill flags are not accurate. They
aren't needed until they are recomputed after RA anyway.

<rdar://problem/11950722>

llvm-svn: 161023
2012-07-31 02:47:24 +00:00
Manman Ren 2b6a0dfd4c Reverse order of the two branches at end of a basic block if it is profitable.
We branch to the successor with higher edge weight first.
Convert from
     je    LBB4_8  --> to outer loop
     jmp   LBB4_14 --> to inner loop
to
     jne   LBB4_14
     jmp   LBB4_8

PR12750
rdar: 11393714

llvm-svn: 161018
2012-07-31 01:11:07 +00:00
Jim Grosbach 20666162e9 Keep empty assembly macro argument values in the middle of the list.
Empty macro arguments at the end of the list should be as-if not specified at
all, but those in the middle of the list need to be kept so as not to screw
up the positional numbering. E.g.:
.macro foo
foo_-bash___:
  nop
.endm

foo 1, 2, 3, 4
foo 1, , 3, 4

Should create two labels, "foo_1_2_3_4" and "foo_1__3_4".

rdar://11948769

llvm-svn: 161002
2012-07-30 22:44:17 +00:00
Pete Cooper 91244268d7 Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd
llvm-svn: 160987
2012-07-30 20:23:19 +00:00
Kevin Enderby 5c490f1b8f Fix a bug in ARMMachObjectWriter::RecordRelocation() in ARMMachObjectWriter.cpp
where the other_half of the movt and movw relocation entries needs to get set
and only with the 16 bits of the other half.

rdar://10038370

llvm-svn: 160978
2012-07-30 18:46:15 +00:00
Nadav Rotem 77f1b9c477 When constant folding GEP expressions, keep the address space information of pointers.
Together with Ran Chachick <ran.chachick@intel.com>

llvm-svn: 160954
2012-07-30 07:25:20 +00:00
Manman Ren f87dd7c01b Revert r160920 and r160919 due to dragonegg and clang selfhost failure
llvm-svn: 160927
2012-07-29 02:44:09 +00:00
Nick Lewycky d2c3bdd269 Add testcases for GlobalOpt changes in r160693 and r160757.
llvm-svn: 160925
2012-07-29 01:15:37 +00:00
Manman Ren 9de95e779c X86 Peephole: fold loads to the source register operand if possible.
Trying to fix the bot by specifying a triple in the failing testing cases.

llvm-svn: 160920
2012-07-28 17:51:24 +00:00
Manman Ren 0fa3ab88ba X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919
2012-07-28 16:48:01 +00:00
Manman Ren 32367c063b X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.

llvm-svn: 160912
2012-07-28 03:15:46 +00:00
Eric Christopher 86ca9f9e11 Add a DW_AT_high_pc for CUs that are a single address range. Update
all tests accordingly.

Fixes PR13351.

Patch by shinichiro hamaji!

llvm-svn: 160899
2012-07-27 22:00:05 +00:00
Evan Cheng 249716e8ae Teach CodeGenPrep to look past bitcast when it's duplicating return instruction
into predecessor blocks to enable tail call optimization.

rdar://11958338

llvm-svn: 160894
2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen bc65e8f94e Add <imp-def> of super-register when lowering SUBREG_TO_REG.
Patch by Tyler Nowicki!

llvm-svn: 160888
2012-07-27 20:19:49 +00:00
Nuno Lopes 85591f899d fix PR13390: do not loop forever with self-referencing self instructions
llvm-svn: 160876
2012-07-27 18:21:15 +00:00
Nuno Lopes 20c7eb3549 fix infinite loop in instcombine in the presence of a (malformed) self-referencing select inst.
This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW

llvm-svn: 160874
2012-07-27 18:03:57 +00:00
Pete Cooper abc13af9c6 Simplify demanded bits of select sources where the condition is a constant vector
llvm-svn: 160835
2012-07-26 23:10:24 +00:00
Pete Cooper e807e45bff Teach SimplifyDemandedBits how to look through fpext and fptrunc to simplify their operand
llvm-svn: 160823
2012-07-26 22:37:04 +00:00
Jakob Stoklund Olesen ceee4a9d0c Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Duncan Sands 5651452076 Stop reassociate from looking through expressions of arbitrary complexity. This
is a temporary measure until my fix for PR13021 is ready.

llvm-svn: 160778
2012-07-26 09:26:40 +00:00
Craig Topper c7690ac7ac Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Akira Hatanaka 64626fc20f Fix call setup for PIC.
Patch by Reed Kotler.

llvm-svn: 160774
2012-07-26 02:24:43 +00:00
Manman Ren e8c6b15137 Update testing case for Atom when disabling rematerialization in
TwoAddressInstructionPass.

The generated code for Atom has a different code sequence. This is realted
to commit r160749.

llvm-svn: 160755
2012-07-25 20:17:14 +00:00
Nuno Lopes f0626f2205 revert r160742: it's breaking CMake build
original commit msg:
MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings

llvm-svn: 160751
2012-07-25 18:49:28 +00:00
Manman Ren cc1dc6dc11 Disable rematerialization in TwoAddressInstructionPass.
It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.

This also fixed rdar://11830760 where xor is expected instead of movi0.

llvm-svn: 160749
2012-07-25 18:28:13 +00:00
Nuno Lopes f0441e04bd MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings
llvm-svn: 160742
2012-07-25 17:29:22 +00:00
Rafael Espindola 11c38b9657 When a return struct pointer is passed in registers, the called has nothing
to pop.

llvm-svn: 160725
2012-07-25 13:41:10 +00:00
Duncan Sands 77a1f3b564 Don't perform an overaligned load in this test, since that's undefined
behaviour that might be exploited one day.

llvm-svn: 160714
2012-07-25 09:45:37 +00:00
Duncan Sands 0b875a0c29 When folding a load from a global constant, if the load started in the middle
of an array element (rather than at the beginning of the element) and extended
into the next element, then the load from the second element was being handled
wrong due to incorrect updating of the notion of which byte to load next.  This
fixes PR13442.  Thanks to Chris Smowton for reporting the problem, analyzing it
and providing a fix.

llvm-svn: 160711
2012-07-25 09:14:54 +00:00
Akira Hatanaka 5a69c235ae Eliminate the stack slot used to save the global base register.
The long branch pass (fixed in r160601) no longer uses the global base register
to compute addresses of branch destinations, so it is not necessary to reserve
a slot on the stack.

llvm-svn: 160703
2012-07-25 03:16:47 +00:00
Rafael Espindola a92cf29f0d Add a cpu to the test. Should fix the atom bot.
llvm-svn: 160701
2012-07-24 22:56:06 +00:00
Rafael Espindola f30e9bfb90 Add a triple to the test.
llvm-svn: 160698
2012-07-24 21:55:04 +00:00
Rafael Espindola a44e193a11 In order to correctly compile
struct s {
  double x1;
  float x2;
};
__attribute__((regparm(3))) struct s f(int a, int b, int c);
void g(void) {
  f(41, 42, 43);
}

We need to be able to represent passing the address of s to f (sret) in a
register (inreg). Turns out that all that is needed is to not mark them as
mutually incompatible.

llvm-svn: 160695
2012-07-24 21:40:17 +00:00
David Chisnall 5b8c1680de ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.

llvm-svn: 160687
2012-07-24 20:04:16 +00:00
Nuno Lopes 2a4b09c9de teach objectsize about strdup() and strndup()
llvm-svn: 160676
2012-07-24 16:28:13 +00:00
Nick Lewycky faa9c3b035 Teach globalopt to not nuke all stores to globals. Keep them around of they
might be deliberate "one time" leaks, so that leak checkers can find them.
This is a reapply of r160602 with the fix that this time I'm committing the
code I thought I was committing last time; the I->eraseFromParent() goes
*after* the break out of the loop.

llvm-svn: 160664
2012-07-24 07:21:08 +00:00
Akira Hatanaka 26e9ecb7a3 Add basic ability to setup call frame, and make procedure calls.
Hello world will compile and execute with this patch.

Patch by Reed Kotler.

llvm-svn: 160651
2012-07-23 23:45:54 +00:00
Dan Gohman f64ff8ed3a An objc_retain can serve as a may-use for a different pointer.
rdar://11931823.

llvm-svn: 160637
2012-07-23 19:27:31 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Nadav Rotem 9056076cab Fixed DAGCombine optimizations which generate select_cc for targets
that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160619
2012-07-23 07:59:50 +00:00
Nick Lewycky 9669c198ba Revert r160602.
llvm-svn: 160603
2012-07-21 09:03:15 +00:00
Nick Lewycky 72b83e5eaa Teach globalopt to play nice with leak checkers. This is a reapplication of
r160529 that was subsequently reverted. The fix was to not call
GV->eraseFromParent() right before the caller does the same. The existing
testcases already caught this bug if run under valgrind.

llvm-svn: 160602
2012-07-21 08:29:45 +00:00
Akira Hatanaka f72efdb62f Fix Mips long branch pass.
This pass no longer requires that the global pointer value be saved to the
stack or register since it uses bal instruction to compute branch distance.

llvm-svn: 160601
2012-07-21 03:30:44 +00:00
Nuno Lopes 705141d4df baby steps toward fixing some problems with inbound GEPs that overflow, as discussed 2 months ago or so.
Make sure we do not emit index computations with NSW flags so that we dont get an undef value if the GEP overflows

llvm-svn: 160589
2012-07-20 23:07:40 +00:00
Nuno Lopes 20ea62527a move the bounds checking pass to the instrumentation folder, where it belongs. I dunno why in the world I dropped it in the Scalar folder in the first place.
No functionality change.

llvm-svn: 160587
2012-07-20 22:39:33 +00:00
Jakob Stoklund Olesen e2cfd0d45a Avoid folding loads that are unsafe to move.
LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.

This fixes PR13414.

llvm-svn: 160575
2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen f62c07f147 Split loop exiting edges more aggressively.
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.

Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.

The test case demonstrates the improvement. Before:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1

After:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1

  movl  %esi, %eax

llvm-svn: 160571
2012-07-20 20:49:53 +00:00
Richard Osborne 0ab2b0df82 Fix assertion in jump threading (PR13405).
GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't
true if the block ends in an indirect branch with no successors. Fix this by
bailing out earlier in this case.

llvm-svn: 160546
2012-07-20 10:36:17 +00:00
Kostya Serebryany f02c6069ac [asan] make sure that the crash callbacks do not get merged (Chandler's idea: insert an empty InlineAsm). Change the order in which the new BBs are inserted: the slow path BB is insert between old BBs, the crash BB is inserted at the end. Don't create an empty BB (introduced by recent commits). Update the test. The experimental code that does manual crash callback merge will most likely be deleted later.
llvm-svn: 160544
2012-07-20 09:54:50 +00:00
Nick Lewycky 7707e23429 Revert r160529 due to crashes.
llvm-svn: 160532
2012-07-19 23:59:21 +00:00
Nick Lewycky 0fa6a28141 Don't wipe out global variables that are probably storing pointers to heap
memory. This makes clang play nice with leak checkers.

llvm-svn: 160529
2012-07-19 22:35:28 +00:00
Preston Gurd f2ea70ae4a Fix remaining lit tests which were failing when run on an Atom
processor.

Patches by Tyler Nowicki, Andy Zhang, and Preston Gurd!

llvm-svn: 160520
2012-07-19 18:53:21 +00:00
NAKAMURA Takumi 67ce1930c1 test/DebugInfo/dwarfdump-test.test: Tweak expressions for Win32 to match backslashes. They are still odd, though.
For example, Paths are printed on Win32 as below;

/tmp/dbginfo\def2.cc:4:0
/tmp/dbginfo\include\decl2.h:1:0
/tmp/include\decl.h:5:0

llvm-svn: 160505
2012-07-19 13:40:09 +00:00
Jush Lu e67e07b901 [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 160500
2012-07-19 09:49:00 +00:00
Alexey Samsonov e16e16add6 DebugInfo library: add support for fetching absolute paths to source files
(instead of basenames) from DWARF. Use this behavior in llvm-dwarfdump tool.

Reviewed by Benjamin Kramer.

llvm-svn: 160496
2012-07-19 07:03:58 +00:00
Manman Ren d0a4ee8427 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129

llvm-svn: 160454
2012-07-18 21:40:01 +00:00
Preston Gurd f0a48ec8f1 This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!

llvm-svn: 160451
2012-07-18 20:49:17 +00:00
Chandler Carruth 985454e0ac Fix a somewhat nasty crasher in PR13378. This crashes inside of
LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

llvm-svn: 160443
2012-07-18 18:58:22 +00:00
Andrew Trick e002fb5da3 Added unit test for PR13361: LSR + SCEV "hangs" on reasonably sized test.
llvm-svn: 160439
2012-07-18 18:07:52 +00:00
Victor Oliveira a1de408aa7 test commit
llvm-svn: 160438
2012-07-18 17:53:05 +00:00
Jack Carter a62ba82825 Mips specific inline asm operand modifier 'M':
Print the high order register of a double word register operand.

In 32 bit mode, a 64 bit double word integer will be represented
by 2 32 bit registers. This modifier causes the high order register
to be used in the asm expression. It is useful if you are using 
doubles in assembler and continue to control register to variable
relationships.

This patch also fixes a related bug in a previous patch:

    case 'D': // Second part of a double word register operand
    case 'L': // Low order register of a double word register operand
    case 'M': // High order register of a double word register operand

I got 'D' and 'M' confused. The second part of a double word operand
will only match 'M' for one of the endianesses. I had 'L' and 'D'
be the opposite twins when 'L' and 'M' are.

llvm-svn: 160429
2012-07-18 06:41:36 +00:00
Andrew Trick c08726627c indvars: Linear function test replace should avoid reusing undef.
Fixes PR13371: indvars pass incorrectly substitutes 'undef' values.

I do not like this fix. It's needed until/unless the meaning of undef
changes. It attempts to be complete according to the IR spec, but I
don't have much confidence in the implementation given the difficulty
testing undefined behavior. Worse, this invalidates some of my
hard-fought work on indvars and LSR to optimize pointer induction
variables. It results benchmark regressions, which I'll track
internally. On x86_64 no LTO I see:

-3% huffbench
-3% 400.perlbench
-8% fhourstones

My only suggestion for recovering is to change the meaning of
undef. If we could trust an arbitrary instruction to produce a some
real value that can be manipulated (e.g. incremented) according to
non-undef rules, then this case could be easily handled with SCEV.

llvm-svn: 160421
2012-07-18 04:35:10 +00:00
Craig Topper 01deb5f2df Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Joel Jones b84f7bea09 More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160410
2012-07-18 00:02:16 +00:00
Evan Cheng f73d7553cc Add test case for r160387
llvm-svn: 160389
2012-07-17 19:40:05 +00:00
Evan Cheng e6a3b03ee0 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
NAKAMURA Takumi 7364c32b33 llvm/test/Transforms/LoopRotate/PhiRename-1.ll: FileCheck-ize. It fixes PR13301.
It began choking since Chandler's r159547, possibly due to improper expression on grep from TclParser to ShParser.

llvm-svn: 160367
2012-07-17 15:43:17 +00:00
Alexey Samsonov b604ff2a07 Improve behavior of DebugInfoEntryMinimal::getSubprogramName() introduced in r159512.
To fetch a subprogram name we should not only inspect the DIE for this subprogram, but optionally inspect
its specification, or its abstract origin (even if there is no inlining), or even specification of an abstract origin.

Reviewed by Benjamin Kramer.

llvm-svn: 160365
2012-07-17 15:28:35 +00:00
Nadav Rotem 277a40bc0a Fix a crash in the legalization of large vectors.
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.

llvm-svn: 160357
2012-07-17 09:07:37 +00:00
Evan Cheng 780f9b5f92 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng f579beca6d This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Akira Hatanaka 046744467d Fix function select_cc_f32 in test/CodeGen/Mips/selectcc.ll.
llvm-svn: 160329
2012-07-16 23:56:51 +00:00
Nuno Lopes 482fb19fd5 fix PR13339 (remove the predecessor from the unwind BB when removing an invoke)
llvm-svn: 160325
2012-07-16 22:49:40 +00:00
Evan Cheng 75315b877c For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926

llvm-svn: 160312
2012-07-16 19:35:43 +00:00
Nadav Rotem 839a06e9d7 Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160305
2012-07-16 18:34:53 +00:00
Tom Stellard fc3db614c0 Revert "test/CodeGen/R600: Add some basic tests v6"
This reverts commit 11d3457afcda7848448dd7f11b2ede6552ffb9ea.

llvm-svn: 160300
2012-07-16 18:19:43 +00:00
Kostya Serebryany 874dae6119 [asan] refactor instrumentation to allow merging the crash callbacks (not fully implemented yet, no functionality change except the BB order)
llvm-svn: 160284
2012-07-16 16:15:40 +00:00
Jack Carter f649043aa5 Doubleword Shift Left Logical Plus 32
Mips shift instructions DSLL, DSRL and DSRA are transformed into
DSLL32, DSRL32 and DSRA32 respectively if the shift amount is between
32 and 63

Here is a description of DSLL:

Purpose: Doubleword Shift Left Logical Plus 32
To execute a left-shift of a doubleword by a fixed amount--32 to 63 bits

Description: GPR[rd] <- GPR[rt] << (sa+32)

The 64-bit doubleword contents of GPR rt are shifted left, inserting
 zeros into the emptied bits; the result is placed in
GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa.

This patch implements the direct object output of these instructions.

llvm-svn: 160277
2012-07-16 15:14:51 +00:00
Alexey Samsonov 893d3d336a Fix tests that failed on i686-win32 after r160248:
1. FileCheck-ize epilogue.ll and allow another asm instruction to restore %rsp.
2. Remove check in widen_arith-3.ll that was hitting instruction in epilogue instead of
vector add.

llvm-svn: 160274
2012-07-16 14:33:36 +00:00
Tom Stellard 6693fbe3eb test/CodeGen/R600: Add some basic tests v6
llvm-svn: 160273
2012-07-16 14:17:19 +00:00
Nadav Rotem 4968e45b9f Fix a bug in the 3-address conversion of LEA when one of the operands is an
undef virtual register. The problem is that ProcessImplicitDefs removes the
definition of the register and marks all uses as undef. If we lose the undef
marker then we get a register which has no def, is not marked as undef. The
live interval analysis does not collect information for these virtual
registers and we crash in later passes.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160260
2012-07-16 10:52:25 +00:00
Chandler Carruth 8b540ab337 Revert r160254 temporarily.
It turns out that ASan relied on the at-the-end block insertion order to
(purely by happenstance) disable some LLVM optimizations, which in turn
start firing when the ordering is made more "normal". These
optimizations in turn merge many of the instrumentation reporting calls
which breaks the return address based error reporting in ASan.

We're looking at several different options for fixing this.

llvm-svn: 160256
2012-07-16 10:01:02 +00:00
Chandler Carruth 3dd6c81492 Teach AddressSanitizer to create basic blocks in a more natural order.
This is particularly useful to the backend code generators which try to
process things in the incoming function order.

Also, cleanup some uses of IRBuilder to be a bit simpler and more clear.

llvm-svn: 160254
2012-07-16 08:58:53 +00:00
Chandler Carruth 663943e23e Add a basic test for AddressSanitizer. This is just a bare-bones
functionality test.

In general, unless the functionality is substantially separated, we
should lump more basic testing into this file. The test running
infrastructure likes having a few test files with more comprehensive
testing within them.

llvm-svn: 160253
2012-07-16 08:56:46 +00:00
Alexey Samsonov dcc1291d17 This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment.
It is intended to fix PR11468.

Old prologue and epilogue looked like this:
push %rbp
mov %rsp, %rbp
and $alignment, %rsp
push %r14
push %r15
...
pop %r15
pop %r14
mov %rbp, %rsp
pop %rbp

The problem was to reference the locations of callee-saved registers in exception handling:
locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would
take some effort to implement this in LLVM, as currently MachineLocation can only have the form
"Register + Offset". Funciton prologue and epilogue are now changed to:

push %rbp
mov %rsp, %rbp
push %14
push %15
and $alignment, %rsp
...
lea -$size_of_saved_registers(%rbp), %rsp
pop %r15
pop %r14
pop %rbp

Reviewed by Chad Rosier.

llvm-svn: 160248
2012-07-16 06:54:09 +00:00
Nadav Rotem 3050e07108 Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160235
2012-07-15 20:39:08 +00:00
Nadav Rotem eec74c7279 Teach getTargetVShiftNode about TargetConstant nodes.
llvm-svn: 160234
2012-07-15 20:27:43 +00:00
NAKAMURA Takumi 032dc0a06c llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Rewrite expressions to fit various targets.
- Make sure existence of "barrier".
  - Confirm reload corresponding to spill.

llvm-svn: 160232
2012-07-15 14:38:35 +00:00
Nadav Rotem ee3552f88d Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.

PR12782.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160230
2012-07-15 12:26:30 +00:00
Nadav Rotem 9466e81df6 AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector.
This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions.

llvm-svn: 160222
2012-07-14 22:26:05 +00:00
Nadav Rotem 018921002e Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.

llvm-svn: 160221
2012-07-14 21:30:27 +00:00
Andrew Trick 653513b8dd LSR Fix: check SCEV expression safety before expansion.
All SCEV expressions used by LSR formulae must be safe to
expand. i.e. they may not contain UDiv unless we can prove nonzero
denominator.

Fixes PR11356: LSR hoists UDiv.

llvm-svn: 160205
2012-07-13 23:33:10 +00:00
Joel Jones 43cb87839c This is one of the first steps at moving to replace target-dependent
intrinsics with target-indepdent intrinsics.  The first instruction(s) to be 
handled are the vector versions of count leading zeros (ctlz).

The changes here are to clang so that it generates a target independent 
vector ctlz when it sees an ARM dependent vector ctlz.  The changes in llvm 
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp 
to update any existing bc files containing ARM dependent vector ctlzs with 
target-independent ctlzs.  There are also changes to an existing test case in 
llvm for ARM vector count instructions and a new test for the bitcode upgrade.

<rdar://problem/11831778>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160200
2012-07-13 23:25:25 +00:00
Jack Carter 5ddcfda8ef The Mips specific relocation R_MIPS_GOT_DISP
is used in cases where global symbols are 
directly represented in the GOT and we use an 
offset into the global offset table.

This patch adds direct object support for R_MIPS_GOT_DISP.

llvm-svn: 160183
2012-07-13 19:15:47 +00:00
Jack Carter 2e3358a0f8 test case for revision 160084: Alignment filling between Mips function units
llvm-svn: 160177
2012-07-13 18:14:01 +00:00
Duncan Sands a9c373e49d Restrict this to x86, hopefully fixing ARM buildbots.
llvm-svn: 160163
2012-07-13 07:02:00 +00:00
Eric Christopher bf57091f8b The end of the prologue should be marked with is_stmt.
Fixes PR13303.

Patch by Paul Robinson!

llvm-svn: 160148
2012-07-12 23:30:25 +00:00
Akira Hatanaka a13cd0666e Fix check strings in test/MC/Disassembler/Mips/* and run FileCheck.
Patch by Vladimir Medic.

llvm-svn: 160143
2012-07-12 21:19:32 +00:00
Benjamin Kramer 4d0916788d Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it.
I already had the necessary things in place for IR-level passes but missed the machine passes.

llvm-svn: 160137
2012-07-12 18:14:57 +00:00
Nadav Rotem fdce33a495 The LIT tests below do not specify the exact cpu model and fail on AVX2 machines, because we select different instructions such as vbroadcast, new shuffles, etc.
Patch by Michael Liao.

llvm-svn: 160129
2012-07-12 13:45:15 +00:00
NAKAMURA Takumi f415fe70f3 llvm/test/CodeGen/X86/rdrand.ll: Relax expression corresponding to Win64 CC.
llvm-svn: 160124
2012-07-12 10:22:57 +00:00
NAKAMURA Takumi 0b00f994a6 llvm/test/CMakeLists.txt: Add llvm-diff to deps.
llvm-svn: 160123
2012-07-12 10:15:48 +00:00
Benjamin Kramer cbac2f3bc9 Use %s instead of the explicit name, the latter doesn't work in out-of-tree builds.
llvm-svn: 160120
2012-07-12 09:36:29 +00:00
Benjamin Kramer 0ab2794eda Add intrinsics for Ivy Bridge's rdrand instruction.
The rdrand/cmov sequence is the same that is emitted by both
GCC and ICC.

Fixes PR13284.

llvm-svn: 160117
2012-07-12 09:31:43 +00:00
Duncan Sands 671cc2575d The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.

llvm-svn: 160116
2012-07-12 09:01:35 +00:00
Craig Topper f7755df776 Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
llvm-svn: 160110
2012-07-12 06:52:41 +00:00
Evan Cheng 493eb32ff4 Instcombine was transforming:
%shr = lshr i64 %key, 3
  %0 = load i64* %val, align 8
  %sub = add i64 %0, -1
  %and = and i64 %sub, %shr
  ret i64 %and

to:
  %shr = lshr i64 %key, 3
  %0 = load i64* %val, align 8
  %sub = add i64 %0, 2305843009213693951
  %and = and i64 %sub, %shr
  ret i64 %and

The demanded bit optimization is actually a pessimization because add -1 would
be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization
to check for negated constant to make sure it is actually reducing the width
of the constant.

rdar://11793464

llvm-svn: 160101
2012-07-12 01:45:35 +00:00
Manman Ren 34cb93e192 ARM: Fix optimizeCompare to correctly check safe condition.
It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.

llvm-svn: 160090
2012-07-11 22:51:44 +00:00
Stepan Dyatkovskiy 326edc579a Fixed diff comparison.
llvm-svn: 160076
2012-07-11 21:02:57 +00:00
Akira Hatanaka 20dced4dbb Test case for r160036.
llvm-svn: 160067
2012-07-11 19:50:46 +00:00
Manman Ren 1553ce0e81 X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair.
When Movr0 is between sub and cmp, we move Movr0 before sub if it enables
removal of Cmp.

llvm-svn: 160066
2012-07-11 19:35:12 +00:00
Akira Hatanaka 24cf4e36e5 Implement MipsTargetLowering::LowerSELECT_CC to custom lower SELECT_CC.
llvm-svn: 160064
2012-07-11 19:32:27 +00:00
Benjamin Kramer 3aab6a86a2 PR13326: Fix a subtle edge case in the udiv -> magic multiply generator.
This caused 6 of 65k possible 8 bit udivs to be wrong.

llvm-svn: 160058
2012-07-11 18:31:59 +00:00
Nadav Rotem d2bdcebb14 When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers.
llvm-svn: 160044
2012-07-11 13:27:05 +00:00
Akira Hatanaka 878ad8b28d Lower RETURNADDR node in Mips backend.
Patch by Sasa Stankovic.

llvm-svn: 160031
2012-07-11 00:53:32 +00:00
Jack Carter e8cb2fc616 Mips specific inline asm operand modifier 'L'.
Low order register of a double word register operand. Operands 
   are defined by the name of the variable they are marked with in
   the inline assembler code. This is a way to specify that the 
   operand just refers to the low order register for that variable.
   
   It is the opposite of modifier 'D' which specifies the high order
   register.
   
   Example:
   
 main()
{

    long long ll_input = 0x1111222233334444LL;
    long long ll_val = 3;
    int i_result = 0;

    __asm__ __volatile__( 
		   "or	%0, %L1, %2"
	     : "=r" (i_result) 
	     : "r" (ll_input), "r" (ll_val)); 
}

   Which results in:
   
   	lui	$2, %hi(_gp_disp)
	addiu	$2, $2, %lo(_gp_disp)
	addiu	$sp, $sp, -8
	addu	$2, $2, $25
	sw	$2, 0($sp)
	lui	$2, 13107
	ori	$3, $2, 17476     <-- Low 32 bits of ll_input
	lui	$2, 4369
	ori	$4, $2, 8738      <-- High 32 bits of ll_input
	addiu	$5, $zero, 3  <-- Low 32 bits of ll_val
	addiu	$2, $zero, 0  <-- High 32 bits of ll_val
	#APP
	or	$3, $4, $5        <-- or i_result, high 32 ll_input, low 32 of ll_val
	#NO_APP
	addiu	$sp, $sp, 8
	jr	$ra

If not direction is done for the long long for 32 bit variables results
in using the low 32 bits as ll_val shows.

There is an existing bug if 'L' or 'D' is used for the destination register
for 32 bit long longs in that the target value will be updated incorrectly
for the non-specified part unless explicitly set within the inline asm code.

llvm-svn: 160028
2012-07-10 22:41:20 +00:00
Chad Rosier 3ee9a4c29e Add newline.
llvm-svn: 160006
2012-07-10 17:57:00 +00:00
Chad Rosier 579b1fee6b Add test case accidentally omitted from r160002.
llvm-svn: 160004
2012-07-10 17:49:39 +00:00
Chad Rosier bdb08ac50a Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.  Basically, this is a reapplication of r158087 with a few fixes.

Specifically, (1) the stack pointer is restored from the base pointer before
popping callee-saved registers and (2) in obscure cases (see comments in patch)
we must cache the value of the original stack adjustment in the prologue and
apply it in the epilogue.

rdar://11496434

llvm-svn: 160002
2012-07-10 17:45:53 +00:00
Nadav Rotem d908ddc186 Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.

llvm-svn: 159991
2012-07-10 13:25:08 +00:00
Richard Barton 1dc44dcedd Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters) (and do it properly this time!
llvm-svn: 159989
2012-07-10 12:51:09 +00:00
Craig Topper be41e2daa6 Reverse assembler/disassembler operand order for gather instructions.
llvm-svn: 159983
2012-07-10 06:38:33 +00:00
Akira Hatanaka efff7b763b Make register Mips::RA allocatable if not in mips16 mode.
llvm-svn: 159971
2012-07-10 00:19:06 +00:00
Chad Rosier aeed158f75 Revert r159938 (and r159945) to appease the buildbots.
llvm-svn: 159960
2012-07-09 20:43:34 +00:00
Owen Anderson d4b841f8f9 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.

llvm-svn: 159957
2012-07-09 20:31:12 +00:00
Manman Ren 5f6fa428fa X86: implement functions to analyze & synthesize CMOV|SET|Jcc
getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond

No functional change intended.
If we want to update the condition code of CMOV|SET|Jcc, we first analyze the
opcode to get the condition code, then update the condition code, finally
synthesize the new opcode form the new condition code.

llvm-svn: 159955
2012-07-09 18:57:12 +00:00
Akira Hatanaka 9bf2b5677d Reapply r158846.
Access mips register classes via MCRegisterInfo's functions instead of via the
TargetRegisterClasses defined in MipsGenRegisterInfo.inc.

llvm-svn: 159953
2012-07-09 18:46:47 +00:00
Nuno Lopes 95cc4f3cb5 instcombine: merge the functions that remove dead allocas and dead mallocs/callocs/...
This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :)
In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway

llvm-svn: 159952
2012-07-09 18:38:20 +00:00
Richard Barton c9e1c94fae Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters)
llvm-svn: 159938
2012-07-09 16:41:33 +00:00
Richard Barton 35aceb86fe Prevent ARM assembler from losing a right shift by #32 applied to a register
llvm-svn: 159937
2012-07-09 16:31:14 +00:00
Richard Barton a39625ecc6 Teach the assembler to use the narrow thumb encodings of various three-register dp instructions where permissable.
llvm-svn: 159935
2012-07-09 16:12:24 +00:00
Manman Ren bb36074047 X86: Fix optimizeCompare to correctly check safe condition.
It is safe if EFLAGS is killed or re-defined.
When we are done with the basic block, check whether EFLAGS is live-out.
Do not optimize away cmp if EFLAGS is live-out.

llvm-svn: 159888
2012-07-07 03:34:46 +00:00
Nuno Lopes fa0dffccee teach instcombine to remove allocated buffers even if there are stores, memcpy/memmove/memset, and objectsize users.
This means we can do cheap DSE for heap memory.
Nothing is done if the pointer excapes or has a load.

The churn in the tests is mostly due to objectsize, since we want to make sure we
don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0)

llvm-svn: 159876
2012-07-06 23:09:25 +00:00
Akira Hatanaka b577ff116d revert r159851.
llvm-svn: 159854
2012-07-06 20:16:48 +00:00
Akira Hatanaka cfa35fa0ff Reapply r158846.
Include file MipsGenRegisterInfo.inc.

llvm-svn: 159851
2012-07-06 19:29:11 +00:00
Manman Ren c965673707 X86: peephole optimization to remove cmp instruction
For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.

llvm-svn: 159838
2012-07-06 17:36:20 +00:00
Chad Rosier 88d53eae56 [fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic.
llvm-svn: 159837
2012-07-06 17:33:39 +00:00
Duncan Sands c65aa3f6ae Attempt to fix windows buildbots. Patch by James Benton.
llvm-svn: 159826
2012-07-06 14:43:16 +00:00
NAKAMURA Takumi 4f934676fb test/CodeGen/X86/sext-setcc-self.ll: Mark it as XFAIL: cygwin,mingw32,win32. Investigating.
llvm-svn: 159820
2012-07-06 12:12:39 +00:00
NAKAMURA Takumi 0246724cd6 Revert r159804, "[arm-fast-isel] Add support for vararg function calls."
It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts.

llvm-svn: 159817
2012-07-06 11:12:44 +00:00
Alexey Samsonov 39602781f6 Fix PR13202 and a regtest.
DwarfDebug class could generate the same (inlined) DIVariable twice:
1) when trying to find abstract debug variable for a concrete inlined instance.
2) when explicitly collecting info for variables that were optimized out.

This change makes sure that this duplication won't happen and makes
Clang pass "gdb.opt/inline-locals" test from gdb testsuite.

Reviewed by Eric Christopher.

llvm-svn: 159811
2012-07-06 08:45:08 +00:00
Jush Lu 5e6e6264f4 [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 159804
2012-07-06 03:02:37 +00:00
Jack Carter b2af512cef Mips specific inline asm operand modifier D.
Print the second half of a double word operand.
   
   The include list was cleaned up a bit as well.
   
   Also the test case was modified to test for both
   big and little patterns.
   

llvm-svn: 159787
2012-07-05 23:58:21 +00:00
Akira Hatanaka bbf374c4c6 test case for r159770.
llvm-svn: 159771
2012-07-05 19:29:31 +00:00
Duncan Sands 0552a2cad2 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.

llvm-svn: 159739
2012-07-05 09:32:46 +00:00
Jakob Stoklund Olesen 2dee812445 Ensure CopyToReg nodes are always glued to the call instruction.
The CopyToReg nodes that set up the argument registers before a call
must be glued to the call instruction. Otherwise, the scheduler may emit
the physreg copies long before the call, causing long live ranges for
the fixed registers.

Besides disabling good register allocation, that can also expose
problems when EmitInstrWithCustomInserter() splits a basic block during
the live range of a physreg.

llvm-svn: 159721
2012-07-04 19:28:31 +00:00
Rafael Espindola 1a7cf13215 Add a testcase for pr13209. It is not a great test, but it still fails if
159509 and 159479 are reverted. It would be really nice to be able to run
just the coalescer :-(

llvm-svn: 159715
2012-07-04 16:06:00 +00:00
Jakob Stoklund Olesen 49e4d4b3ef Add early if-conversion support to X86.
Implement the TII hooks needed by EarlyIfConversion to create cmov
instructions and estimate their latency.

Early if-conversion is still not enabled by default.

llvm-svn: 159695
2012-07-04 00:09:58 +00:00
Nuno Lopes 1e8dffdf27 BoundsChecking: optimize out the check for offset < 0 if size is known to be >= 0 (signed).
(LLVM optimizers cannot do this optimization by themselves)

llvm-svn: 159668
2012-07-03 17:30:18 +00:00
Craig Topper 676dcd8c39 Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252.
llvm-svn: 159644
2012-07-03 05:49:45 +00:00
NAKAMURA Takumi 2338556320 test/CodeGen/SPARC/private.ll: Fixup. Forgot to prune old RUN lines.
llvm-svn: 159643
2012-07-03 04:29:20 +00:00
NAKAMURA Takumi c2a5bd6822 test/CodeGen/SPARC/private.ll: FileCheck-ize.
llvm-svn: 159642
2012-07-03 04:21:57 +00:00
NAKAMURA Takumi 2a4930c96a llvm/test/lit.cfg: Retweak for Win32 to fix testing.
- execute_external should be;
    - Not on Win32.
    - Using bash.
    In reverse, "execute_internal" shoud be (Win32 && !bash).

  - lit.getBashPath() behaves differently before and after tweaking $PATH.

I will add a few explanations there later.

llvm-svn: 159641
2012-07-03 03:59:34 +00:00
NAKAMURA Takumi dff1a78321 test/CodeGen/X86/sincos.ll: FileCheck-ize.
llvm-svn: 159639
2012-07-03 03:59:22 +00:00
NAKAMURA Takumi 10dc235746 test/CodeGen/X86/fabs.ll: FileCheck-ize.
llvm-svn: 159638
2012-07-03 03:59:15 +00:00
NAKAMURA Takumi ff680b1db6 test/CodeGen/X86/2007-09-05-InvalidAsm.ll: FileCheck-ize.
llvm-svn: 159637
2012-07-03 03:59:08 +00:00
NAKAMURA Takumi e5e19e4f7b test/CodeGen/X86/2004-03-30-Select-Max.ll: FileCheck-ize.
llvm-svn: 159636
2012-07-03 03:58:59 +00:00
Jack Carter b353094f27 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    

llvm-svn: 159625
2012-07-02 23:35:23 +00:00
Eric Christopher dfc3e68c40 Revert " mips32 long long register inline asm constraint support." as
it appears to be breaking the bots.

This reverts commit 1b055ce320fa13f6f1ac81670d11b45e01f79876.

llvm-svn: 159619
2012-07-02 23:22:25 +00:00
Eric Christopher b65acc61a5 Revert "IntRange:" as it appears to be breaking self hosting.
This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c.

llvm-svn: 159618
2012-07-02 23:22:21 +00:00
Jack Carter 939236c2eb deleted test/CodeGen/Mips/inlineasm-cnstrnt-bad-r-1.ll
llvm-svn: 159617
2012-07-02 23:21:22 +00:00
Jack Carter 5c1a01a625 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    

llvm-svn: 159610
2012-07-02 22:39:45 +00:00
Chandler Carruth a7f1f35eb8 Extend the workaround from r159593 to cover a few explicit alias
targets.

llvm-svn: 159597
2012-07-02 21:45:22 +00:00
Chandler Carruth aec961811b Revert r159588, and apply a more principled fix. Place the fix for this
in the abstraction for lit test suites so that the various other layers
of abstraction pick up the same behavioral fix, and so that we still get
a complete list of dependencies for the 'check-all' target.

This should fix the follow-on issues of the same nature with various
other build targets, including Clang targets. Sorry for the churn, and
again thanks to Matt for testing and breaking this more thoroughly.

llvm-svn: 159593
2012-07-02 21:31:03 +00:00
Chandler Carruth 6e80d5934d Work around a really frustrating apparant CMake bug.
No functionality changed here, except that the CMake installed by
default on Ubuntu Lucid should actually work with the makefile
generators now.

Thanks to Matt for the report and head-desking required to figure out
why it was failing.

llvm-svn: 159588
2012-07-02 21:14:06 +00:00
Jack Carter 06de0fb083 Pass the correct ELFOSABI enumeration to the MipsELFObjectWriter constructor
Contributer: Sasa Stankovic 
llvm-svn: 159574
2012-07-02 20:04:43 +00:00
Bob Wilson cac3b90633 Extend TargetPassConfig to allow running only a subset of the normal passes.
This is still a work in progress but I believe it is currently good enough
to fix PR13122 "Need unit test driver for codegen IR passes".  For example,
you can run llc with -stop-after=loop-reduce to have it dump out the IR after
running LSR.  Serializing machine-level IR is not yet supported but we have
some patches in progress for that.

The plan is to serialize the IR to a YAML file, containing separate sections
for the LLVM IR, machine-level IR, and whatever other info is needed.  Chad
suggested that we stash the stop-after pass in the YAML file and use that
instead of the start-after option to figure out where to restart the
compilation.  I think that's a great idea, but since it's not implemented yet
I put the -start-after option into this patch for testing purposes.

llvm-svn: 159570
2012-07-02 19:48:45 +00:00
Chandler Carruth ff123d5c63 Fix the remaining TCL-style quotes found in the testsuite. This is
another mechanical change accomplished though the power of terrible Perl
scripts.

I have manually switched some "s to 's to make escaping simpler.

While I started this to fix tests that aren't run in all configurations,
the massive number of tests is due to a really frustrating fragility of
our testing infrastructure: things like 'grep -v', 'not grep', and
'expected failures' can mask broken tests all too easily.

Essentially, I'm deeply disturbed that I can change the testsuite so
radically without causing any change in results for most platforms. =/

llvm-svn: 159547
2012-07-02 19:09:46 +00:00
Duncan Sands e8ce94fcd7 GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection.
llvm-svn: 159546
2012-07-02 18:55:39 +00:00
Chandler Carruth 5da53436d5 Convert the uses of '|&' to use '2>&1 |' instead, which works on old
versions of Bash. In addition, I can back out the change to the lit
built-in shell test runner to support this.

This should fix the majority of fallout on Darwin, but I suspect there
will be a few straggling issues.

llvm-svn: 159544
2012-07-02 18:37:59 +00:00
Bob Wilson 2297221028 Do not attempt to use ROR for Thumb1.
Patch by Matt Fischer!

llvm-svn: 159538
2012-07-02 17:22:47 +00:00
Nuno Lopes d0bcfe4d9d fix the regression I introduced in r159385 (it's necessary to update PHI nodes in unwind BB
llvm-svn: 159534
2012-07-02 16:14:47 +00:00
Chandler Carruth 665c76bc52 The built-in shell test runner for some reason doesn't like the quoting
and multi-line nature of this test. I don't really feel like bugging
this kind of edge-case, so just put it on one line and use single
quotes. With this, every test *really* passes with the built-in shell
test runner.

llvm-svn: 159530
2012-07-02 13:35:01 +00:00
Chandler Carruth 872ac7cfad Fix the TCL-style quoting in one random test that somehow slipped
through my perl nets.

With this, the test suite passes even if I force it to run with the
built-in shell test logic, except for a test which REQUIREs shell.

llvm-svn: 159529
2012-07-02 13:29:47 +00:00
Stepan Dyatkovskiy 8b9ecca42d IntRange:
- Changed isSingleNumber method behaviour. Now this flag is calculated on demand.
IntegersSubsetMapping
  - Optimized diff operation.
  - Replaced type of Items field from std::list with std::map.
  - Added new methods:
    bool isOverlapped(self &RHS)
    void add(self& RHS, SuccessorClass *S)
    void detachCase(self& NewMapping, SuccessorClass *Succ)
    void removeCase(SuccessorClass *Succ)
    SuccessorClass *findSuccessor(const IntTy& Val)
    const IntTy* getCaseSingleNumber(SuccessorClass *Succ)
IntegersSubsetTest
  - DiffTest: Added checks for successors.
SimplifyCFG
  Updated SwitchInst usage (now it is case-ragnes compatible) for
    - SimplifyEqualityComparisonWithOnlyPredecessor
    - FoldValueComparisonIntoPredecessors

llvm-svn: 159527
2012-07-02 13:02:18 +00:00
Chandler Carruth a5a29f970e Convert all tests using TCL-style quoting to use shell-style quoting.
This was done through the aid of a terrible Perl creation. I will not
paste any of the horrors here. Suffice to say, it require multiple
staged rounds of replacements, state carried between, and a few
nested-construct-parsing hacks that I'm not proud of. It happens, by
luck, to be able to deal with all the TCL-quoting patterns in evidence
in the LLVM test suite.

If anyone is maintaining large out-of-tree test trees, feel free to poke
me and I'll send you the steps I used to convert things, as well as
answer any painful questions etc. IRC works best for this type of thing
I find.

Once converted, switch the LLVM lit config to use ShTests the same as
Clang. In addition to being able to delete large amounts of Python code
from 'lit', this will also simplify the entire test suite and some of
lit's architecture.

Finally, the test suite runs 33% faster on Linux now. ;]
For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s

llvm-svn: 159525
2012-07-02 12:47:22 +00:00
Chandler Carruth 0a4a261365 Make tests which first provide a negative assertion via 'not', then
a pipeline, and then a positive assertion via grep, use two RUN lines
instead.

Supporting these complex ideas of 'success' and 'failure' across
multiple stages of a pipeline is brittle in the shell world, and would
block switching to ShTest format; it only worked due to contrivances
introduced by the TclTest format.

Writing this as two separate RUN lines seems clearer in any event.

This is another step toward completely removing TclTests from lit.

llvm-svn: 159524
2012-07-02 12:23:19 +00:00
Chandler Carruth ae00a80869 Rewrite three tests that had truly egregious abuses of 'grep' in them to
use FileCheck.

Aside from removing a dependence on TCL-style quoting, this also makes
the tests ... significantly more robust. =] It would be really, *really*
great of the maintainer(s) of the CellSPU backend went through and
systematically rewrite these tests to use FileCheck. There are a lot
more that have nearly this bad of abuses.

Another step along the path to a TclTest-free testsuite.

llvm-svn: 159523
2012-07-02 12:20:14 +00:00
Chandler Carruth 8bdfe1ec92 Switch a bunch of Linker tests from using elaborate echo productions to
just provide and reference separate input files from an Inputs
subdirectory. This pattern works very well in the Clang tree and is
easier to understand in my opinion. It also has fewer limitations and
will remove one particularly annoying use of TCL-style {} quoting from
the testsuite.

Also teach the LLVM lit configuration to avoid recursing into 'Inputs'
subdirectories. This wasn't required for the previous 'Inputs'
subdirectories used due to fortuitous suffix patterns.

This is the first step to completely removing support for TCL-style tests.

llvm-svn: 159520
2012-07-02 10:18:06 +00:00
Alexey Samsonov f4462fa3ca This patch extends the libLLVMDebugInfo which contains a minimalistic DWARF parser:
1) DIContext is now able to return function name for a given instruction address (besides file/line info).
2) llvm-dwarfdump accepts flag --functions that prints the function name (if address is specified by --address flag).
3) test case that checks the basic functionality of llvm-dwarfdump added

llvm-svn: 159512
2012-07-02 05:54:45 +00:00
Rafael Espindola a77d31d7fd Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

llvm-svn: 159509
2012-07-01 17:08:01 +00:00
Elena Demikhovsky 9af899fa88 Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
llvm-svn: 159504
2012-07-01 06:12:26 +00:00
Chandler Carruth 69ce6652b8 Hoist LLVM's lit testsuite infrastructure into module so that it can be
re-used. Also, build in direct support for accumulating a set of lit
parameters, arguments, and testsuites to run as part of a 'check-all'
rule. This sinks 'check-all' from a Clang-specific construct to
a generic construct of the project.

llvm-svn: 159482
2012-06-30 10:14:14 +00:00
Jakob Stoklund Olesen 3e3cdecf98 Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

llvm-svn: 159461
2012-06-29 21:00:03 +00:00
Duncan Sands 369c6d270b Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to
the optimizers producing a multiply expression with more multiplications than
the original (!).

llvm-svn: 159426
2012-06-29 13:25:06 +00:00
Rafael Espindola efdfb1e6b2 In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

llvm-svn: 159409
2012-06-29 04:22:35 +00:00
Manman Ren 98a5bf24a9 X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256

llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Chandler Carruth 0cb6c4bdcc Remove a completely unnecessary mkdir from the CMake build.
Clang has been getting along fine without this for quite some time.

llvm-svn: 159400
2012-06-29 00:45:57 +00:00
Nick Lewycky 474112d82c If the step value is a constant zero, the loop isn't going to terminate. Fixes
the assert reported in PR13228!

llvm-svn: 159393
2012-06-28 23:44:57 +00:00
Nuno Lopes 2f49284f12 make the verifier accept @llvm.donothing as the only intrinsic that can be invoked
While at it, merge 2 tests and FileCheckize them

llvm-svn: 159388
2012-06-28 22:57:00 +00:00
Nuno Lopes b97a4e8bc2 make simplifyCFG erase invokes to readonly/readnone functions
llvm-svn: 159385
2012-06-28 22:32:27 +00:00
Nuno Lopes 9ac4661afa make instcombine produce calls to llvm.donothing instead of a random intrinsic
llvm-svn: 159384
2012-06-28 22:31:24 +00:00
Nuno Lopes ec9653b363 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Nuno Lopes 8650fb8e0e make LazyValueInfo analyze the default case of switch statements (we know that in the default branch the value cannot be any of the switch cases)
llvm-svn: 159353
2012-06-28 16:13:37 +00:00
Chandler Carruth 3511dd30c8 Move the setup for variables that are expanded in the lit.site.cfg into
a dedicated helper function. This will enable re-using the same logic
for Clang's lit setup, etc.

llvm-svn: 159333
2012-06-28 06:36:24 +00:00
Hal Finkel f2dcb9a9c4 Allow BBVectorize to form non-2^n-length vectors.
The original algorithm only used recursive pair fusion of equal-length
types. This is now extended to allow pairing of any types that share
the same underlying scalar type. Because we would still generally
prefer the 2^n-length types, those are formed first. Then a second
set of iterations form the non-2^n-length types.

Also, a call to SimplifyInstructionsInBlock has been added after each
pairing iteration. This takes care of DCE (and a few other things)
that make the following iterations execute somewhat faster. For the
same reason, some of the simple shuffle-combination cases are now
handled internally.

There is some additional refactoring work to be done, but I've had
many requests for this feature, so additional refactoring will come
soon in future commits (as will additional test cases).

llvm-svn: 159330
2012-06-28 05:42:42 +00:00
Jack Carter 6c0bc0b378 The Mips specific inline asm operand modifier 'z' has the
following description in the gnu sources:

    Print $0 if operand is zero otherwise print the op normally.

llvm-svn: 159324
2012-06-28 01:33:40 +00:00
Nuno Lopes e6e049020b make LVI::getEdgeValue() always intersect the constraints of the edge with the range of the block. Previously it was only performing the intersection for a few cases, thus losing precision
llvm-svn: 159320
2012-06-28 01:16:18 +00:00
Chandler Carruth bf2b400f3b Remove 'site.exp' building from both CMake and configure+make.
This is another vestige of the DejaGNU roots. There were FIXMEs in the
lit setup to add a 'lit.site.cfg', which has been around for quite some
time now, so I've properly switched the handling of the 4 things
actually used in site.exp to go through lit.site.cfg now. No more
parsing of the .exp file, one fewer configure-style generated file,
etc., etc.

llvm-svn: 159313
2012-06-28 00:16:51 +00:00
Chandler Carruth fd3a5e33d5 Remove the last vestiges of the '-lit' and '-dg' test runner split by
removing '-lit' qualifiers from make rules. I've left a legacy
'check-local-lit' rule in case build scripts have this encoded
somewhere.

llvm-svn: 159311
2012-06-28 00:03:15 +00:00
Chandler Carruth 256d3a9eaa Rip out legacy DejaGNU support from our Makefiles. This hasn't been the
default in forever, and hasn't even worked since most of the .exp files
were removed.

llvm-svn: 159307
2012-06-27 23:48:39 +00:00
Chandler Carruth b5c1a2b87c LLVM-GCC is dead. Really. I promise. ;]
More importantly, these files don't even have the variable that these
lines purport to substite.

llvm-svn: 159304
2012-06-27 23:34:25 +00:00
Jack Carter ef40238a0e This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159302
2012-06-27 23:13:42 +00:00
Jack Carter b9f9de93df This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159301
2012-06-27 22:48:25 +00:00
Matt Beaumont-Gay a58862310c Revert r159136 due to PR13124.
Original commit message:

If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159272
2012-06-27 17:10:33 +00:00
Duncan Sands 514db117bd Some reassociate optimizations create new instructions, which they insert just
before the expression root.  Any existing operators that are changed to use one
of them needs to be moved between it and the expression root, and recursively
for the operators using that one.  When I rewrote RewriteExprTree I accidentally
inverted the logic, resulting in the compacting going down from operators to
operands rather than up from operands to the operators using them, oops.  Fix
this, resolving PR12963.

llvm-svn: 159265
2012-06-27 14:19:00 +00:00
Richard Barton 57b7d16e34 Teach assembler to handle capitalised operation values for DSB instructions
llvm-svn: 159259
2012-06-27 09:48:23 +00:00
Chandler Carruth aa324c9078 Clean up the 'check' CMake build rule a bit, notable renaming it to
'check-llvm'.

Don't worry! 'check' still works! =] To rationalize the names of targets
used to run tests, the vague plan is the following:

make check-llvm  # run LLVM reg/unit tests  (currently 'check')
make check-clang # run Clang reg/unit tests (currently 'clang-test')
make check-rt    # run CompilerRT reg/unit tests
make check-asan  # run ASan reg/unit tests (subset of -rt)
make check-tsan  # run TSan reg/unit tests (subset of -rt)
make check-all   # run as much of the above as is available

The last one respects what projects are checked out and built for
a given tree. Personally, I would like to eventually make 'check' be an
alias for 'check-all'. For now however, it is an alias for 'check-llvm',
and thus no behavior has changed.

While this patch and my plan only really apply to CMake, I think it
might be good to similarly rationalize the naming scheme for the Make
builds.

llvm-svn: 159258
2012-06-27 09:44:16 +00:00
Akira Hatanaka ad31cd9a01 Test case for r159240.
llvm-svn: 159242
2012-06-27 00:40:34 +00:00
Evan Cheng 319be53a1f Remove a instcombine transform that (no longer?) makes sense:
// C - zext(bool) -> bool ? C - 1 : C
    if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
      if (ZI->getSrcTy()->isIntegerTy(1))
        return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);

This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
  return (x - y) + 1;
}
=>
        movzbl  %dil, %eax
        movzbl  %sil, %ecx
        shll    $31, %ecx
        sarl    $31, %ecx
        leal    1(%rax,%rcx), %eax
        ret


Without the rule, llvm now generates:
        movzbl  %sil, %ecx
        movzbl  %dil, %eax
        incl    %eax
        subl    %ecx, %eax
        ret

It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).

The transformation was done as part of Eli's r75531. He has given the ok to
remove it.

rdar://11748024

llvm-svn: 159230
2012-06-26 22:03:13 +00:00
Rafael Espindola e0eaa043eb Fix llc's -print-before=pass and -print-after=pass.
llvm-svn: 159227
2012-06-26 21:33:36 +00:00