Commit Graph

3150 Commits

Author SHA1 Message Date
Chris Lattner d39f75ba39 Fix PR2590 by making PatternSortingPredicate actually be
ordered correctly.  Previously it would get in trouble when
two patterns were too similar and give them nondet ordering.
We force this by using the record ID order as a fallback.

The testsuite diff is due to alpha patterns being ordered
slightly differently, the change is a semantic noop afaict:

< 	lda $0,-100($16)
---
> 	subq $16,100,$0

llvm-svn: 97509
2010-03-01 22:09:11 +00:00
Chris Lattner 8c1132746b stop using anders-aa
llvm-svn: 97491
2010-03-01 20:24:05 +00:00
Devang Patel 3bf0571bb0 Rewrite test to test VLA using new debug info encoding scheme.
llvm-svn: 97465
2010-03-01 18:30:58 +00:00
Devang Patel 4aefd92040 Remove this generic debug info intrinsic test. LLVM does not use this llvm.dbg.stoppoint intrinsic anymore. There are tests to check new implementation, which attaches location information directly with an instruction using metadata.
llvm-svn: 97464
2010-03-01 18:30:08 +00:00
Chris Lattner 90e7924cf0 add some random nounwinds.
llvm-svn: 97411
2010-02-28 20:36:49 +00:00
Dan Gohman 34021b7445 Don't try to replace physical registers when doing CSE.
llvm-svn: 97360
2010-02-28 01:33:43 +00:00
Dan Gohman 45e7ffc350 Add nounwinds.
llvm-svn: 97349
2010-02-27 23:53:53 +00:00
Evan Cheng 228c31f045 Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap.
llvm-svn: 97310
2010-02-27 07:36:59 +00:00
Jakob Stoklund Olesen ddbf7a858e Use the right floating point load/store instructions in PPCInstrInfo::foldMemoryOperandImpl().
The PowerPC floating point registers can represent both f32 and f64 via the
two register classes F4RC and F8RC. F8RC is considered a subclass of F4RC to
allow cross-class coalescing. This coalescing only affects whether registers
are spilled as f32 or f64.

Spill slots must be accessed with load/store instructions corresponding to the
class of the spilled register. PPCInstrInfo::foldMemoryOperandImpl was looking
at the instruction opcode which is wrong.

X86 has similar floating point register classes, but doesn't try to fold
memory operands, so there is no problem there.

llvm-svn: 97262
2010-02-26 21:09:24 +00:00
Sanjiv Gupta 2bdbb3c167 Reapply things reverted back in 97220, with the fixed test case.
llvm-svn: 97228
2010-02-26 17:59:28 +00:00
Richard Osborne 333300e0df Fix XCoreTargetLowering::isLegalAddressingMode() to handle VoidTy.
Previously LoopStrengthReduce would sometimes be unable to find
a legal formula, causing an assertion failure.

llvm-svn: 97226
2010-02-26 16:44:51 +00:00
Chris Lattner f7fc2d8b86 change the scope node to include a list of children to be checked
instead of to have a chained series of scope nodes.  This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more 
reasonable to implement.

llvm-svn: 97160
2010-02-25 19:00:39 +00:00
Dan Gohman 9b80f86e5b Revert r97064. Duncan pointed out that bitcasts are defined in
terms of store and load, which means bitcasting between scalar
integer and vector has endian-specific results, which undermines
this whole approach.

llvm-svn: 97137
2010-02-25 15:20:39 +00:00
Dan Gohman a9c205cc88 Make LoopSimplify change conditional branches in loop exiting blocks
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.

Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.

llvm-svn: 97126
2010-02-25 06:57:05 +00:00
Jakob Stoklund Olesen 63af51c1c8 Create a stack frame on ARM when
- Function uses all scratch registers AND
- Function does not use any callee saved registers AND
- Stack size is too big to address with immediate offsets.

In this case a register must be scavenged to calculate the address of a stack
object, and the scavenger needs a spare register or emergency spill slot.

llvm-svn: 97071
2010-02-24 22:43:17 +00:00
Bob Wilson ba8ac74fd9 Check for comparisons of +/- zero when optimizing less-than-or-equal and
greater-than-or-equal SELECT_CCs to NEON vmin/vmax instructions.  This is
only allowed when UnsafeFPMath is set or when at least one of the operands
is known to be nonzero.

llvm-svn: 97065
2010-02-24 22:15:53 +00:00
Dan Gohman 4b2b48daba Make getTypeSizeInBits work correctly for array types; it should return
the number of value bits, not the number of bits of allocation for in-memory
storage.

Make getTypeStoreSize and getTypeAllocSize work consistently for arrays and
vectors.

Fix several places in CodeGen which compute offsets into in-memory vectors
to use TargetData information.

This fixes PR1784.

llvm-svn: 97064
2010-02-24 22:05:23 +00:00
Daniel Dunbar 4811d004be Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in
the hopes of fixing PPC bootstrap.

llvm-svn: 97040
2010-02-24 17:05:47 +00:00
Dan Gohman 3860521406 When forming SSE min and max nodes for UGE and ULE comparisons, it's
necessary to swap the operands to handle NaN and negative zero properly.

Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.

llvm-svn: 97025
2010-02-24 06:52:40 +00:00
Chris Lattner df8a8a8c6f Change the scheduler from adding nodes in allnodes order
to adding them in a determinstic order (bottom up from 
the root) based on the structure of the graph itself.

This updates tests for some random changes, interesting
bits: CodeGen/Blackfin/promote-logic.ll no longer crashes.
I have no idea why, but that's good right?

CodeGen/X86/2009-07-16-LoadFoldingBug.ll also fails, but
now compiles to have one fewer constant pool entry, making
the expected load that was being folded disappear.  Since it
is an unreduced mass of gnast, I just removed it.

This fixes PR6370

llvm-svn: 97023
2010-02-24 06:11:37 +00:00
Jim Grosbach 6ad4bcb0da LowerCall() should always do getCopyFromReg() to reference the stack pointer.
Machine instruction selection is much happier when operands are in virtual
registers.

llvm-svn: 97012
2010-02-24 01:43:03 +00:00
Evan Cheng 328a607490 Re-apply 96540 and 96556 with fixes.
llvm-svn: 97011
2010-02-24 01:42:31 +00:00
Jakob Stoklund Olesen a2d8c97b65 DIV8r must define %AX since X86DAGToDAGISel::Select() sometimes uses it
instead of %AL/%AH.

llvm-svn: 97006
2010-02-24 00:39:35 +00:00
Jakob Stoklund Olesen fe0a8cd210 Remember to handle sub-registers when moving imp-defs to a rematted instruction.
llvm-svn: 96995
2010-02-23 22:44:02 +00:00
Jakob Stoklund Olesen 38b76e27a7 Keep track of phi join registers explicitly in LiveVariables.
Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply
defined registers. That doesn't work if the phi join is implicitly defined in
all but one of the predecessors.

llvm-svn: 96994
2010-02-23 22:43:58 +00:00
Wesley Peck e4801e49c9 Adding the MicroBlaze backend.
The MicroBlaze is a highly configurable 32-bit soft-microprocessor for
use on Xilinx FPGAs. For more information see:
http://www.xilinx.com/tools/microblaze.htm
http://en.wikipedia.org/wiki/MicroBlaze

The current LLVM MicroBlaze backend generates assembly which can be
compiled using the an appropriate binutils assembler.

llvm-svn: 96969
2010-02-23 19:15:24 +00:00
Richard Osborne f578196968 Lower BR_JT on the XCore to a jump into a series of jump instructions.
llvm-svn: 96942
2010-02-23 13:25:07 +00:00
Evan Cheng 2a33390e2b These should not have been committed.
llvm-svn: 96827
2010-02-22 23:37:48 +00:00
Chris Lattner 72334622d6 no need to run llvm-as here.
llvm-svn: 96826
2010-02-22 23:34:12 +00:00
Evan Cheng 3688b8fa68 Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting.
llvm-svn: 96825
2010-02-22 23:34:00 +00:00
Dan Gohman 6c5ac6de5c Canonicalize ConstantInts to the right operand of commutative
operators.

The test difference is just due to the multiplication operands
being commuted (and thus requiring a more elaborate match). In
optimized code, that expression would be folded.

llvm-svn: 96816
2010-02-22 22:43:23 +00:00
Dan Gohman be24c455c4 Actually enable the -enable-unsafe-fp-math tests.
llvm-svn: 96796
2010-02-22 18:53:26 +00:00
Arnold Schwaighofer 30ece5b807 Mark the return address stack slot as mutable when moving the return address
during a tail call. A parameter might overwrite this stack slot during the tail
call. 

The sequence during a tail call is:
1.) load return address to temp reg
2.) move parameters (might involve storing to return address stack slot)
3.) store return address to new location from temp reg

If the stack location is marked immutable CodeGen can colocate load (1) with the
store (3).

This fixes bug 6225.

llvm-svn: 96783
2010-02-22 16:18:09 +00:00
Dan Gohman b87de8d30d Remove the logic for reasoning about NaNs from the code that forms
SSE min and max instructions. The real thing this code needs to be
concerned about is negative zero.

Update the sse-minmax.ll test accordingly, and add tests for
-enable-unsafe-fp-math mode as well.

llvm-svn: 96775
2010-02-22 04:03:39 +00:00
Dan Gohman 4506fcb3c2 When emitting an instruction which depends on both a post-incremented
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.

llvm-svn: 96774
2010-02-22 03:59:54 +00:00
Chris Lattner 745219ea64 add some no-unwinds, other minor cleanups.
llvm-svn: 96756
2010-02-21 20:33:20 +00:00
Chris Lattner c43c88ebce add a triple so that this doesn't fail due to linux/ppc register printing
syntax.

llvm-svn: 96748
2010-02-21 19:27:38 +00:00
Chris Lattner 53485469b4 filecheckize and add nouwinds.
llvm-svn: 96745
2010-02-21 18:53:28 +00:00
Anton Korobeynikov e96503faa1 IT turns out that during jumpless setcc lowering eq and ne were swapped.
This fixes PR6348

llvm-svn: 96734
2010-02-21 12:28:58 +00:00
Chris Lattner 3c29aff9ff fix and un-xfail X86/vec_ss_load_fold.ll
llvm-svn: 96720
2010-02-21 04:53:34 +00:00
Chris Lattner 7d5f4a4c03 temporarily disable this.
llvm-svn: 96717
2010-02-21 03:24:41 +00:00
Dan Gohman 85af256779 Check for overflow when scaling up an add or an addrec for
scaled reuse.

llvm-svn: 96692
2010-02-19 19:32:49 +00:00
Charles Davis 7e47767763 Add support for the 'alignstack' attribute to the x86 backend. Fixes PR5254.
Also, FileCheck'ize a test.

llvm-svn: 96686
2010-02-19 18:17:13 +00:00
Duncan Sands d0bf6f640f Revert commits 96556 and 96640, because commit 96556 breaks the
dragonegg self-host build.  I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier.  The
symptom of the 96556 miscompile is the following crash:

  llvm[3]: Compiling AlphaISelLowering.cpp for Release build
  cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
  Stack dump:
  0.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
  g++: Internal error: Aborted (program cc1plus)

This occurs when building LLVM using LLVM built by LLVM (via
dragonegg).  Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM.  Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.

Found by bisection.

r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines

Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines

Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96672
2010-02-19 11:30:41 +00:00
Evan Cheng d2d9252f35 Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96640
2010-02-19 00:34:39 +00:00
Dan Gohman 2446f57503 When determining the set of interesting reuse factors, consider
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.

llvm-svn: 96629
2010-02-19 00:05:23 +00:00
Mon P Wang c94892513d getSplatIndex assumes that the first element of the mask contains the splat index
which is not always true if the mask contains undefs. Modified it to return
the first non undef value.

llvm-svn: 96621
2010-02-18 22:33:18 +00:00
Jakob Stoklund Olesen c953acbd7f Always normalize spill weights, also for intervals created by spilling.
Moderate the weight given to very small intervals.

The spill weight given to new intervals created when spilling was not
normalized in the same way as the original spill weights calculated by
CalcSpillWeights. That meant that restored registers would tend to hang around
because they had a much higher spill weight that unspilled registers.

This improves the runtime of a few tests by up to 10%, and there are no
significant regressions.

llvm-svn: 96613
2010-02-18 21:33:05 +00:00
Dan Gohman 5ffef745c2 Make CodePlacementOpt detect special EH control flow by
checking whether AnalyzeBranch disagrees with the CFG
directly, rather than looking for EH_LABEL instructions.
EH_LABEL instructions aren't always at the end of the
block, due to FP_REG_KILL and other things. This fixes
an infinite loop compiling MultiSource/Benchmarks/Bullet.

llvm-svn: 96611
2010-02-18 21:25:53 +00:00
Chris Lattner 6a9bdade29 remove empty file
llvm-svn: 96573
2010-02-18 06:29:06 +00:00
Bob Wilson c6c13a3515 Use NEON vmin/vmax instructions for floating-point selects.
Radar 7461718.

llvm-svn: 96572
2010-02-18 06:05:53 +00:00
Evan Cheng 0ceb68a552 Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

llvm-svn: 96556
2010-02-18 02:13:50 +00:00
Dan Gohman 104207b4c5 Don't check for comments, which vary between subtargets.
llvm-svn: 96434
2010-02-17 01:08:57 +00:00
Dan Gohman 5f10d6c52c Don't attempt to divide INT_MIN by -1; consider such cases to
have overflowed.

llvm-svn: 96428
2010-02-17 00:41:53 +00:00
Chris Lattner 1fc2773a33 roundss is an sse 4 thing, fix the test on non-sse41 builders
like llvm-gcc-x86_64-darwin10-selfhost

llvm-svn: 96417
2010-02-17 00:29:06 +00:00
Dale Johannesen cee887425e Make g5 target explicit; scheduling affects register choice.
llvm-svn: 96413
2010-02-16 23:25:23 +00:00
Chris Lattner afac7dad21 fix rdar://7653908, a crash on a case where we would fold a load
into a roundss intrinsic, producing a cyclic dag.  The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection.  Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.

llvm-svn: 96408
2010-02-16 22:35:06 +00:00
Dale Johannesen 0062f7bf59 Adjust register numbers in tests to compensate for the
new lack of R2.

llvm-svn: 96407
2010-02-16 22:31:31 +00:00
Chris Lattner c98beb567c filecheckize
llvm-svn: 96404
2010-02-16 22:13:43 +00:00
Evan Cheng 82b04130cb Look for SSE and instructions of this form: (and x, (build_vector c1,c2,c3,c4)).
If there exists a use of a build_vector that's the bitwise complement of the mask,
then transform the node to
(and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)).

Since this transformation is only useful when 1) the given build_vector will
become a load from constpool, and 2) (and (xor x -1), y) matches to a single
instruction, I decided this is appropriate as a x86 specific transformation.
rdar://7323335

llvm-svn: 96389
2010-02-16 21:09:44 +00:00
David Greene 9641d06809 Add support for emitting non-temporal stores for DAGs marked
non-temporal.  Fix from r96241 for botched encoding of MOVNTDQ.

Add documentation for !nontemporal metadata.

Add a simpler movnt testcase.

llvm-svn: 96386
2010-02-16 20:50:18 +00:00
Bob Wilson 70aa8d0745 Fix pr6111: Avoid using the LR register for the target address of an indirect
branch in ARM v4 code, since it gets clobbered by the return address before
it is used.  Instead of adding a new register class containing all the GPRs
except LR, just use the existing tGPR class.

llvm-svn: 96360
2010-02-16 17:24:15 +00:00
Dan Gohman 521efe68ab Split the main for-each-use loop again, this time for GenerateTruncates,
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.

llvm-svn: 96308
2010-02-16 01:42:53 +00:00
Anton Korobeynikov ae4ccc10da Preliminary patch to improve dwarf EH generation - Hooks to return Personality / FDE / LSDA / TType encoding depending on target / options (e.g. code model / relocation model) - MCIzation of Dwarf EH printer to use encoding information - Stub generation for ELF target (needed for indirect references) - Some other small changes here and there
llvm-svn: 96285
2010-02-15 22:35:59 +00:00
Jakob Stoklund Olesen 2988d573e5 Fix PR6300.
A virtual register can be used before it is defined in the same MBB if the MBB
is part of a loop. Teach the implicit-def pass about this case.

llvm-svn: 96279
2010-02-15 22:03:29 +00:00
Bob Wilson 9be7200b08 Last week we were generating code with duplicate induction variables in this
test, but the problem seems to have gone away today.  Add a check to make sure
it doesn't come back.

llvm-svn: 96277
2010-02-15 21:56:40 +00:00
Chris Lattner 3818d9763d remove empty file.
llvm-svn: 96271
2010-02-15 21:14:50 +00:00
Chris Lattner bcbaaba532 revert r96241. It breaks two regression tests, isn't documented,
and the testcase needs improvement.

llvm-svn: 96265
2010-02-15 20:53:01 +00:00
Chris Lattner 6fbfe5897c fix PR6305 by handling BlockAddress in a helper function
called by jump threading.

llvm-svn: 96263
2010-02-15 20:47:49 +00:00
David Greene 63cedef74b Add support for emitting non-temporal stores for DAGs marked
non-temporal.

llvm-svn: 96241
2010-02-15 17:02:56 +00:00
Jakob Stoklund Olesen b659c76c77 Fix PR6283.
When coalescing with a physreg, remember to add imp-def and imp-kill when
dealing with sub-registers.

Also fix a related bug in VirtRegRewriter where substitutePhysReg may
reallocate the operand list on an instruction and invalidate the reg_iterator.
This can happen when a register is mentioned twice on the same instruction.

llvm-svn: 96072
2010-02-13 02:06:10 +00:00
Bob Wilson 01abf8fc2f Besides removing phi cycles that reduce to a single value, also remove dead
phi cycles.  Adjust a few tests to keep dead instructions from being optimized
away.  This (together with my previous change for phi cycles) fixes Apple
radar 7627077.

llvm-svn: 96057
2010-02-13 00:31:44 +00:00
Dale Johannesen 26062150fa When save/restoring CR at prolog/epilog, in a large
stack frame, the prolog/epilog code was using the same
register for the copy of CR and the address of the save slot.  Oops.
This is fixed here for Darwin, sort of, by reserving R2 for this case.
A better way would be to do the store before the decrement of SP,
which is safe on Darwin due to the red zone.

SVR4 probably has the same problem, but I don't know how to fix it;
there is no red zone and R2 is already used for something else.
I'm going to leave it to someone interested in that target.

Better still would be to rewrite the CR-saving code completely;
spilling each CR subregister individually is horrible code.

llvm-svn: 96015
2010-02-12 21:35:34 +00:00
Anton Korobeynikov b9ce3cc458 Testcases for recent stdcall / fastcall mangling improvements
llvm-svn: 95982
2010-02-12 15:29:13 +00:00
Anton Korobeynikov c9276dfe04 Cleanup stdcall / fastcall name mangling.
This should fix alot of problems we saw so far, e.g. PRs 5851 & 2936

llvm-svn: 95980
2010-02-12 15:28:40 +00:00
Dan Gohman 45774ce0ad Reapply the new LoopStrengthReduction code, with compile time and
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.

This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.

llvm-svn: 95975
2010-02-12 10:34:29 +00:00
Bob Wilson 0827e040e0 Add a new pass on machine instructions to optimize away PHI cycles that
reduce down to a single value.  InstCombine already does this transformation
but DAG legalization may introduce new opportunities.  This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized.  I measured the compile time 
impact of this (running llc on 176.gcc) and it was not significant.

llvm-svn: 95951
2010-02-12 01:30:21 +00:00
Jakob Stoklund Olesen 93c92225af Reapply coalescer fix for better cross-class coalescing.
This time with fixed test cases.

llvm-svn: 95938
2010-02-11 23:55:29 +00:00
Mon P Wang 5b77f0dac1 The previous fix of widening divides that trap was too fragile as it depends on custom
lowering and requires that certain types exist in ValueTypes.h.  Modified widening to
check if an op can trap and if so, the widening algorithm will apply only the op on
the defined elements.  It is safer to do this in widening because the optimizer can't
guarantee removing unused ops in some cases.

llvm-svn: 95823
2010-02-10 23:37:45 +00:00
Bob Wilson 0f52d0c074 Delete dead PHI machine instructions. These can be created due to type
legalization even when the IR-level optimizer has removed dead phis, such
as when the high half of an i64 value is unused on a 32-bit target.
I had to adjust a few test cases that had dead phis.
This is a partial fix for Radar 7627077.

llvm-svn: 95816
2010-02-10 22:58:57 +00:00
Evan Cheng 29b8f554fc Now that ShrinkDemandedOps() is separated out from DAG combine. It sometimes leave some obvious nops which dag combine used to clean up afterwards e.g. (trunk (ext n)) -> n. Look for them and squash them.
llvm-svn: 95757
2010-02-10 02:17:34 +00:00
Chris Lattner 78360a8184 move tests that depend on the x86 backend out of codegen/generic,
and remove a few old and unreduced ones.  Fixes PR5624. 

llvm-svn: 95656
2010-02-09 06:41:03 +00:00
Chris Lattner 4660c225a2 make target independent.
llvm-svn: 95655
2010-02-09 06:36:30 +00:00
Chris Lattner 933509287b merge a target-specific add test into x86 directory.
llvm-svn: 95654
2010-02-09 06:35:50 +00:00
Chris Lattner 015ecd85d4 merge another test in, drop the trivially constant folded cases.
llvm-svn: 95653
2010-02-09 06:33:27 +00:00
Chris Lattner c77b9eb31c consolidate and filecheckize two tests.
llvm-svn: 95652
2010-02-09 06:24:00 +00:00
Chris Lattner 534e1667a0 merge two tests, make target independent.
llvm-svn: 95651
2010-02-09 06:19:20 +00:00
Chris Lattner ae67ca33ed convert to filecheck.
llvm-svn: 95608
2010-02-08 23:47:34 +00:00
Chris Lattner b6b2164e28 add an x86 implementation of MCTargetExpr for
representing @GOT and friends.  Use it for
personality references as a first use.

llvm-svn: 95588
2010-02-08 22:09:08 +00:00
Dan Gohman 4268d6a7c3 When CodeGen'ing unoptimized code, there may be unfolded constant expressions
in global initializers. Instead of aborting, attempt to fold them on the
spot. If folding succeeds, emit the folded expression instead.

This fixes PR6255.

llvm-svn: 95583
2010-02-08 22:02:38 +00:00
Dan Gohman bd374da130 In guaranteed tailcall mode, don't decline the tailcall optimization
for blocks ending in "unreachable".

llvm-svn: 95565
2010-02-08 20:34:14 +00:00
Evan Cheng ea5c6be766 Run codegen dce pass for all targets at all optimization levels. Previously it's
only run for x86 with fastisel. I've found it being very effective in
eliminating some obvious dead code as result of formal parameter lowering
especially when tail call optimization eliminated the need for some of the loads
from fixed frame objects. It also shrinks a number of the tests. A couple of
tests no longer make sense and are now eliminated.

llvm-svn: 95493
2010-02-06 09:07:11 +00:00
Evan Cheng c72f7882c0 Remove a large test case that (soon will) no longer make sense.
llvm-svn: 95492
2010-02-06 09:00:30 +00:00
Rafael Espindola 4536b9a904 Fix alignment on ppc linux. This fixes the build of crtend.o
llvm-svn: 95477
2010-02-06 03:32:21 +00:00
Evan Cheng d064aefefc Do not emit callseq instructions around sibcalls. This eliminated some unnecessary stack adjustments.
llvm-svn: 95475
2010-02-06 03:28:46 +00:00
Bob Wilson 5638c36efd Handle AddrMode6 (for NEON load/stores) in Thumb2's rewriteT2FrameIndex.
Radar 7614112.

llvm-svn: 95456
2010-02-06 00:24:38 +00:00
Jakob Stoklund Olesen 5f9ead2714 Don't unroll loops containing function calls.
llvm-svn: 95454
2010-02-05 23:21:31 +00:00
Bill Wendling 994da1a479 Make test more fucused eliminating extraneous bits.
llvm-svn: 95384
2010-02-05 11:21:05 +00:00
Evan Cheng c8b4db77be Fix test.
llvm-svn: 95373
2010-02-05 06:37:00 +00:00
Evan Cheng a366c61f77 Handle tail call with byval arguments.
llvm-svn: 95351
2010-02-05 02:21:12 +00:00
Evan Cheng 3b245876c0 When the scheduler unfold a load folding instruction it move some of the predecessors to the unfolded load. It decides what gets moved to the load by checking whether the new load is using the predecessor as an operand. The check neglects the cases whether the predecessor is a flagged scheduling unit.
rdar://7604000

llvm-svn: 95339
2010-02-05 01:27:11 +00:00
Bill Wendling 6510dc8dc3 An empty global constant (one of size 0) may have a section immediately
following it. However, the EmitGlobalConstant method wasn't emitting a body for
the constant. The assembler doesn't like that. Before, we were generating this:

  .zerofill __DATA, __common, __cmd, 1, 3

This fix puts us back to that semantic.

llvm-svn: 95336
2010-02-05 00:17:02 +00:00
Jakob Stoklund Olesen c7c89b8325 Fix small bug in handling instructions with more than one implicitly defined operand.
ProcessImplicitDefs would only mark one operand per instruction with <undef>.
This fixed PR6086.

llvm-svn: 95319
2010-02-04 18:46:28 +00:00
Evan Cheng aeba2250a5 Re-enable x86 tail call optimization.
llvm-svn: 95295
2010-02-04 06:47:24 +00:00
Chris Lattner 8228b11abc add support for the sparcv9-*-* target triple to turn on
64-bit sparc codegen.  Patch by Nathan Keynes!

llvm-svn: 95293
2010-02-04 06:34:01 +00:00
Evan Cheng f4139067ee Speculatively disable x86 automatic tail call optimization while we track down a self-hosting issue.
llvm-svn: 95259
2010-02-03 21:40:40 +00:00
Evan Cheng 112a871fe2 Make test less fragile
llvm-svn: 95258
2010-02-03 21:39:04 +00:00
Evan Cheng 27a41d5473 Revert 94937 and move the noreturn check to codegen.
llvm-svn: 95198
2010-02-03 03:55:59 +00:00
Evan Cheng 40905b4302 Allow all types of callee's to be tail called. But avoid automatic tailcall if the callee is a result of bitcast to avoid losing necessary zext / sext etc.
llvm-svn: 95195
2010-02-03 03:28:02 +00:00
Dale Johannesen a466692552 Reapply 95050 with a tweak to check the register class.
llvm-svn: 95183
2010-02-03 01:40:33 +00:00
Chris Lattner dee74e2805 make these less sensitive to asm verbose changes by disabling it for them.
llvm-svn: 95175
2010-02-03 00:48:53 +00:00
Dale Johannesen da431c76fb Test revert 95050; there's a good chance it's causing
buildbot failure.

llvm-svn: 95103
2010-02-02 18:52:56 +00:00
Evan Cheng 55afd2564c Perform sibcall in some cases when arguments are passes memory. Look for cases
where callee's arguments are already in the caller's own caller's stack and
they line up perfectly. e.g.

extern int foo(int a, int b, int c);

int bar(int a, int b, int c) {
  return foo(a, b, c);
}

llvm-svn: 95053
2010-02-02 02:22:50 +00:00
Dale Johannesen c84816a62e Make local RA smarter about reusing input register of a copy
as output.  Needed for (functional) correctness in inline asm,
and should be generally beneficial.  7361612.

llvm-svn: 95050
2010-02-02 02:08:02 +00:00
Evan Cheng a49d8e6d38 Fix PR6196. GV callee may not be a function.
llvm-svn: 95017
2010-02-01 22:40:09 +00:00
Dan Gohman 36bca4e4ba Update this test for a trivial register allocation difference.
llvm-svn: 94989
2010-02-01 19:00:32 +00:00
Evan Cheng ed8ca56eeb Undo r94946 now all the tests are passing again.
llvm-svn: 94970
2010-02-01 02:13:39 +00:00
Evan Cheng 7f62def0f9 Avoid recursive sibcall's.
llvm-svn: 94946
2010-01-31 06:44:49 +00:00
Anton Korobeynikov 25df248382 Fix a gross typo: ARMv6+ may or may not support unaligned memory operations.
Even if they are suported by the core, they can be disabled
(this is just a configuration bit inside some register).

Allow unaligned memops on darwin and conservatively disallow them otherwise.

llvm-svn: 94889
2010-01-30 14:08:12 +00:00
Evan Cheng 70f714fdbe Allow more tailcall optimization: calls with inputs that are all passed in registers.
llvm-svn: 94873
2010-01-30 01:22:00 +00:00
Evan Cheng 297a494f55 Catch more trivial tail call opportunities: no inputs and output types match.
llvm-svn: 94804
2010-01-29 06:45:59 +00:00
Chris Lattner cc9a6f0580 convert the last 3 targets to use EmitFunctionBody() now that
it has before/end body hooks.

 lib/Target/Alpha/AsmPrinter/AlphaAsmPrinter.cpp |   49 ++-----------
 lib/Target/Mips/AsmPrinter/MipsAsmPrinter.cpp   |   87 ++++++------------------
 lib/Target/XCore/AsmPrinter/XCoreAsmPrinter.cpp |   56 +++------------
 test/CodeGen/XCore/ashr.ll                      |    2 
 4 files changed, 48 insertions(+), 146 deletions(-)

llvm-svn: 94741
2010-01-28 06:22:43 +00:00
Evan Cheng 346af88396 Fix a bug introduced by r94490 where it created a X86ISD::CMP whose output type is different from its inputs.
This fixes PR6146.

llvm-svn: 94731
2010-01-28 01:57:22 +00:00
Chris Lattner 73de5fbfc3 Give AsmPrinter the most common expected implementation of
runOnMachineFunction, and switch PPC to use EmitFunctionBody.
The two ppc asmprinters now don't heave to define 
runOnMachineFunction.

llvm-svn: 94722
2010-01-28 01:28:58 +00:00
Chris Lattner 565896b9eb emit a 0 byte instead of a noop if a function is empty on darwin.
"0" is nice and target independent.

llvm-svn: 94718
2010-01-28 01:06:32 +00:00
Chandler Carruth 4fe7a3bc08 Quick fix to a test that is currently failing on every Linux build bot. No idea
if this is the "correct" fix, but it seems a strict improvement.

llvm-svn: 94675
2010-01-27 10:36:15 +00:00
Evan Cheng 85476f304c Perform trivial tail call optimization for callees with "C" ABI. These are done
even when -tailcallopt is not specified and it does not require changing ABI.
First case is the most trivial one. Perform tail call optimization when both
the caller and callee do not return values and when the callee does not take
any input arguments.

llvm-svn: 94664
2010-01-27 06:25:16 +00:00
Chris Lattner b657c4cdc3 emit jump table an alias ".set" directives through MCStreamer as
assignments.

.set x, a-b

is the same as:

x = a-b

llvm-svn: 94596
2010-01-26 21:53:08 +00:00
Rafael Espindola dcb03f0f6b Emit .comm alignment in bytes but .align in powers of 2 for ARM ELF.
Original patch by Sandeep Patel and updated by me.

llvm-svn: 94582
2010-01-26 20:21:43 +00:00
Chris Lattner 3dd38a8112 eliminate MCAsmInfo::NeedsSet: we now just use .set on any platform
that has it.

llvm-svn: 94581
2010-01-26 20:20:43 +00:00
Evan Cheng 555f61bf58 Implement cond ? -1 : 0 with sbb.
llvm-svn: 94490
2010-01-26 02:00:44 +00:00
Rafael Espindola 4cb52db485 Update test for darwin.
llvm-svn: 94421
2010-01-25 15:32:10 +00:00
Chris Lattner 9b83727cfe we removed support for darwin8 tools.
llvm-svn: 94414
2010-01-25 07:43:40 +00:00
Rafael Espindola a1141dd6ab Fix PR6134.
We are not emitting alignments on Darwin for "bar". Not sure what is the
correct way to do it.

llvm-svn: 94400
2010-01-25 02:27:39 +00:00
Daniel Dunbar 75652a6f2b Attempt to unbreak test on Linux. Chris, please check.
llvm-svn: 94399
2010-01-25 00:54:13 +00:00
Chris Lattner 45dd2327cb just remove this test, it is not reduced, is not clear what its testing for and
it is dying due to fragility in the asmprinter .s comments.

llvm-svn: 94372
2010-01-24 19:23:09 +00:00
Mon P Wang 4f45512c23 It seems better to scalarize vectors of size 1 instead of widening them.
Add support to widen SETCC.

llvm-svn: 94342
2010-01-24 00:24:43 +00:00
Mon P Wang 586d997e98 Improved widening loads by adding support for wider loads if
the alignment allows.  Fixed a bug where we didn't use a
vector load/store for PR5626.

llvm-svn: 94338
2010-01-24 00:05:03 +00:00
Chris Lattner d1acffc845 Change constantexpr global variable initializers to convert the constants
to MCExpr then emit them through MCStreamer with EmitValue.  I think all
global variable initializers are now going through mcstreamer.

llvm-svn: 94293
2010-01-23 06:17:14 +00:00
Eric Christopher c1451d764f Don't lower splat vector load to relative to the esp if the
stack may be misaligned.

Update test accordingly.

Patch by Evan Cheng!

llvm-svn: 94291
2010-01-23 06:02:43 +00:00
Chris Lattner 82b86b0fce stop testing for invalid output.
llvm-svn: 94288
2010-01-23 05:45:28 +00:00
Chris Lattner 68eeb5ec9c emit .ascii and .asciz through MCStreamer.
llvm-svn: 94282
2010-01-23 04:54:10 +00:00
Chris Lattner cabb6ff64d remove this test.
llvm-svn: 94276
2010-01-23 03:11:10 +00:00
Evan Cheng 8204911e1d Fix test.
llvm-svn: 94272
2010-01-23 01:21:27 +00:00
Evan Cheng e1b8b5a01b Fix tests.
llvm-svn: 94271
2010-01-23 01:19:28 +00:00
Chris Lattner 88b8b1b419 make this less constrained, we want blank lines between globals.
llvm-svn: 94201
2010-01-22 19:51:08 +00:00
Dan Gohman 045f81981a Revert LoopStrengthReduce.cpp to pre-r94061 for now.
llvm-svn: 94123
2010-01-22 00:46:49 +00:00
Chris Lattner 1526375827 testcase for r94095
llvm-svn: 94096
2010-01-21 20:01:04 +00:00
Dan Gohman 51ad99d2c5 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061
2010-01-21 02:09:26 +00:00
Chris Lattner f8dcf784a7 emit basic block labels with mcstreamer.
llvm-svn: 93993
2010-01-20 07:24:05 +00:00
Chris Lattner 4c8b1824f0 emit integer and fp zeros as (e.g.) .byte 0 instead of .space 1,
for tidiness.

llvm-svn: 93992
2010-01-20 07:19:19 +00:00
Chris Lattner 03cb2a3035 signficant cleanups to EmitGlobalConstant (including streamerization
of int initializers), change some methods to be static functions,
use raw_ostream::write_hex instead of a smallstring dance with 
APValue::toStringUnsigned(S, 16).

llvm-svn: 93991
2010-01-20 07:11:32 +00:00
Dan Gohman 954f49014d Fold (add x, shl(0 - y, n)) -> sub(x, shl(y, n)), to simplify some code
that SCEVExpander can produce when running on behalf of LSR.

llvm-svn: 93949
2010-01-19 23:30:49 +00:00
Dan Gohman 221708d826 Make SCEVAddRecExpr's getType return a pointer type when the add
has a pointer member. This helps reduce unnecessary bitcasting
and uglygeps.

llvm-svn: 93939
2010-01-19 22:53:50 +00:00
Dan Gohman fb9bea7150 Add nounwinds.
llvm-svn: 93919
2010-01-19 21:51:51 +00:00
Jakob Stoklund Olesen bdc17f6840 Remove predicates when changing an add into an unpredicable mov.
Since the mov is executed unconditionally, make sure that the add didn't have
any predicate.

llvm-svn: 93909
2010-01-19 21:08:28 +00:00
Evan Cheng bf43525a29 Do not extend extension results beyond the use of a PHI instruction at the start of a use block. A PHI use is expected to kill its source values.
llvm-svn: 93895
2010-01-19 19:45:51 +00:00
Chris Lattner a6368219ac don't let asm-verbose break the check-next lines in these tests.
llvm-svn: 93869
2010-01-19 06:39:54 +00:00
Chris Lattner c7a062d187 Now that we have everything nicely factored (e.g. asmprinter is not
doing global variable classification anymore) and hookized, sink almost
all target targets global variable emission code into AsmPrinter and out
of each target.

Some notes:

1. PIC16 does completely custom and crazy stuff, so it is not changed.
2. XCore has some custom handling for extra directives.  I'll look at it next.
3. This switches linux/ppc to use .globl instead of .global.  If .globl is
   actually wrong, let me know and I'll fix it.
4. This makes linux/ppc get a lot of random cases right which were obviously
   wrong before, it is probably now a bit healthier.
5. Blackfin will probably start getting .comm and other things that it didn't
   before.  If this is undesirable, it should explicitly opt out of these
   things by clearing the relevant fields of MCAsmInfo.

This leads to a nice diffstat:
 14 files changed, 127 insertions(+), 830 deletions(-)

llvm-svn: 93858
2010-01-19 05:38:33 +00:00
Chris Lattner a986aa33eb fix a significant difference between llvm and gcc on ELF systems:
GCC would put weak zero initialized mutable data in the .bss section,
we would put it into a crasy '.gnu.linkonce.b.test,"aw",@nobits' 
section.  Fixing this will allow simplifications next up.

llvm-svn: 93844
2010-01-19 03:06:01 +00:00
Chris Lattner 024734e0f0 there is no need to emit a .section above .comm on linux.
llvm-svn: 93842
2010-01-19 02:46:56 +00:00
Evan Cheng 4668a3b935 Test case for r93758.
llvm-svn: 93824
2010-01-19 00:35:20 +00:00
Evan Cheng 88b65bc835 Canonicalize -1 - x to ~x.
Instcombine does this but apparently there are situations where this pattern will escape the optimizer and / or created by isel. Here is a case that's seen in JavaScriptCore:
  %t1 = sub i32 0, %a
  %t2 = add i32 %t1, -1
The dag combiner pattern: ((c1-A)+c2) -> (c1+c2)-A
will fold it to -1 - %a.

llvm-svn: 93773
2010-01-18 21:38:44 +00:00
Chris Lattner 387c6b20cd reduce this test and convert to filecheck, hopefully the linux buildbot
will tell me something more useful.

llvm-svn: 93688
2010-01-17 19:09:12 +00:00
Bob Wilson 9349437c65 The Neon "vtst" instruction takes a suffix that is the element size alone --
adding an "i" to the suffix, indicating that the elements are integers, is
accepted but not part of the standard syntax.  This helps us pass a few more
of the Neon tests from gcc.

llvm-svn: 93677
2010-01-17 06:35:17 +00:00
Kenneth Uildriks dd6ddd1aeb When checking for sret-demotion, it needs to use legal types. When using the return value of an sret-demoted call, it needs to use possibly illegal types that match the declared Type of the callee.
llvm-svn: 93667
2010-01-16 23:37:33 +00:00
Chris Lattner 08eff61eeb this teestcase takes a long time to crash, remove it. If someone cares about this, they should file a bug, it's not doing any good as an xfail.
llvm-svn: 93604
2010-01-16 00:53:22 +00:00
Bob Wilson 298cdac99c Run the pre-register allocation tail duplication pass by default. Remove
the -pre-regalloc-taildup command-line option, and add a new
-disable-early-taildup option.

llvm-svn: 93597
2010-01-16 00:29:50 +00:00
David Greene b0c0e6433f Fix PR6019. A load has more than one use if it feeds a bitconvert that
has more than one use.

llvm-svn: 93576
2010-01-15 23:23:41 +00:00
Jim Grosbach fd850837a3 add testcase for r93564
llvm-svn: 93567
2010-01-15 22:27:37 +00:00
Anton Korobeynikov 07e8171fcb Reenable tests
llvm-svn: 93555
2010-01-15 21:19:26 +00:00
Anton Korobeynikov 3a0b066d24 Temporary disable tests
llvm-svn: 93501
2010-01-15 02:09:27 +00:00
Anton Korobeynikov fdf7031a1a Add variable-width shifts for MSP430
llvm-svn: 93468
2010-01-14 22:09:38 +00:00
Dan Gohman dd5286dc63 Fix a codegen abort seen in 483.xalancbmk.
llvm-svn: 93417
2010-01-14 03:08:49 +00:00
Chris Lattner fb40a8e5f1 this test requires SSE, thanks to jyasskin for pointing this out.
llvm-svn: 93360
2010-01-13 21:51:41 +00:00
Evan Cheng af0ad65ff2 Commit some changes I had managed to lose last night while refactoring the code. Avoid change use of PHI instructions because it's not legal to insert any instructions before them.
This fixes PR6027.

llvm-svn: 93335
2010-01-13 19:16:39 +00:00
Evan Cheng b5499d09d1 Re-enable extension optimization pass.
llvm-svn: 93313
2010-01-13 08:45:40 +00:00
Chris Lattner 25d8ed3773 remove uses of deprecated functions, this generates slightly
different BlockAddress labels, but nothing semantically important.

Add a FIXME that BlockAddress codegen is broken if the LLVM BB has 
an empty name (e.g. strip was run).

llvm-svn: 93303
2010-01-13 07:30:49 +00:00
Evan Cheng d7d8f6d000 Disable opt-ext pass to unbreak the build for now.
llvm-svn: 93286
2010-01-13 01:51:43 +00:00
Jeffrey Yasskin 0ad23efb0f Try to fix the ARM and PPC buildbots. The -mattr=vector-unaligned-mem
flag doesn't exist there, and this is an x86 test.

llvm-svn: 93279
2010-01-13 00:31:43 +00:00
Evan Cheng 30bebff456 Add a quick pass to optimize sign / zero extension instructions. For targets where the pre-extension values are available in the subreg of the result of the extension, replace the uses of the pre-extension value with the result + extract_subreg.
For now, this pass is fairly conservative. It only perform the replacement when both the pre- and post- extension values are used in the block. It will miss cases where the post-extension values are live, but not used.

llvm-svn: 93278
2010-01-13 00:30:23 +00:00
Evan Cheng 0e9189371f Add nounwind.
llvm-svn: 93244
2010-01-12 18:29:23 +00:00
Duncan Sands b7168c270e Revert commit 93204, since it causes the assembler to barf
on x86-64 linux with messages like this:
Error: Incorrect register `%r14' used with `l' suffix

llvm-svn: 93242
2010-01-12 17:46:16 +00:00
Dan Gohman 9059833de6 Make several tests less fragile.
llvm-svn: 93230
2010-01-12 04:52:47 +00:00
Dan Gohman c119580307 Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS.
llvm-svn: 93229
2010-01-12 04:42:54 +00:00
Evan Cheng 42b07e9600 Add manual ISD::OR fastisel selection routines. TableGen is no longer autogen them after 93152 and 93191.
llvm-svn: 93204
2010-01-11 22:59:27 +00:00
Evan Cheng 99789a7a76 Extend r93152 to work on OR r, r. If the source set bits are known not to overlap, then select as an ADD instead.
llvm-svn: 93191
2010-01-11 22:03:29 +00:00
Chris Lattner f0d26d4b74 reduce this to a sensible testcase.
llvm-svn: 93189
2010-01-11 21:58:19 +00:00
David Greene eb103c404b Shorten up this testcase.
llvm-svn: 93187
2010-01-11 21:50:35 +00:00
Evan Cheng 7bdf339602 Revert 93158. It's breaking quite a few x86_64 tests.
llvm-svn: 93185
2010-01-11 21:13:41 +00:00
Jakob Stoklund Olesen d2a1bee2d4 Avoid adding PHI arguments for a predecessor that has gone away when a BRCOND was constant folded.
This fixes PR5980.

llvm-svn: 93184
2010-01-11 21:02:33 +00:00
Dan Gohman e99a3c191e Use a 32-bit and with implicit zero-extension instead of a 64-bit and if it
has an immediate with at least 32 bits of leading zeros, to avoid needing to
materialize that immediate in a register first.

FileCheckize, tidy, and extend a testcase to cover this case.

This fixes rdar://7527390.

llvm-svn: 93160
2010-01-11 17:58:34 +00:00
Dan Gohman 3a55686345 Re-instate MOV64r0 and MOV16r0, with adjustments to work with the
new AsmPrinter. This is perhaps less elegant than describing them
in terms of MOV32r0 and subreg operations, but it allows the
current register to rematerialize them.

llvm-svn: 93158
2010-01-11 17:37:57 +00:00
Dan Gohman 31e8637ac2 Generalize this check to avoid depending on a specific register assignment.
llvm-svn: 93157
2010-01-11 17:24:27 +00:00
Dan Gohman 355ebc7f58 Make this test less trivial, to avoid spurious failures.
llvm-svn: 93156
2010-01-11 17:23:56 +00:00
Evan Cheng 64d9f40557 Select an OR with immediate as an ADD if the input bits are known zero. This allow the instruction to be 3address-fied if needed.
llvm-svn: 93152
2010-01-11 17:03:47 +00:00
David Greene 206351a1ff Implement a feature (-vector-unaligned-mem) to allow targets to
ignore alignment requirements for SIMD memory operands.  This
is useful on architectures like the AMD 10h that do not trap on
unaligned references if a status bit is twiddled at startup time.

llvm-svn: 93151
2010-01-11 16:29:42 +00:00
Jeffrey Yasskin bb857e5d68 Fix http://llvm.org/PR5729: x86-64 tail calls were putting their targets into
R11, and then asserting that the target was in R9.  Since R9 isn't reserved for
the target anymore, and is used as an argument, this patch changes the
assertion.

llvm-svn: 93065
2010-01-09 18:56:43 +00:00
Dan Gohman 6bd3ef82ff Revert an earlier change to SIGN_EXTEND_INREG for vectors. The VTSDNode
really does need to be a vector type, because
TargetLowering::getOperationAction for SIGN_EXTEND_INREG uses that type,
and it needs to be able to distinguish between vectors and scalars.

Also, fix some more issues with legalization of vector casts.

llvm-svn: 93043
2010-01-09 02:13:55 +00:00
Evan Cheng cc6d56bd3b Fix a critical bug in 64-bit atomic operation lowering for 32-bit. The results of the cmpxchg8b instructions are being thrown away when it branches back to the top of the checking loop. This means the loop always compares against the old value and this can result in a dead lock.
llvm-svn: 93028
2010-01-08 23:41:50 +00:00
Evan Cheng 58ec4fec88 ReplaceAllUsesOfValueWith may delete other nodes that the one being replaced. Do not delete dead nodes again.
llvm-svn: 92988
2010-01-08 02:36:12 +00:00
Chris Lattner dab2cd543f Fix rdar://7517201, a regression introduced by r92849.
When folding a and(any_ext(load)) both the any_ext and the
load have to have only a single use.

This removes the anyext-uses.ll testcase which started failing
because it is unreduced and unclear what it is testing.

llvm-svn: 92950
2010-01-07 21:59:23 +00:00
Evan Cheng 16b75ce19c APInt'fy TargetLowering::SimplifySetCC to fix PR5963.
llvm-svn: 92943
2010-01-07 20:58:44 +00:00
Evan Cheng 90dc43fcf5 Fix a minor regression from my dag combiner changes. One more place which needs to look pass truncates.
llvm-svn: 92885
2010-01-07 00:54:06 +00:00
Jakob Stoklund Olesen f1522d612f Add comments.
llvm-svn: 92883
2010-01-07 00:51:04 +00:00
Jakob Stoklund Olesen 29a64c9575 Add Target hook to duplicate machine instructions.
Some instructions refer to unique labels, and so cannot be trivially cloned
with CloneMachineInstr.

llvm-svn: 92873
2010-01-06 23:47:07 +00:00
Evan Cheng 166a4e6caa Teach dag combine to fold the following transformation more aggressively:
(OP (trunc x), (trunc y)) -> (trunc (OP x, y))

Unfortunately this simple change causes dag combine to infinite looping. The problem is the shrink demanded ops optimization tend to canonicalize expressions in the opposite manner. That is badness. This patch disable those optimizations in dag combine but instead it is done as a late pass in sdisel.

This also exposes some deficiencies in dag combine and x86 setcc / brcond lowering. Teach them to look pass ISD::TRUNCATE in various places.

llvm-svn: 92849
2010-01-06 19:38:29 +00:00
Dan Gohman f34b289057 Move this test from test/Transforms/IndVarSimplify to
test/CodeGen/X86, as doesn't use -indvars, and it does use
llc -march=x86-64.

llvm-svn: 92799
2010-01-05 22:52:54 +00:00
Bill Wendling 03f0af372c Don't assign the shift the same type as the variable being shifted. This could
result in illegal types for the SHL operator.

llvm-svn: 92797
2010-01-05 22:39:10 +00:00
Dan Gohman fb4193625a Delete useless trailing semicolons.
llvm-svn: 92740
2010-01-05 17:55:26 +00:00
Dan Gohman 8c63ee7e28 Make this test more portable.
llvm-svn: 92514
2010-01-04 21:23:34 +00:00
Dan Gohman 52183c3cc9 Add some tests and update an existing test to reflect recent
x86 isel peeps.

llvm-svn: 92509
2010-01-04 20:53:54 +00:00
Anton Korobeynikov d91a14dba5 Fix invalid chain folding for memory variant of sdiv / udiv
llvm-svn: 92472
2010-01-04 10:31:54 +00:00
Chris Lattner 1dae8766b1 fix PR5930, allowing the asmprinter to emit difference between
two labels as a truncate.

llvm-svn: 92455
2010-01-03 18:33:18 +00:00
Chris Lattner f6a585fc2f add PR#
llvm-svn: 92451
2010-01-03 18:10:58 +00:00
Chris Lattner a7cfc43af8 differences between two blockaddress's don't cause a
global variable initializer to require relocations.

llvm-svn: 92450
2010-01-03 18:09:40 +00:00
Chris Lattner 909c71c96a allow this to work on linux hosts.
llvm-svn: 92407
2010-01-02 00:22:15 +00:00
Chris Lattner 1eea3b0ada Teach codegen to handle:
(X != null) | (Y != null) --> (X|Y) != 0
 (X == null) & (Y == null) --> (X|Y) == 0

so that instcombine can stop doing this for pointers.  This is part of PR3351,
which is a case where instcombine doing this for pointers (inserting ptrtoint)
is pessimizing code.

llvm-svn: 92406
2010-01-02 00:00:03 +00:00
Chris Lattner 6eef072eb6 rename file.
llvm-svn: 92405
2010-01-01 23:55:04 +00:00
Chris Lattner 39f18e545e Teach codegen to lower llvm.powi to an efficient (but not optimal)
multiply sequence when the power is a constant integer.  Before, our
codegen for std::pow(.., int) always turned into a libcall, which was
really inefficient.

This should also make many gfortran programs happier I'd imagine.

llvm-svn: 92388
2010-01-01 03:32:16 +00:00
Chris Lattner 5967840a5f Make this more likely to generate a libcall.
llvm-svn: 92387
2010-01-01 03:26:51 +00:00
Sanjiv Gupta 015215ca86 Extern declaration for unordered.f32 libcall was not being emitted. Fixed that.
llvm-svn: 92242
2009-12-29 03:24:34 +00:00
Sanjiv Gupta 1ecffe13b2 Fixed llc crash for zext (i1 -> i8) loads.
llvm-svn: 92201
2009-12-28 04:53:24 +00:00
Chris Lattner f5e3ed64d5 handle equality memcmp of 8 bytes on x86-64 with two unaligned loads and a
compare.  On other targets we end up with a call to memcmp because we don't
want 16 individual byte loads.  We should be able to use movups as well, but
we're failing to select the generated icmp.

llvm-svn: 92107
2009-12-24 01:07:17 +00:00
Chris Lattner 1a32ede6fd move an optimization for memcmp out of simplifylibcalls and into
SDISel.  This optimization was causing simplifylibcalls to 
introduce type-unsafe nastiness.  This is the first step, I'll be 
expanding the memcmp optimizations shortly, covering things that
we really really wouldn't want simplifylibcalls to do.

llvm-svn: 92098
2009-12-24 00:37:38 +00:00
Sanjiv Gupta cd419eebce Reapply 91904.
llvm-svn: 91996
2009-12-23 11:19:09 +00:00
Sanjiv Gupta 6920c17f1f deleting empty file.
llvm-svn: 91994
2009-12-23 10:35:24 +00:00
Sanjiv Gupta f7b4f89588 Reverting back 91904.
llvm-svn: 91993
2009-12-23 09:46:01 +00:00
Dale Johannesen a864a67185 Use more sensible type for flags in asms. PR 5570.
Patch by Sylve`re Teissier (sorry, ASCII only).

llvm-svn: 91988
2009-12-23 07:32:51 +00:00
Eric Christopher fdb33458fc Update objectsize intrinsic and associated dependencies. Fix
lowering code and update testcases.

llvm-svn: 91979
2009-12-23 02:51:48 +00:00
Anton Korobeynikov ef3fdc1cbd Add testcase for PR5703
llvm-svn: 91931
2009-12-22 22:37:23 +00:00
Evan Cheng 71d7eaa87e Remove target attribute break-sse-dep. Instead, do not fold load into sse partial update instructions unless optimizing for size.
llvm-svn: 91910
2009-12-22 17:47:23 +00:00
Sanjiv Gupta 8c5f05fcee While converting one of the operands to a memory operand, we need to check if it is Legal and does not result into a cyclic dep.
llvm-svn: 91904
2009-12-22 14:25:37 +00:00
Sanjiv Gupta 8ac077df57 Emit direction operand in binary insns that stores in memory.
llvm-svn: 91777
2009-12-19 13:52:01 +00:00
Sanjiv Gupta bda8002e7f Test cases for changes done in 91768.
llvm-svn: 91773
2009-12-19 11:38:14 +00:00
Evan Cheng b175de6356 Increase opportunities to optimize (brcond (srl (and c1), c2)).
llvm-svn: 91717
2009-12-18 21:31:31 +00:00
Evan Cheng 4cf30b72bf On recent Intel u-arch's, folding loads into some unary SSE instructions can
be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.

movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0

instead of
cvtss2sd (%rdi), %xmm0

An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0

llvm-svn: 91672
2009-12-18 07:40:29 +00:00
Dan Gohman 51fbfb726f Tidy up this testcase and add test for tailcall optimization
with unreachable.

llvm-svn: 91650
2009-12-18 01:05:06 +00:00
Bob Wilson 3152b0471b Handle ARM inline asm "w" constraints with 64-bit ("d") registers.
The change in SelectionDAGBuilder is needed to allow using bitcasts to convert
between f64 (the default type for ARM "d" registers) and 64-bit Neon vector
types.  Radar 7457110.

llvm-svn: 91649
2009-12-18 01:03:29 +00:00
Dan Gohman 7f4326f8b6 Remove "tail" keywords. These calls are not intended to be tail calls.
This protects this test from depending on codegen not performing the
tail call optimization by default.

llvm-svn: 91648
2009-12-18 01:02:18 +00:00
Jakob Stoklund Olesen 78c5919b14 Add test case for the phi reuse patch.
llvm-svn: 91642
2009-12-18 00:11:44 +00:00
Sean Callanan 04d8cb74f3 Instruction fixes, added instructions, and AsmString changes in the
X86 instruction tables.

Also (while I was at it) cleaned up the X86 tables, removing tabs and
80-line violations.

This patch was reviewed by Chris Lattner, but please let me know if
there are any problems.

* X86*.td
	Removed tabs and fixed 80-line violations

* X86Instr64bit.td
	(IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW)
		Added
	(CALL, CMOV) Added qualifiers
	(JMP) Added PC-relative jump instruction
	(POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate
		that it is 64-bit only (ambiguous since it has no
		REX prefix)
	(MOV) Added rr form going the other way, which is encoded
		differently
	(MOV) Changed immediates to offsets, which is more correct;
		also fixed MOV64o64a to have to a 64-bit offset
	(MOV) Fixed qualifiers
	(MOV) Added debug-register and condition-register moves
	(MOVZX) Added more forms
	(ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which
		(as with MOV) are encoded differently
	(ROL) Made REX.W required
	(BT) Uncommented mr form for disassembly only
	(CVT__2__) Added several missing non-intrinsic forms
	(LXADD, XCHG) Reordered operands to make more sense for
		MRMSrcMem
	(XCHG) Added register-to-register forms
	(XADD, CMPXCHG, XCHG) Added non-locked forms
* X86InstrSSE.td
	(CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ)
		Added
* X86InstrFPStack.td
	(COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP,
	 FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X,
	 FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM,
	 FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE,
	 FXRSTOR)
		Added
	(FCOM, FCOMP) Added qualifiers
	(FSTENV, FSAVE, FSTSW) Fixed opcode names
	(FNSTSW) Added implicit register operand
* X86InstrInfo.td
	(opaque512mem) Added for FXSAVE/FXRSTOR
	(offset8, offset16, offset32, offset64) Added for MOV
	(NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR,
	 LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS,
	 LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT,
	 LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC,
	 CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC,
	 SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL,
	 VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD,
	 VMWRITE, VMXOFF, VMXON) Added
	(NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier
	(JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL,
	 JGE, JLE, JG, JCXZ) Added 32-bit forms
	(MOV) Changed some immediate forms to offset forms
	(MOV) Added reversed reg-reg forms, which are encoded
		differently
	(MOV) Added debug-register and condition-register moves
	(CMOV) Added qualifiers
	(AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV
	(BT) Uncommented memory-register forms for disassembler
	(MOVSX, MOVZX) Added forms
	(XCHG, LXADD) Made operand order make sense for MRMSrcMem
	(XCHG) Added register-register forms
	(XADD, CMPXCHG) Added unlocked forms
* X86InstrMMX.td
	(MMX_MOVD, MMV_MOVQ) Added forms
* X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table
	change

* X86RegisterInfo.td: Added debug and condition register sets
* x86-64-pic-3.ll: Fixed testcase to reflect call qualifier
* peep-test-3.ll: Fixed testcase to reflect test qualifier
* cmov.ll: Fixed testcase to reflect cmov qualifier
* loop-blocks.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-11.ll: Fixed testcase to reflect call qualifier
* 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call
  qualifier
* x86-64-pic-2.ll: Fixed testcase to reflect call qualifier
* live-out-reg-info.ll: Fixed testcase to reflect test qualifier
* tail-opts.ll: Fixed testcase to reflect call qualifiers
* x86-64-pic-10.ll: Fixed testcase to reflect call qualifier
* bss-pagealigned.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-1.ll: Fixed testcase to reflect call qualifier
* widen_load-1.ll: Fixed testcase to reflect call qualifier

llvm-svn: 91638
2009-12-18 00:01:26 +00:00
Evan Cheng aadf060b92 Revert this dag combine change:
Fold (zext (and x, cst)) -> (and (zext x), cst)

DAG combiner likes to optimize expression in the other way so this would end up cause an infinite looping.

llvm-svn: 91574
2009-12-17 00:40:05 +00:00
Nick Lewycky 23fbd54cfe Make this test pass on Linux.
llvm-svn: 91521
2009-12-16 07:35:25 +00:00
Evan Cheng 1be6286028 Re-enable 91381 with fixes.
llvm-svn: 91489
2009-12-16 00:53:11 +00:00
Dale Johannesen 56f041406d Do better with physical reg operands (typically, from inline asm)
in local register allocator.  If a reg-reg copy has a phys reg
input and a virt reg output, and this is the last use of the phys
reg, assign the phys reg to the virt reg.  If a reg-reg copy has
a phys reg output and we need to reload its spilled input, reload
it directly into the phys reg than passing it through another reg.

Following 76208, there is sometimes no dependency between the def of
a phys reg and its use; this creates a window where that phys reg
can be used for spilling (this is true in linear scan also).  This
is bad and needs to be fixed a better way, although 76208 works too
well in practice to be reverted.  However, there should normally be
no spilling within inline asm blocks.  The patch here goes a long way
towards making this actually be true.

llvm-svn: 91485
2009-12-16 00:29:41 +00:00
Kenneth Uildriks 792f0913ee For fastcc on x86, let ECX be used as a return register after EAX and EDX
llvm-svn: 91410
2009-12-15 03:27:52 +00:00
Evan Cheng fcb5453dc7 Disable 91381 for now. It's miscompiling ARMISelDAG2DAG.cpp.
llvm-svn: 91405
2009-12-15 03:07:11 +00:00
Evan Cheng 852c486946 Make 91378 more conservative.
1. Only perform (zext (shl (zext x), y)) -> (shl (zext x), y) when y is a constant. This makes sure it remove at least one zest.
2. If the shift is a left shift, make sure the original shift cannot shift out bits.

llvm-svn: 91399
2009-12-15 03:00:32 +00:00
Evan Cheng 0e8b9e32d1 Use sbb x, x to materialize carry bit in a GPR. The result is all one's or all zero's.
llvm-svn: 91381
2009-12-15 00:53:42 +00:00