Commit Graph

57054 Commits

Author SHA1 Message Date
Nadav Rotem 15198e94d2 Fix a crash in SimpliftDemandedBits of vectors of pointers.
PR14183.

llvm-svn: 166785
2012-10-26 17:17:05 +00:00
Akira Hatanaka 6fe7acab9d Make sure I is not the end iterator when isInsideBundle is called.
llvm-svn: 166784
2012-10-26 17:11:42 +00:00
Reed Kotler 4e1c629567 (no commit message)
llvm-svn: 166780
2012-10-26 16:18:19 +00:00
Chad Rosier e2f03771c4 [ms-inline asm] Have the target AsmParser create the asmrewrite for the offsetof
operator.

llvm-svn: 166779
2012-10-26 16:09:20 +00:00
Renato Golin 4dab6a1b7c Better handling of OpcodeToISD using enum/switch.
Patch by Pasi Parviainen <pasi.parviainen@iki.fi>

llvm-svn: 166773
2012-10-26 12:24:52 +00:00
Joerg Sonnenberger 7dcded6b11 Don't explicitly require RTTI and EH.
llvm-svn: 166772
2012-10-26 12:15:29 +00:00
Adhemerval Zanella 0f9cff1ab8 PowerPC: Fix for rldcl/rldicl/rldicr MC emission
This patch fixes the rldcl/rldicl/rldicr instruction emission. The issue is
the MDForm_1 instruction defines the PowerISA MB field from 'rldicl'
with the name MBE, but RLDCL/RLDICL/RLDICR definition uses as 'MB'.

It end up by generatint the 'rldicl' enconding at 
'lib/Target/PowerPC/PPCGenMCCodeEmitter.inc' to use the fourth argument as the
third. The patch changes it by adjusting to use the fourth argument as
intended.

Fixes PR14180.

llvm-svn: 166770
2012-10-26 12:09:58 +00:00
Nicolas Geoffray 457b356f3a Remove GC roots that reference dead objects.
llvm-svn: 166763
2012-10-26 09:15:55 +00:00
Nicolas Geoffray 4027f238eb Fix CPP backend for method attributes by creating a block where a new AttrBuilder is defined for each attribute.
llvm-svn: 166762
2012-10-26 09:14:38 +00:00
Reed Kotler 287f0449a2 Implement carry for subtract/add for mips16
llvm-svn: 166755
2012-10-26 04:46:26 +00:00
Nick Lewycky c86037ff01 Hoist out some work done inside a loop doing a linear scan over all
instructions in a block. GetUnderlyingObject is more expensive than it looks as
it can, for instance, call SimplifyInstruction.

This might have some behavioural changes in odd corner cases, but only because
of some strange artefacts of the original implementation. If you were relying
on those, we can fix that by replacing this with a smarter algorithm. Change
passes the existing tests.

llvm-svn: 166754
2012-10-26 04:43:47 +00:00
Hal Finkel 4863448dca Use VTTI->getNumberOfParts in BBVectorize.
This change reflects VTTI refactoring; no functionality change intended.

llvm-svn: 166752
2012-10-26 04:28:06 +00:00
Hal Finkel 9dd045f178 Add VectorTargetTransform::getNumberOfParts.
As discussed on IRC, add VectorTargetTransform::getNumberOfParts
to provide a stable interface to the vector legalization splitting factor.

llvm-svn: 166751
2012-10-26 04:28:02 +00:00
Nick Lewycky 1a32954279 Fix typo in comment.
llvm-svn: 166750
2012-10-26 04:27:49 +00:00
Reed Kotler e47873ab89 implement large (>16 bit) constant loading.
llvm-svn: 166749
2012-10-26 03:09:34 +00:00
Hal Finkel 41a6ded4a0 Disable generation of pointer vectors by BBVectorize.
Once vector-of-pointer support works, then this can be reverted.

llvm-svn: 166741
2012-10-26 00:05:26 +00:00
Nadav Rotem 8255ceb2cf Revert 166726 because it may have broken a number of SPEC tests. PR14183.
llvm-svn: 166739
2012-10-25 23:51:48 +00:00
Hal Finkel 20a49d6f2c BBVectorize, when using VTTI, should not form types that will be split.
This is needed so that perl's SHA can be compiled (otherwise
BBVectorize takes far too long to find its fixed point).

I'll try to come up with a reduced test case.

llvm-svn: 166738
2012-10-25 23:47:16 +00:00
Nadav Rotem bb4cfb5ee1 Fix a crash in ValueTracking. Add support for vectors of pointers.
llvm-svn: 166726
2012-10-25 21:52:52 +00:00
Chad Rosier 240b7b963a [ms-inline asm] Perform field lookups with the dot operator.
llvm-svn: 166724
2012-10-25 21:51:10 +00:00
Reed Kotler 097556d6bd implement mips16 patterns for select nodes
llvm-svn: 166721
2012-10-25 21:33:30 +00:00
Hal Finkel cbf9365f4c Begin incorporating target information into BBVectorize.
This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:

 1. Target information is used to determine if it is profitable to fuse two
    instructions. This means that the cost of the vector operation must not
    be more expensive than the cost of the two original operations. Pairs that
    are not profitable are no longer considered (because current cost information
    is incomplete, for intrinsics for example, equal-cost pairs are still
    considered).

 2. The 'cost savings' computed for the profitability check are also used to
    rank the DAGs that represent the potential vectorization plans. Specifically,
    for nodes of non-trivial depth, the cost savings is used as the node
    weight.

The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.

llvm-svn: 166716
2012-10-25 21:12:23 +00:00
Nadav Rotem 579042f71b LoopVectorize: Teach the cost model to query scalar costs as scalar types and not vectors of 1.
llvm-svn: 166715
2012-10-25 21:03:48 +00:00
Chad Rosier f0e8720054 [ms-inline asm] Add support for creating AsmRewrites in the target specific
AsmParser logic.  To be used/tested in a subsequent commit.

llvm-svn: 166714
2012-10-25 20:41:34 +00:00
Joerg Sonnenberger 635debe85b Remove exception handling usage from tblgen.
Most places can use PrintFatalError as the unwinding mechanism was not
used for anything other than printing the error. The single exception
was CodeGenDAGPatterns.cpp, where intermediate errors during type
resolution were ignored to simplify incremental platform development.
This use is replaced by an error flag in TreePattern and bailout earlier
in various places if it is set. 

llvm-svn: 166712
2012-10-25 20:33:17 +00:00
Jakob Stoklund Olesen 977f41a1fa Also optimize large switch statements.
The isValueEqualityComparison() guard at the top of SimplifySwitch()
only applies to some of the possible transformations.

The newer transformations work just fine on large switches, and the
check on predecessor count is nonsensical.

llvm-svn: 166710
2012-10-25 18:51:15 +00:00
Nadav Rotem 8b749b2364 Minor cleanups.
llvm-svn: 166706
2012-10-25 18:17:48 +00:00
Chad Rosier 911c1f38b0 [ms-inline asm] Add error handling to the ParseIntelDotOperator() function.
llvm-svn: 166698
2012-10-25 17:37:43 +00:00
Joerg Sonnenberger 356f797d66 In preparation for removing exception handling in tablegen, add
PrintFatalError, which combines PrintError with exit(1).

llvm-svn: 166690
2012-10-25 16:35:18 +00:00
Benjamin Kramer 71a3512d60 DependenceAnalysis: Push #includes down into the implementation.
llvm-svn: 166688
2012-10-25 16:15:22 +00:00
Adhemerval Zanella 1be10dc732 This patch fixes the MC object emission of 'nop' for external function calls
and also fixes the R_PPC64_TOC16 and R_PPC64_TOC16_DS relocation offset.
The 'nop' is needed so a restore TOC instruction (ld r2,40(r1)) can be placed
by the linker to correct restore the TOC of previous function.

Current code has two issues: it defines in PPCInstr64Bit.td file a LDinto_toc
and LDtoc_restore as a DSForm_1 with DS_RA=0 where it should be
DS=2 (the 8 bytes displacement of the TOC saving). It also wrongly emits a
MC intruction using an uint32_t value while the PPC::BL8_NOP_ELF
and PPC::BLA8_NOP_ELF are both uint64_t (because of the following 'nop').

This patch corrects the remaining ExecutionEngine using MCJIT:

ExecutionEngine/2002-12-16-ArgTest.ll
ExecutionEngine/2003-05-07-ArgumentTest.ll
ExecutionEngine/2005-12-02-TailCallBug.ll
ExecutionEngine/hello.ll
ExecutionEngine/hello2.ll
ExecutionEngine/test-call.ll

llvm-svn: 166682
2012-10-25 14:29:13 +00:00
Bill Schmidt 6ed3b99f43 This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.

llvm-svn: 166680
2012-10-25 13:38:09 +00:00
Adhemerval Zanella 5fc11b3554 PowerPC: Initial support for PowerPC64 MCJIT
This patch adds initial support for MCJIT for PPC64-elf-abi. The TOC
relocation and ODP handling is implemented.

It fixes the following ExecutionEngine testcases:

ExecutionEngine/2003-01-04-ArgumentBug.ll
ExecutionEngine/2003-01-04-LoopTest.ll
ExecutionEngine/2003-01-04-PhiTest.ll
ExecutionEngine/2003-01-09-SARTest.ll
ExecutionEngine/2003-01-10-FUCOM.ll
ExecutionEngine/2003-01-15-AlignmentTest.ll
ExecutionEngine/2003-05-11-PHIRegAllocBug.ll
ExecutionEngine/2003-06-04-bzip2-bug.ll
ExecutionEngine/2003-06-05-PHIBug.ll
ExecutionEngine/2003-08-15-AllocaAssertion.ll
ExecutionEngine/2003-08-21-EnvironmentTest.ll
ExecutionEngine/2003-08-23-RegisterAllocatePhysReg.ll
ExecutionEngine/2003-10-18-PHINode-ConstantExpr-CondCode-Failure.ll
ExecutionEngine/simplesttest.ll
ExecutionEngine/simpletest.ll
ExecutionEngine/stubs.ll
ExecutionEngine/test-arith.ll
ExecutionEngine/test-branch.ll
ExecutionEngine/test-call-no-external-funcs.ll
ExecutionEngine/test-cast.ll
ExecutionEngine/test-common-symbols.ll
ExecutionEngine/test-constantexpr.ll
ExecutionEngine/test-fp-no-external-funcs.ll
ExecutionEngine/test-fp.ll
ExecutionEngine/test-global-init-nonzero.ll
ExecutionEngine/test-global.ll
ExecutionEngine/test-loadstore.ll
ExecutionEngine/test-local.ll
ExecutionEngine/test-logical.ll
ExecutionEngine/test-loop.ll
ExecutionEngine/test-phi.ll
ExecutionEngine/test-ret.ll
ExecutionEngine/test-return.ll
ExecutionEngine/test-setcond-fp.ll
ExecutionEngine/test-setcond-int.ll
ExecutionEngine/test-shift.ll

llvm-svn: 166678
2012-10-25 13:13:48 +00:00
Adhemerval Zanella f2aceda854 Initial TOC support for PowerPC64 object creation
This patch adds initial PPC64 TOC MC object creation using the small mcmodel
(a single 64K TOC) adding the some TOC relocations (R_PPC64_TOC,
R_PPC64_TOC16, and R_PPC64_TOC16DS).

The addition of 'undefinedExplicitRelSym' hook on 'MCELFObjectTargetWriter'
is meant to avoid the creation of an unreferenced ".TOC." symbol (used in
the .odp creation) as well to set the R_PPC64_TOC relocation target as the
temporary ".TOC." symbol. On PPC64 ABI, the R_PPC64_TOC relocation should
not point to any symbol.

llvm-svn: 166677
2012-10-25 12:27:42 +00:00
Michael Liao c6696b04db Atom has SIMD instruction set extension up to SSSE3
llvm-svn: 166665
2012-10-25 07:06:48 +00:00
Michael Liao 6d810bd9b8 Clean up where SlotSize should be used instead of pointer size.
llvm-svn: 166664
2012-10-25 06:29:14 +00:00
Chandler Carruth 58d0556765 Teach SROA how to split whole-alloca integer loads and stores into
smaller integer loads and stores.

The high-level motivation is that the frontend sometimes generates
a single whole-alloca integer load or store during ABI lowering of
splittable allocas. We need to be able to break this apart in order to
see the underlying elements and properly promote them to SSA values. The
hope is that this fixes some performance regressions on x86-32 with the
new SROA pass.

Unfortunately, this causes quite a bit of churn in the test cases, and
bloats some IR that comes out. When we see an alloca that consists soley
of bits and bytes being extracted and re-inserted, we now do some
splitting first, before building widened integer "bucket of bits"
representations. These are always well folded by instcombine however, so
this shouldn't actually result in missed opportunities.

If this splitting of all-integer allocas does cause problems (perhaps
due to smaller SSA values going into the RA), we could potentially go to
some extreme measures to only do this integer splitting trick when there
are non-integer component accesses of an alloca, but discovering this is
quite expensive: it adds yet another complete walk of the recursive use
tree of the alloca.

Either way, I will be watching build bots and LNT bots to see what
fallout there is here. If anyone gets x86-32 numbers before & after this
change, I would be very interested.

llvm-svn: 166662
2012-10-25 04:37:07 +00:00
Nadav Rotem 5ffb049a55 Add support for additional reduction variables: AND, OR, XOR.
Patch by Paul Redmond <paul.redmond@intel.com>.

llvm-svn: 166649
2012-10-25 00:08:41 +00:00
Jakob Stoklund Olesen 9004798da8 Stop running the machine code verifier unconditionally.
llvm-svn: 166646
2012-10-25 00:05:39 +00:00
Nadav Rotem 086ea5c1f5 revert accidental change
llvm-svn: 166643
2012-10-24 23:48:57 +00:00
Nadav Rotem 4a87683a41 Implement a basic cost model for vector and scalar instructions.
llvm-svn: 166642
2012-10-24 23:47:38 +00:00
Micah Villmow f07b962801 Fix a compiler warning with an unused variable.
llvm-svn: 166634
2012-10-24 22:32:26 +00:00
Chad Rosier 5dcb4664f2 [ms-inline asm] Add support for parsing the '.' operator. Given,
[register].field

The operator returns the value at the location pointed to by register plus the
offset of field within its structure or union.  This patch only handles
immediate fields (i.e., [eax].4).  The original displacement has to be a
MCConstantExpr as well.
Part of rdar://12470415 and rdar://12470514

llvm-svn: 166632
2012-10-24 22:21:50 +00:00
Chad Rosier 6844ea09fa Tidy up. No functional change intended.
llvm-svn: 166630
2012-10-24 22:13:37 +00:00
Hal Finkel 69b07a2c3a Update GVN to support vectors of pointers.
GVN will now generate ptrtoint instructions for vectors of pointers.
Fixes PR14166.

llvm-svn: 166624
2012-10-24 21:22:30 +00:00
Nadav Rotem e4f491e7ee whitespace
llvm-svn: 166622
2012-10-24 20:58:40 +00:00
Nadav Rotem a721b21c64 LoopVectorizer: Add a basic cost model which uses the VTTI interface.
llvm-svn: 166620
2012-10-24 20:36:32 +00:00
Evan Cheng 59ed7d45a6 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385

llvm-svn: 166613
2012-10-24 19:53:01 +00:00
Hal Finkel 30bd9346a0 getSmallConstantTripMultiple should never return zero.
When the trip count is -1, getSmallConstantTripMultiple could return zero,
and this would cause runtime loop unrolling to assert. Instead of returning
zero, one is now returned (consistent with the existing overflow cases).
Fixes PR14167.

llvm-svn: 166612
2012-10-24 19:46:44 +00:00
Micah Villmow bf3eeb2dfc Add some cleanup to the DataLayout changes requested by Chandler.
llvm-svn: 166607
2012-10-24 18:36:13 +00:00
Micah Villmow 51e7246cb4 Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this!
llvm-svn: 166596
2012-10-24 17:25:11 +00:00
Nadav Rotem 2289f2c932 Implement a basic VectorTargetTransformInfo interface to be used by the loop and bb vectorizers for modeling the cost of instructions.
llvm-svn: 166593
2012-10-24 17:22:41 +00:00
Chad Rosier 91c8266200 [ms-inline asm] Create a register operand, rather than a memory operand when we
see the offsetof operator.  Previously, we were matching something like MOVrm
in the front-end and later matching MOVrr in the back-end.  This change makes
things more consistent.  It also fixes cases where we can't match against a 
memory operand as the source (test cases coming).
Part of rdar://12470317

llvm-svn: 166592
2012-10-24 17:22:29 +00:00
Micah Villmow 6a8f3f9e20 Delete a directory that wasn't supposed to be checked in yet.
llvm-svn: 166591
2012-10-24 17:20:04 +00:00
Micah Villmow 12d9127833 Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
2012-10-24 15:52:52 +00:00
Elena Demikhovsky d6afb03bc9 Special calling conventions for Intel OpenCL built-in library.
llvm-svn: 166566
2012-10-24 14:46:16 +00:00
Michael Liao 5922979e49 Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.

llvm-svn: 166546
2012-10-24 04:14:18 +00:00
Michael Liao c5af149e70 Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.

llvm-svn: 166545
2012-10-24 04:09:32 +00:00
Akira Hatanaka 868b3a333b [mips] Make sure sret argument is returned in register V0.
llvm-svn: 166539
2012-10-24 02:10:54 +00:00
Rafael Espindola 4e6e537314 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

llvm-svn: 166536
2012-10-24 01:58:48 +00:00
Jakub Staszak a6addc2741 Keep coding standard. Don't evaluate getNumOperands() every time.
llvm-svn: 166531
2012-10-24 00:38:25 +00:00
Richard Smith 1f6f455f7c Fix ODR violations: a virtual function must be defined, even if it's never
called. Provide an (asserting) definition of Operator's private destructor.
Remove destructors from all classes derived from Operator. We don't need them
for safety, because their implicit definitions would be ill-formed (they'd call
Operator's private destructor), and we don't need them to avoid emitting
vtables, because we don't do anything with Operator subclasses which would
trigger vtable instantiation.

The Operator hierarchy is still a complete disaster with regard to undefined
behavior, but this at least allows LLVM to link when using Clang's
-fcatch-undefined-behavior with a new vptr-based type checking mechanism.

llvm-svn: 166530
2012-10-24 00:30:41 +00:00
Chad Rosier a623524487 [ms-inline asm] Offset operator - the size should be based on the size of a
pointer, not the size of the variable.
Part of rdar://12470317

llvm-svn: 166526
2012-10-23 23:42:06 +00:00
Chad Rosier eac2b2003e [ms-inline asm] Clean up comment.
llvm-svn: 166525
2012-10-23 23:34:28 +00:00
Chad Rosier 146310a1c1 [ms-inline asm] When parsing inline assembly we set the base register to a
non-zero value as we don't know the actual value at this point.  This is
necessary to get the matching correct in some cases.  However, the actual value
set as the base register doesn't matter, since we're just matching not emitting.

llvm-svn: 166523
2012-10-23 23:31:33 +00:00
Michael Liao 6d106b7bfd Clean up code and put transformation on (build_vec (ext x)) into a helper func
llvm-svn: 166519
2012-10-23 23:06:52 +00:00
Kevin Enderby dccdac6a06 Make branch heavy code for generating marked up disassembly simpler
and easier to read by adding a couple helper functions.  Suggestion by
Chandler Carruth and seconded by Meador Inge!

llvm-svn: 166515
2012-10-23 22:52:52 +00:00
Michael Liao 2843625bb5 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.

llvm-svn: 166504
2012-10-23 21:40:15 +00:00
Nadav Rotem 33e034a4b3 Make the indirect branch optimization deterministic. No functionality change.
Patch by Daniel Reynaud.

llvm-svn: 166501
2012-10-23 21:05:33 +00:00
Matt Beaumont-Gay bdcebd323a Silence -Wsign-compare
llvm-svn: 166494
2012-10-23 19:46:36 +00:00
Nadav Rotem 5bed7b4fad Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.
PR14158.

llvm-svn: 166491
2012-10-23 18:44:18 +00:00
Bill Wendling 5858b56ce3 Ignore unreachable blocks when doing memory dependence analysis on non-local
loads. It's not really profitable and may result in GVN going into an infinite
loop when it hits constructs like this:

     %x = gep %some.type %x, ...

Found via an LTO build of LLVM.

llvm-svn: 166490
2012-10-23 18:37:11 +00:00
Chad Rosier 37e755cee2 [ms-inline asm] Add an implementation of the offset operator. This is a follow
on patch to r166433.
rdar://12470317

llvm-svn: 166488
2012-10-23 17:43:43 +00:00
Michael Liao c03c03d56e Add custom UINT_TO_FP from v4i8/v4i16/v8i8/v8i16 to v4f32/v8f32
- Replace v4i8/v8i8 -> v8f32 DAG combine with custom lowering to reduce
  DAG combine overhead.
- Extend the support to v4i16/v8i16 as well.

llvm-svn: 166487
2012-10-23 17:36:08 +00:00
Michael Liao 1be96bb5ce Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Eric Christopher c33f622c6f Grammar.
llvm-svn: 166485
2012-10-23 17:19:15 +00:00
Bill Schmidt 57d6de5fd9 This is another TLC patch for separating code for the Darwin and ELF ABIs
for the PowerPC target, and factoring the results.  This will ease future
maintenance of both subtargets.

PPCTargetLowering::LowerCall_Darwin_Or_64SVR4() has grown a lot of special-case
code for the different ABIs, making maintenance difficult.  This is getting
worse as we repair errors in the 64-bit ELF ABI implementation, while avoiding
changes to the Darwin ABI logic.  This patch splits the routine into
LowerCall_Darwin() and LowerCall_64SVR4(), allowing both versions to be
significantly simplified.  I've factored out chunks of similar code where it
made sense to do so.  I also performed similar factoring on
LowerFormalArguments_Darwin() and LowerFormalArguments_64SVR4().

There are no functional changes in this patch, and therefore no new test
cases have been developed.

Built and tested on powerpc64-unknown-linux-gnu with no new regressions.

llvm-svn: 166480
2012-10-23 15:51:16 +00:00
Duncan Sands 5ed3900d77 Fix typo that somehow escaped both testing and code inspection.
llvm-svn: 166475
2012-10-23 09:07:02 +00:00
Duncan Sands 533c8ae79f Transform code like this
%V = mul i64 %N, 4
 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
 %t1 = getelementptr i32* %arr, i32 %N
 %t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
  int foo(int *A, int N) {
    return A[N];
  }
because gcc turns this into byte pointer arithmetic before it hits the plugin:
  D.1590_2 = (long unsigned int) N_1(D);
  D.1591_3 = D.1590_2 * 4;
  D.1592_5 = A_4(D) + D.1591_3;
  D.1589_6 = *D.1592_5;
  return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.

An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.

llvm-svn: 166474
2012-10-23 08:28:26 +00:00
Richard Smith 6289a4e85e Per the C++ standard, we need to include the definition of llvm::Calculate in
every TU where it's implicitly instantiated, even if there's an implicit
instantiation for the same types available in another TU.

llvm-svn: 166470
2012-10-23 06:19:46 +00:00
Nadav Rotem 58df27cb2e Add a comment which explains why the assert fired and how to fix it.
llvm-svn: 166467
2012-10-23 04:35:40 +00:00
Reed Kotler 164bb37c7b implement setXX patterns
llvm-svn: 166459
2012-10-23 01:35:48 +00:00
Julien Lerouge a302b6d95e Fix typo.
llvm-svn: 166456
2012-10-23 00:38:15 +00:00
Julien Lerouge d7fa5e420d Explain why DenseMap is still used here instead of MapVector.
llvm-svn: 166454
2012-10-23 00:23:46 +00:00
Eli Friedman 0f4871d487 [ms-inline-asm] Implement _emit directive (which is roughly equivalent to .byte).
<rdar://problem/12470345>.

llvm-svn: 166451
2012-10-22 23:58:19 +00:00
Bill Wendling 12cda50f1f When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>

llvm-svn: 166448
2012-10-22 23:30:04 +00:00
Kevin Enderby 62183c4e18 Add support for annotated disassembly output for X86 and arm.
Per the October 12, 2012 Proposal for annotated disassembly output sent out by
Jim Grosbach this set of changes implements this for X86 and arm.  The llvm-mc
tool now has a -mdis option to produced the marked up disassembly and a couple
of small example test cases have been added.

rdar://11764962

llvm-svn: 166445
2012-10-22 22:31:46 +00:00
Eli Friedman 15e9b33678 [ms-inline asm] Don't rewrite out parts of an inline-asm skipped by .if 0 and friends.
It's unnecessary and makes the generated assembly less faithful to the original source.

llvm-svn: 166440
2012-10-22 20:50:25 +00:00
Chad Rosier 5bca3f9b8e [ms-inline asm] Add the isOffsetOf() function.
Part of rdar://12470317

llvm-svn: 166436
2012-10-22 19:50:35 +00:00
Julien Lerouge 8cf84fa4e2 Iterating over a DenseMap<std::pair<BasicBlock*, unsigned>, PHINode*> is not
deterministic, replace it with a DenseMap<std::pair<unsigned, unsigned>,
PHINode*> (we already have a map from BasicBlock to unsigned).

<rdar://problem/12541389>

llvm-svn: 166435
2012-10-22 19:43:56 +00:00
Chad Rosier c14ed95da4 [ms-inline asm] Add support for parsing the offset operator. Callback for
CodeGen in the front-end not implemented yet.
rdar://12470317

llvm-svn: 166433
2012-10-22 19:42:52 +00:00
Nadav Rotem 1c7fc71e69 Don't crash if the load/store pointer is not a GEP.
Fix by Shivarama Rao <Shivarama.Rao@amd.com>

llvm-svn: 166427
2012-10-22 18:27:56 +00:00
Argyrios Kyrtzidis 54ff5e81a1 Revert r166407 because it caused analyzer tests to crash and broke self-host bots.
llvm-svn: 166424
2012-10-22 18:16:14 +00:00
Hal Finkel 931c52b84c BBVectorize should ignore unreachable blocks.
Unreachable blocks can have invalid instructions. For example,
jump threading can produce self-referential instructions in
unreachable blocks. Also, we should not be spending time
optimizing unreachable code. Fixes PR14133.

llvm-svn: 166423
2012-10-22 18:00:55 +00:00
Nadav Rotem dbf4783634 Add the "ForceSizeOpt" attribute.
Patch by Quentin Colombet <qcolombet@apple.com>

Original description:
"""
The attached patch is the first step to have a better control on Oz related optimizations.
The Oz optimization level focuses on code size, thus I propose to add an attribute called ForceSizeOpt.
"""

llvm-svn: 166422
2012-10-22 17:33:31 +00:00
Nadav Rotem f17cd27362 Rename a variable.
llvm-svn: 166410
2012-10-22 04:53:05 +00:00
Nadav Rotem 03011f1393 Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector).
llvm-svn: 166409
2012-10-22 04:38:00 +00:00
Nadav Rotem c9741887c3 Update the loop vectorizer docs.
llvm-svn: 166408
2012-10-22 03:52:53 +00:00
Nick Lewycky 8b67e1e0b9 Reapply r166405, teaching tailcallelim to be smarter about nocapture, with a
very small but very important bugfix:
  bool shouldExplore(Use *U) {
    Value *V = U->get();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
    [...]
should have read:
  bool shouldExplore(Use *U) {
    Value *V = U->getUser();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
Fixes PR14143!

llvm-svn: 166407
2012-10-22 03:03:52 +00:00
NAKAMURA Takumi 60d56d2eea Revert r166405, "Teach TailRecursionElimination to consider 'nocapture' when deciding whether"
It broke selfhosting stage2 in several builders.

llvm-svn: 166406
2012-10-22 00:48:51 +00:00