Commit Graph

66793 Commits

Author SHA1 Message Date
Rafael Espindola 24ea09ef7d Construct the MCStreamer before constructing the MCTargetStreamer.
This has a few advantages:
* Only targets that use a MCTargetStreamer have to worry about it.
* There is never a MCTargetStreamer without a MCStreamer, so we can use a
  reference.
* A MCTargetStreamer can talk to the MCStreamer in its constructor.

llvm-svn: 200129
2014-01-26 06:06:37 +00:00
Venkatraman Govindaraju a302d29f1d [Sparc] Add support for parsing DW_CFA_GNU_window_save.
llvm-svn: 200127
2014-01-26 05:13:44 +00:00
Rafael Espindola eb0a8af670 Convert some easy uses of EmitRawText to TargetStreamer methods.
llvm-svn: 200122
2014-01-26 05:06:48 +00:00
Craig Topper aefaab640c Improve some x86 type constraints.
llvm-svn: 200120
2014-01-26 04:59:39 +00:00
Jiangning Liu fb3c17b6c9 Improve pattern match from v1i8 to v1i32 for AArch64 Neon.
llvm-svn: 200119
2014-01-26 04:55:53 +00:00
Rui Ueyama 10ed9ddc8f llvm-readobj: add support for PE32+ (Windows 64 bit executable).
PE32+ supports 64 bit address space, but the file format remains 32 bit.
So its file format is pretty similar to PE32 (32 bit executable). The
differences compared to PE32 are (1) the lack of "BaseOfData" field and
(2) some of its data members are 64 bit.

In this patch, I added a new member function to get a PE32+ Header object to
COFFObjectFile class and made llvm-readobj to use it.

llvm-svn: 200117
2014-01-26 04:15:52 +00:00
Jiangning Liu 6398d839c6 Implement pattern match from v1xx to v1xx for AArch64 Neon.
llvm-svn: 200113
2014-01-26 03:27:40 +00:00
Venkatraman Govindaraju cdee0edf2a [Sparc] Add support for sparc relocation types in ELF object file.
llvm-svn: 200112
2014-01-26 03:21:28 +00:00
Kevin Qin 18662f4b7c [AArch64 NEON] Add patterns for concat_vector on v2i32.
llvm-svn: 200111
2014-01-26 02:46:15 +00:00
Kevin Qin fb9871ff50 [AArch64 NEON] Fix pattern match failed on FP_ROUND from v1f128 to v1f64.
llvm-svn: 200109
2014-01-26 02:19:35 +00:00
Craig Topper 399e39e0de Set displacementSize to 1 for instrucitons with mod==0x1. Fixes PR17310. Modified from patch by James Courtier-Dutton.
llvm-svn: 200100
2014-01-25 22:48:43 +00:00
Evan Cheng b8d499fe2f Clean up hack which is no longer needed after r198617. No functionality change.
llvm-svn: 200095
2014-01-25 19:51:19 +00:00
Hal Finkel dbebb52a2f Disable the use of TBAA when using AA in CodeGen
There are currently two issues, of which I currently know, that prevent TBAA
from being correctly usable in CodeGen:

  1. Stack coloring does not update TBAA when merging allocas. This is easy
     enough to fix, but is not the largest problem.

  2. CGP inserts ptrtoint/inttoptr pairs when sinking address computations.
     Because BasicAA does not handle inttoptr, we'll often miss basic type punning
     idioms that we need to catch so we don't miscompile real-world code (like LLVM).

I don't yet have a small test case for this, but this fixes self hosting a
non-asserts build of LLVM on PPC64 when using -enable-aa-sched-mi and -misched=shuffle.

llvm-svn: 200093
2014-01-25 19:24:54 +00:00
Hal Finkel 9b2617a5a8 Add combiner-aa-only-func (debug only)
This option (which is !NDEBUG only) allows restricting the use of alias
analysis in DAGCombiner to a specific function. This has proved extremely
valuable to isolating bugs related to this feature, and mirrors the
misched-only-func option provided by the new instruction scheduler.

llvm-svn: 200088
2014-01-25 17:32:39 +00:00
Hal Finkel 5fb07341f1 Improve descriptions of combiner-alias-analysis and combiner-global-alias-analysis
llvm-svn: 200087
2014-01-25 17:32:37 +00:00
Artyom Skrobov eab7515385 Reverting r199886 (Prevent repetitive warnings for unrecognized processors and features)
llvm-svn: 200083
2014-01-25 16:56:18 +00:00
Rafael Espindola 14d02fe5c8 This reverts commit r200064 and r200051.
r200064 depends on r200051.

r200051 is broken: I tries to replace .mips_hack_elf_flags, which is a good
thing, but what it replaces it with is even worse.

The new emitMipsELFFlags it adds corresponds to no assembly directive, is not
marked as a hack and is not even printed to the .s file.

The patch also introduces more uses of hasRawTextSupport.

The correct way to remove .mips_hack_elf_flags is to have the mips target
streamer handle the default flags (and command line options). That way the
same code path is used for asm and obj. The streamer interface should *really*
correspond to what is printed in the .s file.

llvm-svn: 200078
2014-01-25 15:06:56 +00:00
Chandler Carruth 3aebcb99f7 [LPM] Conclude my immediate work by making the LoopVectorizer
a FunctionPass. With this change the loop vectorizer no longer is a loop
pass and can readily depend on function analyses. In particular, with
this change we no longer have to form a loop pass manager to run the
loop vectorizer which simplifies the entire pass management of LLVM.

The next step here is to teach the loop vectorizer to leverage profile
information through the profile information providing analysis passes.

llvm-svn: 200074
2014-01-25 10:01:55 +00:00
Chandler Carruth 8765cf702f [LPM] Make LCSSA a utility with a FunctionPass that applies it to all
the loops in a function, and teach LICM to work in the presance of
LCSSA.

Previously, LCSSA was a loop pass. That made passes requiring it also be
loop passes and unable to depend on function analysis passes easily. It
also caused outer loops to have a different "canonical" form from inner
loops during analysis. Instead, we go into LCSSA form and preserve it
through the loop pass manager run.

Note that this has the same problem as LoopSimplify that prevents
enabling its verification -- loop passes which run at the end of the loop
pass manager and don't preserve these are valid, but the subsequent loop
pass runs of outer loops that do preserve this pass trigger too much
verification and fail because the inner loop no longer verifies.

The other problem this exposed is that LICM was completely unable to
handle LCSSA form. It didn't preserve it and it actually would give up
on moving instructions in many cases when they were used by an LCSSA phi
node. I've taught LICM to support detecting LCSSA-form PHI nodes and to
hoist and sink around them. This may actually let LICM fire
significantly more because we put everything into LCSSA form to rotate
the loop before running LICM. =/ Now LICM should handle that fine and
preserve it correctly. The down side is that LICM has to require LCSSA
in order to preserve it. This is just a fact of life for LCSSA. It's
entirely possible we should completely remove LCSSA from the optimizer.

The test updates are essentially accomodating LCSSA phi nodes in the
output of LICM, and the fact that we now completely sink every
instruction in ashr-crash below the loop bodies prior to unrolling.

With this change, LCSSA is computed only three times in the pass
pipeline. One of them could be removed (and potentially a SCEV run and
a separate LoopPassManager entirely!) if we had a LoopPass variant of
InstCombine that ran InstCombine on the loop body but refused to combine
away LCSSA PHI nodes. Currently, this also prevents loop unrolling from
being in the same loop pass manager is rotate, LICM, and unswitch.

There is one thing that I *really* don't like -- preserving LCSSA in
LICM is quite expensive. We end up having to re-run LCSSA twice for some
loops after LICM runs because LICM can undo LCSSA both in the current
loop and the parent loop. I don't really see good solutions to this
other than to completely move away from LCSSA and using tools like
SSAUpdater instead.

llvm-svn: 200067
2014-01-25 04:07:24 +00:00
Rafael Espindola 6b9ee9bce3 Remove an easy use of EmitRawText from PPC.
This makes lib/Target/PowerPC EmitRawText free.

llvm-svn: 200065
2014-01-25 02:35:56 +00:00
Juergen Ributzka f26beda7c7 Revert "Revert "Add Constant Hoisting Pass" (r200034)"
This reverts commit r200058 and adds the using directive for
ARMTargetTransformInfo to silence two g++ overload warnings.

llvm-svn: 200062
2014-01-25 02:02:55 +00:00
Reid Kleckner 2084403afb Fix llvm-dis to print the inalloca bit on allocas.
llvm-svn: 200059
2014-01-25 01:24:06 +00:00
Hans Wennborg 4d67a2e85a Revert "Add Constant Hoisting Pass" (r200034)
This commit caused -Woverloaded-virtual warnings. The two new
TargetTransformInfo::getIntImmCost functions were only added to the superclass,
and to the X86 subclass. The other targets were not updated, and the
warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was
hiding the two new getIntImmCost variants.

We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost"
to the various subclasses, or turning it off, but I suspect that it's wrong to
leave the functions unimplemnted in those targets. The default implementations
return TCC_Free, which I don't think is right e.g. for ARM.

llvm-svn: 200058
2014-01-25 01:18:18 +00:00
Jack Carter ca2ae49d55 [Mips] TargetStreamer ELF flag Support for default and commandline options.
This patch uses a common MipsTargetSteamer interface for both 
MipsAsmPrinter and MipsAsmParser for recording default and commandline
driven directives that affect ELF header flags.

It has been noted that the .ll tests affected by this patch belong in
test/Codegen/Mips. I will move them in a separate patch.

Also, a number of directives do not get expressed by AsmPrinter in the 
resultant .s assembly such as setting the correct ASI. I have noted this
in the tests and they will be addressed in later patches.

llvm-svn: 200051
2014-01-25 00:24:07 +00:00
Ana Pazos cd3b9f763e [AArch64] Removed unused i8 type from FPR8 register class.
The i8 type is not registered with any register class.
This causes a segmentation fault in MachineLICM::getRegisterClassIDAndCost.

The code selects the first type associated with register class FPR8,
which happens to be i8.
It uses this type (i8) to get the representative class pointer, which is 0.
It then uses this pointer to access a field, resulting in segmentation fault.

Since i8 type is not being used for printing any neon instruction
we can safely remove it.

llvm-svn: 200046
2014-01-24 22:36:53 +00:00
Rafael Espindola afcc3df7f4 Make ObjectFile ownership of the MemoryBuffer optional.
This allows llvm-ar to mmap the input files only once.

llvm-svn: 200040
2014-01-24 21:32:21 +00:00
Juergen Ributzka 4f3df4ad64 Add Constant Hoisting Pass
Retry commit r200022 with a fix for the build bot errors. Constant expressions
have (unlike instructions) module scope use lists and therefore may have users
in different functions. The fix is to simply ignore these out-of-function uses.

llvm-svn: 200034
2014-01-24 20:18:00 +00:00
Hal Finkel 51a9838049 Fix DAGCombiner::GatherAllAliases to account for non-chain dependencies
DAGCombiner::GatherAllAliases, which is only used when AA used is enabled
during DAGCombine, had a fundamentally incorrect assumption for which this
change compensates. GatherAllAliases, which is used to find aliasing
predecessor chain nodes (so that a better chain can be selected for a load or
store to enable subsequent optimizations) assumed that walking up the chain
would always catch all possibly-aliasing loads and stores. This is not true: To
really find all aliases, we also need to search for aliases through the value
operand of a store, etc.  Consider the following situation:

  Token1 = ...
  L1 = load Token1, %52
  S1 = store Token1, L1, %51
  L2 = load Token1, %52+8
  S2 = store Token1, L2, %51+8
  Token2 = Token(S1, S2)
  L3 = load Token2, %53
  S3 = store Token2, L3, %52
  L4 = load Token2, %53+8
  S4 = store Token2, L4, %52+8

If we search for aliases of S3 (which loads address %52), and we look only
through the chain, then we'll miss the trivial dependence on L1 (which loads
from %52). We then might change all loads and stores to use Token1 as their
chain operand, which could result in copying %53 into %52 before copying
%52 into %51 (which should happen first).

The problem is, however, that searching for such data dependencies can become
expensive, and the cost is not directly related to the chain depth. Instead,
we'll rule out such configurations by insisting that we've visited all chain
users (except for users of the original chain, which is not necessary).  When
doing this, we need to look through nodes we don't care about (otherwise,
things like register copies will interfere with trivial use cases).

Unfortunately, I don't have a small test case for this problem. Creating the
underlying situation is not hard (a pair of memcpys will do it), but arranging
for the default instruction schedule to be incorrect is very fragile.

This unbreaks self hosting on PPC64 when using
-mllvm -combiner-global-alias-analysis -mllvm -combiner-alias-analysis.

llvm-svn: 200033
2014-01-24 20:12:02 +00:00
Benjamin Kramer 09b0f88a7f InstCombine: Don't try to use aggregate elements of ConstantExprs.
PR18600.

llvm-svn: 200028
2014-01-24 19:02:37 +00:00
Juergen Ributzka 50e7e80d00 Revert "Add Constant Hoisting Pass"
This reverts commit r200022 to unbreak the build bots.

llvm-svn: 200024
2014-01-24 18:40:30 +00:00
Hal Finkel ccc18e1330 Restrict FindBetterChain DAG combines to unindexed nodes
These transformations obviously won't work for indexed (pre/post-inc) loads and
stores. In practice, I'm not sure there is any benefit to enabling them for
indexed nodes because other transformations that these might enable likely also
won't handle indexed nodes.

I don't have an in-tree test case that hits this problem, but an upcoming bug
fix will make it much more likely.

llvm-svn: 200023
2014-01-24 18:25:26 +00:00
Juergen Ributzka 38b67d0caf Add Constant Hoisting Pass
This pass identifies expensive constants to hoist and coalesces them to
better prepare it for SelectionDAG-based code generation. This works around the
limitations of the basic-block-at-a-time approach.

First it scans all instructions for integer constants and calculates its
cost. If the constant can be folded into the instruction (the cost is
TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't
consider it expensive and leave it alone. This is the default behavior and
the default implementation of getIntImmCost will always return TCC_Free.

If the cost is more than TCC_BASIC, then the integer constant can't be folded
into the instruction and it might be beneficial to hoist the constant.
Similar constants are coalesced to reduce register pressure and
materialization code.

When a constant is hoisted, it is also hidden behind a bitcast to force it to
be live-out of the basic block. Otherwise the constant would be just
duplicated and each basic block would have its own copy in the SelectionDAG.
The SelectionDAG recognizes such constants as opaque and doesn't perform
certain transformations on them, which would create a new expensive constant.

This optimization is only applied to integer constants in instructions and
simple (this means not nested) constant cast experessions. For example:
%0 = load i64* inttoptr (i64 big_constant to i64*)

Reviewed by Eric

llvm-svn: 200022
2014-01-24 18:23:08 +00:00
Juergen Ributzka 3e752e7af9 Add final and owerride keywords to TargetTransformInfo's subclasses.
llvm-svn: 200021
2014-01-24 18:22:59 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Benjamin Kramer 5e1794eedb InstSimplify: Make shift, select and GEP simplifications vector-aware.
llvm-svn: 200016
2014-01-24 17:09:53 +00:00
Rafael Espindola e75837564c Unify duplicated functions.
llvm-svn: 200014
2014-01-24 16:13:20 +00:00
Rafael Espindola 65fd0a8c6b Move emitInlineAsmEnd to the AsmPrinter interface.
There is no inline asm in a .s file. Therefore, there should be no logic to
handle it in the streamer. Inline asm only exists in bitcode files, so the
logic can live in the (long misnamed) AsmPrinter class.

llvm-svn: 200011
2014-01-24 15:47:54 +00:00
NAKAMURA Takumi 7409e84381 DWARFContext: Fix possible memory leak since r198908.
llvm-svn: 200000
2014-01-24 13:40:43 +00:00
Eric Christopher cf48ade87e Revert "Use DW_AT_high_pc and DW_AT_low_pc for the high and low pc for a"
in order to fix the cygwin/mingw bots.

This reverts commit r199990.

llvm-svn: 199991
2014-01-24 11:52:53 +00:00
Eric Christopher c528858cbd Use DW_AT_high_pc and DW_AT_low_pc for the high and low pc for a
compile unit. Make these relocations on the platforms that need
relocations and add a routine to ensure that we don't put the
addresses in an offset table for split dwarf.

llvm-svn: 199990
2014-01-24 11:40:29 +00:00
Kevin Qin 21cd2152d3 [AArch64 NEON] Fix a bug in implementing register copy bwtween FPR16.
llvm-svn: 199978
2014-01-24 07:53:04 +00:00
Venkatraman Govindaraju dc3bcc19cf [SparcV9] Add support for JIT in Sparc64.
With this change, all supported tests in test/ExecutionEngine pass in sparcv9.

llvm-svn: 199977
2014-01-24 07:10:19 +00:00
Juergen Ributzka e758ddcd16 [X86] Prevent the creation of redundant ops for sadd and ssub with overflow.
This commit teaches the X86 backend to create the same X86 instructions when it
lowers an sadd/ssub with overflow intrinsic and a conditional branch that uses
that overflow result. This allows SelectionDAG to recognize and remove one of
the redundant operations.

This fixes <rdar://problem/15874016> and <rdar://problem/15661073>.

Reviewed by Nadav

llvm-svn: 199976
2014-01-24 06:47:57 +00:00
Jakob Stoklund Olesen 05ae2d6715 Implement atomicrmw operations in 32 and 64 bits for SPARCv9.
These all use the compare-and-swap CASA/CASXA instructions.

llvm-svn: 199975
2014-01-24 06:23:31 +00:00
Venkatraman Govindaraju 98aa7fab7e [Sparc] Correct quad register list in the asm parser.
Add test cases to check parsing of v9 double registers and their aliased quad registers.

llvm-svn: 199974
2014-01-24 05:24:01 +00:00
Rafael Espindola 0e2ccb2df1 Simplify the logic for deciding when to initialize the sections.
llvm-svn: 199971
2014-01-24 03:54:40 +00:00
Rafael Espindola 61adb27de4 Most streamers' InitSections just create a text section. Make that the default
llvm-svn: 199969
2014-01-24 02:42:26 +00:00
Rafael Espindola 100859c608 Use the actual .text section, it is less code than building a dummy one.
llvm-svn: 199968
2014-01-24 02:31:35 +00:00
Rafael Espindola f2812535e4 Inline trivial functions called only once or twice.
llvm-svn: 199967
2014-01-24 02:28:11 +00:00
Chandler Carruth cc497b6ab5 [LPM] Fix a logic error in LICM spotted by inspection.
We completely skipped promotion in LICM if the loop has a preheader or
dedicated exits, but not *both*. We hoist if there is a preheader, and
sink if there are dedicated exits, but either hoisting or sinking can
move loop invariant code out of the loop!

I have no idea if this has a practical consequence. If anyone has ideas
for a test case, let me know.

llvm-svn: 199966
2014-01-24 02:24:47 +00:00
Rafael Espindola 247f951e3a Inline functions that are only called once.
llvm-svn: 199965
2014-01-24 02:18:40 +00:00
Chandler Carruth abfa3e5652 [cleanup] Use the type-based preservation method rather than a string
literal that bakes a pass name and forces parsing it in the pass
manager.

llvm-svn: 199963
2014-01-24 01:59:49 +00:00
Rafael Espindola f144034c98 InitToTextSection is redundant with InitSections. Remove it.
llvm-svn: 199955
2014-01-23 23:14:14 +00:00
Eric Christopher 1bca60d652 Make the use of DW_AT_ranges in the compile unit depend also upon
the existence of comdat/special sections.

llvm-svn: 199954
2014-01-23 22:55:47 +00:00
Rafael Espindola e308c0cd0d Remove duplicated info on what .text, .data and .bss look like.
llvm-svn: 199951
2014-01-23 22:49:25 +00:00
Kevin Enderby bc570f289a Update the X86 assembler for .intel_syntax to produce an error for invalid base
registers in memory addresses that do not match the index register. As it does
for .att_syntax.

rdar://15887380

llvm-svn: 199948
2014-01-23 22:34:42 +00:00
Kevin Enderby 9d11702f5d Update the X86 assembler for .intel_syntax to produce an error for invalid
scale factors in memory addresses. As it does for .att_syntax.

It was producing:
Assertion failed: (((Scale == 1 || Scale == 2 || Scale == 4 || Scale == 8)) && "Invalid scale!"), function CreateMem, file /Volumes/SandBox/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp, line 1133.

rdar://14967214

llvm-svn: 199942
2014-01-23 21:52:41 +00:00
Eric Christopher 7383d4a9c5 Fix out of bounds access to the double regs array. Given the
code this looks correct, but could use review. The previous
was definitely not correct.

llvm-svn: 199940
2014-01-23 21:41:10 +00:00
Lang Hames b1ce33379a Add a few missing cases from r199933. Testcase coming shortly.
llvm-svn: 199938
2014-01-23 21:27:27 +00:00
Lang Hames 23de211c5d Replace vfmaddxx213 instructions with their 231-type equivalents in accumulator
loops. Writing back to the accumulator (231-type) allows the coalescer to
eliminate an extra copy.

llvm-svn: 199933
2014-01-23 20:23:36 +00:00
Weiming Zhao 5930ae6cc2 [Thumbv8] Fix the value of BLXOperandIndex of isV8EligibleForIT
Originally, BLX was passed as operand #0 in MachineInstr and as operand
#2 in MCInst. But now, it's operand #2 in both cases.

This patch also removes unnecessary FileCheck in the test case added by r199127.

llvm-svn: 199928
2014-01-23 19:55:33 +00:00
Juergen Ributzka 5fe955cb75 Add target analysis passes to the codegen pipeline for MCJIT.
This patch adds the target analysis passes (usually TargetTransformInfo) to the
codgen pipeline. We also expose now the AddAnalysisPasses method through the C
API, because the optimizer passes would also benefit from better target-specific
cost models.

Reviewed by Andrew Kaylor

llvm-svn: 199926
2014-01-23 19:23:28 +00:00
Ana Pazos 5d31f6945b [AArch64] Added vselect patterns with float and double types
llvm-svn: 199925
2014-01-23 19:18:57 +00:00
Eric Christopher 4c96056acd Avoid emitting a DWARF type attribute for an ObjC property of type
void.

Patch by Scott Talbot.

llvm-svn: 199924
2014-01-23 19:16:28 +00:00
Tom Stellard a64353e5bd R600: Remove successive JUMP in AnalyzeBranch when AllowModify is true
This fixes a crash in the OpenCV OpenCL test suite.

There is no lit test for this, because the test would be very large
and could easily be invalidated by changes to the scheduler
or other parts of the compiler.

Patch by:  Vincent Lejeune

llvm-svn: 199919
2014-01-23 18:49:34 +00:00
Tom Stellard a2a4b8ee2f R600: Disable the BFE pattern
This pattern uses an SDNodeXForm, which isn't being emitted for some
reason.  I can get it to work by attaching the PatLeaf that has the
XForm to the argument in the output pattern, but this results in an
immediate being used in a register operand, which the backend can't
handle yet.

llvm-svn: 199918
2014-01-23 18:49:33 +00:00
Tom Stellard 805890b252 R600: Correctly handle vertex fetch clauses the precede ENDIFs
The control flow finalizer would sometimes use an ALU_POP_AFTER
instruction before the vetex fetch clause instead of using a POP
instruction after it.

llvm-svn: 199917
2014-01-23 18:49:31 +00:00
Tom Stellard 8cce9bdf17 R600: Unconditionally unroll loops that contain GEPs with alloca pointers
Implement the getUnrollingPreferences() function for
AMDGPUTargetTransformInfo so that loops that do address calculations
on pointers derived from alloca are unconditionally unrolled.

Unrolling these loops makes it more likely that SROA will be able to
eliminate the allocas, which is a big win for R600 since memory
allocated by alloca (private memory) is really slow.

llvm-svn: 199916
2014-01-23 18:49:28 +00:00
Rafael Espindola 2a05ea5c0e Remove tail marker when changing an argument to an alloca.
Argument promotion can replace an argument of a call with an alloca. This
requires clearing the tail marker as it is very likely that the callee is now
using an alloca in the caller.

This fixes pr14710.

llvm-svn: 199909
2014-01-23 17:19:42 +00:00
Tom Stellard 348273df97 R600: Recommit 199842: Add work-around for the CF stack entry HW bug
The unit test is now disabled on non-asserts builds.

The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE,
CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of
sub-entries on the stack is greater than or equal to the stack entry
size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is
present when number of sub-entries module 8 is either 7 or 0)

We choose to be conservative and always apply the work-around when the
number of sub-enries is greater than or equal to the stack entry size,
so that we can safely over-allocate the stack when we are unsure of the
stack allocation rules.

reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199905
2014-01-23 16:18:02 +00:00
Elena Demikhovsky a5d38a39a0 AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions,
they give better sequences than VPERMI

llvm-svn: 199893
2014-01-23 14:27:26 +00:00
Tim Northover 55c625f222 ARM: use litpools for normal i32 imms when compiling minsize.
With constant-sharing, litpool loads consume 4 + N*2 bytes of code, but
movw/movt pairs consume 8*N. This means litpools are better than movw/movt even
with just one use. Other materialisation strategies can still be better though,
so the logic is a little odd.

llvm-svn: 199891
2014-01-23 13:43:47 +00:00
Artyom Skrobov a515896343 Prevent repetitive warnings for unrecognized processors and features
llvm-svn: 199886
2014-01-23 11:31:38 +00:00
Chandler Carruth aa7fa5e4b2 [LPM] Make LoopSimplify no longer a LoopPass and instead both a utility
function and a FunctionPass.

This has many benefits. The motivating use case was to be able to
compute function analysis passes *after* running LoopSimplify (to avoid
invalidating them) and then to run other passes which require
LoopSimplify. Specifically passes like unrolling and vectorization are
critical to wire up to BranchProbabilityInfo and BlockFrequencyInfo so
that they can be profile aware. For the LoopVectorize pass the only
things in the way are LoopSimplify and LCSSA. This fixes LoopSimplify
and LCSSA is next on my list.

There are also a bunch of other benefits of doing this:
- It is now very feasible to make more passes *preserve* LoopSimplify
  because they can simply run it after changing a loop. Because
  subsequence passes can assume LoopSimplify is preserved we can reduce
  the runs of this pass to the times when we actually mutate a loop
  structure.
- The new pass manager should be able to more easily support loop passes
  factored in this way.
- We can at long, long last observe that LoopSimplify is preserved
  across SCEV. This *halves* the number of times we run LoopSimplify!!!

Now, getting here wasn't trivial. First off, the interfaces used by
LoopSimplify are all over the map regarding how analysis are updated. We
end up with weird "pass" parameters as a consequence. I'll try to clean
at least some of this up later -- I'll have to have it all clean for the
new pass manager.

Next up I discovered a really frustrating bug. LoopUnroll *claims* to
preserve LoopSimplify. That's actually a lie. But the way the
LoopPassManager ends up running the passes, it always ran LoopSimplify
on the unrolled-into loop, rectifying this oversight before any
verification could kick in and point out that in fact nothing was
preserved. So I've added code to the unroller to *actually* simplify the
surrounding loop when it succeeds at unrolling.

The only functional change in the test suite is that we now catch a case
that was previously missed because SCEV and other loop transforms see
their containing loops as simplified and thus don't miss some
opportunities. One test case has been converted to check that we catch
this case rather than checking that we miss it but at least don't get
the wrong answer.

Note that I have #if-ed out all of the verification logic in
LoopSimplify! This is a temporary workaround while extracting these bits
from the LoopPassManager. Currently, there is no way to have a pass in
the LoopPassManager which preserves LoopSimplify along with one which
does not. The LPM will try to verify on each loop in the nest that
LoopSimplify holds but the now-Function-pass cannot distinguish what
loop is being verified and so must try to verify all of them. The inner
most loop is clearly no longer simplified as there is a pass which
didn't even *attempt* to preserve it. =/ Once I get LCSSA out (and maybe
LoopVectorize and some other fixes) I'll be able to re-enable this check
and catch any places where we are still failing to preserve
LoopSimplify. If this causes problems I can back this out and try to
commit *all* of this at once, but so far this seems to work and allow
much more incremental progress.

llvm-svn: 199884
2014-01-23 11:23:19 +00:00
Daniel Sanders 37463f7259 [mips][sched] Split IIStore into II_S[BHWD], II_S[WD][LR], and II_SAVE
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199876
2014-01-23 10:31:31 +00:00
Eric Christopher 15abef6df9 Add a variable to track whether or not we've used a unique section,
e.g. linkonce, to TargetMachine and set it when we've done so
for ELF targets currently. This involved making TargetMachine
non-const in a TLOF use and propagating that change around - I'm
open to other ideas.

This will be used in a future commit to handle emitting debug
information with ranges.

llvm-svn: 199871
2014-01-23 06:47:25 +00:00
Kevin Qin 50944eb638 fix some spell mistakes around 'ConcatVector' and 'ShuffleVector' in AArch64 backend.
llvm-svn: 199858
2014-01-23 01:35:13 +00:00
NAKAMURA Takumi 372f05d537 X86Disassembler.cpp: Fix @param introduced in r199804. [-Wdocumentation]
llvm-svn: 199855
2014-01-23 00:37:25 +00:00
Jack Carter 3b2c96ee86 [Mips] formatting through clang-format
llvm-svn: 199853
2014-01-22 23:31:38 +00:00
Jack Carter 39536724a7 [Mips] TargetStreamer Support for .set mips16.
This patch updates .set mips16 support which
affects the ELF ABI and its flags. In addition the patch uses
a common interface for both the MipsTargetSteamer and
MipsObjectStreamer that the assembler uses for
both ELF and ASCII output for these directives.

llvm-svn: 199851
2014-01-22 23:08:42 +00:00
Owen Anderson 77e4d44411 Revert r162101 and replace it with a solution that works for targets where the pointer type is illegal.
This is a horrible bit of code.  We're calling a simplification routine *in the middle* of type legalization.  We tell the
simplification routine that it's running after legalization, but some of the types it will encounter will be illegal!  The
fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated.

llvm-svn: 199847
2014-01-22 22:34:17 +00:00
Tom Stellard 31e16388d7 Revert "R600: Add work-around for the CF stack entry HW bug"
This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba.

The -debug-only flag for llc doesn't appear to be available in
all build configurations.

llvm-svn: 199845
2014-01-22 22:20:54 +00:00
Rafael Espindola 20fcda7162 Provide a dummy section to fix a crash with inline assembly in LTO.
Fixes pr18508.

llvm-svn: 199843
2014-01-22 22:11:14 +00:00
Tom Stellard e89373e062 R600: Add work-around for the CF stack entry HW bug
The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE,
CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of
sub-entries on the stack is greater than or equal to the stack entry
size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is
present when number of sub-entries module 8 is either 7 or 0)

We choose to be conservative and always apply the work-around when the
number of sub-enries is greater than or equal to the stack entry size,
so that we can safely over-allocate the stack when we are unsure of the
stack allocation rules.

reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199842
2014-01-22 21:55:46 +00:00
Tom Stellard 59ed4794c4 R600: Add some missing CF instruction definitions to the .td files.
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199841
2014-01-22 21:55:44 +00:00
Tom Stellard a40f97154b R600: Refactor stack size calculation
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199840
2014-01-22 21:55:43 +00:00
Tom Stellard afbb697e0b R600: CF_PUSH is the same on Evergreen and Cayman
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199839
2014-01-22 21:55:41 +00:00
Tom Stellard 8c347b024e R600: Add wavefront size property to the subtargets v2
v2:
  - Initialize wavefront size to 0

reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199838
2014-01-22 21:55:40 +00:00
Tom Stellard 08b6af91c3 R600: Add stack size to .AMDGPUcsdata section
reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 199837
2014-01-22 21:55:35 +00:00
Matt Arsenault 84de61148b Handle an addrspacecast case in memcpyopt
llvm-svn: 199836
2014-01-22 21:53:19 +00:00
Matt Arsenault 339506d151 Get right cost for addrspacecast in cost model
llvm-svn: 199833
2014-01-22 20:30:16 +00:00
Rafael Espindola 28a85a84ac Fix pr18515.
My understanding (from reading just the llvm code) is that
* most ppc cpus have a "sync n" instruction and an msync alias that is "sync 0".
* "book e" cpus instead have a msync instruction and not the more
general "sync n"

This patch reflects that in the .td files, allowing a single codepath for
asm ond obj streamer and incidentelly fixes a crash when EmitRawText was
called on a obj streamer.

llvm-svn: 199832
2014-01-22 20:20:52 +00:00
Tom Stellard 476437cbbc R600: MOVA is vector only
llvm-svn: 199827
2014-01-22 19:24:24 +00:00
Tom Stellard 598f3945c0 R600: Take alignment into account when calculating the stack offset
llvm-svn: 199826
2014-01-22 19:24:23 +00:00
Tom Stellard 04c0e9851b R600: Add support for global addresses with constant initializers
llvm-svn: 199825
2014-01-22 19:24:21 +00:00
Tom Stellard 27982b1d4a R600: Begin private memory at the second GPR.
This way private memory does not over-write work group information
stored in GPRs 0 and 1.

llvm-svn: 199824
2014-01-22 19:24:19 +00:00
Tom Stellard e93736057f R600/SI: Add support for i8 and i16 private loads/stores
llvm-svn: 199823
2014-01-22 19:24:14 +00:00
Matt Arsenault fc3c91d0cb Bug 18228 - Fix accepting bitcasts between vectors of pointers with a
different number of elements.

Bitcasts were passing with vectors of pointers with different number of
elements since the number of elements was checking
SrcTy->getVectorNumElements() == SrcTy->getVectorNumElements() which
isn't helpful. The addrspacecast was also wrong, but that case at least
is caught by the verifier. Refactor bitcast and addrspacecast handling
in castIsValid to be more readable and fix this problem.

llvm-svn: 199821
2014-01-22 19:21:33 +00:00
Greg Fitzgerald 1f6a6086ae Fix inline assembly that switches between ARM and Thumb modes
This patch restores the ARM mode if the user's inline assembly
does not.  In the object streamer, it ensures that instructions
following the inline assembly are encoded correctly and that
correct mapping symbols are emitted.  For the asm streamer, it
emits a .arm or .thumb directive.

This patch does not ensure that the inline assembly contains
the ADR instruction to switch modes at runtime.

The problem we need to solve is code like this:

  int foo(int a, int b) {
    int r = a + b;
    asm volatile(
        ".align 2     \n"
        ".arm         \n"
        "add r0,r0,r0 \n"
    : : "r"(r));
    return r+1;
  }

If we compile this function in thumb mode then the inline assembly
will switch to arm mode. We need to make sure that we switch back to
thumb mode after emitting the inline assembly or we will incorrectly
encode the instructions that follow (i.e. the assembly instructions
for return r+1).

Based on patch by David Peixotto

Change-Id: Ib57f6d2d78a22afad5de8693fba6230ff56ba48b
llvm-svn: 199818
2014-01-22 18:32:35 +00:00
Benjamin Kramer f5f23b09bf Remove param doxygen comment for non-existing parameter.
Found by -Wdocumentation.

llvm-svn: 199814
2014-01-22 16:22:17 +00:00
Rafael Espindola ec46f3182b Pass the computed magic to createBinary and createObjectFile if available.
identify_magic is not free, so we should avoid calling it twice. The argument
also makes it cheap for createBinary to just forward to createObjectFile.

llvm-svn: 199813
2014-01-22 16:04:52 +00:00
David Woodhouse 7a7c192e3e [x86] Silence unused diReg variable warning in non-asserting builds
llvm-svn: 199812
2014-01-22 15:31:32 +00:00
David Woodhouse fee418c2c0 [x86] Fix uninitialized variable warning in translate{Src,Dst}Index
llvm-svn: 199811
2014-01-22 15:31:29 +00:00
David Woodhouse e4e815d660 [x86] Remove now-unused isSrcOp() and isDstOp() from X86AsmParser
llvm-svn: 199810
2014-01-22 15:08:58 +00:00
David Woodhouse 4ce66069a0 [x86] Allow segment and address-size overrides for INS[BWLQ] (PR9385)
llvm-svn: 199809
2014-01-22 15:08:55 +00:00
David Woodhouse c472b813bf [x86] Allow segment and address-size overrides for OUTS[BWLQ] (PR9385)
llvm-svn: 199808
2014-01-22 15:08:49 +00:00
David Woodhouse 6f417dea33 [x86] Allow segment and address-size overrides for MOVS[BWLQ] (PR9385)
llvm-svn: 199807
2014-01-22 15:08:42 +00:00
David Woodhouse 9bbf7ca13d ]x86] Allow segment and address-size overrides for CMPS[BWLQ] (PR9385)
llvm-svn: 199806
2014-01-22 15:08:36 +00:00
David Woodhouse 20fe48047d [x86] Allow address-size overrides for SCAS{8,16,32,64} (PR9385)
llvm-svn: 199805
2014-01-22 15:08:27 +00:00
David Woodhouse b33c2ef215 [x86] Allow address-size overrides for STOS[BWLQ] (PR9385)
llvm-svn: 199804
2014-01-22 15:08:21 +00:00
David Woodhouse 2ef8d9c05c [x86] Allow segment and address-size overrides for LODS[BWLQ] (PR9385)
llvm-svn: 199803
2014-01-22 15:08:08 +00:00
Tim Northover bc6659c4e9 Loop strength reduce: fix function name.
llvm-svn: 199801
2014-01-22 13:27:00 +00:00
Elena Demikhovsky 9d56f1e0e5 AVX512: combining setcc and zext is wrong on AVX512
because vector compare instruction puts result in mask register.

llvm-svn: 199798
2014-01-22 12:26:19 +00:00
James Molloy d787d3e593 MachineCopyPropagation has special logic for removing COPY instructions. It will remove plain COPYs using eraseFromParent(), but if the COPY has imp-defs/imp-uses it will convert it to a KILL, to keep the imp-def around.
This actually totally breaks and causes the machine verifier to cry in several cases, one of which being:

%RAX<def> = COPY %RCX<kill>
%ECX<def> = COPY %EAX<kill>, %RAX<imp-use,kill>

These subregister copies are together identified as noops, so are both removed. However, the second one as it has an imp-use gets converted into a kill:

%ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill>

As the original COPY has been removed, the verifier goes into tears at the use of undefined EAX and RAX.

There are several hacky solutions to this hacky problem (which is all to do with imp-use/def weirdnesses), but the least hacky I've come up with is to *always* remove COPYs by converting to KILLs. KILLs are no-ops to the code generator so the generated code doesn't change (which is why they were partially used in the first place), but using them also keeps the def/use and imp-def/imp-use chains alive:

%RAX<def> = KILL %RCX<kill>
%ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill>

The patch passes all test cases including the ones that check the removal of MOVs in this circumstance, along with an extra test I added to check subregister behaviour (which made the machine verifier fall over before my patch).

The patch also adds some DEBUG() statements because the file hadn't got any.

llvm-svn: 199797
2014-01-22 09:12:27 +00:00
Kevin Qin ce0190c6d5 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR.
llvm-svn: 199791
2014-01-22 06:11:03 +00:00
Andrew Trick 4675351afd Reformat a loop for basic hygeine. Self review.
llvm-svn: 199788
2014-01-22 03:38:55 +00:00
Venkatraman Govindaraju dd634cac74 [Sparc] Add support for inline assembly constraints which specify registers by their aliases.
llvm-svn: 199786
2014-01-22 03:18:42 +00:00
Matt Arsenault d850a06604 Fix typo
llvm-svn: 199784
2014-01-22 02:38:23 +00:00
Venkatraman Govindaraju 407e442245 [Sparc] Add support for inline assembly constraint 'I'.
llvm-svn: 199781
2014-01-22 01:29:51 +00:00
Rafael Espindola 51cc360204 Change createObjectFile to return an ErrorOr.
llvm-svn: 199776
2014-01-22 00:14:49 +00:00
Venkatraman Govindaraju f52927fb1b [Sparc] Do not add PC to _GLOBAL_OFFSET_TABLE_ address to access GOT in absolute code.
Fixes PR#18521

llvm-svn: 199775
2014-01-22 00:13:18 +00:00
Chandler Carruth 4de315430c [SROA] Fix a bug which could cause the common type finding to return
inconsistent results for different orderings of alloca slices. The
fundamental issue is that it is just always a mistake to return early
from this function. There is no effective early exit to leverage. This
patch stops trynig to do so and simplifies the code a bit as
a consequence.

Original diagnosis and patch by James Molloy with some name tweaks by me
in part reflecting feedback from Duncan Smith on the mailing list.

llvm-svn: 199771
2014-01-21 23:16:05 +00:00
Rafael Espindola 692410efcb Be a bit more consistent about using ErrorOr when constructing Binary objects.
The constructors of classes deriving from Binary normally take an error_code
as an argument to the constructor. My original intent was to change them
to have a trivial constructor and move the initial parsing logic to a static
method returning an ErrorOr. I changed my mind because:

* A constructor with an error_code out parameter is extremely convenient from
  the implementation side. We can incrementally construct the object and give
  up when we find an error.
* It is very efficient when constructing on the stack or when there is no
  error. The only inefficient case is where heap allocating and an error is
  found (we have to free the memory).

The result is that this is a much smaller patch. It just standardizes the
create* helpers to return an ErrorOr.

Almost no functionality change: The only difference is that this found that
we were trying to read past the end of COFF import library but ignoring the
error.

llvm-svn: 199770
2014-01-21 23:06:54 +00:00
Duncan P. N. Exon Smith 50ed9af23d CodeGen: Stop treating vectors as aggregates
Fix a crash in SjLjEHPrepare::lowerIncomingArguments caused by treating
VectorType like an aggregate.  It's first-class!

<rdar://problem/15854596>

llvm-svn: 199768
2014-01-21 22:46:46 +00:00
Andrew Trick 350ff2c084 Fix PR18572 - llc crash during GenericScheduler::initPolicy().
Generalized the heuristic that looks at the (very rough) size of the
register file before enabling regpressure tracking.

llvm-svn: 199766
2014-01-21 21:27:37 +00:00
Hal Finkel 3e4a34c8c3 Fix pointer info on PPC byval stores
For PPC64 SVR (and Darwin), the stores that take byval aggregate parameters
from registers into the stack frame had MachinePointerInfo objects with
incorrect offsets. These offsets are relative to the object itself, not to the
stack frame base.

This fixes self hosting on PPC64 when compiling with -enable-aa-sched-mi.

llvm-svn: 199763
2014-01-21 20:15:58 +00:00
Yunzhong Gao a88d7abeb1 Adding new LTO APIs to parse metadata nodes and extract linker options and
dependent libraries from a bitcode module.

Differential Revision: http://llvm-reviews.chandlerc.com/D2343

llvm-svn: 199759
2014-01-21 18:31:27 +00:00
Rafael Espindola 23a9750c47 Rename these methods to match the style guide.
llvm-svn: 199751
2014-01-21 16:09:45 +00:00
Daniel Sanders 0b385ac138 [mips][sched] Split IILoad into II_L[BHWD], II_L[BHW]U, II_L[WD][LR], and II_RESTORE
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199749
2014-01-21 15:21:14 +00:00
Daniel Sanders 3d345b11c8 [mips][sched] Split IIFmoveC1 into II_M[FT]C1, II_M[FT]HC1, II_DM[FT]C1
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199748
2014-01-21 15:03:52 +00:00
Daniel Sanders bf8aa22902 [mips][sched] Split IIFStore into II_S[WD]C1, and II_S[WDU]XC1
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199747
2014-01-21 14:50:20 +00:00
Justin Holewinski 7706107e6f [NVPTX] Add missing patterns for div.approx with immediate denominator
llvm-svn: 199746
2014-01-21 14:40:05 +00:00
Daniel Sanders 7741274534 [mips][sched] Split IIFLoad into II_L[WD]C1, and II_L[WDU]XC1
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199743
2014-01-21 13:59:56 +00:00
Daniel Sanders 7af732c915 [mips][sched] Removed IIFrecipFsqrtStep. No instructions use it.
llvm-svn: 199742
2014-01-21 13:45:41 +00:00
Daniel Sanders 3424067527 [mips][sched] Renamed II_FsqrtSingle and II_FsqrtDouble to II_SQRT_S and II_SQRT_D respectively
No functional change

llvm-svn: 199741
2014-01-21 13:36:45 +00:00
Daniel Sanders 072f60f0dc [mips][sched] Renamed II_FdivSingle and II_FdivDouble to II_DIV_S and II_DIV_D respectively
No functional change

llvm-svn: 199738
2014-01-21 13:22:08 +00:00
Daniel Sanders 2ce72b061c [mips][sched] Split IIFmulDouble into II_MUL_D, II_MADD_D, II_MSUB_D, II_NMADD_D, and II_NMSUB_S
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199737
2014-01-21 13:07:31 +00:00
Daniel Sanders 47b4b6dd78 [mips][sched] Split IIFmulSingle into II_MUL_S, II_MADD_S, II_MSUB_S, II_NMADD_S, and II_NMSUB_S
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199734
2014-01-21 12:51:44 +00:00
Daniel Sanders 4bf6078841 [mips][sched] Split IIFadd into II_ADD_[DS], II_SUB_[DS]
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199732
2014-01-21 12:38:07 +00:00
Daniel Sanders b8013baf8f [mips][sched] Split IIFcmp into II_C_CC_[SD]
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199728
2014-01-21 11:42:48 +00:00
Daniel Sanders f5fb34137e [mips][sched] Split IIFmove into II_C[FT]C1, II_MOV[FNTZ]_[SD], II_MOV_[SD]
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199727
2014-01-21 11:28:03 +00:00
Daniel Sanders 555f4c5672 [mips][sched] Split IIFcvt into II_(ROUND|TRUNC|CEIL|FLOOR|CVT), II_ABS, II_NEG
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199722
2014-01-21 10:56:23 +00:00
Daniel Sanders 298ad0f277 [mips][sched] Split IIslt into II_SLT_SLTU, II_SLTI_SLTIU
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199719
2014-01-21 10:42:13 +00:00
Renato Golin e195f9ce15 Checked return warning from coverity
llvm-svn: 199716
2014-01-21 10:24:35 +00:00
Saleem Abdulrasool d9f086036a ARM IAS: add support for .unwind_raw directive
This implements the unwind_raw directive for the ARM IAS.  The unwind_raw
directive takes the form of a stack offset value followed by one or more bytes
representing the opcodes to be emitted.  The opcode emitted will interpreted as
if it were assembled by the opcode assembler via the standard unwinding
directives.

Thanks to Logan Chien for an extra test!

llvm-svn: 199707
2014-01-21 02:33:10 +00:00
Saleem Abdulrasool 662f5c1a5a ARM IAS: support .personalityindex
The .personalityindex directive is equivalent to the .personality directive with
the ARM EABI personality with the specific index (0, 1, 2).  Both of these
directives indicate personality routines, so enhance the personality directive
handling to take into account personalityindex.

Bonus fix: flush the UnwindContext at the beginning of a new function.

Thanks to Logan Chien for additional tests!

llvm-svn: 199706
2014-01-21 02:33:02 +00:00
Kevin Qin 6d379abd8f [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
It was commited as r199628 but reverted in r199628 as causing
regression test failed. It's because of old vervsion of patch
I used to commit. Sorry for mistake.

llvm-svn: 199704
2014-01-21 01:48:52 +00:00
Kevin Enderby c030848f8f Tweak the MCExternalSymbolizer to not use the SymbolLookUp() call back
to not guess at a symbol name in some cases.

The problem is that in object files assembled starting at address 0, when
trying to symbolicate something that starts like this:

% cat x.s
_t1:
	vpshufd	$0x0, %xmm1, %xmm0

the symbolic disassembly can end up like this:

% otool -tV x.o 
x.o:
(__TEXT,__text) section
_t1:
0000000000000000	vpshufd	$_t1, %xmm1, %xmm0

Which is in this case produced incorrect symbolication.

But it is useful in some cases to use the SymbolLookUp() call back
to guess at some immediate values.  For example one like this
that does not have an external relocation entry:

% cat y.s
_t1:
	movl	$_d1, %eax
.data
_d1:	.long	0

% clang -c -arch i386 y.s

% otool -tV y.o 
y.o:
(__TEXT,__text) section
_t1:
0000000000000000	movl	$_d1, %eax

% otool -rv y.o 
y.o:
Relocation information (__TEXT,__text) 1 entries
address  pcrel length extern type    scattered symbolnum/value
00000001 False long   False  VANILLA False     2 (__DATA,__data)

So the change is based on it is not likely that an immediate Value
coming from an instruction field of a width of 1 byte, other than branches
and items with relocation, are not likely symbol addresses.

With the change the first case above simply becomes:

% otool -tV x.o 
x.o:
(__TEXT,__text) section
_t1:
0000000000000000	vpshufd	$0x0, %xmm1, %xmm0

and the second case continues to work as expected.

rdar://14863405

llvm-svn: 199698
2014-01-21 00:23:17 +00:00
Kevin Enderby debfea62d4 To allow the X86 verbose assembly to print its informative comments
when used with symbolic disassembly, add a check that the operand
is an immediate and has not been symbolicated to MCExpr operand.

I’m trying to enable the ‘C’ disassembly API option
LLVMDisassembler_Option_SetInstrComments for darwin’s
otool(1) that uses the llvm disassembler API.  The problem is
that the disassembler API can change an immediate operand to
an MCExpr operand if it symbolicates it with the call backs.
And if it does the code in llvm::EmitAnyX86InstComments()
will crash when it assumes these operands are immediates.

The fix for this is very straight forward to just protect the call
to getImm() with a check of isImm().  So if the immediate for
an instruction is symbolicated it simply doesn’t get the X86
verbose assembly comments:

% otool -tV test_asm.o
test_asm.o:
(__TEXT,__text) section
_t1:
0000000000000000	vpshufd	$_t1, %xmm1, %xmm0
0000000000000005	retq
0000000000000006	nopw	%cs:_t1(%rax,%rax)
_t2:
0000000000000010	vpshufd	$-0x1, %xmm0, %xmm0     ## xmm0 = xmm0[3,3,3,3]
0000000000000015	retq
0000000000000016	nopw	%cs:_t1(%rax,%rax)
_t3:
0000000000000020	vpshufd	$_t1, %xmm1, %xmm0
0000000000000025	retq
0000000000000026	nopw	%cs:_t1(%rax,%rax)
_t4:
0000000000000030	vpshufd	$0x2d, %xmm0, %xmm0     ## xmm0 = xmm0[1,3,2,0]
0000000000000035	retq

The fact that the immediate $0x0 is being symbolicated at
all in this case is a different problem which my next patch
will address.

rdar://10989286

llvm-svn: 199697
2014-01-21 00:18:51 +00:00
Hal Finkel a69e5b8b9d Update StackProtector when coloring merges stack slots
StackProtector keeps a ValueMap of alloca instructions to layout kind tags for
use by PEI and other later passes. When stack coloring replaces one alloca with
a bitcast to another one, the key replacement in this map does not work.
Instead, provide an interface to manage this updating directly. This seems like
an improvement over the old behavior, where the layout map would not get
updated at all when the stack slots were merged. In practice, however, there is
likely no observable difference because PEI only did anything special with
'large array' kinds, and if one large array is merged with another, than the
replacement should already have been a large array.

This is an attempt to unbreak the clang-x86_64-darwin11-RA builder.

llvm-svn: 199684
2014-01-20 19:49:14 +00:00