Commit Graph

66793 Commits

Author SHA1 Message Date
Rafael Espindola ec46f3182b Pass the computed magic to createBinary and createObjectFile if available.
identify_magic is not free, so we should avoid calling it twice. The argument
also makes it cheap for createBinary to just forward to createObjectFile.

llvm-svn: 199813
2014-01-22 16:04:52 +00:00
David Woodhouse 7a7c192e3e [x86] Silence unused diReg variable warning in non-asserting builds
llvm-svn: 199812
2014-01-22 15:31:32 +00:00
David Woodhouse fee418c2c0 [x86] Fix uninitialized variable warning in translate{Src,Dst}Index
llvm-svn: 199811
2014-01-22 15:31:29 +00:00
David Woodhouse e4e815d660 [x86] Remove now-unused isSrcOp() and isDstOp() from X86AsmParser
llvm-svn: 199810
2014-01-22 15:08:58 +00:00
David Woodhouse 4ce66069a0 [x86] Allow segment and address-size overrides for INS[BWLQ] (PR9385)
llvm-svn: 199809
2014-01-22 15:08:55 +00:00
David Woodhouse c472b813bf [x86] Allow segment and address-size overrides for OUTS[BWLQ] (PR9385)
llvm-svn: 199808
2014-01-22 15:08:49 +00:00
David Woodhouse 6f417dea33 [x86] Allow segment and address-size overrides for MOVS[BWLQ] (PR9385)
llvm-svn: 199807
2014-01-22 15:08:42 +00:00
David Woodhouse 9bbf7ca13d ]x86] Allow segment and address-size overrides for CMPS[BWLQ] (PR9385)
llvm-svn: 199806
2014-01-22 15:08:36 +00:00
David Woodhouse 20fe48047d [x86] Allow address-size overrides for SCAS{8,16,32,64} (PR9385)
llvm-svn: 199805
2014-01-22 15:08:27 +00:00
David Woodhouse b33c2ef215 [x86] Allow address-size overrides for STOS[BWLQ] (PR9385)
llvm-svn: 199804
2014-01-22 15:08:21 +00:00
David Woodhouse 2ef8d9c05c [x86] Allow segment and address-size overrides for LODS[BWLQ] (PR9385)
llvm-svn: 199803
2014-01-22 15:08:08 +00:00
Tim Northover bc6659c4e9 Loop strength reduce: fix function name.
llvm-svn: 199801
2014-01-22 13:27:00 +00:00
Elena Demikhovsky 9d56f1e0e5 AVX512: combining setcc and zext is wrong on AVX512
because vector compare instruction puts result in mask register.

llvm-svn: 199798
2014-01-22 12:26:19 +00:00
James Molloy d787d3e593 MachineCopyPropagation has special logic for removing COPY instructions. It will remove plain COPYs using eraseFromParent(), but if the COPY has imp-defs/imp-uses it will convert it to a KILL, to keep the imp-def around.
This actually totally breaks and causes the machine verifier to cry in several cases, one of which being:

%RAX<def> = COPY %RCX<kill>
%ECX<def> = COPY %EAX<kill>, %RAX<imp-use,kill>

These subregister copies are together identified as noops, so are both removed. However, the second one as it has an imp-use gets converted into a kill:

%ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill>

As the original COPY has been removed, the verifier goes into tears at the use of undefined EAX and RAX.

There are several hacky solutions to this hacky problem (which is all to do with imp-use/def weirdnesses), but the least hacky I've come up with is to *always* remove COPYs by converting to KILLs. KILLs are no-ops to the code generator so the generated code doesn't change (which is why they were partially used in the first place), but using them also keeps the def/use and imp-def/imp-use chains alive:

%RAX<def> = KILL %RCX<kill>
%ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill>

The patch passes all test cases including the ones that check the removal of MOVs in this circumstance, along with an extra test I added to check subregister behaviour (which made the machine verifier fall over before my patch).

The patch also adds some DEBUG() statements because the file hadn't got any.

llvm-svn: 199797
2014-01-22 09:12:27 +00:00
Kevin Qin ce0190c6d5 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR.
llvm-svn: 199791
2014-01-22 06:11:03 +00:00
Andrew Trick 4675351afd Reformat a loop for basic hygeine. Self review.
llvm-svn: 199788
2014-01-22 03:38:55 +00:00
Venkatraman Govindaraju dd634cac74 [Sparc] Add support for inline assembly constraints which specify registers by their aliases.
llvm-svn: 199786
2014-01-22 03:18:42 +00:00
Matt Arsenault d850a06604 Fix typo
llvm-svn: 199784
2014-01-22 02:38:23 +00:00
Venkatraman Govindaraju 407e442245 [Sparc] Add support for inline assembly constraint 'I'.
llvm-svn: 199781
2014-01-22 01:29:51 +00:00
Rafael Espindola 51cc360204 Change createObjectFile to return an ErrorOr.
llvm-svn: 199776
2014-01-22 00:14:49 +00:00
Venkatraman Govindaraju f52927fb1b [Sparc] Do not add PC to _GLOBAL_OFFSET_TABLE_ address to access GOT in absolute code.
Fixes PR#18521

llvm-svn: 199775
2014-01-22 00:13:18 +00:00
Chandler Carruth 4de315430c [SROA] Fix a bug which could cause the common type finding to return
inconsistent results for different orderings of alloca slices. The
fundamental issue is that it is just always a mistake to return early
from this function. There is no effective early exit to leverage. This
patch stops trynig to do so and simplifies the code a bit as
a consequence.

Original diagnosis and patch by James Molloy with some name tweaks by me
in part reflecting feedback from Duncan Smith on the mailing list.

llvm-svn: 199771
2014-01-21 23:16:05 +00:00
Rafael Espindola 692410efcb Be a bit more consistent about using ErrorOr when constructing Binary objects.
The constructors of classes deriving from Binary normally take an error_code
as an argument to the constructor. My original intent was to change them
to have a trivial constructor and move the initial parsing logic to a static
method returning an ErrorOr. I changed my mind because:

* A constructor with an error_code out parameter is extremely convenient from
  the implementation side. We can incrementally construct the object and give
  up when we find an error.
* It is very efficient when constructing on the stack or when there is no
  error. The only inefficient case is where heap allocating and an error is
  found (we have to free the memory).

The result is that this is a much smaller patch. It just standardizes the
create* helpers to return an ErrorOr.

Almost no functionality change: The only difference is that this found that
we were trying to read past the end of COFF import library but ignoring the
error.

llvm-svn: 199770
2014-01-21 23:06:54 +00:00
Duncan P. N. Exon Smith 50ed9af23d CodeGen: Stop treating vectors as aggregates
Fix a crash in SjLjEHPrepare::lowerIncomingArguments caused by treating
VectorType like an aggregate.  It's first-class!

<rdar://problem/15854596>

llvm-svn: 199768
2014-01-21 22:46:46 +00:00
Andrew Trick 350ff2c084 Fix PR18572 - llc crash during GenericScheduler::initPolicy().
Generalized the heuristic that looks at the (very rough) size of the
register file before enabling regpressure tracking.

llvm-svn: 199766
2014-01-21 21:27:37 +00:00
Hal Finkel 3e4a34c8c3 Fix pointer info on PPC byval stores
For PPC64 SVR (and Darwin), the stores that take byval aggregate parameters
from registers into the stack frame had MachinePointerInfo objects with
incorrect offsets. These offsets are relative to the object itself, not to the
stack frame base.

This fixes self hosting on PPC64 when compiling with -enable-aa-sched-mi.

llvm-svn: 199763
2014-01-21 20:15:58 +00:00
Yunzhong Gao a88d7abeb1 Adding new LTO APIs to parse metadata nodes and extract linker options and
dependent libraries from a bitcode module.

Differential Revision: http://llvm-reviews.chandlerc.com/D2343

llvm-svn: 199759
2014-01-21 18:31:27 +00:00
Rafael Espindola 23a9750c47 Rename these methods to match the style guide.
llvm-svn: 199751
2014-01-21 16:09:45 +00:00
Daniel Sanders 0b385ac138 [mips][sched] Split IILoad into II_L[BHWD], II_L[BHW]U, II_L[WD][LR], and II_RESTORE
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199749
2014-01-21 15:21:14 +00:00
Daniel Sanders 3d345b11c8 [mips][sched] Split IIFmoveC1 into II_M[FT]C1, II_M[FT]HC1, II_DM[FT]C1
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199748
2014-01-21 15:03:52 +00:00
Daniel Sanders bf8aa22902 [mips][sched] Split IIFStore into II_S[WD]C1, and II_S[WDU]XC1
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199747
2014-01-21 14:50:20 +00:00
Justin Holewinski 7706107e6f [NVPTX] Add missing patterns for div.approx with immediate denominator
llvm-svn: 199746
2014-01-21 14:40:05 +00:00
Daniel Sanders 7741274534 [mips][sched] Split IIFLoad into II_L[WD]C1, and II_L[WDU]XC1
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199743
2014-01-21 13:59:56 +00:00
Daniel Sanders 7af732c915 [mips][sched] Removed IIFrecipFsqrtStep. No instructions use it.
llvm-svn: 199742
2014-01-21 13:45:41 +00:00
Daniel Sanders 3424067527 [mips][sched] Renamed II_FsqrtSingle and II_FsqrtDouble to II_SQRT_S and II_SQRT_D respectively
No functional change

llvm-svn: 199741
2014-01-21 13:36:45 +00:00
Daniel Sanders 072f60f0dc [mips][sched] Renamed II_FdivSingle and II_FdivDouble to II_DIV_S and II_DIV_D respectively
No functional change

llvm-svn: 199738
2014-01-21 13:22:08 +00:00
Daniel Sanders 2ce72b061c [mips][sched] Split IIFmulDouble into II_MUL_D, II_MADD_D, II_MSUB_D, II_NMADD_D, and II_NMSUB_S
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199737
2014-01-21 13:07:31 +00:00
Daniel Sanders 47b4b6dd78 [mips][sched] Split IIFmulSingle into II_MUL_S, II_MADD_S, II_MSUB_S, II_NMADD_S, and II_NMSUB_S
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199734
2014-01-21 12:51:44 +00:00
Daniel Sanders 4bf6078841 [mips][sched] Split IIFadd into II_ADD_[DS], II_SUB_[DS]
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199732
2014-01-21 12:38:07 +00:00
Daniel Sanders b8013baf8f [mips][sched] Split IIFcmp into II_C_CC_[SD]
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199728
2014-01-21 11:42:48 +00:00
Daniel Sanders f5fb34137e [mips][sched] Split IIFmove into II_C[FT]C1, II_MOV[FNTZ]_[SD], II_MOV_[SD]
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199727
2014-01-21 11:28:03 +00:00
Daniel Sanders 555f4c5672 [mips][sched] Split IIFcvt into II_(ROUND|TRUNC|CEIL|FLOOR|CVT), II_ABS, II_NEG
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199722
2014-01-21 10:56:23 +00:00
Daniel Sanders 298ad0f277 [mips][sched] Split IIslt into II_SLT_SLTU, II_SLTI_SLTIU
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199719
2014-01-21 10:42:13 +00:00
Renato Golin e195f9ce15 Checked return warning from coverity
llvm-svn: 199716
2014-01-21 10:24:35 +00:00
Saleem Abdulrasool d9f086036a ARM IAS: add support for .unwind_raw directive
This implements the unwind_raw directive for the ARM IAS.  The unwind_raw
directive takes the form of a stack offset value followed by one or more bytes
representing the opcodes to be emitted.  The opcode emitted will interpreted as
if it were assembled by the opcode assembler via the standard unwinding
directives.

Thanks to Logan Chien for an extra test!

llvm-svn: 199707
2014-01-21 02:33:10 +00:00
Saleem Abdulrasool 662f5c1a5a ARM IAS: support .personalityindex
The .personalityindex directive is equivalent to the .personality directive with
the ARM EABI personality with the specific index (0, 1, 2).  Both of these
directives indicate personality routines, so enhance the personality directive
handling to take into account personalityindex.

Bonus fix: flush the UnwindContext at the beginning of a new function.

Thanks to Logan Chien for additional tests!

llvm-svn: 199706
2014-01-21 02:33:02 +00:00
Kevin Qin 6d379abd8f [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
It was commited as r199628 but reverted in r199628 as causing
regression test failed. It's because of old vervsion of patch
I used to commit. Sorry for mistake.

llvm-svn: 199704
2014-01-21 01:48:52 +00:00
Kevin Enderby c030848f8f Tweak the MCExternalSymbolizer to not use the SymbolLookUp() call back
to not guess at a symbol name in some cases.

The problem is that in object files assembled starting at address 0, when
trying to symbolicate something that starts like this:

% cat x.s
_t1:
	vpshufd	$0x0, %xmm1, %xmm0

the symbolic disassembly can end up like this:

% otool -tV x.o 
x.o:
(__TEXT,__text) section
_t1:
0000000000000000	vpshufd	$_t1, %xmm1, %xmm0

Which is in this case produced incorrect symbolication.

But it is useful in some cases to use the SymbolLookUp() call back
to guess at some immediate values.  For example one like this
that does not have an external relocation entry:

% cat y.s
_t1:
	movl	$_d1, %eax
.data
_d1:	.long	0

% clang -c -arch i386 y.s

% otool -tV y.o 
y.o:
(__TEXT,__text) section
_t1:
0000000000000000	movl	$_d1, %eax

% otool -rv y.o 
y.o:
Relocation information (__TEXT,__text) 1 entries
address  pcrel length extern type    scattered symbolnum/value
00000001 False long   False  VANILLA False     2 (__DATA,__data)

So the change is based on it is not likely that an immediate Value
coming from an instruction field of a width of 1 byte, other than branches
and items with relocation, are not likely symbol addresses.

With the change the first case above simply becomes:

% otool -tV x.o 
x.o:
(__TEXT,__text) section
_t1:
0000000000000000	vpshufd	$0x0, %xmm1, %xmm0

and the second case continues to work as expected.

rdar://14863405

llvm-svn: 199698
2014-01-21 00:23:17 +00:00
Kevin Enderby debfea62d4 To allow the X86 verbose assembly to print its informative comments
when used with symbolic disassembly, add a check that the operand
is an immediate and has not been symbolicated to MCExpr operand.

I’m trying to enable the ‘C’ disassembly API option
LLVMDisassembler_Option_SetInstrComments for darwin’s
otool(1) that uses the llvm disassembler API.  The problem is
that the disassembler API can change an immediate operand to
an MCExpr operand if it symbolicates it with the call backs.
And if it does the code in llvm::EmitAnyX86InstComments()
will crash when it assumes these operands are immediates.

The fix for this is very straight forward to just protect the call
to getImm() with a check of isImm().  So if the immediate for
an instruction is symbolicated it simply doesn’t get the X86
verbose assembly comments:

% otool -tV test_asm.o
test_asm.o:
(__TEXT,__text) section
_t1:
0000000000000000	vpshufd	$_t1, %xmm1, %xmm0
0000000000000005	retq
0000000000000006	nopw	%cs:_t1(%rax,%rax)
_t2:
0000000000000010	vpshufd	$-0x1, %xmm0, %xmm0     ## xmm0 = xmm0[3,3,3,3]
0000000000000015	retq
0000000000000016	nopw	%cs:_t1(%rax,%rax)
_t3:
0000000000000020	vpshufd	$_t1, %xmm1, %xmm0
0000000000000025	retq
0000000000000026	nopw	%cs:_t1(%rax,%rax)
_t4:
0000000000000030	vpshufd	$0x2d, %xmm0, %xmm0     ## xmm0 = xmm0[1,3,2,0]
0000000000000035	retq

The fact that the immediate $0x0 is being symbolicated at
all in this case is a different problem which my next patch
will address.

rdar://10989286

llvm-svn: 199697
2014-01-21 00:18:51 +00:00
Hal Finkel a69e5b8b9d Update StackProtector when coloring merges stack slots
StackProtector keeps a ValueMap of alloca instructions to layout kind tags for
use by PEI and other later passes. When stack coloring replaces one alloca with
a bitcast to another one, the key replacement in this map does not work.
Instead, provide an interface to manage this updating directly. This seems like
an improvement over the old behavior, where the layout map would not get
updated at all when the stack slots were merged. In practice, however, there is
likely no observable difference because PEI only did anything special with
'large array' kinds, and if one large array is merged with another, than the
replacement should already have been a large array.

This is an attempt to unbreak the clang-x86_64-darwin11-RA builder.

llvm-svn: 199684
2014-01-20 19:49:14 +00:00
Andrea Di Biagio 450d1661be [X86] Teach how to combine a vselect into a movss/movsd
Add target specific rules for combining vselect dag nodes into movss/movsd
when possible.

If the vector type of the vselect dag node in input is either MVT::v4i13 or
MVT::v4f32, then try to fold according to rules:

  1) fold (vselect (build_vector (0, -1, -1, -1)), A, B) -> (movss A, B)
  2) fold (vselect (build_vector (-1, 0, 0, 0)), A, B) -> (movss B, A)

If the vector type of the vselect dag node in input is either MVT::v2i64 or
MVT::v2f64 (and we have SSE2), then try to fold according to rules:

  3) fold (vselect (build_vector (0, -1)), A, B) -> (movsd A, B)
  4) fold (vselect (build_vector (-1, 0)), A, B) -> (movsd B, A)

llvm-svn: 199683
2014-01-20 19:35:22 +00:00
Adrian Prantl 671af5ca4b Debug info: On ARM ensure that all __TEXT sections come before the
optional DWARF sections, so compiling with -g does not result in
different code being generated for PC-relative loads.

This is reapplying a diet r197922 (__TEXT-only).

llvm-svn: 199681
2014-01-20 19:15:59 +00:00
Adrian Prantl 1a89924d84 Revert "Debug info: On ARM ensure that the data sections come before the"
Cut back on the cargo cult. The order of __DATA sections doesn't affect
generated code.

This reverts commit r197922.

llvm-svn: 199680
2014-01-20 19:15:55 +00:00
Owen Anderson fb00d5bc7c Allow SMUL_LOHI and UMUL_LOHI to be narrow to MUL on targets where MUL is Custom rather than Legal. Even if the target is doing some kind of expansion for MUL, it's pretty much guaranteed to be more efficent than whatever it does for SMUL_LOHI or UMUL_LOHI!
llvm-svn: 199678
2014-01-20 18:41:34 +00:00
James Molloy 43ccae1bb4 Remove the useless pseudo instructions VDUPfdf and VDUPfqf, replacing them with patterns to match VDUPLN.
llvm-svn: 199675
2014-01-20 17:14:48 +00:00
Hal Finkel cd9569c19e Update IR when merging slots in stack coloring
The way that stack coloring updated MMOs when merging stack slots, while
correct, is suboptimal, and is incompatible with the use of AA during
instruction scheduling. The solution, which involves the use of const_cast (and
more importantly, updating the IR from within an MI-level pass), obviously
requires some explanation:

When the stack coloring pass was originally committed, the code in
ScheduleDAGInstrs::buildSchedGraph tracked possible alias sets by using
GetUnderlyingObject, and all load/store and store/store memory control
dependencies where added between SUs at the object level (where only one
object, that returned by GetUnderlyingObject, was used to identify the object
associated with each MMO). When stack coloring merged stack slots, it would
replace MMOs derived from the remapped alloca with the alloca with which the
remapped alloca was being replaced. Because ScheduleDAGInstrs only used single
objects, and tracked alias sets at the object level, this was a fine solution.

In r169744, (Andy and) I updated the code in ScheduleDAGInstrs to use
GetUnderlyingObjects, and track alias sets using, potentially, multiple
underlying objects for each MMO. This was done, primarily, to provide the
ability to look through PHIs, and provide better scheduling for
induction-variable-dependent loads and stores inside loops. At this point, the
MMO-updating code in stack coloring became suboptimal, because it would clear
the MMOs for (i.e. completely pessimize) all instructions for which r169744
might help in scheduling. Updating the IR directly is the simplest fix for this
(and the one with, by far, the least compile-time impact), but others are
possible (we could give each MMO a small vector of potential values, or make
use of a remapping table, constructed from MFI, inside ScheduleDAGInstrs).

Unfortunately, replacing all MMO values derived from the remapped alloca with
the base replacement alloca fundamentally breaks our ability to use AA during
instruction scheduling (which is critical to performance on some targets). The
reason is that the original MMO might have had an offset (either constant or
dynamic) from the base remapped alloca, and that offset is not present in the
updated MMO. One possible way around this would be to use
GetPointerBaseWithConstantOffset, and update not only the MMO's value, but also
its offset based on the original offset. Unfortunately, this solution would
only handle constant offsets, and for safety (because AA is not completely
restricted to deducing relationships with constant offsets), we would need to
clear all MMOs without constant offsets over the entire function. This would be
an even worse pessimization than the current single-object restriction. Any
other solution would involve passing around a vector of remapped allocas, and
teaching AA to use it, introducing additional complexity and overhead into AA.

Instead, when remapping an alloca, we replace all IR uses of that alloca as
well (optionally inserting a bitcast as necessary). This is even more efficient
that the old MMO-updating code in the stack coloring pass (because it removes
the need to call GetUnderlyingObject on all MMO values), removes the
single-object pessimization in the default configuration, and enables the
correct use of AA during instruction scheduling (all without any additional
overhead).

LLVM now no longer miscompiles itself on x86_64 when using -enable-misched
-enable-aa-sched-mi -misched-bottomup=0 -misched-topdown=0 -misched=shuffle!
Fixed PR18497.

Because the alloca replacement is now done at the IR level, unless the MMO
directly refers to the remapped alloca, the change cannot be seen at the MI
level. As a result, there is no good way to fix test/CodeGen/X86/pr14090.ll.

llvm-svn: 199658
2014-01-20 14:03:16 +00:00
Hal Finkel a228a8187b Track multiple stores per object when using AA in ScheduleDAGInstrs
When using AA to break false chain dependencies, we need to track multiple
stores per object in ScheduleDAGInstrs. Historically, we tracked potential alias
chains at the object level, and so all loads of an object would retain
dependencies on any store to that object. With AA, however, this is not
sufficient: non-overlapping stores and loads to the same object all need to be
tested for dependencies separately, we cannot only test all loads to an object
against only the last store (see PR18497 for an explicit example).

To mitigate any unwelcome compile-time impact when not using AA, only one store
is kept in the list per object when not using AA.

This, along with a stack coloring change to come shortly, will provide a test
case, fix PR18497 (and allow LLVM to compile itself using -enable-aa-sched-mi
on x86-64).

llvm-svn: 199657
2014-01-20 14:03:02 +00:00
David Woodhouse caaa2850c0 [x86] Fix disassembly of MOV16ao16 et al.
The addition of IC_OPSIZE_ADSIZE in r198759 wasn't quite complete. It
also turns out to have been unnecessary. The disassembler handles the
AdSize prefix for itself, and doesn't care about the difference between
(e.g.) MOV8ao8 and MOB8ao8_16 definitions. So just let them coexist and
don't worry about it.

llvm-svn: 199654
2014-01-20 12:02:53 +00:00
David Woodhouse 9c74fdb8b9 [x86] Fix 16-bit disassembly of JCXZ/JECXZ
llvm-svn: 199653
2014-01-20 12:02:48 +00:00
David Woodhouse 3442f3429e [x86] Rename MOVSD/STOSD/LODSD/OUTSD to MOVSL/STOSL/LODSL/OUTSL
The disassembler has a special case for 'L' vs. 'W' in its heuristic for
checking for 32-bit and 16-bit equivalents. We could expand the heuristic,
but better just to be consistent in using the 'L' suffix.

llvm-svn: 199652
2014-01-20 12:02:44 +00:00
David Woodhouse 70ced3e0b2 [x86] Fix disassembly of callw instruction
Not quite sure why this was marked isAsmParserOnly, but it means that the
disassembler can't see it either.

llvm-svn: 199651
2014-01-20 12:02:40 +00:00
David Woodhouse 5cf4c6750d [x86] Fix 16-bit handling of OpSize bit
When disassembling in 16-bit mode the meaning of the OpSize bit is
inverted. Instructions found in the IC_OPSIZE context will actually
*not* have the 0x66 prefix, and instructions in the IC context will
have the 0x66 prefix. Make use of the existing special-case handling
for the 0x66 prefix being in the wrong place, to cope with this.

llvm-svn: 199650
2014-01-20 12:02:35 +00:00
David Woodhouse 7dd218245c [x86] Infer disassembler mode from SubtargetInfo feature bits
Aside from cleaning up the code, this also adds support for the -code16
environment and actually enables the MODE_16BIT mode that was previously
not accessible.

There is no point adding any testing for 16-bit yet though; basically
nothing will work because we aren't handling the OpSize prefix correctly
for 16-bit mode.

llvm-svn: 199649
2014-01-20 12:02:31 +00:00
David Woodhouse 71d15edaf3 [x86] Support i386-*-*-code16 triple for emitting 16-bit code
llvm-svn: 199648
2014-01-20 12:02:25 +00:00
Chandler Carruth 4d35631a6c [PM] Wire up the Verifier for the new pass manager and connect it to the
various opt verifier commandline options.

Mostly mechanical wiring of the verifier to the new pass manager.
Exercises one of the more unusual aspects of it -- a pass can be either
a module or function pass interchangably. If this is ever problematic,
we can make things more constrained, but for things like the verifier
where there is an "obvious" applicability at both levels, it seems
convenient.

This is the next-to-last piece of basic functionality left to make the
opt commandline driving of the new pass manager minimally functional for
testing and further development. There is still a lot to be done there
(notably the factoring into .def files to kill the current boilerplate
code) but it is relatively uninteresting. The only interesting bit left
for minimal functionality is supporting the registration of analyses.
I'm planning on doing that on top of the .def file switch mostly because
the boilerplate for the analyses would be significantly worse.

llvm-svn: 199646
2014-01-20 11:34:08 +00:00
Kai Nacke e51c813859 ARM: add tlsldo relocation
Add support for the symbol(tlsldo) relocation. This is required in order to 
solve PR18554.

Reviewed by R. Golin, A. Korobeynikov.

llvm-svn: 199644
2014-01-20 11:00:40 +00:00
NAKAMURA Takumi 6acf320a99 [CMake] llvm_process_sources: Introduce a parameter, ADDITIONAL_HEADERS.
ADDITIONAL_HEADERS is intended to add header files for IDEs as hint.

For example:
  add_llvm_library(LLVMSupport
    Host.cpp
    ADDITIONAL_HEADERS
      Unix/Host.inc
      Windows/Host.inc
    )

llvm-svn: 199639
2014-01-20 10:20:23 +00:00
Artyom Skrobov 10e76a4eda [ARM] Do not generate Tag_DIV_use=AllowDIVExt when hardware div is non-optional: it should have the default value of AllowDIVIfExists
llvm-svn: 199638
2014-01-20 10:18:42 +00:00
Chandler Carruth f835fc6f4f Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT."
This test fails the newly added regression tests.

llvm-svn: 199631
2014-01-20 08:18:01 +00:00
Chandler Carruth b587ab679f Fix a DenseMap iterator invalidation bug causing lots of crashes when
type units were enabled. The crux of the issue is that the
addDwarfTypeUnitType routine can end up being indirectly recursive. In
this case, the reference into the dense map (TU) became invalid by the
time we popped all the way back and used it to add the DIE type
signature.

Instead, use early return in the case where we can bypass the recursive
step and creating a type unit. Then use the pointer to the new type unit
to set up the DIE type signature in the case where we have to.

I tried really hard to reduce a testcase for this, but it's really
annoying. You have to get this to be mid-recursion when the densemap
grows. Even if we got a test case for this today, it'd be very unlikely
to continue exercising this pattern.

llvm-svn: 199630
2014-01-20 08:07:07 +00:00
Owen Anderson 1664dc8973 Fix all the remaining lost-fast-math-flags bugs I've been able to find. The most important of these are cases in the generic logic for combining BinaryOperators.
This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd.

llvm-svn: 199629
2014-01-20 07:44:53 +00:00
Kevin Qin ff42e06ef4 [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
llvm-svn: 199628
2014-01-20 07:32:26 +00:00
Kevin Qin ef66ff78ca [AArch64 NEON] Accept both #0.0 and #0 for comparing with floating point zero in asm parser.
For FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT, floating point zero will be
printed as #0.0 instead of #0. To support the history codes using #0,
we consider to let asm parser accept both #0.0 and #0.

llvm-svn: 199621
2014-01-20 02:14:05 +00:00
Michael Gottesman 8347c34e67 Move the retrieval of VT after all of the early exits from PerformOrCombine that do not use VT. NFC.
llvm-svn: 199612
2014-01-19 21:06:00 +00:00
Benjamin Kramer b80e1699b3 InstCombine: Modernize a bunch of cast combines.
Also make them vector-aware.

llvm-svn: 199608
2014-01-19 20:05:13 +00:00
Benjamin Kramer 970f4959d4 InstCombine: Hoist 3 copies of AddOne/SubOne into a header.
llvm-svn: 199605
2014-01-19 16:56:10 +00:00
Benjamin Kramer 7a74bd4703 InstCombine: Replace a hand-rolled version of isKnownToBeAPowerOfTwo with the real thing.
llvm-svn: 199604
2014-01-19 16:48:41 +00:00
Benjamin Kramer 72196f3ae5 InstCombine: Teach most integer add/sub/mul/div combines how to deal with vectors.
llvm-svn: 199602
2014-01-19 15:24:22 +00:00
Benjamin Kramer 76b15d04ff InstCombine: Refactor fmul/fdiv combines to handle vectors.
llvm-svn: 199598
2014-01-19 13:36:27 +00:00
Chandler Carruth 1bf38c6a71 Fix a really nasty SROA bug with how we handled out-of-bounds memcpy
intrinsics.

Reported on the list by Evan with a couple of attempts to fix, but it
took a while to dig down to the root cause. There are two overlapping
bugs here, both centering around the circumstance of discovering
a memcpy operand which is known to be completely outside the bounds of
the alloca.

First, we need to kill the *other* side of the memcpy if it was added to
this alloca. Otherwise we'll factor it into our slicing and try to
rewrite it even though we know for a fact that it is dead. This is made
more tricky because we can visit the sides in either order. So we have
to both kill the other side and skip instructions marked as dead. The
latter really should be goodness in every case, but here is a matter of
correctness.

Second, we need to actually remove the *uses* of the alloca by the
memcpy when queuing it for later deletion. Otherwise it may still be
using the alloca when we go to promote it (if the rewrite re-uses the
existing alloca instruction). Do this by factoring out the
use-clobbering used when for nixing a Phi argument and re-using it
across the operands of a to-be-deleted instruction.

llvm-svn: 199590
2014-01-19 12:16:54 +00:00
Saleem Abdulrasool 9390005577 ARM ELF: ensure that the tag types are corrected
Ensure that the tag types are reflected on a replacement.  This is particularly
important for the compatibility tag which has multiple representations where the
last definition wins.

llvm-svn: 199577
2014-01-19 08:25:41 +00:00
Saleem Abdulrasool 196c3212ba ARM: update build attributes for ABI r2.09
Update names for the names as per the current ABI errata.  Mark deprecated tags
as such.

llvm-svn: 199576
2014-01-19 08:25:35 +00:00
Saleem Abdulrasool 278a9f4336 Move ARM build attributes into Support
This moves the ARM build attributes definitions and support routines into the
Support library.  The support routines simply permit the conversion of the value
to and from a string representation.

The movement is prompted in order to permit access to the constants and string
representations from readobj in order to facilitate decoding of the attributes
section.

llvm-svn: 199575
2014-01-19 08:25:27 +00:00
Saleem Abdulrasool 9dedf64238 ARM IAS: remove unnecessary special case
Tag_nodefaults is even and greater than 32 and thus does not need the special
check to fall into the correct category.

llvm-svn: 199574
2014-01-19 08:25:19 +00:00
Arnold Schwaighofer cc742dd9e4 LoopVectorizer: A reduction that has multiple uses of the reduction value is not
a reduction.

Really. Under certain circumstances (the use list of an instruction has to be
set up right - hence the extra pass in the test case) we would not recognize
when a value in a potential reduction cycle was used multiple times by the
reduction cycle.

Fixes PR18526.
radar://15851149

llvm-svn: 199570
2014-01-19 03:18:31 +00:00
Chandler Carruth 043949d446 [PM] Make the verifier work independently of any pass manager.
This makes the 'verifyFunction' and 'verifyModule' functions totally
independent operations on the LLVM IR. It also cleans up their API a bit
by lifting the abort behavior into their clients and just using an
optional raw_ostream parameter to control printing.

The implementation of the verifier is now just an InstVisitor with no
multiple inheritance. It also is significantly more const-correct, and
hides the const violations internally. The two layers that force us to
break const correctness are building a DomTree and dispatching through
the InstVisitor.

A new VerifierPass is used to implement the legacy pass manager
interface in terms of the other pieces.

The error messages produced may be slightly different now, and we may
have slightly different short circuiting behavior with different usage
models of the verifier, but generally everything works equivalently and
this unblocks wiring the verifier up to the new pass manager.

llvm-svn: 199569
2014-01-19 02:22:18 +00:00
Chandler Carruth 6a93692a34 Add a const lookup routine to get a BlockAddress constant if there is
one, but not create one. This is useful in the verifier when we want to
query the constant if it exists but not create one. To be used in an
upcoming commit.

llvm-svn: 199568
2014-01-19 02:13:50 +00:00
Eli Bendersky 157a97a6a0 Support AddrSpaceCast in ConstantExpr::getAsInstruction.
It's handled similarly to the other casts. CastInst::Create already knows how
to handle it.

llvm-svn: 199565
2014-01-18 22:54:33 +00:00
Nick Lewycky a6a17d77d2 Don't refuse to transform constexpr(call(arg, ...)) to call(constexpr(arg), ...)) just because the function has multiple return values even if their return types are the same. Patch by Eduard Burtescu!
llvm-svn: 199564
2014-01-18 22:47:12 +00:00
Benjamin Kramer dd39a98b1c ARM: Let the assembler reject v5 instructions in v4 mode.
PR18524.

llvm-svn: 199559
2014-01-18 19:03:19 +00:00
Benjamin Kramer fea9ac99b0 InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle vectors too.
PR18532.

llvm-svn: 199553
2014-01-18 16:43:14 +00:00
Benjamin Kramer 5d2ff221f6 Upgrade ConstantFP's negative zero and infinity getters to handle vector types.
Will be used soon.

llvm-svn: 199552
2014-01-18 16:43:06 +00:00
Adrian Prantl ef129fbb41 Debug info (LTO): Move the creation of accessibility flags to
getOrCreateSubprogramDIE to avoid attributes being added twice when DIEs
are merged.

rdar://problem/15842330.

llvm-svn: 199536
2014-01-18 02:12:00 +00:00
Owen Anderson 48b842ef7c Fix more instances of dropped fast math flags when optimizing FADD instructions. All found by inspection (aka grep).
llvm-svn: 199528
2014-01-18 00:48:14 +00:00
Reid Kleckner 436c42ec3d Add an inalloca flag to allocas
Summary:
The only current use of this flag is to mark the alloca as dynamic, even
if its in the entry block.  The stack adjustment for the alloca can
never be folded into the prologue because the call may clear it and it
has to be allocated at the top of the stack.

Reviewers: majnemer

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2571

llvm-svn: 199525
2014-01-17 23:58:17 +00:00
Rui Ueyama 24fc2d641f 80-column.
llvm-svn: 199519
2014-01-17 22:11:27 +00:00
Rui Ueyama e5df6095b1 llvm-objdump/COFF: Print ordinal base number.
llvm-svn: 199518
2014-01-17 22:02:24 +00:00
Juergen Ributzka e625013071 Add two new calling conventions for runtime calls
This patch adds two new target-independent calling conventions for runtime
calls - PreserveMost and PreserveAll.
The target-specific implementation for X86-64 is defined as following:
  - Arguments are passed as for the default C calling convention
  - The same applies for the return value(s)
  - PreserveMost preserves all GPRs - except R11
  - PreserveAll preserves all GPRs and all XMMs/YMMs - except R11

Reviewed by Lang and Philip

llvm-svn: 199508
2014-01-17 19:47:03 +00:00
Daniel Sanders b825a634d6 [mips][msa] Correct pattern for LSA
Summary:
$rs and $rt were the wrong way round in the .td and the testcase wasn't
strict enough to detect the mistake.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D2554

llvm-svn: 199498
2014-01-17 15:40:05 +00:00
Daniel Sanders c7a9f8d298 [mips] Split IIIdiv int II_DIV, II_DIVU, II_DDIV, and II_DDIVU
No functional change since the InstrItinData's were duplicated

llvm-svn: 199497
2014-01-17 14:48:06 +00:00
Daniel Sanders e95a137b96 [mips][sched] Split IIImul and IIImult into subclasses.
IIImul -> II_MUL
IIImult -> II_MULT, II_MULTU, II_MADD, II_MADDU, II_MSUB, II_MSUBU, II_DMULT, II_DMULTU

No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199495
2014-01-17 14:32:41 +00:00
Daniel Sanders 9342557ea7 [mips][sched] Split IIHiLo into II_MFHI_MFLO and II_MTHI_MTLO
No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199493
2014-01-17 14:17:34 +00:00
Renato Golin afc43a1cbe Add MLA alias for ARMv4 support.
Fix MLA defs to use register class GPRnopc.
Add encoding tests for multiply instructions.
(Alias for MUL/SMLAL/UMLAL added by r199026.)

Patch by Zhaoshi.

llvm-svn: 199491
2014-01-17 13:53:08 +00:00
Chandler Carruth bf2b652c05 [PM] [cleanup] Rename some of the Verifier's members, re-arrange them,
and tweak comments prior to more invasive surgery. Also clean up some
other non-doxygen comments, and run clang-format over the parts that are
going to change dramatically in subsequent commits so that those don't
get cluttered with formatting changes.

No functionality changed.

llvm-svn: 199489
2014-01-17 11:09:34 +00:00
Kostya Serebryany 714c67c31e [asan] extend asan-coverage (still experimental).
- add a mode for collecting per-block coverage (-asan-coverage=2).
   So far the implementation is naive (all blocks are instrumented),
   the performance overhead on top of asan could be as high as 30%.
 - Make sure the one-time calls to __sanitizer_cov are moved to function buttom,
   which in turn required to copy the original debug info into the call insn.

Here is the performance data on SPEC 2006
(train data, comparing asan with asan-coverage={0,1,2}):

                             asan+cov0     asan+cov1      diff 0-1    asan+cov2       diff 0-2      diff 1-2
       400.perlbench,        65.60,        65.80,         1.00,        76.20,         1.16,         1.16
           401.bzip2,        65.10,        65.50,         1.01,        75.90,         1.17,         1.16
             403.gcc,         1.64,         1.69,         1.03,         2.04,         1.24,         1.21
             429.mcf,        21.90,        22.60,         1.03,        23.20,         1.06,         1.03
           445.gobmk,       166.00,       169.00,         1.02,       205.00,         1.23,         1.21
           456.hmmer,        88.30,        87.90,         1.00,        91.00,         1.03,         1.04
           458.sjeng,       210.00,       222.00,         1.06,       258.00,         1.23,         1.16
      462.libquantum,         1.73,         1.75,         1.01,         2.11,         1.22,         1.21
         464.h264ref,       147.00,       152.00,         1.03,       160.00,         1.09,         1.05
         471.omnetpp,       115.00,       116.00,         1.01,       140.00,         1.22,         1.21
           473.astar,       133.00,       131.00,         0.98,       142.00,         1.07,         1.08
       483.xalancbmk,       118.00,       120.00,         1.02,       154.00,         1.31,         1.28
            433.milc,        19.80,        20.00,         1.01,        20.10,         1.02,         1.01
            444.namd,        16.20,        16.20,         1.00,        17.60,         1.09,         1.09
          447.dealII,        41.80,        42.20,         1.01,        43.50,         1.04,         1.03
          450.soplex,         7.51,         7.82,         1.04,         8.25,         1.10,         1.05
          453.povray,        14.00,        14.40,         1.03,        15.80,         1.13,         1.10
             470.lbm,        33.30,        34.10,         1.02,        34.10,         1.02,         1.00
         482.sphinx3,        12.40,        12.30,         0.99,        13.00,         1.05,         1.06

llvm-svn: 199488
2014-01-17 11:00:30 +00:00
Chandler Carruth 7677760ec8 [PM] Remove the preverifier and directly compute the DominatorTree for
the verifier after ensuring the CFG is at least usefully formed.

This fixes a number of problems:
1) The PreVerifier was missing the controls the Verifier provides over
   *how* an invalid module is handled -- it just aborted the program!
   Now it uses the same logic as the Verifier which is significantly
   more library-friendly.
2) The DominatorTree used previously could have been cached and not
   updated due to bugs in prior passes and we would silently use the
   stale tree. This could cause dominance errors to not be as quickly
   diagnosed.
3) We can now (in the next patch) pull the functionality of the verifier
   apart from the pass infrastructure so that you can verify IR without
   having any form of pass manager. This in turn frees the code to share
   logic between old and new pass manager variants.

Along the way I fixed at least one annoying bug -- the state for
'Broken' wasn't being cleared from run to run causing all functions
visited after the first broken function to be marked as broken
regardless of whether *they* were a problem. Fortunately, I don't really
know much of a way to observe this peculiarity.

In case folks are worried about the runtime cost, its negligible.
I looked at running the entire regression test suite (which should be
a relatively good use of the verifier) before and after but was unable
to even measure the time spent on the verifier and there was no
regresion from before to after. I checked both with debug builds and
optimized builds.

llvm-svn: 199487
2014-01-17 10:56:02 +00:00
Kevin Qin e0faea11b1 [AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon doesn't support these operations.
llvm-svn: 199485
2014-01-17 09:54:30 +00:00
Craig Topper 80ab268b06 Switch a few instructions to use RI instead I so they don't require REX_W to be explicitly specified.
llvm-svn: 199479
2014-01-17 08:16:57 +00:00
Craig Topper f124c6a5ef Add OpSize16 flags to 32-bit CRC32 instructions so they can be encoded correctly in 16-bit mode.
llvm-svn: 199478
2014-01-17 08:01:20 +00:00
Craig Topper 2d4b3c9770 Teach x86 asm parser to handle 'opaque ptr' in Intel syntax.
llvm-svn: 199477
2014-01-17 07:44:10 +00:00
Craig Topper 9ac290ad5b Teach X86 asm parser to understand 'ZMMWORD PTR' in Intel syntax.
llvm-svn: 199476
2014-01-17 07:37:39 +00:00
Craig Topper a49c2960c6 Fix intel syntax for 64-bit version of FXSAVE/FXRSTOR to use '64' suffix instead of 'q'
llvm-svn: 199474
2014-01-17 07:25:39 +00:00
Craig Topper 5a44496988 VEX_PREFIX_66 doesn't need to set the hasOpSize flag since VEX instructions don't use the size fields it controls.
llvm-svn: 199470
2014-01-17 07:11:45 +00:00
Craig Topper 3cbe160619 Replace duplicated code with a existing helper function.
llvm-svn: 199468
2014-01-17 06:42:38 +00:00
Hao Liu 17457a2ee2 [AArch64]Fix the problem can't select f16_to_f32 and f32_to_f16.
Also add copy support for FPR16.
Also add a missing test case file belongs to commit r197361.

llvm-svn: 199463
2014-01-17 06:23:30 +00:00
Kevin Qin 212d9b4a56 [AArch64 NEON] Custom lower conversion between vector integer and vector floating point if element bit-width doesn't match.
llvm-svn: 199462
2014-01-17 05:52:35 +00:00
Hao Liu 18d92262c5 [AArch64]Fix the problem can't select concat_vectors of two v1i32 types.
Also fix the problem can't select scalar_to_vector from f32 to v2f32/v4f32.

llvm-svn: 199461
2014-01-17 05:44:46 +00:00
Reid Kleckner 60d3a835ff Change inalloca rules to make it only apply to the last parameter
This makes things a lot easier, because we can now talk about the
"argument allocation", which allocates all the memory for the call in
one shot.

The only functional change is to the verifier for a feature that hasn't
shipped yet.

llvm-svn: 199434
2014-01-16 22:59:24 +00:00
Quentin Colombet dc0b2ea2bc [opt][PassInfo] Allow opt to run passes that need target machine.
When registering a pass, a pass can now specify a second construct that takes as
argument a pointer to TargetMachine.
The PassInfo class has been updated to reflect that possibility.
If such a constructor exists opt will use it instead of the default constructor
when instantiating the pass.

Since such IR passes are supposed to be rare, no specific support has been
added to this commit to allow an easy registration of such a pass.
In other words, for such pass, the initialization function has to be
hand-written (see CodeGenPrepare for instance).

Now, codegenprepare can be tested using opt:
opt -codegenprepare -mtriple=mytriple input.ll

llvm-svn: 199430
2014-01-16 21:44:34 +00:00
Owen Anderson e7321660c1 Fix two cases where we could lose fast math flags when optimizing FADD expressions.
llvm-svn: 199427
2014-01-16 21:26:02 +00:00
Owen Anderson 4557a156e3 Fix an instance where we would drop fast math flags when performing an fdiv to reciprocal multiply transformation.
llvm-svn: 199425
2014-01-16 21:07:52 +00:00
Owen Anderson e8537fc7e0 Fix a bug in InstCombine where we failed to preserve fast math flags when optimizing an FMUL expression.
llvm-svn: 199424
2014-01-16 20:59:41 +00:00
Rui Ueyama da49d0d44c llvm-objdump/COFF: Print DLL name in the export table header.
llvm-svn: 199422
2014-01-16 20:50:34 +00:00
Owen Anderson f74cfe031f Teach InstCombine that (fmul X, -1.0) can be simplified to (fneg X), which LLVM expresses as (fsub -0.0, X).
llvm-svn: 199420
2014-01-16 20:36:42 +00:00
Rui Ueyama 686738e226 Use static instead of anonymous namespace.
llvm-svn: 199419
2014-01-16 20:30:36 +00:00
Rui Ueyama 5efa665f7b Reduce nesting.
llvm-svn: 199418
2014-01-16 20:22:55 +00:00
Rui Ueyama 8ff24d25de Use the current local variable naming style.
llvm-svn: 199417
2014-01-16 20:11:48 +00:00
Kevin Enderby 8f4921c333 Tweak the MCExternalSymbolizer to print references to C string literals
with raw_ostream's write_escaped() method.

For example darwin's otool(1) program that uses the llvm
disassembler now produces disassembly like this:

leaq	0x7b(%rip), %rdi ## literal pool for: "%f\ntoto\n"

and not print the new lines which messes up the output.

rdar://15145300

llvm-svn: 199407
2014-01-16 18:43:56 +00:00
Daniel Sanders 59f991505c [mips][sched] Removed IIXfer. No instructions use it.
llvm-svn: 199403
2014-01-16 17:23:08 +00:00
Daniel Sanders 4aefdc7bda [mips][sched] Put AND, OR, XOR, MOVT_I, and MOVF_I in the same itinerary class as their non-microMIPS counterparts.
No functional change since both classes have the same InstrItinData definition.

llvm-svn: 199402
2014-01-16 17:13:57 +00:00
Rafael Espindola 0b694814a8 Add an emitRawComment function and use it to simplify some uses of EmitRawText.
llvm-svn: 199397
2014-01-16 16:28:37 +00:00
Daniel Sanders 4d20f0c00f [mips][sched] Split IIseb into II_SEB and II_SEH
No functional change since there are no InstrItinData's.

llvm-svn: 199396
2014-01-16 16:19:38 +00:00
Daniel Sanders 306ef07b8d [mips][sched] Split IILogic into II_AND, II_OR, II_XOR, II_ANDI, II_ORI, II_XORI
This is necessary because the classes are shared between all implementations.

No functional change since the InstrItinData's have been duplicated.

llvm-svn: 199394
2014-01-16 15:57:05 +00:00
Daniel Sanders 980589a803 [mips][sched] Split IIArith in preparation for the first scheduler targeting a specific MIPS CPU.
IIArith -> II_ADD, II_ADDU, II_AND, II_CL[ZO], II_DADDIU, II_DADDU,
  II_DROTR, II_DROTR32, II_DROTRV, II_DSLL, II_DSLL32, II_DSLLV,
  II_DSR[AL], II_DSR[AL]32, II_DSR[AL]V, II_DSUBU, II_LUI, II_MOV[ZFNT],
  II_NOR, II_OR, II_RDHWR, II_ROTR, II_ROTRV, II_SLL, II_SLLV, II_SR[AL],
  II_SR[AL]V, II_SUBU, II_XOR

No functional change since the InstrItinData's have been duplicated.

This is necessary because the classes are shared between all schedulers.

Once this patch series is committed there will be an InstrItinClass for
each mnemonic with minimal grouping. This does increase the size of the
itinerary tables for each MIPS scheduler but we have a few options for dealing
with that later. These options include reducing the number of classes once
we see the best way to simplify them, or by extending tablegen to be able
to compress the table by eliminating duplicates entries, etc.

llvm-svn: 199391
2014-01-16 14:27:20 +00:00
Daniel Sanders bfe1830ab9 [mips] Correct itin class for MULT_MM and MULTu_MM to IIImult.
This matches the itin class used by the non-microMIPS equivalents of these
instructions.

llvm-svn: 199389
2014-01-16 14:02:48 +00:00
Daniel Sanders 818058b2ab [mips] IIImult should have an InstrItinData in the generic scheduler. Used the same one as for IIImul.
Affects:
  DMULT, DMULTu, MADD, MADD_MM, MADDU, MADDU_MM, MSUB, MSUB_MM, MSUBU,
  MSUBU_MM, MULT, MULTu

Does not affect MULT_MM, MULTu_MM since they are currently miscategorised
as IIImul.

llvm-svn: 199381
2014-01-16 13:45:53 +00:00
Tim Northover 3657cb0350 ReMat: fix overly cavalier attitude to sub-register indices
There are two attempted optimisations in reMaterializeTrivialDef, trying to
avoid promoting the size of a register too much when rematerializing.
Unfortunately, both appear to be flawed. First, we see if the original register
would have worked, but this is inadequate. Consider:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0 = COPY v1:Q1 (v1, v2 are QQ)
    ...
    uses of v2

In this case even though v2 *could* be used directly as the output of
SOMETHING, this would set the wrong bits of the QQ register involved. The
correct rematerialization must be:

    v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ)
    ...
    uses of v2:Q1_Q2

For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then
we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try
to hunt for a class between v1 and v2 that works. Unfortunately, this is also
wrong:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ)
    ...
    uses of v2 as a QQQ

The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current
logic would decide that v2 could be a QQ (no interest is taken in later uses).

This patch, therefore, always accepts the widened register class without trying
to be clever. Generally there is no penalty to this (e.g. in the common GR32 <
GR64 case, expanding the width doesn't matter because it's not like you were
going to do anything else with the high bits of a GR32 register). It can
increase register pressure in cases like the ARM VFP regs though (multiple
non-overlapping but equivalent subregisters). This situation can be
spotted by the fact that both source and destination in the
not-quite-coalesced pair have a sub-register index and
rematerialisation is skipped in that situation.

Unfortunately, no in-tree targets actually expose this as far as I can tell
(there are so few isAsCheapAsAMove instructions for it to trigger on) so I've
been unable to produce a test. It was exposed in our ARM64 SPEC tests though,
and I will be adding a test there that we should be able to contribute
soon(TM).

rdar://problem/15775279

llvm-svn: 199376
2014-01-16 12:29:55 +00:00
Evgeniy Stepanov 13665367a0 [asan] Remove -fsanitize-address-zero-base-shadow command line
flag from clang, and disable zero-base shadow support on all platforms
where it is not the default behavior.

- It is completely unused, as far as we know.
- It is ABI-incompatible with non-zero-base shadow, which means all
objects in a process must be built with the same setting. Failing to
do so results in a segmentation fault at runtime.
- It introduces a backward dependency of compiler-rt on user code,
which is uncommon and complicates testing.

This is the LLVM part of a larger change.

llvm-svn: 199371
2014-01-16 10:19:12 +00:00
Jiangning Liu 4df2363a23 For ARM, fix assertuib failures for some ld/st 3/4 instruction with wirteback.
llvm-svn: 199369
2014-01-16 09:16:13 +00:00
Elena Demikhovsky d1487261a0 AVX-512: fixed a compare pattern
llvm-svn: 199366
2014-01-16 08:45:54 +00:00
Craig Topper a9d2c67cc2 Copy segment register when optimizing to MOV8ao8/MOV16ao16/MOV32ao32.
llvm-svn: 199365
2014-01-16 07:57:45 +00:00
Craig Topper 35da3d190a Allow x86 mov instructions to/from memory with absolute address to be encoded and disassembled with a segment override prefix. Fixes PR16962.
llvm-svn: 199364
2014-01-16 07:36:58 +00:00
Rafael Espindola 74c3e63193 Use a slightly smaller hack.
llvm-svn: 199363
2014-01-16 07:36:00 +00:00
Rui Ueyama ad882ba896 llmv-objdump/COFF: Print export table contents.
This patch adds the capability to dump export table contents. An example
output is this:

  Export Table:
   Ordinal      RVA  Name
         5   0x2008  exportfn1
         6   0x2010  exportfn2

By adding this feature to llvm-objdump, we will be able to use it to check
export table contents in LLD's tests. Currently we are doing binary
comparison in the tests, which is fragile and not readable to humans.

llvm-svn: 199358
2014-01-16 07:05:49 +00:00
Rafael Espindola f69b850d60 CommentColumn is always 40. Simplify.
llvm-svn: 199357
2014-01-16 07:04:11 +00:00
Bill Wendling 91686d6dee Reapply r194218 with fix:
Move copying of global initializers below the cloning of functions.

The BlockAddress doesn't have access to the correct basic blocks until the
functions have been cloned. This causes the BlockAddress to point to the old
values. Just wait until the functions have been cloned before copying the
initializers.
PR13163

llvm-svn: 199354
2014-01-16 06:29:36 +00:00
Craig Topper 8a60fff260 Remove use of OpSize for populating VEX_PP field. A prefix encoding is now used instead. Simplify some other code. No functional changes intended.
llvm-svn: 199353
2014-01-16 06:14:45 +00:00
Rafael Espindola 098000eb72 Attempt to fix the MSVC build.
llvm-svn: 199352
2014-01-16 05:09:32 +00:00
Arnold Schwaighofer e3ac099726 BasicAA: We need to check both access sizes when comparing a gep and an
underlying object of unknown size.

Fixes PR18460.

llvm-svn: 199351
2014-01-16 04:53:18 +00:00
Rafael Espindola c3d687667d Prevent calls to __jit_debug_register_code from being optimized out.
Patch by Andrew MacPherson. I just tweaked the comment.

llvm-svn: 199350
2014-01-16 04:50:58 +00:00
Rui Ueyama a045b73a96 Don't use DataRefImpl to implement ImportDirectoryEntryRef.
DataRefImpl (a union of two integers and a pointer) is not the ideal data type
to represent a reference to an import directory entity. We should just use the
pointer to the import table and an offset instead to simplify. No functionality
change.

llvm-svn: 199349
2014-01-16 03:13:19 +00:00
Manman Ren 2ebfb42fe9 Report a warning when dropping outdated debug info metadata.
Use DiagnosticInfo to emit the warning.

llvm-svn: 199346
2014-01-16 01:51:12 +00:00
Reed Kotler 43788a20c6 Adjust offsets for max load instruction offsets. This is more pessimistic
than it needs to be by 1 bit but I need to finish some other things so 
that all the boundary cases will work in that situation. constpool.c
in test-suite will fail to assemble under our new internal test-suite sync
without this change.

llvm-svn: 199343
2014-01-16 00:47:46 +00:00
David Peixotto c0f92a2dc9 Fix parsing of .symver directive on ARM
ARM assembly syntax uses @ for a comment, execpt for the second
parameter of the .symver directive which requires @ as part of the
symbol name. This commit fixes the parsing of this directive by
adding a special case for ARM for this one argumnet.

To make the change we had to move the AllowAtInIdentifier variable
to the MCAsmLexer interface (from AsmLexer) and expose a setter for
the value.  The ELFAsmParser then toggles this value when parsing
the second argument to the .symver directive for a target that
uses @ as a comment symbol

llvm-svn: 199339
2014-01-15 22:40:02 +00:00
Quentin Colombet 5fa1f6f57a [LTO] Add a hook to map LLVM diagnostics into the clients of LTO.
Add a hook in the C API of LTO so that clients of the code generator can set
their own handler for the LLVM diagnostics.
The handler is defined like this:
typedef void (*lto_diagnostic_handler_t)(lto_codegen_diagnostic_severity_t
severity, const char *diag, void *ctxt)
- severity says how bad this is.
- diag is a string that contains the diagnostic message.
- ctxt is the registered context for this handler.

This hook is more general than the lto_get_error_message, since this function
keeps only the latest message and can only be queried when something went wrong
(no warning for instance).

<rdar://problem/15517596>

llvm-svn: 199338
2014-01-15 22:04:35 +00:00
Bob Wilson f8d5da6e0b Remove support for armv7f slice. <rdar://problem/12478440>
This was never used for anything so we should just get rid of it.

llvm-svn: 199337
2014-01-15 21:44:14 +00:00
Andrea Di Biagio d7c03ec348 [DAGCombiner] Fix a wrong check in method SimplifyVBinOp.
This fixes a regression intruced by r199135.

Revision 199135 tried to simplify part of the logic in method
DAGCombiner::SimplifyVBinOp introducing calls to method BuildVectorSDNode::isConstant().

However, that revision wrongly changed the check performed by method
SimplifyVBinOp to identify dag nodes that can be folded.
Before revision 199135, that method only tried to simplify vector binary operations
if both operands were build_vector of Constant/ConstantFP/Undef only.

After revision 199135, method SimplifyVBinop tried to
simplify also vector binary operations with only one constant operand.

This fixes the problem restoring the old behavior of SimplifyVBinOp.

llvm-svn: 199328
2014-01-15 19:51:32 +00:00
Rafael Espindola 63da295045 Return an ErrorOr<Binary *> from createBinary.
I did write a version returning ErrorOr<OwningPtr<Binary> >, but it is too
cumbersome to use without std::move. I will keep the patch locally and submit
when we switch to c++11.

llvm-svn: 199326
2014-01-15 19:37:43 +00:00
Kevin Enderby 2e13b1c7f1 Update the X86 assembler for .intel_syntax to accept
the | and & bitwise operators.

rdar://15570412

llvm-svn: 199323
2014-01-15 19:05:24 +00:00
Zoran Jovanovic 7d63392da9 LL and SC decoder method fix.
llvm-svn: 199316
2014-01-15 13:17:33 +00:00
Zoran Jovanovic d4cb61cf0e Added support for LWU microMIPS instruction.
llvm-svn: 199315
2014-01-15 13:01:18 +00:00
David Majnemer dee105772c WinCOFF: Transform IR expressions featuring __ImageBase into image relative relocations
MSVC on x64 requires that we create image relative symbol
references to refer to RTTI data. Seeing as how there is no way to
explicitly make reference to a given relocation type in LLVM IR, pattern
match expressions of the form &foo - &__ImageBase.

Differential Revision: http://llvm-reviews.chandlerc.com/D2523

llvm-svn: 199312
2014-01-15 09:16:42 +00:00
Elena Demikhovsky 79b75d9048 Fixed identation.
llvm-svn: 199301
2014-01-15 07:18:11 +00:00
Andrew Trick ee5aa7f71a Fix PR18449: SCEV needs more precise max BECount for multi-exit loop.
llvm-svn: 199299
2014-01-15 06:42:11 +00:00
Craig Topper 30a134b68d Add OpSize16 to the two byte forms of INC/DEC that we only use in 64-bit mode and a 64-bit only LEA. Even though we'll not be in 16-bit mode when we use them it makes their tables consistent with their 32-bit counterparts.
llvm-svn: 199297
2014-01-15 05:20:59 +00:00
Jiangning Liu 0a791c348b For AArch64, lowering sext_inreg and generate optimized code by using SXTL.
llvm-svn: 199296
2014-01-15 05:08:01 +00:00
Hans Wennborg 4744ac1733 Switch-to-lookup tables: set threshold to 3 cases
There has been an old FIXME to find the right cut-off for when it's worth
analyzing and potentially transforming a switch to a lookup table.

The switches always have two or more cases. I could not measure any speed-up
by transforming a switch with two cases. A switch with three cases gets a nice
speed-up, and I couldn't measure any compile-time regression, so I think this
is the right threshold.

In a Clang self-host, this causes 480 new switches to be transformed,
and reduces the final binary size with 8 KB.

llvm-svn: 199294
2014-01-15 05:00:27 +00:00
Arnold Schwaighofer dc4c9460a2 LoopVectorize: Only strip casts from integer types when replacing symbolic
strides

Fixes PR18480.

llvm-svn: 199291
2014-01-15 03:35:46 +00:00
Rafael Espindola 9d795caea4 Fix uninitialized variable.
llvm-svn: 199288
2014-01-15 03:27:26 +00:00
Rafael Espindola 26e917cde0 Only mark functions as micromips.
The GNU as behavior is a bit different and very strange. It will mark any
label that contains an instruction. We can implement that, but using the
type looks more natural since gas will not mark a function if a .word is
used to output the instructions!

llvm-svn: 199287
2014-01-15 03:07:12 +00:00
Weiming Zhao fe26fd27b4 PR 18466: Fix ARM Pseudo Expansion
When expanding neon pseudo stores, it may miss the implicit uses of sub
regs, which may cause post RA scheduler reorder instructions that
breakes anti dependency.

For example:
  VST1d64QPseudo %R0<kill>, 16, %Q9_Q10, pred:14, pred:%noreg
  will be expanded to
    VST1d64Q %R0<kill>, 16, %D18, pred:14, pred:%noreg;

An instruction that defines %D20 may be scheduled before the store by
mistake.

This patches adds implicit uses for such case. For the example above, it
emits:
  VST1d64Q %R0<kill>, 8, %D18, pred:14, pred:%noreg, %Q9_Q10<imp-use>

llvm-svn: 199282
2014-01-15 01:32:12 +00:00
Rafael Espindola 8f31e213e4 Make parseBitcodeFile return an ErrorOr<Module *>.
llvm-svn: 199279
2014-01-15 01:08:23 +00:00
Eric Christopher 1ad8457570 Make sure we emit a relocation to the debug_ranges section in the
presence of CU ranges.

llvm-svn: 199276
2014-01-15 00:04:29 +00:00
Rafael Espindola e9fab9b077 Return an error_code from materializeAllPermanently.
llvm-svn: 199275
2014-01-14 23:51:27 +00:00
Rafael Espindola 1d06f7208c Use error_code in Module::materializeAll.
llvm-svn: 199269
2014-01-14 23:02:01 +00:00
Tim Northover 463a5f24d1 ARM: correctly determine final tBX_LR in Thumb1 functions
The changes caused by folding an sp-adjustment into a "pop" previously
disrupted the forward search for the final real instruction in a
terminating block. This switches to a backward search (skipping debug
instrs).

This fixes PR18399.

Patch by Zhaoshi.

llvm-svn: 199266
2014-01-14 22:53:28 +00:00
Tim Northover 6e219cd588 AArch64: don't try to handle [SU]MUL_LOHI nodes
We should set them to expand for now since there are no patterns
dealing with them. Actually, there are no instructions either so I
doubt they'll ever be acceptable.

llvm-svn: 199265
2014-01-14 22:53:22 +00:00
Eric Christopher 39cde8cc90 Enable use of ranges for translation units in the presence of
-ffunction-sections and update comments and TODOs about other
places that we should enable this.

llvm-svn: 199263
2014-01-14 22:44:17 +00:00
Matt Arsenault 2d353d1a10 Do pointer cast simplifications on addrspacecast
llvm-svn: 199254
2014-01-14 20:00:45 +00:00
Matt Arsenault f08a44f903 Remove a check for an illegal condition.
Bitcasts can't be between address spaces anymore.

llvm-svn: 199253
2014-01-14 19:56:57 +00:00
Lang Hames 06234ec147 Add FPExt option to CCValAssign::LocInfo. When generating calling-convention
promotion code, Tablegen will now select FPExt for floating point promotions
(previously it had returned AExt, which is not valid for floating point types).

Any out-of-tree targets that were relying on AExt being returned for FP
promotions will need to update their code check for FPExt instead.

llvm-svn: 199252
2014-01-14 19:56:36 +00:00
Rafael Espindola 08ff298d51 Revert "[AArch64] Added vselect patterns with float and double types"
This reverts commit r199242.

It is causing CodeGen/AArch64/neon-bsl.ll to fail.

llvm-svn: 199248
2014-01-14 19:24:08 +00:00
Matt Arsenault e55a2c2e6b Make nocapture analysis work with addrspacecast
llvm-svn: 199246
2014-01-14 19:11:52 +00:00
Rafael Espindola 6633d57ae4 Fix a low hanging use of hasRawTextSupport.
This also fixes the placement of the function label comment. It was being
placed next to the mips16 directive instead of next to the label.

llvm-svn: 199245
2014-01-14 18:57:12 +00:00
Duncan P. N. Exon Smith 93be7c4fb3 Reapply "LTO: add API to set strategy for -internalize"
Reapply r199191, reverted in r199197 because it carelessly broke
Other/link-opts.ll.  The problem was that calling
createInternalizePass("main") would select
createInternalizePass(bool("main")) instead of
createInternalizePass(ArrayRef<const char *>("main")).  This commit
fixes the bug.

The original commit message follows.

Add API to LTOCodeGenerator to specify a strategy for the -internalize
pass.

This is a new attempt at Bill's change in r185882, which he reverted in
r188029 due to problems with the gold linker.  This puts the onus on the
linker to decide whether (and what) to internalize.

In particular, running internalize before outputting an object file may
change a 'weak' symbol into an internal one, even though that symbol
could be needed by an external object file --- e.g., with arclite.

This patch enables three strategies:

- LTO_INTERNALIZE_FULL: the default (and the old behaviour).
- LTO_INTERNALIZE_NONE: skip -internalize.
- LTO_INTERNALIZE_HIDDEN: only -internalize symbols with hidden
  visibility.

LTO_INTERNALIZE_FULL should be used when linking an executable.

Outputting an object file (e.g., via ld -r) is more complicated, and
depends on whether hidden symbols should be internalized.  E.g., for
ld -r, LTO_INTERNALIZE_NONE can be used when -keep_private_externs, and
LTO_INTERNALIZE_HIDDEN can be used otherwise.  However,
LTO_INTERNALIZE_FULL is inappropriate, since the output object file will
eventually need to link with others.

lto_codegen_set_internalize_strategy() sets the strategy for subsequent
calls to lto_codegen_write_merged_modules() and lto_codegen_compile*().

<rdar://problem/14334895>

llvm-svn: 199244
2014-01-14 18:52:17 +00:00
Ana Pazos 787f540daa [AArch64] Added vselect patterns with float and double types
llvm-svn: 199242
2014-01-14 18:45:48 +00:00
Nico Rieck c60647f0db Handle dllexport for global aliases
llvm-svn: 199219
2014-01-14 15:23:25 +00:00
Nico Rieck 7157bb765e Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199218
2014-01-14 15:22:47 +00:00
Elena Demikhovsky 767fc967b4 AVX-512: optimized scalar compare patterns
removed AVX512SI format, since it is similar to AVX512BI.

llvm-svn: 199217
2014-01-14 15:10:08 +00:00
Patrik Hagglund 682a10d4cc Fix valgrind warning for gcc builds.
Sorry, I don't understand why the warning is generated (a gcc
bug?). Anyhow, the change should improve readablity. No functionality
change intended.

llvm-svn: 199214
2014-01-14 14:09:00 +00:00
Andrea Di Biagio 5448a3c771 [X86] Fix assertion failure caused by a wrong folding of vector shifts by immediate count.
This fixes a regression intruced by r198113.

Revision r198113 introduced an algorithm that tries to fold a vector shift
by immediate count into a build_vector if the input vector is a known vector
of constants.

However the algorithm only worked under the assumption that the input vector
type and the shift type are exactly the same.

This patch disables the folding of vector shift by immediate count if the
input vector type and the shift value type are not the same.

llvm-svn: 199213
2014-01-14 13:17:12 +00:00
Tim Northover 56cc5c92db ARM: add constraint that RdLo != Rn != RdHi for v5 MLA insts.
llvm-svn: 199212
2014-01-14 13:05:47 +00:00
Tim Northover c4c34b4f5c ARM: remove unused UMAALv5 node
It was incorrect anyway, since it didn't have accumulator inputs and wasn't
even supported on v5.

llvm-svn: 199211
2014-01-14 13:05:42 +00:00
Nico Rieck 9d2e0df049 Revert "Decouple dllexport/dllimport from linkage"
Revert this for now until I fix an issue in Clang with it.

This reverts commit r199204.

llvm-svn: 199207
2014-01-14 12:38:32 +00:00
Nico Rieck 1794b62f54 Revert "Handle dllexport for global aliases"
This reverts commit r199205.

llvm-svn: 199206
2014-01-14 12:36:54 +00:00
Nico Rieck 4192acdbc3 Handle dllexport for global aliases
llvm-svn: 199205
2014-01-14 11:55:40 +00:00
Nico Rieck e43aaf7967 Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199204
2014-01-14 11:55:03 +00:00
Nico Rieck da881a2742 Fix fastcall mangling of dllimported symbols
fastcall requires @ as global prefix instead of _ but getNameWithPrefix
wrongly assumes the OutName buffer is empty and replaces at index 0.
For imported functions this buffer is pre-filled with "__imp_" resulting
in broken "@_imp_foo@0" mangling.

Instead replace at the proper index. We also never have to prepend the
@-prefix because this fastcall mangling is only used on 32-bit Windows
targets which have _ has global prefix.

llvm-svn: 199203
2014-01-14 11:53:26 +00:00
NAKAMURA Takumi 23c0ab53b2 Revert r199191, "LTO: add API to set strategy for -internalize"
Please update also Other/link-opts.ll, in next time.

llvm-svn: 199197
2014-01-14 09:40:18 +00:00
Craig Topper ae11aed9d7 Separate the concept of 16-bit/32-bit operand size controlled by 0x66 prefix and the current mode from the concept of SSE instructions using 0x66 prefix as part of their encoding without being affected by the mode.
This should allow SSE instructions to be encoded correctly in 16-bit mode which r198586 probably broke.

llvm-svn: 199193
2014-01-14 07:41:20 +00:00
Duncan P. N. Exon Smith 43ea3478bf LTO: add API to set strategy for -internalize
Add API to LTOCodeGenerator to specify a strategy for the -internalize
pass.

This is a new attempt at Bill's change in r185882, which he reverted in
r188029 due to problems with the gold linker.  This puts the onus on the
linker to decide whether (and what) to internalize.

In particular, running internalize before outputting an object file may
change a 'weak' symbol into an internal one, even though that symbol
could be needed by an external object file --- e.g., with arclite.

This patch enables three strategies:

- LTO_INTERNALIZE_FULL: the default (and the old behaviour).
- LTO_INTERNALIZE_NONE: skip -internalize.
- LTO_INTERNALIZE_HIDDEN: only -internalize symbols with hidden
  visibility.

LTO_INTERNALIZE_FULL should be used when linking an executable.

Outputting an object file (e.g., via ld -r) is more complicated, and
depends on whether hidden symbols should be internalized.  E.g., for
ld -r, LTO_INTERNALIZE_NONE can be used when -keep_private_externs, and
LTO_INTERNALIZE_HIDDEN can be used otherwise.  However,
LTO_INTERNALIZE_FULL is inappropriate, since the output object file will
eventually need to link with others.

lto_codegen_set_internalize_strategy() sets the strategy for subsequent
calls to lto_codegen_write_merged_modules() and lto_codegen_compile*().

<rdar://problem/14334895>

llvm-svn: 199191
2014-01-14 06:37:26 +00:00
Jakob Stoklund Olesen b6b35a4955 Always let value types influence register classes.
When creating a virtual register for a def, the value type should be
used to pick the register class. If we only use the register class
constraint on the instruction, we might pick a too large register class.

Some registers can store values of different sizes. For example, the x86
xmm registers can hold f32, f64, and 128-bit vectors. The three
different value sizes are represented by register classes with identical
register sets: FR32, FR64, and VR128. These register classes have
different spill slot sizes, so it is important to use the right one.

The register class constraint on an instruction doesn't necessarily care
about the size of the value its defining. The value type determines
that.

This fixes a problem where InstrEmitter was picking 32-bit register
classes for 64-bit values on SPARC.

llvm-svn: 199187
2014-01-14 06:18:38 +00:00
Jakob Stoklund Olesen 209120621a Switch the NEON register class from QPR to DPair.
The already allocatable DPair superclass contains odd-even D register
pair in addition to the even-odd pairs in the QPR register class. There
is no reason to constrain the set of D register pairs that can be used
for NEON values. Any NEON instructions that require a Q register will
automatically constrain the register class to QPR.

The allocation order for DPair begins with the QPR registers, so
register allocation is unlikely to change much.

llvm-svn: 199186
2014-01-14 06:18:34 +00:00
Rafael Espindola 6d5f7ce348 Replace .mips_hack_stocg with ".set micromips" and ".set nomicromips".
This matches what gnu as does and implementing this is easier than arguing
about it.

llvm-svn: 199181
2014-01-14 04:25:13 +00:00
Mark Seaborn 8271118a65 Fix llc to not reuse spill slots in functions that invoke setjmp()
We need to ensure that StackSlotColoring.cpp does not reuse stack
spill slots in functions that call "returns_twice" functions such as
setjmp(), otherwise this can lead to miscompiled code, because a stack
slot would be clobbered when it's still live.

This was already handled correctly for functions that call setjmp()
(though this wasn't covered by a test), but not for functions that
invoke setjmp().

We fix this by changing callsFunctionThatReturnsTwice() to check for
invoke instructions.

This fixes PR18244.

llvm-svn: 199180
2014-01-14 04:20:01 +00:00
Rafael Espindola 4a1a360634 Make getTargetStreamer return a possibly null pointer.
This will allow it to be called from target independent parts of the main
streamer that don't know if there is a registered target streamer or not. This
in turn will allow targets to perform extra actions at specified points in the
interface: add extra flags for some labels, extra work during finalization, etc.

llvm-svn: 199174
2014-01-14 01:21:46 +00:00
Cameron McInally f0379fa41a Fix uninitialized warning in llvm/lib/IR/DataLayout.cpp.
llvm-svn: 199147
2014-01-13 22:04:55 +00:00
Juergen Ributzka 6840282c99 [DAG] Refactor ReassociateOps - no functional change intended.
llvm-svn: 199146
2014-01-13 21:49:25 +00:00
Juergen Ributzka 7384405f23 [DAG] Teach DAG to also reassociate vector operations
This commit teaches DAG to reassociate vector ops, which in turn enables
constant folding of vector op chains that appear later on during custom lowering
and DAG combine.

Reviewed by Andrea Di Biagio

llvm-svn: 199135
2014-01-13 20:51:35 +00:00
Andrew Trick 7daf6a45f4 Hide the pre-RA-sched= option.
This is a very confusing option for a feature that will go away.

-enable-misched is exposed instead to help triage issues with the new
scheduler.

llvm-svn: 199133
2014-01-13 20:08:27 +00:00
Weiming Zhao f66be56bf7 Fix PR 18369: [Thumbv8] asserts due to inconsistent CPSR liveness of IT blocks
The issue is caused when Post-RA scheduler reorders a bundle instruction
(IT block). However, it only flips the CPSR liveness of the bundle instruction,
leaves the instructions inside the bundle unchanged, which causes inconstancy and crashes
Thumb2SizeReduction.cpp::ReduceMBB().

llvm-svn: 199127
2014-01-13 18:47:54 +00:00
Rafael Espindola 5b6c1e8e59 Update getLazyBitcodeModule to use ErrorOr for error handling.
llvm-svn: 199125
2014-01-13 18:31:04 +00:00
Andrea Di Biagio 9bc0415c1f [AArch64] Fix assertion failure caused by an invalid comparison between APInt values.
APInt only knows how to compare values with the same BitWidth and asserts
in all other cases.

With this fix, function PerformORCombine does not use the APInt equality
operator if the APInt values returned by 'isConstantSplat' differ in BitWidth.
In that case they are different and no comparison is needed.

llvm-svn: 199119
2014-01-13 16:51:00 +00:00
Joerg Sonnenberger 808df6725f Fix indentation.
llvm-svn: 199118
2014-01-13 15:50:36 +00:00
Richard Sandiford 32379b8141 [SystemZ] Optimize (sext (ashr (shl ...), ...))
...into (ashr (shl (anyext X), ...), ...), which requires one fewer
instruction.  The (anyext X) can sometimes be simplified too.

I didn't do this in DAGCombiner because widening shifts isn't a win
on all targets.

llvm-svn: 199114
2014-01-13 15:17:53 +00:00
Tim Northover 1328c1ae32 ARM: constrain Thumb LDRLIT pseudo-instructions to r0-r7.
Previously we only used GPR for the destination placeholder in "ldr rD, [pc,
incorrect codegen under the integrated assembler.

This should fix both issues (which probably only affect MachO targets at the
moment).

rdar://problem/15800156

llvm-svn: 199108
2014-01-13 14:19:17 +00:00
David Woodhouse 4e033b0e92 [x86] Fix retq/retl handling in 64-bit mode
This finishes the job started in r198756, and creates separate opcodes for
64-bit vs. 32-bit versions of the rest of the RET instructions too.

LRETL/LRETQ are interesting... I can't see any justification for their
existence in the SDM. There should be no 'LRETL' in 64-bit mode, and no
need for a REX.W prefix for LRETQ. But this is what GAS does, and my
Sandybridge CPU and an Opteron 6376 concur when tested as follows:

asm __volatile__("pushq $0x1234\nmovq $0x33,%rax\nsalq $32,%rax\norq $1f,%rax\npushq %rax\nlretl $8\n1:");
asm __volatile__("pushq $1234\npushq $0x33\npushq $1f\nlretq $8\n1:");
asm __volatile__("pushq $0x33\npushq $1f\nlretq\n1:");
asm __volatile__("pushq $0x1234\npushq $0x33\npushq $1f\nlretq $8\n1:");

cf. PR8592 and commit r118903, which added LRETQ. I only added LRETIQ to
match it.

I don't quite understand how the Intel syntax parsing for ret
instructions is working, despite r154468 allegedly fixing it. Aren't the
explicitly sized 'retw', 'retd' and 'retq' supposed to work? I have at
least made the 'lretq' work with (and indeed *require*) the 'q'.

llvm-svn: 199106
2014-01-13 14:05:59 +00:00
Chandler Carruth 73523021d0 [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

llvm-svn: 199104
2014-01-13 13:07:17 +00:00
Elena Demikhovsky b19c9dc1a1 AVX-512: Embedded Rounding Control - encoding and printing
Changed intrinsics for vrcp14/vrcp28 vrsqrt14/vrsqrt28 - aligned with GCC.

llvm-svn: 199102
2014-01-13 12:55:03 +00:00
Chandler Carruth e509db410a [PM] Pull the generic graph algorithms and data structures for dominator
trees into the Support library.

These are all expressed in terms of the generic GraphTraits and CFG,
with no reliance on any concrete IR types. Putting them in support
clarifies that and makes the fact that the static analyzer in Clang uses
them much more sane. When moving the Dominators.h file into the IR
library I claimed that this was the right home for it but not something
I planned to work on. Oops.

So why am I doing this? It happens to be one step toward breaking the
requirement that IR verification can only be performed from inside of
a pass context, which completely blocks the implementation of
verification for the new pass manager infrastructure. Fixing it will
also allow removing the concept of the "preverify" step (WTF???) and
allow the verifier to cleanly flag functions which fail verification in
a way that precludes even computing dominance information. Currently,
that results in a fatal error even when you ask the verifier to not
fatally error. It's awesome like that.

The yak shaving will continue...

llvm-svn: 199095
2014-01-13 10:52:56 +00:00
Tim Northover 7fdd4857f7 Revert "ReMat: fix overly cavalier attitude to sub-register indices"
Very sorry, this was a premature patch that I still need to investigate and
finish off (for some reason beyond me at the moment it doesn't actually fix the
issue in all cases).

This reverts commit r199091.

llvm-svn: 199093
2014-01-13 10:49:11 +00:00
Tim Northover 59f8d4b4ee ReMat: fix overly cavalier attitude to sub-register indices
There are two attempted optimisations in reMaterializeTrivialDef, trying to
avoid promoting the size of a register too much when rematerializing.
Unfortunately, both appear to be flawed. First, we see if the original register
would have worked, but this is inadequate. Consider:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0 = COPY v1:Q1 (v1, v2 are QQ)
    ...
    uses of v2

In this case even though v2 *could* be used directly as the output of
SOMETHING, this would set the wrong bits of the QQ register involved. The
correct rematerialization must be:

    v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ)
    ...
    uses of v2:Q1_Q2

For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then
we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try
to hunt for a class between v1 and v2 that works. Unfortunately, this is also
wrong:

    v1 = SOMETHING (v1 is QQ)
    v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ)
    ...
    uses of v2 as a QQQ

The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current
logic would decide that v2 could be a QQ (no interest is taken in later uses).

This patch, therefore, always accepts the widened register class without trying
to be clever. Generally there is no penalty to this (e.g. in the common GR32 <
GR64 case, expanding the width doesn't matter because it's not like you were
going to do anything else with the high bits of a GR32 register). It can
increase register pressure in cases like the ARM VFP regs though (multiple
non-overlapping but equivalent subregisters). Hopefully this situation is rare
enough that it won't matter.

Unfortunately, no in-tree targets actually expose this as far as I can tell
(there are so few isAsCheapAsAMove instructions for it to trigger on) so I've
been unable to produce a test. It was exposed in our ARM64 SPEC tests though,
and I will be adding a test there that we should be able to contribute
soon(TM).

llvm-svn: 199091
2014-01-13 10:47:01 +00:00
Chandler Carruth 5ad5f15cff [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

llvm-svn: 199082
2014-01-13 09:26:24 +00:00
Chandler Carruth 07baed53e8 Re-sort #include lines again, prior to moving headers around.
llvm-svn: 199080
2014-01-13 08:04:33 +00:00
Chandler Carruth b7bdfd65ac [PM] Wire up support for writing bitcode with new PM.
This moves the old pass creation functionality to its own header and
updates the callers of that routine. Then it adds a new PM supporting
bitcode writer to the header file, and wires that up in the opt tool.
A test is added that round-trips code into bitcode and back out using
the new pass manager.

llvm-svn: 199078
2014-01-13 07:38:24 +00:00
Kevin Qin cfef55d6d4 [AArch64 NEON] Add missing patterns for bitcast from or to v1f64
llvm-svn: 199070
2014-01-13 01:58:38 +00:00
Kevin Qin 21e8f1c4eb [AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector
This patch covered 2 more scenarios:

1.  Two operands of shuffle_vector are the same, like
%shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>

2. One of operands is undef, like
%shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>

After this patch, perm instructions will have chance to be emitted instead of lots of INS.

llvm-svn: 199069
2014-01-13 01:56:29 +00:00
Saleem Abdulrasool a6505ca4c2 correct target directive handling error handling
The target specific parser should return `false' if the target AsmParser handles
the directive, and `true' if the generic parser should handle the directive.
Many of the target specific directive handlers would `return Error' which does
not follow these semantics.  This change simply changes the target specific
routines to conform to the semantis of the ParseDirective correctly.

Conformance to the semantics improves diagnostics emitted for the invalid
directives.  X86 is taken as a sample to ensure that multiple diagnostics are
not presented for a single error.

llvm-svn: 199068
2014-01-13 01:15:39 +00:00
Jakob Stoklund Olesen 1995b9fead Handle bundled terminators in isBlockOnlyReachableByFallthrough.
Targets like SPARC and MIPS have delay slots and normally bundle the
delay slot instruction with the corresponding terminator.

Teach isBlockOnlyReachableByFallthrough to find any MBB operands on
bundled terminators so SPARC doesn't need to specialize this function.

llvm-svn: 199061
2014-01-12 19:24:08 +00:00
NAKAMURA Takumi 4961f7a888 raw_fd_ostream: Don't change STDERR to O_BINARY, or w*printf() (in assert()) would barf wide chars after llvm::errs().
llvm-svn: 199057
2014-01-12 16:14:24 +00:00
NAKAMURA Takumi 79addb8d8f raw_stream formatter: [Win32] Use std::signbit() if available, instead of _fpclass().
FIXME: It should be generic to C++11. For now, it is dedicated to mingw-w64.
llvm-svn: 199052
2014-01-12 14:44:46 +00:00
Nico Rieck b5262d6d8f Fix non-deterministic SDNodeOrder-dependent codegen
Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code
generation.

llvm-svn: 199050
2014-01-12 14:09:17 +00:00
Chandler Carruth 52eef8876e [PM] Add module and function printing passes for the new pass manager.
This implements the legacy passes in terms of the new ones. It adds
basic testing using explicit runs of the passes. Next up will be wiring
the basic output mechanism of opt up when the new pass manager is
engaged unless bitcode writing is requested.

llvm-svn: 199049
2014-01-12 12:15:39 +00:00
Chandler Carruth e0af664cd8 [PM] Simplify the IR printing passes significantly now that a narrower
API is exposed.

This removes the support for deleting the ostream, switches the member
and constructor order arround to be consistent with the creation
routines, and switches to using references.

llvm-svn: 199047
2014-01-12 11:40:03 +00:00
Chandler Carruth 9d805139bd [PM] Simplify the interface exposed for IR printing passes.
Nothing was using the ability of the pass to delete the raw_ostream it
printed to, and nothing was trying to pass it a pointer to the
raw_ostream. Also, the function variant had a different order of
arguments from all of the others which was just really confusing. Now
the interface accepts a reference, doesn't offer to delete it, and uses
a consistent order. The implementation of the printing passes haven't
been updated with this simplification, this is just the API switch.

llvm-svn: 199044
2014-01-12 11:30:46 +00:00
Chandler Carruth 3dd261d0c9 [PM] Run clang-format and remove redundant or obvious comments before
the heavy factoring needed to share logic between the new pass manager
and the old.

llvm-svn: 199043
2014-01-12 11:16:01 +00:00
Chandler Carruth b8ddc7043c [PM] Rename the IR printing pass header to a more generic and correct
name to match the source file which I got earlier. Update the include
sites. Also modernize the comments in the header to use the more
recommended doxygen style.

llvm-svn: 199041
2014-01-12 11:10:32 +00:00
Saleem Abdulrasool bdae4b8743 ARM IAS: fix diagnostics of improper qualification
An improper qualifier would result in a superfluous error due to the parser not
consuming the remainder of the statement.  Simply consume the remainder of the
statement to avoid the error.

llvm-svn: 199035
2014-01-12 05:25:44 +00:00
Venkatraman Govindaraju cd4d9ac62a [Sparc] Add support for parsing floating point instructions.
llvm-svn: 199033
2014-01-12 04:48:54 +00:00
Saleem Abdulrasool fb3950ec63 ARM: change implicit immediate forms of {ld,st}r{,b}t to psuedo-instructions
The implicit immediate 0 forms are assembly aliases, not distinct instruction
encodings.  Fix the initial implementation introduced in r198914 to an alias to
avoid two separate instruction definitions for the same encoding.

An InstAlias is insufficient in this case as the necessary due to the need to
add a new additional operand for the implicit zero.  By using the AsmPsuedoInst,
fall back to the C++ code to transform the instruction to the equivalent
_POST_IMM form, inserting the additional implicit immediate 0.

llvm-svn: 199032
2014-01-12 04:36:01 +00:00
Venkatraman Govindaraju 0b9debf1f6 [Sparc] Replace (unsigned)-1 with ~OU as suggested by Reid Kleckner.
llvm-svn: 199031
2014-01-12 04:34:31 +00:00
Jakob Stoklund Olesen e7084a1c5c The SPARCv9 ABI returns a float in %f0.
This is different from the argument passing convention which puts the
first float argument in %f1.

With this patch, all returned floats are treated as if the 'inreg' flag
were set. This means multiple float return values get packed in %f0,
%f1, %f2, ...

Note that when returning a struct in registers, clang will set the
'inreg' flag on the return value, so that behavior is unchanged. This
also happens when returning a float _Complex.

llvm-svn: 199028
2014-01-12 04:13:17 +00:00
Joerg Sonnenberger 485f00fe0f Add missing mul aliases for armv4 support. Add checks that armv4 can
assemble the various mul instructions.

llvm-svn: 199026
2014-01-12 03:35:18 +00:00
Hans Wennborg ac114a3ce7 Switch-to-lookup tables: Don't require a result for the default
case when the lookup table doesn't have any holes.

This means we can build a lookup table for switches like this:

  switch (x) {
    case 0: return 1;
    case 1: return 2;
    case 2: return 3;
    case 3: return 4;
    default: exit(1);
  }

The default case doesn't yield a constant result here, but that doesn't matter,
since a default result is only necessary for filling holes in the lookup table,
and this table doesn't have any holes.

This makes us transform 505 more switches in a clang bootstrap, and shaves 164 KB
off the resulting clang binary.

llvm-svn: 199025
2014-01-12 00:44:41 +00:00
Venkatraman Govindaraju a66b314c34 [Sparc] Add missing processor types: v7 and niagara
llvm-svn: 199024
2014-01-11 23:56:13 +00:00
Saleem Abdulrasool 2d48edeca3 ARM IAS: support emitting constant values in target expressions
A 32-bit immediate value can be formed from a constant expression and loaded
into a register.  Add support to emit this into an object file.  Because this
value is a constant, a relocation must *not* be produced for it.

llvm-svn: 199023
2014-01-11 23:03:48 +00:00
Arnold Schwaighofer 66c742aeea LoopVectorizer: Enable strided memory accesses versioning per default
I saw no compile or execution time regressions on x86_64 -mavx -O3.

radar://13075509

llvm-svn: 199015
2014-01-11 20:40:34 +00:00
Venkatraman Govindaraju 0653218b2b [Sparc] Bundle instruction with delay slow and its filler. Now, we can use -verify-machineinstrs with SPARC backend.
llvm-svn: 199014
2014-01-11 19:38:03 +00:00
Alp Toker 798060e006 Fix 'ned' typo in doc comment
Patch by Jasper Neumann!

llvm-svn: 199007
2014-01-11 14:01:43 +00:00
Chandler Carruth a13f27cc34 [PM] Add names to passes under the new pass manager, and a debug output
mode that can be used to debug the execution of everything.

No support for analyses here, that will come later. This already helps
show parts of the opt commandline integration that isn't working. Tests
of that will start using it as the bugs are fixed.

llvm-svn: 199004
2014-01-11 11:52:05 +00:00