Commit Graph

71354 Commits

Author SHA1 Message Date
Tim Northover 4e80b584fe ARM: support legalisation of "fptrunc ... to half" operations.
llvm-svn: 213373
2014-07-18 13:01:19 +00:00
Tim Northover 20bd0ced30 CodeGen: soften f16 type by default instead of marking legal.
Actual support for softening f16 operations is still limited, and can be added
when it's needed.  But Soften is much closer to being a useful thing to try
than keeping it Legal when no registers can actually hold such values.

Longer term, we probably want something between Soften and Promote semantics
for most targets, it'll be more efficient to promote the 4 basic operations to
f32 than libcall them.

llvm-svn: 213372
2014-07-18 12:41:46 +00:00
Renato Golin e48d9dc15e Suppress 'not handled in switch' warning
llvm-svn: 213371
2014-07-18 12:13:04 +00:00
Tilmann Scheller 0fc933d6b8 [ARM] Add earlyclobber constraint to pre/post-indexed ARM STR instructions.
The post-indexed instructions were missing the constraint, causing unpredictable STR instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

This fixes PR20323.

llvm-svn: 213369
2014-07-18 12:05:49 +00:00
Renato Golin c17a07b36a Refactor ARM subarchitecture parsing
Re-commit of a patch to rework the triple parsing on ARM to a more sane
model.

Patch by Gabor Ballabas.

llvm-svn: 213367
2014-07-18 12:00:48 +00:00
Artyom Skrobov 78d5daf8ce extracting swapStruct into include/llvm/Support/MachO.h (no functional change)
llvm-svn: 213361
2014-07-18 09:26:16 +00:00
Tim Northover f861de3d7b R600: support f16 -> f64 conversion intrinsic.
Unfortunately, we don't seem to have a direct truncation, but the
extension can be legally split into two operations so we should
support that.

llvm-svn: 213357
2014-07-18 08:43:24 +00:00
Tim Northover 5e54fe14a4 NVPTX: support direct f16 <-> f64 conversions via intrinsics.
Clang may well start emitting these soon, and while it may not be
directly relevant for OpenCL or GLSL, the instructions were just
sitting there waiting to be used.

llvm-svn: 213356
2014-07-18 08:30:10 +00:00
Hal Finkel e15442c8aa Rename AlignAttribute to IntAttribute
Currently the only kind of integer IR attributes that we have are alignment
attributes, and so the attribute kind that takes an integer parameter is called
AlignAttr, but that will change (we'll soon be adding a dereferenceable
attribute that also takes an integer value). Accordingly, rename AlignAttribute
to IntAttribute (class names, enums, etc.).

No functionality change intended.

llvm-svn: 213352
2014-07-18 06:51:55 +00:00
Matt Arsenault 3dd43fc75d R600: Implement TTI:getPopcntSupport
The test is just copied from X86, and I don't know of a better
way to test it.

llvm-svn: 213351
2014-07-18 06:07:13 +00:00
Jim Grosbach b6535c32f5 X86: Constant fold converting vector setcc results to float.
Since the result of a SETCC for X86 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the SSE code is generated as:
LCPI0_0:
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  cvtdq2ps  %xmm0, %xmm0
  retq

After, the code is improved to:
LCPI0_0:
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  retq

The cvtdq2ps has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via the ModRM operand of andps.

llvm-svn: 213342
2014-07-18 00:40:56 +00:00
Jim Grosbach f7502c4884 AArch64: Constant fold converting vector setcc results to float.
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.
  ret

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  ret

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.

rdar://17693791

llvm-svn: 213341
2014-07-18 00:40:52 +00:00
Michael J. Spencer 1eb023013e Revert "[x86] Fold extract_vector_elt of a load into the Load's address computation."
There's a bug where this can create cycles in the DAG. It will take a bit
to fix, so I'm backing it out for now.

llvm-svn: 213339
2014-07-18 00:15:50 +00:00
Eric Christopher 8ef7a6a15b Reset the Subtarget in the AsmPrinter for each machine function
and add explanatory comment about dual initialization. Fix
use of the Subtarget to grab the information off of the target machine.

llvm-svn: 213336
2014-07-18 00:08:53 +00:00
Eric Christopher 7394e23423 Avoid resetting the UseSoftFloat and FloatABIType on the TargetMachine
Options struct and move the comment to inMips16HardFloat. Use the
fact that we now know whether or not we cared about soft float to
set the libcalls.
Accordingly rename mipsSEUsesSoftFloat to abiUsesSoftFloat and
propagate since it's no longer CPU specific.

llvm-svn: 213335
2014-07-18 00:08:50 +00:00
Lang Hames e5fc826f88 [MCJIT] Fix the alignment requirements for ARM and AArch64 which were mistakenly
relaxed in the big RuntimeDyldMachO cleanup of r213293.

No test case yet - this was found via inspection and there's no easy way to test
GOT alignment in RuntimeDyldChecker at the moment. I'm working on adding support
for this now, and hope to have a test case for this soon.

llvm-svn: 213331
2014-07-17 23:11:30 +00:00
Nico Weber 42f79dbf02 ms inline asm: Don't add x86 segment registers to the clobber list.
Clang tries to check the clobber list but doesn't list segment registers in its
x86 register list. This fixes PR20343.

llvm-svn: 213303
2014-07-17 20:24:55 +00:00
Alp Toker 11698180c3 Drop the udis86 wrapper from llvm::sys
This optional dependency on the udis86 library was added some time back to aid
JIT development, but doesn't make much sense to link into LLVM binaries these
days.

llvm-svn: 213300
2014-07-17 20:05:29 +00:00
Arnaud A. de Grandmaison d3d671604f [AArch64] Cleanup AsmParser: no need to use dyn_cast + assert. cast does it for us.
llvm-svn: 213296
2014-07-17 19:08:14 +00:00
Suyog Sarda 1a212203bc Rectify r213231. Use proper version of 'ComputeNumSignBits'.
Earlier when the code was in InstCombine, we were calling the version of ComputeNumSignBits in InstCombine.h
that automatically added the DataLayout* before calling into ValueTracking.
When the code moved to InstSimplify, we are calling into ValueTracking directly without passing in the DataLayout*.
This patch rectifies the same by passing DataLayout in ComputeNumSignBits.

llvm-svn: 213295
2014-07-17 19:07:00 +00:00
Lang Hames a521688cf4 [MCJIT] Significantly refactor the RuntimeDyldMachO class.
The previous implementation of RuntimeDyldMachO mixed logic for all targets
within a single class, creating problems for readability, maintainability, and
performance. To address these issues, this patch strips the RuntimeDyldMachO
class down to just target-independent functionality, and moves all
target-specific functionality into target-specific subclasses RuntimeDyldMachO.

The new class hierarchy is as follows:

class RuntimeDyldMachO
Implemented in RuntimeDyldMachO.{h,cpp}
Contains logic that is completely independent of the target. This consists
mostly of MachO helper utilities which the derived classes use to get their
work done.


template <typename Impl>
class RuntimeDyldMachOCRTPBase<Impl> : public RuntimeDyldMachO
Implemented in RuntimeDyldMachO.h
Contains generic MachO algorithms/data structures that defer to the Impl class
for target-specific behaviors.

RuntimeDyldMachOARM : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM>
RuntimeDyldMachOARM64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM64>
RuntimeDyldMachOI386 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOI386>
RuntimeDyldMachOX86_64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOX86_64>
Implemented in their respective *.h files in lib/ExecutionEngine/RuntimeDyld/MachOTargets
Each of these contains the relocation logic specific to their target architecture.

llvm-svn: 213293
2014-07-17 18:54:50 +00:00
Alexey Samsonov 535b6f9361 [ASan] Don't instrument load/stores with !nosanitize metadata.
This is used to avoid instrumentation of instructions added by UBSan
in Clang frontend (see r213291). This fixes PR20085.

Reviewed in http://reviews.llvm.org/D4544.

llvm-svn: 213292
2014-07-17 18:48:12 +00:00
Hans Wennborg 0fc52d4e85 Typo: exists -> exits
llvm-svn: 213290
2014-07-17 18:33:44 +00:00
Justin Holewinski 428cf0e49a [NVPTX] Improve handling of FP fusion
We now consider the FPOpFusion flag when determining whether
to fuse ops.  We also explicitly emit add.rn when fusion is
disabled to prevent ptxas from fusing the operations on its
own.

llvm-svn: 213287
2014-07-17 18:10:09 +00:00
Matt Arsenault 97483694e7 Fix typos
llvm-svn: 213285
2014-07-17 17:50:22 +00:00
Adam Nemet 5933c2f824 [X86] AVX512: Add disassembler support for compressed displacement
There are two parts here.  First is to modify tablegen to adjust the encoding
type ENCODING_RM with the scaling factor.

The second is to use the new encoding types to compute the correct
displacement in the decoder.

Fixes <rdar://problem/17608489>

llvm-svn: 213281
2014-07-17 17:04:56 +00:00
Adam Nemet 4c339abab3 [X86] AVX512: Rename EVEX_CD8V to CD8_Form
This is to match the naming of CD8_EltSize, CD8_Scale, etc.

No functional change.

llvm-svn: 213280
2014-07-17 17:04:52 +00:00
Adam Nemet 54adb0fcbc [X86] AVX512: Use the TD version of CD8_Scale in the assembler
Passes the computed scaling factor in TSFlags rather than the old attributes.

Also removes the C++ version of computing the scaling factor (MemObjSize)
along with the asserts added by the previous patch.

No functional change.

llvm-svn: 213279
2014-07-17 17:04:50 +00:00
Adam Nemet 4dc92b9a84 [X86] AVX512: Move compressed displacement logic to TD
This does not actually move the logic yet but reimplements it in the Tablegen
language.  Then asserts that the new implementation results in the same value.

The next patch will remove the assert and the temporary use of the TSFlags and
remove the C++ implementation.

The formula requires a limited form of the logical left and right operators.
I implemented these with the bit-extract/insert operator (i.e. blah{bits}).

No functional change.

llvm-svn: 213278
2014-07-17 17:04:34 +00:00
Adam Nemet 017fca0272 [TableGen] Allow shift operators to take bits<n>
Convert the operand to int if possible, i.e. if the value is properly
initialized.  (I suppose there is further room for improvement here to also
peform the shift if the uninitialized bits are shifted out.)

With this little change we can now compute the scaling factor for compressed
displacement with pure tablegen code in the X86 backend.  This is useful
because both the X86-disassembler-specific part of tablegen and the assembler
need this and TD is the natural sharing place.

The patch also adds the missing documentation for the shift and add operator.

llvm-svn: 213277
2014-07-17 17:04:27 +00:00
Justin Holewinski e5a1173f67 [NVPTX] Add missing .v4 qualifier on vector store instruction
llvm-svn: 213276
2014-07-17 16:58:56 +00:00
Saleem Abdulrasool 7d09530cef MC: correct DWARF header for PE/COFF assembly input
The header contains an offset to the DWARF abbreviations for the CU.  The offset
must be section relative for COFF and absolute for others.  The non-assembly
code path for the DWARF header generation already had the correct emission for
the headers.  This corrects just the assembly path.  Due to the invalid
relocation, processing of the debug information would halt previously on the
first assembly input as the associated abbreviations would be out of range as
they would have the location increased by image base and the section offset.

This address PR20332.

llvm-svn: 213275
2014-07-17 16:27:44 +00:00
Saleem Abdulrasool 862e60c75c MC: fix MCAsmInfo usage for windows-itanium
Windows itanium uses the GNUCOFF assmebly format, not ELF.

llvm-svn: 213274
2014-07-17 16:27:40 +00:00
Saleem Abdulrasool 19f8bc65f6 MC: collapse emission of producer
Rather than use three EmitBytes, concatenate the string at compile time,
constructing a single StringRef and emitting the data in one shot.  This also
creates nicer assembly output.  NFC.

llvm-svn: 213273
2014-07-17 16:27:35 +00:00
Justin Holewinski 18cfe7d634 [NVPTX] Flag surface/texture query instructions with IsTexSurfQuery
Also, add some tests to make sure we can handle surface/texture
queries on both Fermi and Kepler+.

llvm-svn: 213268
2014-07-17 14:51:33 +00:00
Justin Holewinski 9a2350e459 [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch
This also uses TSFlags to mark machine instructions that are surface/texture
accesses, as well as the vector width for surface operations.  This is used
to simplify some of the switch statements that need to detect surface/texture
instructions

llvm-svn: 213256
2014-07-17 11:59:04 +00:00
Tim Northover 53f3bcf0e9 ARM: support direct f16 <-> f64 conversions
ARMv8 has instructions to handle it, otherwise a libcall is needed.

llvm-svn: 213254
2014-07-17 11:27:04 +00:00
Tim Northover 84ce0a642e CodeGen: generate single libcall for fptrunc -> f16 operations.
Previously we asserted on this code. Currently compiler-rt doesn't
actually implement any of these new libcalls, but external help is
pretty much the only viable option for LLVM.

I've followed the much more generic "__truncST2" naming, as opposed to
the odd name for f32 -> f16 truncation. This can obviously be changed
later, or overridden by any targets that need to.

llvm-svn: 213252
2014-07-17 11:12:12 +00:00
Tim Northover 2131044814 X86: support double extension of f16 type.
x86 has no native ability to extend an f16 to f64, but the same result
is obtained if we expand it into two separate extensions: f16 -> f32
-> f64.

Unfortunately the same is not true for truncate, so that still results
in a compilation failure.

llvm-svn: 213251
2014-07-17 11:04:04 +00:00
Tim Northover fd7e424935 CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

llvm-svn: 213248
2014-07-17 10:51:23 +00:00
Yi Kong 2355066e43 Port memory barriers intrinsics to AArch64
Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to
implement their corresponding ACLE and MSVC intrinsics.

This patch ports ARM dmb, dsb, isb intrinsic to AArch64.

Differential Revision: http://reviews.llvm.org/D4520

llvm-svn: 213247
2014-07-17 10:50:20 +00:00
Daniel Sanders 701e961650 [mips] .reginfo is 8 byte aligned on N32.
Differential Revision: http://reviews.llvm.org/D4540

llvm-svn: 213246
2014-07-17 10:10:04 +00:00
Daniel Sanders 7f70573ed9 [mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than a mips64-* triple
Summary:
Generally speaking, mips-* vs mips64-* should not be used to make decisions
about the content or format of the ELF. This should be based on the ABI
and CPU in use. For example, `mips-linux-gnu-clang -mips64r2 -mabi=64`
should produce an ELF64 as should `mips64-linux-gnu-clang -mabi=64`.
Conversely, `mips64-linux-gnu-clang -mabi=n32` should produce an ELF32 as
should `mips-linux-gnu-clang -mips64r2 -mabi=n32`.

This patch fixes the e_flags but leaves the ELF32 vs ELF64 issue for now
since there is no apparent way to base this decision on the ABI and CPU.

Differential Revision: http://reviews.llvm.org/D4539

llvm-svn: 213244
2014-07-17 10:02:08 +00:00
Daniel Sanders 185f23adbc [mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6
Summary:
The cpr1_size field describes the minimum register width to run the program
rather than the size of the registers on the target. MIPS32r6 was acting
as if -mfp64 has been given because it starts off with 64-bit FPU registers.

Differential Revision: http://reviews.llvm.org/D4538

llvm-svn: 213243
2014-07-17 09:57:23 +00:00
Daniel Sanders 16ec6c1939 [mips] Fix ELF e_flags related to -mabicalls and -mplt.
Summary:
These options are not implemented yet but we act as if they are always
given.

The integrated assembler is driven by the clang driver so the e_flag test
cases should match the e_flags emitted by GCC+GAS rather than GAS
by itself.

Differential Revision: http://reviews.llvm.org/D4536

llvm-svn: 213242
2014-07-17 09:52:56 +00:00
Yi Kong 7d78ab5753 Fix the prefix for arm64 triple
Triple.cpp still returns "arm64" as prefix for arm64 triple, causing Clang not
being able to select the correct GCCBuiltin IR.

This patch changes the value to correct prefix "aarch64". Regression test will
be added in the coming patch.

Differential Revision: http://reviews.llvm.org/D4516

llvm-svn: 213240
2014-07-17 09:43:27 +00:00
Evgeniy Stepanov c8227aa14d [msan] Avoid redundant origin stores.
Origin is meaningless for fully initialized values. Avoid
storing origin for function arguments that are known to
be always initialized (i.e. shadow is a compile-time null
constant).

This is not about correctness, but purely an optimization.
Seems to affect compilation time of blacklisted functions
significantly.

llvm-svn: 213239
2014-07-17 09:10:37 +00:00
Suyog Sarda 68862414b5 Move ashr optimization from InstCombineShift to InstSimplify.
Refactor code, no functionality change, test case moved from instcombine to instsimplify.

Differential Revision: http://reviews.llvm.org/D4102
 

llvm-svn: 213231
2014-07-17 06:28:15 +00:00
Matt Arsenault ac6e39cf3b Use range for
llvm-svn: 213230
2014-07-17 06:19:06 +00:00
Matt Arsenault 5e2b0f51e7 R600: Short circuit alloca check if address space isn't private.
Skip calling GetUnderlyingObject in cases where it obviously
isn't from an alloca. This should only be a compile time improvement.

llvm-svn: 213229
2014-07-17 06:13:41 +00:00
Suyog Sarda de409fd798 Fix Typo (first commit to test commit access)
llvm-svn: 213228
2014-07-17 06:09:34 +00:00
Saleem Abdulrasool ab820860fa MC: make WinEH opcode an opaque value
This makes the opcode an opaque value (unsigned int) rather than the
enumeration.  This permits the use of target specific operands.

Split out the generic type into a MCWinEH header and add a supporting
MCWin64EH::Instruction to abstract out the selection of the opcode and
construction of the actual instruction.

llvm-svn: 213221
2014-07-17 03:08:50 +00:00
Hal Finkel 354e23b029 Improve BasicAA CS-CS queries (redux)
This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both
problems.

Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.

Original commit message:

BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.

Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.

This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.

Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).

llvm-svn: 213219
2014-07-17 01:28:25 +00:00
Jingyue Wu 0bdc027e31 Partially revert r210444 due to performance regression
Summary:
Converting outermost zext(a) to sext(a) causes worse code when the
computation of zext(a) could be reused. For example, after converting

... = array[zext(a)]
... = array[zext(a) + 1]

to

... = array[sext(a)]
... = array[zext(a) + 1],

the program computes sext(a), which is actually unnecessary. I added one
test in split-gep-and-gvn.ll to illustrate this scenario.

Also, with r211281 and r211084, we annotate more "nuw" tags to
computation involving CUDA intrinsics such as threadIdx.x. These
annotations help with splitting GEP a lot, rendering the benefit we get
from this reverted optimization only marginal.

Test Plan: make check-all

Reviewers: eliben, meheff

Reviewed By: meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D4542

llvm-svn: 213209
2014-07-16 23:25:00 +00:00
Sanjay Patel d3bbfa1cb6 Fixed formatting, removed bug reference, renamed testcase
Thanks to Duncan Exon Smith for reviewing and cleanup suggestions. 

llvm-svn: 213205
2014-07-16 22:40:28 +00:00
Juergen Ributzka 618ce3e85e [FastISel] Local values shouldn't be alive across an inline asm call with side effects.
This fixes an issue where a local value is defined before and used after an
inline asm call with side effects.

This fix simply flushes the local value map, which updates the insertion point
for the inline asm call to be above any previously defined local values.

This fixes <rdar://problem/17694203>

llvm-svn: 213203
2014-07-16 22:20:51 +00:00
Lang Hames 8b20530689 [MCJIT] Improve a RuntimeDyldChecker diagnostic.
When a RuntimeDyldChecker test requests an invalid operand for an instruction,
print the decoded instruction to aid diagnosis.

llvm-svn: 213202
2014-07-16 22:02:20 +00:00
Sanjay Patel ab60d04363 trivial fix for PR20314
Make sure that the AddrInst is an Instruction.

llvm-svn: 213197
2014-07-16 21:08:10 +00:00
Sanjay Patel 6360441f99 Remove Atom references in description.
Any CPU can run this pass.

llvm-svn: 213190
2014-07-16 20:18:49 +00:00
Manuel Jacob 405fb18213 Utilize CastInst::CreatePointerBitCastOrAddrSpaceCast here.
llvm-svn: 213189
2014-07-16 20:13:45 +00:00
Chris Bieneman df4b763be5 [RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo.
llvm-svn: 213188
2014-07-16 20:13:31 +00:00
Justin Holewinski ac451066f4 [NVPTX] Honor alignment on vector loads/stores
We were not considering the stated alignment on vector loads/stores,
leading us to generate vector instructions even when we do not have
sufficient alignment.

Now, for IR like:

  %1 = load <4 x float>, <4 x float>* %ptr, align 4

we will generate correct, conservative PTX like:

  ld.f32 ... [%ptr]
  ld.f32 ... [%ptr+4]
  ld.f32 ... [%ptr+8]
  ld.f32 ... [%ptr+12]

Or if we have an alignment of 8 (for example), we can
generate code like:

  ld.v2.f32 ... [%ptr]
  ld.v2.f32 ... [%ptr+8]

llvm-svn: 213186
2014-07-16 19:45:35 +00:00
David Blaikie 07557be68f Remove unnecessary/redundant std::move
(run returns unique_ptr by value already)

llvm-svn: 213174
2014-07-16 17:09:21 +00:00
Chris Bieneman 80a866a316 Added documentation for SizeMultiplier in the ARM subtarget hook for register coalescing. Also fixed some 80 col violations.
No functional code changes.

llvm-svn: 213169
2014-07-16 16:27:31 +00:00
Justin Holewinski 3e037d98e6 [NVPTX] Rename registers %fl -> %fd and %rl -> %rd
This matches the internal behavior of NVIDIA tools like libnvvm.

llvm-svn: 213168
2014-07-16 16:26:58 +00:00
Tim Northover 7f3e11e7c0 CodeGen: don't form illegail EXTLOAD operations.
It turns out that in most cases (the main exception being i1-related
types) once these operations are formed we cannot separate them and
the targets end up having to deal with them whether they want to or
not.

This is not a good situation, and a more reasonable default can be
formed by ackowledging this and having targets leave them as Legal.
Only x86 seems to be affected (other targets don't even try marking
the operation Expand).

Mostly there's no visible change here yet, but it will be useful to
have truly expanded EXTLOADS for MVT::f16 softening support.

llvm-svn: 213162
2014-07-16 15:37:24 +00:00
Daniel Sanders 68dcb4ffa3 [mips][fp64a] Temporarily disable odd-numbered double-precision registers when using the FP64A ABI.
Summary:
A few instructions (mostly cvt.d.w and similar) are causing problems with
-mfp64 and -mno-odd-spreg and it looks like fixing it properly may
take several weeks. In the meantime, let's disable the odd-numbered
double-precision registers so that the generated code is at least valid.

The problem is that instructions like cvt.d.w read from the 32-bit low
subregister of a double-precision FPU register. This often leads to the compiler
to inserting moves to transfer a GPR32 to a FGR32 using mtc1. Such moves
violate the rules against 32-bit writes to odd-numbered FPU registers imposed
by -mno-odd-spreg. By disabling the odd-numbered double-precision registers, it
becomes impossible for the 32-bit low subregister to be odd-numbered.

This fixes numerous test-suite failures when compiling for the FP64A ABI
('-mfp64 -mno-odd-spreg'). There is no LLVM test case because it's difficult to
test that odd-numbered FPU registers are not allocatable. Instead, we depend on
the assembler (GAS and -fintegrated-as) raising errors when the rules are
violated.

Differential Revision: http://reviews.llvm.org/D4532

llvm-svn: 213160
2014-07-16 15:34:07 +00:00
Andrea Di Biagio a03624d8ab [X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'.
Before this change, method 'isShuffleMaskLegal' didn't know that shuffles
implementing a 'movhlps' operation were perfectly legal for SSE targets.

This patch adds the missing check for 'isMOVHLPSMask' inside method
'isShuffleMaskLegal' to fix the problem.

The reason why it is important to do this is because the DAGCombiner
conservatively avoids combining a pair of shuffles if the resulting shuffle
node has an illegal mask. Before this patch, shuffles with a MOVHLPS mask were
wrongly considered not to be legal. This was the root cause of some poor-code
generation bugs.

llvm-svn: 213137
2014-07-16 11:29:39 +00:00
Reid Kleckner 56b56ea15b Roundtrip the inalloca bit on allocas through bitcode
This was an oversight in the original support.  As it is, I stuffed this
bit into the alignment.  The alignment is stored in log2 form, so it
doesn't need more than 5 bits, given that Value::MaximumAlignment is 1
<< 29.

Reviewers: nicholas

Differential Revision: http://reviews.llvm.org/D3943

llvm-svn: 213118
2014-07-16 01:34:27 +00:00
Manuel Jacob b4db99c76a Fix comment in InstCombiner::visitAddrSpaceCast.
In the original version of the patch the behaviour was like described in
the comment.  This behaviour was changed before committing it without
updating the comment.

llvm-svn: 213117
2014-07-16 01:34:21 +00:00
Hans Wennborg 21f0f13394 Perform wildcard expansion in Process::GetArgumentVector on Windows (PR17098)
On Windows, wildcard expansion isn't performed by the shell, but left to the
program itself. The common way to do this is to link with setargv.obj, which
performs the expansion on argc/argv before main is entered. However, we don't
use argv in Clang on Windows, but instead call GetCommandLineW so we can handle
unicode arguments. This means we have to do wildcard expansion ourselves.

A test case will be added on the Clang side.

Differential Revision: http://reviews.llvm.org/D4529

llvm-svn: 213114
2014-07-16 00:52:11 +00:00
Tyler Nowicki 641d8a06bd Emit warnings if vectorization is forced and fails.
This patch modifies the existing DiagnosticInfo system to create a generic base
class that is inherited to produce diagnostic-based warnings. This is used by
the loop vectorizer to trigger a warning when vectorization is forced and
fails. Several tests have been added to verify this behavior.

Reviewed by: Arnold Schwaighofer

llvm-svn: 213110
2014-07-16 00:36:00 +00:00
Juergen Ributzka 480872b4ce Remove TLI from isInTailCallPosition's arguments. NFC.
There is no need to pass on TLI separately to the function. As Eric pointed out
the Target Machine already provides everything we need.

llvm-svn: 213108
2014-07-16 00:01:22 +00:00
Matt Arsenault 22ca3f8860 R600/SI: Allow using f32 rcp / rsq when denormals not handled.
These are precise enough to use for OpenCL unless denormals
are handled.

llvm-svn: 213107
2014-07-15 23:50:10 +00:00
David Majnemer 3821ff03cd X86: Simplify X86WindowsTargetObjectFile::getSectionForConstant
There exists a helper function to abstract away the various differences
between ConstantVector, ConstantDataVector, ConstantAggregateZero, etc.

Use it to simplify X86WindowsTargetObjectFile::getSectionForConstant.

llvm-svn: 213104
2014-07-15 23:01:10 +00:00
Sanjay Patel a2f658d69d Move Post RA Scheduling flag bit into SchedMachineModel
Refactoring; no functional changes intended

    Removed PostRAScheduler bits from subtargets (X86, ARM).
    Added PostRAScheduler bit to MCSchedModel class.
    This bit is set by a CPU's scheduling model (if it exists).
    Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses.
    Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!).
    Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling.
    Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values.
    Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: 
       a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. 
       b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. 
       c. PPC overrides the CPU's postRA settings by enabling postRA for everything. 
       d. X86 is the only target that actually has postRA specified via sched model info.

Differential Revision: http://reviews.llvm.org/D4217

llvm-svn: 213101
2014-07-15 22:39:58 +00:00
Peter Collingbourne 9947c49812 [dfsan] Introduce further optimization to reduce the number of union queries.
Specifically, do not compute a union if it is statically known that one
shadow set subsumes the other.

llvm-svn: 213100
2014-07-15 22:13:19 +00:00
Matt Arsenault 0d89e849bd R600/SI: Fix select on i1
llvm-svn: 213096
2014-07-15 21:44:37 +00:00
Matt Arsenault e9fa3b8e6b R600/SI: Implement less wrong f32 fdiv
Assuming single precision denormals and accurate sqrt/div are not
reported, this passes the OpenCL conformance test.

llvm-svn: 213089
2014-07-15 20:18:31 +00:00
Matt Arsenault 1d077749ea R600: Add predicate for UnsafeFPMath
llvm-svn: 213088
2014-07-15 20:18:24 +00:00
Matt Arsenault 84446a026b R600: Remove intrinsics that appear to be unused
llvm-svn: 213087
2014-07-15 20:10:27 +00:00
Lang Hames 84bc818baf [RuntimeDyld] Revert r211652 - MachO object GDB registration support.
The registration scheme used in r211652 violated the read-only contract of
MemoryBuffer. This caused crashes in llvm-rtdyld where macho objects were backed
by read-only mmap'd memory.

llvm-svn: 213086
2014-07-15 19:35:22 +00:00
Chris Bieneman 03695ab57e [RegisterCoalescer] Add new subtarget hook allowing targets to opt-out of coalescing.
The coalescer is very aggressive at propagating constraints on the register classes, and the register allocator doesn’t know how to split sub-registers later to recover. This patch provides an escape valve for targets that encounter this problem to limit coalescing.

This patch also implements such for ARM to lower register pressure when using lots of large register classes. This works around PR18825.

llvm-svn: 213078
2014-07-15 17:18:41 +00:00
Cameron McInally 44f3e30cf2 Revert r213070. It's breaking the build in MCELFStreamer::EmitInstToData(...).
llvm-svn: 213073
2014-07-15 16:24:24 +00:00
Jan Vesely 6ddb8dd442 R600: Implement zero undef variants of ctlz/cttz
v2: use ffbh/l if available
v3: Rebase on top of Matt's SI patches

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 213072
2014-07-15 15:51:09 +00:00
Daniel Sanders a6e125f07e [mips] Correct .MIPS.abiflags fp_abi field for -mfpxx and without .module
Summary: Previously all the test cases set it after initialization with '.module fp=xx'.

Differential Revision: http://reviews.llvm.org/D4489

llvm-svn: 213071
2014-07-15 15:31:39 +00:00
Cameron McInally 53bc7a3330 Add x86 patterns to match a specific add-with-carry.
llvm-svn: 213070
2014-07-15 15:03:32 +00:00
Andrea Di Biagio bd5555cc3f [DAGCombiner] Add more rules to fold shuffles.
This patch adds two new rules to the DAGCombiner:
 1.  shuffle (shuffle A, Undef, M0), B, M1 -> shuffle A, B, M2
 2.  shuffle (shuffle A, Undef, M0), A, M1 -> shuffle A, Undef, M2

We only do this if the combined shuffle is legal for the target.

Example:
;;
define <4 x float> @test(<4 x float> %a, <4 x float> %b) {
  %1 = shufflevector <4 x float> %a, <4 x float> undef, <4 x i32><i32 6, i32 0, i32 1, i32 7>
  %2 = shufflevector <4 x float> %1, <4 x float> %b, <4 x i32><i32 1, i32 2, i32 4, i32 5>
  ret <4 x i32> %2
}
;;

(using llc -mcpu=corei7 -march=x86-64)
Before, the x86 backend generated:
  pshufd $120, %xmm0, %xmm0
  shufps $-108, %xmm0, %xmm1
  movaps %xmm1, %xmm0

Now the x86 backend generates:
  movsd %xmm1, %xmm0

llvm-svn: 213069
2014-07-15 13:26:28 +00:00
NAKAMURA Takumi 04b8b37f56 Prune Redundant libdeps in CMake's target_link_libraries and LLVMBuild.txt.
I checked this with Release+Asserts on x86_64-mingw32. Please restore partially if this were overkill.

llvm-svn: 213064
2014-07-15 11:37:03 +00:00
Andrea Di Biagio 04d5a7b337 Silence a warning in conditional expression.
Fixes a gcc warning caused by a typo. A redundant assignment operation was
accidentally used as the third operand of a conditional expression.
No functional change intended.

llvm-svn: 213061
2014-07-15 10:53:44 +00:00
Stepan Dyatkovskiy dee612d4f6 MergeFunc patch from Björn Steinbrink.
Phabricator ticket: D4246, Don't merge functions with different range metadata on call/invoke.
Thanks!

llvm-svn: 213060
2014-07-15 10:46:51 +00:00
Tim Northover e4b8e138e1 AArch64: fall back to generic code for out of range extract/insert.
rdar://problem/17624784

llvm-svn: 213059
2014-07-15 10:00:26 +00:00
David Majnemer d4d9944416 Fix typo in comment
No functionality changed.

llvm-svn: 213052
2014-07-15 07:11:32 +00:00
Juergen Ributzka 8f073c8d60 [FastISel][X86] Remove no longer needed functions.
llvm-svn: 213051
2014-07-15 06:35:53 +00:00
Juergen Ributzka 3566c08dd9 [FastISel][X86] Implement the FastLowerIntrinsicCall hook.
Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively
implements the target hook.

llvm-svn: 213050
2014-07-15 06:35:50 +00:00
Juergen Ributzka 23d43318c7 [FastISel][X86] Implement the FastLowerCall hook.
This implements the FastLowerCall hook, which is based on the DoSelectCall
function. The implementation is very similar, but the target-independent call
lowering part has been factored out.

This should also enable patchpoint intrinsic lowering for FastISel on X86.

Related to <rdar://problem/17427052>.

llvm-svn: 213049
2014-07-15 06:35:47 +00:00
Juergen Ributzka 5ee9d90248 Revert "[FastISel][X86] Remove no longer needed functions."
Revert "[FastISel][X86] Implement the FastLowerIntrinsicCall hook."
Revert "[FastISel][X86] Implement the FastLowerCall hook."

This reverts commit r213035, r213036, and r213037 to make the
buildbots happy again.

llvm-svn: 213048
2014-07-15 05:23:40 +00:00
Peter Collingbourne 705a1ae3c8 [dfsan] Introduce an optimization to reduce the number of union queries.
Specifically, when building a union query, if we are dominated by an identical
query then use the result of that query instead.

llvm-svn: 213047
2014-07-15 04:41:17 +00:00
Peter Collingbourne 83def1cbe7 [dfsan] Move combineShadows to DFSanFunction in preparation for it to use a domtree.
llvm-svn: 213046
2014-07-15 04:41:14 +00:00
Peter Collingbourne 818f5c4837 Give SplitBlockAndInsertIfThen the ability to update a domtree.
llvm-svn: 213045
2014-07-15 04:40:27 +00:00
David Majnemer 4e3ccc0505 CodeGen: Handle ConstantVector and undef in WinCOFF constant pools
The constant pool entry code for WinCOFF assumed that vector constants
would be formed using ConstantDataVector, it did not expect to see a
ConstantVector.  Furthermore, it did not expect undef as one of the
elements of the vector.

ConstantVectors should be handled like ConstantDataVectors, treat Undef
as zero.

llvm-svn: 213038
2014-07-15 02:34:12 +00:00
Juergen Ributzka 9fbf33d70f [FastISel][X86] Remove no longer needed functions.
llvm-svn: 213037
2014-07-15 02:22:56 +00:00
Juergen Ributzka 170f9354bb [FastISel][X86] Implement the FastLowerIntrinsicCall hook.
Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively
implements the target hook.

llvm-svn: 213036
2014-07-15 02:22:53 +00:00
Juergen Ributzka a9cced8a94 [FastISel][X86] Implement the FastLowerCall hook.
This implements the FastLowerCall hook, which is based on the DoSelectCall
function. The implementation is very similar, but the target-independent call
lowering part has been factored out.

This should also enable patchpoint intrinsic lowering for FastISel on X86.

Related to <rdar://problem/17427052>.

llvm-svn: 213035
2014-07-15 02:22:49 +00:00
Juergen Ributzka 718bb71ade [FastISel] Insert patchpoint instruction before the target generated call instruction.
The patchpoint instruction should have been inserted before the target
generated call instruction to be inside the ADJSTACKDOWN/ADJSTACKUP call
sequence window.

llvm-svn: 213034
2014-07-15 02:22:46 +00:00
Juergen Ributzka a415943590 [FastISel] Fix patchpoint lowering to set the result register.
Always update the value map with the result register (if there is one), for the
patchpoint instruction we created to replace the target-specific call
instruction.

llvm-svn: 213033
2014-07-15 02:22:43 +00:00
Matt Arsenault ca3976f7ae R600: Add dag combine for copy of an illegal type.
This helps avoid redundant instructions to unpack, and repack
the vectors. Ideally we could recognize that pattern and eliminate
it. Currently v4i8 and other small element type vectors are scalarized,
so this has the added bonus of avoiding that.

llvm-svn: 213031
2014-07-15 02:06:31 +00:00
Matt Arsenault f1a7e62033 Teach computeKnownBits to look through addrspacecast.
This fixes inferring alignment through an addrspacecast.

llvm-svn: 213030
2014-07-15 01:55:03 +00:00
Reid Kleckner 15fe7a530d Document the maximum LLVM IR alignment, which is 1 << 29 or 0.5 GiB
Add verifier checks.  We already check these in the assembly parser, but
a frontend producing IR in memory wouldn't hit those checks.

llvm-svn: 213027
2014-07-15 01:16:09 +00:00
Matt Arsenault 70f4db88d5 Teach GetUnderlyingObject / BasicAA about addrspacecast
llvm-svn: 213025
2014-07-15 00:56:40 +00:00
Nick Lewycky 7a63c3b389 Revert r212572 "improve BasicAA CS-CS queries", it causes PR20303.
llvm-svn: 213024
2014-07-15 00:53:38 +00:00
Andrea Di Biagio 2152a6c78b [DAGCombiner] Avoid calling method 'isShuffleMaskLegal' on illegal vector types.
This patch fixes a crasher in method 'DAGCombiner::visitOR' due to an invalid
call to method 'isShuffleMaskLegal'. On x86, method 'isShuffleMaskLegal'
always expects a legal vector value type in input.

With this patch, we immediately check if the input OR dag node has a legal
vector type; we only try to fold a OR dag node into a single shufflevector
if we know that the resulting shuffle will have a legal type.
This is to avoid calling method 'isShuffleMaskLegal' on a potentially
illegal vector value type.

Added a new test-case to file 'CodeGen/X86/combine-or.ll' to verify that
DAGCombiner doesn't crash in the attempt to check/combine an OR between shuffles
with illegal types.

llvm-svn: 213020
2014-07-15 00:02:32 +00:00
Matt Arsenault f171cf23b8 R600: Add denormal handling subtarget features.
llvm-svn: 213018
2014-07-14 23:40:49 +00:00
Matt Arsenault c6ae7b4763 R600/SI: Default to no single precision denormals.
llvm-svn: 213017
2014-07-14 23:40:43 +00:00
Lang Hames c832ae3eae [RuntimeDyld] Handle endiannes differences between the host and target while
reading MachO files magic numbers in RuntimeDyld.

This is required now that we're testing cross-platform JITing (via
RuntimeDyldChecker), and should fix some issues that David Fang has seen on PPC
builds.

llvm-svn: 213012
2014-07-14 23:19:50 +00:00
Adam Nemet cf7c905cfb [X86] Specify all TSFlags bit-offsets symbolically
No functional change.

The offsets for the other bitfields are specified symbolically.  I need to
increase the size for one of the earlier fields which is easier after this
cleanup.

Why these bits are relative to VEXShift is a bit strange but that is for
another cleanup.

I made sure that the values for the enums are unchanged after this change.

llvm-svn: 213011
2014-07-14 23:18:39 +00:00
David Majnemer 8bce66b093 CodeGen: Stick constant pool entries in COMDAT sections for WinCOFF
COFF lacks a feature that other object file formats support: mergeable
sections.

To work around this, MSVC sticks constant pool entries in special COMDAT
sections so that each constant is in it's own section.  This permits
unused constants to be dropped and it also allows duplicate constants in
different translation units to get merged together.

This fixes PR20262.

Differential Revision: http://reviews.llvm.org/D4482

llvm-svn: 213006
2014-07-14 22:57:27 +00:00
Alp Toker 6a10223d4a Fix a -Wunused-local-typedefs warning
llvm-svn: 213002
2014-07-14 22:46:45 +00:00
Andrea Di Biagio 3960a9571f [DAGCombiner] Add more rules to combine shuffle vector dag nodes.
This patch teaches the DAGCombiner how to fold a pair of shuffles
according to rules:
  1.  shuffle(shuffle A, B, M0), B, M1) -> shuffle(A, B, M2)
  2.  shuffle(shuffle A, B, M0), A, M1) -> shuffle(A, B, M3)

The new rules would only trigger if the resulting shuffle has legal type and
legal mask.

Added test 'combine-vec-shuffle-3.ll' to verify that DAGCombiner correctly
folds shuffles on x86 when the resulting mask is legal. Also added some negative
cases to verify that we avoid introducing illegal shuffles.

llvm-svn: 213001
2014-07-14 22:46:26 +00:00
Matt Arsenault caa9c71f32 Look through addrspacecast in IsConstantOffsetFromGlobal
llvm-svn: 213000
2014-07-14 22:39:26 +00:00
Matt Arsenault fd78d0c934 Look through addrspacecast in GetPointerBaseWithConstantOffset
llvm-svn: 212999
2014-07-14 22:39:22 +00:00
David Majnemer 5a1c4b8283 CodeGen: Add a getSectionKind method to MachineConstantPoolEntry
This is just a helper routine, no functionality has changed.

llvm-svn: 212993
2014-07-14 22:06:29 +00:00
David Majnemer af9180fd04 InstSimplify: Correct sdiv x / -1
Determining the bounds of x/ -1 would start off with us dividing it by
INT_MIN.  Suffice to say, this would not work very well.

Instead, handle it upfront by checking for -1 and mapping it to the
range: [INT_MIN + 1, INT_MAX.  This means that the result of our
division can be any value other than INT_MIN.

llvm-svn: 212981
2014-07-14 20:38:45 +00:00
David Majnemer 5ea4fc0b33 InstSimplify: The upper bound of X / C was missing a rounding step
Summary:
When calculating the upper bound of X / -8589934592, we would perform
the following calculation: Floor[INT_MAX / 8589934592]

However, flooring the result would make us wrongly come to the
conclusion that 1073741824 was not in the set of possible values.
Instead, use the ceiling of the result.

Reviewers: nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4502

llvm-svn: 212976
2014-07-14 19:49:57 +00:00
Justin Bogner 973b2ff322 Support: Use a range-based for
llvm-svn: 212973
2014-07-14 19:24:13 +00:00
Matt Arsenault 199b39e063 Look through addrspacecast when checking isDereferenceablePointer
llvm-svn: 212971
2014-07-14 18:54:12 +00:00
Nick Lewycky 703e488ed9 Don't eliminate memcpy's when the address of the pointer may itself be relevant. Fixes PR18304. Patch by David Wiberg!
llvm-svn: 212970
2014-07-14 18:52:02 +00:00
Bill Wendling c80b6c92e2 Unify the lowering of arguments during SjLj prepare.
The 'select true, %arg, undef' instruction can be used for both aggregate and
non-aggregate arguments.

llvm-svn: 212967
2014-07-14 18:21:11 +00:00
Sanjay Patel b49bf168f2 fixed typo
llvm-svn: 212966
2014-07-14 18:21:07 +00:00
Matt Arsenault d0d6c0b4c9 Use pointer type cast helpers.
llvm-svn: 212963
2014-07-14 17:24:38 +00:00
Matt Arsenault 740980ee69 Add CreatePointerBitCastOrAddrSpaceCast to IRBuilder and co.
llvm-svn: 212962
2014-07-14 17:24:35 +00:00
Saleem Abdulrasool b51d464f1e X86: correct 64-bit atomics on 32-bit
We would emit a libcall for a 64-bit atomic on x86 after SVN r212119.  This was
due to the misuse of hasCmpxchg16 to indicate if cmpxchg8b was supported on a
32-bit target.  They were added at different times and would result in the
border condition being mishandled.

This fixes the border case to emit the cmpxchg8b instruction for 64-bit atomic
operations on x86 at the cost of restoring a long-standing bug in the codegen.
We emit a cmpxchg8b on all x86 targets even where the CPU does not support this
instruction (pre-Pentium CPUs).  Although this bug should be fixed, this was
present prior to SVN r212119 and this change, so this is not really introducing
a regression.

llvm-svn: 212956
2014-07-14 16:28:13 +00:00
Saleem Abdulrasool 271ac58eb3 CodeGen: add missing include
Found during windows unwinding work.  This header is indirectly included through
a chain leading through Support/Win64EH.h.  Explicitly include the header.  NFC.

llvm-svn: 212955
2014-07-14 16:28:09 +00:00
Tim Northover 6c647eae8b X86: remove temporary atomicrmw used during lowering.
We construct a temporary "atomicrmw xchg" instruction when lowering atomic
stores for widths that aren't supported natively. This isn't on the top-level
worklist though, so it won't be removed automatically and we have to do it
ourselves once that itself has been lowered.

Thanks Saleem for pointing this out!

llvm-svn: 212948
2014-07-14 15:31:13 +00:00
Daniel Sanders 41ffa5d1ba Re-commit: [mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags
The lld tests will temporarily fail again but Simon Atanasyan will commit a fix for those shortly.

llvm-svn: 212946
2014-07-14 15:05:51 +00:00
Daniel Sanders cb0d36e592 Revert: [mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags
This commit causes multiple lld tests to fail. Reverting while I investigate the issue.

llvm-svn: 212945
2014-07-14 14:43:45 +00:00
Daniel Sanders 8e254166e1 [mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags
Summary:
.bss, .text, and .data are at least 16-byte aligned.
.reginfo is 4-byte aligned and has a 24-byte EntrySize.
.MIPS.abiflags has an 24-byte EntrySize.
.MIPS.options is 8-byte aligned and has 1-byte EntrySize.

Using a 1-byte EntrySize for .MIPS.options seems strange because the
records are neither 1-byte long nor fixed-length but this matches the value
that GAS emits.

Differential Revision: http://reviews.llvm.org/D4487

llvm-svn: 212939
2014-07-14 14:02:14 +00:00
Daniel Sanders 7ddb0ab85f [mips] For the FP64A ABI, odd-numbered double-precision moves must not use mtc1/mfc1.
Summary:
This is because the FP64A the hardware will redirect 32-bit reads/writes
from/to odd-numbered registers to the upper 32-bits of the corresponding
even register. In effect, simulating FR=0 mode when FR=0 mode is not
available.

Unfortunately, we have to make the decision to avoid mfc1/mtc1 before
register allocation so we currently do this for even registers too.

FPXX has a similar requirement on 32-bit architectures that lack
mfhc1/mthc1 so this patch also handles the affected moves from the FPU for
FPXX too. Moves to the FPU were supported by an earlier commit.

Differential Revision: http://reviews.llvm.org/D4484

llvm-svn: 212938
2014-07-14 13:08:14 +00:00
Daniel Sanders 24e08fd5c0 [mips] Use MFHC1 when it is available (MIPS32r2 and later) for both FP32 and FP64 moves
Summary:
This is similar to r210771 which did the same thing for MTHC1.

Also corrected MTHC1_D32 and MTHC1_D64 which used AFGR64 and FGR64 on the
wrong definitions.

Differential Revision: http://reviews.llvm.org/D4483

llvm-svn: 212936
2014-07-14 12:41:31 +00:00
Tim Northover 3cb24110b1 AArch64: remove unnecessary pseudo-instruction.
Sufficiently twisted use of TableGen lets us write patterns directly for f16
(as an i16 promoted to i32) -> f32 conversion.

llvm-svn: 212933
2014-07-14 11:16:02 +00:00
Daniel Sanders 9ee2aee859 [mips] Correct the AFL_FLAGS1_ODDSPREG flag in .MIPS.abiflags when no '.module oddspreg' is used
Differential Revision: http://reviews.llvm.org/D4486

llvm-svn: 212932
2014-07-14 10:26:15 +00:00
Sasa Stankovic b976fee83c [mips] Expand BuildPairF64 to a spill and reload when the O32 FPXX ABI is
enabled and mthc1 and dmtc1 are not available (e.g. on MIPS32r1)

This prevents the upper 32-bits of a double precision value from being moved to
the FPU with mtc1 to an odd-numbered FPU register. This is necessary to ensure
that the code generated executes correctly regardless of the current FPU mode.

MIPS32r2 and above continues to use mtc1/mthc1, while MIPS-IV and above continue
to use dmtc1.

Differential Revision: http://reviews.llvm.org/D4465

llvm-svn: 212930
2014-07-14 09:40:29 +00:00
Bill Wendling 151b44d653 Support lowering of empty aggregates.
This crash was pretty common while compiling Rust for iOS (armv7). Reason -
SjLj preparation step was lowering aggregate arguments as ExtractValue +
InsertValue. ExtractValue has assertion which checks that there is some data in
value, which is not true in case of empty (no fields) structures. Rust uses
them quite extensively so this patch uses a 'select true, %val, undef'
instruction to lower the argument.

Patch by Valerii Hiora.

llvm-svn: 212922
2014-07-14 06:22:36 +00:00
NAKAMURA Takumi c3b3897e8a NVPTX/LLVMBuild.txt: Add "Scalar" to required_libraries. It is really referenced.
llvm-svn: 212918
2014-07-14 02:52:19 +00:00
NAKAMURA Takumi a56f860e29 Object/LLVMBuild.txt: Sort required_libraries by alphabetical order.
llvm-svn: 212917
2014-07-14 02:52:08 +00:00
Andrea Di Biagio 67d8b2e2b0 [DAGCombiner] Fix a crash caused by a missing check for legal type when trying to fold shuffles.
Verify that DAGCombiner does not crash when trying to fold a pair of shuffles
according to rule (added at r212539):
  (shuffle (shuffle A, Undef, M0), Undef, M1) -> (shuffle A, Undef, M2)

The DAGCombiner avoids folding shuffles if the resulting shuffle dag node
is not legal for the target. That means, the resulting shuffle must have
legal type and legal mask.

Before, the DAGCombiner only called method
'TargetLowering::isShuffleMaskLegal' to check if it was "safe" to fold according
to the above-mentioned rule. However, this caused a crash in the x86 backend
since method 'isShuffleMaskLegal' always expects to be called on a
legal vector type.

llvm-svn: 212915
2014-07-13 21:02:14 +00:00
Saleem Abdulrasool c7c3cb1f4e MC: make MCWin64EHInstruction a POD-like struct
This is the first of a number of changes designed to generalise
MCWin64EHInstruction to support different target architectures.  An ordered set
(vector) of these instructions is saved per frame to permit the emission of
information for Windows NT style unwinding.  The only bit of information which
is actually target specific here is the Opcode for the unwinding bytecode.  The
remainder of the information is simply generic information that is relevant to
the Windows NT unwinding model.

Remove the accessors for the fields, making them const and public instead.  Sink
the knowledge of the alias'ed name into the single source and sink a single-use
check method into the use.

llvm-svn: 212914
2014-07-13 19:03:45 +00:00
Saleem Abdulrasool 5705cd8cf6 MC: make helper function be more const-correct
Introduce const-ness on parameters, they are used as read-only and should not be
modified.  NFC.

llvm-svn: 212913
2014-07-13 19:03:40 +00:00
Saleem Abdulrasool 3f3cefd392 MC: make DWARF and Windows unwinding handling more similar
Rename member variables and functions for the MCStreamer for DWARF-like
unwinding management.  Rename the Windows ones as well and make the naming and
handling similar across the two.  No functional change intended.

llvm-svn: 212912
2014-07-13 19:03:36 +00:00
Simon Atanasyan 6abd4be930 Add forgotten `break` statement.
llvm-svn: 212910
2014-07-13 16:18:56 +00:00
Simon Atanasyan 1cd169f137 [Mips] Support SHT_MIPS_ABIFLAGS section type flag in the llvm-readobj,
obj2yaml and yaml2obj tools.

llvm-svn: 212908
2014-07-13 15:28:54 +00:00
NAKAMURA Takumi c4fc6eec53 [CMake] Add LLVM_LINK_COMPONENTS to loadable modules, LLVMHello and BugpointPasses, on Win32.
llvm-svn: 212904
2014-07-13 13:36:48 +00:00
David Majnemer ebc741168b IR: Allow comdats to be applied to globals with internal linkage
Our verifier check for checking if a global has local linkage was too
strict.  Forbid private linkage but permit local linkage.

Object file formats permit this and forbidding it prevents elimination
of unused, internal, vftables under the MSVC ABI.

llvm-svn: 212900
2014-07-13 04:56:11 +00:00
David Majnemer 299674e94f MC: Let non-temporary COFF aliases be in symtab
MC was aping a binutils bug where aliases would default their linkage to
private instead of internal.

I've sent a patch to the binutils maintainers and they've recently
applied it to the GNU assembler sources.

This fixes PR20152.

Differential Revision: http://reviews.llvm.org/D4395

llvm-svn: 212899
2014-07-13 04:31:19 +00:00
Matt Arsenault c3f6a7e44e Remove unused include
llvm-svn: 212898
2014-07-13 03:08:59 +00:00
Matt Arsenault d32dbb6a10 R600: Use range for and fix missing consts.
llvm-svn: 212897
2014-07-13 03:06:43 +00:00
Matt Arsenault 762af96f46 R600: Make ShaderType private
llvm-svn: 212896
2014-07-13 03:06:39 +00:00
Matt Arsenault d9a23ab20d R600: Add option to disable promote alloca
This can make writing some tests harder, so add a flag
to disable it.

llvm-svn: 212893
2014-07-13 02:08:26 +00:00
Matt Arsenault 4181ea36a9 Templatify DominanceFrontier.
Theoretically this should now work for MachineBasicBlocks.

llvm-svn: 212885
2014-07-12 21:59:52 +00:00
Saleem Abdulrasool f74d48a011 AArch64: add support for llvm.aarch64.hint intrinsic
This adds a llvm.aarch64.hint intrinsic to mirror the llvm.arm.hint in order to
support the various hint intrinsic functions in the ACLE.

Add an optional pattern field that permits the subclass to specify the pattern
that matches the selection.  The intrinsic pattern is set as mayLoad, mayStore,
so overload the value for the definition of the hint instruction.

llvm-svn: 212883
2014-07-12 21:20:49 +00:00
Saleem Abdulrasool db514056de MC: remove use of unnecessary variable
Due to the fact that the windows unwinding has the concept of chained frames, we
maintain a current frame info pointer that is adjusted on any push and pop of a
unwinding context.  This just removes an unnecessary variable that was used to
mirror the DWARF unwinding code.

llvm-svn: 212882
2014-07-12 20:49:13 +00:00
Saleem Abdulrasool 4a1a2f7790 MC: rename MCW64UnwindInfo to MCWinFrameInfo
This structure contains information related to the call frame used to generate
unwinding information.  Rename this to reflect the future use to represent the
shared state between various architectures for WinCFI information.

llvm-svn: 212881
2014-07-12 20:49:09 +00:00
Simon Atanasyan 8ebb6aed9b [ELFYAML] Group ELF section type flags to target specific blocks.
Recognize only flags which correspond to the current target.

llvm-svn: 212880
2014-07-12 18:25:08 +00:00
Owen Anderson a8d1c3e74e Fix an issue with the MergeBasicBlockIntoOnlyPred() helper function where it did
not properly handle the case where the predecessor block was the entry block to
the function.  The only in-tree client of this is JumpThreading, which worked
around the issue in its own code.  This patch moves the solution into the helper
so that JumpThreading (and other clients) do not have to replicate the same fix
everywhere.

llvm-svn: 212875
2014-07-12 07:12:47 +00:00
Alexey Samsonov 15c9669615 [ASan] Collect unmangled names of global variables in Clang to print them in error reports.
Currently ASan instrumentation pass creates a string with global name
for each instrumented global (to include global names in the error report). Global
name is already mangled at this point, and we may not be able to demangle it
at runtime (e.g. there is no __cxa_demangle on Android).

Instead, create a string with fully qualified global name in Clang, and pass it
to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata
for some global, ASan will use the original algorithm.

This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264.

llvm-svn: 212872
2014-07-12 00:42:52 +00:00
Duncan P. N. Exon Smith 6075510839 BFI: Add constructor for Weight
llvm-svn: 212868
2014-07-12 00:26:00 +00:00
Duncan P. N. Exon Smith 345c287da9 BFI: Clean up BlockMass
Implementation is small now -- the interesting logic was moved to
`BranchProbability` a while ago.  Move it into `bfi_detail` and get rid
of the related TODOs.

I was originally planning to define it within `BlockFrequencyInfoImpl`
(or `BFIIBase`), but it seems cleaner in a namespace.  Besides,
`isPodLike` needs to be specialized before `BlockMass` can be used in
some of the other data structures, and there isn't a clear way to do
that.

llvm-svn: 212866
2014-07-12 00:21:30 +00:00
Lang Hames 5d10284238 [RuntimeDyld] Fix stub size and offset for AArch64 in RuntimeDyldMachO.h.
<rdar://problem/17648000>

llvm-svn: 212864
2014-07-12 00:16:47 +00:00
Reid Kleckner fb9519838a Avoid a warning from MSVC on "*/" in this code by inserting a space
llvm-svn: 212862
2014-07-12 00:06:46 +00:00
Duncan P. N. Exon Smith b5650e5eae BFI: Mark the end of namespaces
llvm-svn: 212861
2014-07-11 23:56:50 +00:00
Lang Hames cb314ceaa9 [RuntimeDyld] Add GOT support for AArch64 to RuntimeDyldMachO.
Test cases to follow once RuntimeDyldChecker supports introspection of stubs.

Fixes <rdar://problem/17648000>

llvm-svn: 212859
2014-07-11 23:52:07 +00:00
Juergen Ributzka d755e9f730 Revert "[FastISel][X86] Implement the FastLowerIntrinsicCall hook."
This reverts commit r212851, because it broke the memset lowering.

llvm-svn: 212855
2014-07-11 23:10:08 +00:00
Juergen Ributzka 04b444913b [FastISel][X86] Implement the FastLowerIntrinsicCall hook.
Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively
implements the target hook.

llvm-svn: 212851
2014-07-11 22:37:43 +00:00
Alexey Samsonov 08f022ae84 [ASan] Introduce a struct representing the layout of metadata entry in llvm.asan.globals.
No functionality change.

llvm-svn: 212850
2014-07-11 22:36:02 +00:00
Juergen Ributzka 3d9e6755e4 [FastISel] Add target-independent patchpoint intrinsic support. WIP.
This implements the target-independent lowering for the patchpoint
intrinsic. Targets have to implement the FastLowerCall
hook to support this intrinsic.

Related to <rdar://problem/17427052>

llvm-svn: 212849
2014-07-11 22:19:02 +00:00
Juergen Ributzka 8179e9e5ad [FastISel] Add basic infrastructure to support a target-independent call lowering hook in FastISel. WIP
The infrastructure mimics the call lowering we have already in place for
SelectionDAG, but with limitations. For example structure return demotion and
non-simple types are not supported (yet).

Currently every backend has its own implementation and duplicated code for call
lowering. There is also no specified interface that could be called from
target-independent code. The target-hook is opt-in and doesn't affect current
implementations.

llvm-svn: 212848
2014-07-11 22:01:42 +00:00
Aditya Nandakumar 0b5a674243 When we sink an instruction, this can open up opportunity for the operands to be sunk - add them to the worklist
llvm-svn: 212847
2014-07-11 21:49:39 +00:00
Argyrios Kyrtzidis 730abd2f4a Move the API and implementation of clang::driver::getARMCPUForMArch() to llvm::Triple::getARMCPUForArch().
Suggested by Eric Christopher.

llvm-svn: 212846
2014-07-11 21:44:54 +00:00
Juergen Ributzka 4ce9863d0b [FastISel] Make isInTailCallPosition independent of SelectionDAG.
Break out the arguemnts required from SelectionDAG, so that this function can
also be used by FastISel.

llvm-svn: 212844
2014-07-11 20:50:47 +00:00
Juergen Ributzka 5dd32136b9 [FastISel] Breakout intrinsic lowering into a separate function and add a target-hook.
Create a separate helper function for target-independent intrinsic lowering. Also
add an target-hook that allows to directly call into a target-sepcific intrinsic
lowering method. Currently the implementation is opt-in and doesn't affect
existing target implementations.

llvm-svn: 212843
2014-07-11 20:42:12 +00:00
Alp Toker 48bbd061bc Simplify the raw_svector_ostream tweak from r212816
The memcpy() and overlap helps didn't help much with timings, so clean up the change.

The difference at this point is that we now leave growth of the storage buffer
up to SmallVector's implementation:

 -   OS.reserve(OS.capacity() * 2);
 +   OS.reserve(OS.size() + 64);

llvm-svn: 212837
2014-07-11 18:23:08 +00:00
Ulrich Weigand 0a51abc100 [MC] Constify MCELF::GetVisibility and MCELF::getOther
These two routines didn't take a "const MCSymbolData &SD"
like the other MCELF::Get routines for some reason ...

llvm-svn: 212834
2014-07-11 17:34:44 +00:00
Ulrich Weigand ea147a9d43 [PowerPC] Fix invalid displacement created by LocalStackAlloc
This commit fixes a bug in PPCRegisterInfo::isFrameOffsetLegal that
could result in the LocalStackAlloc pass creating an MI instruction
out-of-range displacement:
        %vreg17<def> = LD 33184, %vreg31; mem:LD8[%g](align=32)
        %G8RC:%vreg17 G8RC_and_G8RC_NOX0:%vreg31
(In final assembler output the top bits are stripped off, resulting
in a negative offset loading from below the stack pointer.)

Common code expects the isFrameOffsetLegal routine to verify whether
adding a given offset to the offset already present in the instruction
results in a valid displacement.  However, on PowerPC the routine
did not take the already present instruction offset into account.

This commit fixes isFrameOffsetLegal to add the instruction offset,
and updates a local caller (needsFrameBaseReg) to no longer add the
instruction offset itself before calling isFrameOffsetLegal.

Reviewed by Hal Finkel.

llvm-svn: 212832
2014-07-11 17:19:31 +00:00
Marek Olsak eac5062cc0 R600/SI: Use i32 vectors for resources and samplers
This affects new intrinsics only.

What surprises me is that v32i8 still works.

llvm-svn: 212831
2014-07-11 17:11:52 +00:00
Marek Olsak d8ecaeec02 R600/SI: add sample and image intrinsics exposing all instruction fields
We need the intrinsics with offsets, so why not just add them all.
The R128 parameter will also be useful for reducing SGPR usage.
GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent",
so Mesa will probably translate those to slc, glc, etc.

When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics.

llvm-svn: 212830
2014-07-11 17:11:46 +00:00
Marek Olsak ba77c3e4ed R600/SI: fix shadow mapping for 1D and 2D array textures
It was conflicting with def TEX_SHADOW_ARRAY, which also handles them.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 212829
2014-07-11 17:11:39 +00:00
Alp Toker bc4d1a3604 raw_svector_ostream: grow and reserve atomically
Including the scratch buffer size in the initial reservation eliminates the
subsequent malloc+move operation and offers a healthier constant growth with
less memory wastage.

When doing this, take care to avoid invalidating the source buffer.

llvm-svn: 212816
2014-07-11 14:02:04 +00:00
Oliver Stannard 6eda6ffc0c ARM: Allow __fp16 as a function arg or return type for AArch64
ACLE 2.0 allows __fp16 to be used as a function argument or return
type. This enables this for AArch64.

llvm-svn: 212812
2014-07-11 13:33:46 +00:00
Quentin Colombet 0f179c4d8a [X86] Fix the inversion of low and high bits for the lowering of MUL_LOHI.
Also add a few comments.

<rdar://problem/17581756>

llvm-svn: 212808
2014-07-11 12:08:23 +00:00
Marcello Maggioni 78035b11ec Fixup PHIs in LowerSwitch when a Leaf node is not emitted.
This commit fixes bug http://llvm.org/bugs/show_bug.cgi?id=20103.

Thanks to Qwertyuiop for the report and the proposed fix.

llvm-svn: 212802
2014-07-11 10:34:36 +00:00
Adam Nemet 26f817497c [X86] AVX512: Improve readability of isCDisp8
No functional change.  As I was trying to understand this function, I found
that variables were reused with confusing names and the broadcast case was a
bit too implicit.  Hopefully, this is an improvement.

llvm-svn: 212795
2014-07-11 05:23:25 +00:00
Adam Nemet e311c3c836 [X86] AVX512: Simplify logic in isCDisp8
It was computing the VL/n case as:
  MemObjSize = VectorByteSize / ElemByteSize / Divider * ElemByteSize

ElemByteSize not only falls out but VectorByteSize/Divider now actually
matches the definition of VL/n.

Also some formatting fixes.

llvm-svn: 212794
2014-07-11 05:23:12 +00:00
David Blaikie de1e1a60e8 Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.""
This reverts commit r212776.

Nope, still seems to be failing on the sanitizer bots... but hey, not
the msan self-host anymore, it's failing in asan now. I'll start looking
there next.

llvm-svn: 212793
2014-07-11 02:42:57 +00:00
Mark Heffernan 675d401a26 Partially fix PR20058: reduce compile time for loop unrolling with very high count by reducing calls to SE->forgetLoop
llvm-svn: 212782
2014-07-10 23:30:06 +00:00
Lang Hames 16086b984e [RuntimeDyld] Improve error diagnostic in RuntimeDyldChecker.
The compiler often emits assembler-local labels (beginning with 'L') for use in
relocation expressions, however these aren't included in the object files.
Teach RuntimeDyldChecker to warn the user if they try to use one of these in an
expression, since it will never work.

llvm-svn: 212777
2014-07-10 23:26:20 +00:00
David Blaikie 3ca92d2406 Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."
Committed in r212205 and reverted in r212226 due to msan self-hosting
failure, I believe I've got that fixed by r212761 to Clang.

Original commit message:

"Originally committed in r211723, reverted in r211724 due to failure
cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065),
committed again in r212085 and reverted again in r212089 after fixing
some other cases, such as debug info subprogram lists not keeping track
of the function they represent (r212128) and then short-circuiting
things like LiveDebugVariables that build LexicalScopes for functions
that might not have full debug info.

And again, I believe the invariant actually holds for some reasonable
amount of code (but I'll keep an eye on the buildbots and see what
happens... ).

Original commit message:

PR20038: DebugInfo: Inlined call sites where the caller has debug info
but the call itself has no debug location.

This situation does bad things when inlined, so I've fixed Clang not to
produce inlinable call sites without locations when the caller has debug
info (in the one case where I could find that this occurred). This
updates the PR20038 test case to be what clang now produces, and readds
the assertion that had to be removed due to this bug.

I've also beefed up the debug info verifier to help diagnose these
issues in the future, and I hope to add checks to the inliner to just
assert-fail if it encounters this situation. If, in the future, we
decide we have to cope with this situation, the right thing to do is
probably to just remove all the DebugLocs from the inlined
instructions."

llvm-svn: 212776
2014-07-10 22:59:39 +00:00
Jan Vesely 2cb62ce2a0 R600: Implement float to long/ulong
Use alg. from LegalizeDAG.cpp
Move Expand setting to SIISellowering

v2: Extend existing tests instead of creating new ones
v3: use separate LowerFPTOSINT function
v4: use TargetLowering::expandFP_TO_SINT
    add comment about using FP_TO_SINT for uints

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 212773
2014-07-10 22:40:21 +00:00
Jan Vesely eca89d283e SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizer
Move the code to a helper function to allow calls from TypeLegalizer.

No functionality change intended

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Owen Anderson <resistor@mac.com>
llvm-svn: 212772
2014-07-10 22:40:18 +00:00
Brad Smith 733cb6437d Use the integrated assembler by default on OpenBSD.
llvm-svn: 212771
2014-07-10 22:37:28 +00:00
Zoran Jovanovic f34b454219 [mips] Emit two CFI offset directives per double precision SDC1/LDC1
instead of just one for FR=1 registers
Differential Revision: http://reviews.llvm.org/D4310

llvm-svn: 212769
2014-07-10 22:23:30 +00:00
Matt Arsenault 3332b70627 Revert "Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.""
Don't try to convert the select condition type.

llvm-svn: 212750
2014-07-10 18:21:04 +00:00
Andrea Di Biagio b2921c7ca0 [DAG] Further improve the logic in DAGCombiner that folds a pair of shuffles into a single shuffle if the resulting mask is legal.
This patch teaches the DAGCombiner how to fold shuffles according to the
following new rules:
  1. shuffle(shuffle(x, y), undef) -> x
  2. shuffle(shuffle(x, y), undef) -> y
  3. shuffle(shuffle(x, y), undef) -> shuffle(x, undef)
  4. shuffle(shuffle(x, y), undef) -> shuffle(y, undef)

The backend avoids to combine shuffles according to rules 3. and 4. if
the resulting shuffle does not have a legal mask. This is to avoid introducing
illegal shuffles that are potentially expanded into a sub-optimal sequence of
target specific dag nodes during vector legalization.

Added test case combine-vec-shuffle-2.ll to verify that we correctly triggers
the new rules when combining shuffles.

llvm-svn: 212748
2014-07-10 18:04:55 +00:00
Akira Hatanaka 7cc27649a6 [X86] Mark pseudo instruction TEST8ri_NOEREX as hasSIdeEffects=0.
Also, add a case clause in X86InstrInfo::shouldScheduleAdjacent to enable
macro-fusion.

<rdar://problem/15680770>

llvm-svn: 212747
2014-07-10 18:00:53 +00:00
Eric Christopher 54fe1b260c Add the CSR company and the Kalimba DSP processor to Triple.
Patch by Matthew Gardiner with fixes by me.

llvm-svn: 212745
2014-07-10 17:26:54 +00:00
Eric Christopher 22405e4bbf Make it possible for the Subtarget to change between function
passes in the mips back end. This, unfortunately, required a
bit of churn in the various predicates to use a pointer rather
than a reference.

llvm-svn: 212744
2014-07-10 17:26:51 +00:00
Duncan P. N. Exon Smith 04934b0fec InstCombine: Fix a crash in Descale for multiply-by-zero
Fix a crash in `InstCombiner::Descale()` when a multiply-by-zero gets
created as an argument to a GEP partway through an iteration, causing
-instcombine to optimize the GEP before the multiply.

rdar://problem/17615671

llvm-svn: 212742
2014-07-10 17:13:27 +00:00
David Majnemer ed33243e86 IR: Aliases don't belong to an explicit comdat
Aliases inherit their comdat from their aliasee, they don't have an
explicit comdat.

This fixes PR20279.

llvm-svn: 212732
2014-07-10 16:26:10 +00:00
Hal Finkel 511fea7acd Feeding isSafeToSpeculativelyExecute its DataLayout pointer (in Sink)
This is the one remaining place I see where passing
isSafeToSpeculativelyExecute a DataLayout pointer might matter (at least for
loads) -- I think I got the others in r212720. Most of the other remaining
callers of isSafeToSpeculativelyExecute only use it for call sites (or
otherwise exclude loads).

llvm-svn: 212730
2014-07-10 16:07:11 +00:00
David Majnemer 99ef236542 Mips: Silence a -Wcovered-switch-default
Remove a default label which covered no enumerators, replace it with a
llvm_unreachable.

No functionality changed.

llvm-svn: 212729
2014-07-10 16:04:04 +00:00
Zoran Jovanovic 255d00dc23 [mips] Added FPXX modeless calling convention.
Differential Revision: http://reviews.llvm.org/D4293

llvm-svn: 212726
2014-07-10 15:36:12 +00:00
Arnaud A. de Grandmaison f643231163 [AArch64] Add logical alias instructions to MC AsmParser
This patch teaches the AsmParser to accept some logical+immediate
instructions and convert them as shown:

  bic  Rd, Rn, #imm  ->  and Rd, Rn, #~imm
  bics Rd, Rn, #imm  ->  ands Rd, Rn, #~imm
  orn  Rd, Rn, #imm  ->  orr Rd, Rn, #~imm
  eon  Rd, Rn, #imm  ->  eor Rd, Rn, #~imm

Those instructions are an alternate syntax available to assembly coders,
and are needed in order to support code already compiling with some other
assemblers. For example, the bic construct is used by the linux kernel.

llvm-svn: 212722
2014-07-10 15:12:26 +00:00
Hal Finkel a995f92627 Feeding isSafeToSpeculativelyExecute its DataLayout pointer
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.

This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.

llvm-svn: 212720
2014-07-10 14:41:31 +00:00
Tim Northover fee2adefba AArch64: correctly fast-isel i8 & i16 multiplies
We were asking for a register for type i8 or i16 which caused an assert.

rdar://problem/17620015

llvm-svn: 212718
2014-07-10 14:18:46 +00:00
Daniel Sanders 7e527423f5 [mips] Add support for -modd-spreg/-mno-odd-spreg
Summary:
When -mno-odd-spreg is in effect, 32-bit floating point values are not
permitted in odd FPU registers. The option also prohibits 32-bit and 64-bit
floating point comparison results from being written to odd registers.

This option has three purposes:
* It allows support for certain MIPS implementations such as loongson-3a that
  do not allow the use of odd registers for single precision arithmetic.
* When using -mfpxx, -mno-odd-spreg is the default and this allows us to
  statically check that code is compliant with the O32 FPXX ABI since mtc1/mfc1
  instructions to/from odd registers are guaranteed not to appear for any
  reason. Once this has been established, the user can then re-enable
  -modd-spreg to regain the use of all 32 single-precision registers.
* When using -mfp64 and -mno-odd-spreg together, an O32 extension named
  O32 FP64A is used as the ABI. This is intended to provide almost all
  functionality of an FR=1 processor but can also be executed on a FR=0 core
  with the assistance of a hardware compatibility mode which emulates FR=0
  behaviour on an FR=1 processor.

* Added '.module oddspreg' and '.module nooddspreg' each of which update
  the .MIPS.abiflags section appropriately
* Moved setFpABI() call inside emitDirectiveModuleFP() so that the caller
  doesn't have to remember to do it.
* MipsABIFlags now calculates the flags1 and flags2 member on demand rather
  than trying to maintain them in the same format they will be emitted in.

There is one portion of the -mfp64 and -mno-odd-spreg combination that is not
implemented yet. Moves to/from odd-numbered double-precision registers must not
use mtc1. I will fix this in a follow-up.

Differential Revision: http://reviews.llvm.org/D4383

llvm-svn: 212717
2014-07-10 13:38:23 +00:00
Zinovy Nis cad431c122 [x32] Add AsmBackend for X32 which uses ELF32 with x86_64 (the author is Pavel Chupin).
This is minimal change for backend required to have "hello world" compiled and working on x32 target (x86_64-linux-gnux32). More patches for x32 will follow.

Differential Revision: http://reviews.llvm.org/D4181

llvm-svn: 212716
2014-07-10 13:03:26 +00:00
Chandler Carruth 0b666e0648 [x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous
to the zero-extend-vector-inreg node introduced previously for the same
purpose: manage the type legalization of widened extend operations,
especially to support the experimental widening mode for x86.

I'm adding both because sign-extend is expanded in terms of any-extend
with shifts to propagate the sign bit. This removes the last
fundamental scalarization from vec_cast2.ll (a test case that hit many
really bad edge cases for widening legalization), although the trunc
tests in that file still appear scalarized because the the shuffle
legalization is scalarizing. Funny thing, I've been working on that.

Some initial experiments with this and SSE2 scenarios is showing
moderately good behavior already for sign extension. Still some work to
do on the shuffle combining on X86 before we're generating optimal
sequences, but avoiding scalarization is a huge step forward.

llvm-svn: 212714
2014-07-10 12:32:32 +00:00
Richard Sandiford 02bb0ec368 [SystemZ] Use SystemZCallingConv.td to define callee-saved registers
Just a clean-up.  No behavioral change intended.

llvm-svn: 212711
2014-07-10 11:44:37 +00:00
NAKAMURA Takumi f862ce8908 Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine."
This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations.

llvm-svn: 212708
2014-07-10 11:37:28 +00:00
Richard Sandiford 909aa3ad21 [SystemZ] Tweak instruction format classifications
There's no real need to have Shift as a separate format type from Binary.
The comments for other format types were too specific and in some cases
no longer accurate.

Just a clean-up, no behavioral change intended.

llvm-svn: 212707
2014-07-10 11:29:23 +00:00
Chandler Carruth df8d0caab7 [x86] Add another combine that is particularly useful for the new vector
shuffle lowering: match shuffle patterns equivalent to an unpcklwd or
unpckhwd instruction.

This allows us to use generic lowering code for v8i16 shuffles and match
the unpack pattern late.

llvm-svn: 212705
2014-07-10 11:09:29 +00:00
Richard Sandiford e66e8c8b66 [SystemZ] Add MC support for LEDBRA, LEXBRA and LDXBRA
These instructions aren't used for codegen since the original L*DB instructions
are suitable for fround.

llvm-svn: 212703
2014-07-10 11:00:55 +00:00
Richard Sandiford ca44614ac0 [SystemZ] Avoid using i8 constants for immediate fields
Immediate fields that have no natural MVT type tended to use i8 if the
field was small enough.  This was a bit confusing since i8 isn't a legal
type for the target.  Fields for short immediates in a 32-bit or 64-bit
operation use i32 or i64 instead, so it would be better to do the same
for all fields.

No behavioral change intended.

llvm-svn: 212702
2014-07-10 10:52:51 +00:00
Richard Sandiford ac1dba0fdf [SystemZ] Fix FPR dwarf numbering
The dwarf FPR numbers are supposed to have the order F0, F2, F4, F6,
F1, F3, F5, F7, F8, etc., which matches the pairing of registers for
long doubles.  E.g. a long double stored in F0 is paired with F2.

llvm-svn: 212701
2014-07-10 10:45:11 +00:00
Daniel Sanders cbd44c591d Make it possible for ints/floats to return different values from getBooleanContents()
Summary:
On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer
comparisons return 0 or 1.

Updated the various uses of getBooleanContents. Two simplifications had to be
disabled when float and int boolean contents differ:
- ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially
  discoverable (i.e. when the condition of the VSELECT is a SETCC node).
- visitVSELECT (select C, 0, 1) -> (xor C, 1).
  Come to think of it, this one could test for the common case of 'C'
  being a SETCC too.

Preserved existing behaviour for all other targets and updated the affected
MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low'
variable was counting in the wrong direction because it thought it could simply
add the result of the comparison.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D4389

llvm-svn: 212697
2014-07-10 10:18:12 +00:00
Chandler Carruth 853fa0ac8d [x86] Expand the target DAG combining for PSHUFD nodes to be able to
combine into half-shuffles through unpack instructions that expand the
half to a whole vector without messing with the dword lanes.

This fixes some redundant instructions in splat-like lowerings for
v16i8, which are now getting to be *really* nice.

llvm-svn: 212695
2014-07-10 09:57:36 +00:00
Chandler Carruth a34a8e230d [x86] Tweak the v16i8 single input special case lowering for shuffles
that splat i8s into i16s.

Previously, we would try much too hard to arrange a sequence of i8s in
one half of the input such that we could unpack them into i16s and
shuffle those into place. This isn't always going to be a cheaper i8
shuffle than our other strategies. The case where it is always going to
be cheaper is when we can arrange all the necessary inputs into one half
using just i16 shuffles. It happens that viewing the problem this way
also makes it much easier to produce an efficient set of shuffles to
move the inputs into one half and then unpack them.

With this, our splat code gets one step closer to being not terrible
with the new experimental lowering strategy. It also exposes two
combines missing which I will add next.

llvm-svn: 212692
2014-07-10 09:16:40 +00:00
Hal Finkel 66e23f126d Fix isDereferenceablePointer not to try to take the size of an unsized type.
I'll add a test-case shortly.

llvm-svn: 212687
2014-07-10 06:06:11 +00:00
Hal Finkel 2e42c34d05 Allow isDereferenceablePointer to look through some bitcasts
isDereferenceablePointer should not give up upon encountering any bitcast. If
we're casting from a pointer to a larger type to a pointer to a small type, we
can continue by examining the bitcast's operand. This missing capability
was noted in a comment in the function.

In order for this to work, isDereferenceablePointer now takes an optional
DataLayout pointer (essentially all callers already had such a pointer
available). Most code uses isDereferenceablePointer though
isSafeToSpeculativelyExecute (which already took an optional DataLayout
pointer), and to enable the LICM test case, LICM needs to actually provide its DL
pointer to isSafeToSpeculativelyExecute (which it was not doing previously).

llvm-svn: 212686
2014-07-10 05:27:53 +00:00
Saleem Abdulrasool 1e76cbdff7 MC: modernise for loop
Convert a for loop to range bsaed form.  NFC.

llvm-svn: 212684
2014-07-10 04:50:09 +00:00
Saleem Abdulrasool 427c08d48b MC: add and use an accessor for WinCFI
This adds a utility method to access the WinCFI information in bulk and uses
that to iterate rather than requesting the count and individually iterating
them.  This is in preparation for restructuring WinCFI handling to enable more
clear sharing across architectures to enable unwind information emission for
Windows on ARM.

llvm-svn: 212683
2014-07-10 04:50:06 +00:00
Peter Collingbourne 8876c3face Remove move assignment operator to appease older GCCs.
llvm-svn: 212682
2014-07-10 04:39:40 +00:00
Chandler Carruth 7d2ffb5492 [x86] Initial improvements to the new shuffle lowering for v16i8
shuffles specifically for cases where a small subset of the elements in
the input vector are actually used.

This is specifically targetted at improving the shuffles generated for
trunc operations, but also helps out splat-like operations.

There is still some really low-hanging fruit here that I want to address
but this is a huge step in the right direction.

llvm-svn: 212680
2014-07-10 04:34:06 +00:00
Peter Collingbourne 05b9ebf2f9 Explicitly define move constructor and move assignment operator to appease MSVC.
llvm-svn: 212679
2014-07-10 04:29:06 +00:00
Peter Collingbourne d5feb7ba42 SpecialCaseList: use std::unique_ptr.
llvm-svn: 212678
2014-07-10 03:55:02 +00:00
Hao Liu 71224b02fb [AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector.
llvm-svn: 212677
2014-07-10 03:41:50 +00:00
Matt Arsenault b0df92577d R600/SI: Add support for llvm.convert.{to|from}.fp16
llvm-svn: 212676
2014-07-10 03:22:20 +00:00
Chandler Carruth b3840a55ae [x86] Refactor some of the new code for lowering v16i8 shuffles to
remove duplication and make it easier to select different strategies.

No functionality changed.

llvm-svn: 212674
2014-07-10 02:24:26 +00:00
Peter Collingbourne 2e28edf8e1 [dfsan] Handle bitcast aliases.
llvm-svn: 212668
2014-07-10 01:30:39 +00:00
Chandler Carruth d3561f6fec [SDAG] Make the new zext-vector-inreg node default to expand so targets
don't need to set it manually.

This is based on feedback from Tom who pointed out that if every target
needs to handle this we need to reach out to those maintainers. In fact,
it doesn't make sense to duplicate everything when anything other than
expand seems unlikely at this stage.

llvm-svn: 212661
2014-07-09 22:53:04 +00:00
David Blaikie 029bd3350e Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information.
Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson
reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped
provide a reduced reproduction, though the failure wasn't too hard to
guess, and even easier with the example to confirm.

The assertion that the subprogram metadata associated with an
llvm::Function matches the scope data referenced by the DbgLocs on the
instructions in that function is not valid under LTO. In LTO, a C++
inline function might exist in multiple CUs and the subprogram metadata
nodes will refer to the same llvm::Function. In this case, depending on
the order of the CUs, the first intance of the subprogram metadata may
not be the one referenced by the instructions in that function and the
assertion will fail.

A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the
assertion removed and a comment added to explain this situation.

Original commit message:

If a function isn't actually in a CU's subprogram list in the debug info
metadata, ignore all the DebugLocs and don't try to build scopes, track
variables, etc.

While this is possibly a minor optimization, it's also a correctness fix
for an incoming patch that will add assertions to LexicalScopes and the
debug info verifier to ensure that all scope chains lead to debug info
for the current function.

Fix up a few test cases that had broken/incomplete debug info that could
violate this constraint.

Add a test case where this occurs by design (inlining a
debug-info-having function in an attribute nodebug function - we want
this to work because /if/ the nodebug function is then inlined into a
debug-info-having function, it should be fine (and will work fine - we
just stitch the scopes up as usual), but should the inlining not happen
we need to not assert fail either).

llvm-svn: 212649
2014-07-09 21:02:41 +00:00
Alexey Samsonov b7dd329f2f Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
  * DFSanABIList in DFSan instrumentation pass.
  * SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).

Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.

No functionality change.

llvm-svn: 212643
2014-07-09 19:40:08 +00:00
Matt Arsenault 658c5576d1 Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.
Do this if the truncate is free and the select is legal.

llvm-svn: 212640
2014-07-09 19:12:07 +00:00
Jim Grosbach 34cc92b475 AArch64: Better codegen for storing to __fp16.
Storing will generally be immediately preceded by rounding from an f32
or f64, so make sure to match those patterns directly to convert into the
FPR16 register class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-from-f64 path
which was first converting to f32 and then to f16 from there.

rdar://17594379

llvm-svn: 212638
2014-07-09 18:55:52 +00:00
Benjamin Kramer c560a6cadc TargetRegisterInfo: Remove function that fell out of use years ago.
llvm-svn: 212636
2014-07-09 18:53:57 +00:00
Adam Nemet 2820a5b9e9 [X86] AVX512: Enable it in the Loop Vectorizer
This lets us experiment with 512-bit vectorization without passing
force-vector-width manually.

The code generated for a simple integer memset loop is properly vectorized.
Disassembly is still broken for it though :(.

llvm-svn: 212634
2014-07-09 18:22:33 +00:00
Louis Gerbarg 1ce0c37bf0 Make AArch64FastISel::EmitIntExt explicitly check its source and destination types
This is a follow up to r212492. There should be no functional difference, but
this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be
an i8/i16/i32/i64.

rdar://17516686

llvm-svn: 212633
2014-07-09 17:54:32 +00:00
Sanjay Patel 58814445d4 Fix for PR20059 (instcombine reorders shufflevector after instruction that may trap)
In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem).

This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed.

Differential Revision: http://reviews.llvm.org/D4424

llvm-svn: 212629
2014-07-09 16:34:54 +00:00
Daniel Sanders c5626f4444 Add Imagination Technologies to the vendors in llvm::Triple
Summary: This is a pre-requisite for supporting the mips-img-linux-gnu triple in clang.

Differential Revision: http://reviews.llvm.org/D4435

llvm-svn: 212626
2014-07-09 16:03:10 +00:00
Tim Northover ac002d3e34 Generic: add range-adapter for option parsing.
I want to use it in lld, but while I'm here I'll update LLVM uses.

llvm-svn: 212615
2014-07-09 13:03:37 +00:00
Chandler Carruth 5865a73a82 [x86] Fix a bug in my new zext-vector-inreg DAG trickery where we were
not widening the input type to the node sufficiently to let the ext take
place in a register.

This would in turn result in a mysterious bitcast assertion failure
downstream. First change here is to add back the helpful assert I had in
an earlier version of the code to catch this immediately.

Next change is to add support to the type legalization to detect when we
have widened the operand either too little or too much (for whatever
reason) and find a size-matched legal vector type to convert it to
first. This can also fail so we get a new fallback path, but that seems
OK.

With this, we no longer crash on vec_cast2.ll when using widening. I've
also added the CHECK lines for the zero-extend cases here. We still need
to support sign-extend and trunc (or something) to get plausible code
for the other two thirds of this test which is one of the regression
tests that showed the most scalarization when widening was
force-enabled. Slowly closing in on widening being a viable legalization
strategy without it resorting to scalarization at every turn. =]

llvm-svn: 212614
2014-07-09 12:36:54 +00:00