Commit Graph

78410 Commits

Author SHA1 Message Date
Simon Pilgrim 09f3ff9a0a [DAGCombiner] Add support for TRUNCATE + FP_EXTEND vector constant folding
This patch adds supports for the vector constant folding of TRUNCATE and FP_EXTEND instructions and tidies up the SINT_TO_FP and UINT_TO_FP instructions to match.

It also moves the vector constant folding for the FNEG and FABS instructions to use the DAG.getNode() functionality like the other unary instructions.

Differential Revision: http://reviews.llvm.org/D8593

llvm-svn: 233224
2015-03-25 22:30:31 +00:00
Andrew Kaylor 51fcf0fc5f Fix remaining MSVC warning
llvm-svn: 233220
2015-03-25 21:33:24 +00:00
Matthias Braun 5d27ef6449 RegisterCoalescer: Fix implicit def handling in register coalescer
If liveranges induced by an IMPLICIT_DEF get completely covered by a
proper liverange the IMPLICIT_DEF instructions and its corresponding
definitions have to be removed from the live ranges. This has to happen
in the subregister live ranges as well (I didn't see this case earlier
because in most programs only some subregisters are covered and the
IMPLCIT_DEF won't get removed).

No testcase, I spent hours trying to create one for one of the public
targets, but ultimately failed because I couldn't manage to properly
control the placement of COPY and IMPLICIT_DEF instructions from an .ll
file.

llvm-svn: 233217
2015-03-25 21:18:24 +00:00
Matthias Braun e962e52a45 MachineVerifier: slightly simplify code that is only called with vregs
llvm-svn: 233216
2015-03-25 21:18:22 +00:00
Krzysztof Parzyszek 6001847e8f Revert r233206
llvm-svn: 233213
2015-03-25 20:21:16 +00:00
Reid Kleckner 7e9546b378 WinEH: Create an unwind help alloca for __CxxFrameHandler3 xdata tables
We don't have any logic to emit those tables yet, so the sdag lowering
of this intrinsic is just a stub. We can see the intrinsic in the
prepared IR, though.

llvm-svn: 233209
2015-03-25 20:10:36 +00:00
Krzysztof Parzyszek 62b41b9458 [Hexagon] Keep the bare getSubtargetImpl for now
llvm-svn: 233206
2015-03-25 19:51:52 +00:00
Kit Barton 535e69de34 Add Hardware Transactional Memory (HTM) Support
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], but currently only the
'PowerPC HTM Low Level Built-in Function' are implemented.

The HTM instructions follows the RC ones and the transaction initiation result
is set on RC0 (with exception of tcheck). Currently approach is to create a
register copy from CR0 to GPR and comapring. Although this is suboptimal, since
the branch could be taken directly by comparing the CR0 value, it generates code
correctly on both test and branch and just return value. A possible future
optimization could be elimitate the MFCR instruction to branch directly.

The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.

This is send along a clang patch to enabled the builtins and option switch.

[1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html

Phabricator Review: http://reviews.llvm.org/D8247

llvm-svn: 233204
2015-03-25 19:36:23 +00:00
Rafael Espindola 59f90b215d clang-format bits of code to make another patch readable.
llvm-svn: 233203
2015-03-25 19:24:39 +00:00
Peter Collingbourne b736065f78 DebugInfo: Permit DW_TAG_structure_type, DW_TAG_member, DW_TAG_typedef tags with empty file names.
Some languages, such as Go, have pre-defined structure types (e.g. "string"
is essentially a pointer/length pair) or pre-defined "typedef" types
(e.g. "error" is essentially a typedef for a specific interface type).
Such types do not have associated source location, so a Go frontend would
be correct not to associate a file name with such types.

This change relaxes the DIType verifier to permit unlocated types with
these tags.

Differential Revision: http://reviews.llvm.org/D8588

llvm-svn: 233200
2015-03-25 17:44:49 +00:00
Sanjay Patel 2f8f019daf [X86, AVX] improve insertion into zero element of 256-bit vector
This patch allows AVX blend instructions to handle insertion into the low
element of a 256-bit vector for the appropriate data types.

For f32, instead of:

   vblendps	$1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
   vblendps	$15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]

we get:

   vblendps	$1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]

For f64, instead of:

   vmovsd	%xmm1, %xmm0, %xmm1     ## xmm1 = xmm1[0],xmm0[1]
   vblendpd	$3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]

we get:

   vblendpd	$1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]

For the hardware-neglected integer data types, I left a TODO comment in the
code and added regression tests for a follow-on patch.

Differential Revision: http://reviews.llvm.org/D8609

llvm-svn: 233199
2015-03-25 17:36:01 +00:00
Benjamin Kramer b4b5150dfc [APInt] Add an isSplat helper and use it in some places.
To complement getSplat. This is more general than the binary
decomposition method as it also handles non-pow2 splat sizes.

llvm-svn: 233195
2015-03-25 16:49:59 +00:00
Benjamin Kramer 327ec24b4d [Hexagon] Pattern match a CTZ loop into a call to countTrailingZeros.
No functional change intended.

llvm-svn: 233192
2015-03-25 15:36:57 +00:00
Benjamin Kramer 860323fd4f [ARM] Rewrite .save/.vsave emission with bit math
Hopefully makes it a bit easier to understand what's going on.
No functional change intended.

llvm-svn: 233191
2015-03-25 15:27:58 +00:00
Rafael Espindola f275ad8af1 Fix fixup evaluation when deciding what to relocate with.
The previous logic was to first try without relocations at all
and failing that stop on the first defined symbol.

That was inefficient and incorrect in the case part of the
expression could be simplified and another part could not
(see included test).

We now stop the evaluation when we get to a variable whose value
can change (i.e. is weak).

llvm-svn: 233187
2015-03-25 13:16:53 +00:00
Andrea Di Biagio 460948c9ab [optnone] Skip pass Float2Int on optnone functions.
Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int
is not run on optnone functions.

llvm-svn: 233183
2015-03-25 12:22:37 +00:00
James Molloy cb75d92458 Reapply r233062: "float2int": Add a new pass to demote from float to int where possible.
Now with a fix for PR23008 and extra regression test.

llvm-svn: 233175
2015-03-25 10:03:42 +00:00
Craig Topper f2071f2672 [X86] Remove GetCpuIDAndInfo, GetCpuIDAndInfoEx and DetectFamilyModel functions from X86 MC layer. They haven't been used since CPU autodetection was removed from X86Subtarget.cpp.
llvm-svn: 233170
2015-03-25 04:16:50 +00:00
Lang Hames 8389b55237 [Orc] Refactor JITCompileCallbackManagerBase and CompileOnDemandLayer to support
target-independent callback management.

This is a prerequisite for adding orc-based lazy-jitting to lli.

llvm-svn: 233166
2015-03-25 02:45:50 +00:00
Duncan P. N. Exon Smith 004ced3b08 Linker: Drop function pointers for overridden subprograms
Instead of dropping subprograms that have been overridden, just set
their function pointers to `nullptr`.  This is a minor adjustment to the
stop-gap fix for PR21910 committed in r224487, and fixes the crasher
from PR22792.

The problem that r224487 put a band-aid on: how do we find the canonical
subprogram for a `Function`?  Since the backend currently relies on
`DebugInfoFinder` (which does a naive in-order traversal of compile
units and picks the first subprogram) for this, r224487 tried dropping
non-canonical subprograms.

Dropping subprograms fails because the backend *also* builds up a map
from subprogram to compile unit (`DwarfDebug::SPMap`) based on the
subprogram lists.  A missing subprogram causes segfaults later when an
inlined reference (such as in this testcase) is created.

Instead, just drop the `Function` pointer to `nullptr`, which nicely
mirrors what happens when an already-inlined `Function` is optimized
out.  We can't really be sure that it's the same definition anyway, as
the testcase demonstrates.

This still isn't completely satisfactory.  Two flaws at least that I can
think of:

  - I still haven't found a straightforward way to make this symmetric
    in the IR.  (Interestingly, the DWARF output is already symmetric,
    and I've tested for that to be sure we don't regress.)
  - Using `DebugInfoFinder` to find the canonical subprogram for a
    function is kind of crazy.  We should just attach metadata to the
    function, like this:

        define weak i32 @foo(i32, i32) !dbg !MDSubprogram(...) {

llvm-svn: 233164
2015-03-25 02:26:32 +00:00
Rafael Espindola 44cc654869 Fix warning on non-assert build.
llvm-svn: 233158
2015-03-25 00:45:41 +00:00
Rafael Espindola dbb4021b64 Produce an error instead of asserting on invalid .sleb128/.uleb128.
llvm-svn: 233155
2015-03-25 00:25:37 +00:00
Paul Robinson 284f0451cf 'optnone' should not disable DAG combiner.
Reverts the code change from r221168 and the relevant test.
It was a mistake to disable the combiner, and based on the ultimate
definition of 'optnone' we shouldn't have considered the test case
as failing in the first place.

llvm-svn: 233153
2015-03-25 00:10:24 +00:00
Philip Reames 4dbd88f3b4 !invariant.load semantics with potentially clobbering calls
A load from an invariant location is assumed to not alias any otherwise potentially aliasing stores. Our implementation only applied this rule to store instructions themselves whereas they it should apply for any memory accessing instruction. This results in both FRE and PRE becoming more effective at eliminating invariant loads.

Note that as a follow on change I will likely move this into AliasAnalysis itself. That's where the TBAA constant flag is handled and the semantics are essentially the same. I'd like to separate the semantic change from the refactoring and thus have extended the hack that's already in MemoryDependenceAnalysis for this change.

Differential Revision: http://reviews.llvm.org/D8591

llvm-svn: 233140
2015-03-24 23:54:54 +00:00
Rafael Espindola c9e7068cdd Don't be over eager in evaluating a subtraction with a weak symbol.
In a subtraction of the form A - B, if B is weak, there is no way to represent
that on ELF since all relocations add the value of a symbol.

llvm-svn: 233139
2015-03-24 23:48:44 +00:00
Reid Kleckner 11470c48d0 X86: Fix frameescape when not using an FP
We can't use TargetFrameLowering::getFrameIndexOffset directly, because
Win64 really wants the offset from the stack pointer at the end of the
prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(),
which is a pretty close approximiation of that. It fails to handle cases
with interestingly large stack alignments, which is pretty uncommon on
Win64 and is TODO.

llvm-svn: 233137
2015-03-24 23:46:01 +00:00
Andrew Kaylor 5c73e1f85c Disabling warnings for MSVC build to enable /W4 use.
Differential Revision: http://reviews.llvm.org/D8572

llvm-svn: 233133
2015-03-24 23:37:10 +00:00
David Blaikie 156d46eda0 Opaque Pointer Types: GEP API migrations to specify the gep type explicitly
The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)

SCEV looks like it'll need some restructuring - we'll have to do a bit
more work for GEP canonicalization, since it'll depend on how it's used
if we can even manage to canonicalize it to a non-ugly GEP. I guess we
can do some fun stuff like voting (do 2 out of 3 load from the GEP with
a certain type that gives a pretty GEP? Does every typed use of the GEP
use either a specific type or a generic type (i8*, etc)?)

llvm-svn: 233131
2015-03-24 23:34:31 +00:00
Sanjay Patel e304bea010 optimize the AVX2 (integer) version of vperm2 into a shuffle
...because this is what happens when an instruction
set puts its underwear on after its pants.

This is an extension of r232852, r233100, and 233110:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232852
http://llvm.org/viewvc/llvm-project?view=revision&revision=233100
http://llvm.org/viewvc/llvm-project?view=revision&revision=233110

llvm-svn: 233127
2015-03-24 22:39:29 +00:00
David Blaikie 68d535c45f Opaque Pointer Types: GEP API migrations to specify the gep type explicitly
The changes to InstCombine do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)

llvm-svn: 233126
2015-03-24 22:38:16 +00:00
Philip Reames 2b969d7010 Merge empty landing pads in SimplifyCFG
This patch tries to merge duplicate landing pads when they branch to a common shared target.

Given IR that looks like this:
lpad1:
  %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
         cleanup
  br label %shared_resume
lpad2:
  %exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
          cleanup
  br label %shared_resume
shared_resume:
  call void @fn()
  ret void
}

We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.

Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.

Differential Revision: http://reviews.llvm.org/D8297

llvm-svn: 233125
2015-03-24 22:28:45 +00:00
David Blaikie 1a6bb9fcf6 Revert "Remove an InstCombine that seems to have become redundant."
Assertion fires in compiler-rt. Guess it does fire..

This reverts commit r233116.

llvm-svn: 233121
2015-03-24 21:50:35 +00:00
Rafael Espindola 8b4817b5f7 Reset the CFA offset at the start of every FDE.
This fixes PR21515.

llvm-svn: 233120
2015-03-24 21:47:31 +00:00
Peter Collingbourne e8813e6c2c AArch64: use a different means to determine whether to byte swap relocations.
This code depended on a bug in the FindAssociatedSection function that would
cause it to return the wrong result for certain absolute expressions. Instead,
use EvaluateAsRelocatable.

llvm-svn: 233119
2015-03-24 21:47:03 +00:00
David Blaikie e37e10dc57 Remove an InstCombine that seems to have become redundant.
Assert that this doesn't fire - I'll remove all of this later, but just
leaving it in for a while in case this is firing & we just don't have
test coverage.

llvm-svn: 233116
2015-03-24 21:31:31 +00:00
Sanjay Patel 43a87fdc79 [X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles
This is the IR optimizer follow-on patch for D8563: the x86 backend patch
that converts this kind of shuffle back into a vperm2.

This is also a continuation of the transform that started in D8486. 
In that patch, Andrea suggested that we could convert vperm2 intrinsics that
use zero masks into a single shuffle. 

This is an implementation of that suggestion.

Differential Revision: http://reviews.llvm.org/D8567

llvm-svn: 233110
2015-03-24 20:36:42 +00:00
Hans Wennborg e42c64551a Revert r233062 ""float2int": Add a new pass to demote from float to int where possible."
This caused PR23008, compiles failing with: "Use still stuck around after Def is
destroyed: %.sroa.speculated"

Also reverting follow-up r233064.

llvm-svn: 233105
2015-03-24 20:07:08 +00:00
Sanjoy Das 45dc94a856 [IRCE] Fix how IRCE checks for no-sign-overflow.
IRCE requires the induction variables it handles to not sign-overflow.
The current scheme of checking if sext({X,+,S}) == {sext(X),+,sext(S)}
fails when SCEV simplifies sext(X) too.  After this change we //also//
check no-signed-wrap by looking at the flags set on the SCEVAddRecExpr.

llvm-svn: 233102
2015-03-24 19:29:22 +00:00
Sanjoy Das 337d46b36f [IRCE] Fix a regression introduced in r232444.
IRCE should not try to eliminate range checks that check an induction
variable against a loop-varying length.

llvm-svn: 233101
2015-03-24 19:29:18 +00:00
Sanjay Patel 99d246d7d7 [X86, AVX] recognize shufflevector with zero input as a vperm2 (PR22984)
vperm2x128 instructions have the special ability (aka free hardware capability)
to shuffle zero values into a vector.

This patch recognizes that type of shuffle and generates the appropriate
control byte.

https://llvm.org/bugs/show_bug.cgi?id=22984

Differential Revision: http://reviews.llvm.org/D8563

llvm-svn: 233100
2015-03-24 19:19:07 +00:00
Duncan P. N. Exon Smith fc25da101c Verifier: Start recursing into !dbg attachments
The main verifier already recurses through the other entry points, so we
might as well descend here too.

This temporarily duplicates some work already done in
`verifyDebugInfo()`, but eventually I'll be removing the other side.

llvm-svn: 233095
2015-03-24 17:32:19 +00:00
Duncan P. N. Exon Smith f238c78c4c Verifier: !llvm.dbg.cu must point at compile units
Duplicate this check from `verifyDebugInfo()`.

llvm-svn: 233094
2015-03-24 17:18:03 +00:00
David Blaikie 19ef0d3b97 Refactor: Simplify boolean expressions in lib/Analysis
Simplify boolean expressions using `true` and `false` with `clang-tidy`

Patch by Richard Thomson.

Reviewed By: nlewycky

Differential Revision: http://reviews.llvm.org/D8528

llvm-svn: 233091
2015-03-24 16:33:19 +00:00
David Blaikie 186d2cbd1d Refactor: Simplify boolean expressions in AArch64 target
Simplify boolean expressions using `true` and `false` with `clang-tidy`

Patch by Richard Thomson.

Reviewed By: rengolin

Differential Revision: http://reviews.llvm.org/D8525

llvm-svn: 233089
2015-03-24 16:24:01 +00:00
Daniel Sanders c676f2a8bb [mips] Support 16-bit offsets for 'm' inline assembly memory constraint.
Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8435

llvm-svn: 233086
2015-03-24 15:19:14 +00:00
Marek Olsak aab1a8daee R600/SI: Insert more NOPs after READLANE on VI, don't use NOPs on CI
This is a candidate for stable.

llvm-svn: 233080
2015-03-24 13:40:38 +00:00
Marek Olsak 949f5dab95 R600/SI: Select V_BFE_U32 for and+shift with a non-literal offset
llvm-svn: 233079
2015-03-24 13:40:34 +00:00
Marek Olsak 9b72868d17 R600/SI: Custom-select 32-bit S_BFE from bitwise opcodes
llvm-svn: 233078
2015-03-24 13:40:27 +00:00
Marek Olsak 63a7b084eb R600/SI: Improve BFM support
llvm-svn: 233077
2015-03-24 13:40:21 +00:00
Marek Olsak 7d77728c97 R600/SI: Use V_FRACT_F64 for faster 64-bit floor on SI
Other f64 opcodes not supported on SI can be lowered in a similar way.

v2: use complex VOP3 patterns
llvm-svn: 233076
2015-03-24 13:40:15 +00:00
Marek Olsak 43650e45c3 R600/SI: Expand fract to floor, then only select V_FRACT on CI
V_FRACT is buggy on SI.

R600-specific code is left intact.

v2: drop the multiclass, use complex VOP3 patterns
llvm-svn: 233075
2015-03-24 13:40:08 +00:00
Benjamin Kramer 722ff28643 Internalize the StackMapLiveness pass.
No need to have its own header when it's not used anywhere. NFC.

llvm-svn: 233072
2015-03-24 13:20:54 +00:00
Michael Kuperstein 29704e7fb4 Revert "Use std::bitset for SubtargetFeatures"
This reverts commit r233055.

It still causes buildbot failures (gcc running out of memory on several platforms, and a self-host failure on arm), although less than the previous time.

llvm-svn: 233068
2015-03-24 12:56:59 +00:00
Aaron Ballman d5cc45f192 Silencing some MSVC warnings "C4805: '^' : unsafe mix of type 'bool' and type 'unsigned int' in operation"; NFC.
llvm-svn: 233067
2015-03-24 12:47:51 +00:00
Simon Atanasyan c99ce681ca [mips] Simplify boolean expressions in Mips target with `clang-tidy`
No functional changes.

Patch by Richard Thomson.

Differential Revision: http://reviews.llvm.org/D8522

llvm-svn: 233065
2015-03-24 12:24:56 +00:00
Benjamin Kramer e3b961a6e2 [float2int] Sort includes and add missing raw_ostream include.
llvm-svn: 233064
2015-03-24 11:28:47 +00:00
Daniel Sanders a73d8fe2ad [mips] Distinguish 'R', 'ZC', and 'm' inline assembly memory constraint.
Summary:
Previous behaviour of 'R' and 'm' has been preserved for now. They will be
improved in subsequent commits.

The offset permitted by ZC varies according to the subtarget since it is
intended to match the restrictions of the pref, ll, and sc instructions.

The restrictions on these instructions are:
* For microMIPS: 12-bit signed offset.
* For Mips32r6/Mips64r6: 9-bit signed offset.
* Otherwise: 16-bit signed offset.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8414

llvm-svn: 233063
2015-03-24 11:26:34 +00:00
James Molloy 408df5160c "float2int": Add a new pass to demote from float to int where possible.
It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used.

This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers.

This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point.

To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around.

llvm-svn: 233062
2015-03-24 11:15:23 +00:00
Michael Kuperstein 774b441b5e Use std::bitset for SubtargetFeatures
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first time this was committed (r229831), it caused several buildbot failures. 
At least some of the ARM ones were due to gcc/binutils issues, and should now be fixed.

Differential Revision: http://reviews.llvm.org/D8542

llvm-svn: 233055
2015-03-24 09:17:25 +00:00
Lang Hames cd118e7632 [Orc] Move delta-handling for trampoline sizes into the resolver block.
This is the first step towards adding a target-independent callback
handler API.

llvm-svn: 233049
2015-03-24 04:27:02 +00:00
Simon Pilgrim 481f4146cd [SelectionDAG] Fixed issue with uitofp vector constant folding being treated as sitofp
While the uitofp scalar constant folding treats an integer as an unsigned value (from lang ref):

%X = sitofp i8 -1 to double ; yields double:-1.0
%Y = uitofp i8 -1 to double ; yields double:255.0

The vector constant folding was always using sitofp:

%X = sitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>
%Y = uitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>

This patch fixes this so that the correct opcode is used for sitofp and uitofp.

%X = sitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>
%Y = uitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double 255.0, double 255.0>

Differential Revision: http://reviews.llvm.org/D8560

llvm-svn: 233033
2015-03-23 22:44:55 +00:00
Duncan P. N. Exon Smith 9b9cc2dad4 DebugInfo: Overload get() in DIDescriptor subclasses
Continue to simplify the `DIDescriptor` subclasses, so that they behave
more like raw pointers.  Remove `getRaw()`, replace it with an
overloaded `get()`, and overload the arrow and cast operators.  Two
testcases started to crash on the arrow operators with this change
because of `scope:` references that weren't real scopes.  I fixed them.
Soon I'll add verifier checks for them too.

This also adds explicit dereference operators.  Previously, the builtin
dereference against `operator MDNode *()` would have worked, but now the
builtins are ambiguous.

llvm-svn: 233030
2015-03-23 21:54:07 +00:00
Rafael Espindola f2b408c64e Refactor how passes get a symbol at the end of a section.
There is now a canonical symbol at the end of a section that different
passes can request.

This also allows us to assert that we don't switch back to a section whose
end symbol has already been printed.

llvm-svn: 233026
2015-03-23 21:22:04 +00:00
Ahmed Bougacha d1655cb1c0 [AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1.
The pass used to be enabled by default with CodeGenOpt::Less (-O1).
This is too aggressive, considering the pass indiscriminately merges
all globals together.

Currently, performance doesn't always improve, and, on code that uses
few globals (e.g., the odd file- or function- static), more often than
not is degraded by the optimization.  Lengthy discussion can be found
on llvmdev (AArch64-focused;  ARM has similar problems):
  http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html
Also, it makes tooling and debuggers less useful when dealing with
globals and data sections.

GlobalMerge needs to better identify those cases that benefit, and this
will be done separately.  In the meantime, move the pass to run with
-O3 rather than -O1, on both ARM and AArch64.

llvm-svn: 233024
2015-03-23 21:17:36 +00:00
David Blaikie 4eaa79c8d9 Refactor: Simplify boolean expressions in R600 target
Simplify boolean expressions with `true` and `false` using `clang-tidy`

Patch by Richard Thomson.

Differential Revision: http://reviews.llvm.org/D8520

llvm-svn: 233020
2015-03-23 20:56:44 +00:00
Rafael Espindola ae3d78ac18 Update variable name and reuse existing variable. NFC.
llvm-svn: 233014
2015-03-23 20:25:31 +00:00
Chris Bieneman 6a1b54acc7 Raising minimum required CMake version to 2.8.12.2.
This commit is in reference to the llvm-dev thread: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/083672.html

llvm-svn: 233008
2015-03-23 20:03:57 +00:00
David Blaikie 9965c5ae14 Refactor: Simplify boolean expressions in llvm IR
Simplify boolean expressions using `true` and `false` with `clang-tidy`

Patch by Richard Thomson with a few other simplifications to fix
else-after-returns in the surrounding code.

Differential Revision: http://reviews.llvm.org/D8527

llvm-svn: 233005
2015-03-23 19:51:23 +00:00
David Blaikie 4f75c097b0 Refactor: Simplify boolean expressions in llvm Support
Simplify boolean expressions using `true` and `false` with `clang-tidy`

Patch by Richard Thomson - I dropped the parens and != 0 test, for
consistency with other patches/tests like this, but I'm open to the
notion that we should add the explicit non-zero test in all these sort
of cases (non-bool assigned to a bool).

Differential Revision: http://reviews.llvm.org/D8526

llvm-svn: 233004
2015-03-23 19:45:40 +00:00
David Blaikie 50e4f9e4c8 Refactor: Simplify boolean expressions in x86 target
Simplify boolean expressions with `true` and `false` with `clang-tidy`

Patch by Richard Thomson.

Differential Revision: http://reviews.llvm.org/D8519

llvm-svn: 233002
2015-03-23 19:42:36 +00:00
Benjamin Kramer 799003bf8c Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.
llvm-svn: 232998
2015-03-23 19:32:43 +00:00
Benjamin Kramer 1f7c328bf2 [ctorutils] Update and sort includes. NFC.
llvm-svn: 232995
2015-03-23 19:06:17 +00:00
Benjamin Kramer a8d61b104d [winehprepare] Update and sort includes. NFC.
llvm-svn: 232994
2015-03-23 18:57:17 +00:00
Benjamin Kramer b85d3756a6 Another set of missing raw_ostream.h. Still no functional change.
llvm-svn: 232993
2015-03-23 18:45:56 +00:00
Matt Arsenault 88a13c6c8d R600/SI: Merge tables for commuting
Don't use a separate table for compares anymore,
and use the same VOP2_REV class.

llvm-svn: 232992
2015-03-23 18:45:41 +00:00
Matt Arsenault 0943b0e30f R600/SI: Only use one range of isCommutable for compares
Also don't count the class instructions as isCompare anymore.

llvm-svn: 232991
2015-03-23 18:45:38 +00:00
Matt Arsenault 448dac05cd R600/SI: Remove redundant unsetting of hasSideEffects
These are already set in the base class for the instruction.

llvm-svn: 232990
2015-03-23 18:45:36 +00:00
Matt Arsenault 42f39e1a3f R600/SI: Move hasSideEffects setting into VOPCX classes
llvm-svn: 232989
2015-03-23 18:45:35 +00:00
Matt Arsenault f5b2cd891a R600/SI: Allow commuting compares
This enables very common cases to switch to the
smaller encoding.

All of the standard LLVM canonicalizations of comparisons
are the opposite of what we want. Compares with constants
are moved to the RHS, but the first operand can be an inline
immediate, literal constant, or SGPR using the 32-bit VOPC
encoding.

There are additional bad canonicalizations that should
also be fixed, such as canonicalizing ge x, k to gt x, (k + 1)
if this makes k no longer an inline immediate value.

llvm-svn: 232988
2015-03-23 18:45:30 +00:00
Matt Arsenault 05b617fed5 R600/SI: Use right class for cmpsx f64 instructions
Use VOPCX_F64 to not need the let Defs = [EXEC]

llvm-svn: 232987
2015-03-23 18:45:23 +00:00
Matt Arsenault a2dd76f41c R600/SI: Remove cond operand to VOPCX classes
It isn't used, and these will probably never be directly selected.

llvm-svn: 232986
2015-03-23 18:45:20 +00:00
Yaron Keren 7773a72c0f Add missing ELFObjectWriter::reset() override, like other MC classes.
See detailed discussion at

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140915/235418.html

and r217907, r217948:

 http://llvm.org/viewvc/llvm-project?view=revision&revision=217907
 http://llvm.org/viewvc/llvm-project?view=revision&revision=217948

llvm-svn: 232982
2015-03-23 18:35:01 +00:00
Benjamin Kramer de9f090e10 More missing includes only visible to MSVC.
NFC.

llvm-svn: 232981
2015-03-23 18:23:08 +00:00
Benjamin Kramer 4073ce8d04 Add missing include that MSVC complains about.
Also reorder includes a bit, NFC.

llvm-svn: 232980
2015-03-23 18:19:41 +00:00
Benjamin Kramer 16132e6faa Purge unused includes throughout libSupport.
NFC.

llvm-svn: 232976
2015-03-23 18:07:13 +00:00
Chad Rosier affe181b39 [AArch64] Enable rematerialization of float 0 values.
Patch by Geoff Berry<gberry@codeaurora.org>.

llvm-svn: 232967
2015-03-23 17:19:34 +00:00
Bradley Smith ae0ad9c95d Revert "[ARM] Add more pattern matching for f16 <-> f64 conversions"
This change is incorrect since it converts double rounding into single rounding,
which can produce different results. Instead this optimization will be done by
modifying Clang's codegen to not produce double rounding in the first place.

This reverts commit r232954.

llvm-svn: 232962
2015-03-23 16:52:52 +00:00
Eli Bendersky 3e84019a39 Simplify boolean expressions with true and false using clang-tidy
Patch by Richard (legalize@xmission.com)

Differential Revision: http://reviews.llvm.org/D8521

llvm-svn: 232961
2015-03-23 16:26:23 +00:00
James Molloy fa041153e5 [ARM] Remove target-specific ITOFP/FPTOI nodes
Anton tried this 5 years ago but it was reverted due to extra VMOVs
being emitted. This can be easily fixed with a liberal application
of patterns - matching loads/stores and extractelts.

llvm-svn: 232958
2015-03-23 16:15:16 +00:00
Tom Stellard f0a575f6be R600/SI: Fix crash in SIInstrInfo::areLoadsFromSameBasePtr()
This function assumed that SMRD instructions always have immediate
offsets, which is not always the case.

llvm-svn: 232957
2015-03-23 16:06:01 +00:00
Colin LeMahieu 473e34782d [Hexagon] Simplify boolean expression
Patch by Richard
http://reviews.llvm.org/D8523

llvm-svn: 232955
2015-03-23 16:01:03 +00:00
Bradley Smith bc0f0d8c49 [ARM] Add more pattern matching for f16 <-> f64 conversions
Specifically when the conversion is done in two steps, f16 -> f32 -> f64.

For example:

%1 = tail call float @llvm.convert.from.fp16.f32(i16 %0)
%conv = fpext float %1 to double

to:

vcvtb.f64.f16

llvm-svn: 232954
2015-03-23 15:59:54 +00:00
Benjamin Kramer b1d8c46f0e [gcov] Move formatBranchInfo into an anonymous namespace.
NFC.

llvm-svn: 232949
2015-03-23 13:59:13 +00:00
Benjamin Kramer 51f6096cf8 Move private classes into anonymous namespaces
NFC.

llvm-svn: 232944
2015-03-23 12:30:58 +00:00
Petar Jovanovic 5b4362276b Fix sign extension for MIPS64 in makeLibCall function
Fixing sign extension in makeLibCall for MIPS64. In MIPS64 architecture all
32 bit arguments (int, unsigned int, float 32 (soft float)) must be sign
extended. This fixes test "MultiSource/Applications/oggenc/".

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D7791

llvm-svn: 232943
2015-03-23 12:28:13 +00:00
Daniel Sanders f731eee322 [aarch64] Distinguish the 'Q' and 'm' inline assembly memory constraints.
Summary:
But still handle them the same way since I don't know how they differ on
this target.

Clang also has code for 'Ump', 'Utf', 'Usa', and 'Ush' but calls
llvm_unreachable() on this code path so they are not converted to a
constraint id at the moment.

No functional change intended.

Reviewers: t.p.northover

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D8177

llvm-svn: 232941
2015-03-23 11:33:15 +00:00
Hal Finkel 8f7c5a7f18 [SDAG] Don't widen VSETCC during type legalization for split operands
Because the operands of a vector SETCC node can be of a different type from the
result (and often are), it can happen that even if we'd prefer to widen the
result type of the SETCC, the operands have been split instead. In this case,
the SETCC result also must be split. This mirrors what is done in
WidenVecRes_SELECT, and should be NFC elsewhere because if the operands are not
widened the following calls to GetWidenedVector will assert (which is what was
happening in the test case).

llvm-svn: 232935
2015-03-23 08:22:43 +00:00
Craig Topper 3b1c3501f2 Fix typo 'AVX too' instead of 'AVX2'
llvm-svn: 232929
2015-03-23 04:17:11 +00:00
Craig Topper 1e1b0f732a [X86] Add one stepping of Broadwell to the CPU name autodetection for march=native.
llvm-svn: 232927
2015-03-23 00:15:06 +00:00
David Majnemer abd9f5bfb6 Silence a GCC warning
llvm-svn: 232923
2015-03-22 21:27:10 +00:00
Benjamin Kramer 66f486fe11 FoldingSet: Make FoldingSetImpl's dtor protected and non-virtual
It's not intended to be polymorphically deleted. Make FoldingSet
and ContextualFoldingSet final to avoid noise from -Wnon-virtual-dtor.

No functional change intended.

llvm-svn: 232922
2015-03-22 18:22:33 +00:00
Simon Pilgrim 3f229eaf3f Fixed MSVC compile warning issue introduced in r232837
- was reporting 'warning C4715: 'getType32' : not all control paths return a value'

llvm-svn: 232913
2015-03-22 13:38:36 +00:00
Benjamin Kramer d6aa0ec737 [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform.
llvm-svn: 232903
2015-03-21 22:04:26 +00:00
Benjamin Kramer 7857d723f1 [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.

int foo(char C) { return strchr("123!", C) != nullptr; } now becomes

	cmpl	$64, %edi ## range check
	sbbb	%al, %al
	movabsq	$0xE000200000001, %rcx
	btq	%rdi, %rcx ## bit test
	sbbb	%cl, %cl
	andb	%al, %cl ## and the two conditions
	andb	$1, %cl
	movzbl	%cl, %eax ## returning an int
	ret

(imho the backend should expand this into a series of branches, but
that's a different story)

The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.

llvm-svn: 232902
2015-03-21 21:09:33 +00:00
Benjamin Kramer 691363e7f2 SimplifyLibCalls: Add basic optimization of memchr calls.
This is just memchr(x, y, 0) -> nullptr and constant folding.

llvm-svn: 232896
2015-03-21 15:36:21 +00:00
Benjamin Kramer 0248a3e549 ValueTracking: Forward getConstantStringInfo's TrimAtNul param into recursive invocation
Currently this is only used to tweak the backend's memcpy inlining
heuristics, testing that isn't very helpful. A real test case will
follow in the next commit, where this behavior would cause a real
miscompilation.

llvm-svn: 232895
2015-03-21 15:36:06 +00:00
David Majnemer e165502ed7 MemoryDependenceAnalysis: Don't miscompile atomics
r216771 introduced a change to MemoryDependenceAnalysis that allowed it
to reason about acquire/release operations.  However, this change does
not ensure that the acquire/release operations pair.  Unfortunately,
this leads to miscompiles as we won't see an acquire load as properly
memory effecting.  This largely reverts r216771.

This fixes PR22708.

llvm-svn: 232889
2015-03-21 06:19:17 +00:00
Eric Christopher 4d0f35a901 Remove the target independent TargetMachine::getSubtarget and
TargetMachine::getSubtargetImpl routines.

This keeps the target independent code free of bare subtarget
calls while the remainder of the backends are migrated, or not
if they don't wish to support per-function subtargets as would
be needed for function multiversioning or LTO of disparate
cpu subarchitecture types, e.g.

clang -msse4.2 -c foo.c -emit-llvm -o foo.bc
clang -c bar.c -emit-llvm -o bar.bc
llvm-link foo.bc bar.bc -o baz.bc
llc baz.bc

and get appropriate code for what the command lines requested.

llvm-svn: 232885
2015-03-21 04:22:23 +00:00
Eric Christopher faad620569 Remove the bare getSubtargetImpl call from the AArch64 port. As part
of this add a test that shows we can generate code for functions
that specifically enable a subtarget feature.

llvm-svn: 232884
2015-03-21 04:04:50 +00:00
Eric Christopher 83eb13c967 Remove the bare getSubtargetImpl call from the PPC port. As part
of this add a test that shows we can generate code with
for functions that differ by subtarget feature.

llvm-svn: 232882
2015-03-21 03:36:02 +00:00
Eric Christopher 8024d030fb Grab a subtarget off of an AMDGPUTargetMachine rather than a
bare target machine in preparation for the TargetMachine bare
getSubtarget/getSubtargetImpl calls going away.

llvm-svn: 232880
2015-03-21 03:17:25 +00:00
Eric Christopher c5a85af3b2 Cache the Function dependent subtarget on the MachineFunction.
As preparation for removing the getSubtargetImpl() call from
TargetMachine go ahead and flip the switch on caching the function
dependent subtarget and remove the bare getSubtargetImpl call
from the X86 port. As part of this add a few tests that show we
can generate code and assemble on X86 based on features/cpu on
the Function.

llvm-svn: 232879
2015-03-21 03:13:10 +00:00
Eric Christopher cba722f8c1 Grab the cached subtarget off of the MachineFunction.
llvm-svn: 232878
2015-03-21 03:13:07 +00:00
Eric Christopher 948bdf996b Grab a subtarget off of a MipsTargetMachine rather than a
bare target machine in preparation for the TargetMachine bare
getSubtarget/getSubtargetImpl calls going away.

llvm-svn: 232877
2015-03-21 03:13:05 +00:00
Eric Christopher 5c3dffc459 Simplify the query for a subtarget in the NVPTX pass manager.
llvm-svn: 232876
2015-03-21 03:13:03 +00:00
Eric Christopher cd53d6eda7 Change getISAEncoding to use the target triple to determine
thumb-ness similar to the rest of the Module level asm printing
infrastructure as debug info finalization happens after the function
may be missing.

llvm-svn: 232875
2015-03-21 03:13:01 +00:00
Eric Christopher 23a7d1e6f4 Make the Hexagon ISelDAGToDAG pass set the subtarget dynamically
on each runOnMachineFunction invocation.

llvm-svn: 232874
2015-03-21 03:12:59 +00:00
Kostya Serebryany f4e35cc47d [sanitizer] experimental tracing for cmp instructions
llvm-svn: 232873
2015-03-21 01:29:36 +00:00
Ahmed Bougacha 7173b669b4 [CodeGen][IfCvt] Don't re-ifcvt blocks with unanalyzable terminators.
If we couldn't analyze its terminator (i.e., it's an indirectbr, or some
other weirdness), we can't safely re-if-convert a predicated block,
because we can't tell whether the predicated terminator can
fallthrough (it does).

Currently, we would completely ignore the fallthrough successor. In
the added testcase, this means we used to generate:

    ...
  @ %entry:
    cmp   r5, #21
    ittt  ne
  @ %cc1f:
    cmpne r7, #42
  @ %cc2t:
    strne.w       r5, [r8]
    movne pc, r10
  @ %cc1t:
    ...

Whereas the successor of %cc1f was originally %bb1.
With the fix, we get the correct:

    ...
  @ %entry:
    cmp   r5, #21
    itt   eq
  @ %cc1t:
    streq.w       r5, [r11]
    moveq pc, r0
  @ %cc1f:
    cmp   r7, #42
    itt   ne
  @ %cc2t:
    strne.w       r5, [r8]
    movne pc, r10
  @ %bb1:
    ...

rdar://20192768
Differential Revision: http://reviews.llvm.org/D8509

llvm-svn: 232872
2015-03-21 01:23:15 +00:00
Ahmed Bougacha e6bb09ac3f [AArch64] Prefer UZP for concat_vector of illegal truncs.
Follow-up to r232459: prefer a UZP shuffle to the intermediate truncs.

llvm-svn: 232871
2015-03-21 01:08:39 +00:00
Filipe Cabecinhas 008067aca9 Make getLastArgNoClaim work for up to 4 arguments.
Summary:
This is needed for http://reviews.llvm.org/D8507
I have no idea what stand-alone tests could be done, if needed.

Reviewers: Bigcheese, craig.topper, samsonov

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8508

llvm-svn: 232859
2015-03-20 23:32:58 +00:00
Sanjay Patel ccf5f24b7b [X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles
vperm2* intrinsics are just shuffles. 
In a few special cases, they're not even shuffles.

Optimizing intrinsics in InstCombine is better than
handling this in the front-end for at least two reasons:

1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders
   really angry (and so I have regrets about some patches from last week).

2. Doing mask conversion logic in header files is hard to write and 
   subsequently read.

There are a couple of TODOs in this patch to complete this optimization.

Differential Revision: http://reviews.llvm.org/D8486

llvm-svn: 232852
2015-03-20 21:47:56 +00:00
Andrew Kaylor 3170e5620e Fixing a bug with WinEH PHI handling
llvm-svn: 232851
2015-03-20 21:42:54 +00:00
Sanjay Patel c88f724fed [X86] Prefer blendps over insertps codegen for one special case
With this patch, for this one exact case, we'll generate:

  blendps %xmm0, %xmm1, $1

instead of:

  insertps %xmm0, %xmm1, $0

If there's a memory operand available for load folding and we're
optimizing for size, we'll still generate the insertps.

The detailed performance data motivation for this may be found in D7866; 
in summary, blendps has 2-3x throughput vs. insertps on widely used chips.

Differential Revision: http://reviews.llvm.org/D8332

llvm-svn: 232850
2015-03-20 21:19:52 +00:00
Benjamin Kramer 063667cea2 X86: Make helper functions static. NFC.
llvm-svn: 232848
2015-03-20 21:07:30 +00:00
Eric Christopher 594fa96a57 Remove dead calls and function arguments dealing with TRI in StackMaps.
llvm-svn: 232847
2015-03-20 21:05:18 +00:00
Rafael Espindola 36a15cb975 Don't declare all text sections at the start of the .s
The code this patch removes was there to make sure the text sections went
before the dwarf sections. That is necessary because MachO uses offsets
relative to the start of the file, so adding a section can change relaxations.

The dwarf sections were being printed at the start just to produce symbols
pointing at the start of those sections.

The underlying issue was fixed in r231898. The dwarf sections are now printed
when they are about to be used, which is after we printed the text sections.

To make sure we don't regress, the patch makes the MachO streamer assert
if CodeGen puts anything unexpected after the DWARF sections.

llvm-svn: 232842
2015-03-20 20:00:01 +00:00
Duncan P. N. Exon Smith 23e56ecf26 AsmPrinter: Check subprogram before using it
Check return of `getDISubprogram()` before using it.  A WIP patch makes
`DIDescriptor` accessors more strict (and would crash on this).

llvm-svn: 232838
2015-03-20 19:50:00 +00:00
Rafael Espindola bdfbde56e0 Reorganize the x86 ELF relocation selection logic.
The main differences are:

* Split in 32 and 64 bit functions.
* First switch on the Modifier so that we have only one non fully covered
  switch.
* Map the fixup kind first to a x86_64 (or i386) specific enum, to make
  it easy to handle cases like X86::reloc_riprel_4byte_movq_load.
* Switch on IsPCRel last, which reduces code duplication.

Fixes pr22308.

llvm-svn: 232837
2015-03-20 19:48:54 +00:00
Duncan P. N. Exon Smith d3a057733f DwarfDebug: Check for null DebugLocs
`DL` might be null, so check for that before using accessors.  A WIP
patch to make `DIDescriptors` more strict fails otherwise.

As a bonus, I think the logic is easier to follow now (despite the extra
nesting depth).

llvm-svn: 232836
2015-03-20 19:37:03 +00:00
Duncan P. N. Exon Smith a3bdc328a5 Verifier: Check that !dbg attachments have the right type
A WIP patch makes `DIDescriptor` accessors more strict, which in turn
causes the `DebugInfoFinder` to crash on wrongly typed `!dbg`
attachments.  Catch that error up front in
`Verifier::visitInstruction()`.

Also remove a test that we "handle" invalid `!dbg` attachments, added
back in r99938.  We don't want to handle those anymore.

Note: I'm *not* recursing and verifying the debug info graph reachable
from this node; that work is already done by `verifyDebugInfo()`.

llvm-svn: 232834
2015-03-20 19:26:58 +00:00
Duncan P. N. Exon Smith d4e07c973c DebugInfoFinder: Check for null imported entities
Don't use the accessors in `DIImportedEntity` on a null pointer.  (A WIP
patch to make `DIDescriptor` accessors more strict crashes here
otherwise.)

llvm-svn: 232833
2015-03-20 19:13:53 +00:00
Duncan P. N. Exon Smith 18c97fa2a0 SanitizerCoverage: Check for null DebugLocs
After a WIP patch to make `DIDescriptor` accessors more strict, this
started asserting.

llvm-svn: 232832
2015-03-20 18:48:45 +00:00
Hans Wennborg 90aa1a9653 SelectionDAGBuilder: Rangeify a loop. NFC.
llvm-svn: 232831
2015-03-20 18:48:40 +00:00
Hans Wennborg 2bdc4cf35f SelectionDAGBuilder::handleJTSwitchCase, simplify loop; NFC
llvm-svn: 232830
2015-03-20 18:48:31 +00:00
Wei Mi 6c428d6ff6 Correctly estimate SROA savings for store operands in inline cost analysis.
When estimating SROA savings, we want to see if an address is derived
off an alloca in the caller. For store instructions, operand 1 is the
address operand, but the current code uses operand 0.  Use
getPointerOperand for loads and stores to fix this.

Patch by Easwaran Raman.
http://reviews.llvm.org/D8425

llvm-svn: 232827
2015-03-20 18:33:12 +00:00
Daniel Berlin 9e77de2a1e Small optimization to avoid getting pass info when we will not run loop
llvm-svn: 232826
2015-03-20 18:05:49 +00:00
John Brawn 1f26a47630 [ARM] Fix handling of thumb1 out-of-range frame offsets
LocalStackSlotPass assumes that isFrameOffsetLegal doesn't change its
answer when the base register changes. Unfortunately this isn't true
in thumb1, where SP-based loads allow a larger offset than
non-SP-based loads, and this causes the base register reuse code to
generate instructions that are unencodable, causing an assertion
failure. 

Solve this by adding a BaseReg parameter to isFrameOffsetLegal, which
ARMBaseRegisterInfo can then make use of to give the correct answer. 

Differential Revision: http://reviews.llvm.org/D8419

llvm-svn: 232825
2015-03-20 17:20:07 +00:00
Simon Pilgrim 180cad2e57 Stripped trailing whitespace. NFC.
llvm-svn: 232822
2015-03-20 16:08:17 +00:00
Eric Christopher cef8e71394 Rewrite StackMap location handling to pre-compute the dwarf register
numbers before emission.

This removes a dependency on being able to access TRI at the module
level and is similar to the DwarfExpression handling. I've modified
the debug support into print/dump routines that'll do the same dumping
but is now callable anywhere and if TRI isn't available will go ahead
and just print out raw register numbers.

llvm-svn: 232821
2015-03-20 16:03:42 +00:00
Eric Christopher d43c5c75b6 At the beginning of doFinalization set the MachineFunction to
nullptr so that users get an earlier dereferencing error and
so that we can use it to conditionalize access to MachineFunction
specific data.

llvm-svn: 232820
2015-03-20 16:03:39 +00:00
Chad Rosier c83fa9e9eb Typo.
llvm-svn: 232819
2015-03-20 15:45:14 +00:00
Tom Stellard 3b0dab9f3f R600/SI: Refactor VOP2 instruction defs
llvm-svn: 232817
2015-03-20 15:14:23 +00:00
Tom Stellard 23c2c3d0f4 R600/SI: Refactor VOP1 instruction defs
llvm-svn: 232816
2015-03-20 15:14:21 +00:00
Rafael Espindola 8c8d15879f Reduce indentation after return. NFC.
llvm-svn: 232814
2015-03-20 14:33:25 +00:00
Rafael Espindola 2d74274017 Use early returns. NFC.
llvm-svn: 232813
2015-03-20 14:23:46 +00:00
Rafael Espindola 9e77cba164 Fold a llvm_unreachable into an assert. NFC.
llvm-svn: 232811
2015-03-20 13:50:15 +00:00
Rafael Espindola 5f5e24bb92 clang-format a function. NFC.
llvm-svn: 232810
2015-03-20 13:47:40 +00:00
Daniel Jasper 214997c63b [MBP] Don't outline short optional branches
With the option -outline-optional-branches, LLVM will place optional
branches out of line (more details on r231230).

With this patch, this is not done for short optional branches. A short
optional branch is a branch containing a single block with an
instruction count below a certain threshold (defaulting to 3). Still
everything is guarded under -outline-optional-branches).

Outlining a short branch can't significantly improve code locality. It
can however decrease performance because of the additional jmp and in
cases where the optional branch is hot. This fixes a compile time
regression I have observed in a benchmark.

Review: http://reviews.llvm.org/D8108
llvm-svn: 232802
2015-03-20 10:00:37 +00:00
Craig Topper 3a8eb896c9 [Tablegen] Attempt to add support for patterns containing nodes with multiple results.
This is needed for AVX512 masked scatter/gather support.

The R600 change is necessary to remove a hack that was working around the lack of multiple results.

llvm-svn: 232798
2015-03-20 05:09:06 +00:00
Nick Lewycky 2ce2832c9b Fix comment from r232794. NFC
llvm-svn: 232796
2015-03-20 02:52:23 +00:00
Alexei Starovoitov f049a68d78 [bpf] fix build
fix BPF backend build broken by r232699

llvm-svn: 232795
2015-03-20 02:35:29 +00:00
Nick Lewycky be8af48824 When simplifying a SCEV truncate by distributing, consider it a simplification to replace a cast, even if we end up with a trunc around the term. Fixes PR22960!
llvm-svn: 232794
2015-03-20 02:25:00 +00:00
Duncan P. N. Exon Smith 41a1546ebc SampleProfile: Check for missing debug locations
Don't use `DebugLoc` accessors if we're pointing at null, which will be
a problem after a WIP patch to make the `DIDescriptor` accessors more
strict.  Caught by Frontend/profile-sample-use-loc-tracking.c (in
clang).

llvm-svn: 232792
2015-03-20 00:56:55 +00:00
Duncan P. N. Exon Smith 8b866d1a6d Verifier: Remove the separate DebugInfoVerifier class
Remove the separate `DebugInfoVerifier` class, as a partial step toward
better integrating debug info verification with the `Verifier`.

Right now, verification of debug info is kind of a mess.

  - There are `DIDescriptor::Verify()` checks live in `DebugInfo.cpp`.
    These return `bool`, and there's no way to see (except by opening a
    debugger) why they fail.
  - We rely on `DebugInfoFinder` to traverse the debug info graph and
    dig up nodes.  However, the regular `Verifier` visits many of these
    nodes when it calls into debug info intrinsic operands.  Visiting
    twice and running different checks is kind of absurd.
  - Moreover, `DebugInfoFinder` asserts on failed type resolution -- the
    verifier should never assert!

By integrating the two verifiers, I'm aiming at solving these problems
(work to be done, obviously).  Verification can be localized to the
`Verifier`; we can use a naive `MDNode` operand traversal to find all
the nodes; we can verify type references instead of asserting on
failure.

There are `assert()`s sprinkled throughout the optimizer and dwarf
backend on `DIDescriptor::Verify()` checks.  This is a hangover from
when the debug info verifier was off, so I plan to remove them as I go
(once I confirm that the checks are done at verification time).

Note: to keep the behaviour of only running the debug info verifier when
-verify succeeds, I've added an `EverBroken` flag.  Once the
`DebugInfoFinder` assertions are gone and the two traversals have been
merged, I expect to be able to remove this.

llvm-svn: 232790
2015-03-20 00:48:23 +00:00
Hans Wennborg 077845eb81 Rewrite SelectionDAGBuilder::Clusterify to run in linear time. NFC.
It was previously repeatedly erasing elements from the middle of a vector,
causing O(n^2) worst-case run-time.

llvm-svn: 232789
2015-03-20 00:41:03 +00:00
Eric Christopher d83003ea59 Use the cached subtarget on the MachineFunction when the AsmPrinter
will have a MachineFunction, i.e. in places other than the module
level doInitialize/doFinalize.

llvm-svn: 232783
2015-03-19 23:27:42 +00:00
Eric Christopher 7585fb2d9f Use the cached subtarget off of the machine function.
llvm-svn: 232782
2015-03-19 23:06:21 +00:00
Sanjay Patel 803fb7c85c move insert, extract, concat helper functions closer to related helper functions; NFCI
llvm-svn: 232781
2015-03-19 23:04:25 +00:00
Owen Anderson db4201235b Fix a nasty bug in DAGCombine of STORE nodes.
This is very related to the bug fixed in r174431.  The problem is that
SelectionDAG does not include alignment in the uniquing of loads and
stores.  When an otherwise no-op DAGCombine would increase the alignment
of a load or store, the original node would be returned (with the
alignment increased), which would cause the node not to be processed by
any further DAGCombines.

I don't have a direct testcase for this that manifests on an in-tree
target, but I did see some noise in the tests for other targets and have
updated them for it.

llvm-svn: 232780
2015-03-19 22:48:57 +00:00
Eric Christopher cf7b5f5fc5 Remove unused headers.
llvm-svn: 232777
2015-03-19 22:36:38 +00:00
Eric Christopher 12cf76fe26 Add an MCSubtargetInfo variable to the TargetMachine.
This enables us to remove calls to the subtarget from the TargetMachine
and with a small hack for backends that require global subtarget
information for module level code generation, e.g. mips abi flags, as
mentioned in a fixme in the code.

llvm-svn: 232776
2015-03-19 22:36:37 +00:00
Eric Christopher 72e23a219c Add a TargetMachine local MCRegisterInfo and MCInstrInfo so that
they can be used without a subtarget in constructing subtarget
independent passes.

llvm-svn: 232775
2015-03-19 22:36:32 +00:00
Reid Kleckner c759fe90bc WinEH: Make llvm.eh.actions emission match the EH docs
This switches the sense of the i32 values and updates the test cases.

We can also use CHECK-SAME to clean up some tests, and reduce the visual
noise from bitcasts.

llvm-svn: 232774
2015-03-19 22:31:02 +00:00
Sanjay Patel d5c2d287f9 [X86, AVX] use blends instead of insert128 with index 0
Another case of x86-specific shuffle strength reduction:
avoid generating insert*128 instructions with index 0 because
they are slower than their non-lane-changing blend equivalents.

Shuffle lowering already catches most of these cases, but
the zero vector case and some other paths such as in the
modified test in vector-shuffle-256-v32.ll were getting
through.

Differential Revision: http://reviews.llvm.org/D8366

llvm-svn: 232773
2015-03-19 22:29:40 +00:00
Duncan P. N. Exon Smith ab58a568ee Verifier: Remove the separate -verify-di pass
Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass.
Instead, call into the `DebugInfoVerifier` from inside
`VerifierLegacyPass::finalizeModule()`.  This better matches the logic
in `verifyModule()` (used by the new PassManager), avoids requiring two
separate passes to verify the IR, and makes the API for "add a pass to
verify the IR" simple.

Note: the `-verify-debug-info` flag still works (for now, at least;
eventually it might make sense to just remove it).

llvm-svn: 232772
2015-03-19 22:24:17 +00:00
Peter Collingbourne 994ba3d29c LowerBitSets: Avoid reusing byte set addresses.
Each use of the byte array uses a different alias. This makes the
backend less likely to reuse previously computed byte array addresses,
improving the security of the CFI mechanism based on this pass.

Differential Revision: http://reviews.llvm.org/D8455

llvm-svn: 232770
2015-03-19 22:02:10 +00:00
Peter Collingbourne 070843d60b libLTO, llvm-lto, gold: Introduce flag for controlling optimization level.
This change also introduces a link-time optimization level of 1. This
optimization level runs only the globaldce pass as well as cleanup passes for
passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html

llvm-svn: 232769
2015-03-19 22:01:00 +00:00
Duncan P. N. Exon Smith 0a93e2db9c PassManagerBuilder: Remove effectively dead 'StripDebug' option
`StripDebug` was only used by tools/opt/opt.cpp in
`AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its
command-line flag before it calls `AddStandardLinkPasses()`.  Stripping
debug info twice isn't very useful.

llvm-svn: 232765
2015-03-19 21:37:17 +00:00
Hans Wennborg b4db1420c2 Switch lowering: extract NextBlock function. NFC.
llvm-svn: 232759
2015-03-19 20:41:48 +00:00
Peter Collingbourne 0dbc7088da GlobalDCE: Improve performance for large modules containing comdats.
When we encounter a global with a comdat, rather than iterating over
every global in the module to find globals in the same comdat, store the
members in a multimap. This effectively lowers the complexity to O(N log N),
improving performance significantly for large modules such as might be
encountered during LTO.

It looks like we used to do something like this until r219191.

No functional change.

Differential Revision: http://reviews.llvm.org/D8431

llvm-svn: 232743
2015-03-19 18:23:29 +00:00
Artem Belevich 9e8a039318 Add support for __nvvm_reflect changes in libdevice in CUDA-7.0
Summary:
CUDA 7.0's libdevice uses slightly different IR to call __nvvm_reflect
and that triggers an assertion in nvvm_reflect optimization pass. This
change allows nvvm_reflect pass to deal with both old and new ways to
pass an argument to __nvvm_reflect.

Test Plan: ninja check-all

Reviewers: eliben, echristo

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D8399

llvm-svn: 232732
2015-03-19 17:05:35 +00:00
Hans Wennborg 783254386e Switch lowering: remove unnecessary ConstantInt casts. NFC.
llvm-svn: 232729
2015-03-19 16:42:21 +00:00
Krzysztof Parzyszek 421133470f [Hexagon] Add support for vector instructions
llvm-svn: 232728
2015-03-19 16:33:08 +00:00
Krzysztof Parzyszek c6f19333cf [Hexagon] ENDLOOP is a non-reversible conditional branch
llvm-svn: 232725
2015-03-19 15:18:57 +00:00
Benjamin Kramer 717e973a51 Internalize PEI. NFC.
llvm-svn: 232722
2015-03-19 14:09:20 +00:00
Daniel Sanders b1fbacab5f [sparc] Small fix to r232719 to make 2007-12-17-InvokeAsm.ll pass on the buildbot.
llvm-svn: 232720
2015-03-19 11:27:23 +00:00
Daniel Sanders f5d1110075 [sparc] Only support the 'm' inline assembly memory constraint. NFC.
Summary:
SPARC doesn't seem to support any additional constraints. Therefore remove
the target hook.

No functional change intended.

Reviewers: venkatra

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8214

llvm-svn: 232719
2015-03-19 11:26:05 +00:00
Daniel Jasper 5add63f21e [InstCombine] Don't fold a GEP into itself through a PHI node
This can only occur (I think) through the back-edge of the loop.

However, folding a GEP into itself means that the value of the previous
iteration needs to be stored in the meantime, thus requiring an
additional register variable to be live, but not actually achieving
anything (the gep still needs to be executed once per loop iteration).

The attached test case is derived from:
  typedef unsigned uint32;
  typedef unsigned char uint8;
  inline uint8 *f(uint32 value, uint8 *target) {
    while (value >= 0x80) {
      value >>= 7;
      ++target;
    }
    ++target;
    return target;
  }
  uint8 *g(uint32 b, uint8 *target) {
    target = f(b, f(42, target));
    return target;
  }

What happens is that the GEP stored in incptr2 is folded into itself
through the loop's back-edge and the phi-node stored in loopptr,
effectively incrementing the ptr by "2" in each iteration instead of "1".

In this case, it is actually increasing the number of GEPs required as
the GEP before the loop can't be folded away anymore. For comparison:

With this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1
    %shr = lshr i32 %newval, 7
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %0, i64 2
    ret i8* %incptr3
  }

Without this patch:
  define i8* @test4(i32 %value, i8* %buffer) {
  entry:
    %incptr = getelementptr inbounds i8, i8* %buffer, i64 1
    %cmp = icmp ugt i32 %value, 127
    br i1 %cmp, label %loop.header, label %exit

  loop.header:                                      ; preds = %entry
    br label %loop.body

  loop.body:                                        ; preds = %loop.body, %loop.header
    %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
    %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ]
    %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
    %shr = lshr i32 %newval, 7
    %incptr2 = getelementptr inbounds i8, i8* %0, i64 2
    %cmp2 = icmp ugt i32 %newval, 16383
    br i1 %cmp2, label %loop.body, label %loop.exit

  loop.exit:                                        ; preds = %loop.body
    br label %exit

  exit:                                             ; preds = %loop.exit, %entry
    %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ]
    %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1
    ret i8* %incptr3
  }

Review: http://reviews.llvm.org/D8245
llvm-svn: 232718
2015-03-19 11:05:08 +00:00
Rafael Espindola 7f45440bd9 Note that we don't support COFF on PPC.
Should bring back the windows bots.

llvm-svn: 232701
2015-03-19 02:40:56 +00:00
Rafael Espindola cd584a809d Split the object streamer callback in one per file format.
There are two main advantages to doing this

* Targets that only need to handle one of the formats specially don't have
  to worry about the others. For example, x86 now only registers a
  constructor for the COFF streamer.

* Changes to the arguments passed to one format constructor will not impact
  the other formats.

llvm-svn: 232699
2015-03-19 01:50:16 +00:00
Hans Wennborg 5b64657e36 SelectionDAGBuilder: update comment in HandlePHINodesInSuccessorBlocks.
From what I can tell, the code is checking for PHIs that expect any value from
this block, not just constants.

llvm-svn: 232697
2015-03-19 00:57:51 +00:00
Matthias Braun a25e13aaf1 Do not track subregister liveness when it brings no benefits
Some subregisters are only to indicate different access sizes, while not
providing any way to actually divide the register up into multiple
disjunct parts. Avoid tracking subregister liveness in these cases as it
is not beneficial.

Differential Revision: http://reviews.llvm.org/D8429

llvm-svn: 232695
2015-03-19 00:21:58 +00:00
Hans Wennborg 81cfeb1999 SelectionDAGIsel: Fix comment about terminators being "handled below".
That changed in r102128.

llvm-svn: 232692
2015-03-19 00:02:22 +00:00
Quentin Colombet 7bdd50d2a0 [CodeGenPrepare] Remove broken, dead, code.
NFC.

llvm-svn: 232690
2015-03-18 23:17:28 +00:00
Rafael Espindola 69244c3e78 two or more, use a for.
llvm-svn: 232688
2015-03-18 23:15:49 +00:00
Rafael Espindola 242548906d Teach getDefaultFormat that we only support ELF on some architectures.
This should bring the windows bots back.

It is a bit ugly, but it is better than what we had before: The triple would
say that the object format was COFF, but llc/llvm-mc would produce an ELF.

llvm-svn: 232683
2015-03-18 22:19:16 +00:00
Simon Pilgrim 5ec5c9cafe [X86][SSE] Avoid scalarization of v2i64 vector shifts (REAPPLIED)
Fixed broken tests.

Differential Revision: http://reviews.llvm.org/D8416

llvm-svn: 232682
2015-03-18 22:18:51 +00:00
Bill Schmidt 1723525e01 [PowerPC] Correct typo in PPCInstrAltivec.td
llvm-svn: 232681
2015-03-18 22:13:03 +00:00
Eric Christopher 050f590a0c Revert "[X86][SSE] Avoid scalarization of v2i64 vector shifts" as it
appears to have broken tests/bots.

This reverts commit r232660.

llvm-svn: 232670
2015-03-18 21:01:00 +00:00
Eric Christopher 5ac4e120be Revert "Add a TargetMachine local MCRegisterInfo and MCInstrInfo so that"
Committed too early.

This reverts commit r232666.

llvm-svn: 232667
2015-03-18 20:41:44 +00:00
Eric Christopher 4e80e18e78 Add a TargetMachine local MCRegisterInfo and MCInstrInfo so that
they can be used without a subtarget in constructing subtarget
independent passes.

llvm-svn: 232666
2015-03-18 20:37:36 +00:00
Eric Christopher a0de253d27 Revert "Migrate the AArch64 TargetRegisterInfo to its TargetMachine"
as we don't necessarily need to do this yet - though we could move
the base class to the TargetMachine as it isn't subtarget dependent.

This reverts commit r232103.

llvm-svn: 232665
2015-03-18 20:37:30 +00:00
Reid Kleckner 0f9e27a371 Use WinEHPrepare to outline SEH finally blocks
No outlining is necessary for SEH catch blocks. Use the blockaddr of the
handler in place of the usual outlined function.

Reviewers: majnemer, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D8370

llvm-svn: 232664
2015-03-18 20:26:53 +00:00
Simon Pilgrim 5c837edc2a [X86][SSE] Avoid scalarization of v2i64 vector shifts
Currently v2i64 vectors shifts (non-equal shift amounts) are scalarized, costing 4 x extract, 2 x x86-shifts and 2 x insert instructions - and it gets even more awkward on 32-bit targets.

This patch separately shifts the vector by both shift amounts and then shuffles the partial results back together, costing 2 x shuffles and 2 x sse-shifts instructions (+ 2 movs on pre-AVX hardware).

Note - this patch only improves the SHL / LSHR logical shifts as only these are supported in SSE hardware.

Differential Revision: http://reviews.llvm.org/D8416

llvm-svn: 232660
2015-03-18 19:35:31 +00:00
Rafael Espindola 105270f68c Add a default implementation of createObjectStreamer.
This removes duplicated code from backends that don't need to do anything
fancy.

llvm-svn: 232658
2015-03-18 19:08:20 +00:00
Krzysztof Parzyszek 36ccfa5779 [Hexagon] Use pseudo-instructions for true/false predicate values
llvm-svn: 232657
2015-03-18 19:07:53 +00:00
Krzysztof Parzyszek 7a9cd80f54 Revert "[Hexagon] Use pseudo-instructions for true/false predicate values"
This reverts r232650.

Missed a piece of code in the previous commit.

llvm-svn: 232656
2015-03-18 18:50:06 +00:00
Rafael Espindola 38438bae21 Handle X86::reloc_riprel_4byte in 32 bits mode.
We can get there with .code64.

Fixes pr22349.

llvm-svn: 232651
2015-03-18 17:33:40 +00:00
Krzysztof Parzyszek 5d7e8fcd52 [Hexagon] Use pseudo-instructions for true/false predicate values
llvm-svn: 232650
2015-03-18 17:20:51 +00:00
Krzysztof Parzyszek 47ab1f2007 [Hexagon] Intrinsics for circular and bit-reversed loads and stores
llvm-svn: 232645
2015-03-18 16:23:44 +00:00
Krzysztof Parzyszek 78cc36fed7 [Hexagon] Handle ENDLOOP0 in InsertBranch and RemoveBranch
llvm-svn: 232643
2015-03-18 15:56:43 +00:00
Sid Manning 51c3560c48 Add support for .ifnes psuedo-op.
llvm-svn: 232636
2015-03-18 14:20:54 +00:00
John Brawn 0dbcd65442 [ARM] Align stack objects passed to memory intrinsics
Memcpy, and other memory intrinsics, typically tries to use LDM/STM if
the source and target addresses are 4-byte aligned. In CodeGenPrepare
look for calls to memory intrinsics and, if the object is on the
stack, 4-byte align it if it's large enough that we expect that memcpy
would want to use LDM/STM to copy it.

Differential Revision: http://reviews.llvm.org/D7908

llvm-svn: 232627
2015-03-18 12:01:59 +00:00
Yaron Keren 92e1b62d45 Remove many superfluous SmallString::str() calls.
Now that SmallString is a first-class citizen, most SmallString::str()
calls are not required. This patch removes a whole bunch of them, yet
there are lots more.

There are two use cases where str() is really needed:
1) To use one of StringRef member functions which is not available in
SmallString.
2) To convert to std::string, as StringRef implicitly converts while 
SmallString do not. We may wish to change this, but it may introduce
ambiguity.

llvm-svn: 232622
2015-03-18 10:17:07 +00:00
Kai Nacke c44b162c4b [mips] Add itineraries for ext and ins instructions.
Currently, there are no itineraries defined for ext and ins instructions.
This patch adds these itineraries and uses them in the instruction definitions.

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D7209

llvm-svn: 232613
2015-03-18 06:28:38 +00:00
Alexei Starovoitov 6d93de6439 [bpf] fix build
fix BPF backend build broken by r232429

Patch by Brenden Blanco

llvm-svn: 232581
2015-03-18 01:39:40 +00:00
Krzysztof Parzyszek 8c1cab9a27 Generate bit manipulation instructions on Hexagon
llvm-svn: 232577
2015-03-18 00:43:46 +00:00
Sanjoy Das cb8bca1777 [SCEV] Make isImpliedCond smarter.
Summary:
This change teaches isImpliedCond to infer things like "X sgt 0" => "X -
1 sgt -1".  The `ConstantRange` class has the logic to do the heavy
lifting, this change simply gets ScalarEvolution to exploit that when
reasonable.

Depends on D8345

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8346

llvm-svn: 232576
2015-03-18 00:41:29 +00:00
Sanjoy Das 7182d36f66 [ConstantRange] Split makeICmpRegion in two.
Summary:
This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
`makeSatisfyingICmpRegion` with slightly different contracts.  The first
one is useful for determining what values some expression //may// take,
given that a certain `icmp` evaluates to true.  The second one is useful
for determining what values are guaranteed to //satisfy// a given
`icmp`.

Reviewers: nlewycky

Reviewed By: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8345

llvm-svn: 232575
2015-03-18 00:41:24 +00:00
David Majnemer e48237df95 DAGCombiner: fold (xor (shl 1, x), -1) -> (rotl ~1, x)
Targets which provide a rotate make it possible to replace a sequence of
(XOR (SHL 1, x), -1) with (ROTL ~1, x).  This saves an instruction on
architectures like X86 and POWER(64).

Differential Revision: http://reviews.llvm.org/D8350

llvm-svn: 232572
2015-03-18 00:03:36 +00:00
David Majnemer 7db449a6e7 COFF: Let globals with private linkage reside in their own section
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.

Differential Revision: http://reviews.llvm.org/D8394

llvm-svn: 232570
2015-03-17 23:54:51 +00:00
Krzysztof Parzyszek d89581c5e7 Remove unneeded selection functions from HexagonISelDAGToDAG
- SelectSelect, and
- SelectTruncate

llvm-svn: 232569
2015-03-17 23:54:48 +00:00
Pirama Arumuga Nainar 12aeefc63b Fix bug while building FP16 constant vectors for AArch64
Summary: Building FP16 constant vectors caused the FP16 data to be bitcast to i64.  This patch creates a BITCAST node with the correct value, and adds a test to verify correct handling.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: mcrosier, jmolloy, ab, srhines, llvm-commits, rengolin, aemerson

Differential Revision: http://reviews.llvm.org/D8369

llvm-svn: 232562
2015-03-17 23:10:29 +00:00
NAKAMURA Takumi c085eca176 Appease AArch64ISelLowering.cpp miscompiled by g++-4.7.2.
I will revert this when 4.7.3 is ready.

llvm-svn: 232561
2015-03-17 22:55:01 +00:00
Simon Pilgrim 257849f11a XformToShuffleWithZero - Added clearer early outs and general tidy up. NFCI
llvm-svn: 232557
2015-03-17 22:19:08 +00:00
Krzysztof Parzyszek ae14e7bf99 Selection DAG preprocessing on Hexagon
Simplify: (or (select c x 0) z)  ->  (select c (or x z) z)
          (or (select c 0 y) z)  ->  (select c z (or y z))
llvm-svn: 232553
2015-03-17 21:47:16 +00:00
Duncan P. N. Exon Smith 719994b907 DebugInfo: Drop fake DW_TAG_expression
Break MDExpression off of DebugNode (inherit directly from `MDNode`) and
drop the fake `DW_TAG_expression` tag in the process.

AFAICT, there's no real functionality change here.  The tag was
originally used by `DIDescriptor::isExpression()` to discriminate
between `MDNode`s, but in the new hierarchy we don't need that.

Fixes PR22780.

llvm-svn: 232550
2015-03-17 21:32:46 +00:00
Rafael Espindola 7fce7e62db Emit the offset directly instead of creating a dummy expression.
We were creating an expression of the form (S+C)-S which is just C.

Patch by Frédéric Riss. I just added the testcase.

llvm-svn: 232549
2015-03-17 21:30:21 +00:00
David Majnemer 63b1d99943 Revert "COFF: Let globals with private linkage reside in their own section"
This reverts commit r232539.  This was committed accidently.

llvm-svn: 232543
2015-03-17 20:41:11 +00:00
Benjamin Kramer cced8bee52 Internalize BitcodeReader. Not used outside of BitcodeReader.cpp.
NFC.

llvm-svn: 232542
2015-03-17 20:40:24 +00:00
David Majnemer 21fecf9441 Revert "Address review comments"
This reverts commit r232540.  This was committed accidently.

llvm-svn: 232541
2015-03-17 20:40:21 +00:00
David Majnemer 564404cc96 Address review comments
llvm-svn: 232540
2015-03-17 20:39:40 +00:00
David Majnemer 47e3842982 COFF: Let globals with private linkage reside in their own section
Summary:
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8374

llvm-svn: 232539
2015-03-17 20:39:25 +00:00
Michael Zolotukhin 9ef5671d36 Try to fix a test broken by one of my previous commits.
llvm-svn: 232536
2015-03-17 20:31:56 +00:00
Rafael Espindola 9ab09237dc Centralize the handling of unique ids for temporary labels.
Before this patch code wanting to create temporary labels for a given entity
(function, cu, exception range, etc) had to keep its own counter to have stable
symbol names.

createTempSymbol would still add a suffix to make sure a new symbol was always
returned, but it kept a single counter. Because of that, if we were to use
just createTempSymbol("cu_begin"), the label could change from cu_begin42 to
cu_begin43 because some other code started using temporary labels.

Simplify this by just keeping one counter per prefix and removing the various
specialized counters.

llvm-svn: 232535
2015-03-17 20:07:06 +00:00
Benjamin Kramer ba05a784ff Internalize llvm::AssemblyWriter. It's not used outside of AsmWriter.cpp.
This is an artifact of an implementation detail of DebugIR that has been
long refactored away. NFC.

llvm-svn: 232532
2015-03-17 19:53:41 +00:00
Michael Zolotukhin 6d8a2aa976 TLI: Add addVectorizableFunctionsFromVecLib.
Also, add several entries to vectorizable functions table, and
corresponding tests. The table isn't complete, it'll be populated later.

Review: http://reviews.llvm.org/D8131
llvm-svn: 232531
2015-03-17 19:50:55 +00:00
Michael Zolotukhin 9b3cf604ce LoopVectorize: teach loop vectorizer to vectorize calls.
The tests would be committed in a commit for http://reviews.llvm.org/D8131

Review: http://reviews.llvm.org/D8095
llvm-svn: 232530
2015-03-17 19:46:50 +00:00
Samuel Antao 0d59f31d12 Add assertion to detect invalid registers in the PowerPC MC instruction lowering.
We have observed that noreg was being generated due to a bug in FastIsel and was not being detected during emission. It happens that in the Asm emission there is an assertion that detects this in getRegisterName() from the tbl-generated file PPCGenAsmWriter.inc. However, when emitting an Obj file, invalid registers can be emitted given that no check are made in getBinaryCodeFromInstr() from PPCGenMCCodeEmitter.inc. In order to cover all cases this adds an assertion for reg operands in LowerPPCMachineInstrToMCInst.

llvm-svn: 232525
2015-03-17 19:31:19 +00:00
Michael Zolotukhin 7ed84a8151 TTI: Add getCallInstrCost.
Review: http://reviews.llvm.org/D8094
llvm-svn: 232524
2015-03-17 19:26:23 +00:00
Michael Zolotukhin e8f2551f67 TLI: Add interface for querying whether a function is vectorizable.
Review: http://reviews.llvm.org/D8093
llvm-svn: 232523
2015-03-17 19:22:30 +00:00
Michael Zolotukhin 1d4e52512c LoopVectorizer: Add TargetTransformInfo.
Review: http://reviews.llvm.org/D8092
llvm-svn: 232522
2015-03-17 19:17:18 +00:00
Kostya Serebryany b1870a64cf [asan] remove redundant ifndefs. NFC
llvm-svn: 232521
2015-03-17 19:13:23 +00:00
Yaron Keren d7c546c935 Remove LookupSymbol(StringRef) and optimize LookupSymbol(Twine).
Same as MakeArgString in r232465, keep only LookupSymbol(Twine)
while making sure it handles the StringRef like cases efficiently
using twine::toStringRef.

llvm-svn: 232517
2015-03-17 18:55:30 +00:00
Richard Barton 30934c0926 [ARM] Fix offset calculation in ARMBaseRegisterInfo::needsFrameBaseReg
The input offset to needsFrameBaseReg is a negative value below the top of the
stack frame, but when converting to a positive offset from the bottom of the
stack frame this value was negated, causing the final offset to be too large
by twice the input offset's magnitude. Fix that by not negating the offset.

Patch by John Brawn

Differential Revision: http://reviews.llvm.org/D8316

llvm-svn: 232513
2015-03-17 18:20:47 +00:00
Michael Liao 24fcae8fa0 [SwitchLowering] Remove incoming values in the reverse order
- To prevent invalidating *successive* indices.
 

llvm-svn: 232510
2015-03-17 18:03:10 +00:00
David Blaikie c4dfa63928 Fix GCC -Wparentheses warning (& reformat now that the precedence is fixed)
Benign warning (clang deliberately suppresses this case) but does
regularly produce bad formatting, so it's nice to fix/reformat.

llvm-svn: 232508
2015-03-17 17:48:24 +00:00
Duncan P. N. Exon Smith 0633fd7fb4 Verifier: Set --verify-debug-info=true by default
r186634 started verifying debug info, and r194986 disabled it by default
because it was too expensive to run the checks on every function (since
most of the graph was reachable from each function).

r206300 moved the checks to module-level to make it cheaper, but there
was already quite a bit of testcase bitrot (and the verifier would only
print `<badref>`) so I guess no one had time to turn it back on.

This does just that.  Upgrade scripts this past autumn and winter
probably fixed some of the bitrot, and this weekend I fixed the verifier
output (r232275, r232417, r232418) and thusly the remaining failing
testcases (r232290, r232415).

This is part of PR22777.

llvm-svn: 232505
2015-03-17 17:28:41 +00:00
Dmitry Vyukov 618d580ec9 asan: optimization experiments
The experiments can be used to evaluate potential optimizations that remove
instrumentation (assess false negatives). Instead of completely removing
some instrumentation, you set Exp to a non-zero value (mask of optimization
experiments that want to remove instrumentation of this instruction).
If Exp is non-zero, this pass will emit special calls into runtime
(e.g. __asan_report_exp_load1 instead of __asan_report_load1). These calls
make runtime terminate the program in a special way (with a different
exit status). Then you run the new compiler on a buggy corpus, collect
the special terminations (ideally, you don't see them at all -- no false
negatives) and make the decision on the optimization.

The exact reaction to experiments in runtime is not implemented in this patch.
It will be defined and implemented in a subsequent patch.

http://reviews.llvm.org/D8198

llvm-svn: 232502
2015-03-17 16:59:19 +00:00
Reid Kleckner 0b16859805 Use an underlying enum type of unsigned to silence a -Wmicrosoft warning about being unable to put (unsigned)-1 into the default underyling type of int
llvm-svn: 232498
2015-03-17 16:50:20 +00:00
Daniel Sanders 2eeace219d [systemz] Distinguish the 'Q', 'R', 'S', and 'T' inline assembly memory constraints.
Summary:
But still handle them the same way since I don't know how they differ on
this target.

No functional change intended.

Reviewers: uweigand

Reviewed By: uweigand

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8251

llvm-svn: 232495
2015-03-17 16:16:14 +00:00
Rafael Espindola a45fed82e8 Remove the error prone GetTempSymbol API.
llvm-svn: 232487
2015-03-17 15:02:17 +00:00
Samuel Antao f68156015d Fix R0 use in PowerPC VSX store for FastIsel.
The VSX stores are sometimes generated with a undefined index register, causing %noreg to be used and R0 to be emitted later on. The semantics of the VSX store (e.g. stdsdx) requires R0 to be used as base if we want zero to be used in the computation of the effective address instead of the content of R0. This patch checks if no index register was generated and forces R0 to be used as base address.

llvm-svn: 232486
2015-03-17 15:00:57 +00:00
Rafael Espindola c0eb4de58d Convert the last 4 users of GetTempSymbol to createTempSymbol.
Despite using the same name these are unrelated.

llvm-svn: 232485
2015-03-17 14:58:47 +00:00
Rafael Espindola ccfbbd5596 Use createTempSymbol to avoid collisions instead of an ad hoc method.
llvm-svn: 232483
2015-03-17 14:50:32 +00:00
Rafael Espindola 8dc4e1007a Make EmitFunctionHeader a private helper.
llvm-svn: 232481
2015-03-17 14:38:30 +00:00
Daniel Sanders 49f643c472 Re-commit: [hexagon] Distinguish the 'o', 'v', and 'm' inline assembly memory constraints.
Summary:
But still handle them the same way since I don't know how they differ on
this target.

No functional change intended.

Reviewers: kparzysz, adasgupt

Reviewed By: kparzysz, adasgupt

Subscribers: colinl, llvm-commits

Differential Revision: http://reviews.llvm.org/D8204

Like for the PowerPC target, I've had to add 'i' to the constraint mappings in
order to pass 2007-12-17-InvokeAsm.ll. It's not clear why 'i' has historically
been treated as a memory constraint.

llvm-svn: 232480
2015-03-17 14:37:39 +00:00
Rafael Espindola ec8da3de01 Call EmitFunctionHeader just before EmitFunctionBody.
This avoids switching to .AMDGPU.config and back and hardcoding the
section it switches back to.

llvm-svn: 232479
2015-03-17 14:34:42 +00:00
Rafael Espindola 5345e420c4 Convert the easy cases of GetTempSymbol to createTempSymbol.
In these cases no code was depending on GetTempSymbol finding an existing
symbol.

llvm-svn: 232478
2015-03-17 14:22:31 +00:00