Commit Graph

113867 Commits

Author SHA1 Message Date
Zachary Turner 7b0871cc74 Revert "[llvm-pdbdump] Add some tests for llvm-pdbdump."
It is not correctly detecting the situations where the test is
unsupported.  Reverting until we can figure it out.

llvm-svn: 230085
2015-02-20 23:21:21 +00:00
Philip Reames 1f017547bb [RewriteStatepointsForGC] Use DenseSet in place of std::set [NFC]
This should be the last cleanup on non-llvm preferred data structures.  I left one use of std::set in an assertion; DenseSet didn't seem to have a tombstone for CallSite defined.  That might be worth fixing, but wasn't worth it for a debug only use.

llvm-svn: 230084
2015-02-20 23:16:52 +00:00
Zachary Turner d1b1136e0e [llvm-pdbdump] Add some tests for llvm-pdbdump.
This adds only a very basic set of tests that dump a few
functions and object files.

Differential Revision: http://reviews.llvm.org/D7656
Reviewed By: David Blaikie

llvm-svn: 230083
2015-02-20 23:05:57 +00:00
Philip Reames e9c3b9bd46 [RewriteStatepointsForGC] Replace std::map with DenseMap
I'd done the work of extracting the typedef in a previous commit, but didn't actually change it.  Hopefully this will make any subtle changes easier to isolate.

llvm-svn: 230081
2015-02-20 22:48:20 +00:00
Philip Reames d2b664642f [RewriteStatepointsForGC] Cleanup - replace std::vector usage [NFC]
Migrate std::vector usage to a combination of SmallVector and ArrayRef.

llvm-svn: 230079
2015-02-20 22:39:41 +00:00
Eric Christopher d4e723f2cf Used the cached subtarget off of the MachineFunction.
llvm-svn: 230078
2015-02-20 22:36:11 +00:00
Reid Kleckner 8142a08ce7 X86: Remove pre-2010 dead code in mergeSPUpdatesDown
llvm-svn: 230075
2015-02-20 22:13:25 +00:00
Simon Pilgrim b7875837c7 LowerScalarImmediateShift - Merged v16i8 and v32i8 shift lowering. NFC.
llvm-svn: 230074
2015-02-20 22:13:03 +00:00
Matt Arsenault 20711b7bae R600/SI: Remove v_sub_f64 pseudo
The expansion code does the same thing. Since
the operands were not defined with the correct
types, this has the side effect of fixing operand
folding since the expanded pseudo would never use
SGPRs or inline immediates.

llvm-svn: 230072
2015-02-20 22:10:45 +00:00
Matt Arsenault 8d6300346f R600: Use new fmad node.
This enables a few useful combines that used to only
use fma.

Also since v_mad_f32 apparently does not support denormals,
disable the existing cases that are custom handled if they are
requested.

llvm-svn: 230071
2015-02-20 22:10:41 +00:00
Matt Arsenault 0dc54c4dee Add generic fmad DAG node.
This allows sharing of FMA forming combines to work
with instructions that have the same semantics as a separate
multiply and add.

This is expand by default, and only formed post legalization
so it shouldn't have much impact on targets that do not want it.

llvm-svn: 230070
2015-02-20 22:10:33 +00:00
Philip Reames 860660ea5e [RewriteStatepointsForGC] More style cleanup [NFC]
Use llvm_unreachable where appropriate, use SmallVector where easy to do so, introduce typedefs for planned type migrations.

llvm-svn: 230068
2015-02-20 22:05:18 +00:00
Philip Reames 0a3240f4de [RewriteStatepointsForGC] Remove notion of SafepointBounds [NFC]
The notion of a range of inserted safepoint related code is no longer really applicable.  This survived over from an earlier implementation.  Just saving the inserted gc.statepoint and working from that is far clearer given the current code structure.  Particularly when invokable statepoints get involved.

llvm-svn: 230063
2015-02-20 21:34:11 +00:00
Chris Bieneman e88a396f86 Raising minimum required CMake version to 2.8.12.2.
llvm-svn: 230062
2015-02-20 21:28:18 +00:00
Eric Christopher 9ecaa174d6 Grab the DataLayout off of the TargetMachine since that's where
it's stored.

llvm-svn: 230059
2015-02-20 20:56:39 +00:00
Benjamin Kramer 911d5b3ace LoopRotate: When reconstructing loop simplify form don't split edges from indirectbrs.
Yet another chapter in the endless story. While this looks like we leave
the loop in a non-canonical state this replicates the logic in
LoopSimplify so it doesn't diverge from the canonical form in any way.

PR21968

llvm-svn: 230058
2015-02-20 20:49:25 +00:00
Duncan P. N. Exon Smith a5c57ccf2d IR: Change MDFile to directly store the filename/directory
In the old (well, current) schema, there are two types of file
references: untagged and tagged (the latter references the former).

    !0 = !{!"filename", !"/directory"}
    !1 = !{!"0x29", !1} ; DW_TAG_file_type [filename] [/directory]

The interface to `DIBuilder` universally takes the tagged version,
described by `DIFile`.  However, most `file:` references actually use
the untagged version directly.

In the new hierarchy, I'm merging this into a single node: `MDFile`.

Originally I'd planned to keep the old schema unchanged until after I
moved the new hierarchy into place.

However, it turns out to be trivial to make `MDFile` match both nodes at
the same time.

  - Anyone referencing !1 does so through `DIFile`, whose implementation
    I need to gut anyway (as I do the rest of the `DIDescriptor`s).
  - Anyone referencing !0 just references an `MDNode`, and expects a
    node with two `MDString` operands.

This commit achieves that, and updates all the testcases for the parts
of the new hierarchy that used the two-node schema (I've replaced the
untagged nodes with `distinct !{}` to make the diff clear (otherwise the
metadata all gets renumbered); it might be worthwhile to come back and
delete those nodes and renumber the world, not sure).

llvm-svn: 230057
2015-02-20 20:35:17 +00:00
Peter Collingbourne e6909c8e8b Introduce bitset metadata format and bitset lowering pass.
This patch introduces a new mechanism that allows IR modules to co-operatively
build pointer sets corresponding to addresses within a given set of
globals. One particular use case for this is to allow a C++ program to
efficiently verify (at each call site) that a vtable pointer is in the set
of valid vtable pointers for the class or its derived classes. One way of
doing this is for a toolchain component to build, for each class, a bit set
that maps to the memory region allocated for the vtables, such that each 1
bit in the bit set maps to a valid vtable for that class, and lay out the
vtables next to each other, to minimize the total size of the bit sets.

The patch introduces a metadata format for representing pointer sets, an
'@llvm.bitset.test' intrinsic and an LTO lowering pass that lays out the globals
and builds the bitsets, and documents the new feature.

Differential Revision: http://reviews.llvm.org/D7288

llvm-svn: 230054
2015-02-20 20:30:47 +00:00
Jozef Kolek 0365675522 Reversed revision 229706. The reason is regression, which is caused by the
usage of instruction ADDU16 by CodeGen. For this instruction an improper
register is allocated, i.e. the register that is not from register set defined
for the instruction.

llvm-svn: 230053
2015-02-20 20:26:52 +00:00
David Majnemer ab457815f3 Verifier: Unused comdats might not have a corresponding GV
This fixes PR22646.

llvm-svn: 230051
2015-02-20 19:58:48 +00:00
Eric Christopher db51e2a52a Fix an asan use-after-free bug introduced by the asm printer
changes to remove non-Function based subtargets out of the asm
printer. For module level emission we'll need to construct up
an MCSubtargetInfo so that we can encode instructions for
emission.

llvm-svn: 230050
2015-02-20 19:54:07 +00:00
Philip Reames fa2fcf173b [GC, RewriteStatepointsForGC] Style cleanup and bug fix
When doing style cleanup, I noticed a minor bug in this code.  If we have a pointer that we think is unused after a statepoint and thus doesn't need relocation, we store a null pointer into the alloca we're about to promote.  This helps turn a mistake in liveness analysis into an easily debuggable crash.  It turned out this code had never been updated to handle invoke statepoints.  

There's no test for this.  Without a bug in liveness, it appears impossible to make this trigger in a way which is visible in the resulting IR.  We might store the null, but when promoting the alloca, there will be no uses and thus nothing to test against.  Suggestions on how to test are very welcome.

llvm-svn: 230047
2015-02-20 19:51:56 +00:00
Reid Kleckner a070ee5ef5 Use unreachable instead of assert(false) to silence MSVC warning
llvm-svn: 230045
2015-02-20 19:46:02 +00:00
Andrea Di Biagio 7035178aeb [X86][FastIsel] Teach how to select float-half conversion intrinsics.
This patch teaches X86FastISel how to select intrinsic 'convert_from_fp16' and
intrinsic 'convert_to_fp16'.
If the target has F16C, we can select VCVTPS2PHrr for a float-half conversion,
and VCVTPH2PSrr for a half-float conversion.

Differential Revision: http://reviews.llvm.org/D7673

llvm-svn: 230043
2015-02-20 19:37:14 +00:00
Philip Reames f20413245a [GC] Style cleanup for RewriteStatepointForGC (1 of many) [NFC]
Starting to update variable naming and types to match LLVM style.  This will be an incremental process to minimize the chance of breakage as I work.  Step one, rename member variables to LLVM CamelCase and use llvm's ADT.  Much more to come.

llvm-svn: 230042
2015-02-20 19:26:04 +00:00
Chris Bieneman 9b7f832935 Setting up CMake to default to Debug when no build type is specified.
Summary: Turns out if you don't set CMAKE_BUILD_TYPE the default is an empty string. This results in some of the behaviors of debug builds, but not all of them. For example ENABLE_ASSERTIONS is false.

Reviewers: rnk

Reviewed By: rnk

Subscribers: chapuni, llvm-commits

Differential Revision: http://reviews.llvm.org/D7360

llvm-svn: 230041
2015-02-20 19:02:59 +00:00
Philip Reames 2ef029c7ae Bugfix for 229954
Before calling Function::getGC to test for enablement, we need to make sure there's actually a GC at all via Function::hasGC.  Otherwise, we'd crash on functions without a GC.  Thankfully, this only mattered if you manually scheduled the pass, but still, oops. :(

llvm-svn: 230040
2015-02-20 18:56:14 +00:00
Eric Christopher 9cda4b7ed9 Remove a use of the Subtarget in the darwin ppc asm printer.
EmitFunctionStubs is called from doFinalization and so can't
depend on the Subtarget existing. It's also irrelevant as
we know we're darwin since we're in the darwin asm printer.

llvm-svn: 230039
2015-02-20 18:53:42 +00:00
Eric Christopher f734a8bae7 Get the function specific subtarget.
llvm-svn: 230038
2015-02-20 18:44:17 +00:00
Eric Christopher 1df0c519fc Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 230037
2015-02-20 18:44:15 +00:00
Sanjay Patel 1b53f74c4c canonicalize a v2f64 blendi of 2 registers
This canonicalization step saves us 3 pattern matching possibilities * 4 math ops
for scalar FP math that uses xmm regs. The backend can re-commute the operands
post-instruction-selection if that makes register allocation better.

The tests in llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll cover this scenario already,
so there are no new tests with this patch.

Differential Revision: http://reviews.llvm.org/D7777

llvm-svn: 230024
2015-02-20 16:55:27 +00:00
Benjamin Kramer d50f33d996 Put MSVC back into the dumb compiler's corner.
It fails to compile std::trivially_copyable for forward-declared enums.

llvm-svn: 230023
2015-02-20 16:35:42 +00:00
Benjamin Kramer f4b269bdf0 Base isPodLike on is_trivially_copyable for GCC 5 and MSVC
It would be nice to get rid of the version checks here, but that will
have to wait until libstdc++ is upgraded to 5.0 everywhere ...

llvm-svn: 230021
2015-02-20 16:19:28 +00:00
Kit Barton 263edb99ab I incorrectly marked the VORC instruction as isCommutable when I added it.
This fix removes the VORC instruction definition from the isCommutable block.

Phabricator review: http://reviews.llvm.org/D7772

llvm-svn: 230020
2015-02-20 15:54:58 +00:00
Igor Laevsky 7fc58a4ad8 Generalize statepoint lowering to use ImmutableStatepoint. Move statepoint lowering into a separate function 'LowerStatepoint' which uses ImmutableStatepoint instead of a CallInst. Also related utility functions are changed to receive ImmutableCallSite.
Differential Revision: http://reviews.llvm.org/D7756 

llvm-svn: 230017
2015-02-20 15:28:35 +00:00
Benjamin Kramer 7af984b710 Constants.cpp: Only read 32 bits for float.
Otherwise we'll discard the wrong half of a uint64_t on big-endian systems.

llvm-svn: 230016
2015-02-20 15:11:55 +00:00
NAKAMURA Takumi e86b9b76c5 Constants.cpp: getElementAsAPFloat(): Don't handle constant value via host's float/double, just handle with APInt/APFloat.
x87 FPU didn't keep SNAN, but demoted to QNAN.

llvm-svn: 230013
2015-02-20 14:24:49 +00:00
Benjamin Kramer 6f66545ae6 RewriteStatepointsForGC: Move details into anonymous namespaces. NFC.
While there reduce the number of duplicated std::map lookups.

llvm-svn: 230012
2015-02-20 14:00:58 +00:00
Benjamin Kramer ca1ba4b280 Make the static instance of None just const.
This way there shouldn't be any unused variable warnings.

llvm-svn: 230010
2015-02-20 13:16:05 +00:00
Benjamin Kramer d4a3a55564 Wrap recursive function only used in assert in #ifndef NDEBUG.
Avoids unused function warnings in Release builds.

llvm-svn: 230009
2015-02-20 13:15:49 +00:00
Chandler Carruth 0c5f059865 [x86] Switching the shuffle equivalence test to a variadic template was
the wrong answer. We also got initializer lists which are *way* cleaner
for this kind of thing. Let's use those and make this a normal, boring
functionn accepting ArrayRef.

llvm-svn: 230004
2015-02-20 10:47:28 +00:00
Eric Christopher 0218f8cfed Fix wording and grammar in Mips subtarget options.
llvm-svn: 230001
2015-02-20 08:42:34 +00:00
Eric Christopher 3ee30d0607 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 230000
2015-02-20 08:39:06 +00:00
Eric Christopher 22b2ad265f Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 229999
2015-02-20 08:24:37 +00:00
Eric Christopher 155290edf9 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

llvm-svn: 229998
2015-02-20 08:24:34 +00:00
Eric Christopher ad1ef04ab1 Save the MachineFunction in startFunction so that we can use it for
lookups of the subtarget later.

llvm-svn: 229996
2015-02-20 08:01:55 +00:00
Eric Christopher 4369c9b42c Use the cached subtarget from the MachineFunction rather than
doing a lookup on the TargetMachine.

llvm-svn: 229995
2015-02-20 08:01:52 +00:00
Eric Christopher 1947a9e2e2 Make the TargetMachine::getSubtarget that takes a Function argument
take a reference to match the getSubtargetImpl that takes a Function
argument.

llvm-svn: 229994
2015-02-20 07:32:59 +00:00
Justin Bogner c4f5a5e863 Disallow implicit conversions from None to integer types
This fixes an error introduced in r228934 where None was converted to
an int instead of the int being converted to an Optional as intended.
We make that sort of mistake a compile error by changing NoneType into
a scoped enum.

Finally, provide a static NoneType called None to avoid forcing all
users to spell it NoneType::None.

llvm-svn: 229980
2015-02-20 07:28:28 +00:00
Nick Lewycky b73c041005 Fix build with gcc. This has a -Wsequence-point error on 'MII', which is a good point.
llvm-svn: 229979
2015-02-20 07:17:40 +00:00
Eric Christopher a7249ec1a7 Remove more uses of TargetMachine::getSubtargetImpl from the
AsmPrinter.

getSubtargetInfo now asserts that the MachineFunction exists.
Debug printing of register naming now uses the register info
from MCAsmInfo as that's unchanging.

llvm-svn: 229978
2015-02-20 07:16:19 +00:00
Nick Lewycky a2bda08806 Fix build in release mode, -Wunused-variable on this lambda function used only in an assert.
llvm-svn: 229977
2015-02-20 07:16:17 +00:00
Nick Lewycky eb3231eefa Fix build in release mode, four cases of -Wunused-variable.
llvm-svn: 229976
2015-02-20 07:14:02 +00:00
Eric Christopher 78a3f6cc4d AsmPrinter::doFinalization is at the module level and so doesn't
have access to a target specific subtarget info. Grab the module
level MCSubtargetInfo for the JumpInstrTable output stubs.

llvm-svn: 229974
2015-02-20 06:59:48 +00:00
Lang Hames 02be3f36a3 [Orc] Add a new JITSymbol constructor to build a symbol from an existing address.
This constructor is more efficient for symbols that have already been emitted,
since it avoids the construction/execution of a std::function.

Update the ObjectLinkingLayer to use this new constructor where possible.

llvm-svn: 229973
2015-02-20 06:48:29 +00:00
Eric Christopher 97ea7622b5 Remove the MCInstrInfo cached variable as it was only used in a
single place and replace calls to getSubtargetImpl with calls
to get the subtarget from the MachineFunction where valid.

llvm-svn: 229971
2015-02-20 06:35:21 +00:00
David Blaikie 9a3644c472 Fix -Wunused-variable warning in non-asserts build, and optimize a little bit while I'm here.
llvm-svn: 229970
2015-02-20 06:28:38 +00:00
Hal Finkel e5aaf3f2cd [PowerPC] Loop Data Prefetching for the BG/Q
The IBM BG/Q supercomputer's A2 cores have a hardware prefetching unit, the
L1P, but it does not prefetch directly into the A2's L1 cache. Instead, it
prefetches into its own L1P buffer, and the latency to access that buffer is
significantly higher than that to the L1 cache (although smaller than the
latency to the L2 cache). As a result, especially when multiple hardware
threads are not actively busy, explicitly prefetching data into the L1 cache is
advantageous.

I've been using this pass out-of-tree for data prefetching on the BG/Q for well
over a year, and it has worked quite well. It is enabled by default only for
the BG/Q, but can be enabled for other cores as well via a command-line option.

Eventually, we might want to add some TTI interfaces and move this into
Transforms/Scalar (there is nothing particularly target dependent about it,
although only machines like the BG/Q will benefit from its simplistic
strategy).

llvm-svn: 229966
2015-02-20 05:08:21 +00:00
Chandler Carruth 4041f2217b [x86] Remove the old vector shuffle lowering code and its flag.
The new shuffle lowering has been the default for some time. I've
enabled the new legality testing by default with no really blocking
regressions. I've fuzz tested this very heavily (many millions of fuzz
test cases have passed at this point). And this cleans up a ton of code.
=]

Thanks again to the many folks that helped with this transition. There
was a lot of work by others that went into the new shuffle lowering to
make it really excellent.

In case you aren't using a diff algorithm that can handle this:
  X86ISelLowering.cpp: 22 insertions(+), 2940 deletions(-)

llvm-svn: 229964
2015-02-20 04:25:04 +00:00
Chandler Carruth eb206aa1ea [x86] Now that the new vector shuffle legality is enabled and everything
is going well, remove the flag and the code for the old legality tests.

This is the first step toward removing the entire old vector shuffle
lowering. *Much* more code to delete coming up next.

llvm-svn: 229963
2015-02-20 03:59:35 +00:00
Duncan P. N. Exon Smith ad6eb127c9 Bitcode: Stop assuming non-null fields
When writing the bitcode serialization for the new debug info hierarchy,
I assumed two fields would never be null.

Drop that assumption, since it's brittle (and crashes the
`BitcodeWriter` if wrong), and is a check better left for the verifier
anyway.  (No need for a bitcode upgrade here, since the new hierarchy is
still not in place.)

The fields in question are `MDCompileUnit::getFile()` and
`MDDerivedType::getBaseType()`, the latter of which isn't null in
test/Transforms/Mem2Reg/ConvertDebugInfo2.ll (see !14, a pointer to
nothing).  While the testcase might have bitrotted, there's no reason
for the bitcode format to rely on non-null for metadata operands.

This also fixes a bug in `AsmWriter` where if the `file:` is null it
isn't emitted (caught by the double-round trip in the testcase I'm
adding) -- this is a required field in `LLParser`.

I'll circle back to ConvertDebugInfo2.  Once the specialized nodes are
in place, I'll be trying to turn the debug info verifier back on by
default (in the newer module pass form committed r206300) and throwing
more logic in there.  If the testcase has bitrotted (as opposed to me
not understanding the schema correctly) I'll fix it then.

llvm-svn: 229960
2015-02-20 03:17:58 +00:00
Hal Finkel 847e05f569 [InstCombine] Remove unnecessary variable indexing into single-element arrays
This change addresses a deficiency pointed out in PR22629. To copy from the bug
report:

[from the bug report]

Consider this code:

int f(int x) {
  int a[] = {12};
  return a[x];
}

GCC knows to optimize this to

movl     $12, %eax
ret

The code generated by recent Clang at -O3 is:

movslq   %edi, %rax
movl     .L_ZZ1fiE1a(,%rax,4), %eax
retq

.L_ZZ1fiE1a:
  .long    12                      # 0xc

[end from the bug report]

This definitely seems worth fixing. I've also seen this kind of code before (as
the base case of generic vector wrapper templates with one element).

The general idea is to look at the GEP feeding a load or a store, which has
some variable as its first non-zero index, and determine if that index must be
zero (or else an out-of-bounds access would occur). We can do this for allocas
and globals with constant initializers where we know the maximum size of the
underlying object. When we find such a GEP, we create a new one for the memory
access with that first variable index replaced with a constant zero.

Even if we can't eliminate the memory access (and sometimes we can't), it is
still useful because it removes unnecessary indexing calculations.

llvm-svn: 229959
2015-02-20 03:05:53 +00:00
Chandler Carruth d2b14b296c [x86] Make the new vector shuffle legality test on by default, which
reflects the fact that the x86 backend can in fact lower any shuffle you
want it to with reasonably high code quality.

My recent work on the new vector shuffle has made this regress *very*
little. The diff in the test cases makes me very, very happy.

llvm-svn: 229958
2015-02-20 03:05:47 +00:00
Kostya Serebryany 2e3622bddd [fuzzer] one more experimental search mode: -use_coverage_pairs=1
llvm-svn: 229957
2015-02-20 03:02:37 +00:00
Justin Bogner 3e18de2dbb utils: Teach lldbDataFormatters about llvm::Optional
llvm-svn: 229956
2015-02-20 02:55:22 +00:00
Chandler Carruth 6677809820 [x86] Clean up a couple of test cases with the new update script. Split
one test case that is only partially tested in 32-bits into two test
cases so that the script doesn't generate massive spews of tests for the
cases we don't care about.

llvm-svn: 229955
2015-02-20 02:44:13 +00:00
Philip Reames 6faacf4772 Adjust enablement of RewriteStatepointsForGC
When back merging the changes in 229945 I noticed that I forgot to mark the test cases with the appropriate GC.  We want the rewriting to be off by default (even when manually added to the pass order), not on-by default.  To keep the current test working, mark them as using the statepoint-example GC and whitelist that GC.  

Longer term, we need a better selection mechanism here for both actual usage and testing.  As I migrate more tests to the in tree version of this pass, I will probably need to update the enable/disable logic as well. 

llvm-svn: 229954
2015-02-20 02:34:49 +00:00
Duncan P. N. Exon Smith f86505abdf IR: Extract macros from DILocation, NFC
`DILocation` is a lightweight wrapper.  Its accessors check for null and
the correct type, and then forward to `MDLocation`.

Extract a couple of macros to do the `dyn_cast_or_null<>` and default
return logic.  I'll be using these to minimize error-prone boilerplate
when I move the new hierarchy into place -- since all the other
subclasses of `DIDescriptor` will similarly become lightweight wrappers.

(Note that I hope to obsolete these wrappers fairly quickly, with the
goal of renaming the underlying types (e.g., I'll rename `MDLocation` to
`DILocation` once the name is free).)

llvm-svn: 229953
2015-02-20 02:28:49 +00:00
Chandler Carruth 301ed0c3b4 Revert r229944: EH: Prune unreachable resume instructions during Dwarf EH preparation
This doesn't pass 'ninja check-llvm' for me. Lots of tests, including
the ones updated, fail with crashes and other explosions.

llvm-svn: 229952
2015-02-20 02:15:36 +00:00
Kostya Serebryany 086a919cae [sanitizer] fix a test broken by r229940
llvm-svn: 229951
2015-02-20 02:12:25 +00:00
Lang Hames c75a932df3 [Orc][Kaleidoscope] Fix the orc/kaleidoscope tutorials on linux.
llvm-svn: 229949
2015-02-20 02:03:30 +00:00
Duncan P. N. Exon Smith 9c73c4aff2 IR: Add getRaw() helper, NFC
llvm-svn: 229947
2015-02-20 01:18:47 +00:00
Philip Reames d16a9b1fdc Add a pass for constructing gc.statepoint sequences w/explicit relocations
This patch consists of a single pass whose only purpose is to visit previous inserted gc.statepoints which do not have gc.relocates inserted yet, and insert them. This can be used either immediately after IR generation to perform 'early safepoint insertion' or late in the pass order to perform 'late insertion'.

This patch is setting the stage for work to continue in tree.  In particular, there are known naming and style violations in the current patch.  I'll try to get those resolved over the next week or so.  As I touch each area to make style changes, I need to make sure we have adequate testing in place.  As part of the cleanup, I will be cleaning up a collection of test cases we have out of tree and submitting them upstream. The tests included in this change are very basic and mostly to provide examples of usage.

The pass has several main subproblems it needs to address:
- First, it has identify any live pointers. In the current code, the use of address spaces to distinguish pointers to GC managed objects is hard coded, but this will become parametrizable in the near future.  Note that the current change doesn't actually contain a useful liveness analysis.  It was seperated into a followup change as the code wasn't ready to be shared.  Instead, the current implementation just considers any dominating def of appropriate pointer type to be live.
- Second, it has to identify base pointers for each live pointer. This is a fairly straight forward data flow algorithm. 
- Third, the information in the previous steps is used to actually introduce rewrites. Rather than trying to do this by hand, we simply re-purpose the code behind Mem2Reg to do this for us.

llvm-svn: 229945
2015-02-20 01:06:44 +00:00
Reid Kleckner 0b647e6cca EH: Prune unreachable resume instructions during Dwarf EH preparation
Today a simple function that only catches exceptions and doesn't run
destructor cleanups ends up containing a dead call to _Unwind_Resume
(PR20300). We can't remove these dead resume instructions during normal
optimization because inlining might introduce additional landingpads
that do have cleanups to run. Instead we can do this during EH
preparation, which is guaranteed to run after inlining.

Fixes PR20300.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D7744

llvm-svn: 229944
2015-02-20 01:00:19 +00:00
Eric Christopher 0d94fa98e5 Revert "AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics."
The instructions were being generated on architectures that don't support avx512.

This reverts commit r229837.

llvm-svn: 229942
2015-02-20 00:45:28 +00:00
Eric Christopher 06b32cdfed Add a license header to the AVX512 file.
llvm-svn: 229941
2015-02-20 00:36:53 +00:00
Kostya Serebryany 885994618c [sanitizer] when dumping the basic block trace, also dump the module names. Patch by Laszlo Szekeres
llvm-svn: 229940
2015-02-20 00:30:44 +00:00
Eric Christopher cd37bf5483 This needs to be a const variable so the two sides of the ternary
operator agree on type.

llvm-svn: 229938
2015-02-20 00:03:45 +00:00
Michael Gottesman 0fc2accb58 [objc-arc-contract] We can not move retains over instructions which can not conservatively be proven to not decrement the retain's RCIdentity.
I also cleaned up the code to make it more understandable for mere mortals.

<rdar://problem/19853758>

llvm-svn: 229937
2015-02-20 00:02:49 +00:00
Michael Gottesman 5ab64de62b [objc-arc] Add the predicate CanDecrementRefCount.
This is different from CanAlterRefCount since CanDecrementRefCount is
attempting to prove specifically whether or not an instruction can
decrement instead of the more general question of whether it can
decrement or increment.

llvm-svn: 229936
2015-02-20 00:02:45 +00:00
Duncan P. N. Exon Smith d34db1716e IR: Fix MDType fields from unsigned to uint64_t
When trying to match the current schema with the new debug info
hierarchy, I downgraded `SizeInBits`, `AlignInBits` and `OffsetInBits`
to 32-bits (oops!).  Caught this while testing my upgrade script to move
the hierarchy into place.  Bump it back up to 64-bits and update tests.

llvm-svn: 229933
2015-02-19 23:56:07 +00:00
Ahmed Bougacha db141ac37d [ARM] Re-re-apply VLD1/VST1 base-update combine.
This re-applies r223862, r224198, r224203, and r224754, which were
reverted in r228129 because they exposed Clang misalignment problems
when self-hosting.

The combine caused the crashes because we turned ISD::LOAD/STORE nodes
to ARMISD::VLD1/VST1_UPD nodes.  When selecting addressing modes, we
were very lax for the former, and only emitted the alignment operand
(as in "[r1:128]") when it was larger than the standard alignment of
the memory type.

However, for ARMISD nodes, we just used the MMO alignment, no matter
what.  In our case, we turned ISD nodes to ARMISD nodes, and this
caused the alignment operands to start being emitted.

And that's how we exposed alignment problems that were ignored before
(but I believe would have been caught with SCTRL.A==1?).

To fix this, we can just mirror the hack done for ISD nodes:  only
take into account the MMO alignment when the access is overaligned.

Original commit message:
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.

We can do the same thing for generic load/stores.

Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).

rdar://19717869, rdar://14062261.

llvm-svn: 229932
2015-02-19 23:52:41 +00:00
Eric Christopher 2105ae98f6 Only use the initialized MCInstrInfo if it's been initialized already
during SetupMachineFunction. This is also the single use of MII
and it'll be changing to TargetInstrInfo (which is MachineFunction
based) in the next commit here.

llvm-svn: 229931
2015-02-19 23:52:35 +00:00
Duncan P. N. Exon Smith 72379706ea DebugInfo: Match Name and DisplayName in testcase
There's no way for `DIBuilder` to create a subprogram or global variable
where `getName()` and `getDisplayName()` give different answers.  This
testcase managed to achieve the feat though.  This was probably just
left behind in some sort of upgrade along the way.

llvm-svn: 229930
2015-02-19 23:48:17 +00:00
Ahmed Bougacha dfdf54bed0 [ARM] Minor cleanup to CombineBaseUpdate. NFC.
In preparation for a future patch:
- rename isLoad to isLoadOp: the former is confusing, and can be taken
  to refer to the fact that the node is an ISD::LOAD.  (it isn't, yet.)
- change formatting here and there.
- add some comments.
- const-ify bools.

llvm-svn: 229929
2015-02-19 23:30:37 +00:00
Eric Christopher 7330264146 Migrate away a use of the subtarget (and TargetMachine) from
AsmPrinterDwarf since the information is on the MCRegisterInfo
via the MCContext and MMI that we already have on the AsmPrinter.

llvm-svn: 229928
2015-02-19 23:29:42 +00:00
Duncan P. N. Exon Smith a9f0a8d325 IR: Add missing null operand to MDSubroutineType
Add missing `nullptr` from `MDSubroutineType`'s operands for
`MDCompositeTypeBase::getIdentifier()` (and add tests for all the other
unused fields).  This highlights just how crazy it is that
`MDSubroutineType` inherits from `MDCompositeTypeBase`.

llvm-svn: 229926
2015-02-19 23:25:21 +00:00
Ahmed Bougacha 4c2b0781a5 [CodeGen] Use ArrayRef instead of std::vector&. NFC.
The former lets us use SmallVectors.  Do so in ARM and AArch64.

llvm-svn: 229925
2015-02-19 23:13:10 +00:00
Eric Christopher cbdbf39881 MCTargetOptions reside on the TargetMachine that we always have via
TargetOptions.

llvm-svn: 229917
2015-02-19 21:29:51 +00:00
Eric Christopher 457864178f Remove a call to TargetMachine::getSubtarget from the inline
asm support in the asm printer. If we can get a subtarget from
the machine function then we should do so, otherwise we can
go ahead and create a default one since we're at the module
level.

llvm-svn: 229916
2015-02-19 21:24:23 +00:00
Colin LeMahieu 1174fea31c [Hexagon] Moving remaining methods off of HexagonMCInst in to HexagonMCInstrInfo and eliminating HexagonMCInst class.
llvm-svn: 229914
2015-02-19 21:10:50 +00:00
Benjamin Kramer 68ca67b212 MC: Allow multiple comma-separated expressions on the .uleb128 directive.
For compatiblity with GNU as. Binutils documents this as
'.uleb128 expressions'. Subtle, isn't it?

llvm-svn: 229911
2015-02-19 20:24:04 +00:00
Benjamin Kramer dfedfeb298 SSAUpdater: Use range-based for. NFC.
llvm-svn: 229908
2015-02-19 20:04:02 +00:00
Eric Christopher 64d35be6d6 Remove unused argument from emitInlineAsmStart.
llvm-svn: 229907
2015-02-19 19:52:25 +00:00
Michael Gottesman 2e0e4e07b4 [objc-arc] Convert the bodies of ARCInstKind predicates into covered switches.
This is much better than the previous manner of just using
short-curcuiting booleans from:

1. A "naive" efficiency perspective: we do not have to rely on the
compiler to change the short circuiting boolean operations into a
switch.
2. An understanding perspective by making the implicit behavior of
negative predicates explicit.
3. A maintainability perspective through the covered switch flag making
it easy to know where to update code when adding new ARCInstKinds.

llvm-svn: 229906
2015-02-19 19:51:36 +00:00
Michael Gottesman 6f729fa675 [objc-arc] Change the InstructionClass to be an enum class called ARCInstKind.
I also renamed ObjCARCUtil.cpp -> ARCInstKind.cpp. That file only contained
items related to ARCInstKind anyways.

llvm-svn: 229905
2015-02-19 19:51:32 +00:00
Chris Bieneman a747e5935d Checking if TARGET_OS_IPHONE is defined isn't good enough for 10.7 and earlier.
Older versions of the TargetConditionals header always defined TARGET_OS_IPHONE to something (0 or 1), so we need to test not only for the existence but also if it is 1.

This resolves PR22631.

llvm-svn: 229904
2015-02-19 19:50:52 +00:00
Colin LeMahieu 745c4710db [Hexagon] Moving more functions off of HexagonMCInst and in to HexagonMCInstrInfo.
llvm-svn: 229903
2015-02-19 19:49:27 +00:00
Adam Nemet 57ac766ee9 [LoopAccesses] Change LAA:getInfo to return a constant reference
As expected, this required a few more const-correctness fixes.

Based on Hal's feedback on D7684.

llvm-svn: 229899
2015-02-19 19:15:21 +00:00
Adam Nemet e91cc6ef93 [LoopAccesses] Add -analyze support
The LoopInfo in combination with depth_first is used to enumerate the
loops.

Right now -analyze is not yet complete.  It only prints the result of
the analysis, the report and the run-time checks.  Printing the unsafe
depedences will require a bit more reshuffling which I'd like to do in a
follow-on to this patchset.  Unsafe dependences are currently checked
via -debug-only=loop-accesses in the new test.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229898
2015-02-19 19:15:19 +00:00
Adam Nemet 2bd6e984ef [LoopAccesses] Split out LoopAccessReport from VectorizerReport
The only difference between these two is that VectorizerReport adds a
vectorizer-specific prefix to its messages.  When LAA is used in the
vectorizer context the prefix is added when we promote the
LoopAccessReport into a VectorizerReport via one of the constructors.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229897
2015-02-19 19:15:15 +00:00
Adam Nemet 3e87634fd8 [LoopAccesses] Add missing const to APIs in VectorizationReport
When I split out LoopAccessReport from this, I need to create some temps
so constness becomes necessary.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229896
2015-02-19 19:15:13 +00:00
Adam Nemet 929c38e8ff [LoopAccesses] Add canAnalyzeLoop
This allows the analysis to be attempted with any loop.  This feature
will be used with -analysis.  (LV only requests the analysis on loops
that have already satisfied these tests.)

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229895
2015-02-19 19:15:10 +00:00
Adam Nemet 339f42b396 [LoopAccesses] Change debug messages from LV to LAA
Also add pass name as an argument to VectorizationReport::emitAnalysis.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229894
2015-02-19 19:15:07 +00:00
Adam Nemet 3bfd93d789 [LoopAccesses] Create the analysis pass
This is a function pass that runs the analysis on demand.  The analysis
can be initiated by querying the loop access info via LAA::getInfo.  It
either returns the cached info or runs the analysis.

Symbolic stride information continues to reside outside of this analysis
pass. We may move it inside later but it's not a priority for me right
now.  The idea is that Loop Distribution won't support run-time stride
checking at least initially.

This means that when querying the analysis, symbolic stride information
can be provided optionally.  Whether stride information is used can
invalidate the cache entry and rerun the analysis.  Note that if the
loop does not have any symbolic stride, the entry should be preserved
across Loop Distribution and LV.

Since currently the only user of the pass is LV, I just check that the
symbolic stride information didn't change when using a cached result.

On the LV side, LoopVectorizationLegality requests the info object
corresponding to the loop from the analysis pass.  A large chunk of the
diff is due to LAI becoming a pointer from a reference.

A test will be added as part of the -analyze patch.

Also tested that with AVX, we generate identical assembly output for the
testsuite (including the external testsuite) before and after.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229893
2015-02-19 19:15:04 +00:00
Adam Nemet 436018c3ff [LoopAccesses] Cache the result of canVectorizeMemory
LAA will be an on-demand analysis pass, so we need to cache the result
of the analysis.  canVectorizeMemory is renamed to analyzeLoop which
computes the result.  canVectorizeMemory becomes the query function for
the cached result.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229892
2015-02-19 19:15:00 +00:00
Adam Nemet c922853b93 [LoopAccesses] Stash the report from the analysis rather than emitting it
The transformation passes will query this and then emit them as part of
their own report.  The currently only user LV is modified to do just
that.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229891
2015-02-19 19:14:56 +00:00
Adam Nemet f219c64723 [LoopAccesses] Make VectorizerParams global + fix for cyclic dep
As LAA is becoming a pass, we can no longer pass the params to its
constructor.  This changes the command line flags to have external
storage.  These can now be accessed both from LV and LAA.

VectorizerParams is moved out of LoopAccessInfo in order to shorten the
code to access it.

This commits also has the fix (D7731) to the break dependence cycle
between the analysis and vector libraries.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229890
2015-02-19 19:14:52 +00:00
Adam Nemet 04d4163e95 Revert "Reformat."
This reverts commit r229651.

I'd like to ultimately revert r229650 but this reformat stands in the
way.  I'll reformat the affected files once the the loop-access pass is
fully committed.

llvm-svn: 229889
2015-02-19 19:14:34 +00:00
David Blaikie 8d2d6b1d05 [orcjit] Include CMake support for the fully_lazy example and fix the build
Not sure if/how to make the CMake build use C++14 for the examples, so
let's stick to C++11 for now.

llvm-svn: 229888
2015-02-19 19:06:04 +00:00
Colin LeMahieu af304e5192 [Hexagon] Creating HexagonMCInstrInfo namespace as landing zone for static functions detached from HexagonMCInst.
llvm-svn: 229885
2015-02-19 19:00:00 +00:00
Eric Christopher 1e61ffddc7 Fix grammar in documentation.
Patch by Ralph Campbell!

llvm-svn: 229884
2015-02-19 18:46:25 +00:00
Eric Christopher 504f388a84 Update and remove a few calls to TargetMachine::getSubtargetImpl
out of the asm printer.

llvm-svn: 229883
2015-02-19 18:46:23 +00:00
Kostya Serebryany 016852c396 [fuzzer] split main() into FuzzerDriver() that takes a callback as a parameter and a tiny main() in a separate file
llvm-svn: 229882
2015-02-19 18:45:37 +00:00
Ben Langmuir 0897091730 Assume the original file is created before release in LockFileManager
This is true in clang, and let's us remove the problematic code that
waits around for the original file and then times out if it doesn't get
created in short order.  This caused any 'dead' lock file or legitimate
time out to cause a cascade of timeouts in any processes waiting on the
same lock (even if they only just showed up).

llvm-svn: 229881
2015-02-19 18:22:35 +00:00
Kostya Serebryany 2117269dd1 [fuzzer] properly annotate fallthrough, add one more entry to FAQ
llvm-svn: 229880
2015-02-19 18:21:12 +00:00
Colin LeMahieu f08a3ccf50 [Hexagon] Removing static variable holding MCInstrInfo.
llvm-svn: 229872
2015-02-19 17:38:39 +00:00
Benjamin Kramer 1c2beed7fd LSR: Move set instead of copying. NFC.
llvm-svn: 229871
2015-02-19 17:19:43 +00:00
Sanjay Patel f34a29a845 add X86 load folding tests for unary math ops
X86 load folding is fragile; eg, the tests here
don't work without AVX even though they should. This
is because we have a mix of tablegen patterns that have
been added over time, and we have a load folding table
used by the peephole optimizer that has to be kept in 
sync with the ever-changing ISA and tablegen defs.

llvm-svn: 229870
2015-02-19 16:59:11 +00:00
Rafael Espindola 8c97e19124 Avoid conversion to float when creating ConstantDataArray/ConstantDataVector.
Patch by Raoux, Thomas F!

llvm-svn: 229864
2015-02-19 16:08:20 +00:00
Benjamin Kramer ea68a944a1 Demote vectors to arrays. No functionality change.
llvm-svn: 229861
2015-02-19 15:26:17 +00:00
Chandler Carruth 5d1a84b7b8 [x86] Delete still more piles of complex code now that we have a good
systematic lowering of v8i16.

This required a slight strategy shift to prefer unpack lowerings in more
places. While this isn't a cut-and-dry win in every case, it is in the
overwhelming majority. There are only a few places where the old
lowering would probably be a touch faster, and then only by a small
margin.

In some cases, this is yet another significant improvement.

llvm-svn: 229859
2015-02-19 15:21:57 +00:00
Chandler Carruth 0b39536390 [x86] Teach the unpack lowering how to lower with an initial unpack in
addition to lowering to trees rooted in an unpack.

This saves shuffles and or registers in many various ways, lets us
handle another class of v4i32 shuffles pre SSE4.1 without domain
crosses, etc.

llvm-svn: 229856
2015-02-19 15:06:13 +00:00
Chandler Carruth 352eba1c29 [x86] Dramatically improve v8i16 shuffle lowering by not using its
terribly complex partial blend logic.

This code path was one of the more complex and bug prone when it first
went in and it hasn't faired much better. Ultimately, with the simpler
basis for unpack lowering and support bit-math blending, this is
completely obsolete. In the worst case without this we generate
different but equivalent instructions. However, in many cases we
generate much better code. This is especially true when blends or pshufb
is available.

This does expose one (minor) weakness of the unpack lowering that I'll
try to address.

In case you were wondering, this is actually a big part of what I've
been trying to pull off in the recent string of commits.

llvm-svn: 229853
2015-02-19 14:08:24 +00:00
Chandler Carruth 2c0390ca4b [x86] Remove the final fallback in the v8i16 lowering that isn't really
needed, and significantly improve the SSSE3 path.

This makes the new strategy much more clear. If we can blend, we just go
with that. If we can't blend, we try to permute into an unpack so
that we handle cases where the unpack doing the blend also simplifies
the shuffle. If that fails and we've got SSSE3, we now call into
factored-out pshufb lowering code so that we leverage the fact that
pshufb can set up a blend for us while shuffling. This generates great
code, especially because we *know* we don't have a fast blend at this
point. Finally, we fall back on decomposing into permutes and blends
because we do at least have a bit-math-based blend if we need to use
that.

This pretty significantly improves some of the v8i16 code paths. We
never need to form pshufb for the single-input shuffles because we have
effective target-specific combines to form it there, but we were missing
its effectiveness in the blends.

llvm-svn: 229851
2015-02-19 13:56:49 +00:00
Chandler Carruth f0f0d27391 [x86] Simplify the pre-SSSE3 v16i8 lowering significantly by decomposing
them into permutes and a blend with the generic decomposition logic.

This works really well in almost every case and lets the code only
manage the expansion of a single input into two v8i16 vectors to perform
the actual shuffle. The blend-based merging is often much nicer than the
pack based merging that this replaces. The only place where it isn't we
end up blending between two packs when we could do a single pack. To
handle that case, just teach the v2i64 lowering to handle these blends
by digging out the operands.

With this we're down to only really random permutations that cause an
explosion of instructions.

llvm-svn: 229849
2015-02-19 13:15:12 +00:00
Chandler Carruth 8817e5e01b [x86] Remove the insanely over-aggressive unpack lowering strategy for
v16i8 shuffles, and replace it with new facilities.

This uses precise patterns to match exact unpacks, and the new
generalized unpack lowering only when we detect a case where we will
have to shuffle both inputs anyways and they terminate in exactly
a blend.

This fixes all of the blend horrors that I uncovered by always lowering
blends through the vector shuffle lowering. It also removes *sooooo*
much of the crazy instruction sequences required for v16i8 lowering
previously. Much cleaner now.

The only "meh" aspect is that we sometimes use pshufb+pshufb+unpck when
it would be marginally nicer to use pshufb+pshufb+por. However, the
difference there is *tiny*. In many cases its a win because we re-use
the pshufb mask. In others, we get to avoid the pshufb entirely. I've
left a FIXME, but I'm dubious we can really do better than this. I'm
actually pretty happy with this lowering now.

For SSE2 this exposes some horrors that were really already there. Those
will have to fixed by changing a different path through the v16i8
lowering.

llvm-svn: 229846
2015-02-19 12:10:37 +00:00
Jozef Kolek 5d171fc291 [mips][microMIPS] Make usage of AND16, OR16 and XOR16 by code generator
Differential Revision: http://reviews.llvm.org/D7611

llvm-svn: 229845
2015-02-19 11:51:32 +00:00
Chandler Carruth 38dea42ddf [x86] The SELECT x86 DAG combine also does legalization. It used to rely
on things not being marked as either custom or legal, but we now do
custom lowering of more VSELECT nodes. To cope with this, manually
replicate the legality tests here. These have to stay in sync with the
set of tests used in the custom lowering of VSELECT.

Ideally, we wouldn't do any of this combine-based-legalization when we
have an actual custom legalization step for VSELECT, but I'm not going
to be able to rewrite all of that today.

I don't have a test case for this currently, but it was found when
compiling a number of the test-suite benchmarks. I'll try to reduce
a test case and add it.

This should at least fix the test-suite fallout on build bots.

llvm-svn: 229844
2015-02-19 11:43:37 +00:00
Igor Laevsky 55d60a4a2f Add few simple tests to check statepoint placement for invoke instructions.
Differential Revision: http://reviews.llvm.org/D7535

llvm-svn: 229842
2015-02-19 11:39:04 +00:00
Michael Kuperstein efd7a96d2e Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures.
llvm-svn: 229841
2015-02-19 11:38:11 +00:00
Igor Laevsky 9570ff94f7 Implement invoke statepoint verification.
Differential Revision: http://reviews.llvm.org/D7366

llvm-svn: 229840
2015-02-19 11:28:47 +00:00
Igor Laevsky 77f118f878 Add invoke related functionality into StatepointSite classes.
Differential Revision: http://reviews.llvm.org/D7364

llvm-svn: 229838
2015-02-19 11:02:11 +00:00
Elena Demikhovsky 69e8b45b13 AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics.
llvm-svn: 229837
2015-02-19 10:48:04 +00:00
Chandler Carruth bcb6c5f62d [x86] Add support for bit-wise blending and use it in the v8 and v16
lowering paths. I'm going to be leveraging this to simplify a lot of the
overly complex lowering of v8 and v16 shuffles in pre-SSSE3 modes.

Sadly, this isn't profitable on v4i32 and v2i64. There, the float and
double blending instructions for pre-SSE4.1 are actually pretty good,
and we can't beat them with bit math. And once SSE4.1 comes around we
have direct blending support and this ceases to be relevant.

Also, some of the test cases look odd because the domain fixer
canonicalizes these to floating point domain. That's OK, it'll use the
integer domain when it matters and some day I may be able to update
enough of LLVM to canonicalize the other way.

This restores almost all of the regressions from teaching x86's vselect
lowering to always use vector shuffle lowering for blends. The remaining
problems are because the v16 lowering path is still doing crazy things.
I'll be re-arranging that strategy in more detail in subsequent commits
to finish recovering the performance here.

llvm-svn: 229836
2015-02-19 10:46:52 +00:00
Chandler Carruth b89464a9b6 [x86,sdag] Two interrelated changes to the x86 and sdag code.
First, don't combine bit masking into vector shuffles (even ones the
target can handle) once operation legalization has taken place. Custom
legalization of vector shuffles may exist for these patterns (making the
predicate return true) but that custom legalization may in some cases
produce the exact bit math this matches. We only really want to handle
this prior to operation legalization.

However, the x86 backend, in a fit of awesome, relied on this. What it
would do is mark VSELECTs as expand, which would turn them into
arithmetic, which this would then match back into vector shuffles, which
we would then lower properly. Amazing.

Instead, the second change is to teach the x86 backend to directly form
vector shuffles from VSELECT nodes with constant conditions, and to mark
all of the vector types we support lowering blends as shuffles as custom
VSELECT lowering. We still mark the forms which actually support
variable blends as *legal* so that the custom lowering is bypassed, and
the legal lowering can even be used by the vector shuffle legalization
(yes, i know, this is confusing. but that's how the patterns are
written).

This makes the VSELECT lowering much more sensible, and in fact should
fix a bunch of bugs with it. However, as you'll see in the test cases,
right now what it does is point out the *hilarious* deficiency of the
new vector shuffle lowering when it comes to blends. Fortunately, my
very next patch fixes that. I can't submit it yet, because that patch,
somewhat obviously, forms the exact and/or pattern that the DAG combine
is matching here! Without this patch, teaching the vector shuffle
lowering to produce the right code infloops in the DAG combiner. With
this patch alone, we produce terrible code but at least lower through
the right paths. With both patches, all the regressions here should be
fixed, and a bunch of the improvements (like using 2 shufps with no
memory loads instead of 2 andps with memory loads and an orps) will
stay. Win!

There is one other change worth noting here. We had hilariously wrong
vectorization cost estimates for vselect because we fell through to the
code path that assumed all "expand" vector operations are scalarized.
However, the "expand" lowering of VSELECT is vector bit math, most
definitely not scalarized. So now we go back to the correct if horribly
naive cost of "1" for "not scalarized". If anyone wants to add actual
modeling of shuffle costs, that would be cool, but this seems an
improvement on its own. Note the removal of 16 and 32 "costs" for doing
a blend. Even in SSE2 we can blend in fewer than 16 instructions. ;] Of
course, we don't right now because of OMG bad code, but I'm going to fix
that. Next patch. I promise.

llvm-svn: 229835
2015-02-19 10:36:19 +00:00
Michael Kuperstein ba5b04c798 Use std::bitset for SubtargetFeatures
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.

No functional change.

Differential Revision: http://reviews.llvm.org/D7065

llvm-svn: 229831
2015-02-19 09:01:04 +00:00
Davide Italiano faafae33fa [Support/Timer] Make GetMallocUsage() aware of jemalloc.
Differential Revision:	D7657
Reviewed by:	shankarke, majnemer

llvm-svn: 229824
2015-02-19 07:27:14 +00:00
Lang Hames c6ba0bf33b [Orc][Kaleidoscope] Fix typo in tutorial comment.
llvm-svn: 229821
2015-02-19 05:33:30 +00:00
Dmitri Gribenko 3e1551c96f Provide the same ABI regardless of NDEBUG
For projects depending on LLVM, I find it very useful to combine a
release-no-asserts build of LLVM with a debug+asserts build of the dependent
project.  The motivation is that when developing a dependent project, you are
debugging that project itself, not LLVM.  In my usecase, a significant part of
the runtime is spent in LLVM optimization passes, so I would like to build LLVM
without assertions to get the best performance from this combination.

Currently, `lib/Support/Debug.cpp` changes the set of symbols it provides
depending on NDEBUG, while `include/llvm/Support/Debug.h` requires extra
symbols when NDEBUG is not defined.  Thus, it is not possible to enable
assertions in an external project that uses facilities of `Debug.h`.

This patch changes `Debug.cpp` and `Valgrind.cpp` to always define the symbols
that other code may depend on when #including LLVM headers without NDEBUG.

http://reviews.llvm.org/D7662

llvm-svn: 229819
2015-02-19 05:30:16 +00:00
Lang Hames 56678fe634 [Orc][Kaleidoscope] Make the 'fully lazy' orc kaleidoscope tutorial lazier still.
The new JIT doesn't IRGen stubs until they're referenced.

llvm-svn: 229807
2015-02-19 01:32:43 +00:00
Lang Hames af53ed1a7f [Orc] Fix a bug in the compile callback manager: trampoline ids need to be fixed
up before returning them to the available pool.

llvm-svn: 229806
2015-02-19 01:31:25 +00:00
Eric Christopher d84f5d30e2 Remove the local subtarget variable from the SystemZ asm printer
and update the two calls accordingly.

llvm-svn: 229805
2015-02-19 01:26:28 +00:00
Eric Christopher 0795a2ef0c Remove a few more calls to TargetMachine::getSubtarget from the
R600 port.

llvm-svn: 229804
2015-02-19 01:10:55 +00:00
Eric Christopher 7edca437f5 Grab the subtarget off of the machine function for the R600
asm printer and clean up a bunch of uses.

llvm-svn: 229803
2015-02-19 01:10:53 +00:00
Eric Christopher 96caeda730 Remove the DisasmEnabled AsmPrinter variable and just look it
up on the subtarget where it's set anyhow than looking it up
2-3 times in the same place.

llvm-svn: 229802
2015-02-19 01:10:49 +00:00
Peter Collingbourne fb8002cbe0 MC: Remove NullStreamer hook, as it is redundant with NullTargetStreamer.
llvm-svn: 229799
2015-02-19 00:45:07 +00:00
Peter Collingbourne f4498a4fd3 llvm-mc: Use Target::createNullStreamer to fix crashes on target-specific asm directives.
llvm-svn: 229798
2015-02-19 00:45:04 +00:00
Peter Collingbourne 20c7259ce9 Introduce Target::createNullTargetStreamer and use it from IRObjectFile.
A null MCTargetStreamer allows IRObjectFile to ignore target-specific
directives. Previously we were crashing.

Differential Revision: http://reviews.llvm.org/D7711

llvm-svn: 229797
2015-02-19 00:45:02 +00:00
Michael Gottesman e5ad66f8a9 [objc-arc] Introduce the concept of RCIdentity and rename all relevant functions to use that name. NFC.
The RCIdentity root ("Reference Count Identity Root") of a value V is a
dominating value U for which retaining or releasing U is equivalent to
retaining or releasing V. In other words, ARC operations on V are
equivalent to ARC operations on U.

This is a useful property to ascertain since we can use this in the ARC
optimizer to make it easier to match up ARC operations by always mapping
ARC operations to RCIdentityRoots instead of pointers themselves. Then
we perform pairing of retains, releases which are applied to the same
RCIdentityRoot.

In general, the two ways that we see RCIdentical values in ObjC are via:

  1. PointerCasts
  2. Forwarding Calls that return their argument verbatim.

As such in ObjC, two RCIdentical pointers must always point to the same
memory location.

Previously this concept was implicit in the code and various methods
that dealt with this concept were given functional names that did not
conform to any name in the "ARC" model. This often times resulted in
code that was hard for the non-ARC acquanted to understand resulting in
unhappiness and confusion.

llvm-svn: 229796
2015-02-19 00:42:38 +00:00