Commit Graph

121625 Commits

Author SHA1 Message Date
Chandler Carruth 233edd20a7 [ADT] Rewrite the StringRef::find implementation to be simpler, clearer,
and tremendously less reliant on the optimizer to fix things.

The code is always necessarily looking for the entire length of the
string when doing the equality tests in this find implementation, but it
previously was needlessly re-checking the size each time among other
annoyances.

By writing this so simply an ddirectly in terms of memcmp, it also is
about 8x faster in a debug build, which in turn makes FileCheck about 2x
faster in 'ninja check-llvm'. This saves about 8% of the time for
FileCheck-heavy parts of the test suite like the x86 backend tests.

llvm-svn: 247269
2015-09-10 11:17:49 +00:00
Silviu Baranga df9ce8408a [DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant folding vectors
Summary:
The BUILD_VECTOR node will truncate its operators to match the
type. We need to take this into account when constant folding -
we need to perform a truncation before constant folding the elements.
This is because the upper bits can change the result, depending on
the operation type (for example this is the case for min/max).

This change also adds a regression test.

Reviewers: jmolloy

Subscribers: jmolloy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12697

llvm-svn: 247265
2015-09-10 10:34:34 +00:00
James Molloy d47634d781 Enable GlobalsAA by default
This can give significant improvements to alias analysis in some situations, and improves its testing coverage in all situations.

llvm-svn: 247264
2015-09-10 10:22:20 +00:00
James Molloy efbba72cb2 Add GlobalsAA as preserved to a bunch of transforms
GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA.

llvm-svn: 247263
2015-09-10 10:22:12 +00:00
Chandler Carruth 9e1c0c1500 [ADT] Force inline several super boring and unusually hot methods on
SmallVector to further help debug builds not waste their time calling
one line functions.

To give you an idea of why this is worthwhile, this change alone gets
another >10% reduction in the runtime of TripleTest.Normalization! It's
now under 9 seconds for me. Sadly, this is the end of the easy wins for
that test. Anything further will require some different architecture of
the test itself. Still, I'm pretty happy. 'check-llvm' now is under 35s
for me.

llvm-svn: 247259
2015-09-10 09:46:47 +00:00
Chandler Carruth 693683426b [ADT] Micro-optimize and force inlining for string switches.
These are now quite heavily used in unit tests and the host tools,
making it worth having them be reasonably fast even in an unoptimized
build. This change reduces the total runtime of TripleTest.Normalization
by yet another 10% to 15%. It is now under 10 seconds on my machine, and
the total check-llvm time has dropped from 38s to around 36s.

I experimented with a number of different options, and the code pattern
here consistently seemed to lower the cleanest, likely due to the
significantly simple CFG and far fewer redundant tests of 'Result'.

llvm-svn: 247257
2015-09-10 09:25:59 +00:00
James Molloy 8c995a93ce [ARM] Do not use vtrn for vectorshuffle if the order is reversed
The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case.

Patch by Jeroen Ketema!

llvm-svn: 247254
2015-09-10 08:42:28 +00:00
Chandler Carruth 6f77949d8b [ADT] Apply a large hammer to StringRef functions: attribute always_inline.
The logic of this follows something Howard does in libc++ and something
I discussed with Chris eons ago -- for a lot of functions, there is
really no benefit to preserving "debug information" by leaving the
out-of-line even in debug builds. This is especially true as we now do
a very good job of preserving most debug information even in the face of
inlining. There are a bunch of methods in StringRef that we are paying
a completely unacceptable amount for with every debug build of every
LLVM developer.

Some day, we should fix Clang/LLVM so that developers can reasonable
use a default of something other than '-O0' and not waste their lives
waiting on *completely* unoptimized code to execute. We should have
a default that doesn't impede debugging while providing at least
plausable performance.

But today is not that day.

So today, I'm applying always_inline to the functions that are really
hurting the critical path for stuff like 'check_llvm'. I'm being very
cautious here, but there are a few other APIs that we really should do
this for as a matter of pragmatism. Hopefully we can rip this out some
day.

With this change, TripleTest.Normalization runtime decreases by over
10%, and the total 'check-llvm' time on my 48-core box goes from 38s to
just under 37s.

llvm-svn: 247253
2015-09-10 08:29:35 +00:00
Chandler Carruth 4f4541356b [Support] Fix the always_inline attribute macro to not include the
'inline' specifier. That specifier may or may not be valid for a given
function, or it may be required for correct linkage even when the
compiler doesn't support the always_inline attribute.

llvm-svn: 247252
2015-09-10 08:29:30 +00:00
Chandler Carruth f054eca167 [ADT] Micro-optimize the Triple constructor by doing a single split and
re-using the resulting components rather than repeatedly splitting and
re-splitting to compute each component as part of the initializer list.

This is more work on PR23676. Sadly, it doesn't help much. It removes
the constructor from my profile, but doesn't make a sufficient dent in
the total time. But it should play together nicely with subsequent
changes.

llvm-svn: 247250
2015-09-10 07:51:43 +00:00
Chandler Carruth 4425c91dea [ADT] Fix a confusing interface spec and some annoying peculiarities
with the StringRef::split method when used with a MaxSplit argument
other than '-1' (which nobody really does today, but which should
actually work).

The spec claimed both to split up to MaxSplit times, but also to append
<= MaxSplit strings to the vector. One of these doesn't make sense.
Given the name "MaxSplit", let's go with it being a max over how many
*splits* occur, which means the max on how many strings get appended is
MaxSplit+1. I'm not actually sure the implementation correctly provided
this logic either, as it used a really opaque loop structure.

The implementation was also playing weird games with nullptr in the data
field to try to rely on a totally opaque hidden property of the split
method that returns a pair. Nasty IMO.

Replace all of this with what is (IMO) simpler code that doesn't use the
pair returning split method, and instead just finds each separator and
appends directly. I think this is a lot easier to read, and it most
definitely matches the spec. Added some tests that exercise the corner
cases around StringRef() and StringRef("") that all now pass.

I'll start using this in code in the next commit.

llvm-svn: 247249
2015-09-10 07:51:37 +00:00
NAKAMURA Takumi 1a296ec6d1 GlobalsAAResult(&&): Move every members.
Or, one of MSVC builders failed with unexpected behavior.

llvm-svn: 247247
2015-09-10 07:16:42 +00:00
Elena Demikhovsky 5cf3a02992 Added isUndef() interface for SDNode
Differential Revision: http://reviews.llvm.org/D12720

llvm-svn: 247246
2015-09-10 06:33:13 +00:00
Chandler Carruth e4405e949f [ADT] Switch a bunch of places in LLVM that were doing single-character
splits to actually use the single character split routine which does
less work, and in a debug build is *substantially* faster.

llvm-svn: 247245
2015-09-10 06:12:31 +00:00
Chandler Carruth 477121721b [ADT] Add a single-character version of the small vector split routine
on StringRef. Finding and splitting on a single character is
substantially faster than doing it on even a single character StringRef
-- we immediately get to a *very* tuned memchr call this way.

Even nicer, we get to this even in a debug build, shaving 18% off the
runtime of TripleTest.Normalization, helping PR23676 some more.

llvm-svn: 247244
2015-09-10 06:07:03 +00:00
Chandler Carruth 93d5d3b5db Add a way to skip the Go bindings tests even when Go is configured in
CMake.

The Go bindings tests in an unoptimized build take over 30 seconds for
me, making it the slowest test in 'check-llvm' by a factor of two.

I've only rigged this up fully to the CMake build. If someone is
interested in rigging it up to the autoconf build, they're welcome to do
so.

llvm-svn: 247243
2015-09-10 05:47:43 +00:00
Sanjoy Das f3132d3b03 [ScalarEvolution] Fix PR24757.
Summary:
PR24757 was caused by some incorect math in
`ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X
in

  2^N * A = 2^N * X

is not necessarily A.

Reviewers: atrick, majnemer, meheff

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12721

llvm-svn: 247242
2015-09-10 05:27:38 +00:00
Chandler Carruth 87275186d1 [LPM] Simplify this code and fix a compile error for compilers that
don't correctly implement the scoping rules of C++11 range based for
loops. This kind of aliasing isn't a good idea anyways (and wasn't
really intended).

llvm-svn: 247241
2015-09-10 04:22:36 +00:00
Chandler Carruth b1e3a9ae8d [LPM] Use a map from analysis ID to immutable passes in the legacy pass
manager to avoid a slow linear scan of every immutable pass and on every
attempt to find an analysis pass.

This speeds up 'check-llvm' on an unoptimized build for me by 15%, YMMV.
It should also help (a tiny bit) other folks that are really
bottlenecked on repeated runs of tiny pass pipelines across small IR
files.

llvm-svn: 247240
2015-09-10 02:31:42 +00:00
Kit Barton d3b904d440 Enable the shrink wrapping optimization for PPC64.
The changes in this patch are as follows:
  1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function
  2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function
  3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run:
      Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line

A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64.

Phabricator review: http://reviews.llvm.org/D11817

llvm-svn: 247237
2015-09-10 01:55:44 +00:00
Ahmed Bougacha 05541459fa [AArch64] Match FI+offset in STNP addressing mode.
First, we need to teach isFrameOffsetLegal about STNP.
It already knew about the STP/LDP variants, but those were probably
never exercised, because it's only the load/store optimizer that
generates STP/LDP, and the only user of the method is frame lowering,
which runs earlier.
The STP/LDP cases were wrong: they didn't take into account the fact
that they return two results, not one, so the immediate offset will be
the 4th operand, not the 3rd.

Follow-up to r247234.

llvm-svn: 247236
2015-09-10 01:54:43 +00:00
Davide Italiano ddedd7255a [MC] Convert all the remaining tests from macho-dump to llvm-readobj.
This sort-of deprecates macho-dump. It may take still a little while
to garbage collect it, but at least there's no real usage of it in
the tree anymore. New tests should always rely on llvm-readobj or
llvm-objdump.

llvm-svn: 247235
2015-09-10 01:50:00 +00:00
Ahmed Bougacha c0ac38d584 [AArch64] Match base+offset in STNP addressing mode.
Followup to r247231.

llvm-svn: 247234
2015-09-10 01:48:29 +00:00
Mehdi Amini 8d4611648f Makes EmitRecord() accepting ArrayRef and raw array (NFC)
After r247186, a vector is no longer needed as the push_front for
the code is removed.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247232
2015-09-10 01:45:55 +00:00
Ahmed Bougacha b8886b517d [AArch64] Support selecting STNP.
We could go through the load/store optimizer and match STNP where
we would have matched a nontemporal-annotated STP, but that's not
reliable enough, as an opportunistic optimization.
Insetad, we can guarantee emitting STNP, by matching them at ISel.
Since there are no single-input nontemporal stores, we have to
resort to some high-bits-extracting trickery to generate an STNP
from a plain store.

Also, we need to support another, LDP/STP-specific addressing mode,
base + signed scaled 7-bit immediate offset.
For now, only match the base. Let's make it smart separately.

Part of PR24086.

llvm-svn: 247231
2015-09-10 01:42:28 +00:00
Matt Arsenault 80f766a032 AMDGPU/SI: Fix more cases of losing exec operands
llvm-svn: 247230
2015-09-10 01:23:28 +00:00
Matt Arsenault ad46e0c1ab AMDGPU/SI: Fix creating v_mov_b32s without exec uses
This will be caught by existing tests with a
verifier check to be added in a future commit.

llvm-svn: 247229
2015-09-10 01:06:06 +00:00
Hans Wennborg d2799a963f Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"
This caused build breakges, e.g.
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926

llvm-svn: 247226
2015-09-10 00:57:26 +00:00
Ahmed Bougacha 37bffd83f0 [CodeGen] Make x86 nontemporal store patfrags generic. NFC.
To be used by other targets.

llvm-svn: 247225
2015-09-10 00:53:15 +00:00
Philip Reames 953817b65d [RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC]
llvm-svn: 247223
2015-09-10 00:44:10 +00:00
Philip Reames b4e55f3923 [RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC]
The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation.  That's not true.  The tighter form of the assert is useful documentation.

llvm-svn: 247221
2015-09-10 00:32:56 +00:00
Philip Reames c8ded462c4 [RewriteStatepointsForGC] One last bit of naming [NFCI]
llvm-svn: 247220
2015-09-10 00:27:50 +00:00
Reid Kleckner 7878391208 [WinEH] Add codegen support for cleanuppad and cleanupret
All of the complexity is in cleanupret, and it mostly follows the same
codepaths as catchret, except it doesn't take a return value in RAX.

This small example now compiles and executes successfully on win32:
  extern "C" int printf(const char *, ...) noexcept;
  struct Dtor {
    ~Dtor() { printf("~Dtor\n"); }
  };
  void has_cleanup() {
    Dtor o;
    throw 42;
  }
  int main() {
    try {
      has_cleanup();
    } catch (int) {
      printf("caught it\n");
    }
  }

Don't try to put the cleanup in the same function as the catch, or Bad
Things will happen.

llvm-svn: 247219
2015-09-10 00:25:23 +00:00
Philip Reames 34d7a7493d [RewriteStatepointsForGC] Further style/naming fixup [NFCI]
llvm-svn: 247217
2015-09-10 00:22:49 +00:00
Hans Wennborg 6fa09455ed Fix Clang-tidy misc-use-override warnings, other minor fixes
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D12740

llvm-svn: 247216
2015-09-10 00:12:56 +00:00
Mehdi Amini c7aa5ca8a8 Bitcode Writer: EmitRecordWith* takes an ArrayRef instead of a SmallVector (NFC)
This reapply commit r247178 after post-commit review from D.Blaikie
in a way that makes it compatible with the existing API.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247215
2015-09-10 00:05:09 +00:00
Mehdi Amini defa546551 Add makeArrayRef() overload for ArrayRef input (no-op/identity) NFC
The purpose is to allow templated wrapper to work with either
ArrayRef or any convertible operation:

template<typename Container>
void wrapper(const Container &Arr) {
  impl(makeArrayRef(Arr));
}

with Container being a std::vector, a SmallVector, or an ArrayRef.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247214
2015-09-10 00:05:04 +00:00
Philip Reames 7540e3a45d [RewriteStatepointsForGC] More naming cleanup [NFCI]
llvm-svn: 247213
2015-09-10 00:01:53 +00:00
Philip Reames ece70b8042 [RewriteStatepointsForGC] Code cleanup [NFC]
Factor out common code related to naming values, fix a small style issue.  More to follow in separate changes.

llvm-svn: 247211
2015-09-09 23:57:18 +00:00
Philip Reames 6628713f4f [RewriteStatepointsForGC] Extend base pointer inference to handle insertelement
This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done.

Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time.

Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered *all* insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well.

Differential Revision: http://reviews.llvm.org/D12583

llvm-svn: 247210
2015-09-09 23:40:12 +00:00
Philip Reames 15d5563cea [RewriteStatepointsForGC] Make base pointer inference deterministic
Previously, the base pointer algorithm wasn't deterministic. The core fixed point was (of course), but we were inserting new nodes and optimizing them in an order which was unspecified and variable. We'd somewhat hacked around this for testing by sorting by value name, but that doesn't solve the general determinism problem.

Instead, we can use the order of traversal over the def/use graph to give us a single consistent ordering. Today, this is a DFS order, but the exact order doesn't mater provided it's deterministic for a given input.

(Q: It is safe to rely on a deterministic order of operands right?)

Note that this only fixes the determinism within a single inference step. The inference step is currently invoked many times in a non-deterministic order. That's a future change in the sequence. :)

Differential Revision: http://reviews.llvm.org/D12640

llvm-svn: 247208
2015-09-09 23:26:08 +00:00
Peter Collingbourne 1cbc91eccf LowerBitSets: Fix non-determinism bug.
Visit disjoint sets in a deterministic order based on the maximum BitSetNM
index, otherwise the order in which we visit them will depend on pointer
comparisons. This was being exposed by MSan.

llvm-svn: 247201
2015-09-09 22:30:32 +00:00
Reid Kleckner 94b704c469 [SEH] Emit 32-bit SEH tables for the new EH IR
The 32-bit tables don't actually contain PC range data, so emitting them
is incredibly simple.

The 64-bit tables, on the other hand, use the same table for state
numbering as well as label ranges. This makes things more difficult, so
it will be implemented later.

llvm-svn: 247192
2015-09-09 21:10:03 +00:00
Dan Gohman 5e0668426c [WebAssembly] Update target datalayout strings.
llvm-svn: 247187
2015-09-09 20:54:31 +00:00
Teresa Johnson 0f251a1cc6 Change EmitRecordWithAbbrevImpl to take Optional record code. NFC.
This change enables EmitRecord to pass the supplied record Code to
EmitRecordWithAbbrevImpl, rather than insert it into the Vals array.
It is an enabler for changing EmitRecord to take an ArrayRef<uintty> instead
of a SmallVectorImpl<uintty>&

Patch suggested by Duncan P. N. Exon Smith, modified by myself a bit to get
correct assertion checking.

llvm-svn: 247186
2015-09-09 20:53:31 +00:00
Piotr Padlewski 0dde00d239 ScalarEvolution assume hanging bugfix
http://reviews.llvm.org/D12719

llvm-svn: 247184
2015-09-09 20:47:30 +00:00
Mehdi Amini c9a85abc6c Revert "Bitcode Writer: EmitRecordWith* takes an ArrayRef instead of a SmallVector (NFC)"
This reverts commit r247178.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247182
2015-09-09 20:35:15 +00:00
David Majnemer d34dbf07bd Revert trunc(lshr (sext A), Cst) to ashr A, Cst
This reverts commit r246997, it introduced a regression (PR24763).

llvm-svn: 247180
2015-09-09 20:20:08 +00:00
Mehdi Amini 7d2bf53ed1 Bitcode Writer: EmitRecordWith* takes an ArrayRef instead of a SmallVector (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247178
2015-09-09 20:08:39 +00:00
Renato Golin db7ea86bf4 Revert "AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding."
This reverts commit r247149, as it was breaking numerous buildbots of varied architectures.

llvm-svn: 247177
2015-09-09 19:44:40 +00:00
Sanjay Patel 66dcafc3d6 allow unpredictable metadata on switch statements
llvm-svn: 247174
2015-09-09 18:38:30 +00:00
Matthias Braun d9da162789 Save LaneMask with livein registers
With subregister liveness enabled we can detect the case where only
parts of a register are live in, this is expressed as a 32bit lanemask.
The current code only keeps registers in the live-in list and therefore
enumerated all subregisters affected by the lanemask. This turned out to
be too conservative as the subregister may also cover additional parts
of the lanemask which are not live. Expressing a given lanemask by
enumerating a minimum set of subregisters is computationally expensive
so the best solution is to simply change the live-in list to store the
lanemasks as well. This will reduce memory usage for targets using
subregister liveness and slightly increase it for other targets

Differential Revision: http://reviews.llvm.org/D12442

llvm-svn: 247171
2015-09-09 18:08:03 +00:00
Matthias Braun cc58005885 VirtRegMap: Improve addMBBLiveIns() using SlotIndex::MBBIndexIterator; NFC
Now that we have an explicit iterator over the idx2MBBMap in SlotIndices
we can use the fact that segments and the idx2MBBMap is sorted by
SlotIndex position so can advance both simultaneously instead of
starting from the beginning for each segment.

This complicates the code for the subregister case somewhat but should
be more efficient and has the advantage that we get the final lanemask
for each block immediately which will be important for a subsequent
change.

Removes the now unused SlotIndexes::findMBBLiveIns function.

Differential Revision: http://reviews.llvm.org/D12443

llvm-svn: 247170
2015-09-09 18:07:54 +00:00
Chandler Carruth 7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Matthias Braun 80595460d8 MachineVerifier: Check that SlotIndex MBBIndexList is sorted.
This introduces a check that the MBBIndexList is sorted as proposed in
http://reviews.llvm.org/D12443 but split up into a separate commit.

llvm-svn: 247166
2015-09-09 17:49:46 +00:00
Matt Arsenault ef67d76869 AMDGPU: Extract full 64-bit subregister and use subregs
Instead of extracting both 32-bit components from the 128-bit
register. This produces fewer copies and is easier for
the copy peephole optimizer to understand and see the actual uses
as extracts from a reg_sequence.

This avoids needing to handle subregister composing in the
PeepholeOptimizer's ValueTracker for this case.

llvm-svn: 247162
2015-09-09 17:03:29 +00:00
Matt Arsenault b5541fb098 AMDGPU: Remove unused multiclass argument
llvm-svn: 247161
2015-09-09 17:03:18 +00:00
Tom Stellard 5268c17e52 llvm-config: Add --build-system option
Summary:
This can be used for distinguishing between cmake and autoconf builds.
Users may need this in order to handle inconsistencies between the
outputs of the two build systems.

Reviewers: echristo, chandlerc, beanz

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11838

llvm-svn: 247159
2015-09-09 16:39:30 +00:00
Dan Gohman f71abef701 [WebAssembly] Implement calls with void return types.
llvm-svn: 247158
2015-09-09 16:13:47 +00:00
Tom Stellard 9a197676b1 AMDGPU/SI: Fold operands through REG_SEQUENCE instructions
Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12256

llvm-svn: 247157
2015-09-09 15:43:26 +00:00
Silviu Baranga a3e27edb5d [CostModel][AArch64] Remove amortization factor for some of the vector select instructions
Summary:
We are not scalarizing the wide selects in codegen for i16 and i32 and
therefore we can remove the amortization factor. We still have issues
with i64 vectors in codegen though.

Reviewers: mcrosier

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12724

llvm-svn: 247156
2015-09-09 15:35:02 +00:00
Sanjay Patel 6eccf487c9 don't repeat function names in comments; NFC
llvm-svn: 247154
2015-09-09 15:24:36 +00:00
Dan Gohman 1ce7ba5fe0 [WebAssembly] Tidy up some unneeded newline characters.
llvm-svn: 247152
2015-09-09 15:13:36 +00:00
Joseph Tremoulet e5e75afe8f [CMake] Flag recursive cmake invocations for cross-compile
Summary:
Cross-compilation uses recursive cmake invocations to build native host
tools.  These recursive invocations only forward a fixed set of
variables/options, since the native environment is generally the default.
This change adds -DLLVM_TARGET_IS_CROSSCOMPILE_HOST=TRUE to the recursive
cmake invocations, so that cmake files can distinguish these recursive
invocations from top-level ones, which can explain why expected options
are unset.

LLILC will use this to avoid trying to generate its build rules in the
crosscompile native host target (where it is not needed), which would fail
if attempted because LLILC requires a cmake variable passed on the command
line, which is not forwarded in the recursive invocation.

Reviewers: rnk, beanz

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12679

llvm-svn: 247151
2015-09-09 14:57:06 +00:00
Sanjay Patel e283441836 function names start with a lower case letter; NFC
llvm-svn: 247150
2015-09-09 14:54:29 +00:00
Igor Breger ac29a82921 AVX512: Implemented encoding and intrinsics for
vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11802

llvm-svn: 247149
2015-09-09 14:35:09 +00:00
Sanjay Patel 2fbab9d893 don't repeat function names in comments; NFC
llvm-svn: 247148
2015-09-09 14:34:26 +00:00
Zoran Jovanovic 6b28f09d67 [mips][microMIPS] Implement ADDU16, AND16, ANDI16, NOT16, OR16, SLL16 and SRL16 instructions
Differential Revision: http://reviews.llvm.org/D11178

llvm-svn: 247146
2015-09-09 13:55:45 +00:00
Alex Lorenz b9a68dbcae Fix PR 24633 - Handle undef values when parsing standalone constants.
llvm-svn: 247145
2015-09-09 13:44:33 +00:00
James Molloy 520838977b Rename ExitCount to BackedgeTakenCount, because that's what it is.
We called a variable ExitCount, stored the backedge count in it, then redefined it to be the exit count again.

llvm-svn: 247140
2015-09-09 12:51:10 +00:00
James Molloy 89eccee4db Delay predication of stores until near the end of vector code generation
Predicating stores requires creating extra blocks. It's much cleaner if we do this in one pass instead of mutating the CFG while writing vector instructions.

Besides which we can make use of helper functions to update domtree for us, reducing the work we need to do.

llvm-svn: 247139
2015-09-09 12:51:06 +00:00
Alexandros Lamprineas 712099ccfd LLVM does not distinguish Cortex-M4 from Cortex-M4F neither Cortex-R5 from R5F.
Removed "cortex-r5f" and "cortex-m4f" from Target Parser, sinced they are
unknown cpu names for llvm and clang. Also updated default FPUs for R5 and M4
accordingly.

Differential Revision: http://reviews.llvm.org/D12692

Change-Id: Ib81c7216521a361d8ee1296e4b6a2aa00bd479c5
llvm-svn: 247136
2015-09-09 11:20:48 +00:00
Daniel Sanders 2038747fce Fix vector splitting for extract_vector_elt and vector elements of <8-bits.
Summary:
One of the vector splitting paths for extract_vector_elt tries to lower:
    define i1 @via_stack_bug(i8 signext %idx) {
      %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx
      ret i1 %1
    }
to:
    define i1 @via_stack_bug(i8 signext %idx) {
      %base = alloca <2 x i1>
      store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base
      %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx
      %3 = load i1, i1* %2
      ret i1 %3
    }
However, the elements of <2 x i1> are not byte-addressible. The result of this
is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies
to '%base + %idx * 0', and then simply '%base' causing all values of %idx to
extract element zero.

This commit fixes this by promoting the vector elements of <8-bits to i8 before
splitting the vector.

This fixes a number of test failures in pocl.

Reviewers: pekka.jaaskelainen

Subscribers: pekka.jaaskelainen, llvm-commits

Differential Revision: http://reviews.llvm.org/D12591

llvm-svn: 247128
2015-09-09 09:53:20 +00:00
Chandler Carruth 1688a772fc Fix a typo I spotted when hacking on SROA. Somewhat alarming that
nothing broke.

llvm-svn: 247127
2015-09-09 09:46:16 +00:00
Zoran Jovanovic d9790793d6 [mips][microMIPS] Implement CACHEE and PREFE instructions
Differential Revision: http://reviews.llvm.org/D11628

llvm-svn: 247125
2015-09-09 09:10:46 +00:00
Matt Arsenault d768737454 AMDGPU: Fix not encoding src2 of VOP3b instructions
Broken by r247074. Should include an assembler test,
but the assembler is currently broken for VOP3b apparently.

llvm-svn: 247123
2015-09-09 08:39:49 +00:00
Sanjoy Das da0d79e0a0 [IRCE] Add INITIALIZE_PASS_DEPENDENCY invocations.
IRCE was just using INITIALIZE_PASS(), which is incorrect.

llvm-svn: 247122
2015-09-09 03:47:18 +00:00
Lang Hames 856e4767ff [RuntimeDyld] Add support for MachO x86_64 SUBTRACTOR relocation.
llvm-svn: 247119
2015-09-09 03:14:29 +00:00
Dan Gohman e590b33bf8 [WebAssembly] Fix lowering of calls with more than one argument.
llvm-svn: 247118
2015-09-09 01:52:45 +00:00
Matt Arsenault acd68b58ae SelectionDAG: Support Expand of f16 extloads
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.

This moves a hack out of SI's argument lowering and
is covered by existing tests.

llvm-svn: 247113
2015-09-09 01:12:27 +00:00
Dan Gohman 4f52e00ecb [WebAssembly] Implement WebAssemblyInstrInfo::copyPhysReg
llvm-svn: 247110
2015-09-09 00:52:47 +00:00
Matt Arsenault 3099156261 Fix typos / grammar
llvm-svn: 247109
2015-09-09 00:38:33 +00:00
Duncan P. N. Exon Smith 78b66ecd70 Revert "Bitcode: ArrayRef-ize EmitRecordWithAbbrev(), NFC"
This reverts commit r247107.  Turns out clang calls these functions
directly, and `ArrayRef<T>` doesn't have a working implicit conversion
from `SmallVector<T>`.

http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_build/14247

llvm-svn: 247108
2015-09-09 00:37:52 +00:00
Duncan P. N. Exon Smith 98b3cd9280 Bitcode: ArrayRef-ize EmitRecordWithAbbrev(), NFC
Change `EmitRecordWithAbbrev()` and friends to take an `ArrayRef<T>`
instead of requiring a `SmallVectorImpl<T>`.  No functionality change
intended.

llvm-svn: 247107
2015-09-09 00:34:25 +00:00
Davide Italiano 9a429b766f [llvm-readobj] MachO -- dump LinkerOptions load command.
Example output:

Linker Options {
  Size: 32
  Count: 2
  Strings [
    Value: -framework
    Value: Cocoa
  ]
}

There were only two tests using this -- so I converted them as part of
this commit rather than separately.

Differential Revision:	 http://reviews.llvm.org/D12702

llvm-svn: 247106
2015-09-09 00:21:18 +00:00
Reid Kleckner 51189f0a1d [WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code
Typically these are catchpads, which hold data used to decide whether to
catch the exception or continue unwinding. We also shouldn't create MBBs
for catchendpads, cleanupendpads, or terminatepads, since no real code
can live in them.

This fixes a problem where MI passes (like the register allocator) would
try to put code into catchpad blocks, which are not executed by the
runtime. In the new world, blocks ending in invokes now have many
possible successors.

llvm-svn: 247102
2015-09-08 23:28:38 +00:00
Peter Collingbourne 8d24ae9441 Re-apply r247080 with order of evaluation fix.
llvm-svn: 247095
2015-09-08 22:49:35 +00:00
Reid Kleckner df1295173f [WinEH] Emit prologues and epilogues for funclets
Summary:
32-bit funclets have short prologues that allocate enough stack for the
largest call in the whole function. The runtime saves CSRs for the
funclet. It doesn't restore CSRs after we finally transfer control back
to the parent funciton via a CATCHRET, but that's a separate issue.
32-bit funclets also have to adjust the incoming EBP value, which is
what llvm.x86.seh.recoverframe does in the old model.

64-bit funclets need to spill CSRs as normal. For simplicity, this just
spills the same set of CSRs as the parent function, rather than trying
to compute different CSR sets for the parent function and each funclet.
64-bit funclets also allocate enough stack space for the largest
outgoing call frame, like 32-bit.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12546

llvm-svn: 247092
2015-09-08 22:44:41 +00:00
Peter Collingbourne 07f3af2e82 Revert r247080, "LowerBitSets: Extend pass to support functions as bitset
members." as it causes test failures on a number of bots.

llvm-svn: 247088
2015-09-08 22:33:23 +00:00
Vedant Kumar 9ebd49a4cf [Bitcode] Add compatibility tests for new instructions
Adds basic compatibility tests for the following instructions:

  catchpad, catchendpad, cleanuppad, cleanupendpad, terminatepad,
  cleanupret, catchret

llvm-svn: 247087
2015-09-08 22:33:23 +00:00
Vedant Kumar ee6110cd39 [docs] Fix typo in catchret example
An example usage of catchret omitted the "to" in "to label" in
ExceptionHandling.rst.

llvm-svn: 247086
2015-09-08 22:28:38 +00:00
Eric Christopher 71f6e2f568 Fix the PPC CTR Loop pass to look for calls to the intrinsics that
read CTR and count them as reading the CTR.

llvm-svn: 247083
2015-09-08 22:14:58 +00:00
Peter Collingbourne c634ed0b1a LowerBitSets: Extend pass to support functions as bitset members.
This change extends the bitset lowering pass to support bitsets that may
contain either functions or global variables. A function bitset is lowered to
a jump table that is laid out before one of the functions in the bitset.

Also add support for non-string bitset identifier names. This allows for
distinct metadata nodes to stand in for names with internal linkage,
as done in D11857.

Differential Revision: http://reviews.llvm.org/D11856

llvm-svn: 247080
2015-09-08 21:57:45 +00:00
Ivan Krasin a610cb5ba0 [libFuzzer]Add a test for defeating a hash sum.
Summary:
Add a test for a data followed by 4-byte hash value.
I use a slightly modified Jenkins hash function,
as described in https://en.wikipedia.org/wiki/Jenkins_hash_function

The modification is to ensure that hash(zeros) != 0.

Reviewers: kcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12648

llvm-svn: 247076
2015-09-08 21:22:52 +00:00
Matt Arsenault 86d336e91b AMDGPU/SI: Fix input vcc operand for VOP2b instructions
Adds vcc to output string input for e32. Allows option
of using e64 encoding with assembler.

Also fixes these instructions not implicitly reading exec.

llvm-svn: 247074
2015-09-08 21:15:00 +00:00
Artem Belevich 0127d80986 [NVPTX] Added run NVVMReflect pass to NVPTX back-end.
The pass is needed to remove __nvvm_reflect calls when we link in
libdevice bitcode that comes with CUDA.

Differential Revision: http://reviews.llvm.org/D11663

llvm-svn: 247072
2015-09-08 21:04:55 +00:00
Derek Schuff 45c832c5d8 Fix comments and RUN line in x86-64 stdarg test leftover from last commit
From http://reviews.llvm.org/D12346

llvm-svn: 247070
2015-09-08 20:58:41 +00:00
Derek Schuff eef533f422 x32. Fixes a bug in how struct va_list is initialized in x32
Summary: This patch modifies X86TargetLowering::LowerVASTART so that
struct va_list is initialized with 32 bit pointers in x32. It also
includes tests that call @llvm.va_start() for x32.

Patch by João Porto

Subscribers: llvm-commits, hjl.tools
Differential Revision: http://reviews.llvm.org/D12346

llvm-svn: 247069
2015-09-08 20:51:31 +00:00
Kostya Serebryany 4b82de2e47 [libFuzzer] remove a piece of stale code
llvm-svn: 247067
2015-09-08 20:40:10 +00:00
Kostya Serebryany 9cdea94f66 [libFuzzer] be more robust when dealing with files on disk (e.g. don't crash if a file was there but disappeared)
llvm-svn: 247066
2015-09-08 20:36:33 +00:00
Dan Gohman e32c57443f [WebAssembly] Support running without a register allocator in the default CodeGen passes
This allows backends which don't use a traditional register allocator,
but do need PHI lowering and other passes, to use the default
TargetPassConfig::addFastRegAlloc and
TargetPassConfig::addOptimizedRegAlloc implementations.

Differential Revision: http://reviews.llvm.org/D12691

llvm-svn: 247065
2015-09-08 20:36:33 +00:00
Matt Arsenault 90e0340850 Add const overload of findRegisterUseOperand
llvm-svn: 247063
2015-09-08 20:21:29 +00:00
Vedant Kumar 9e1998e198 [docs] Update documentation for the landingpad instruction
llvm-svn: 247062
2015-09-08 20:16:35 +00:00
Sanjay Patel b54e62fe17 refactor matches for De Morgan's Laws; NFCI
llvm-svn: 247061
2015-09-08 20:14:13 +00:00
Matt Arsenault 8ac35cd031 AMDGPU: Mark s_barrier as a high latency instruction
These were marked as WriteSALU, which is low latency.
I'm guessing at the value to use, but it should probably
be considered the highest latency instruction.

I'm not sure this has any actual effect since hasSideEffects
probably is preventing any moving of these.

llvm-svn: 247060
2015-09-08 19:54:32 +00:00
Matt Arsenault 8fb810a1d2 AMDGPU: Fix s_barrier flags
This should be convergent. This is not a
barrier in the isBarrier sense, nor
hasCtrlDep.

llvm-svn: 247059
2015-09-08 19:54:25 +00:00
Derek Schuff ee4e947e23 x32. Fixes a bug in i8mem_NOREX declaration.
The old implementation assumed LP64 which is broken for x32.  Specifically, the
MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit
physreg copy instruction' error message to be reported.

This patch also enable the h-register*ll tests for x32.

Differential Revision: http://reviews.llvm.org/D12336

Patch by João Porto

llvm-svn: 247058
2015-09-08 19:47:15 +00:00
Matt Arsenault 966a94f861 AMDGPU: Handle sub of constant for DS offset folding
sub C, x - > add (sub 0, x), C for DS offsets.

This is mostly to fix regressions that show up when
SeparateConstOffsetFromGEP is enabled.

llvm-svn: 247054
2015-09-08 19:34:22 +00:00
Diego Novillo f9aa39b0cf Fix PR 24723 - Handle 0-mass backedges in irreducible loops
This corner case happens when we have an irreducible SCC that is
deeply nested.  As we work down the tree, the backedge masses start
getting smaller and smaller until we reach one that is down to 0.

Since we distribute the incoming mass using the backedge masses as
weight, the distributor does not allow zero weights.  So, we simply
ignore them (which will just use the weights of the non-zero nodes).

llvm-svn: 247050
2015-09-08 19:22:17 +00:00
Davide Italiano cb2da7166a [MC/ELF] Accept zero for .align directive
.align directive refuses alignment 0 -- a comment in the code hints this is
done for GNU as compatibility, but it seems GNU as accepts .align 0
(and silently rounds up alignment to 1).

Differential Revision:	 http://reviews.llvm.org/D12682

llvm-svn: 247048
2015-09-08 18:59:47 +00:00
David Blaikie 12dd3c4ebb Fix CPP Backend for GEP API changes for opaque pointer types
Based on a patch by Jerome Witmann.

llvm-svn: 247047
2015-09-08 18:42:29 +00:00
Benjamin Kramer c3c183554b Merge or combine tests and convert to FileCheck.
- Move tests only exercising instsimplify to instsimplify's apint-or.ll
- Actually test the CHECK lines in instsimplify's apint-or.ll
- Merge the remaining tests in apint-or1.ll and apint-or2.ll, use FileCheck

llvm-svn: 247045
2015-09-08 18:36:56 +00:00
Evgeniy Stepanov 80d5569dba Fix isDiscardableIfUnused to include available_externally linkage.
AvailableExternally functions are discardable.

llvm-svn: 247044
2015-09-08 18:25:20 +00:00
Sanjay Patel 1854927556 remove function names from comments; NFC
llvm-svn: 247043
2015-09-08 18:24:36 +00:00
Andrew Kaylor e2ea93c6c0 Fix for bz24500: Avoid non-deterministic code generation triggered by the x86 call frame optimization
Patch by Dave Kreitzer

Differential Revision: http://reviews.llvm.org/D12620

llvm-svn: 247042
2015-09-08 18:18:46 +00:00
Sanjay Patel 21a145c341 add tests for De Morgan instcombines based on PR22723
llvm-svn: 247040
2015-09-08 18:13:03 +00:00
Sanjay Patel 35f673ef08 fix typos, remove noise; NFCI
llvm-svn: 247035
2015-09-08 17:58:22 +00:00
Kostya Serebryany b06fae5ede [libFuzzer] better documentatio for -save_minimized_corpus=1
llvm-svn: 247033
2015-09-08 17:43:51 +00:00
Vedant Kumar 52cd4eecac [Bitcode] Add compatibility test for llvm 3.7.0
This patch adds llvm-3.7 IR and generated bitcode for our compatibility
test (in accordance with the developer policy).

llvm-svn: 247031
2015-09-08 17:39:21 +00:00
Kostya Serebryany 468ed78434 [libFuzzer] remove -iterations as redundant (there is also -num_runs)
llvm-svn: 247030
2015-09-08 17:30:35 +00:00
JF Bastien 749ed88aa5 WebAssembly: NFC rename shr/sar
Renamed from: https://github.com/WebAssembly/design/pull/332

llvm-svn: 247028
2015-09-08 17:21:21 +00:00
Kostya Serebryany 25425ad920 [libFuzzer] add one more mutator: Mutate_ChangeASCIIInteger
llvm-svn: 247027
2015-09-08 17:19:31 +00:00
Jun Bum Lim 4d3c5986f2 Remove white space (test commit)
llvm-svn: 247021
2015-09-08 16:11:22 +00:00
Zoran Jovanovic 2da1437d62 [mips][microMIPS] Implement LLE, LUI, LW and LWE instructions
Differential Revision: http://reviews.llvm.org/D1179

llvm-svn: 247017
2015-09-08 15:02:50 +00:00
Dan Gohman d4a12d25ff [WebAssembly] Temporarily disable this test, as it depends on additional patches that aren't yet checked in.
llvm-svn: 247011
2015-09-08 13:21:12 +00:00
Igor Breger a54a1a84dd AVX512: kunpck encoding implementation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D12061

llvm-svn: 247010
2015-09-08 13:10:00 +00:00
Dan Gohman 25d2a0dda4 [WebAssembly] Enable SSA lowering and other pre-regalloc passes
llvm-svn: 247008
2015-09-08 12:39:25 +00:00
Elena Demikhovsky ddf715ef77 Removed an old comment, NFC
llvm-svn: 247006
2015-09-08 12:22:22 +00:00
Alex Lorenz 37e026260b MIRLangRef: Add documentation for the subregister indices.
llvm-svn: 247005
2015-09-08 11:39:47 +00:00
Alex Lorenz d4990ebd02 MIRLangRef: Add documentation for the global value machine operands.
llvm-svn: 247004
2015-09-08 11:38:16 +00:00
Zoran Jovanovic 9eaa30d2bf [mips][microMIPS] Implement SB, SBE, SCE, SH and SHE instructions
Differential Revision: http://reviews.llvm.org/D11801

llvm-svn: 246999
2015-09-08 10:18:38 +00:00
Jakub Kuderski 7cd4810021 There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that
removes cast by performing the lshr on smaller types. However, currently there
is no trunc(lshr (sext A), Cst) variant.
This patch add such optimization by transforming trunc(lshr (sext A), Cst)
to ashr A, Cst.

Differential Revision: http://reviews.llvm.org/D12520

llvm-svn: 246997
2015-09-08 10:03:17 +00:00
Daniel Sanders 808dfb8ba7 [mips] Reserve address spaces 1-255 for software use.
Summary: And define them to have noop casts with address spaces 0-255.

Reviewers: pekka.jaaskelainen

Subscribers: pekka.jaaskelainen, llvm-commits

Differential Revision: http://reviews.llvm.org/D12678

llvm-svn: 246990
2015-09-08 09:07:03 +00:00
Zoran Jovanovic 68be5f21a9 [mips][microMIPS] Add microMIPS32r6 and microMIPS64r6 tests for existing 16-bit LBU16, LHU16, LW16, LWGP and LWSP instructions
Differential Revision: http://reviews.llvm.org/D10956

llvm-svn: 246987
2015-09-08 08:25:34 +00:00
NAKAMURA Takumi bb7483dd77 [CMake][CMP0051] Avoid for user of objlib to use llvm_update_compile_flags().
$<TARGET_OBJECTS> shouldn't require compile flags. Flags are set in obj.${name}.

llvm-svn: 246984
2015-09-08 07:42:06 +00:00
Elena Demikhovsky dec0f0885f compilation issue, NFC
llvm-svn: 246983
2015-09-08 07:34:06 +00:00
Elena Demikhovsky d240d778b3 fixed compilation issue, NFC.
llvm-svn: 246982
2015-09-08 07:10:08 +00:00
Elena Demikhovsky e88038f235 AVX-512: Lowering for 512-bit vector shuffles.
Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer.

Differential Revision: http://reviews.llvm.org/D10683

llvm-svn: 246981
2015-09-08 06:38:21 +00:00
Davide Italiano bb9a6ccfa8 [llvm-readobj] Shrink code a little bit. No functional change.
llvm-svn: 246976
2015-09-07 20:47:03 +00:00
Sanjay Patel de28573e49 add missing regression tests for De Morgan's Law transform in InstCombine
llvm-svn: 246973
2015-09-07 19:00:38 +00:00
Zoran Jovanovic 7b85682541 [mips][microMIPS] Implement ABS.fmt, CEIL.L.fmt, CEIL.W.fmt, FLOOR.L.fmt, FLOOR.W.fmt, TRUNC.L.fmt, TRUNC.W.fmt, RSQRT.fmt and SQRT.fmt instructions
Differential Revision: http://reviews.llvm.org/D11674

llvm-svn: 246968
2015-09-07 13:01:04 +00:00
Zoran Jovanovic ada7091812 [mips][microMIPS] Implement BC16, BEQZC16 and BNEZC16 instructions
Differential Revision: http://reviews.llvm.org/D11181

llvm-svn: 246963
2015-09-07 11:56:37 +00:00
John Brawn d8b405abf7 [ARM] Get rid of SelectT2ShifterOperandReg, NFC
SelectT2ShifterOperandReg has identical behaviour to SelectImmShifterOperand,
so get rid of it and use SelectImmShifterOperand instead.

Differential Revision: http://reviews.llvm.org/D12195

llvm-svn: 246962
2015-09-07 11:45:18 +00:00
Zoran Jovanovic 14f308e44f [mips][microMIPS] Implement CVT.D.fmt, CVT.L.fmt, CVT.S.fmt, CVT.W.fmt, MAX.fmt, MIN.fmt, MAXA.fmt, MINA.fmt and CMP.condn.fmt instructions
Differential Revision: http://reviews.llvm.org/D12141

llvm-svn: 246960
2015-09-07 10:31:31 +00:00
David Majnemer 907abf799d CODE_OWNERS.TXT is supposed to be sorted by surname
llvm-svn: 246954
2015-09-07 00:41:40 +00:00
NAKAMURA Takumi 0d72539d5a Prune utf8 chars in comments.
llvm-svn: 246953
2015-09-07 00:26:54 +00:00
Simon Pilgrim 15fc134865 [X86][MMX] Added missing stack folding tests for MMX/SSSE3 instructions
llvm-svn: 246949
2015-09-06 17:50:15 +00:00
Simon Pilgrim b38c09d7ff [X86][AVX512] Added 512-bit vector shift tests.
Only works for avx512f (dq) targets so far - need to add avx512bw tests once char/short shifts are supported.

llvm-svn: 246943
2015-09-06 13:36:32 +00:00
David Majnemer 135ca40a7d [InstCombine] Don't divide by zero when evaluating a potential transform
Trivial multiplication by zero may survive the worklist.  We tried to
reassociate the multiplication with a division instruction, causing us
to divide by zero; bail out instead.

This fixes PR24726.

llvm-svn: 246939
2015-09-06 06:49:59 +00:00
Hal Finkel 10aac5fd0e [SelectionDAG] Swap commutative binops before constant-based folding
In searching for a fix for the underlying code-quality bug highlighted by
r246937 (that SDAG simplification can lead to us generating an ISD::OR node
with a constant zero LHS), I ran across this:

We generically canonicalize commutative binary-operation nodes in SDAG getNode
so that, if only one operand is a constant, it will be on the RHS.  However, we
were doing this only after a bunch of constant-based simplification checks that
all assume this canonical form (that any constant will be on the RHS). Moving
the operand-swapping canonicalization prior to these checks seems like the
right thing to do (and, as it turns out, causes SDAG to completely fold away the
computation in test/CodeGen/ARM/2012-11-14-subs_carry.ll, just like InstCombine
would do).

llvm-svn: 246938
2015-09-06 05:42:13 +00:00
Hal Finkel ccf9259c00 [PowerPC] Don't commute trivial rlwimi instructions
To commute a trivial rlwimi instructions (meaning one with a full mask and zero
shift), we'd need to ability to form an all-zero mask (instead of an all-one
mask) using rlwimi. We can't represent this, however, and we'll miscompile code
if we try.

The code quality problem that this highlights (that SDAG simplification can
lead to us generating an ISD::OR node with a constant zero LHS) will be fixed
as a follow-up.

Fixes PR24719.

llvm-svn: 246937
2015-09-06 04:17:30 +00:00
Craig Topper 1c8fbd24f3 [TableGen] Use make_unique. NFC.
llvm-svn: 246936
2015-09-06 03:44:50 +00:00
Andrew Wilkins df17a9c5a9 [bindings] Update Go bindings to DIBuilder
Summary:
Update the Go bindings to DIBuilder to match
the split of creating local variables into
auto and parameter variables.

Reviewers: pcc

Subscribers: llvm-commits, axw

Differential Revision: http://reviews.llvm.org/D11864

llvm-svn: 246935
2015-09-06 02:22:15 +00:00
David Majnemer daa24b9789 [InstCombine] Don't assume m_Mul gives back an Instruction
This fixes PR24713.

llvm-svn: 246933
2015-09-05 20:44:56 +00:00
Alexandros Lamprineas ea33e5e88e Added arch extensions and default target features in TargetParser.
Differential: http://reviews.llvm.org/D11590
llvm-svn: 246930
2015-09-05 17:05:33 +00:00
Simon Pilgrim a1ceba8ab4 [X86] Updated vector lzcnt tests. Added missing vec512 tests.
llvm-svn: 246927
2015-09-05 11:56:30 +00:00
Simon Pilgrim 5cce4381ab [X86] Updated vector tzcnt tests. Added vec512 tests.
llvm-svn: 246922
2015-09-05 10:19:07 +00:00
Simon Pilgrim ba8ae5a814 [X86] Updated vector popcnt tests. Added vec512 tests.
llvm-svn: 246921
2015-09-05 09:59:59 +00:00
Zoran Jovanovic 89ca2b982e [mips][microMIPS] Implement ADD.fmt, SUB.fmt, MOV.fmt, MUL.fmt, DIV.fmt, MADDF.fmt, MSUBF.fmt and NEG.fmt instructions
Differential Revision: http://reviews.llvm.org/D11978

llvm-svn: 246919
2015-09-05 09:25:30 +00:00
Andrew Wilkins bb6d95fc3a [cmake] rework LLVM_LINK_LLVM_DYLIB option handling
Summary:
This diff attempts to address the concerns raised in
http://reviews.llvm.org/D12488.

We introduce a new USE_SHARED option to llvm_config,
which, if set, causes the target to be linked against
libLLVM.

add_llvm_utility now uniformly disables linking against
libLLVM. These utilities are not intended for distribution,
and this keeps the option handling more centralised.

llvm-shlib is now processes before any other "tools"
subdirectories, ensuring the libLLVM target is defined
before its dependents.

One main difference from what was requested: llvm_config
does not prune LLVM_DYLIB_COMPONENTS from the components
passed into explicit_llvm_config. This is because the "all"
component does something special, adding additional
libraries (namely libLTO). Adding the component libraries
after libLLVM should not be a problem, as symbols will be
resolved in libLLVM first.

Finally, I'm not really happy with the
DISABLE_LLVM_LINK_LLVM option, but I'm not sure of a
better way to get the following:
 - link all tools and shared libraries to libLLVM if
   LLVM_LINK_LLVM_DYLIB is set
 - some way of explicitly *not* doing so for utilities
   and libLLVM itself
Suggestions for improvement here are particularly welcome.

Reviewers: beanz

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12590

llvm-svn: 246918
2015-09-05 08:27:33 +00:00
Craig Topper 02a55d701d Fix build warning.
llvm-svn: 246908
2015-09-05 04:49:44 +00:00
NAKAMURA Takumi 2f9e8c0570 WinCOFFObjectWriter.cpp: Roll back TimeDateStamp along ENABLE_TIMESTAMPS.
We want a deterministic output. GNU AS leaves it zero.

FIXME: It may be optional by its user, like llc and clang.
llvm-svn: 246905
2015-09-05 01:17:49 +00:00
Davide Italiano 591b5a5e72 [MC] Convert other MachO tests from macho-dump to llvm-readobj.
This commit accomplish two goals:
1) it's a step forward to deprecate macho-dump, now less than 40 tests
rely on it.

2) It tests all the MachO specific features introduced in llvm-readobj in
the following commits:  r246789, r246665, r246474.

While the conversion is mostly mechanical (I double-checked all the
tests output one by one, but still), a post-commit review is greatly
appreciated.

llvm-svn: 246904
2015-09-05 01:02:05 +00:00
Andrew Kaylor 2a9a6d8c38 Fix build warning
llvm-svn: 246903
2015-09-05 01:00:51 +00:00
Hal Finkel b1518d6c24 [PowerPC] Fix and(or(x, c1), c2) -> rlwimi generation
PPCISelDAGToDAG has a transformation that generates a rlwimi instruction from
an input pattern that looks like this:

  and(or(x, c1), c2)

but the associated logic does not work if there are bits that are 1 in c1 but 0
in c2 (these are normally canonicalized away, but that can't happen if the 'or'
has other users. Make sure we abort the transformation if such bits are
discovered.

Fixes PR24704.

llvm-svn: 246900
2015-09-05 00:02:59 +00:00
Andrew Kaylor a212aba680 Fix build warning
llvm-svn: 246899
2015-09-04 23:58:32 +00:00
Evgeniy Stepanov 4dd7104549 Fix passed env var name in lit for Android tests.
The variable is actually called ANDROID_SERIAL.
This was not exercised on the bots until today.
Should fix the sanitizer-x86_64-linux failures.

llvm-svn: 246898
2015-09-04 23:52:04 +00:00
Andrew Kaylor a89baa21b8 Fixing bad test syntax.
llvm-svn: 246897
2015-09-04 23:47:34 +00:00
Andrew Kaylor 50e4e86c26 [WinEH] Teach SimplfyCFG to eliminate empty cleanup pads.
Differential Revision: http://reviews.llvm.org/D12434

llvm-svn: 246896
2015-09-04 23:39:40 +00:00
Kostya Serebryany e641dd6479 [libFuzzer] more accurate logic for traces, 80-char fix
llvm-svn: 246888
2015-09-04 22:32:25 +00:00
Davide Italiano baec437f54 [llvm-readobj] MachO: dump the correct field.
This was found while converting a test from macho-dump to llvm-readobj
and will once I commit the converted test.

llvm-svn: 246868
2015-09-04 20:43:00 +00:00
Yaron Keren 771e31964d Remove two unused includes and C++11 rangify for loops.
llvm-svn: 246865
2015-09-04 20:24:24 +00:00
Simon Pilgrim 0fccef6ac2 [X86][AVX] Test tidyup + regeneration. NFCI.
llvm-svn: 246863
2015-09-04 19:47:56 +00:00
Peter Collingbourne 07dcb58f8a Add powerpc64 to parallel.ll unsupported architecture list.
llvm-svn: 246862
2015-09-04 19:45:36 +00:00
Ben Craig dfe3d56d87 Adding full stops to comments
Also, test commit

llvm-svn: 246855
2015-09-04 15:28:13 +00:00
Chad Rosier a67b2d0117 Typo. NFC.
llvm-svn: 246851
2015-09-04 12:34:55 +00:00
Silviu Baranga 44077da1b7 Simplify testcase added in r246759. NFC
llvm-svn: 246848
2015-09-04 11:37:20 +00:00
David Majnemer 5ca46f0df1 [MC] Replace comparison with isUInt<32>.
Casting to unsigned long can cause the time to get truncated to 32-bits,
making it appear to be a valid timestamp.  Just use isUInt<32> instead.

llvm-svn: 246840
2015-09-04 07:22:36 +00:00
NAKAMURA Takumi c95358b1ea WinCOFFObjectWriter.cpp: Appease a warning in checking std::time_t. [-Wsign-compare]
llvm-svn: 246839
2015-09-04 05:19:37 +00:00
Richard Smith 55f5e657ee Fix APInt value initialization to give a zero value as any sane integer type
should, rather than giving a broken value that doesn't even zero/sign-extend
properly.

llvm-svn: 246836
2015-09-04 04:08:36 +00:00
Steven Wu 332eeca1e4 Fix the testcase in r246790
Using generic neon syntax to avoid test failure on apple platforms.

llvm-svn: 246833
2015-09-04 01:39:24 +00:00
Kostya Serebryany b2e9897644 [libFuzzer] when a single mutation fails try a few more times with other mutations before returning un-mutated data
llvm-svn: 246828
2015-09-04 00:40:29 +00:00
Kostya Serebryany 7d21166218 [libFuzzer] actually make the dictionaries work (+docs)
llvm-svn: 246825
2015-09-04 00:12:11 +00:00
Hal Finkel 4a7be23976 [PowerPC] Enable interleaved-access vectorization
This adds a basic cost model for interleaved-access vectorization (and a better
default for shuffles), and enables interleaved-access vectorization by default.
The relevant difference from the default cost model for interleaved-access
vectorization, is that on PPC, the shuffles that end up being used are *much*
cheaper than modeling the process with insert/extract pairs (which are
quite expensive, especially on older cores).

llvm-svn: 246824
2015-09-04 00:10:41 +00:00
Hal Finkel 75afa2b6b6 [PowerPC] Always use aggressive interleaving on the A2
On the A2, with an eye toward QPX unaligned-load merging, we should always use
aggressive interleaving. It is generally superior to only using concatenation
unrolling.

llvm-svn: 246819
2015-09-03 23:23:00 +00:00
Hal Finkel e6702ca0e2 [PowerPC] Try harder to find a base+offset when looking for consecutive accesses
When forming permutation-based unaligned vector loads, we need to know whether
it is valid to read ahead of the requested address by a full vector length.
Doing so is more efficient (and allows for more CSE with later loads), but
could trigger a page fault if invalid. To determine validity, we look for other
loads in the same block that access the relevant address range.

The relevant point here is that we need to do this as part of the process of
forming permutation-based vector loads, and this happens quite early in the
SDAG pipeline - specifically before many of the address calculations are fully
canonicalized. As a result, we need to try harder to recognize base+offset
address computations, because they still might appear as chain of adds
(base+offset+offset, for example). To account for this, we'll look through
chains of adds, accumulating the constant offsets.

llvm-svn: 246813
2015-09-03 22:37:44 +00:00
Sanjoy Das 88d0fdeb00 [IR] Have AttrBuilder::clear clear `TargetDepAttrs`.
Test case attached -- currently the parser smears the "foo bar" to all
of the formal arguments.

llvm-svn: 246812
2015-09-03 22:27:42 +00:00
Philip Reames 3ea158950e [RewriteStatepointsForGC] Extract common code, comment, and fix a build warning [NFC]
llvm-svn: 246810
2015-09-03 21:57:40 +00:00
Philip Reames f5b8e47651 [RewriteStatepointsForGC] Strengthen invariants around BDVs
As a first step towards a new implementation of the base pointer inference algorithm, introduce an abstraction for BDVs, strengthen the assertions around them, and rewrite the BDV relation code in terms of the abstraction which includes an explicit notion of whether the BDV is also a base. The later is motivated by the fact we had a bug where insertelement was always assumed to be a base pointer even though the BDV code knew it wasn't. The strengthened assertions in this patch would have caught that bug.

The next step will be to separate the DefiningValueMap into a BDV use list cache (entirely within findBasePointers) and a base pointer cache. Having the former will allow me to use a deterministic visit order when visiting BDVs in the inference algorithm and remove a bunch of ordering related hacks. Before actually doing the last step, I'm likely going to extend the lattice with a 'BaseN' (seen only base inputs) state so that I can kill the post process optimization step.

Phabricator Revision: http://reviews.llvm.org/D12608

llvm-svn: 246809
2015-09-03 21:34:30 +00:00
Kostya Serebryany ec2dcb1d91 [libFuzzer] refactor the mutation functions so that they are now methods of a class. NFC
llvm-svn: 246808
2015-09-03 21:24:19 +00:00
Hal Finkel f11bc761d8 [PowerPC] Include the permutation cost for unaligned vector loads
Pre-P8, when we generate code for unaligned vector loads (for Altivec and QPX
types), even when accounting for the combining that takes place for multiple
consecutive such loads, there is at least one load instructions and one
permutation for each load. Make sure the cost reported reflects the cost of the
permutes as well.

llvm-svn: 246807
2015-09-03 21:23:18 +00:00
Hal Finkel 99d95328d6 [PowerPC] Compute the MMO offset for an unaligned load with signed arithmetic
If you compute the MMO offset using unsigned arithmetic, you end up with a
large positive offset instead of a small negative one. In theory, this could
cause bad instruction-scheduling decisions later.

I noticed this by inspection from the debug output, and using that for the
regression test is the best I can do right now.

llvm-svn: 246805
2015-09-03 21:12:15 +00:00
Philip Reames 246e618e77 [RewriteStatepointsForGC] Workaround a lack of determinism in visit order
The visit order being used in the base pointer inference algorithm is currently non-deterministic.  When working on http://reviews.llvm.org/D12583, I discovered that we were relying on a peephole optimization to get deterministic ordering in one of the test cases.  

This change is intented to let me test and land http://reviews.llvm.org/D12583.  The current code will not be long lived.  I'm starting to investigate a rewrite of the algorithm which will combine the post-process step into the initial algorithm and make the visit order determistic.  Before doing that, I wanted to make sure the existing code was complete and the test were stable.  Hopefully, patches should be up for review for the new algorithm this week or early next.

llvm-svn: 246801
2015-09-03 20:24:29 +00:00
Kostya Serebryany 9838b2be87 [libFuzzer] adding a parser for AFL-style dictionaries + tests.
llvm-svn: 246800
2015-09-03 20:23:46 +00:00
Reid Kleckner df52337bfc [sancov] Disable sanitizer coverage on functions using SEH
Splitting basic blocks really messes up WinEHPrepare. We can remove this
change when SEH uses the new EH IR.

llvm-svn: 246799
2015-09-03 20:18:29 +00:00
Jonathan Roelofs 32eeb76a4f llvm.vim: 'musttail' is a keyword too
llvm-svn: 246798
2015-09-03 20:10:40 +00:00
Dan Liew 50456fb98e Try to clarify the semantics of fptrunc
* ``the value cannot fit within the destination type`` is ambiguous.
  It could mean overflow, underflow (not in the IEEE-754 sense) or a
  result that cannot be exactly represented and requires rounding or it
  could mean some combination of these. The semantics now state it means
  overflow **only**.

* Using "truncation" in the semantics is very misleading given that it
  doesn't necessarily truncate (i.e. round to zero). For example on
  x86_64 with SSE2 this is currently mapped to cvtsd2ss instruction
  who's rounding behaviour is dependent on the MXCSR register which
  is usually set to round to nearest even by default. The semantics
  now state that the rounding mode is undefined.

llvm-svn: 246792
2015-09-03 18:43:56 +00:00
Chad Rosier 6c36eff1d6 [AArch64] Improve ISel using across lane addition reduction.
In vectorized add reduction code, the final "reduce" step is sub-optimal.
This change wll combine :

ext  v1.16b, v0.16b, v0.16b, #8
add  v0.4s, v1.4s, v0.4s
dup  v1.4s, v0.s[1]
add  v0.4s, v1.4s, v0.4s

into

addv s0, v0.4s

PR21371
http://reviews.llvm.org/D12325
Patch by Jun Bum Lim <junbuml@codeaurora.org>!

llvm-svn: 246790
2015-09-03 18:13:57 +00:00
Davide Italiano 4410b22cea [llvm-readobj] Dump MachO indirect symbols.
Example output:

File: <stdin>
Format: Mach-O 32-bit i386
Arch: i386
AddressSize: 32bit
Indirect Symbols {

Number: 3
Symbols [
  Entry {
    Entry Index: 0
    Symbol Index: 0x4
  }
  Entry {
    Entry Index: 1
    Symbol Index: 0x0
  }
  Entry {
    Entry Index: 2
    Symbol Index: 0x1
  }
]
}

Differential Revision:	http://reviews.llvm.org/D12570

llvm-svn: 246789
2015-09-03 18:10:28 +00:00
Karl Schimpf 7772978ccf Allow global address space forward decls using IDs in .ll files.
Summary:
This fixes bugzilla bug 24656. Fixes the case where there is a forward
reference to a global variable using an ID (i.e. @0). It does this by
passing the address space of the initializer pointer for which the
forward referenced global is used.

llvm-svn: 246788
2015-09-03 18:06:44 +00:00