Commit Graph

25289 Commits

Author SHA1 Message Date
David Blaikie 48af9c3527 DebugInfo: Fix up some test cases to have more correct debug info metadata.
* Add CUs to the named CU node
* Add missing DW_TAG_subprogram nodes
* Add llvm::Functions to the DW_TAG_subprogram nodes

This cleans up the tests so that they don't break under a
soon-to-be-made change that is more strict about such things.

llvm-svn: 213951
2014-07-25 16:05:18 +00:00
Hal Finkel ff0bcb60c9 Convert noalias parameter attributes into noalias metadata during inlining
This functionality is currently turned off by default.

Part of the motivation for introducing scoped-noalias metadata is to enable the
preservation of noalias parameter attribute information after inlining.
Sometimes this can be inferred from the code in the caller after inlining, but
often we simply lose valuable information.

The overall process if fairly simple:
 1. Create a new unqiue scope domain.
 2. For each (used) noalias parameter, create a new alias scope.
 3. For each pointer, collect the underlying objects. Add a noalias scope for
    each noalias parameter from which we're not derived (and has not been
    captured prior to that point).
 4. Add an alias.scope for each noalias parameter from which we might be
    derived (or has been captured before that point).

Note that the capture checks apply only if one of the underlying objects is not
an identified function-local object.

llvm-svn: 213949
2014-07-25 15:50:08 +00:00
Hal Finkel 029cde639c Simplify and improve scoped-noalias metadata semantics
In the process of fixing the noalias parameter -> metadata conversion process
that will take place during inlining (which will be committed soon, but not
turned on by default), I have come to realize that the semantics provided by
yesterday's commit are not really what we want. Here's why:

void foo(noalias a, noalias b, noalias c, bool x) {
  *q = x ? a : b;
  *c = *q;
}

Generically, we know that *c does not alias with *a and with *b (so there is an
'and' in what we know we're not), and we know that *q might be derived from *a
or from *b (so there is an 'or' in what we know that we are). So we do not want
the semantics currently, where any noalias scope matching any alias.scope
causes a NoAlias return. What we want to know is that the noalias scopes form a
superset of the alias.scope list (meaning that all the things we know we're not
is a superset of all of things the other instruction might be).

Making that change, however, introduces a composibility problem. If we inline
once, adding the noalias metadata, and then inline again adding more, and we
append new scopes onto the noalias and alias.scope lists each time. But, this
means that we could change what was a NoAlias result previously into a MayAlias
result because we appended an additional scope onto one of the alias.scope
lists. So, instead of giving scopes the ability to have parents (which I had
borrowed from the TBAA implementation, but seems increasingly unlikely to be
useful in practice), I've given them domains. The subset/superset condition now
applies within each domain independently, and we only need it to hold in one
domain. Each time we inline, we add the new scopes in a new scope domain, and
everything now composes nicely. In addition, this simplifies the
implementation.

llvm-svn: 213948
2014-07-25 15:50:02 +00:00
Duncan P. N. Exon Smith 6b6fdc992a IPO: Add use-list-order verifier
Add a -verify-use-list-order pass, which shuffles use-list order, writes
to bitcode, reads back, and verifies that the (shuffled) order matches.

  - The utility functions live in lib/IR/UseListOrder.cpp.

  - Moved (and renamed) the command-line option to enable writing
    use-lists, so that this pass can return early if the use-list orders
    aren't being serialized.

It's not clear that this pass is the right direction long-term (perhaps
a separate tool instead?), but short-term it's a great way to test the
use-list order prototype.  I've added an XFAIL-ed testcase that I'm
hoping to get working pretty quickly.

This is part of PR5680.

llvm-svn: 213945
2014-07-25 14:49:26 +00:00
Amara Emerson 115d2df8a4 [ARM] Emit ABI_PCS_R9_use build attribute.
Patch by Ben Foster!

Differential Revision: http://reviews.llvm.org/D4657

llvm-svn: 213944
2014-07-25 14:03:14 +00:00
NAKAMURA Takumi b4553da481 llvm/test/CodeGen/ARM/inlineasm-global.ll: Add explicit triple to appease targeting *-win32.
llvm-svn: 213933
2014-07-25 09:55:01 +00:00
NAKAMURA Takumi dd620a242a llvm/test/CodeGen/ARM/inlineasm-global.ll: Avoid specifing source file on llc.
It sometimes confuses FileCheck. Consider the case that path contains 'stmib'. :)

llvm-svn: 213932
2014-07-25 09:54:49 +00:00
Akira Hatanaka 16e47ff42e [ARM] In thumb mode, emit directive ".code 16" before file level inline
assembly instructions.

This is necessary to ensure ARM assembler switches to Thumb mode before it
starts assembling the file level inline assembly instructions at the beginning
of a .s file.

<rdar://problem/17757232>

llvm-svn: 213924
2014-07-25 05:12:49 +00:00
Lang Hames 98c3c0f38a [X86] Add comments to clarify some non-obvious lines in the stackmap-nops.ll
testcases.

Based on code review from Philip Reames. Thanks Philip!

llvm-svn: 213923
2014-07-25 04:50:08 +00:00
Bill Schmidt c9fa5dd618 [PATCH][PPC64LE] Correct little-endian usage of vmrgh* and vmrgl*.
Because the PowerPC vmrgh* and vmrgl* instructions have a built-in
big-endian bias, it is necessary to swap their inputs in little-endian
mode when using them to implement a vector shuffle.  This was
previously missed in the vector LE implementation.

There was already logic to distinguish between unary and "normal"
vmrg* vector shuffles, so this patch extends that logic to use a third
option:  "swapped" vmrg* vector shuffles that are used for little
endian in place of the "normal" ones.

I've updated the vec-shuffle-le.ll test to check for the expected
register ordering on the generated instructions.

This bug was discovered when testing the LE and ELFv2 patches for
safety if they were backported to 3.4.  A different vectorization
decision was made in 3.4 than on mainline trunk, and that exposed the
problem.  I've verified this fix takes care of that issue.

llvm-svn: 213915
2014-07-25 01:55:55 +00:00
Kevin Enderby 08e1bbd645 Add an implementation for llvm-nm’s -print-file-name option (aka -o and -A).
The -print-file-name option in llvm-nm is to precede each symbol
with the object file it came from.  While code for the parsing of this
option and its aliases existed there was no code to implement it.

llvm-svn: 213906
2014-07-24 23:31:52 +00:00
David Majnemer ab131e86ce Opportunistically fix the builders
A builder complained that it couldn't find llvm-vtabledump, this is
probably because it wasn't a dependency of the 'test' target.

llvm-svn: 213905
2014-07-24 23:26:54 +00:00
David Majnemer 72ab1a5aee llvm-vtabledump: A vtable dumper
This tool's job is to dump the vtables inside object files.  It is
currently limited to MS ABI vf- and vb-tables but it will eventually
support Itanium-style v-tables as well.

Differential Revision: http://reviews.llvm.org/D4584

llvm-svn: 213903
2014-07-24 23:14:40 +00:00
Mark Heffernan 8ec1474f7f After unrolling a loop with llvm.loop.unroll.count metadata (unroll factor
hint) the loop unroller replaces the llvm.loop.unroll.count metadata with
llvm.loop.unroll.disable metadata to prevent any subsequent unrolling
passes from unrolling more than the hint indicates.  This patch fixes
an issue where loop unrolling could be disabled for other loops as well which
share the same llvm.loop metadata.

llvm-svn: 213900
2014-07-24 22:36:40 +00:00
Joerg Sonnenberger b5459e6e22 Don't use 128bit functions on PPC32.
llvm-svn: 213899
2014-07-24 22:20:10 +00:00
Chandler Carruth 9f4530b95d [SDAG] Introduce a combined set to the DAG combiner which tracks nodes
which have successfully round-tripped through the combine phase, and use
this to ensure all operands to DAG nodes are visited by the combiner,
even if they are only added during the combine phase.

This is critical to have the combiner reach nodes that are *introduced*
during combining. Previously these would sometimes be visited and
sometimes not be visited based on whether they happened to end up on the
worklist or not. Now we always run them through the combiner.

This fixes quite a few bad codegen test cases lurking in the suite while
also being more principled. Among these, the TLS codegeneration is
particularly exciting for programs that have this in the critical path
like TSan-instrumented binaries (although I think they engineer to use
a different TLS that is faster anyways).

I've tried to check for compile-time regressions here by running llc
over a merged (but not LTO-ed) clang bitcode file and observed at most
a 3% slowdown in llc. Given that this is essentially a worst case (none
of opt or clang are running at this phase) I think this is tolerable.
The actual LTO case should be even less costly, and the cost in normal
compilation should be negligible.

With this combining logic, it is possible to re-legalize as we combine
which is necessary to implement PSHUFB formation on x86 as
a post-legalize DAG combine (my ultimate goal).

Differential Revision: http://reviews.llvm.org/D4638

llvm-svn: 213898
2014-07-24 22:15:28 +00:00
Chandler Carruth 80b869461e [x86] Make vector legalization of extloads work more like the "normal"
vector operation legalization with support for custom target lowering
and fallback to expand when it fails, and use this to implement sext and
anyext load lowering for x86 in a more principled way.

Previously, the x86 backend relied on a target DAG combine to "combine
away" sextload and extload nodes prior to legalization, or would expand
them during legalization with terrible code. This is particularly
problematic because the DAG combine relies on running over non-canonical
DAG nodes at just the right time to match several common and important
patterns. It used a combine rather than lowering because we didn't have
good lowering support, and to expose some tricks being employed to more
combine phases.

With this change it becomes a proper lowering operation, the backend
marks that it can lower these nodes, and I've added support for handling
the canonical forms that don't have direct legal representations such as
sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases
for this behavior continue to pass even after the DAG combiner beigns
running more systematically over every node.

There is some noise caused by this in the test suite where we actually
use vector extends instead of subregister extraction. This doesn't
really seem like the right thing to do, but is unlikely to be a critical
regression. We do regress in one case where by lowering to the
target-specific patterns early we were able to combine away extraneous
legal math nodes. However, this regression is completely addressed by
switching to a widening based legalization which is what I'm working
toward anyways, so I've just switched the test to that mode.

Differential Revision: http://reviews.llvm.org/D4654

llvm-svn: 213897
2014-07-24 22:09:56 +00:00
Lang Hames f49bc3f1b1 [X86] Optimize stackmap shadows on X86.
This patch minimizes the number of nops that must be emitted on X86 to satisfy
stackmap shadow constraints.

To minimize the number of nops inserted, the X86AsmPrinter now records the
size of the most recent stackmap's shadow in the StackMapShadowTracker class,
and tracks the number of instruction bytes emitted since the that stackmap
instruction was encountered. Padding is emitted (if it is required at all)
immediately before the next stackmap/patchpoint instruction, or at the end of
the basic block.

This optimization should reduce code-size and improve performance for people
using the llvm stackmap intrinsic on X86.

<rdar://problem/14959522>

llvm-svn: 213892
2014-07-24 20:40:55 +00:00
Reid Kleckner 9a412d13c1 Replace an assertion with a fatal error
Frontends are responsible for putting inalloca on parameters that would
be passed in memory and not registers.

llvm-svn: 213891
2014-07-24 19:53:33 +00:00
Manman Ren 29a2005596 Try to fix the bots again by moving test to X86 directory.
llvm-svn: 213884
2014-07-24 17:57:09 +00:00
Saleem Abdulrasool c61ed0474e X86: correct library call setup for Windows itanium
This target is identical to the Windows MSVC (and follows Microsoft ABI for C).
Correct the library call setup for this target.  The same set of library calls
are missing on this environment.

llvm-svn: 213883
2014-07-24 17:46:36 +00:00
Matt Arsenault 83592a2d32 R600: Add FMA instructions for Evergreen
llvm-svn: 213882
2014-07-24 17:41:01 +00:00
Manman Ren a8bc9a4c36 Try to fix the bots. If this does not work, I am going to move it to X86 directory.
llvm-svn: 213880
2014-07-24 17:18:33 +00:00
Nico Weber 155dccd1eb Let the integrated assembler understand .exitm, PR20426.
llvm-svn: 213876
2014-07-24 17:08:39 +00:00
Nico Weber 404012b7dc Let the integrated assembler understand .warning, PR20428.
llvm-svn: 213873
2014-07-24 16:26:06 +00:00
Hal Finkel 9414665a3b Add scoped-noalias metadata
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
  1. To preserve noalias function attribute information when inlining
  2. To provide the ability to model block-scope C99 restrict pointers

Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.

What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:

!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }

Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:

... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }

When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.

Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.

[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]

Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.

llvm-svn: 213864
2014-07-24 14:25:39 +00:00
Hal Finkel cc39b67530 AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).

This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.

No functionality change intended.

llvm-svn: 213859
2014-07-24 12:16:19 +00:00
Tilmann Scheller 96ef72e54a [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STRH instructions.
The ARM ARM prohibits STRH instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STRH instructions with unpredictable behavior.

llvm-svn: 213850
2014-07-24 09:55:46 +00:00
Matt Arsenault 9acb978105 R600: Match rcp node on pre-SI
llvm-svn: 213844
2014-07-24 06:59:24 +00:00
Matt Arsenault 0daeb63f03 R600: Fix LowerSDIV24
Use ComputeNumSignBits instead of checking for i8 / i16 which only
worked when AMDIL was lying about having legal i8 / i16.

If an integer is known to fit in 24-bits, we can
do division faster with float ops.

llvm-svn: 213843
2014-07-24 06:59:20 +00:00
Rafael Espindola 7fd11896a8 Remove unused substitution.
llvm-svn: 213839
2014-07-24 04:09:04 +00:00
Kevin Qin 9a2a2c502b [AArch64] Fix a bug generating incorrect instruction when building small vector.
This bug is introduced by r211144. The element of operand may be
smaller than the element of result, but previous commit can
only handle the contrary condition. This commit is to handle this
scenario and generate optimized codes like ZIP1.

llvm-svn: 213830
2014-07-24 02:05:42 +00:00
Jiangning Liu 451f30e89f [AArch64] Disable some optimization cases for type conversion from sint to fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file.
llvm-svn: 213827
2014-07-24 01:29:59 +00:00
Filipe Cabecinhas 933cccf3fa Fixed PR20411 - bug in getINSERTPS()
When we had a vector_shuffle where we had an input from each vector, we
could miscompile it because we were assuming the input from V2 wouldn't
be moved from where it was on the vector.

Added a test case.

llvm-svn: 213826
2014-07-24 01:28:21 +00:00
Manman Ren edc60376ed SimplifyCFG: fix a bug in switch to table conversion
We use gep to access the global array "switch.table", and the table index
should be treated as unsigned. When the highest bit is 1, this commit
zero-extends the index to an integer type with larger size.

For a switch on i2, we used to generate:
%switch.tableidx = sub i2 %0, -2
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx

It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate
%switch.tableidx = sub i2 %0, -2
%switch.tableidx.zext = zext i2 %switch.tableidx to i3
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext

rdar://17735071

llvm-svn: 213815
2014-07-23 23:13:23 +00:00
NAKAMURA Takumi c0eb860c21 Let llvm/test/CodeGen/X86/avx512*-mask-op.ll(s) aware of Win32 x64 calling convention.
llvm-svn: 213812
2014-07-23 22:38:25 +00:00
Chandler Carruth 9f2a54c579 [x86] Rip out some broken test cases for avx512 i1 store support.
It isn't reasonable to test storing things using undef pointers --
storing through those is at best "good luck" and really should be
transformed to "unreachable". Random changes in the combiner can
randomly break these tests for no good reason. I'm following up on the
original commit regarding the right long-term strategy here.

llvm-svn: 213810
2014-07-23 22:29:19 +00:00
Juergen Ributzka fa154f03d1 [RuntimeDyld][AArch64] Update relocation tests and also add a simple GOT test.
llvm-svn: 213807
2014-07-23 22:23:17 +00:00
David Blaikie 8e9cfa5497 ArgPromo+DebugInfo: Handle updating debug info over multiple applications of argument promotion.
While the subprogram map cache used by Dead Argument Elimination works
there, I made a mistake when reusing it for Argument Promotion in
r212128 because ArgPromo may transform functions more than once whereas
DAE transforms each function only once, removing all the dead arguments
in one go.

To address this, ensure that the map is updated after each argument
promotion.

In retrospect it might be a little wasteful to create a map of all
subprograms when only handling a single CGSCC, but the alternative is
walking the debug info for each function in the CGSCC that gets updated.
It's not clear to me what the right tradeoff is there, but since the
current tradeoff seems to be working OK (and the code to keep things
updated is very cheap), let's stick with that for now.

llvm-svn: 213805
2014-07-23 22:09:29 +00:00
David Blaikie f997c6f90b Test debug info in arg promotion with an actual promotion case, rather than a degenerate arg promotion that's actually DAE performed by ArgPromo
Also the debug location I had here was bogus, describing the location of
the call site as in the callee - and unnecessary, so just drop it.

llvm-svn: 213803
2014-07-23 21:30:59 +00:00
Jim Grosbach d3c7942f4a Use an explicit triple in testcase.
Make the test work better on non-darwin hosts. Hopefully.

llvm-svn: 213801
2014-07-23 20:46:32 +00:00
Jim Grosbach 724e438c62 [X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants.
The transform to constant fold unary operations with an AND across a
vector comparison applies when the constant is not a splat of a scalar
as well.

llvm-svn: 213800
2014-07-23 20:41:43 +00:00
Jim Grosbach 8f6f0858ec X86: restrict combine to when type sizes are safe.
The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.

llvm-svn: 213799
2014-07-23 20:41:38 +00:00
Jim Grosbach 19dd3088c0 DAG: fp->int conversion for non-splat constants.
Constant fold the lanes of the input constant build_vector individually
so we correctly handle when the vector elements are not all the same
constant value.

PR20394

llvm-svn: 213798
2014-07-23 20:41:31 +00:00
Justin Holewinski 4d6f783281 [NVPTX] Add some extra tests for mul.wide to test non-power-of-two source types
llvm-svn: 213794
2014-07-23 20:23:49 +00:00
Mark Heffernan 9e112443b6 Do not add unroll disable metadata after unrolling pass for loops with #pragma clang loop unroll(full).
llvm-svn: 213789
2014-07-23 20:05:44 +00:00
Juergen Ributzka 1b014504ab [FastISel][AArch64] Fix return type in FastLowerCall.
I used the wrong method to obtain the return type inside FinishCall. This fix
simply uses the return type from FastLowerCall, which we already determined to
be a valid type.

Reduced test case from Chad. Thanks.

llvm-svn: 213788
2014-07-23 20:03:13 +00:00
Justin Holewinski ecca715b3c [NVPTX] mul.wide generation works for any smaller integer source types, not just the next smaller power of two
llvm-svn: 213784
2014-07-23 18:46:03 +00:00
Robert Khasanov 11d08548fd [SKX] Added missed test files for rev 213757
llvm-svn: 213780
2014-07-23 18:17:49 +00:00
Saleem Abdulrasool df2c3e89b5 AsmParser: remove deprecated LLIR support
linker_private and linker_private_weak were deprecated in 3.5.  Remove support
for them now that the 3.5 branch has been created.

llvm-svn: 213777
2014-07-23 18:09:31 +00:00