Commit Graph

71547 Commits

Author SHA1 Message Date
Petar Jovanovic b7c305f091 Add support for scalarizing ctlz_zero_undef
Fix the missing case in ScalarizeVectorResult() that was exposed with
libclcore.bc in Android.

Differential Revision: http://reviews.llvm.org/D4645

llvm-svn: 214266
2014-07-30 00:44:03 +00:00
Richard Smith 5e23fb8691 Header hygiene: remove using directive and #undef DEBUG_TYPE once we're done.
llvm-svn: 214263
2014-07-30 00:25:24 +00:00
Joerg Sonnenberger 130765558b Add rfci instruction.
llvm-svn: 214256
2014-07-29 23:45:20 +00:00
Lang Hames 9cb7353e0b [MCJIT] Add options to llvm-rtdyld to describe a phony target address space for
use in -verify mode.

This patch adds three hidden command line options to llvm-rtdyld:

 -target-addr-start <start-addr> : Specify the start of the virtual address
                                   space on the phony target.

 -target-addr-end   <end-addr>   : Specify the end of the virtual address space
                                   on the phony target.

 -target-section-sep <sep>       : Specify the separation (in bytes) between the
                                   end of one section and the start of the next.

These options automatically default to sane values for the target platform. In
particular, they allow narrow (e.g. 32-bit, 16-bit) targets to be tested from
wider (e.g. 64-bit, 32-bit) hosts without overflowing pointers.

The section separation option defaults to zero, but can be set to a large number
(e.g. 1 << 32) to force large separations between sections in order to
stress-test large-code-model code.

llvm-svn: 214255
2014-07-29 23:43:13 +00:00
Joerg Sonnenberger 2450768251 mbar without argument is equivalent to mbar 0.
llvm-svn: 214250
2014-07-29 23:31:27 +00:00
Duncan P. N. Exon Smith b57aef0030 Revert "UseListOrder: Order GlobalValue uses after initializers"
This reverts commits r214242 and r214243 while I investigate buildbot
failures [1][2][3].  I can't reproduce these failures locally, so if
anyone can see what I've done wrong, I'd appreciate a note.

[1]: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/9840
[2]: http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/14981
[3]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/15191

llvm-svn: 214249
2014-07-29 23:31:11 +00:00
Joerg Sonnenberger 99ef10f915 Recognize BookE's mbar instruction.
llvm-svn: 214244
2014-07-29 23:16:31 +00:00
Duncan P. N. Exon Smith 1d501e8f46 UseListOrder: Order GlobalValue uses after initializers
To avoid unnecessary forward references, the reader doesn't process
initializers of `GlobalValue`s until after the constant pool has been
processed, and then in reverse order.  Model this when predicting
use-list order.  This gets two more Bitcode tests passing with
`llvm-uselistorder`.

Part of PR5680.

llvm-svn: 214242
2014-07-29 23:06:14 +00:00
Duncan P. N. Exon Smith 2e6a87b281 UseListOrder: Create a struct around OrderMap, NFC
llvm-svn: 214241
2014-07-29 23:03:40 +00:00
Manman Ren 72b07e8578 Feedback on r214189, no functionality change.
llvm-svn: 214240
2014-07-29 22:58:13 +00:00
Joerg Sonnenberger 053566aeab Fix typo in alias: DSIR -> DSISR
llvm-svn: 214238
2014-07-29 22:42:44 +00:00
Justin Bogner cf36a366a8 llvm-profdata: Clean up and reorganize some tests
This moves some tests around to make it clearer what's being tested,
and adds very rudimentary comment syntax to the text input format to
make specifying this kind of test a little bit simpler.

llvm-svn: 214235
2014-07-29 22:29:23 +00:00
Joerg Sonnenberger 9e9623ca64 Support move to/from segment register.
llvm-svn: 214234
2014-07-29 22:21:57 +00:00
Alex Lorenz 251b3e3fbc Coverage: improve efficiency of the counter propagation to the expansion regions.
This patch reduces the complexity of the two inner loops in order to speed up 
the loading of coverage data for very large functions.

llvm-svn: 214228
2014-07-29 21:42:24 +00:00
Lang Hames b2eb492df4 [MCJIT] Make sure we print the full 64-bit result of exprs in RuntimeDyldChecker.
llvm-svn: 214227
2014-07-29 21:38:20 +00:00
Matt Arsenault 1acc72f431 R600/SI: Implement getLdStBaseRegImmOfs
llvm-svn: 214225
2014-07-29 21:34:55 +00:00
Rafael Espindola 2743525032 Have a single enum for "not a bitcode" error.
This is more convenient for callers. No functionality change, this will
be used in a next patch to the gold plugin.

llvm-svn: 214218
2014-07-29 21:01:24 +00:00
Matt Arsenault 1eb18309d5 R600/SI: Enable named operand table for DS instructions
llvm-svn: 214217
2014-07-29 21:00:56 +00:00
Matt Arsenault 620155fd1d Remove line with no effect
llvm-svn: 214216
2014-07-29 21:00:53 +00:00
Lang Hames 480763f814 [MCJIT] Make the RuntimeDyldChecker stub_addr builtin use file names rather than
full paths for its first argument.

This allows us to remove the annoying sed lines in the test cases, and write
direct references to file names in stub_addr calls (rather than <filename>
placeholders).

llvm-svn: 214211
2014-07-29 20:40:37 +00:00
Rafael Espindola c3f2e73006 Move the bitcode error enum to the include directory.
This will let users in other libraries know which error occurred. In particular,
it will be possible to check if the parsing failed or if the file is not
bitcode.

llvm-svn: 214209
2014-07-29 20:22:46 +00:00
Alex Lorenz a422911c3a Coverage: fix the missing output stream in recursive call to CoverageMappingContext::dump
llvm-svn: 214206
2014-07-29 19:58:16 +00:00
Juergen Ributzka 0e913b17d6 [RuntimeDyld][AArch64] Make encode/decodeAddend also work on big-endian hosts.
llvm-svn: 214205
2014-07-29 19:57:15 +00:00
Juergen Ributzka fbd40c36eb [RuntimeDyld][AArch64] Make encode/decodeAddend more typesafe by using the relocation enum type. NFCI.
llvm-svn: 214204
2014-07-29 19:57:11 +00:00
Joerg Sonnenberger b1ccf5623b Add a number of aliases for SPR access.
llvm-svn: 214196
2014-07-29 18:55:43 +00:00
Matt Arsenault e2fabd35b5 R600/SI: Add isMUBUF / isMTBUF
Also add missing comments about how the flags work.

llvm-svn: 214195
2014-07-29 18:51:56 +00:00
Matt Arsenault 0040f18256 R600/SI: Set bits on SMRD instructions
Set mayStore = 0 and enable named operand table.

llvm-svn: 214194
2014-07-29 18:51:54 +00:00
Manman Ren f93ac4bfad [Debug Info] remove DITrivialType and use null to represent unspecified param.
Per feedback on r214111, we are going to use null to represent unspecified
parameter. If the type array is {null}, it means a function that returns void;
If the type array is {null, null}, it means a variadic function that returns
void. In summary if we have more than one element in the type array and the last
element is null, it is a variadic function.

rdar://17628609

llvm-svn: 214189
2014-07-29 18:20:39 +00:00
Duncan P. N. Exon Smith d7a281ad2e IR: Create the use-list order shuffle vector in-place
Per David Blaikie's review of r214135, this is a more natural way to
initialize.

llvm-svn: 214184
2014-07-29 16:58:18 +00:00
Joerg Sonnenberger accbc9426d Add rfi instruction. Based on feedback by Ulrich Weigand.
llvm-svn: 214181
2014-07-29 15:49:09 +00:00
Sasa Stankovic f4a9e3bc28 [mips] Don't use odd-numbered single precision registers for fastcc calling
convention if -mno-odd-spreg is used.

Differential Revision: http://reviews.llvm.org/D4682

llvm-svn: 214180
2014-07-29 14:39:24 +00:00
Tim Northover e2239ff3eb CodeGenPrep: fall back to MVT::Other if instruction's type isn't an EVT.
The test being performed is just an approximation anyway, so it really
shouldn't crash when things don't go entirely as expected.

Should fix PR20474.

llvm-svn: 214177
2014-07-29 10:20:22 +00:00
Tim Northover 4e13a61413 ARM: add __aeabi_d2h for truncation on AEABI systems
ARM does actually define the name for this conversion, so we should use it on
"-eabi" platforms.

llvm-svn: 214176
2014-07-29 09:56:45 +00:00
Tim Northover f67bb2079d ARM: fix @llvm.convert.from.fp16 on softfloat targets.
We need to make sure we use the softened version of all appropriate operands in
the libcall, or things go horribly wrong. This may entail actually executing a
1-stage softening.

llvm-svn: 214175
2014-07-29 09:56:38 +00:00
Jiangning Liu cd296378a7 Implement AArch64 TTI interface isAsCheapAsAMove.
llvm-svn: 214159
2014-07-29 02:09:26 +00:00
Jiangning Liu c3053129b9 Add TargetInstrInfo interface isAsCheapAsAMove.
llvm-svn: 214158
2014-07-29 01:55:19 +00:00
Duncan P. N. Exon Smith 3f0fc7bca9 Bitcode: Correctly compare a Use against itself
Fix the sort of expected order in the reader to correctly return `false`
when comparing a `Use` against itself.

This was caught by test/Bitcode/binaryIntInstructions.3.2.ll, so I'm
adding a `RUN` line using `llvm-uselistorder` for every test in
`test/Bitcode` that passes.

A few tests still fail, so I'll investigate those next.

This is part of PR5680.

llvm-svn: 214157
2014-07-29 01:13:56 +00:00
Duncan P. N. Exon Smith 78141cef2c IR: Augment debug statements for use-list order
llvm-svn: 214155
2014-07-29 01:09:46 +00:00
Matt Arsenault 57e74d2010 Fix typos / grammar.
llvm-svn: 214147
2014-07-29 00:02:40 +00:00
Matt Arsenault 60bd28cefd Fix header including itself
llvm-svn: 214146
2014-07-29 00:02:37 +00:00
Manman Ren bd1628a595 [Debug Info] unique MDNodes in the enum types of each compile unit.
The enum types array by design contains pointers to MDNodes rather than DIRefs.
Unique them when handling the enum types in DwarfDebug.

rdar://17628609

llvm-svn: 214139
2014-07-28 23:04:20 +00:00
Duncan P. N. Exon Smith f849ace2ab IR: Optimize size of use-list order shuffle vectors
Since we're storing lots of these, save two-pointers per vector with a
custom type rather than using the relatively heavy `SmallVector`.

Part of PR5680.

llvm-svn: 214135
2014-07-28 22:41:50 +00:00
Manman Ren f8a1967c8c [Debug Info] add DISubroutineType and its creation takes DITypeArray.
DITypeArray is an array of DITypeRef, at its creation, we will create
DITypeRef (i.e use the identifier if the type node has an identifier).

This is the last patch to unique the type array of a subroutine type.

rdar://17628609

llvm-svn: 214132
2014-07-28 22:24:06 +00:00
Duncan P. N. Exon Smith 1f66c856b5 Bitcode: Serialize (and recover) use-list order
Predict and serialize use-list order in bitcode.  This makes the option
`-preserve-bc-use-list-order` work *most* of the time, but this is still
experimental.

  - Builds a full value-table up front in the writer, sets up a list of
    use-list orders to write out, and discards the table.  This is a
    simpler first step than determining the order from the various
    overlapping IDs of values on-the-fly.

  - The shuffles stored in the use-list order list have an unnecessarily
    large memory footprint.

  - `blockaddress` expressions cause functions to be materialized
    out-of-order.  For now I've ignored this problem, so use-list orders
    will be wrong for constants used by functions that have block
    addresses taken.  There are a couple of ways to fix this, but I
    don't have a concrete plan yet.

  - When materializing functions lazily, the use-lists for constants
    will not be correct.  This use case is out of scope: what should the
    use-list order be, if it's incomplete?

This is part of PR5680.

llvm-svn: 214125
2014-07-28 21:19:41 +00:00
Manman Ren 1a125c95de [Debug Info] add a template class DITypedArray.
Typedef DIArray to DITypedArray<DIDescriptor>. Also typedef DITypeArray as
DITypedArray<DITypeRef>.

This is the third of a series of patches to handle type uniqueing of the
type array for a subroutine type.

This commit should have no functionality change.

llvm-svn: 214115
2014-07-28 19:33:20 +00:00
Manman Ren ab8ffbaaee [Debug Info] rename getTypeArray to getElements, setTypeArray to setArrays.
This is the second of a series of patches to handle type uniqueing of the
type array for a subroutine type.

For vector and array types, getElements returns the array of subranges, so it
is a better name than getTypeArray. Even for class, struct and enum types,
getElements returns the members, which can be subprograms.

setArrays can set up to two arrays, the second is the templates.

This commit should have no functionality change.

llvm-svn: 214112
2014-07-28 19:14:13 +00:00
Manman Ren bf696e3930 [Debug Info] replace DIUnspecifiedParameter with DITrivialType.
This is the first of a series of patches to handle type uniqueing of the
type array for a subroutine type.

This commit makes sure unspecified_parameter is a DIType to enable converting
the type array for a subroutine type to an array of DITypes.

This commit should have no functionality change. With this commit, we may
change unspecified type to be a DITrivialType instead of a DIType.

llvm-svn: 214111
2014-07-28 18:52:30 +00:00
Matt Arsenault b9f46eeff1 R600/SI: Fix return type for isMIMG / isSMRD
All the others use bool, so these should too.

llvm-svn: 214106
2014-07-28 17:59:38 +00:00
Chandler Carruth b143274ad0 [SDAG] Add DEBUG logging to the legalizer, fixing a "bug" found by
inspection in the proccess, and shuffle the logging in the DAG combiner
around a bit.

With this it is much easier to follow what the legalizer is doing. It
should even accurately present most of the strange legalization
operations where a single node is replaced by multiple nodes, etc. There
is still some information lost (we log SDNodes not SDValues so we don't
log which result is used for which thing), but I think this is much
closer to a usable system. Notably, this will make it *much* more
apparant when legalization is actually happening inside the combiner, or
when there is a cycle caused by interactions of the legalizer and the
combiner.

The "bug" I fixed here I'm not sure is remotely possible to trigger. We
were only adding one of the nodes in a replacement to the updated set
rather than all of the nodes in the replacement. Realistically, the
worst result of this are nodes not getting back onto the worklist in the
DAG combiner. I doubt it is possible to trigger this today, and
I certainly don't have any ideas about how, but this at least brings the
code into alignment with the principled operation of the routine.

llvm-svn: 214105
2014-07-28 17:55:07 +00:00
Matt Arsenault 46645fa102 R600/SI: Implement getOptimalMemOpType
The default guess uses i32. This needs an address space argument
to really do the right thing in all cases.

llvm-svn: 214104
2014-07-28 17:49:26 +00:00
Matt Arsenault 86033cae84 R600/SI: Make argument loads invariant
llvm-svn: 214101
2014-07-28 17:31:39 +00:00
Robert Khasanov 595683da00 [SKX] Enabling mask logic instructions: encoding, lowering
Instructions: KAND{BWDQ}, KANDN{BWDQ}, KOR{BWDQ}, KXOR{BWDQ}, KXNOR{BWDQ}

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 214081
2014-07-28 13:46:45 +00:00
Ulrich Weigand 90a5de88a8 [PowerPC] Support ELFv1/ELFv2 ABI selection via features
While LLVM now supports both ELFv1 and ELFv2 ABIs, their use is currently
hard-coded via the target triple: powerpc64-linux is always ELFv1, while
powerpc64le-linux is always ELFv2.

These are of course the most common scenarios, but in principle it is
possible to support the ELFv2 ABI on big-endian or the ELFv1 ABI on
little-endian systems (and GCC does support that), and there are some
special use cases for that (e.g. certain Linux kernel versions could
only be built using ELFv1 on LE).

This patch implements the LLVM side of supporting this.  As precedent
on other platforms suggests, ABI options are passed to the back-end as
features.  Thus, this patch implements two features "elfv1" and "elfv2"
that select the desired ABI if present.  (If not, the LLVM uses the
same default rules as now.)

llvm-svn: 214072
2014-07-28 13:09:28 +00:00
Saleem Abdulrasool 8988c2a524 ARM: correct handling of features in arch_extension
The subtarget information is the ultimate source of truth for the feature set
that is enabled at this point.  We would previously not propagate the feature
information to the subtarget.  While this worked for the most part (features
would be enabled/disabled as requested), if another operation that changed the
feature bits was encountered (such as a mode switch via a .arm or .thumb
directive), we would end up resetting the behaviour of the architectural
extensions.

Handling this properly requires a slightly more complicated handling.  We need
to check if the feature is now being toggled.  If so, only then do we toggle the
features.  In return, we no longer have to calculate the feature bits ourselves.

The test changes are mostly to the diagnosis, which is now more uniform (a nice
side effect!).  Add an additional test to ensure that we handle this case
properly.

Thanks to Nico Weber for alerting me to this issue!

llvm-svn: 214057
2014-07-27 19:07:09 +00:00
Saleem Abdulrasool 45cf67b8e9 ARM: convert loop to range based
Convert a loop to use range based iteration.  Rename structure members to help
naming, and make structure definition anonymous.  NFC.

llvm-svn: 214056
2014-07-27 19:07:05 +00:00
Matt Arsenault 6f2a526101 Add alignment value to allowsUnalignedMemoryAccess
Rename to allowsMisalignedMemoryAccess.

On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment,
and don't need to be split into multiple accesses. Vector loads with
an alignment of the element type are not uncommon in OpenCL code.

llvm-svn: 214055
2014-07-27 17:46:40 +00:00
Tim Northover 2c46beb0d1 AArch64: fix conversion of 'J' inline asm constraints.
'J' represents a negative number suitable for an add/sub alias
instruction, but while preparing it to become an int64_t we were
mangling the sign extension. So "i32 -1" became 0xffffffffLL, for
example.

Should fix one half of PR20456.

llvm-svn: 214052
2014-07-27 07:10:29 +00:00
Chandler Carruth 64a7c828cb [x86] Sink a variable only used by asserts into the asserts. Should fix
some -Werror bots, sorry for the noise.

llvm-svn: 214043
2014-07-27 01:45:49 +00:00
Chandler Carruth 80c5bfd843 [x86] Add a much more powerful framework for combining x86 shuffle
instructions in the legalized DAG, and leverage it to combine long
sequences of instructions to PSHUFB.

Eventually, the other x86-instruction-specific shuffle combines will
probably all be driven out of this routine. But the real motivation is
to detect after we have fully legalized and optimized a shuffle to the
minimal number of x86 instructions whether it is profitable to replace
the chain with a fully generic PSHUFB instruction even though doing so
requires either a load from a constant pool or tying up a register with
the mask.

While the Intel manuals claim it should be used when it replaces 5 or
more instructions (!!!!) my experience is that it is actually very fast
on modern chips, and so I've gon with a much more aggressive model of
replacing any sequence of 3 or more instructions.

I've also taught it to do some basic canonicalization to special-purpose
instructions which have smaller encodings than their generic
counterparts.

There are still quite a few FIXMEs here, and I've not yet implemented
support for lowering blends with PSHUFB (where its power really shines
due to being able to zero out lanes), but this starts implementing real
PSHUFB support even when using the new, fancy shuffle lowering. =]

llvm-svn: 214042
2014-07-27 01:15:58 +00:00
Matt Arsenault a5789bb4e1 R600: Move intrinsic lowering to separate functions
llvm-svn: 214023
2014-07-26 06:23:37 +00:00
Chandler Carruth 5a85c7beb8 [SDAG] Add an assert that we don't mess up the number of values when
replacing nodes in the legalizer.

This caught a number of bugs for me during development.

llvm-svn: 214022
2014-07-26 05:53:16 +00:00
Chandler Carruth 98655fa4d8 [SDAG] Simplify the code for handling single-value nodes and add
a missing transfer of debug information (without which tests fail).

llvm-svn: 214021
2014-07-26 05:52:51 +00:00
Chandler Carruth 411fb407f8 [SDAG] When performing post-legalize DAG combining, run the legalizer
over each node in the worklist prior to combining.

This allows the combiner to produce new nodes which need to go back
through legalization. This is particularly useful when generating
operands to target specific nodes in a post-legalize DAG combine where
the operands are significantly easier to express as pre-legalized
operations. My immediate use case will be PSHUFB formation where we need
to build a constant shuffle mask with a build_vector node.

This also refactors the relevant functionality in the legalizer to
support this, and updates relevant tests. I've spoken to the R600 folks
and these changes look like improvements to them. The avx512 change
needs to be investigated, I suspect there is a disagreement between the
legalizer and the DAG combiner there, but it seems a minor issue so
leaving it to be re-evaluated after this patch.

Differential Revision: http://reviews.llvm.org/D4564

llvm-svn: 214020
2014-07-26 05:49:40 +00:00
Nick Lewycky d7c726c5e9 Fix broken assert.
llvm-svn: 214019
2014-07-26 05:44:15 +00:00
NAKAMURA Takumi 1fa7769ba9 X86ShuffleDecode.cpp: Silence a warning. [-Wunused-variable]
llvm-svn: 214016
2014-07-26 04:53:05 +00:00
Chandler Carruth 5896698e2e [x86] Fix PR20355 (for real). There are many layers to this bug.
The tale starts with r212808 which attempted to fix inversion of the low
and high bits when lowering MUL_LOHI. Sadly, that commit did not include
any positive test cases, and just removed some operations from a test
case where the actual logic being changed isn't fully visible from the
test.

What this commit did was two things. First, it reversed the low and high
results in the formation of the MERGE_VALUES node for the multiple
results. This is entirely correct.

Second it changed the shuffles for extracting the low and high
components from the i64 results of the multiplies to extract them
assuming a big-endian-style encoding of the multiply results. This
second change is wrong. There is no big-endian encoding in x86, the
results of the multiplies are normal v2i64s: when cast to v4i32, the low
i32s are at offsets 0 and 2, and the high i32s are at offsets 1 and 3.

However, the first change wasn't enough to actually fix the bug, which
is (I assume) why the second change was also made. There was another bug
in the MERGE_VALUES formation: we weren't using a VTList, and so were
getting a single result node! When grabbing the *second* result from the
node, we got... well.. colud be anything. I think this *appeared* to
invert things, but had to be causing other problems as well.

Fortunately, I fixed the MERGE_VALUES issue in r213931, so we should
have been fine, right? NOOOPE! Because the core bug was never addressed,
the test in vector-idiv failed when I fixed the MERGE_VALUES node.
Because there are essentially no docs for this node, I had to guess at
how to fix it and tried swapping the operands, restoring the order of
the original code before r212808. While this "fixed" the test case (in
that we produced the write instructions) we were still extracting the
wrong elements of the i64s, and thus PR20355 was still broken.

This commit essentially reverts the big-endian-style extraction part of
r212808 and goes back to the original masks which were correct. Now that
the MERGE_VALUES node formation is also correct, everything works. I've
also included a more detailed test from PR20355 to make sure this stays
fixed.

llvm-svn: 214011
2014-07-26 03:46:57 +00:00
Chandler Carruth f6406ac5d6 [x86] Revert r214007: Fix PR20355 ...
The clever way to implement signed multiplication with unsigned *is
already implemented* and tested and working correctly. The bug is
somewhere else. Re-investigating.

This will teach me to not scroll far enough to read the code that did
what I thought needed to be done.

llvm-svn: 214009
2014-07-26 02:14:54 +00:00
Chandler Carruth 1bf4d19172 [x86] Fix PR20355 (and dups) by not using unsigned multiplication when
signed multiplication is requested. While there is not a difference in
the *low* half of the result, the *high* half (used specifically to
implement the signed division by these constants) certainly is used. The
test case I've nuked was actively asserting wrong code.

There is a delightful solution to doing signed multiplication even when
we don't have it that Richard Smith has crafted, but I'll add the
machinery back and implement that in a follow-up patch. This at least
restores correctness.

llvm-svn: 214007
2014-07-26 01:52:13 +00:00
NAKAMURA Takumi 8b2e7bfac1 Update X86/Utils/LLVMBuild.txt corresponding to r213986. "Core" has been introduced.
llvm-svn: 213995
2014-07-26 00:45:43 +00:00
Chandler Carruth 0e469609f3 [x86] Fix unused variable warning in no-asserts build.
llvm-svn: 213989
2014-07-26 00:04:41 +00:00
Chandler Carruth 185cc18d42 [x86] Teach the X86 backend to print shuffle comments for PSHUFB
instructions which happen to have a constant mask.

Currently, this only handles a very narrow set of cases, but those
happen to be the cases that I care about for testing shuffles sanely.
This is a bit trickier than other shuffle instructions because we're
decoding constants out of the constant pool. The current MC layer makes
it completely impossible to inspect a constant pool entry, so we have to
do it at the MI level and attach the comment to the streamer on its way
out. So no joy for disassembling, but it does make test cases and asm
dumps *much* nicer.

Sorry for no test cases, but it didn't really seem that valuable to go
trolling through existing old test cases and updating them. I'll have
lots of testing of this in the upcoming patch for SSSE3 emission in the
new vector shuffle lowering code paths.

llvm-svn: 213986
2014-07-25 23:47:11 +00:00
Matt Arsenault c824458e81 R600/SI: Allow partial unrolling and increase thresholds.
llvm-svn: 213985
2014-07-25 23:02:42 +00:00
Eric Christopher ac4b69e40b Move R600 subtarget dependent variables onto the subtarget.
No functional change.

llvm-svn: 213982
2014-07-25 22:22:39 +00:00
Alex Lorenz b2ebf2a08b coverage: remove empty mapping regions
This patch removes the empty coverage mapping regions.
Those regions were produced by clang's old mapping region generation 
algorithm, but the new algorithm doesn't generate them.

llvm-svn: 213981
2014-07-25 22:22:24 +00:00
Hal Finkel f5867a79c5 Canonicalization for @llvm.assume
Adds simple logical canonicalization of assumption intrinsics to instcombine,
currently:
 - invariant(a && b) -> invariant(a); invariant(b)
 - invariant(!(a || b)) -> invariant(!a); invariant(!b)

llvm-svn: 213977
2014-07-25 21:45:17 +00:00
Nico Weber a822d94f57 Wrap to 80 columns, no behavior change.
llvm-svn: 213975
2014-07-25 21:37:41 +00:00
Hal Finkel 930469107d Add @llvm.assume, lowering, and some basic properties
This is the first commit in a series that add an @llvm.assume intrinsic which
can be used to provide the optimizer with a condition it may assume to be true
(when the control flow would hit the intrinsic call). Some basic properties are added here:

 - llvm.invariant(true) is dead.
 - llvm.invariant(false) is unreachable (this directly corresponds to the
   documented behavior of MSVC's __assume(0)), so is llvm.invariant(undef).

The intrinsic is tagged as writing arbitrarily, in order to maintain control
dependencies. BasicAA has been updated, however, to return NoModRef for any
particular location-based query so that we don't unnecessarily block code
motion.

llvm-svn: 213973
2014-07-25 21:13:35 +00:00
Akira Hatanaka e5b6e0d231 [stack protector] Fix a potential security bug in stack protector where the
address of the stack guard was being spilled to the stack.

Previously the address of the stack guard would get spilled to the stack if it
was impossible to keep it in a register. This patch introduces a new target
independent node and pseudo instruction which gets expanded post-RA to a
sequence of instructions that load the stack guard value. Register allocator
can now just remat the value when it can't keep it in a register. 

<rdar://problem/12475629>

llvm-svn: 213967
2014-07-25 19:31:34 +00:00
Rafael Espindola 78cfa0c72e Remove dead code.
llvm-svn: 213963
2014-07-25 19:06:39 +00:00
Hal Finkel 7c8ae53506 [PowerPC] Support TLS on PPC32/ELF
Patch by Justin Hibbits!

llvm-svn: 213960
2014-07-25 17:47:22 +00:00
Juergen Ributzka 5d6c43e294 [FastISel][AArch64] Add support for frameaddress intrinsic.
This commit implements the frameaddress intrinsic for the AArch64 architecture
in FastISel.

There were two test cases that pretty much tested the same, so I combined them
to a single test case.

Fixes <rdar://problem/17811834>

llvm-svn: 213959
2014-07-25 17:47:14 +00:00
Duncan P. N. Exon Smith 4b4d8ecde1 Move -verify-use-list-order into llvm-uselistorder
Ugh.  Turns out not even transformation passes link in how to read IR.
I sincerely believe the buildbots will finally agree with my system
after this though.  (I don't really understand why all of this has been
working on my system, but not on all the buildbots.)

Create a new tool called llvm-uselistorder to use for verifying use-list
order.  For now, just dump everything from the (now defunct)
-verify-use-list-order pass into the tool.

This might be a better way to test use-list order anyway.

Part of PR5680.

llvm-svn: 213957
2014-07-25 17:13:03 +00:00
David Blaikie 29459ae83c Reapply "DebugInfo: Don't put fission type units in comdat sections."
This recommits r208930, r208933, and r208975 (by reverting r209338) and
reverts r209529 (the FIXME to readd this functionality once the tools
were fixed) now that DWP has been fixed to cope with a single section
for all fission type units.

Original commit message:

"Since type units in the dwo file are handled by a debug aware tool,
they don't need to leverage the ELF comdat grouping to implement
deduplication. Avoid creating all the .group sections for these as a
space optimization."

llvm-svn: 213956
2014-07-25 17:11:58 +00:00
Hans Wennborg 82f490c0ba Fix MSVC2012 build error in UseListOrder.cpp
I think the compiler got confused by the nested DEBUG macros.
It was failing with:

  UseListOrder.cpp(80) : error C2059: syntax error : '}'

llvm-svn: 213954
2014-07-25 16:22:13 +00:00
Duncan P. N. Exon Smith 15eb0ab28d Bitcode: Don't optimize constants when preserving use-list order
`ValueEnumerator::OptimizeConstants()` creates forward references within
the constant pools, which makes predicting constants' use-list order
difficult.  For now, just disable the optimization.

This can be re-enabled in the future in one of two ways:

  - Enable a limited version of this optimization that doesn't create
    forward references.  One idea is to categorize constants by their
    "height" and make that the top-level sort.

  - Enable it entirely.  This requires predicting how may times each
    constant will be recreated as its operands' and operands' operands'
    (etc.) forward references get resolved.

This is part of PR5680.

llvm-svn: 213953
2014-07-25 16:13:16 +00:00
David Blaikie 2f04011435 Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information.
Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson
reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped
provide a reduced reproduction, though the failure wasn't too hard to
guess, and even easier with the example to confirm.

The assertion that the subprogram metadata associated with an
llvm::Function matches the scope data referenced by the DbgLocs on the
instructions in that function is not valid under LTO. In LTO, a C++
inline function might exist in multiple CUs and the subprogram metadata
nodes will refer to the same llvm::Function. In this case, depending on
the order of the CUs, the first intance of the subprogram metadata may
not be the one referenced by the instructions in that function and the
assertion will fail.

A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the
assertion removed and a comment added to explain this situation.

This was then reverted again in r213581 as it caused PR20367. The root
cause of this was the early exit in LiveDebugVariables meant that
spurious DBG_VALUE intrinsics that referenced dead variables were not
removed, causing an assertion/crash later on. The fix is to have
LiveDebugVariables strip all DBG_VALUE intrinsics in functions without
debug info as they're not needed anyway. Test case added to cover this
situation (that occurs when a debug-having function is inlined into a
nodebug function) in test/DebugInfo/X86/nodebug_with_debug_loc.ll

Original commit message:

If a function isn't actually in a CU's subprogram list in the debug info
metadata, ignore all the DebugLocs and don't try to build scopes, track
variables, etc.

While this is possibly a minor optimization, it's also a correctness fix
for an incoming patch that will add assertions to LexicalScopes and the
debug info verifier to ensure that all scope chains lead to debug info
for the current function.

Fix up a few test cases that had broken/incomplete debug info that could
violate this constraint.

Add a test case where this occurs by design (inlining a
debug-info-having function in an attribute nodebug function - we want
this to work because /if/ the nodebug function is then inlined into a
debug-info-having function, it should be fine (and will work fine - we
just stitch the scopes up as usual), but should the inlining not happen
we need to not assert fail either).

llvm-svn: 213952
2014-07-25 16:10:16 +00:00
Hal Finkel ff0bcb60c9 Convert noalias parameter attributes into noalias metadata during inlining
This functionality is currently turned off by default.

Part of the motivation for introducing scoped-noalias metadata is to enable the
preservation of noalias parameter attribute information after inlining.
Sometimes this can be inferred from the code in the caller after inlining, but
often we simply lose valuable information.

The overall process if fairly simple:
 1. Create a new unqiue scope domain.
 2. For each (used) noalias parameter, create a new alias scope.
 3. For each pointer, collect the underlying objects. Add a noalias scope for
    each noalias parameter from which we're not derived (and has not been
    captured prior to that point).
 4. Add an alias.scope for each noalias parameter from which we might be
    derived (or has been captured before that point).

Note that the capture checks apply only if one of the underlying objects is not
an identified function-local object.

llvm-svn: 213949
2014-07-25 15:50:08 +00:00
Hal Finkel 029cde639c Simplify and improve scoped-noalias metadata semantics
In the process of fixing the noalias parameter -> metadata conversion process
that will take place during inlining (which will be committed soon, but not
turned on by default), I have come to realize that the semantics provided by
yesterday's commit are not really what we want. Here's why:

void foo(noalias a, noalias b, noalias c, bool x) {
  *q = x ? a : b;
  *c = *q;
}

Generically, we know that *c does not alias with *a and with *b (so there is an
'and' in what we know we're not), and we know that *q might be derived from *a
or from *b (so there is an 'or' in what we know that we are). So we do not want
the semantics currently, where any noalias scope matching any alias.scope
causes a NoAlias return. What we want to know is that the noalias scopes form a
superset of the alias.scope list (meaning that all the things we know we're not
is a superset of all of things the other instruction might be).

Making that change, however, introduces a composibility problem. If we inline
once, adding the noalias metadata, and then inline again adding more, and we
append new scopes onto the noalias and alias.scope lists each time. But, this
means that we could change what was a NoAlias result previously into a MayAlias
result because we appended an additional scope onto one of the alias.scope
lists. So, instead of giving scopes the ability to have parents (which I had
borrowed from the TBAA implementation, but seems increasingly unlikely to be
useful in practice), I've given them domains. The subset/superset condition now
applies within each domain independently, and we only need it to hold in one
domain. Each time we inline, we add the new scopes in a new scope domain, and
everything now composes nicely. In addition, this simplifies the
implementation.

llvm-svn: 213948
2014-07-25 15:50:02 +00:00
Duncan P. N. Exon Smith 20a005f27a Try to fix a layering violation introduced by r213945
The dragonegg buildbot (and others?) started failing after
r213945/r213946 because `llvm-as` wasn't linking in the bitcode reader.
I think moving the verify functions to the same file as the verify pass
should fix the build.  Adding a command-line option for maintaining
use-list order in assembly as a drive-by to prevent warnings about
unused static functions.

llvm-svn: 213947
2014-07-25 15:41:49 +00:00
Duncan P. N. Exon Smith f62acab1c0 Fix -Werror build after r213945
llvm-svn: 213946
2014-07-25 15:00:02 +00:00
Duncan P. N. Exon Smith 6b6fdc992a IPO: Add use-list-order verifier
Add a -verify-use-list-order pass, which shuffles use-list order, writes
to bitcode, reads back, and verifies that the (shuffled) order matches.

  - The utility functions live in lib/IR/UseListOrder.cpp.

  - Moved (and renamed) the command-line option to enable writing
    use-lists, so that this pass can return early if the use-list orders
    aren't being serialized.

It's not clear that this pass is the right direction long-term (perhaps
a separate tool instead?), but short-term it's a great way to test the
use-list order prototype.  I've added an XFAIL-ed testcase that I'm
hoping to get working pretty quickly.

This is part of PR5680.

llvm-svn: 213945
2014-07-25 14:49:26 +00:00
Amara Emerson 115d2df8a4 [ARM] Emit ABI_PCS_R9_use build attribute.
Patch by Ben Foster!

Differential Revision: http://reviews.llvm.org/D4657

llvm-svn: 213944
2014-07-25 14:03:14 +00:00
Benjamin Kramer 1f8930e3d3 Run sort_includes.py on the AArch64 backend.
No functionality change.

llvm-svn: 213938
2014-07-25 11:42:14 +00:00
Chandler Carruth 3de980d2ff [SDAG] Enable the new assert for out-of-range result numbers in
SDValues, fixing the two bugs left in the regression suite.

The key for both of these was the use a single value type rather than
a VTList which caused an unintentionally single-result merge-value node.
Fix this by getting the appropriate VTList in place.

Doing this exposed that the comments in x86's code abouth how MUL_LOHI
operands are handle is wrong. The bug with the use of out-of-range
result numbers was hiding the bug about the order of operands here (as
best i can tell). There are more places where the code appears to get
this backwards still...

llvm-svn: 213931
2014-07-25 09:19:23 +00:00
Chandler Carruth eae2d28cc9 [SDAG] Don't insert the VRBase into a mapping from SDValues when the def
doesn't actually correspond to an SDValue at all. Fixes most of the
remaining asserts on out-of-range SDValue result numbers.

llvm-svn: 213930
2014-07-25 09:19:18 +00:00
Matt Arsenault 197a1e26e3 Store nodes only have 1 result.
llvm-svn: 213928
2014-07-25 07:56:42 +00:00
Chandler Carruth 94bd553eb8 [SDAG] Start plumbing an assert into SDValues that we don't form one
with a result number outside the range of results for the node.

I don't know how we managed to not really check this very basic
invariant for so long, but the code is *very* broken at this point.
I have over 270 test failures with the assert enabled. I'm committing it
disabled so that others can join in the cleanup effort and reproduce the
issues. I've also included one of the obvious fixes that I already
found. More fixes to come.

llvm-svn: 213926
2014-07-25 07:23:23 +00:00
Akira Hatanaka 16e47ff42e [ARM] In thumb mode, emit directive ".code 16" before file level inline
assembly instructions.

This is necessary to ensure ARM assembler switches to Thumb mode before it
starts assembling the file level inline assembly instructions at the beginning
of a .s file.

<rdar://problem/17757232>

llvm-svn: 213924
2014-07-25 05:12:49 +00:00
Ehsan Akhgari 29b61ce770 Fix a warning in CoverageMappingReader.cpp
llvm-svn: 213920
2014-07-25 02:51:57 +00:00
Lang Hames 5432649be7 [X86] Clarify some stackmap shadow optimization code as based on review
feedback from Eric Christopher.

No functional change.

llvm-svn: 213917
2014-07-25 02:29:19 +00:00
Bill Schmidt c9fa5dd618 [PATCH][PPC64LE] Correct little-endian usage of vmrgh* and vmrgl*.
Because the PowerPC vmrgh* and vmrgl* instructions have a built-in
big-endian bias, it is necessary to swap their inputs in little-endian
mode when using them to implement a vector shuffle.  This was
previously missed in the vector LE implementation.

There was already logic to distinguish between unary and "normal"
vmrg* vector shuffles, so this patch extends that logic to use a third
option:  "swapped" vmrg* vector shuffles that are used for little
endian in place of the "normal" ones.

I've updated the vec-shuffle-le.ll test to check for the expected
register ordering on the generated instructions.

This bug was discovered when testing the LE and ELFv2 patches for
safety if they were backported to 3.4.  A different vectorization
decision was made in 3.4 than on mainline trunk, and that exposed the
problem.  I've verified this fix takes care of that issue.

llvm-svn: 213915
2014-07-25 01:55:55 +00:00
Alex Lorenz a20a5d50ba Add code coverage mapping data, reader, and writer.
This patch implements the data structures, the reader and
the writers for the new code coverage mapping system. 
The new code coverage mapping system uses the instrumentation
based profiling to provide code coverage analysis.

llvm-svn: 213910
2014-07-24 23:57:54 +00:00
Alex Lorenz 817e485470 Add code coverage mapping data, reader, and writer.
This patch implements the data structures, the reader and
the writers for the new code coverage mapping system. 
The new code coverage mapping system uses the instrumentation
based profiling to provide code coverage analysis.

llvm-svn: 213909
2014-07-24 23:55:56 +00:00
Mark Heffernan 8ec1474f7f After unrolling a loop with llvm.loop.unroll.count metadata (unroll factor
hint) the loop unroller replaces the llvm.loop.unroll.count metadata with
llvm.loop.unroll.disable metadata to prevent any subsequent unrolling
passes from unrolling more than the hint indicates.  This patch fixes
an issue where loop unrolling could be disabled for other loops as well which
share the same llvm.loop metadata.

llvm-svn: 213900
2014-07-24 22:36:40 +00:00
Joerg Sonnenberger b5459e6e22 Don't use 128bit functions on PPC32.
llvm-svn: 213899
2014-07-24 22:20:10 +00:00
Chandler Carruth 9f4530b95d [SDAG] Introduce a combined set to the DAG combiner which tracks nodes
which have successfully round-tripped through the combine phase, and use
this to ensure all operands to DAG nodes are visited by the combiner,
even if they are only added during the combine phase.

This is critical to have the combiner reach nodes that are *introduced*
during combining. Previously these would sometimes be visited and
sometimes not be visited based on whether they happened to end up on the
worklist or not. Now we always run them through the combiner.

This fixes quite a few bad codegen test cases lurking in the suite while
also being more principled. Among these, the TLS codegeneration is
particularly exciting for programs that have this in the critical path
like TSan-instrumented binaries (although I think they engineer to use
a different TLS that is faster anyways).

I've tried to check for compile-time regressions here by running llc
over a merged (but not LTO-ed) clang bitcode file and observed at most
a 3% slowdown in llc. Given that this is essentially a worst case (none
of opt or clang are running at this phase) I think this is tolerable.
The actual LTO case should be even less costly, and the cost in normal
compilation should be negligible.

With this combining logic, it is possible to re-legalize as we combine
which is necessary to implement PSHUFB formation on x86 as
a post-legalize DAG combine (my ultimate goal).

Differential Revision: http://reviews.llvm.org/D4638

llvm-svn: 213898
2014-07-24 22:15:28 +00:00
Chandler Carruth 80b869461e [x86] Make vector legalization of extloads work more like the "normal"
vector operation legalization with support for custom target lowering
and fallback to expand when it fails, and use this to implement sext and
anyext load lowering for x86 in a more principled way.

Previously, the x86 backend relied on a target DAG combine to "combine
away" sextload and extload nodes prior to legalization, or would expand
them during legalization with terrible code. This is particularly
problematic because the DAG combine relies on running over non-canonical
DAG nodes at just the right time to match several common and important
patterns. It used a combine rather than lowering because we didn't have
good lowering support, and to expose some tricks being employed to more
combine phases.

With this change it becomes a proper lowering operation, the backend
marks that it can lower these nodes, and I've added support for handling
the canonical forms that don't have direct legal representations such as
sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases
for this behavior continue to pass even after the DAG combiner beigns
running more systematically over every node.

There is some noise caused by this in the test suite where we actually
use vector extends instead of subregister extraction. This doesn't
really seem like the right thing to do, but is unlikely to be a critical
regression. We do regress in one case where by lowering to the
target-specific patterns early we were able to combine away extraneous
legal math nodes. However, this regression is completely addressed by
switching to a widening based legalization which is what I'm working
toward anyways, so I've just switched the test to that mode.

Differential Revision: http://reviews.llvm.org/D4654

llvm-svn: 213897
2014-07-24 22:09:56 +00:00
Saleem Abdulrasool 8dc8fb18d8 Target: invert condition for Windows
The Microsoft ABI and MSVCRT are considered the canonical C runtime and ABI.
The long double routines are not part of this environment.  However, cygwin and
MinGW both provide supplementary implementations.  Change the condition to
reflect this reality.

llvm-svn: 213896
2014-07-24 22:09:06 +00:00
Manman Ren 4d189fb9a6 Feedback from Hans on r213815. No functionaility change.
llvm-svn: 213895
2014-07-24 21:13:20 +00:00
Hans Wennborg e34a71aa91 Windows: Don't wildcard expand /? or -?
Even if there's a file called c:\a, we want /? to be preserved as
an option, not expanded to a filename.

llvm-svn: 213894
2014-07-24 21:09:45 +00:00
Lang Hames f49bc3f1b1 [X86] Optimize stackmap shadows on X86.
This patch minimizes the number of nops that must be emitted on X86 to satisfy
stackmap shadow constraints.

To minimize the number of nops inserted, the X86AsmPrinter now records the
size of the most recent stackmap's shadow in the StackMapShadowTracker class,
and tracks the number of instruction bytes emitted since the that stackmap
instruction was encountered. Padding is emitted (if it is required at all)
immediately before the next stackmap/patchpoint instruction, or at the end of
the basic block.

This optimization should reduce code-size and improve performance for people
using the llvm stackmap intrinsic on X86.

<rdar://problem/14959522>

llvm-svn: 213892
2014-07-24 20:40:55 +00:00
Reid Kleckner 9a412d13c1 Replace an assertion with a fatal error
Frontends are responsible for putting inalloca on parameters that would
be passed in memory and not registers.

llvm-svn: 213891
2014-07-24 19:53:33 +00:00
Joerg Sonnenberger 6637d4e2e7 Use the same .eh_frame encoding for 32bit PPC as on i386.
llvm-svn: 213890
2014-07-24 19:25:16 +00:00
Saleem Abdulrasool c61ed0474e X86: correct library call setup for Windows itanium
This target is identical to the Windows MSVC (and follows Microsoft ABI for C).
Correct the library call setup for this target.  The same set of library calls
are missing on this environment.

llvm-svn: 213883
2014-07-24 17:46:36 +00:00
Matt Arsenault 83592a2d32 R600: Add FMA instructions for Evergreen
llvm-svn: 213882
2014-07-24 17:41:01 +00:00
Saleem Abdulrasool 34610e33ae X86: silence sign comparison warning
GCC 4.8 detected a signed compare [-Wsign-compare].  Add a cast for the
destination index.  Add an assert to catch a potential overflow however unlikely
it may be.

llvm-svn: 213878
2014-07-24 17:12:06 +00:00
Matt Arsenault 83e60581c3 R600: Add new functions for splitting vector loads and stores.
These will be used in future patches and shouldn't change anything yet.

llvm-svn: 213877
2014-07-24 17:10:35 +00:00
Nico Weber 155dccd1eb Let the integrated assembler understand .exitm, PR20426.
llvm-svn: 213876
2014-07-24 17:08:39 +00:00
Nico Weber 2a8f922b1c Remove unused field MacroInstantiation::TheMacro. No behavior change.
llvm-svn: 213874
2014-07-24 16:29:04 +00:00
Nico Weber 404012b7dc Let the integrated assembler understand .warning, PR20428.
llvm-svn: 213873
2014-07-24 16:26:06 +00:00
Joerg Sonnenberger c7dbc13e77 Include relative path for header outside the current directory.
llvm-svn: 213872
2014-07-24 16:04:46 +00:00
Rafael Espindola 8c4c0213fd Remove dead code.
Every user has been switched to using EngineBuilder.

llvm-svn: 213871
2014-07-24 16:02:28 +00:00
Tim Northover 7324e845a4 AArch64: refactor ReconstructShuffle function
Quite a bit of cruft had accumulated as we realised the various different cases
it had to handle and squeezed them in where possible. This refactoring mostly
flattens the logic and special-cases. The result is slightly longer, but I
think clearer.

Should be no functionality change.

llvm-svn: 213867
2014-07-24 15:39:55 +00:00
Hal Finkel 9414665a3b Add scoped-noalias metadata
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
  1. To preserve noalias function attribute information when inlining
  2. To provide the ability to model block-scope C99 restrict pointers

Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.

What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:

!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }

Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:

... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }

When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.

Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.

[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]

Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.

llvm-svn: 213864
2014-07-24 14:25:39 +00:00
Aaron Ballman 99e0ea0aa8 Fixing an MSVC conversion warning about implicitly converting the shift results to 64-bits. No functional change intended.
llvm-svn: 213863
2014-07-24 14:24:59 +00:00
Hal Finkel cc39b67530 AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).

This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.

No functionality change intended.

llvm-svn: 213859
2014-07-24 12:16:19 +00:00
NAKAMURA Takumi 8d745ca7cc Prune redundant libdeps.
llvm-svn: 213857
2014-07-24 11:45:27 +00:00
NAKAMURA Takumi 98d18be5fe Prune dependency to MC from each target disassembler.
llvm-svn: 213856
2014-07-24 11:45:11 +00:00
Tilmann Scheller 96ef72e54a [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STRH instructions.
The ARM ARM prohibits STRH instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STRH instructions with unpredictable behavior.

llvm-svn: 213850
2014-07-24 09:55:46 +00:00
Daniel Sanders bdcfab117c [mips] Fix ll and sc instructions
Summary: The ll and sc instructions for r6 and non-r6 are misplaced. This patch fixes that.

Patch by Jyun-Yan You

Differential Revision: http://reviews.llvm.org/D4578

llvm-svn: 213847
2014-07-24 09:47:14 +00:00
Matt Arsenault 9acb978105 R600: Match rcp node on pre-SI
llvm-svn: 213844
2014-07-24 06:59:24 +00:00
Matt Arsenault 0daeb63f03 R600: Fix LowerSDIV24
Use ComputeNumSignBits instead of checking for i8 / i16 which only
worked when AMDIL was lying about having legal i8 / i16.

If an integer is known to fit in 24-bits, we can
do division faster with float ops.

llvm-svn: 213843
2014-07-24 06:59:20 +00:00
NAKAMURA Takumi 9c3bd7618a Update library dependencies.
llvm-svn: 213832
2014-07-24 02:10:42 +00:00
Matt Arsenault 034d666bb7 R600: Implement enableClusterLoads()
llvm-svn: 213831
2014-07-24 02:10:17 +00:00
Kevin Qin 9a2a2c502b [AArch64] Fix a bug generating incorrect instruction when building small vector.
This bug is introduced by r211144. The element of operand may be
smaller than the element of result, but previous commit can
only handle the contrary condition. This commit is to handle this
scenario and generate optimized codes like ZIP1.

llvm-svn: 213830
2014-07-24 02:05:42 +00:00
Jiangning Liu 451f30e89f [AArch64] Disable some optimization cases for type conversion from sint to fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file.
llvm-svn: 213827
2014-07-24 01:29:59 +00:00
Filipe Cabecinhas 933cccf3fa Fixed PR20411 - bug in getINSERTPS()
When we had a vector_shuffle where we had an input from each vector, we
could miscompile it because we were assuming the input from V2 wouldn't
be moved from where it was on the vector.

Added a test case.

llvm-svn: 213826
2014-07-24 01:28:21 +00:00
Manman Ren edc60376ed SimplifyCFG: fix a bug in switch to table conversion
We use gep to access the global array "switch.table", and the table index
should be treated as unsigned. When the highest bit is 1, this commit
zero-extends the index to an integer type with larger size.

For a switch on i2, we used to generate:
%switch.tableidx = sub i2 %0, -2
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx

It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate
%switch.tableidx = sub i2 %0, -2
%switch.tableidx.zext = zext i2 %switch.tableidx to i3
getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext

rdar://17735071

llvm-svn: 213815
2014-07-23 23:13:23 +00:00
Rafael Espindola 45bcf8a59c Fix the build when building with only the ARM backend.
llvm-svn: 213814
2014-07-23 22:54:28 +00:00
Eric Christopher f19d12ba3c Fix indenting.
llvm-svn: 213811
2014-07-23 22:34:13 +00:00
Eric Christopher 6d0e40bfbf Reorganize and simplify local variables.
llvm-svn: 213809
2014-07-23 22:27:10 +00:00
Rafael Espindola 5addace56d Finish inverting the MC -> Object dependency.
There were still some disassembler bits in lib/MC, but their use of Object
was only visible in the includes they used, not in the symbols.

llvm-svn: 213808
2014-07-23 22:26:07 +00:00
Eric Christopher 9d9167950e Remove the query for TargetMachine and TargetInstrInfo since we're
already inside TargetInstrInfo.

llvm-svn: 213806
2014-07-23 22:12:03 +00:00
David Blaikie 8e9cfa5497 ArgPromo+DebugInfo: Handle updating debug info over multiple applications of argument promotion.
While the subprogram map cache used by Dead Argument Elimination works
there, I made a mistake when reusing it for Argument Promotion in
r212128 because ArgPromo may transform functions more than once whereas
DAE transforms each function only once, removing all the dead arguments
in one go.

To address this, ensure that the map is updated after each argument
promotion.

In retrospect it might be a little wasteful to create a map of all
subprograms when only handling a single CGSCC, but the alternative is
walking the debug info for each function in the CGSCC that gets updated.
It's not clear to me what the right tradeoff is there, but since the
current tradeoff seems to be working OK (and the code to keep things
updated is very cheap), let's stick with that for now.

llvm-svn: 213805
2014-07-23 22:09:29 +00:00
Jim Grosbach 724e438c62 [X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants.
The transform to constant fold unary operations with an AND across a
vector comparison applies when the constant is not a splat of a scalar
as well.

llvm-svn: 213800
2014-07-23 20:41:43 +00:00
Jim Grosbach 8f6f0858ec X86: restrict combine to when type sizes are safe.
The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.

llvm-svn: 213799
2014-07-23 20:41:38 +00:00
Jim Grosbach 19dd3088c0 DAG: fp->int conversion for non-splat constants.
Constant fold the lanes of the input constant build_vector individually
so we correctly handle when the vector elements are not all the same
constant value.

PR20394

llvm-svn: 213798
2014-07-23 20:41:31 +00:00
Justin Holewinski 2cb5e181d1 [NVPTX] Silence a GCC warning found by the buildbots
The cast to NVPTXTargetLowering was missing a 'const', but let's
just access the right pointer through the subtarget anyway.

llvm-svn: 213793
2014-07-23 20:23:47 +00:00
Mark Heffernan 9e112443b6 Do not add unroll disable metadata after unrolling pass for loops with #pragma clang loop unroll(full).
llvm-svn: 213789
2014-07-23 20:05:44 +00:00
Juergen Ributzka 1b014504ab [FastISel][AArch64] Fix return type in FastLowerCall.
I used the wrong method to obtain the return type inside FinishCall. This fix
simply uses the return type from FastLowerCall, which we already determined to
be a valid type.

Reduced test case from Chad. Thanks.

llvm-svn: 213788
2014-07-23 20:03:13 +00:00