Commit Graph

142269 Commits

Author SHA1 Message Date
Geoff Berry 66d1f0ff1f [LiveRangeEdit] Change eliminateDeadDef assert to if condition.
The assert could potentially fire (though no cases have been
encountered), so just check that the instruction we're handling
specially for rematerialization only has one def to begin with.

Reviewed by Wei Mi over email.

llvm-svn: 289861
2016-12-15 19:55:19 +00:00
Peter Collingbourne e089554c8f LibDriver: Allow resource files to be archive members.
It seems pointless to add a resource to an archive because it won't have
any symbols to link against (and link.exe doesn't have an equivalent of
--whole-archive), but lib.exe allows it for some reason.

llvm-svn: 289859
2016-12-15 19:37:46 +00:00
Zachary Turner 578113ffb7 Re-add the check for __has_attribute in StringLiteral.
llvm-svn: 289858
2016-12-15 19:33:31 +00:00
Boris Ulasevich b76f6c2745 BrainF example: fixing segfault caused by outdated code with missing MCJIT dependency
Differential Revision: https://reviews.llvm.org/D26280

llvm-svn: 289857
2016-12-15 19:29:42 +00:00
Zachary Turner b0aa31bb25 Ignore -Wgcc-compat diagnostic in StringLiteral.
llvm-svn: 289856
2016-12-15 19:22:58 +00:00
Sanjay Patel d640641a61 [InstCombine] add folds for icmp (smin X, Y), X
Min/max canonicalization (r287585) exposes the fact that we're missing combines for min/max patterns. 
This patch won't solve the example that was attached to that thread, so something else still needs fixing.

The line between InstCombine and InstSimplify gets blurry here because sometimes the icmp instruction that
we want to fold to already exists, but sometimes it's the swapped form of what we want.

Corresponding changes for smax/umin/umax to follow.

Differential Revision: https://reviews.llvm.org/D27531

llvm-svn: 289855
2016-12-15 19:13:37 +00:00
Reid Kleckner e793966d80 Fix some remaining documentation references to MSVC 2013
MSVC 2015 has been the minimum supported version of VS since October.

Differential Revision: https://reviews.llvm.org/D25710

llvm-svn: 289854
2016-12-15 19:08:02 +00:00
Zachary Turner 182b4652e5 [StringRef] Add enable-if to StringLiteral.
to prevent StringLiteral from being created with a non-literal
char array, clang has a macro enable_if() that can be used
in such a way as to guarantee that the constructor is disabled
unless the length fo the string can be computed at compile time.

This only works on clang, but at least it should allow bots
to catch abuse of StringLiteral.

Differential Revision: https://reviews.llvm.org/D27780

llvm-svn: 289853
2016-12-15 19:02:43 +00:00
Kostya Serebryany 9a038c188c [libFuzzer] doc update
llvm-svn: 289849
2016-12-15 18:47:22 +00:00
Ahmed Bougacha 5228603387 [GlobalISel] Drop workaround for Legalizer member/class sharing a name. NFC.
MachineLegalizer used to be the name of both the class and the member,
causing GCC errors. r276522 fixed that by renaming the member to just
'Legalizer'.  The 'class' workaround isn't necessary anymore; drop it.

llvm-svn: 289848
2016-12-15 18:45:30 +00:00
Sanjay Patel a97358bc8e [x86] use a single shufps for 256-bit vectors when it can save instructions
This is the 256-bit counterpart to the 128-bit transform checked in here:
https://reviews.llvm.org/rL289837

This patch is based on the draft by @sroland (Roland Scheidegger) that is
attached to PR27885:
https://llvm.org/bugs/show_bug.cgi?id=27885

llvm-svn: 289846
2016-12-15 18:43:46 +00:00
Matthew Simpson 2c8de192a1 [AArch64] Guard Misaligned 128-bit store penalty by subtarget feature
This patch checks that the SlowMisaligned128Store subtarget feature is set
when penalizing such stores in getMemoryOpCost.

Differential Revision: https://reviews.llvm.org/D27677

llvm-svn: 289845
2016-12-15 18:36:59 +00:00
Ahmed Bougacha 2a26a5f1f0 [AArch64][GlobalISel] Remove redundant RBI comments. NFC.
It's brittle, and Doxygen already picks the overriden method's comment
anyway.

llvm-svn: 289844
2016-12-15 18:22:15 +00:00
Teresa Johnson 1b859a2306 [ThinLTO] Ensure callees get hot threshold when first seen on cold path
This is split out from D27696, since it turned out to be a bug fix and
not part of the NFC efficiency change.

Keep the same adjusted (possibly decayed) threshold in both the worklist
and the ImportList. Otherwise if we encountered it first along a cold
path, the callee would be added to the worklist with a lower decayed
threshold than when it is later encountered along a hot path. But the
logic uses the threshold recorded in the ImportList entry to check if
we should re-add it, and without this patch the threshold recorded there
is the same along both paths so we don't re-add it. Using the
same possibly decayed threshold in the ImportList ensures we re-add it
later with the higher non-decayed hot path threshold.

llvm-svn: 289843
2016-12-15 18:21:01 +00:00
Chris Bieneman dc9b0db8e3 [CMake] Minor change to symlink generation for LLDB
If OUTPUT_DIR is not specified we can assume the symlink is linking to a file in the same directory, so we can use $<TARGET_FILE_NAME:${target}> to create a relative symlink.

In the case of LLDB, when we build a framework, we are creating symlinks in a different directory than the file we're pointing to, and we don't install those links. To make this work in the build directory we can use $<TARGET_FILE:${target}> instead, which uses the full path to the target.

llvm-svn: 289840
2016-12-15 18:17:07 +00:00
Sanjay Patel a0d8a278a7 [x86] use a single shufps when it can save instructions
This is a tiny patch with a big pile of test changes.
This partially fixes PR27885:
https://llvm.org/bugs/show_bug.cgi?id=27885

My motivating case looks like this:

  - vpshufd {{.*#+}} xmm1 = xmm1[0,1,0,2]
  - vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
  - vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm1[4,5,6,7]

  + vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2]

And this happens several times in the diffs. For chips with domain-crossing penalties,
the instruction count and size reduction should usually overcome any potential 
domain-crossing penalty due to using an FP op in a sequence of int ops. For chips such
as recent Intel big cores and Atom, there is no domain-crossing penalty for shufps, so
using shufps is a pure win.

So the test case diffs all appear to be improvements except one test in 
vector-shuffle-combining.ll where we miss an opportunity to use a shift to generate 
zero elements and one test in combine-sra.ll where multiple uses prevent the expected
shuffle combining.

Differential Revision: https://reviews.llvm.org/D27692

llvm-svn: 289837
2016-12-15 18:03:38 +00:00
Simon Pilgrim 7522f54feb [X86][SSE] Fix domains for scalar store instructions
As discussed on D27692

llvm-svn: 289834
2016-12-15 17:09:24 +00:00
Robert Lougher 6ea759a83e Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
Reverting as it is causing buildbot failures (address sanitizer).

llvm-svn: 289833
2016-12-15 16:59:13 +00:00
Jacques Pienaar ccffe38352 [lanai] Simplify small section check in LowerGlobalAddress and treat ldata sections specially.
Move the check for the code model into isGlobalInSmallSectionImpl and return false (not in small section) for variables placed in sections prefixed with .ldata (workaround for a tool limitation).

llvm-svn: 289832
2016-12-15 16:56:16 +00:00
Simon Pilgrim ba46422694 [X86][AVX512] Moved instruction domain lookups to the right table. NFCI.
Avoid duplicating instructions in the int32/int64 domains.

llvm-svn: 289830
2016-12-15 16:38:51 +00:00
Robert Lougher cf17674211 [SimplifyCFG] In sinkLastInstruction correctly set debugloc of "common" inst
Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Differential Revision: https://reviews.llvm.org/D27590

llvm-svn: 289828
2016-12-15 16:17:53 +00:00
Krzysztof Parzyszek 0ca1987977 Fix ubsan failures in lane mask shifts
llvm-svn: 289826
2016-12-15 16:08:49 +00:00
Simon Pilgrim d7518896ff [X86][SSE] Fix domains for VZEXT_LOAD type instructions
Add the missing domain equivalences for movss, movsd, movd and movq zero extending loading instructions.

Differential Revision: https://reviews.llvm.org/D27684

llvm-svn: 289825
2016-12-15 16:05:29 +00:00
Alexander Timofeev a57511c451 Fix for regression after Global Load Scalarization patch
llvm-svn: 289822
2016-12-15 15:17:19 +00:00
Krzysztof Parzyszek 91b5cf8412 Extract LaneBitmask into a separate type
Specifically avoid implicit conversions from/to integral types to
avoid potential errors when changing the underlying type. For example,
a typical initialization of a "full" mask was "LaneMask = ~0u", which
would result in a value of 0x00000000FFFFFFFF if the type was extended
to uint64_t.

Differential Revision: https://reviews.llvm.org/D27454

llvm-svn: 289820
2016-12-15 14:36:06 +00:00
Simon Pilgrim 2f7f0e7a48 [CostModel][X86] Updated reverse shuffle costs
llvm-svn: 289819
2016-12-15 14:24:07 +00:00
Alexey Bataev 4160264e30 [TEST] Initial commit of tests for minmax horizontal reductions.
llvm-svn: 289817
2016-12-15 13:21:29 +00:00
Alexey Bataev 2db6045b29 Revert "[TESTS] Initial commit of tests, by Andrew Tischenko"
This reverts commit ee709f8988653a0334fbf100cdbbdd83a3933347.

llvm-svn: 289814
2016-12-15 12:26:18 +00:00
Ehsan Amiri 795b0671c5 [InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp
A number of new patterns for simplifying and/xor of icmp:

(icmp ne %x, 0) ^ (icmp ne %y, 0) => icmp ne %x, %y if the following is true:
1- (%x = and %a, %mask) and (%y = and %b, %mask)
2- %mask is a power of 2.

(icmp eq %x, 0) & (icmp ne %y, 0) => icmp ult %x, %y if the following is true:
1- (%x = and %a, %mask1) and (%y = and %b, %mask2)
2- Let %t be the smallest power of 2 where %mask1 & %t != 0. Then for any
   %s that is a power of 2 and %s & %mask2 != 0, we must have %s <= %t.
For example if %mask1 = 24 and %mask2 = 16, setting %s = 16 and %t = 8
violates condition (2) above. So this optimization cannot be applied.

llvm-svn: 289813
2016-12-15 12:25:13 +00:00
Simon Pilgrim 9876ed07f6 [CostModel] Fix long standing bug with reverse shuffle mask detection
Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles

llvm-svn: 289811
2016-12-15 12:12:45 +00:00
Alexey Bataev 67c90c7d95 [TESTS] Initial commit of tests, by Andrew Tischenko
llvm-svn: 289807
2016-12-15 11:48:24 +00:00
Nemanja Ivanovic 552c8e960e [Power9] Allow AnyExt immediates for XXSPLTIB
In some situations, the BUILD_VECTOR node that builds a v18i8 vector by
a splat of an i8 constant will end up with signed 8-bit values and other
situations, it'll end up with unsigned ones. Handle both situations.

Fixes PR31340.

llvm-svn: 289804
2016-12-15 11:16:20 +00:00
Dylan McKay 4f590f28e7 [AVR] Support floats in the instrumention pass
This also refactors some common code into the 'GetTypeName' method.

llvm-svn: 289803
2016-12-15 11:02:41 +00:00
Simon Pilgrim 9ebeac3eed [CostModel][X86] Add tests for reverse shuffle costs
llvm-svn: 289800
2016-12-15 10:45:53 +00:00
Prakhar Bahuguna bc35f21f70 Add missing triple target for numeric section flag test
llvm-svn: 289798
2016-12-15 10:20:48 +00:00
Pavel Labath 08c2e86802 Simplify format member detection in FormatVariadic
Summary:
This replaces the format member search, which was quite complicated, with a more
direct approach to detecting whether a class should be formatted using the
format-member method. Instead we use a special type llvm::format_adapter, which
every adapter must inherit from. Then the search can be simply implemented with
the is_base_of type trait.

Aside from the simplification, I like this way more because it makes it more
explicit that you are supposed to use this type only for adapter-like
formattings, and the other approach (format_provider overloads) should be used
as a default (a mistake I made when first trying to use this library).

The only slight change in behaviour here is that now choose the format-adapter
branch even if the format member invocation will fail to compile (e.g. because it is a
non-const member function and we are passing a const adapter), whereas
previously we would have gone on to search for format_providers for the type.
However, I think that is actually a good thing, as it probably means the
programmer did something wrong.

Reviewers: zturner, inglorion

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27679

llvm-svn: 289795
2016-12-15 09:40:27 +00:00
Sjoerd Meijer 96e10b5a9e [Thumb] Teach ISel how to lower compares of AND bitmasks efficiently
This is essentially a recommit of r285893, but with a correctness fix. The
problem of the original commit was that this:

bic r5, r7, #31
cbz r5, .LBB2_10

got rewritten into:

lsrs  r5, r7, #5
beq .LBB2_10

The result in destination register r5 is not the same and this is incorrect
when r5 is not dead. So this fix includes checking the uses of the AND
destination register. And also, compared to the original commit, some regression
tests didn't need changing anymore because of this extra check.

For completeness, this was the original commit message:

For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more
efficient instruction selection if the bitmask is one consecutive sequence of
set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).

1) If the bitmask touches the LSB, then we can remove all the upper bits and
set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and
set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit
into the sign bit with one LSLS and change the condition query from NE/EQ to
MI/PL (we could also implement this by shifting into the carry bit and
branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower
zero bits of the mask.

1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two
16-bit instructions but can elide the CMP and doesn't require materializing a
complex immediate, so is also a win.

Differential Revision: https://reviews.llvm.org/D27761

llvm-svn: 289794
2016-12-15 09:38:59 +00:00
Dylan McKay 4b028e2ee1 [AVR] Add argument indices to the instrumention hook functions
This allows the instrumention hook functions to do better
pretty-printing.

llvm-svn: 289793
2016-12-15 09:38:09 +00:00
Prakhar Bahuguna 13e9921ccc Fix for build warning in execute-only support
llvm-svn: 289788
2016-12-15 08:42:04 +00:00
Prakhar Bahuguna e640c6f765 Allow ELF section flags to be specified numerically
Summary:
GAS already allows flags for sections to be specified directly as a
numeric value. This functionality is particularly useful for setting
processor or application-specific values that may not be directly
supported or understood by LLVM. This patch allows LLVM to use numeric
section flag values verbatim if specified by the assembly file.

Reviewers: grosbach, rafael, t.p.northover, rengolin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27451

llvm-svn: 289785
2016-12-15 07:59:15 +00:00
Prakhar Bahuguna 52a7dd7d78 [ARM] Implement execute-only support in CodeGen
This implements execute-only support for ARM code generation, which
prevents the compiler from generating data accesses to code sections.
The following changes are involved:

* Add the CodeGen option "-arm-execute-only" to the ARM code generator.
* Add the clang flag "-mexecute-only" as well as the GCC-compatible
  alias "-mpure-code" to enable this option.
* When enabled, literal pools are replaced with MOVW/MOVT instructions,
  with VMOV used in addition for floating-point literals. As the MOVT
  instruction is required, execute-only support is only available in
  Thumb mode for targets supporting ARMv8-M baseline or Thumb2.
* Jump tables are placed in data sections when in execute-only mode.
* The execute-only text section is assigned section ID 0, and is
  marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'.
  This also overrides selection of ELF sections for globals.

llvm-svn: 289784
2016-12-15 07:59:08 +00:00
Sanjoy Das 93b1de0f8c Add missing -mtriple to MIR test case
llvm-svn: 289779
2016-12-15 07:13:50 +00:00
Yaxun Liu 6f8d90999e Attempt to fix llvm-readobj crash on ppc64 due to r289674
llvm-svn: 289777
2016-12-15 06:59:23 +00:00
Daniel Jasper befe7a3fc4 Fix go bindings after r289702 (hopefully, don't really know how to build
them, build.sh seems to be broken).

llvm-svn: 289775
2016-12-15 06:54:29 +00:00
Kostya Serebryany 628b43aab6 [libFuzzer] enable the failure-resistant merge by default (with trace-pc-guard only)
llvm-svn: 289772
2016-12-15 06:21:21 +00:00
Dylan McKay dc58eb543f [AVR] Whitelist the avrlit config environment variables
This allows us to use `lit` to run on-target execution tests.

llvm-svn: 289769
2016-12-15 06:04:53 +00:00
Hal Finkel f19e114237 Revert part of r289765 that is not necessary
CS.doesNotAccessMemory(ArgNo) and CS.onlyReadsMemory(ArgNo) calls
dataOperandHasImpliedAttr, so revert this part of r289765 because
it should not be necessary.

llvm-svn: 289768
2016-12-15 05:50:45 +00:00
Hal Finkel 34f9d6ac11 Trying to fix NDEBUG build after r289764
llvm-svn: 289766
2016-12-15 05:33:19 +00:00
Hal Finkel 39fed399e1 Fix argument attribute queries with bundle operands
When iterating over data operands in AA, don't make argument-attribute-specific
queries on bundle operands. Trying to fix self hosting...

llvm-svn: 289765
2016-12-15 05:09:15 +00:00
Sanjoy Das d7389d6261 [MachineBlockPlacement] Don't make blocks "uneditable"
Summary:
This fixes an issue with MachineBlockPlacement due to a badly timed call
to `analyzeBranch` with `AllowModify` set to true.  The timeline is as
follows:

 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls
    `TailDup.shouldTailDuplicate` on its argument, which in turn calls
    `analyzeBranch` with `AllowModify` set to true.

 2. This `analyzeBranch` call edits the terminator sequence of the block
    based on the physical layout of the machine function, turning an
    unanalyzable non-fallthrough block to a unanalyzable fallthrough
    block.  Normally MBP bails out of rearranging such blocks, but this
    block was unanalyzable non-fallthrough (and thus rearrangeable) the
    first time MBP looked at it, and so it goes ahead and decides where
    it should be placed in the function.

 3. When placing this block MBP fails to analyze and thus update the
    block in keeping with the new physical layout.

Concretely, before (1) we have something like:

```
LBL0:
  < unknown terminator op that may branch to LBL1 >
  jmp LBL1

LBL1:
  ... A

LBL2:
  ... B
```

In (2), analyze branch simplifies this to

```
LBL0:
  < unknown terminator op that may branch to LBL2 >
  ;; jmp LBL1 <- redundant jump removed

LBL1:
  ... A

LBL2:
  ... B
```

In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2
after the first block since that is profitable.

```
LBL0:
  < unknown terminator op that may branch to LBL2 >
  ;; jmp LBL1 <- redundant jump

LBL2:
  ... B

LBL1:
  ... A
```

and the program now has incorrect behavior (we no longer fall-through
from `LBL0` to `LBL1`) because MBP can no longer edit LBL0.

There are several possible solutions, but I went with removing the teeth
off of the `analyzeBranch` calls in TailDuplicator.  That makes thinking
about the result of these calls easier, and breaks nothing in the lit
test suite.

I've also added some bookkeeping to the MachineBlockPlacement pass and
used that to write an assert that would have caught this.

Reviewers: chandlerc, gberry, MatzeB, iteratee

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27783

llvm-svn: 289764
2016-12-15 05:08:57 +00:00
Craig Topper ab5f355d8c [AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts.
llvm-svn: 289759
2016-12-15 03:49:45 +00:00
Hal Finkel 321053a7ca Fix iterator-invalidation issue
Inserting a new key into a DenseMap potentially invalidates iterators into that
map. Trying to fix an issue from r289755 triggering this assertion:

  Assertion `isHandleInSync() && "invalid iterator access!"' failed.

llvm-svn: 289757
2016-12-15 03:30:40 +00:00
Hal Finkel 3ca4a6bcf1 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Hal Finkel cb9f78e1c3 Make processing @llvm.assume more efficient by using operand bundles
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).

Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.

At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.

Differential Revision: https://reviews.llvm.org/D27259

llvm-svn: 289755
2016-12-15 02:53:42 +00:00
Eli Friedman db07ebbab6 Add testcases for some shuffle bugs.
See https://llvm.org/bugs/show_bug.cgi?id=31301 and
https://llvm.org/bugs/show_bug.cgi?id=31364 .

llvm-svn: 289751
2016-12-15 01:47:15 +00:00
Nico Weber d43d3ba5cd Fix test/tools/lto/hide-linkonce-odr.ll after r289719
llvm-svn: 289750
2016-12-15 01:31:38 +00:00
Justin Lebar 7853d3b9dd [NVPTX] Remove dead #defines from NVPTXUtilities.h.
llvm-svn: 289747
2016-12-15 00:45:06 +00:00
Joerg Sonnenberger 400e7b7811 Use PIC relocation model as default for PowerPC64 ELF.
Most of the PowerPC64 code generation for the ELF ABI is already PIC.
There are four main exceptions:
(1) Constant pointer arrays etc. should in writeable sections.
(2) The TOC restoration NOP after a call is needed for all global
symbols. While GNU ld has a workaround for questionable GCC self-calls,
we trigger the checks for calls from COMDAT sections as they cross input
sections and are therefore not considered self-calls. The current
decision is questionable and suboptimal, but outside the scope of the
change.
(3) TLS access can not use the initial-exec model.
(4) Jump tables should use relative addresses. Note that the current
encoding doesn't work for the large code model, but it is more compact
than the default for any non-trivial jump table. Improving this is again
beyond the scope of this change.

At least (1) and (3) are assumptions made in target-independent code and
introducing additional hooks is a bit messy. Testing with clang shows
that a -fPIC binary is 600KB smaller than the corresponding -fno-pic
build. Separate testing from improved jump table encodings would explain
only about 100KB or so. The rest is expected to be a result of more
aggressive immediate forming for -fno-pic, where the -fPIC binary just
uses TOC entries.

This change brings the LLVM output in line with the GCC output, other
PPC64 compilers like XLC on AIX are known to produce PIC by default
as well. The relocation model can still be provided explicitly, i.e.
when using MCJIT.

One test case for case (1) is included, other test cases with relocation
mode sensitive behavior are wired to static for now. They will be
reviewed and adjusted separately.

Differential Revision: https://reviews.llvm.org/D26566

llvm-svn: 289743
2016-12-15 00:01:53 +00:00
Justin Lebar a091da75b2 [AMDGPU] Fix runtime-metadata.ll test so it doesn't leave an object file in the source tree.
llvm-svn: 289742
2016-12-14 23:24:43 +00:00
Justin Lebar a54f4d7052 [NVPTX] Remove dead code.
I've chosen to remove NVPTXInstrInfo::CanTailMerge but not
NVPTXInstrInfo::isLoadInstr and isStoreInstr (which are also dead)
because while the latter two are reasonably useful utilities, the former
cannot be used safely: It relies on successful address space inference
to identify writes to shared memory, but addrspace inference is a
best-effort thing.

llvm-svn: 289740
2016-12-14 23:20:40 +00:00
Sanjay Patel afee21a5b2 [DAG] allow more select folding for targets that have 'and not' (PR31175)
The original motivation for this patch comes from wanting to canonicalize 
more IR to selects and also canonicalizing min/max.

If we're going to do that, we need more backend fixups to undo select codegen 
when simpler ops will do. I chose AArch64 for the tests because that shows the
difference in the simplest way. This should fix:
https://llvm.org/bugs/show_bug.cgi?id=31175

Differential Revision: https://reviews.llvm.org/D27489

llvm-svn: 289738
2016-12-14 22:59:14 +00:00
Davide Italiano 1ebbd176b3 [gold] Add datalayout to two tests where it was missing.
Reported by: thakis via chromium bots.

llvm-svn: 289737
2016-12-14 22:53:43 +00:00
Eugene Zelenko f9f8c68290 [Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 289736
2016-12-14 22:50:46 +00:00
Greg Clayton 52fe1f68c8 Add the ability to get attribute values as Optional<T>
When getting attributes it is sometimes nicer to use Optional<T> some of the time instead of magic values. I tried to cut over to only using the Optional values but it made many of the call sites very messy, so it makes sense the leave in the calls that can return a default value. Otherwise code that looks like this:

uint64_t CallColumn = Die.getAttributeValueAsAddress(DW_AT_call_line, 0);

Has to be turned into:

uint64_t CallColumn = 0;
if (auto CallColumnValue = Die.getAttributeValueAsAddress(DW_AT_call_line))
    CallColumn = *CallColumnValue;

The first snippet of code looks much better. But in cases where you want an offset that may or may not be there, the following code looks better:

if (auto StmtOffset = Die.getAttributeValueAsSectionOffset(DW_AT_stmt_list)) {
  // Use StmtOffset
}

Differential Revision: https://reviews.llvm.org/D27772

llvm-svn: 289731
2016-12-14 22:38:08 +00:00
Justin Lebar e7bbf7fde3 Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll.
llvm-svn: 289730
2016-12-14 22:32:55 +00:00
Justin Lebar 19bf9d2b6d [NVPTX] Support .maxnreg annotation.
Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D27638

llvm-svn: 289729
2016-12-14 22:32:50 +00:00
Justin Lebar e6867085fa [NVPTX] Remove string constants from NVPTXBaseInfo.h.
Summary:
Previously they were defined as a 2D char array in a header file.  This
is kind of overkill -- we can let the linker lay out these strings
however it pleases.  While we're at it, we might as well just inline
these constants where they're used, as each of them is used only once.

Also move NVPTXUtilities.{h,cpp} into namespace llvm.

Reviewers: tra

Subscribers: jholewinski, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D27636

llvm-svn: 289728
2016-12-14 22:32:44 +00:00
Peter Collingbourne b677fe00fb LibDriver: Reject inputs that are not COFF objects or bitcode files.
Fixes PR31372.

Differential Revision: https://reviews.llvm.org/D27776

llvm-svn: 289726
2016-12-14 22:19:22 +00:00
Dehao Chen 40dd8c5109 Only sets profile summary when it was not preset.
Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass.

Reviewers: eraman, davidxl

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D27733

llvm-svn: 289725
2016-12-14 22:06:49 +00:00
Dehao Chen fb699619a0 Fix the bug in r289714 (NFC).
llvm-svn: 289724
2016-12-14 22:03:08 +00:00
Jan Sjodin 071b0fa06a Revert revision 289721.
llvm-svn: 289723
2016-12-14 21:58:42 +00:00
Jan Sjodin 9419021bba Dummy commit.
llvm-svn: 289721
2016-12-14 21:57:18 +00:00
Davide Italiano ebed410ca0 [LTO] Add the missing datalayout in a test.
llvm-svn: 289720
2016-12-14 21:57:14 +00:00
Davide Italiano 2ceb628f36 [LTO] Reject modules without datalayout.
Also, udpate the ~60 failing tests in the tree which did
not contain a valid datalayout.
This fixes PR31123. lld will be updated in a following patch,
immediately after this is committed.

Differential Revision:  https://reviews.llvm.org/D27082

llvm-svn: 289719
2016-12-14 21:57:04 +00:00
Filipe Cabecinhas dd9688703c [asan] Don't skip instrumentation of masked load/store unless we've seen a full load/store on that pointer.
Reviewers: kcc, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27625

llvm-svn: 289718
2016-12-14 21:57:04 +00:00
Filipe Cabecinhas 1e69017a6d [asan] Hook ClInstrumentWrites and ClInstrumentReads to masked operation instrumentation.
Reviewers: kcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27548

llvm-svn: 289717
2016-12-14 21:56:59 +00:00
Dehao Chen a99e082e15 Create SampleProfileLoader pass in llvm instead of clang
Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder.

Reviewers: tejohnson, davidxl, dnovillo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D27743

llvm-svn: 289714
2016-12-14 21:40:47 +00:00
Eli Friedman cbed30c501 [ARM] Split 128-bit vectors in BUILD_VECTOR lowering
Given that INSERT_VECTOR_ELT operates on D registers anyway, combining
64-bit vectors into a 128-bit vector is basically free. Therefore, try
to split BUILD_VECTOR nodes before giving up and lowering them to a series
of INSERT_VECTOR_ELT instructions. Sometimes this allows dramatically
better lowerings; see testcases for examples. Inspired by similar code
in the x86 backend for AVX.

Differential Revision: https://reviews.llvm.org/D27624

llvm-svn: 289706
2016-12-14 20:44:38 +00:00
Nico Weber 53816d074d fix gcc warning about a superfluous ;
llvm-svn: 289705
2016-12-14 20:33:54 +00:00
Robert Lougher cfd7198698 [InstCombine] Folding of a compare with RHS const should merge debug locations
If all the operands to a phi node are compares that have a RHS constant,
instcombine will try to pull them through the phi node, combining them into
a single operation. When it does this, the debug location of the new op
should be the merged debug locations of the phi node arguments.

Patch 8 of 8 for D26256.  Folding of a compare that has a RHS constant.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289704
2016-12-14 20:27:22 +00:00
Eli Friedman 10576e73c9 [ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.
Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.

Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).

Differential Revision: https://reviews.llvm.org/D27694

llvm-svn: 289703
2016-12-14 20:25:26 +00:00
Amjad Aboud 43c8b6b7b2 [DebugInfo] Changed DIBuilder::createCompileUnit() to take DIFile instead of FileName and Directory.
This way it will be easier to expand DIFile (e.g., to contain checksum) without the need to modify the createCompileUnit() API.

Reviewers: llvm-commits, rnk

Differential Revision: https://reviews.llvm.org/D27762

llvm-svn: 289702
2016-12-14 20:24:54 +00:00
Yaxun Liu 04334b527d Fix build failure due to r289674 on certain systems
Removed a useless include which caused conflict.

llvm-svn: 289700
2016-12-14 20:17:47 +00:00
Robert Lougher c9f7354776 [InstCombine] Folding of a binop with RHS const should merge the debug locations
If all the operands to a phi node are a binop with a RHS constant, instcombine
will try to pull them through the phi node, combining them into a single
operation. When it does this, the debug location of the new op should be the
merged debug locations of the phi node arguments.

Patch 7 of 8 for D26256.  Folding of a binop with RHS constant.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289699
2016-12-14 20:07:49 +00:00
David Blaikie b461468958 DebugInfo: Improve type safety and simplify some subprogram finalization code
This probably ended up this way aften the subprogram<>function link
inversion and debug info metadata schema changes.

llvm-svn: 289697
2016-12-14 19:38:39 +00:00
Geoff Berry ca11a1e147 [GVNHoist] Move GVNHoist to function simplification part of pipeline.
Summary:
Move GVNHoist to later in the optimization pipeline, specifically, to
the function simplification part of the pipeline.  The new pipeline
location allows GVNHoist to run on a function after its callees have
been inlined but before the function has been considered for inlining
into its callers, exposing more opportunities for hoisting.

Performance results on AArch64 kryo:
Improvements:
  Benchmarks/CoyoteBench/fftbench  -24.952%
  spec2006/bzip2                    -4.071%
  internal bmark                    -3.177%
  Benchmarks/PAQ8p/paq8p            -1.754%
  spec2000/perlbmk                  -1.328%
  spec2006/h264ref                  -1.140%

Regressions:
  internal bmark                    +1.818%
  Benchmarks/mafft/pairlocalalign   +1.084%

Reviewers: sebpop, dberlin, hiraditya

Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27722

llvm-svn: 289696
2016-12-14 19:38:22 +00:00
Andrew Kaylor ce3bcae632 [WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements
Differential Revision: https://reviews.llvm.org/D27693

llvm-svn: 289694
2016-12-14 19:30:18 +00:00
Robert Lougher f02d9b8325 [InstCombine] When folding casts through a phi node merge the debug locations
If all the operands to a phi node are a cast, instcombine will try to pull
them through the phi node, combining them into a single cast. When it does
this, the debug location of the new cast should be the merged debug locations
of the phi node arguments.

Patch 6 of 8 for D26256.  Folding of a cast operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289693
2016-12-14 19:24:01 +00:00
Sean Callanan 62204ad74a Include <cstdarg> in PrettyStackTrace.cpp, fixing the bots.
llvm-svn: 289691
2016-12-14 19:19:53 +00:00
Sean Callanan 032dbf9ee3 Prepare PrettyStackTrace for LLDB adoption
This patch fixes the linkage for __crashtracer_info__, making it have the proper mangling (extern "C") and linkage (private extern).
It also adds a new PrettyStackTrace type, allowing LLDB to adopt this instead of Host::SetCrashDescriptionWithFormat().

Without this patch, CrashTracer on macOS won't pick up pretty stack traces from any LLVM client. 
An LLDB commit adopting this API will follow shortly.

Differential Revision: https://reviews.llvm.org/D27683

llvm-svn: 289689
2016-12-14 19:09:43 +00:00
Robert Lougher 373e36a410 [InstCombine] Folding loads through a phi node should merge the debug locations
If all the operands to a phi node are a load, instcombine will try to pull
them through the phi node, combining them into a single load. When it does
this, the debug location of the new load should be the merged debug locations
of the phi node arguments.

Patch 5 of 8 for D26256.  Folding of a load operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289688
2016-12-14 19:02:14 +00:00
Robert Lougher 8fc1e89bbb [InstCombine] When folding GEP through a phi node merge the debug locations
If all the operands to a phi node are getelementptr, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the new getelementptr
should be the merged debug locations of the phi node arguments.

Patch 4 of 8 for D26256.  Folding of a getelementptr operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289684
2016-12-14 18:37:50 +00:00
Eric Christopher ba1024cfb8 This change does two things:
Adds a "Discriminator" field to struct DILineInfo, which defaults to 0.
Fills out the "Discriminator" field in DILineInfo in DWARFDebugLine::LineTable::getFileLineInfoForAddress().

in order to have a slightly nicer interface in getFileLineInfoForAddress.

Patch by Simon Que!

Differential Revision: https://reviews.llvm.org/D27649

llvm-svn: 289683
2016-12-14 18:29:39 +00:00
Robert Lougher 4b0790d488 [InstCombine] Merge debug locations when folding through a phi node
If all the operands to a phi node are of the same operation, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the operation should
be the merged debug locations of the phi node arguments.

Patch 3 of 8 for D26256.  Folding of a compare operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289681
2016-12-14 18:14:57 +00:00
Kostya Serebryany d9d9a54511 [libFuzzer] disable msan for one more hook that reads target's data that might be uninitialized
llvm-svn: 289680
2016-12-14 18:13:02 +00:00
Robert Lougher 2428a4050f [InstCombine] Merge debug locations when folding through a phi node
If all the operands to a phi node are of the same operation, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the operation should
be the merged debug locations of the phi node arguments.

Patch 2 of 8 for D26256.  Folding of a binary operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289679
2016-12-14 17:49:19 +00:00
Dehao Chen 23025f8483 revert r289669 which breaks bots
llvm-svn: 289676
2016-12-14 17:23:16 +00:00
Yaxun Liu 07d659bc76 AMDGPU: Emit runtime metadata version 2 as YAML
Differential Revision: https://reviews.llvm.org/D25046

llvm-svn: 289674
2016-12-14 17:16:52 +00:00
Derek Schuff ebd8110aa1 lit.cfg: Check value of build config rather than converting to boolean
This is a CMake var which never evaluates to false.

llvm-svn: 289673
2016-12-14 17:05:34 +00:00
Matt Arsenault bdc0ac0a0e AMDGPU: Make AllocationPriority of SGPRs higher than VGPRs
Since SGPRs should spill to VGPRs, they should be allocated first.
I don't think this is sufficient for SGPRs to always spill to
VGPRs though.

llvm-svn: 289671
2016-12-14 16:52:06 +00:00
Dehao Chen cb61c94d87 Create SampleProfileLoader pass in llvm instead of clang
Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder.

Reviewers: tejohnson, davidxl, dnovillo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D27743

llvm-svn: 289669
2016-12-14 16:49:28 +00:00
Nirav Dave f5bf03c7ef Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
Reverting due to ARM MCJIT and MIPS LLD error.

This reverts commit r289659.

llvm-svn: 289667
2016-12-14 16:43:44 +00:00
Matt Arsenault ebfba7027e AMDGPU: Change vintrp printing
llvm-svn: 289664
2016-12-14 16:36:12 +00:00
Derek Schuff 112b303905 Revert gold part of change, just liblto
llvm-svn: 289663
2016-12-14 16:20:25 +00:00
Derek Schuff 0c2796dc36 Disable libLTO tests when libLTO is not built
Summary:
The current test only checks whether ld64 is available, causing tests
to fail when ld64 is avilable but libLTO is not built.

Reviewers: beanz, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D27739

llvm-svn: 289662
2016-12-14 16:20:22 +00:00
Robert Lougher 7bd04e3b2d New API for merging debug locations. NFC.
Given two debug locations the function getMergedLocation combines the
locations into a single location (which may be an empty location).
Please see https://reviews.llvm.org/D26256 for the discussion leading
up to this API.

Note the function is currently a stub.  This allows optimisations to
use the API although no location will actually be used.

This is patch 1 out of 8 for D26256.  As suggested by David Blaikie,
each change in D26256 has been broken out into a separate patch.

llvm-svn: 289661
2016-12-14 16:14:17 +00:00
Nirav Dave 8527ab0ad2 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Retrying after fixing after removing load-store factoring through
token factors in favor of improved token factor operand pruning

Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.

Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).

Additional Minor Changes:

   1. Finishes removing unused AliasLoad code
   2. Unifies the the chain aggregation in the merged stores across
      code paths
   3. Re-add the Store node to the worklist after calling
      SimplifyDemandedBits.
   4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
      arbitrary, but seemed sufficient to not cause regressions in
      tests.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations

Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -

      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and
      merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

llvm-svn: 289659
2016-12-14 15:44:26 +00:00
Simon Pilgrim facbd35696 Wdocumentation fix
llvm-svn: 289655
2016-12-14 15:14:44 +00:00
Simon Pilgrim 05ab8ffc7e [DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of just APInt::isPowerOf2
Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases.

Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value.

Differential Revision: https://reviews.llvm.org/D27714

llvm-svn: 289654
2016-12-14 15:08:13 +00:00
Michael Zuckerman 1ce2a23a1e Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask instruction
adding new optimization opportunity by adding new X86ISelLowering pattern. The test case was shown in https://llvm.org/bugs/show_bug.cgi?id=30945.

Test explanation:
Select gets three arguments mask, op and op2. In this case, the Mask is a result of ICMP. The ICMP instruction compares (with equal operand) the zero initializer vector and the result of the first ICMP.

In general, The result of "cmp eq, op1, zero initializers" is "not(op1)" where op1 is a mask. By rearranging of the two arguments inside the Select instruction, we can get the same result. Without the necessary of the middle phase ("cmp eq, op1, zero initializers").

Missed optimization opportunity: 
vpcmpled %zmm0, %zmm1, %k0
knotw %k0, %k1

can be combine to 
vpcmpgtd %zmm0, %zmm2, %k1

Reviewers: 
1. delena
2. igorb 

Commited after check all 
Differential Revision: https://reviews.llvm.org/D27160

llvm-svn: 289653
2016-12-14 14:57:10 +00:00
Simon Pilgrim ebe58191c8 [X86][SSE] Add AVX1 tests to sdiv/udiv srem/urem combine tests
As requested on D27714

llvm-svn: 289652
2016-12-14 14:39:51 +00:00
Renato Golin ce1dd3c949 Revert "[AVR] Add the very first on-target test"
This reverts commit r289648, as it's an execution test and relies on the
emulator/dispatcher being available on all builders.

llvm-svn: 289651
2016-12-14 13:24:20 +00:00
Stephan Bergmann 7d94d54a36 Adapt to recent APFloat change
llvm-svn: 289649
2016-12-14 12:11:35 +00:00
Dylan McKay 452e266cd6 [AVR] Add the very first on-target test
This test runs on actual AVR hardware.

llvm-svn: 289648
2016-12-14 12:03:39 +00:00
Stephan Bergmann 17c7f70362 Replace APFloatBase static fltSemantics data members with getter functions
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.

Differential Revision: https://reviews.llvm.org/D26671

llvm-svn: 289647
2016-12-14 11:57:17 +00:00
Artur Pilipenko f3ee444010 Add a couple of assertions to the load combine code introduced by r289538
llvm-svn: 289646
2016-12-14 11:55:47 +00:00
Dylan McKay cfd1ce6a52 [AVR] Add the integrated testing tool to the .gitignore
We build it as an LLVM tool.

llvm-svn: 289645
2016-12-14 11:47:14 +00:00
Oliver Stannard 268f42f1ce [Assembler] Better error messages for .org directive
Currently, the error messages we emit for the .org directive when the
expression is not absolute or is out of range do not include the line
number of the directive, so it can be hard to track down the problem if
a file contains many .org directives.

This patch stores the source location in the MCOrgFragment, so that it
can be used for diagnostics emitted during layout.

Since layout is an iterative process, and the errors are detected during
each iteration, it would have been possible for errors to be reported
multiple times. To prevent this, I've made the assembler bail out after
each iteration if any errors have been reported. This will still allow
multiple unrelated errors to be reported in the common case where they
are all detected in the first round of layout.

Differential Revision: https://reviews.llvm.org/D27411

llvm-svn: 289643
2016-12-14 10:43:58 +00:00
Dylan McKay 3abd1d3e12 [AVR] Add a function instrumentation pass
This will be used for an on-chip test suite.

llvm-svn: 289641
2016-12-14 10:15:00 +00:00
Craig Topper aeaa52cc11 [X86][InstCombine] Handle demanded elements for operand of AVX-512 scalar floating point to integer conversion intrinsics.
llvm-svn: 289639
2016-12-14 07:46:12 +00:00
Hal Finkel 065b756528 [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)
This change aims to unify and correct our logic for when we need to allow for
the possibility of the linker adding a TOC restoration instruction after a
call. This comes up in two contexts:

 1. When determining tail-call eligibility. If we make a tail call (i.e.
    directly branch to a function) then there is no place for the linker to add
    a TOC restoration.
 2. When determining when we need to add a nop instruction after a call.
    Likewise, if there is a possibility that the linker might need to add a
    TOC restoration after a call, then we need to put a nop after the call
    (the bl instruction).

First problem: We were using similar, but different, logic to decide (1) and
(2). This is just wrong. Both the resideInSameModule function (used when
determining tail-call eligibility) and the isLocalCall function (used when
deciding if the post-call nop is needed) were supposed to be determining the
same underlying fact (i.e. might a TOC restoration be needed after the call).
The same logic should be used in both places.

Second problem: The logic in both places was wrong. We only know that two
functions will share the same TOC when both functions come from the same
section of the same object. Otherwise the linker might cause the functions to
use different TOC base addresses (unless the multi-TOC linker option is
disabled, in which case only shared-library boundaries are relevant). There are
a number of factors that can cause functions to be placed in different sections
or come from different objects (-ffunction-sections, explicitly-specified
section names, COMDAT, weak linkage, etc.). All of these need to be checked.
The existing logic only checked properties of the callee, but the properties of
the caller must also be checked (for example, calling from a function in a
COMDAT section means calling between sections).

There was a conceptual error in the resideInSameModule function in that it
allowed tail calls to functions with weak linkage and protected/hidden
visibility. While protected/hidden visibility does prevent the function
implementation from being replaced at runtime (via interposition), it does not
prevent the linker from using an alternate implementation at link time (i.e.
using some strong definition to replace the provided weak one during linking).
If this happens, then we're still potentially looking at a required TOC
restoration upon return.

Otherwise, in general, the post-call nop is needed wherever ELF interposition
needs to be supported. We don't currently support ELF interposition at the IR
level (see http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html
for more information), and I don't think we should try to make it appear to
work in the backend in spite of that fact. This will yield subtle bugs if
interposition is attempted. As a result, regardless of whether we're in PIC
mode, we don't assume that we need to add the nop to support the possibility of
ELF interposition. However, the necessary check is in place (i.e. calling
GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions for
which interposition is allowed at the IR level, we'll add the nop as necessary.
In the mean time, we'll generate more tail calls and fewer nops when compiling
position-independent code.

Differential Revision: https://reviews.llvm.org/D27231

llvm-svn: 289638
2016-12-14 07:24:50 +00:00
Craig Topper 268b3abe6d [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better.
Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking.

llvm-svn: 289636
2016-12-14 06:06:58 +00:00
Craig Topper dfd268d76b [X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts.
Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly.

llvm-svn: 289632
2016-12-14 05:43:05 +00:00
Mehdi Amini 8e13bc4562 [ThinLTO] Add an API to trigger file-based API for returning objects to the linker
Summary:
The motivation is to support better the -object_path_lto option on
Darwin. The linker needs to write down the generate object files on
disk for later use by lldb or dsymutil (debug info are not present
in the final binary). We're moving this into libLTO so that we can
be smarter when a cache is enabled and hard-link when possible
instead of duplicating the files.

Reviewers: tejohnson, deadalnix, pcc

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D27507

llvm-svn: 289631
2016-12-14 04:56:42 +00:00
Craig Topper eb6a20e79e [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

llvm-svn: 289629
2016-12-14 03:17:30 +00:00
Craig Topper a0372dec26 [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

llvm-svn: 289628
2016-12-14 03:17:27 +00:00
Mehdi Amini 76a00b51f0 Don't double-initialize cl::opt for iterating in reverse order to uncover non-determinism in codegen by default
Bots are broken and needs to be fixed before having this on by default.
The feature was committed in r289619.

I tried to disable it in r289624 and failed because it was initialized in two places.

llvm-svn: 289626
2016-12-14 02:35:32 +00:00
Mehdi Amini fd1184efb5 Disable Iterating SmallPtrSet in reverse order to uncover non-determinism in codegen by default
Bots are broken and needs to be fixed before having this on by default.
The feature was committed in r289619.

llvm-svn: 289624
2016-12-14 02:02:28 +00:00
Kostya Serebryany 8efb35b4cb [libFuzzer] document one more desired feature of a fuzz target
llvm-svn: 289622
2016-12-14 01:31:21 +00:00
Peter Collingbourne 1a0720e8c4 LTO: Add support for multi-module bitcode files.
Differential Revision: https://reviews.llvm.org/D27313

llvm-svn: 289621
2016-12-14 01:17:59 +00:00
Paul Robinson 8fec3da00c [DWARF] Preserve column number when emitting 'line 0' record
Follow-up to r289256, address a FIXME to avoid resetting the column
number. This reduced .debug_line by 2.6% in a RelWithDebInfo
self-build of clang.

llvm-svn: 289620
2016-12-14 00:27:35 +00:00
Mandeep Singh Grang f6b069c7db [llvm] Iterate SmallPtrSet in reverse order to uncover non-determinism in codegen
Summary:
Given a flag (-mllvm -reverse-iterate) this patch will enable iteration of SmallPtrSet in reverse order.
The idea is to compile the same source with and without this flag and expect the code to not change.
If there is a difference in codegen then it would mean that the codegen is sensitive to the iteration order of SmallPtrSet.
This is enabled only with LLVM_ENABLE_ABI_BREAKING_CHECKS.

Reviewers: chandlerc, dexonsmith, mehdi_amini

Subscribers: mgorny, emaste, llvm-commits

Differential Revision: https://reviews.llvm.org/D26718

llvm-svn: 289619
2016-12-14 00:15:57 +00:00
Evandro Menezes 54eb192b25 [ARM] Fix typo in checking prefix
llvm-svn: 289617
2016-12-14 00:02:03 +00:00
Evandro Menezes aeec780e42 Add support for Samsung Exynos M3 (NFC)
llvm-svn: 289613
2016-12-13 23:31:41 +00:00
Greg Clayton 74c265e537 Update the header docs to match a recent checkin.
llvm-svn: 289612
2016-12-13 23:22:53 +00:00
Greg Clayton 1cbf3fa94a Switch functions that returned bool and filled in a DWARFFormValue arg with ones that return Optional<DWARFFormValue>
Differential Revision: https://reviews.llvm.org/D27737

llvm-svn: 289611
2016-12-13 23:20:56 +00:00
Peter Collingbourne 98d40e0557 llvm-cat: Allow bitcode files to be created with no modules.
llvm-svn: 289610
2016-12-13 23:14:55 +00:00
Chris Bieneman da1c84c01e [llvm-config] Fixing one check where shared libs implied dylib
We shouldn't print the dylib if LinkDylib is false.

llvm-svn: 289609
2016-12-13 23:08:52 +00:00
Derek Schuff 7ff587a96d llvm-config: Set LinkMode in addition to LinkDyLib when using --ignore-llvm
Summary:
LinkDyLib is only used (before arg processing) to set up the default for
LinkMode. So reset LinkMode as well, and process before --link-shared or
--link-static to allow those flags to continue to override it.

Reviewers: beanz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27736

llvm-svn: 289608
2016-12-13 23:01:53 +00:00
Kostya Serebryany f6f82c2cc8 [libFuzzer] fix an UB (invalid shift) spotted by ubsan. The code worked fine by luck, because the way shifts actually work on clang+x86
llvm-svn: 289607
2016-12-13 22:49:14 +00:00
Chris Bieneman 7f6611cf3e [llvm-config] Add --ignore-libllvm
This flag forces off linking libLLVM. This should resolve some issues reported on llvm-commits.

llvm-svn: 289605
2016-12-13 22:17:59 +00:00
Eugene Zelenko 8208592707 [Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 289604
2016-12-13 22:13:50 +00:00
Dehao Chen 0f35fa907d Change CoverageTracker from a global variable to member variable to avoid breaking thread-safety. (NFC)
llvm-svn: 289603
2016-12-13 22:13:18 +00:00
Sanjoy Das c02dda2ab9 Re-land "[SCEVExpander] Use llvm data structures; NFC"
This change re-lands r289215, by reverting r289482.  The underlying
issue that caused it to be reverted has been fixed by Tim Northover in
r289496.

Original commit message for r289215:

[SCEVExpander] Use llvm data structures; NFC

Original commit message for r289482:

Revert "[SCEVExpander] Use llvm data structures; NFC"

This reverts r289215 (git SHA1 cb7b86a1).  It breaks the ubsan build
because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it
tries to cast the empty and tombstone keys to `T *` (due to insufficient
alignment).

This is the relevant stack trace (thanks to Mike Aizatsky):

    #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39
    #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19
    #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23

llvm-svn: 289602
2016-12-13 22:04:58 +00:00
Anna Thomas 65ca8e91cc [IRCE] Avoid loop optimizations on pre and post loops
Summary:
This patch will add loop metadata on the pre and post loops generated by IRCE.
Currently, we have metadata for disabling optimizations such as vectorization,
unrolling, loop distribution and LICM versioning (and confirmed that these
optimizations check for the metadata before proceeding with the transformation).

The pre and post loops generated by IRCE need not go through loop opts (since
these are slow paths).

Added two test cases as well.

Reviewers: sanjoy, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26806

llvm-svn: 289588
2016-12-13 21:05:21 +00:00
Michael Kuperstein 3d23d4a234 [LV] Don't vectorize when we have a small static bound on trip count
We currently check if the exact trip count is known and is smaller than the
"tiny loop" bound. We should be checking the maximum bound on the trip count
instead.

Differential Revision: https://reviews.llvm.org/D27690

llvm-svn: 289583
2016-12-13 20:38:18 +00:00
Peter Collingbourne b56a103462 ADT: Use delete[] to delete the array owned by OwningArrayRef, as we created it with new[].
llvm-svn: 289582
2016-12-13 20:30:12 +00:00
Peter Collingbourne d9af29969a ADT: Add OwningArrayRef class.
This is a MutableArrayRef that owns its array.
I plan to use this in D22296.

Differential Revision: https://reviews.llvm.org/D27723

llvm-svn: 289579
2016-12-13 20:24:24 +00:00
Peter Collingbourne 45102a24c7 Object: Make IRObjectFile own multiple modules and enumerate symbols from all modules.
This implements multi-module support in IRObjectFile.

Differential Revision: https://reviews.llvm.org/D26951

llvm-svn: 289578
2016-12-13 20:20:17 +00:00
Peter Collingbourne c5fecb4f1a Object: Remove module accessors from IRObjectFile, and hide its constructor.
Differential Revision: https://reviews.llvm.org/D27079

llvm-svn: 289577
2016-12-13 20:10:22 +00:00
Peter Collingbourne 77f4c30d6f LTO: Port the legacy LTO API to ModuleSymbolTable.
Differential Revision: https://reviews.llvm.org/D27078

llvm-svn: 289576
2016-12-13 20:01:58 +00:00
Peter Collingbourne ad90369a94 LTO: Port the new LTO API to ModuleSymbolTable.
Differential Revision: https://reviews.llvm.org/D27077

llvm-svn: 289574
2016-12-13 19:43:49 +00:00
Alina Sbirlea 77c5eaaeda Generalize strided store pattern in interleave access pass
Summary:
This patch aims to generalize matching of the strided store accesses to more general masks.
The more general rule is to have consecutive accesses based on the stride:
[x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...]
All elements in the masks need not form a contiguous space, there may be gaps.
As before, undefs are allowed and filled in with adjacent element loads.

Reviewers: HaoLiu, mssimpso

Subscribers: mkuper, delena, llvm-commits

Differential Revision: https://reviews.llvm.org/D23646

llvm-svn: 289573
2016-12-13 19:32:36 +00:00
Matthias Braun fde00fc252 Revert "AArch64CollectLOH: Rewrite as block-local analysis."
This is not always behaving as expected as it turns out block live-in
lists are only correct most of the time. Still waiting for reviews on
https://reviews.llvm.org/D27559 to have them correct all of the time.

See also http://llvm.org/PR31361, rdar://25117107

This reverts commit r288567.
This reverts commit r288561.

llvm-svn: 289570
2016-12-13 19:08:17 +00:00
Alexei Starovoitov 3b9efca8e8 [bpf] change llvm-objdump to print dec instead of hex
since bpf instruction stream is multiple of 8 change llvm-objdump
to print decimal instruction number instead of hex address, so that
users don't have to do this math manually to match kernel verifier output

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 289569
2016-12-13 19:07:08 +00:00
Tim Northover fe7c59adb8 GlobalISel: fix GOT accesses on AArch64.
We were using the correct pseudo-instruction, but because the operand's flags
weren't set correctly we still ended up emitting incorrect relocations during
MC lowering.

llvm-svn: 289566
2016-12-13 18:25:38 +00:00
Greg Clayton c8c1032c0c Make a DWARFDIE class that can help avoid using the wrong DWARFUnit when extracting attributes
Many places pass around a DWARFDebugInfoEntryMinimal and a DWARFUnit. It is easy to get things wrong by using the wrong DWARFUnit with a DWARFDebugInfoEntryMinimal. This patch creates a DWARFDie class that contains the DWARFUnit and DWARFDebugInfoEntryMinimal objects so that they can't get out of sync. All attribute extraction has been moved out of DWARFDebugInfoEntryMinimal and into DWARFDie. DWARFDebugInfoEntryMinimal was also renamed to DWARFDebugInfoEntry.

DWARFDie objects are temporary objects that are used by clients and contain 2 pointers that you always need to have anyway. Keeping them grouped will avoid errors and simplify many of the attribute extracting APIs by not having to pass in a DWARFUnit.

Differential Revision: https://reviews.llvm.org/D27634

llvm-svn: 289565
2016-12-13 18:25:19 +00:00
Marcos Pividori c21b3c949d [libFuzzer] Add missing header needed for Windows.
llvm-svn: 289564
2016-12-13 17:46:48 +00:00
Marcos Pividori 7c1defd738 [libFuzzer] Avoid name collision with Windows API.
Windows uses some macros to replace DeleteFile() by DeleteFileA() or
DeleteFileW(). This was causing an error at link time.
DeleteFile was renamed to RemoveFile().

Differential Revision: https://reviews.llvm.org/D27577

llvm-svn: 289563
2016-12-13 17:46:40 +00:00
Marcos Pividori 67dfacdd80 [libFuzzer] Implement DirName() for Windows.
Implement DirName from scratch to avoid dependencies on external libraries.
It's based on MSDN documentation for Naming Files, Paths, and Namespaces.

The algorithm can't simply start from the end and look backwards for the
first separator, because we need to preserve the prefix that represent
the root location. We shouldn't remove anything there. In Windows we
have many different options, like:
 \\Server\Share\ , \ , C: , C:\ , \\?\C:\ , \\?\UNC\Server\Share\
We remove the last separator in the rest of the path, if it exists.

It was implemented to have a similar behaviour to dirname() in linux,
removing trailing separators, returning "." when the path doesn't
contain separators, etc.

Differential Revision: https://reviews.llvm.org/D27579

llvm-svn: 289562
2016-12-13 17:46:32 +00:00
Marcos Pividori 64d4147396 [libFuzzer] Fix bug in detecting timeouts when input string is empty.
I added a new flag RunningCB to know if the Fuzzer's main thread is
running the CB function, instead of using (!CurrentUnitSize).
(!CurrentUnitSize) doesn't work properly. For example, in FuzzerLoop.cpp,
inside ShuffleAndMinimize() function, we execute the callback with an
empty string (size=0). Previous implementation failed to detect timeouts
in that execution.
Also, I add a regression test for that case.

Differential Revision: https://reviews.llvm.org/D27433

llvm-svn: 289561
2016-12-13 17:46:25 +00:00
Marcos Pividori 178fe58745 [libFuzzer] Clean up headers and file formatting of LibFuzzer files.
Reorganize #includes to follow LLVM Coding Standards.
Include some missing headers. Required to use `Printf()`.

Aside from that, this patch contains no functional change.
It is purely a re-organization.

Differential Revision: https://reviews.llvm.org/D27363

llvm-svn: 289560
2016-12-13 17:46:11 +00:00
Marcos Pividori 6e3d885c79 [libFuzzer] Properly use unsigned for workers, jobs and NumberOfCpuCores.
std:🧵:hardware_concurrency() returns an unsigned, so I modify
NumberOfCpuCores() to return unsigned too.
The number of cpus is used to define the number of workers, so I decided
to update the worker and jobs flags to be declared as unsigned too.

Differential Revision: https://reviews.llvm.org/D27685

llvm-svn: 289559
2016-12-13 17:45:53 +00:00
Marcos Pividori 463f8bdd0b [libFuzzer] Properly use unsigned for Process ID.
Use unsigned for PID instead of signed int. GetCurrentProcessId() returns
an unsigned (DWORD) so we must be sure we can deal with all possible values.
I use a long unsigned to be sure it can hold a 32 bit unsigned (DWORD).

Differential Revision: https://reviews.llvm.org/D27281

llvm-svn: 289558
2016-12-13 17:45:44 +00:00
Marcos Pividori c59b692c85 [libFuzzer] Improve Signal Handler interface.
Add new flags to FuzzingOptions to represent the different conditions
on the signal handling. These options are passed when calling
SetSignalHandler().
This changes simplify the implementation of Windows's exception
handling. Now we can define a unique handler for all the exceptions.

Differential Revision: https://reviews.llvm.org/D27238

llvm-svn: 289557
2016-12-13 17:45:20 +00:00
Rong Xu 3462cac9af Fix the test cases committed in r289521.
llvm-svn: 289556
2016-12-13 17:34:29 +00:00
Simon Pilgrim 5f2db1351f [X86][SSE] Regenerate vector of pointers tests
llvm-svn: 289555
2016-12-13 17:22:39 +00:00
Zachary Turner bc48d20ef7 [ADT] Add llvm::StringLiteral.
StringLiteral is a wrapper around a string literal useful for
replacing global tables of char arrays with global tables of
StringRefs that can initialized in a constexpr context, avoiding
the invocation of a global constructor.

Differential Revision: https://reviews.llvm.org/D27686

llvm-svn: 289551
2016-12-13 17:03:49 +00:00
David Callahan ebcf916c5a [ADCE] Add code to remove dead branches
Summary:
This is last in of a series of patches to evolve ADCE.cpp to support
removing of unnecessary control flow.

This patch adds the code to update the control and data flow graphs
to remove the dead control flow.

Also update unit tests to test the capability to remove dead,
may-be-infinite loop which is enabled by the switch
-adce-remove-loops.

Previous patches:

D23824 [ADCE] Add handling of PHI nodes when removing control flow
D23559 [ADCE] Add control dependence computation
D23225 [ADCE] Modify data structures to support removing control flow
D23065 [ADCE] Refactor anticipating new functionality (NFC)
D23102 [ADCE] Refactoring for new functionality (NFC)

Reviewers: dberlin, majnemer, nadav, mehdi_amini

Subscribers: llvm-commits, david2050, freik, twoh

Differential Revision: https://reviews.llvm.org/D24918

llvm-svn: 289548
2016-12-13 16:42:18 +00:00
Artur Pilipenko 469fcd2afd Use more detailed assertion messages in the code introduced by r289538
llvm-svn: 289545
2016-12-13 16:26:15 +00:00
Artur Pilipenko 79d1255e26 Fix a buildbot failure introduced by r289538
Build failed because of unused variable in product mode.

llvm-svn: 289540
2016-12-13 14:55:31 +00:00
Artur Pilipenko c93cc5955f [DAGCombiner] Match load by bytes idiom and fold it into a single load
Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it.

Assuming little endian target:
  i8 *a = ...
  i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
=>
  i32 val = *((i32)a)

  i8 *a = ...
  i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3]
=>
  i32 val = BSWAP(*((i32)a))

This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations.

Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part:
  i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24)

Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses.

The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed.

Reviewed By: hfinkel, RKSimon, filcab

Differential Revision: https://reviews.llvm.org/D26149

llvm-svn: 289538
2016-12-13 14:21:14 +00:00
Artur Pilipenko 01e86444a0 Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the upcoming user
llvm-svn: 289537
2016-12-13 14:16:02 +00:00
Simon Pilgrim 9dc67c0101 [SelectionDAG] computeKnownBits - simplified knownbits sign extension. NFCI.
We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this.

llvm-svn: 289534
2016-12-13 13:36:27 +00:00
Simon Dardis c97cfb69ba [mips][rtdyld] Move MIPS relocation resolution to a subclass and implement N32 relocations
N32 relocations are only correct for individual relocations at the moment.
Support for relocation composition will follow in a later patch.

Patch By: Daniel Sanders

Reviwers: vkalintiris, atanasyan

Differential Revision: https://reviews.llvm.org/D27467

llvm-svn: 289532
2016-12-13 11:39:18 +00:00
Simon Dardis e8af792439 [mips] Fix comment to respect 80 chars per line; NFC
llvm-svn: 289530
2016-12-13 11:10:53 +00:00
Simon Dardis 43b5ce492d [mips] Fix compact branch hazard detection
In certain cases it is possible that transient instructions such as
%reg = IMPLICIT_DEF as a single instruction in a basic block to reach
the MipsHazardSchedule pass. This patch teaches MipsHazardSchedule to
properly look through such cases.

Reviewers: vkalintiris, zoran.jovanovic

Differential Revision: https://reviews.llvm.org/D27209

llvm-svn: 289529
2016-12-13 11:07:51 +00:00
Diana Picus 2d9adbf524 [GlobalISel] Move extendRegister where it belongs. NFCI
Apparently I missed this one when I moved ValueHandler back in r288658. Sorry!

llvm-svn: 289528
2016-12-13 10:46:12 +00:00
Craig Topper ac75bca1eb [X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly.
Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed.

Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support.

llvm-svn: 289523
2016-12-13 07:45:45 +00:00
NAKAMURA Takumi b8ea75a010 llvm/test/Transforms/PGOProfile/noreturncall.ll REQUIRES asserts due to -debug-only.
llvm-svn: 289522
2016-12-13 07:04:03 +00:00
Rong Xu 51a1e3c430 [PGO] Fix insane counts due to nonreturn calls
Summary:
Since we don't break BBs for function calls. We might get some insane counts 
(wrap of unsigned) in the presence of noreturn calls.

This patch sets these counts to zero instead of the wrapped number.

Reviewers: davidxl

Subscribers: xur, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D27602

llvm-svn: 289521
2016-12-13 06:41:14 +00:00
Davide Italiano 463bebc319 [SCCP] Debug diagnostic goes under DEBUG(). NFCI.
llvm-svn: 289519
2016-12-13 05:56:04 +00:00
Dylan McKay 1e57fa487b [AVR] Add an 'relax memory operation' pass
Summary:
This pass will be used to relax instructions which use out of bounds
memory accesses to equivalent operations that can work with the
addresses.

The pass currently implements relaxation for the STDWPtrQRr instruction.

Without this pass, an assertion error would be hit in the pseudo expansion pass.

In the future, we will need to add more instructions to this pass. We can do
that on a case-by-case basic.

Reviewers: arsenm, kparzysz

Subscribers: wdng, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D27650

llvm-svn: 289517
2016-12-13 05:53:14 +00:00
Philip Reames 1f1bbac8da [peephole] Enhance folding logic to work for STATEPOINTs
The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are:

    Support for folding multiple operands at once which reference the same load
    Support for folding multiple loads into a single instruction
    Walk all the operands of the instruction for varidic instructions (this is a bug fix!)

Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here.

Differential Revision: https://reviews.llvm.org/D24103

llvm-svn: 289510
2016-12-13 01:38:41 +00:00
Philip Reames 51387a8c28 [Statepoints] Reuse stack slots more than once within a basic block
The stack slot reuse code had a really amusing bug. We ended up only reusing a stack slot exact once (initial use + reuse) within a basic block. If we had a third statepoint to process, we ended up allocating a new set of stack slots. If we crossed a basic block boundary, the set got cleared. As a result, code which is invoke heavy doesn't see the problem, but multiple calls within a basic block does. Net result: as we optimize invokes into calls, lowering gets worse.

The root error here is that the bitmap uses by the custom allocator wasn't kept in sync. The result was that we ended up resizing the bitmap on the next statepoint (to handle the cross block case), reset the bit once, but then never reset it again.

Differential Revision: https://reviews.llvm.org/D25243

llvm-svn: 289509
2016-12-13 01:21:15 +00:00
Kostya Serebryany a31300e789 [libFuzzer] don't require extra flags with -minimize_crash=1 (default to -max_total_time=600). Also respect exact_artifact_path when outputting the end result
llvm-svn: 289506
2016-12-13 00:40:47 +00:00
Chris Bieneman 5d58aa80ad Missed a file in r289503.
llvm-svn: 289504
2016-12-13 00:32:43 +00:00
Chris Bieneman a0523fd0cd [LIT] Fix system-windows
Turns out if you were on windows and your default target wasn't windows the system-windows feature wasn't getting enabled.

This fixes that and updates the coff-dwarf test to rely on the new "target-windows" feature. That test was the reason why system-windows was changed to not always be enabled on Windows hosts.

llvm-svn: 289503
2016-12-13 00:29:56 +00:00
Chris Bieneman 5a7c5069da Revert "Suppress LLVM::tools/llvm-symbolizer/coff-dwarf.test for mingw, for now."
This reverts commit r249937.

llvm-svn: 289502
2016-12-13 00:29:51 +00:00
Chris Bieneman e96abc6d45 [llvm-config] Unsupported should be win32
Hopefully this will fix the failing Windows bot.

llvm-svn: 289497
2016-12-12 23:42:08 +00:00
Tim Northover d82cc61744 Stop lying about pointers' required alignments.
These extra specializations were added in the depths of history (r67984 from
2009) and are clearly problematic now. The pointers actually are aligned to the
default (8 bytes), since otherwise UBsan would be complaining loudly.

I *think* it originally made sense because there was no "alignof" to infer the
correct value so the generic case went with what malloc returned (8-byte
aliged objects), and on 32-bit machines this specialization was correct. It
became wrong when we started compiling for 64-bit, and caused a UBSan failure
when we tried to put a ValueHandle into a DenseMap.

Should fix the Green Dragon UBSan bot.

llvm-svn: 289496
2016-12-12 23:29:07 +00:00
Marcos Pividori 681e904419 [libFuzzer] Implement Timers for Windows.
Implemented timeouts for Windows using TimerQueueTimers.
Timers are used to supervise the time of execution of the
callback function that is being fuzzed.

Differential Revision: https://reviews.llvm.org/D27237

llvm-svn: 289495
2016-12-12 23:25:11 +00:00
Sanjay Patel 2a1554a0b6 [x86] fix test specifications
llvm-svn: 289493
2016-12-12 23:16:35 +00:00
Sanjay Patel 1740526e99 [x86] fix test specifications and auto-generate checks
llvm-svn: 289492
2016-12-12 23:15:15 +00:00
Petr Hosek 024a17b06d [CMake] Multi-target builtins build
This change enables building builtins for multiple different targets
using LLVM runtimes directory.

To specify the builtin targets to be built, use the LLVM_BUILTIN_TARGETS
variable, where the value is the list of targets.  To pass a per target
variable to the builtin build, you can set BUILTINS_<target>_<variable>
where <variable> will be passed to the builtin build for <target>.

Differential Revision: https://reviews.llvm.org/D26652

llvm-svn: 289491
2016-12-12 23:15:10 +00:00
Chris Bieneman 1a5e67869e Revert "Disable all llvm-config tests for now, will investigate later"
This reverts commit r260386.

These tests all pass for me locally. I have no idea if they will pass on all configurations, so I'll watch the bots closely.

llvm-svn: 289490
2016-12-12 23:14:58 +00:00
Dan Liew 197d2f0df3 [llvm-config] Fix bug where `--libfiles` and `--names` would produce
incorrect output when LLVM is built with `LLVM_BUILD_LLVM_DYLIB`.

`llvm-config` previously produced output like this

```
$ llvm-config --libfiles
/usr/lib/liblibLLVM-4.0svn.so.so
$ llvm-config --libnames
liblibLLVM-4.0svn.so.so
```

The library prefix and shared library extension were added to
the library name twice which was wrong.

I wanted to write a test cases for this but it looks like **all**
`llvm-config` tests were disabled by r260386 so I'll leave this for
now.

Subscribers: llvm-commits, tstellarAMD

Reviewers: beanz, DiamondLovesYou, axw

Differential Revision: https://reviews.llvm.org/D27393

llvm-svn: 289488
2016-12-12 23:07:22 +00:00
Andrew Kaylor ff6a1edfa8 Avoid infinite loops in branch folding
Differential Revision: https://reviews.llvm.org/D27582

llvm-svn: 289486
2016-12-12 23:05:38 +00:00
Chris Bieneman 7495a4895c clang-format to fix post-commit feedback
Thanks dblaikie!

llvm-svn: 289485
2016-12-12 23:05:15 +00:00
Chris Bieneman f07d05eccd [llvm-config] Fix cflags test looking for "error"
This test is (I think) actually trying to make sure no errors are printed, but it hits on the string "error" in flags.

llvm-svn: 289484
2016-12-12 23:03:28 +00:00
Chris Bieneman 04418623fe Revert "Remove system-libs.test for now"
This reverts commit r260281.

llvm-svn: 289483
2016-12-12 23:03:01 +00:00
Sanjoy Das 804b629812 Revert "[SCEVExpander] Use llvm data structures; NFC"
This reverts r289215 (git SHA1 cb7b86a1).  It breaks the ubsan build
because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it
tries to cast the empty and tombstone keys to `T *` (due to insufficient
alignment).

This is the relevant stack trace (thanks to Mike Aizatsky):

    #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39
    #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19
    #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23

llvm-svn: 289482
2016-12-12 23:00:12 +00:00
Kostya Serebryany 092d5764a1 [libFuzzer] split one slow test into several, for more parallel testing
llvm-svn: 289481
2016-12-12 22:55:25 +00:00
Nico Weber b3901bdde8 Fix MSVC build after 289461; MSVC isn't sure if this is std:: or llvm::
llvm-svn: 289480
2016-12-12 22:46:40 +00:00
Kostya Serebryany a4b43bf8e8 [libFuzzer] make SimpleCmpTest a bit simpler to crack and more verbose
llvm-svn: 289477
2016-12-12 22:39:33 +00:00
Sanjay Patel 62104ee6d9 [x86] fix formatting; NFC
llvm-svn: 289476
2016-12-12 22:31:01 +00:00
Eugene Zelenko 6a9226d9b8 [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 289475
2016-12-12 22:23:53 +00:00
Tim Shen 18e7ae672e [APFloatTest] Use std::make_tuple to make GCC 4.8 happy
Differential Revision: https://reviews.llvm.org/D26817

llvm-svn: 289474
2016-12-12 22:16:08 +00:00
Guozhi Wei 1fd553c934 [PPC] Prefer direct move on power8 if load 1 or 2 bytes to VSR
Power8 has MTVSRWZ but no LXSIBZX/LXSIHZX, so move 1 or 2 bytes to VSR through MTVSRWZ is much faster than store the extended value into stack and load it with LXSIWZX.
This patch fixes pr31144.

Differential Revision: https://reviews.llvm.org/D27287

llvm-svn: 289473
2016-12-12 22:09:02 +00:00
Tim Shen 44bde896a5 [APFloat] Implement PPCDoubleDouble add and subtract.
Summary:
I looked at libgcc's implementation (which is based on the paper,
Software for Doubled-Precision Floating-Point Computations", by Seppo Linnainmaa,
ACM TOMS vol 7 no 3, September 1981, pages 272-283.) and made it generic to
arbitrary IEEE floats.

Differential Revision: https://reviews.llvm.org/D26817

llvm-svn: 289472
2016-12-12 21:59:30 +00:00
Matthew Simpson 92ce0230b5 [SLP] Fix sign-extends for type-shrinking
This patch ensures the correct minimum bit width during type-shrinking.
Previously when type-shrinking, we always sign-extended values back to their
original width. However, if we are going to sign-extend, and the sign bit is
unknown, we have to increase the minimum bit width by one bit so the
sign-extend will fill the upper bits correctly. If the sign bit is known to be
zero, we can perform a zero-extend instead. This should fix PR31243.

Reference: https://llvm.org/bugs/show_bug.cgi?id=31243
Differential Revision: https://reviews.llvm.org/D27466

llvm-svn: 289470
2016-12-12 21:11:04 +00:00
Kostya Serebryany 035af9b346 [libFuzzer] build libFuzzer itself with asan
llvm-svn: 289469
2016-12-12 20:58:10 +00:00
Paul Robinson ac7fe5e0c4 Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions.
DWARF specifies that "line 0" really means "no appropriate source
location" in the line table.  By default, use this for branch targets
and some other cases that have no specified source location, to
prevent inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).

Updated patch allows enabling or suppressing this behavior for all
unspecified source locations.

Differential Revision: http://reviews.llvm.org/D24180

llvm-svn: 289468
2016-12-12 20:49:11 +00:00
Kostya Serebryany d4be88913e [libFuzzer] respect -max_len during merge
llvm-svn: 289467
2016-12-12 20:39:35 +00:00
Teresa Johnson a29bd6ffcc [ThinLTO] Remove useless code (NFC)
Should have been removed in r288446.

llvm-svn: 289466
2016-12-12 20:34:28 +00:00
Mehdi Amini ef27db879c Refactor BitcodeReader: move Metadata and ValueId handling in their own class/file
Summary:
I'm planning on changing the way we load metadata to enable laziness.
I'm getting lost in this gigantic files, and gigantic class that is the bitcode
reader. This is a first toward splitting it in a few coarse components that
are more easily understandable.

Reviewers: pcc, tejohnson

Subscribers: mgorny, llvm-commits, dexonsmith

Differential Revision: https://reviews.llvm.org/D27646

llvm-svn: 289461
2016-12-12 19:34:26 +00:00
Mehdi Amini bf2090e31a Remove IsMetadataMaterialized from BitcodeReader (NFC)
Summary: It does not seem useful.

Reviewers: pcc, dexonsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27668

llvm-svn: 289457
2016-12-12 19:23:39 +00:00
Geoff Berry d73420d591 [LiveRangeEdit] Add assert string and descriptive comment.
llvm-svn: 289456
2016-12-12 19:12:41 +00:00
Dimitry Andric 59e5cb4342 Fix compile with GCC 5 or later
Summary:

Compiling with GCC 5 or later can fail with a bogus error "constructor
required before non-static data member for
llvm::ValueEnumerator::MDRange::First has been parsed".

This was originally fixed upstream in GCC PR 70528, but later this fix
was reverted, and released versions of GCC still show the bogus error.

To work around this, replace MDRange's declaration of a default
constructor with a definition.

Reviewers: dexonsmith, rsmith, rivanvx

Subscribers: llvm-commits, dim, dexonsmith

Differential Revision: https://reviews.llvm.org/D18730

llvm-svn: 289454
2016-12-12 19:05:52 +00:00
Reid Kleckner 30422eea0f Revert "[SCEVExpand] do not hoist divisions by zero (PR30935)"
Reverts r289412. It caused an OOB PHI operand access in instcombine when
ASan is enabled. Reduction in progress.

Also reverts "[SCEVExpander] Add a test case related to r289412"

llvm-svn: 289453
2016-12-12 18:52:32 +00:00
Simon Atanasyan 5048514c20 [mips] For PIC code convert unconditional jump to unconditional branch
Unconditional branch uses relative addressing which is the right choice
in case of position independent code.

This is a fix for the bug:
https://dmz-portal.mips.com/bugz/show_bug.cgi?id=2445

Differential revision: https://reviews.llvm.org/D27483

llvm-svn: 289448
2016-12-12 17:40:26 +00:00
Nicolai Haehnle f45ea4bbc5 AMDGPU: llvm.amdgcn.interp.mov is a source of divergence
Summary:
While the result is constant across a single primitive, each pixel
shader wave can have pixels from multiple primitives.

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D27572

llvm-svn: 289447
2016-12-12 16:52:19 +00:00
Sanjay Patel 052220c5c8 remove stale FIXME note from test; NFC
llvm-svn: 289445
2016-12-12 16:20:21 +00:00
Simon Pilgrim a64d4dc22f [X86] Regenerate vector bitcast/widening tests.
llvm-svn: 289443
2016-12-12 16:15:45 +00:00
Sanjay Patel e730ce87a5 [InstCombine] fix bug when offsetting case values of a switch (PR31260)
We could truncate the condition and then try to fold the add into the
original condition value causing wrong case constants to be used.

Move the offset transform ahead of the truncate transform and return
after each transform, so there's no chance of getting confused values.

Fix for:
https://llvm.org/bugs/show_bug.cgi?id=31260

llvm-svn: 289442
2016-12-12 16:13:52 +00:00
Teresa Johnson 040cc16835 [ThinLTO] Import only necessary DICompileUnit fields
Summary:
As discussed on mailing list, for ThinLTO importing we don't need
to import all the fields of the DICompileUnit. Don't import enums,
macros, retained types lists. Also only import local scoped imported
entities. Since we don't currently import any global variables,
we also don't need to import the list of global variables (added an
assert to verify none are being imported).

This is being done by pre-populating the value map entries to map
the unneeded metadata to nullptr. For the imported entities, we can
simply replace the source module's list with a new list containing
only those needed imported entities. This is done in the IRLinker
constructor so that value mapping automatically does the desired
mapping.

Reviewers: mehdi_amini, dexonsmith, dblaikie, aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27635

llvm-svn: 289441
2016-12-12 16:09:30 +00:00
Sanjay Patel 87e2f677d7 [InstCombine] clean up range-for-loops in visitSwitchInst(); NFCI
llvm-svn: 289439
2016-12-12 15:52:56 +00:00
Simon Pilgrim d4ff86b973 [X86] Regenerate test.
llvm-svn: 289438
2016-12-12 15:47:53 +00:00
Sanjay Patel 2b060c7700 [InstCombine] add test to show PR31260 miscompile; NFC
llvm-svn: 289437
2016-12-12 15:28:44 +00:00
Sanjoy Das b1227db1f4 [SCEVExpander] Add a test case related to r289412
llvm-svn: 289435
2016-12-12 14:57:11 +00:00
Simon Pilgrim 4cbe1834e4 Update inline argument comment. NFCI.
combineX86ShufflesRecursively 'HasPSHUFB' flag has been the more generic 'HasVariableMask' flag for some time.

llvm-svn: 289430
2016-12-12 13:43:15 +00:00
Simon Pilgrim 5ebd2b542b [X86][SSE] Add support for combining SSE VSHLI/VSRLI uniform constant shifts.
Fixes some missed constant folding opportunities and allows us to combine shuffles that end with a logical bit shift.

llvm-svn: 289429
2016-12-12 13:33:58 +00:00
Simon Pilgrim 369cd349b9 [X86][SSE] Lower suitably sign-extended mul vXi64 using PMULDQ
PMULDQ returns the 64-bit result of the signed multiplication of the lower 32-bits of vXi64 vector inputs, we can lower with this if the sign bits stretch that far.

Differential Revision: https://reviews.llvm.org/D27657

llvm-svn: 289426
2016-12-12 10:49:15 +00:00
Simon Pilgrim 040a36c176 [SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBits
Pre-commit as discussed on D27657

llvm-svn: 289425
2016-12-12 10:29:43 +00:00
Craig Topper 36ecce9bed [X86] Teach selectScalarSSELoad to accept full 128-bit vector loads and the X86ISD::VZEXT_LOAD opcode.
Disable peephole on some of the tests that no longer require it to properly fold scalar intrinsics.

llvm-svn: 289424
2016-12-12 07:57:24 +00:00
Craig Topper f2c6f7abf3 [X86] Change CMPSS/CMPSD intrinsic instructions to use sse_load_f32/f64 as its memory pattern instead of full vector load.
These intrinsics only load a single element. We should use sse_loadf32/f64 to give more options of what loads it can match.

Currently these instructions are often only getting their load folded thanks to the load folding in the peephole pass. I plan to add more types of loads to sse_load_f32/64 so we can match without the peephole.

llvm-svn: 289423
2016-12-12 07:57:21 +00:00
Craig Topper 081c0e2864 [X86] Remove some intrinsic instructions from hasPartialRegUpdate
Summary:
These intrinsic instructions are all selected from intrinsics that have well defined behavior for where the upper bits come from. It's not the same place as the lower bits.

As you can see we were suppressing load folding for these instructions in some cases. In none of the cases was the separate load helping avoid a partial dependency on the destination register. So we should just go ahead and allow the load to be folded.

Only foldMemoryOperand was suppressing folding for these. They all have patterns for folding sse_load_f32/f64 that aren't gated with OptForSize, but sse_load_f32/f64 doesn't allow 128-bit vector loads. It only allows scalar_to_vector and vzmovl of scalar loads to match. There's no reason we can't allow a 128-bit vector load to be narrowed so I would like to fix sse_load_f32/f64 to allow that. And if I do that it changes some of these same test cases to fold the load too.

Reviewers: spatel, zvi, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27611

llvm-svn: 289419
2016-12-12 05:07:17 +00:00
Sebastian Pop 8c9cc8c86b [SCEVExpand] do not hoist divisions by zero (PR30935)
SCEVExpand computes the insertion point for the components of a SCEV to be code
generated.  When it comes to generating code for a division, SCEVexpand would
not be able to check (at compilation time) all the conditions necessary to avoid
a division by zero.  The patch disables hoisting of expressions containing
divisions by anything other than non-zero constants in order to avoid hoisting
these expressions past conditions that should hold before doing the division.

The patch passes check-all on x86_64-linux.

Differential Revision: https://reviews.llvm.org/D27216

llvm-svn: 289412
2016-12-12 02:52:51 +00:00
Craig Topper 7fc6d34ed1 [InstCombine][XOP] The instructions for the scalar frcz intrinsics are defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead.
llvm-svn: 289411
2016-12-11 22:32:38 +00:00
Simon Pilgrim 831435cb14 [X86][SSE] Add support for combining target shuffles to SHUFPD.
llvm-svn: 289407
2016-12-11 21:26:25 +00:00
Davide Italiano 0a1476c756 [SCCP] Use the appropriate helper function. NFCI.
llvm-svn: 289406
2016-12-11 21:19:03 +00:00
Ayman Musa 7ec4ed55d3 [X86][AVX512] Add missing patterns for broadcast fallback in case load node has multiple uses (for v4i64 and v4f64).
When the load node which the broadcast instruction broadcasts has multiple uses, it cannot be folded.
A fallback pattern is added to catch these cases and provide another solution.

Differential Revision: https://reviews.llvm.org/D27661

llvm-svn: 289404
2016-12-11 20:11:17 +00:00
Sanjoy Das 6de678815c [TBAA] Don't generate invalid TBAA when merging nodes
Summary:
Fix a corner case in `MDNode::getMostGenericTBAA` where we can sometimes
generate invalid TBAA metadata.

Reviewers: chandlerc, hfinkel, mehdi_amini, manmanren

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26635

llvm-svn: 289403
2016-12-11 20:07:25 +00:00
Sanjoy Das 3336f681e3 [Verifier] Add verification for TBAA metadata
Summary:
This change adds some verification in the IR verifier around struct path
TBAA metadata.

Other than some basic sanity checks (e.g. we get constant integers where
we expect constant integers), this checks:

 - That by the time an struct access tuple `(base-type, offset)` is
   "reduced" to a scalar base type, the offset is `0`.  For instance, in
   C++ you can't start from, say `("struct-a", 16)`, and end up with
   `("int", 4)` -- by the time the base type is `"int"`, the offset
   better be zero.  In particular, a variant of this invariant is needed
   for `llvm::getMostGenericTBAA` to be correct.

 - That there are no cycles in a struct path.

 - That struct type nodes have their offsets listed in an ascending
   order.

 - That when generating the struct access path, you eventually reach the
   access type listed in the tbaa tag node.

Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26438

llvm-svn: 289402
2016-12-11 20:07:15 +00:00
Sanjay Patel 81ed3499cd [Constants] don't die processing non-ConstantInt GEP indices in isGEPWithNoNotionalOverIndexing() (PR31262)
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=31262

llvm-svn: 289401
2016-12-11 20:07:02 +00:00
Simon Pilgrim 7c98a79f7b [X86][AVX512] Add target shuffle test showing missing PSHUFPD combine.
llvm-svn: 289400
2016-12-11 19:41:23 +00:00
Sebastian Pop e08d9c7c87 instr-combiner: sum up all latencies of the transformed instructions
We have found that -- when the selected subarchitecture has a scheduling model
and we are not optimizing for size -- the machine-instruction combiner uses a
too-simple algorithm to compute the cost of one of the two alternatives [before
and after running a combining pass on a section of code], and therefor it throws
away the combination results too often.

This fix has the potential to help any ISA with the potential to combine
instructions and for which at least one subarchitecture has a scheduling model.
As of now, this is only known to definitely affect AArch64 subarchitectures with
a scheduling model.

Regression tested on AMD64/GNU-Linux, new test case tested to fail on an
unpatched compiler and pass on a patched compiler.

Patch by Abe Skolnik and Sebastian Pop.

llvm-svn: 289399
2016-12-11 19:39:32 +00:00
Simon Pilgrim 8766a76f3d [X86][XOP] Add target shuffle tests showing missing PSHUFPD combine.
llvm-svn: 289398
2016-12-11 19:36:25 +00:00
Sanjoy Das ba1bf87586 [SCEVExpander] Explicitly expand AddRec starts into loop preheader
This is NFC today, but won't be once D27216 (or an equivalent patch) is
in.

This change fixes a design problem in SCEVExpander -- it relied on a
hoisting optimization to generate correct code for add recurrences.
This meant changing the hoisting optimization to not kick in under
certain circumstances (to avoid speculating faulting instructions, say)
would break correctness.

The fix is to make the correctness requirements explicit, and have it
not rely on the hoisting optimization for correctness.

llvm-svn: 289397
2016-12-11 19:02:21 +00:00
Oren Ben Simhon 9683ecbff6 [X86] Regcall - Adding support for mask types
Regcall calling convention passes mask types arguments in x86 GPR registers.
The review includes the changes required in order to support v32i1, v16i1 and v8i1.

Differential Revision: https://reviews.llvm.org/D27148

llvm-svn: 289383
2016-12-11 14:10:52 +00:00