Commit Graph

96564 Commits

Author SHA1 Message Date
Kuba Brecka 44e875ad5b [tsan] Cast floating-point types correctly when instrumenting atomic accesses, LLVM part
Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this.

Differential Revision: https://reviews.llvm.org/D26266

llvm-svn: 286135
2016-11-07 19:09:56 +00:00
Matt Arsenault f530e8b3f0 AMDGPU: Remove unnecessary and on conditional branch
The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.

llvm-svn: 286134
2016-11-07 19:09:33 +00:00
Matt Arsenault 52f14ec596 AMDGPU: Preserve vcc undef flags when inverting branch
If the branch was on a read-undef of vcc, passes that used
analyzeBranch to invert the branch condition wouldn't preserve
the undef flag resulting in a verifier error.

Fixes verifier failures in a future commit.

Also fix verifier error when inserting copy for vccz
corruption bug.

llvm-svn: 286133
2016-11-07 19:09:27 +00:00
Benjamin Kramer 1697d39eef [MemCpyOpt] Don't emit IR in an unspecified order
Argument evaluation order is one of the edge cases where Clang differs
from GCC, yielding different IR depending on which compiler LLVM was
built with. Make the order deterministic and tune the test to actually
verify the order instead of trying to hide it.

llvm-svn: 286126
2016-11-07 17:47:28 +00:00
Matt Arsenault 2ae5653072 AMDGPU: Try to fix (non-clang?) bot builds
llvm-svn: 286120
2016-11-07 16:52:50 +00:00
Richard Smith 857efb0880 Add -O0 support for @llvm.invariant.group.barrier by discarding it if it gets to ISel.
Differential Revision: https://reviews.llvm.org/D26292

llvm-svn: 286119
2016-11-07 16:47:20 +00:00
Matt Arsenault 314cbf7a3b AMDGPU: Refactor copyPhysReg
Separate the subregister splitting logic to re-use later.

llvm-svn: 286118
2016-11-07 16:39:22 +00:00
Chad Rosier 611b73b1f9 Fix 80-column violations. NFC.
llvm-svn: 286117
2016-11-07 16:28:04 +00:00
Sanjay Patel 86408a8048 [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.

llvm-svn: 286113
2016-11-07 15:52:45 +00:00
Jonas Paulsson 4f0509fab3 [SystemZ] Correct the SchedModel regarding vector unit / instructions.
* Use a generic vector unit to model the issue unit more accurately.
* Update some vector instructions that actually use the vector unit for more
  than one cycle.

Review: Ulrich Weigand
llvm-svn: 286112
2016-11-07 15:45:06 +00:00
Amara Emerson 614b44bbe9 This patch adds support for 16 bit floating point registers to the inline asm register selection on AArch64.
Without this patch, register allocation for the example below fails.

define half @test(half %a1, half %a2) #0 {
entry:
  %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1
  ret half %0
}

Patch by Florian Hahn.

Differential Revision: https://reviews.llvm.org/D25080

llvm-svn: 286111
2016-11-07 15:42:12 +00:00
Chad Rosier d6daac4746 [AArch64] Removed the narrow load merging code in the ld/st optimizer.
This feature has been disabled for some time now, so remove cruft.

Differential Revision: https://reviews.llvm.org/D26248

llvm-svn: 286110
2016-11-07 15:27:22 +00:00
Jonas Paulsson 818431a61a [SystemZ] Fixes in SchedModels for older subtargets.
IssueWidth updated to reflect the capacity of the issue unit correctly.
Correct number of FX and LS units modelled (2, was 1).

Review: Ulrich Weigand
llvm-svn: 286109
2016-11-07 14:47:25 +00:00
Chad Rosier 8f348017b0 [AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory.
Differential Revision: https://reviews.llvm.org/D26252

llvm-svn: 286108
2016-11-07 14:11:45 +00:00
James Molloy b03e0879fc [Thumb1] Move padding earlier when synthesizing TBBs off of the PC
When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding.

If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset.

In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4.

llvm-svn: 286107
2016-11-07 13:38:21 +00:00
Dylan McKay c988b334b6 [AVR] Enable the ISel, frame analyzer, and alloca passes
llvm-svn: 286095
2016-11-07 06:02:55 +00:00
Craig Topper b110e04851 [AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to selects and native zext/sext.
This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select.

llvm-svn: 286092
2016-11-07 02:12:57 +00:00
Craig Topper 96041c6ba9 [X86] Use StringRef::startswith to reduce a few compares in the intrinsic autoupgrade code.
llvm-svn: 286090
2016-11-07 00:13:42 +00:00
Craig Topper 7e545335d6 [AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to legacy intrinsics and a select.
llvm-svn: 286089
2016-11-07 00:13:39 +00:00
Krzysztof Parzyszek 39d14f3bc3 Reapply r286080 with a phony change in Hexagon's CMakeLists.txt
Cmake has not recognized that Hexagon.td has a new dependency in
HexagonPatterns.td. All changes to that file were not visible to
the build bots.

llvm-svn: 286084
2016-11-06 20:55:57 +00:00
Saleem Abdulrasool 804e12eeb5 ARM: lower fpowi appropriately for Windows ARM
This handles the last case of the builtin function calls that we would
generate code which differed from Microsoft's ABI.  Rather than
generating a call to `__pow{d,s}i2` we now promote the parameter to a
float or double and invoke `powf` or `pow` instead.

Addresses PR30825!

llvm-svn: 286082
2016-11-06 19:46:54 +00:00
Krzysztof Parzyszek f8d38d11b9 Revert r286080: it breaks build bots
llvm-svn: 286081
2016-11-06 19:36:09 +00:00
Krzysztof Parzyszek 9e3520c884 [Hexagon] Remove redundant custom selection code
The clr/set/toggle-bit instructions (with the bit index given as an
immediate operand) had both, custom selection code that generated them,
and selection patterns at the same time. The selection patterns were
not used, because the custom selection code was executed first.
This patch removes the custom code in favor of the selection patterns.
The custom code handled 64-bit registers as well with an immediate bit
index, and so new patterns were added to implement that.

It was also the same case for the instruction "Rd += asr(Rs, Rt)",
except that the custom code did not offer any additional functionality,
and was simply removed.

llvm-svn: 286080
2016-11-06 19:03:38 +00:00
Krzysztof Parzyszek c93815ef04 [Hexagon] Round 5 of selection pattern simplifications
Remove unnecessary type casts in patterns.

llvm-svn: 286079
2016-11-06 18:13:14 +00:00
Krzysztof Parzyszek f914278f8b [Hexagon] Round 4 of selection pattern simplifications
Give simpler or more meaningful names to pat frags and xforms.

llvm-svn: 286078
2016-11-06 18:09:56 +00:00
Krzysztof Parzyszek 846597d081 [Hexagon] Round 3 of selection pattern simplifications
Remove unnecessary C++ functions for SDNode transforms. Move more
pat frags to files where they are used.

llvm-svn: 286077
2016-11-06 18:05:14 +00:00
Krzysztof Parzyszek 84755104b4 [Hexagon] Round 2 of selection pattern simplifications
Add pat frags for any-, sign-, and zero-extensions.

llvm-svn: 286076
2016-11-06 17:56:48 +00:00
Simon Pilgrim 39df78e384 [SelectionDAG] Add support for vector demandedelts in XOR opcodes
llvm-svn: 286075
2016-11-06 16:49:19 +00:00
Craig Topper 46de41330c [AVX-512] Remove intrinsics for 128/256-bit masked variable shift. Instead upgrade them to a select and the older AVX2 intrinsic.
llvm-svn: 286073
2016-11-06 16:29:19 +00:00
Craig Topper af9b3fe752 [AVX-512] Remove intrinsics for 128/256-bit masked shift by immediate. Instead upgrade them to a select and the older SSE/AVX2 intrinsic.
llvm-svn: 286072
2016-11-06 16:29:14 +00:00
Simon Pilgrim dd4809a603 [SelectionDAG] Add support for vector demandedelts in OR opcodes
llvm-svn: 286071
2016-11-06 16:29:09 +00:00
Craig Topper c9467ed31e [AVX-512] Remove intrinsics for 128/256-bit masked shift by single element in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic.
llvm-svn: 286070
2016-11-06 16:29:08 +00:00
Simon Pilgrim b3ad5f7ebf [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsElementInsertion. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286067
2016-11-06 14:20:29 +00:00
Benjamin Kramer f15a8e6a50 [BitcodeWriter] Replace a manual byteswap with read32be.
No functional change intended.

llvm-svn: 286066
2016-11-06 13:26:39 +00:00
Amaury Sechet 74084c44ca Kill deprecated attribute API
Summary:
This kill various depreacated API related to attribute :
 - The deprecated C API attribute based on LLVMAttribute enum.
 - The Raw attribute set format (planned to be removed in 4.0).

Reviewers: bkramer, echristo, mehdi_amini, void

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D23039

llvm-svn: 286062
2016-11-06 07:48:46 +00:00
Tim Shen 398f90f024 [APFloat] Make functions that produce APFloaat objects use correct semantics.
Summary:
Fixes PR30869.

In D25977 I meant to change all functions that care about lifetime. I
changed constructors, factory functions, but I missed member/free
functions that return new instances. This patch changes them.

Reviewers: hfinkel, kbarton, echristo, joerg

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26269

llvm-svn: 286060
2016-11-06 07:38:37 +00:00
Craig Topper 5471fc29e4 [AVX-512] Add missing EVEX version of pattern for (v2f64 (extloadv2f32 addr:)) -> VCVTPS2PDZ128rm
llvm-svn: 286059
2016-11-06 04:12:52 +00:00
Craig Topper 1162857ec4 [AVX-512] Lower AVX cvtpd2ps intrinsic to ISD::FP_ROUND so it can use EVEX instruction when available.
llvm-svn: 286057
2016-11-06 04:12:46 +00:00
Craig Topper 9a4a3af5dd [AVX-512] Lower SSE/AVX cvtdq2ps intrinsics directly to ISD::SINT_TO_FP so they can use EVEX instructions when available.
llvm-svn: 286056
2016-11-06 04:12:42 +00:00
Krzysztof Parzyszek 2839b29f4b [Hexagon] Relocate pattern-related bits to proper places
llvm-svn: 286049
2016-11-05 21:44:50 +00:00
Krzysztof Parzyszek 4b4012a5c9 [Hexagon] Round 1 of selection pattern simplifications
Consistently use register class pat frags instead of spelling out
the type and class each time.

llvm-svn: 286048
2016-11-05 21:02:54 +00:00
Simon Pilgrim 4a9f210412 [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBlend. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286045
2016-11-05 18:31:57 +00:00
Simon Pilgrim 725174694a [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsZeroOrAnyExtend. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286044
2016-11-05 18:22:13 +00:00
Simon Pilgrim 9f0afc6ae1 [X86][SSE] Reuse zeroable element mask in SSE4A EXTRQ/INSERTQ vector shuffle lowering. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286043
2016-11-05 18:05:13 +00:00
Simon Pilgrim 3cae21960e [X86][SSE] Reuse zeroable element mask in PSHUFB vector shuffle lowering. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286042
2016-11-05 17:53:27 +00:00
Simon Pilgrim 64a592d0a2 [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsInsertPS. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286040
2016-11-05 17:27:48 +00:00
Simon Pilgrim 009befbd88 [X86][SSE] Reuse zeroable element mask in lowerVectorShuffleAsBitMask. NFCI
Don't regenerate a zeroable element mask with computeZeroableShuffleElements when its already available.

llvm-svn: 286039
2016-11-05 17:12:19 +00:00
Justin Lebar 54b0be048e [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any valid int64_t to the set.
Summary:
SmallSetVector uses DenseSet, but that means we need to reserve some
values for the empty and tombstone keys.

It seems to me we should have a general way to let us store full-range
ints inside of DenseSets, and furthermore that we probably shouldn't
silently let you add ints into DenseSets without explicitly promising
that they're in range.  But that's a battle for another day; for now,
just fix this code, since we currently do something Very Bad when
compiling ffmpeg.

Fixes PR30914.

Reviewers: jeremyhu

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26323

llvm-svn: 286038
2016-11-05 16:47:25 +00:00
Simon Pilgrim 1af0fc1103 [X86][SSE] Reuse zeroable element mask instead of regenerating it. NFCI
We are repeatedly calling computeZeroableShuffleElements in many shuffle lowering calls for the same shuffle mask/inputs.

This is a first step towards reusing the zeroable result, initially just for lowerVectorShuffleAsShift calls.

llvm-svn: 286037
2016-11-05 16:40:20 +00:00
Krzysztof Parzyszek a8d63dc289 [Hexagon] Split all selection patterns into a separate file
This is just the basic separation, without any cleanup. Further changes
will follow.

llvm-svn: 286036
2016-11-05 15:01:38 +00:00
Simon Pilgrim 1b4e1ac966 Strip trailing whitespace. NFCI.
llvm-svn: 286034
2016-11-05 14:43:04 +00:00
Craig Topper 8f354a7f8b [AVX-512] Use an equality compare instead of StringRef::startswith in a few places in auto upgrade that were looking for the complete intrinsic name anyway.
llvm-svn: 286033
2016-11-05 05:35:23 +00:00
Alina Sbirlea f802c3f975 Correct mprotect page boundries to round up end page. Fixes PR30905.
Summary:
Update the boundries for mprotect.
Patch by Andrew Adams. Fixes PR30905.

Reviewers: loladiro, andrew.w.kaylor, chandlerc

Subscribers: abadams, llvm-commits

Differential Revision: https://reviews.llvm.org/D26312

llvm-svn: 286032
2016-11-05 04:22:15 +00:00
Craig Topper a2f12e0a75 [X86] Remove broken support for autoupgrading llvm.x86.fma4.* intrinsics to llvm.x86.fma.*.
It currently fires an assert if you even try. Looking back, I don't think it ever worked because it only changed the name of the function object, but not the intrinsic ID stored in it. Given that, I think it can be removed since no one has noticed or complained in the past 4 years.

llvm-svn: 286031
2016-11-05 04:00:31 +00:00
Krzysztof Parzyszek b7eb7fc892 [Hexagon] Account for <def,read-undef> when validating moves for predication
llvm-svn: 286009
2016-11-04 20:41:03 +00:00
Weiming Zhao 6100118a52 Fix 24560: assembler does not share constant pool for same constants
Summary: This patch returns the same label if the CP entry with the same value has been created.

Reviewers: eli.friedman, rengolin, jmolloy

Subscribers: majnemer, jmolloy, llvm-commits

Differential Revision: https://reviews.llvm.org/D25804

llvm-svn: 286006
2016-11-04 19:17:32 +00:00
Zvi Rackover 85bc64c734 [X86] Broadcast from memory intructions aren't unfoldable
Broadcast from memory instructions should be treated as moves. They can't be unfolded.

Fixes pr30693.

llvm-svn: 285998
2016-11-04 15:15:19 +00:00
Tom Stellard 2d2d33f1dc Revert "AMDGPU: Add VI i16 support"
This reverts commit r285939 and r285948.  These broke some conformance tests.

llvm-svn: 285995
2016-11-04 13:06:34 +00:00
Jonas Paulsson baeb402014 Comment rewording in MachineScheduler.cpp.
Author: A Trick
llvm-svn: 285991
2016-11-04 08:31:14 +00:00
Chandler Carruth d9ef4b6601 Only log the visit of a return instruction if we in fact found a return
instruction.

This avoids dereferencing null in the debug logging if the instruction
was not in fact a return instruction. This potential bug was found by
PVS-Studio.

This actually fixes the last of the "dereferenced a pointer before
checking it for null" reports in the recent PVS-Studio run. However,
there are quite a few reports of this nature that I did not do anything
to fix because they are pretty glaring false positives. They usually
took the form of quite clear correlated checks or a check made in
a separate function. I've even added asserts anywhere this correlation
wasn't pretty obvious and fundamental to the code.

llvm-svn: 285988
2016-11-04 06:59:50 +00:00
Chandler Carruth 0f139b4f71 Hoist check for TLI above all of the attempts to use it (including one
of which that is hidden inside a separate function call) and helpfully
before building expensive transaction infrastructure. This will avoid
crashing when running CGP in a generic mode if we ever managed to hit
this case.

Note that I spent some time looking at alternatives. CGP is actually
used without a TM or TLI in order to do some target-independent testing.
Further, all of the neighboring optimization techniques actually have
some paths that are effective even in the absence of TLI so this seemed
the correct scope at which to check and bypass logic. It still isn't
clear that long-term support for missing TM/TLI is the right
cost/benefit tradeoff for CGP -- we seem to get relatively little for it
and the code is just littered with checks (and assumptions which
I suspect are still missing some checks).

This at least fixes the potential bug in this code spotted by
PVS-Studio, so we've got that going for us. ;]

llvm-svn: 285987
2016-11-04 06:54:00 +00:00
Xinliang David Li f450b88191 Fix typo
llvm-svn: 285978
2016-11-04 03:00:52 +00:00
Justin Bogner 2c2c6ac7b5 X86: Move a non-null assert to before the pointer is dereferenced
llvm-svn: 285975
2016-11-03 23:55:36 +00:00
Chandler Carruth 651f019297 Sink all of the code relying on the MachO MachineModuleInfo to live
behind the test that the MachineModuleInfo analysis was
actually available and can be used.

While the MachO bits may well be reasonable to assume in the darwin
assembly printer, the analysis isn't constructively guaranteed anywhere
I could find so it seems safest to avoid crashing here.

This issue was found with PVS-Studio. Pretty sure the Clang Static
Anaylzer flags similar issues but we've probably never pointed it at
this code effectively.

llvm-svn: 285972
2016-11-03 23:33:46 +00:00
Weiming Zhao 962eaaea9c [Cortex-M0] Atomic lowering
Summary: ARMv6m supports dmb etc fench instructions but not ldrex/strex etc. So for some atomic load/store, LLVM should inline instructions instead of lowering to __sync_ calls.

Reviewers: rengolin, efriedma, t.p.northover, jmolloy

Subscribers: efriedma, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D26120

llvm-svn: 285969
2016-11-03 21:49:08 +00:00
Kevin Enderby 7747cb55dc Add support for the ARM_THREAD_STATE64 and
in llvm-objdump for Mach-O files add the printing of the
ARM_THREAD_STATE64 in the same format as
otool-classic(1) on darwin.

To do this the 64-bit ARM general tread state
needed to be defined in include/llvm/Support/MachO.h .

rdar://28985800

llvm-svn: 285967
2016-11-03 20:51:28 +00:00
Tony Jiang 946242b5d2 NFC - Test commit.
Delete an empty line at the end of README.txt file.

llvm-svn: 285964
2016-11-03 20:32:21 +00:00
Adrian Prantl dbfda63695 Add DWARF debug info support for C++11 inline namespaces.
This implements the DWARF 5 DW_AT_export_symbols feature:
http://dwarfstd.org/ShowIssue.php?issue=141212.1

<rdar://problem/18616046>

llvm-svn: 285959
2016-11-03 19:42:02 +00:00
Kostya Serebryany 8a56917492 [libFuzzer] fix -error_exitcode=N, now with a test
llvm-svn: 285958
2016-11-03 19:31:18 +00:00
Justin Bogner f9fb2abb01 PDB: Fix some APIs to avoid use-after-frees
The buffer is already owned by the PDBFile for all of these APIs, so
don't pass it in separately.

llvm-svn: 285953
2016-11-03 18:28:04 +00:00
Tom Stellard cc34983181 AMDGPU/SI: Re add VIInstructions.td to unbreak bots
This file is unused as of r285939, but we need to keep it around
for bots that don't do full rebuilds.   We should be able to delete this
again in a few days.

llvm-svn: 285948
2016-11-03 17:56:46 +00:00
Chandler Carruth 5589aa60c7 Remove a redundant condition found by PVS-Studio.
Filed http://llvm.org/PR30897 to teach Clang to warn on this kind of
stuff.

llvm-svn: 285945
2016-11-03 17:42:02 +00:00
Tom Stellard 2b3379cdff AMDGPU: Add VI i16 support
Patch By: Wei Ding

Differential Revision: https://reviews.llvm.org/D18049

llvm-svn: 285939
2016-11-03 17:13:50 +00:00
Chandler Carruth 30e0029904 Delete a dead store found by PVS-Studio.
Quite sad we still aren't really using aggressive dead code warnings
from Clang that we could potentially use to catch this and so many other
things.

llvm-svn: 285936
2016-11-03 17:01:38 +00:00
Chandler Carruth fca1ff0da2 Fix a bug found by inspection by PVS-Studio.
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.

We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101

llvm-svn: 285930
2016-11-03 16:39:25 +00:00
Alexander Timofeev f867a40bf6 [AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.
hange explores the fact that LDS reads may be reordered even if access
the same location.

Prior the change, algorithm immediately stops as soon as any memory
access encountered between loads that are expected to be merged
together. Although, Read-After-Read conflict cannot affect execution
correctness.

Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%.
Also improvement expected on any massive sequences of reads from LDS.

Differential Revision: https://reviews.llvm.org/D25944

llvm-svn: 285919
2016-11-03 14:37:13 +00:00
Zvi Rackover a455864fdf Refactor creation of X86ISD::SETCC nodes to a helper function. NFC.
llvm-svn: 285917
2016-11-03 14:25:24 +00:00
Nicolai Haehnle bea772c6dc DAGCombiner: fix use-after-free when merging consecutive stores
Summary:
Have MergeConsecutiveStores explicitly return information about the stores
that were merged, so that we can safely determine whether the starting
node has been freed.

Reviewers: chandlerc, bogner, niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25601

llvm-svn: 285916
2016-11-03 14:25:04 +00:00
James Molloy e7d97368f2 Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently"
This reverts commit r285893. It caused (probably) http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/83 .

llvm-svn: 285912
2016-11-03 14:08:01 +00:00
James Molloy b60d8b1987 [Thumb] Teach ISel how to lower compares of AND bitmasks efficiently
This recommits r281323, which was backed out for two reasons. One, a selfhost failure, and two, it apparently caused Chromium failures. Actually, the latter was a red herring. The log has expired from the former, but I suspect that was a red herring too (actually caused by another problematic patch of mine). Therefore reapplying, and will watch the bots like a hawk.

For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).

1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.

1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.

llvm-svn: 285893
2016-11-03 10:18:20 +00:00
George Rimar e1924f06d1 [Object/ELF] - Make getSymbol() return Error.
That is consistent with other methods around
and helps to handle error on a caller side.

Differential revision: https://reviews.llvm.org/D26247

llvm-svn: 285886
2016-11-03 08:40:55 +00:00
Craig Topper 7b9cc1474f [AVX-512] Use 'vnot' instead of 'not' in patterns involving vXi1 vectors.
This fixes selection of KANDN instructions and allows us to remove an extra set of patterns for KNOT and KXNOR.

Reviewers: delena, igorb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26134

llvm-svn: 285878
2016-11-03 06:04:28 +00:00
Elena Demikhovsky caaceef4b3 Expandload and Compressstore intrinsics
2 new intrinsics covering AVX-512 compress/expand functionality.
This implementation includes syntax, DAG builder, operation lowering and tests.
Does not include: handling of illegal data types, codegen prepare pass and the cost model.

llvm-svn: 285876
2016-11-03 03:23:55 +00:00
Teresa Johnson 0515fb8d4b [ThinLTO] Handle distributed backend case when doing renaming
Summary:
The recent change I made to consult the summary when deciding whether to
rename (to handle inline asm) in r285513 broke the distributed build
case. In a distributed backend we will only have a portion of the
combined index, specifically for imported modules we only have the
summaries for any imported definitions. When renaming on import we were
asserting because no summary entry was found for a local reference being
linked in (def wasn't imported).

We only need to consult the summary for a renaming decision for the
exporting module. For imports, we would have prevented importing any
references to NoRename values already.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26250

llvm-svn: 285871
2016-11-03 01:07:16 +00:00
Greg Bedwell 5fc6f94591 Revert "[InstCombine] allow splat vector folds in adjustMinMax()"
This reverts commit r285732.

This change introduced a new assertion failure in the following
testcase at -O2:

typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
  __v8hi Result = V1;
  if (mask & 0x80)
    Result[0] = V2[0];
  return Result;
}

llvm-svn: 285866
2016-11-02 23:17:05 +00:00
Adrian McCarthy 4333daab1c Emit S_COMPILE3 record once per TU rather than once per function
This has some ripple effects in several tests.

llvm-svn: 285862
2016-11-02 21:30:35 +00:00
Kevin Enderby fbebe1632a Add the rest of the additional error checks for invalid Mach-O files when
the offsets and sizes of an element of the Mach-O file overlaps with
another element in the Mach-O file.

Some other tests for malformed Mach-O files now run into these
checks so their tests were also adjusted.

llvm-svn: 285860
2016-11-02 21:08:39 +00:00
Eli Friedman b6befc3bc4 DCE math library calls with a constant operand.
On platforms which use -fmath-errno, math libcalls without any uses
require some extra checks to figure out if they are actually dead.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 .

Differential Revision: https://reviews.llvm.org/D25970

llvm-svn: 285857
2016-11-02 20:48:11 +00:00
Krzysztof Parzyszek ead77016d8 [Hexagon] Remove registers coalesced in expand-condsets from live intervals
llvm-svn: 285846
2016-11-02 17:59:54 +00:00
Davide Italiano 6b2bba14a9 [lli/COFF] Set the correct alignment for common symbols
Otherwise we set it always to zero, which is not correct,
and we assert inside alignTo (Assertion failed:
Align != 0u && "Align can't be 0.").

Differential Revision:  https://reviews.llvm.org/D26173

llvm-svn: 285841
2016-11-02 17:32:19 +00:00
Zachary Turner 7251ede7c5 Add CodeViewRecordIO for reading and writing.
Using a pattern similar to that of YamlIO, this allows
us to have a single codepath for translating codeview
records to and from serialized byte streams.  The
current patch only hooks this up to the reading of
CodeView type records.  A subsequent patch will hook
it up for writing of CodeView type records, and then a
third patch will hook up the reading and writing of
CodeView symbols.

Differential Revision: https://reviews.llvm.org/D26040

llvm-svn: 285836
2016-11-02 17:05:19 +00:00
Nicolai Haehnle 368972c3b3 AMDGPU: Allow additional implicit operands on MOVRELS instructions
Summary:
The post-RA scheduler occasionally uses additional implicit operands when
the vector implicit operand as a whole is killed, but some subregisters
are still live because they are directly referenced later. Unfortunately,
this seems incredibly subtle to reproduce.

Fixes piglit spec/glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-wr.shader_test
and others.

Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25656

llvm-svn: 285835
2016-11-02 17:03:11 +00:00
Malcolm Parsons 06ac79c210 Fix Clang-tidy readability-redundant-string-cstr warnings
Reviewers: beanz, lattner, jlebar

Subscribers: jholewinski, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26235

llvm-svn: 285832
2016-11-02 16:43:50 +00:00
Nirav Dave 0a392a8e7f [ARM][MC] Cleanup ARM Target Assembly Parser
Summary:
Correctly parse end-of-statement tokens and handle preprocessor
end-of-line comments in ARM assembly processor.

Reviewers: rnk, majnemer

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26152

llvm-svn: 285830
2016-11-02 16:22:51 +00:00
Adrian Prantl f148d6989c Improve and cleanup comments in DwarfExpression.h
llvm-svn: 285829
2016-11-02 16:20:37 +00:00
Matt Arsenault 44deb7914e BranchRelaxation: Fix computing indirect branch block size
llvm-svn: 285828
2016-11-02 16:18:29 +00:00
Adrian Prantl 54286bda2c Simplify control flow in the the DWARF expression compiler
by refactoring common code into a DwarfExpressionCursor wrapper.

llvm-svn: 285827
2016-11-02 16:12:20 +00:00
Adrian Prantl 56527dc804 Emit DW_OP_piece also if the previous value was a constant.
This fixes a bug in the DWARF backend.

llvm-svn: 285826
2016-11-02 16:12:16 +00:00
Simon Pilgrim 93f2f7fb6c Use !operator to test if APInt is zero/non-zero. NFCI.
Avoids APInt construction and slower comparisons.

llvm-svn: 285822
2016-11-02 15:41:15 +00:00
Vasileios Kalintiris e3bb72ea78 [mips] Always run the MipsOptimizePICCall pass.
Summary:
Remove this pass from addMachineSSAOptimization() and register it unconditionally in through addPreRegAlloc(). This pass is required for generating correct PIC calls.

Reviewers: sdardis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26036

llvm-svn: 285814
2016-11-02 15:11:27 +00:00
Joerg Sonnenberger bef3621ad0 Create the virtual register for the global base in the intersection of
GPRC and GPRC_NOR0 (or the 64bit equivalent) and not just the latter.
GPRC_NOR0 contains ZERO as alternative meaning of r0 and is therefore
not a true subclass of GPRC.

llvm-svn: 285813
2016-11-02 15:00:31 +00:00
Aaron Ballman 3ac3a7efff Removing a switch statement that contains a default label, but no case labels. Silences an MSVC warning; NFC.
llvm-svn: 285806
2016-11-02 13:58:57 +00:00
Joerg Sonnenberger 5e31b3ad93 Simplify.
llvm-svn: 285802
2016-11-02 12:45:28 +00:00
Ulrich Weigand 75d2f1b10d [SystemZ] Fix compiler warnings introduced by r285574
SystemZAsmParser::parseOperand returns a bool, not an enum.

llvm-svn: 285800
2016-11-02 11:32:28 +00:00
Kirill Bobyrev 1f1751182e [llvm] FIx if-clause -Wmisleading-indentation issue.
While bootstrapping Clang with recent `gcc 6.2.0` I found a bug related to misleading indentation.

I believe, a pair of `{}` was forgotten, especially given the above similar piece of code:

```
      if (!RDef || !HII->isPredicable(*RDef)) {
        Done = coalesceRegisters(RD, RegisterRef(S1));
        if (Done) {
          UpdRegs.insert(RD.Reg);
          UpdRegs.insert(S1.getReg());
        }
      }
```

Reviewers: kparzysz

Differential Revision: https://reviews.llvm.org/D26204

llvm-svn: 285794
2016-11-02 10:00:40 +00:00
Bjorn Pettersson 7424c8ccd1 [Reassociate] Skip analysis of dead code to avoid infinite loop.
Summary:
It was detected that the reassociate pass could enter an inifite
loop when analysing dead code. Simply skipping to analyse basic
blocks that are dead avoids such problems (and as a side effect
we avoid spending time on optimising dead code).

The solution is using the same Reverse Post Order ordering of the
basic blocks when doing the optimisations, as when building the
precalculated rank map. A nice side-effect of this solution is
that we now know that we only try to do optimisations for blocks
with ranked instructions.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30818

Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini

Subscribers: dberlin

Differential Revision: https://reviews.llvm.org/D26154

llvm-svn: 285793
2016-11-02 08:55:19 +00:00
Dylan McKay 7549b0a013 [AVR] Add instruction selection lowering code
Summary: This adds AVRISelLowering.cpp

Reviewers: arsenm, kparzysz

Subscribers: llvm-commits, modocache, japaric, wdng, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25034

llvm-svn: 285790
2016-11-02 06:47:40 +00:00
Peter Collingbourne ff2c2ec6b2 Bitcode: Check file size before reading bitcode header.
Should unbreak ocaml binding tests.

Also added an llvm-dis test that checks for the same thing.

llvm-svn: 285777
2016-11-02 00:39:11 +00:00
Peter Collingbourne 4e76019e34 Support: Remove MemoryObject and DataStreamer interfaces.
These interfaces are no longer used.

Differential Revision: https://reviews.llvm.org/D26222

llvm-svn: 285774
2016-11-02 00:08:37 +00:00
Peter Collingbourne 028eb5a3f8 Bitcode: Change reader interface to take memory buffers.
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106595.html

This change also fixes an API oddity where BitstreamCursor::Read() would
return zero for the first read past the end of the bitstream, but would
report_fatal_error for subsequent reads. Now we always report_fatal_error
for all reads past the end. Updated clients to check for the end of the
bitstream before reading from it.

I also needed to add padding to the invalid bitcode tests in
test/Bitcode/. This is because the streaming interface was not checking that
the file size is a multiple of 4.

Differential Revision: https://reviews.llvm.org/D26219

llvm-svn: 285773
2016-11-02 00:08:19 +00:00
Alex Bradbury 6b2cca7f8f [RISCV] Add bare-bones RISC-V MCTargetDesc
This is enough to compile and link but doesn't yet do anything particularly 
useful. Once an ASM parser and printer are added in the next two patches, the 
whole thing can be usefully tested.

Differential Revision: https://reviews.llvm.org/D23562

llvm-svn: 285770
2016-11-01 23:47:30 +00:00
Alex Bradbury 24d9b13b36 [RISCV 4/10] Add basic RISCV{InstrFormats,InstrInfo,RegisterInfo,}.td
For now, only add instruction definitions for basic ALU operations. Our 
initial target is a working MC layer rather than codegen, so appropriate 
SelectionDAG patterns will come later.

Differential Revision: https://reviews.llvm.org/D23561

llvm-svn: 285769
2016-11-01 23:40:28 +00:00
Matt Arsenault c507cdb4bc AMDGPU: Handle CopyToReg in getOperandRegClass
llvm-svn: 285768
2016-11-01 23:22:17 +00:00
Matt Arsenault 663ab8c119 AMDGPU: Use brev for materializing SGPR constants
This is already done with VGPR immediates and saves 4 bytes.

llvm-svn: 285765
2016-11-01 23:14:20 +00:00
Matt Arsenault 3d463193a9 AMDGPU: Default to using scalar mov to materialize immediate
This is the conservatively correct way because it's easy to
move or replace a scalar immediate. This was incorrect in the case
when the register class wasn't known from the static instruction
definition, but still needed to be an SGPR. The main example of this
is inlineasm has an SGPR constraint.

Also start verifying the register classes of inlineasm operands.

llvm-svn: 285762
2016-11-01 22:55:07 +00:00
Eric Christopher 690f8e587e Move the initialization of PreferredLoopExit into runOnMachineFunction to be near the other function specific initializations.
llvm-svn: 285758
2016-11-01 22:15:50 +00:00
Sam McCall 2a36eee4ca Fix uninitialized access in MachineBlockPlacement.
Summary:
Currently PreferredLoopExit is set only in buildLoopChains, which is
never called if there are no MachineLoops.

MSan is currently broken by this:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/145/steps/check-llvm%20msan/logs/stdio

This is a naive fix to get things green again. iteratee: you may have a better fix.

This change will also mean PreferredLoopExit will not carry over if
buildCFGChains() is called a second time in runOnMachineFunction, this
appears to be the right thing.

Reviewers: bkramer, iteratee, echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26069

llvm-svn: 285757
2016-11-01 22:02:14 +00:00
Matt Arsenault a6319b82ca AMDGPU: Stop creating unused virtual registers
These are only used in the spill to VMEM path. Move them to
the one use.

llvm-svn: 285756
2016-11-01 21:58:07 +00:00
George Burgess IV 66837aba0a [MemorySSA] Tighten up types to make our API prettier. NFC.
Patch by bryant.

Differential Revision: https://reviews.llvm.org/D26126

llvm-svn: 285750
2016-11-01 21:17:46 +00:00
Sanjay Patel 9840cdad4c [ValueTracking] remove TODO comment; NFC
InstCombine should always canonicalize patterns like the one shown in the comment
when visiting 'select' insts in adjustMinMax().

Scalars were already handled there, and vector splats are handled after:
https://reviews.llvm.org/rL285732

llvm-svn: 285744
2016-11-01 20:43:00 +00:00
Matt Arsenault 2d8c289b4b AMDGPU: Workaround for instruction size with literals
Instructions with a 32-bit base encoding with an optional
32-bit literal encoded after them report their size as 4
for the disassembler. Consider these when computing the
MachineInstr size. This fixes problems caused by size estimate
consistency in BranchRelaxation.

llvm-svn: 285743
2016-11-01 20:42:24 +00:00
Sanjay Patel c3d89842ad [InstCombine] allow splat vector folds in adjustMinMax()
llvm-svn: 285732
2016-11-01 20:08:02 +00:00
Sanjay Patel c0339c77ef [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.
This transforms

%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0

into

%b = icmp {ugt|ule} %x, (c0 >> c1)

z3:

(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
                (bvugt x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
                (bvule x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25913

llvm-svn: 285729
2016-11-01 19:19:29 +00:00
Krzysztof Parzyszek 654dc11b79 [Hexagon] Rename operand/predicate names for unshifted integers
For example, rename s6Ext to s6_0Ext. The names for shifted integers
include the underscore and this will make the naming consistent. It
also exposed a few duplicates that were removed.

llvm-svn: 285728
2016-11-01 19:02:10 +00:00
Matt Arsenault cb578f84e0 BranchRelaxation: Expand unconditional branches first
It's likely if a conditional branch needs to be expanded, the following
unconditional branch will also need expansion. By expanding the
unconditional branch first, the conditional branch can be simply
inverted to jump over the inserted indirect branch block. If the
conditional branch is expanded first, it results in an additional
branch.

This avoids test regressions in future commits.

llvm-svn: 285722
2016-11-01 18:34:00 +00:00
Sanjay Patel 644d7c3b8a [InstCombine] clean up adjustMinMax(); NFCI
1. Change param names for readability
2. Change pointer param to ref
3. Early exit to reduce indent
4. Change switch to if/else

llvm-svn: 285718
2016-11-01 18:15:03 +00:00
Konstantin Zhuravlyov d971a1123f [AMDGPU] Check if type transforms to i16 (VI+) when getting AMDGPUISD::FFBH_U32
This will prevent following regression when enabling i16 support (D18049):

test/CodeGen/AMDGPU/ctlz.ll
test/CodeGen/AMDGPU/ctlz_zero_undef.ll

Differential Revision: https://reviews.llvm.org/D25802

llvm-svn: 285716
2016-11-01 17:49:33 +00:00
Sanjay Patel 7ce658388b [InstCombine] add helper function for adjustMinMax(); NFCI
This is just a cut and paste; clean-up and enhancements to follow.

llvm-svn: 285715
2016-11-01 17:46:08 +00:00
Alex Bradbury b2e5472d85 [RISCV] Add stub backend
This contains just enough for lib/Target/RISCV to compile. Notably a basic 
RISCVTargetMachine and RISCVTargetInfo. At this point you can attempt llc 
-march=riscv32 myinput.ll and will find it fails due to the lack of 
MCAsmInfo.

See http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html for 
further discussion

Differential Revision: https://reviews.llvm.org/D23560

llvm-svn: 285712
2016-11-01 17:27:54 +00:00
Tom Stellard 9677b60288 AMDGPU: Fix buildbots broken by r285704
llvm-svn: 285711
2016-11-01 17:20:03 +00:00
Alex Bradbury 1524f62b97 [RISCV] Add RISC-V ELF defines
Add the necessary definitions for RISC-V ELF files, including relocs. Also 
make necessary trivial change to ELFYaml, llvm-objdump, and llvm-readobj in 
order to work with RISC-V ELFs.

Differential Revision: https://reviews.llvm.org/D23557

llvm-svn: 285708
2016-11-01 16:59:37 +00:00
Alex Bradbury b6e784a240 [RISCV] Recognise riscv32 and riscv64 in triple parsing code
This is the first in a series of 10 initial patches that incrementally add an 
MC layer for RISC-V to LLVM. See 
<http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html> for more 
discussion.

Differential Revision: https://reviews.llvm.org/D23557

llvm-svn: 285707
2016-11-01 16:47:54 +00:00
Alex Bradbury 58eba09949 [TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h
As it stands, the OperandMatchResultTy is only included in the generated
header if there is custom operand parsing. However, almost all backends
make use of MatchOperand_Success and friends from OperandMatchResultTy for
e.g. parseRegister. This is a pain when starting an AsmParser for a new
backend that doesn't yet have custom operand parsing. Move the enum to
MCTargetAsmParser.h.

This patch is a prerequisite for D23563

Differential Revision: https://reviews.llvm.org/D23496

llvm-svn: 285705
2016-11-01 16:32:05 +00:00
Tom Stellard 94c21bc088 AMDGPU: Implement expansion of f16 = FP_TO_FP16 f64
I wanted to implement this as a target independent expansion, however when
targets say they want to expand FP_TO_FP16 what they actually want is
the unsafe math expansion when possible and expansion to a libcall in all
other cases.

The only way to make this work as a target independent would be to add logic
to target's TargetLowering construction to mark theses nodes as Expand when
LegalizeDAG can use the unsafe expansion and mark them as LibCall when it
cannot.  I think this would be possible, but I think it would be too fragile
and complex as it would require targets to keep their expansion logic up
to date with the code in LegalizeDAG.

Reviewers: bogner, ab, t.p.northover, arsenm

Subscribers: wdng, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D25999

llvm-svn: 285704
2016-11-01 16:31:48 +00:00
Simon Pilgrim 6dd8fab443 [InstCombine] Folding of shifts by the sum of positive values
This patch introduces the combine:

(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive

If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.

Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).

Differential Revision: https://reviews.llvm.org/D26000

llvm-svn: 285696
2016-11-01 15:40:30 +00:00
James Molloy 70a3d6df52 [Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables
[Reapplying r284580 and r285917 with fix and testing to ensure emitted jump tables for Thumb-1 have 4-byte alignment]

The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions.

It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size.

TBB example:
Before: lsls r0, r0, #2    After: add  r0, pc
        adr  r1, .LJTI0_0         ldrb r0, [r0, #6]
        ldr  r0, [r0, r1]         lsls r0, r0, #1
        mov  pc, r0               add  pc, r0
  => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4.

The only case that can increase dynamic instruction count is the TBH case:

Before: lsls r0, r4, #2    After: lsls r4, r4, #1
        adr  r1, .LJTI0_0         add  r4, pc
        ldr  r0, [r0, r1]         ldrh r4, [r4, #6]
        mov  pc, r0               lsls r4, r4, #1
                                  add  pc, r4
  => 1 more instruction in prologue. Jump table shrunk by a factor of 2.

So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!)

llvm-svn: 285690
2016-11-01 13:37:41 +00:00
Valery Pykhtin 8a89d3662a [AMDGPU] Expand vector mulhu/mulhs
Differential revision: https://reviews.llvm.org/D26077

llvm-svn: 285684
2016-11-01 10:26:48 +00:00
Nemanja Ivanovic e70fa63390 [PowerPC] Implement vector shift builtins - llvm portion
This patch corresponds to review https://reviews.llvm.org/D26095.
Committing on behalf of Tony Jiang.

llvm-svn: 285681
2016-11-01 09:42:32 +00:00
Serge Pavlov 6ac8e034f6 Allow resolving response file names relative to including file
If a response file included by construct @file itself includes a response file
and that file is specified by relative file name, current behavior is to resolve
the name relative to the current working directory. The change adds additional
flag to ExpandResponseFiles that may be used to resolve nested response file
names relative to including file. With the new mode a set of related response
files may be kept together and reference each other with short position
independent names.

Differential Revision: https://reviews.llvm.org/D24917

llvm-svn: 285675
2016-11-01 06:53:29 +00:00
Sanjoy Das 9c729067e9 [TBAA] Use wrapper objects instead of raw getOperand s; NFC
This is intended to make the semantic intent clearer.

The wrapper objects are now generic to avoid `const_cast` s.  Since
`const` ness is part of the API of `MDNode::getMostGenericTBAA` (and
therefore I can't make things `const` all the way through without some
code churn outside TypeBasedAliasAnalysis.cpp), this seemed like the
cleanest solution.

llvm-svn: 285665
2016-11-01 02:58:30 +00:00
Sanjoy Das a1a77f3a21 [TBAA] Rename accessors to be more idiomatic; NFC
llvm-svn: 285661
2016-11-01 01:21:57 +00:00
Peter Collingbourne d3a6c70b2d Bitcode: Simplify BitstreamWriter::EnterBlockInfoBlock() interface.
No block info block should need to define local abbreviations, so we can
always use a code width of 2.

Also change all block info block writers to use EnterBlockInfoBlock.

Differential Revision: https://reviews.llvm.org/D26168

llvm-svn: 285660
2016-11-01 01:18:57 +00:00
Matt Arsenault f3dd863031 AMDGPU: Whitespace fixes
llvm-svn: 285659
2016-11-01 00:55:14 +00:00
Sanjay Patel 70c5f02d25 [DAG] disable nsw/nuw for add/sub/mul when simplifying based on demanded bits (PR30841)
This bug was exposed by using nsw/nuw for more aggressive folds in:
https://reviews.llvm.org/rL284844

The changes mimic the IR demanded bits logic in InstCombiner::SimplifyDemandedUseBits(),
but we can't just flip flag bits in the DAG; we have to create a new node that has the
bits cleared.

This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30841 

llvm-svn: 285656
2016-10-31 23:28:45 +00:00
Davide Italiano 51cbe13a3f [Hexagon] Garbage collect dead code.
llvm-svn: 285654
2016-10-31 22:56:56 +00:00
Evgeniy Stepanov 1bd9fc7098 Fix a typo.
Found with PVS-Studio here: http://www.viva64.com/en/b/0446/

llvm-svn: 285652
2016-10-31 22:42:39 +00:00
Saleem Abdulrasool e1aa782bd0 CodeGen: further loosen -O0 CG for WoA division
Generate the slowest possible codepath for noopt CodeGen.  Even trying to be
clever with the negated jump can cause out-of-range jumps.  Use a wide branch
instead. Although the code is modelled simplistically, the later optimizations
would recombine the branching into `cbz` if possible.  This re-enables the
previous optimization as well as hopefully gives us working code in all cases.

Addresses PR30356!

llvm-svn: 285649
2016-10-31 22:12:37 +00:00
Teresa Johnson 002af9bbce [ThinLTO] Disable importing and other cross-module optis at -O0
Summary:
There is no point to importing at -O0, since we won't inline. We should
also disable other cross-module optimizations.

(Plan to backport this fix to the 3.9 branch to fix PR30774)

Reviewers: pcc

Subscribers: johanengelen, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25918

llvm-svn: 285648
2016-10-31 22:12:21 +00:00
Justin Lebar ed1e312f05 [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass.
Summary:
This has been replaced by the NVPTXInferAddressSpaces pass.  We've had
the new one as the default with the old one accessible via a flag for
some months now, and we've had no problems.

Reviewers: tra

Subscribers: llvm-commits, jholewinski, jingyue, mgorny

Differential Revision: https://reviews.llvm.org/D26165

llvm-svn: 285642
2016-10-31 21:51:42 +00:00
Kevin Enderby d503940e8f More additional error checks for invalid Mach-O files when
the offsets and sizes of an element of the file overlaps with
another element in the Mach-O file.

This shows the approach to this testing for three elements
and contains for tests for their overlap.  Checking for all the
remain elements will be added next.

llvm-svn: 285632
2016-10-31 20:29:48 +00:00