Commit Graph

252946 Commits

Author SHA1 Message Date
Jason Molenda a1609ff658 Jim unintentionally had the gdb-format specifiers falling through
after r276132 so that 'x/4b' would print out a series of 4 8-byte
quantities.  Fix that, add a test case.

<rdar://problem/29930833> 

llvm-svn: 293002
2017-01-25 01:41:48 +00:00
Arpith Chacko Jacob 4dbf368e14 [OpenMP] Codegen support for 'target teams' on the host.
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.

This target teams region is offloaded using the __tgt_target_teams()
call. The patch sets the number of teams as an argument to
this call.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29084

llvm-svn: 293001
2017-01-25 01:38:33 +00:00
Tom Stellard 2f3f9855f0 AMDGPU add support for spilling to a user sgpr pointed buffers
Summary:
This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].

Patch By: Dave Airlie

Reviewers: nhaehnle, arsenm, tstellarAMD

Reviewed By: arsenm

Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D25428

llvm-svn: 293000
2017-01-25 01:25:13 +00:00
Arpith Chacko Jacob e04da5dee2 [OpenMP] Support for the num_threads-clause on 'target parallel' on the NVPTX device.
This patch adds support for the Spmd construct 'target parallel' on the
NVPTX device. This involves ignoring the num_threads clause on the device
since the number of threads in this combined construct is already set on
the host through the call to __tgt_target_teams().

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29083

llvm-svn: 292999
2017-01-25 01:18:34 +00:00
Kostya Serebryany 99259ee40c [asan] fix __sanitizer_cov_with_check to get the correct caller PC. Before this fix the code relied on the fact that the other function (__sanitizer_cov) is inlined. This was true with clang builds on x86, but not true with gcc builds on x86 and on PPC. This caused bot redness after r292862
llvm-svn: 292998
2017-01-25 01:14:24 +00:00
Arpith Chacko Jacob 33c849a007 [OpenMP] Support for the num_threads-clause on 'target parallel'.
The num_threads-clause on the combined directive applies to the
'parallel' region of this construct. We modify the NumThreadsClause
class to capture the clause expression within the 'target' region.

The offload runtime call for 'target parallel' is changed to
__tgt_target_teams() with 1 team and the number of threads set by
this clause or a default if none.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29082

llvm-svn: 292997
2017-01-25 00:57:16 +00:00
Eugene Zelenko 11f6907f40 [AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292996
2017-01-25 00:29:26 +00:00
Justin Bogner a029531e10 GlobalISel: Use the correct types when translating landingpad instructions
There was a bug here where we were using p0 instead of s32 for the
selector type in the landingpad. Instead of hardcoding these types we
should get the types from the landingpad instruction directly.

Note that we replicate an assert from SDAG here to only support
two-valued landingpads.

llvm-svn: 292995
2017-01-25 00:16:53 +00:00
Kostya Serebryany d843cd55b5 [asan] temporarily disable parts of a test that fail after r292862
llvm-svn: 292994
2017-01-24 23:58:21 +00:00
Kevin Enderby 7a165755ba Fix llvm-objdump so it picks a good CPU based for Mach-O files
for CPU_SUBTYPE_ARM_V7S and CPU_SUBTYPE_ARM_V7K.

For these two cpusubtypes they should default to a cortex-a7 CPU
to give proper disassembly without a -mcpu= flag.

rdar://27431703

llvm-svn: 292993
2017-01-24 23:41:04 +00:00
Marshall Clow 3cd9e94241 Implement LWG2556: Wide contract for future::share()
llvm-svn: 292992
2017-01-24 23:28:25 +00:00
Richard Smith 73edb6d0cc PR31742: Don't emit a bogus "zero size array" extwarn when initializing a
runtime-sized array from an empty list in an array new.

llvm-svn: 292991
2017-01-24 23:18:28 +00:00
Marshall Clow 63b560be69 Change the return type of emplace_[front|back] back to void when building with C++14 or before. Resolves PR31680.
llvm-svn: 292990
2017-01-24 23:09:12 +00:00
Hafiz Abid Qadeer b10fb96541 Provide option to set pc of the file loaded in memory.
Summary: This commit adds an option to set PC to the entry point of the file loaded using "target module load" command. In D28804, Greg asked me to separate this part under a different option.

Reviewers: clayborg

Reviewed By: clayborg

Subscribers: lldb-commits

Differential Revision: https://reviews.llvm.org/D28944

llvm-svn: 292989
2017-01-24 23:07:27 +00:00
Eugene Zelenko 8c6ed0f3a0 [XCore] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292988
2017-01-24 23:02:48 +00:00
Hafiz Abid Qadeer 68d7f37d26 Fix a bug where lldb does not respect the packet size.
Summary: LLDB was using packet size advertised by the target as the max memory size to write in one go. It is wrong because packets have other overhead apart from memory payload. Also memory transferred through 'm' and 'M' packets needs 2 bytes in packet to transfer 1 of memory.

Reviewers: clayborg

Reviewed By: clayborg

Subscribers: lldb-commits

Differential Revision: https://reviews.llvm.org/D28808

llvm-svn: 292987
2017-01-24 22:55:36 +00:00
Marshall Clow e67179bc6c Remove auto_ptr in C++17. Get it back by defining _LIBCPP_ENABLE_CXX17_REMOVED_AUTO_PTR
llvm-svn: 292986
2017-01-24 22:22:33 +00:00
Matt Arsenault bf67cf7e4b AMDGPU: Remove spurious out branches after a kill
The sequence like this:
  v_cmpx_le_f32_e32 vcc, 0, v0
  s_branch BB0_30
  s_cbranch_execnz BB0_30
  ; BB#29:
  exp null off, off, off, off done vm
  s_endpgm
  BB0_30:
  ; %endif110

is likely wrong. The s_branch instruction will unconditionally jump
to BB0_30 and the skip block (exp done + endpgm) inserted for
performing the kill instruction will never be executed. This results
in a GPU hang with Star Ruler 2.

The s_branch instruction is added during the "Control Flow Optimizer"
pass which seems to re-organize the basic blocks, and we assume
that SI_KILL_TERMINATOR is always the last instruction inside a
basic block. Thus, after inserting a skip block we just go to the
next BB without looking at the subsequent instructions after the
kill, and the s_branch op is never removed.

Instead, we should remove the unconditional out branches and let
skip the two instructions if the exec mask is non-zero.

This patch fixes the GPU hang and doesn't introduce any regressions
with "make check".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99019

Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com>

llvm-svn: 292985
2017-01-24 22:18:39 +00:00
Wei Mi f1cf0278e8 Revert rL292621. Caused some internal build bot failures in apple.
llvm-svn: 292984
2017-01-24 22:15:06 +00:00
Eugene Zelenko 3943d2b0d7 [SystemZ] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292983
2017-01-24 22:10:43 +00:00
Matt Arsenault 7aad8fd8f4 Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

llvm-svn: 292982
2017-01-24 22:02:15 +00:00
Kuba Mracek e4c1dd2c08 [tsan] Enable ignore_noninstrumented_modules=1 on Darwin by default
TSan recently got the "ignore_noninstrumented_modules" flag, which disables tracking of read and writes that come from noninstrumented modules (via interceptors). This is a way of suppressing false positives coming from system libraries and other noninstrumented code. This patch turns this on by default on Darwin, where it's supposed to replace the previous solution, "ignore_interceptors_accesses", which disables tracking in *all* interceptors. The new approach should re-enable TSan's ability to find races via interceptors on Darwin.

Differential Revision: https://reviews.llvm.org/D29041

llvm-svn: 292981
2017-01-24 21:37:50 +00:00
Rui Ueyama 8981f3aacf Add a file comment to SyntheticSections.h.
llvm-svn: 292980
2017-01-24 21:35:25 +00:00
Dehao Chen a5eb1689dc Explicitly promote indirect calls before sample profile annotation.
Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation.

Reviewers: xur, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29040

llvm-svn: 292979
2017-01-24 21:05:51 +00:00
Richard Smith 6536f6058d Strengthen test from r292632 to also check we get the mangling correct for this case.
llvm-svn: 292978
2017-01-24 21:03:48 +00:00
Saleem Abdulrasool 85824ee618 Demangle: correct demangling for CV-qualified functions
When demangling a CV-qualified function type with a final reference type
parameter, we would treat the reference type parameter as a r-value ref
accidentally.  This would result in the improper decoration of the
function type itself.

Resolves PR31741!

llvm-svn: 292976
2017-01-24 20:04:58 +00:00
Saleem Abdulrasool 25ee0a62ac Demangle: use named values for CV qualifiers
Rather than hard-coding magic values of 1, 2, 4 (bit-field), use an enum
to name the values.  NFC.

llvm-svn: 292975
2017-01-24 20:04:56 +00:00
Ivan Krasin 34e89ad0a4 Revert [AMDGPU][mc][tests][NFC] Add coverage/smoke tests for Gfx7 and Gfx8.
Reason: broke ASAN bots with a global buffer overflow.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/2291

Each test contains 20-30K test cases but takes only several (from 4 to 10)
seconds to complete on average machine. The tests cover the majority of
AMDGPU Gfx7/Gfx8 instructions, including many dark corners, and intended
to quickly find out if something is broken.

llvm-svn: 292974
2017-01-24 19:58:59 +00:00
Saleem Abdulrasool f5d26bb142 cxa_demangle: fix rvalue ref check
When checking if the type is a r-value ref, we would not do a complete
check.  This would result in us treating a trailing parameter reference
`&)` as a r-value ref, and improperly inject the cv qualifier on the
type.  We now correctly demangle the type `KFvRmE` as a constant
function rather than a constant reference.

Fixes PR31741!

llvm-svn: 292973
2017-01-24 19:57:05 +00:00
Peter Collingbourne 65cb42c1ce IRGen: Factor out function CodeGenAction::loadModule. NFCI.
llvm-svn: 292972
2017-01-24 19:55:38 +00:00
Daniel Berlin 390dfde0f3 Remove the load hoisting code of MLSM, it is completely subsumed by GVNHoist
Summary:
GVNHoist performs all the optimizations that MLSM does to loads, in a
more general way, and in a faster time bound (MLSM is N^3 in most
cases, N^4 in a few edge cases).

This disables the load portion.

Note that the way ld_hoist_st_sink.ll is written makes one think that
the loads should be moved to the while.preheader block, but

1. Neither MLSM nor GVNHoist do it (they both move them to identical places).

2. MLSM couldn't possibly do it anyway, as the while.preheader block
is not the head of the diamond, while.body is.  (GVNHoist could do it
if it was legal).

3. At a glance, it's not legal anyway because the in-loop load
conflict with the in-loop store, so the loads must stay in-loop.

I am happy to update the test to use update_test_checks so that
checking is tighter, just was going to do it as a followup.

Note that i can find no particular benefit to the store portion on any
real testcase/benchmark i have (even size-wise).  If we really still
want it, i am happy to commit to writing a targeted store sinker, just
taking the code from the MemorySSA port of MergedLoadStoreMotion
(which is N^2 worst case, and N most of the time).

We can do what it does in a much better time bound.

We also should be both hoisting and sinking stores, not just sinking
them, anyway, since whether we should hoist or sink to merge depends
basically on luck of the draw of where the blockers are placed.

Nonetheless, i have left it alone for now.

Reviewers: chandlerc, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29079

llvm-svn: 292971
2017-01-24 19:55:36 +00:00
Peter Collingbourne 47d2364a51 IRGen: Factor out function clang::FindThinLTOModule. NFCI.
llvm-svn: 292970
2017-01-24 19:54:37 +00:00
Marshall Clow 86e7eae3a5 Add a test to make sure that implicit conversion from error_code to bool will fail
llvm-svn: 292969
2017-01-24 19:44:55 +00:00
Richard Smith 081ad4d3e5 [docs] Add TableGen-based generator for command line argument documentation,
and generate documentation for all (non-hidden) options supported by the
'clang' driver.

llvm-svn: 292968
2017-01-24 19:39:46 +00:00
Marshall Clow 1ee7bf6313 Update status for LWG2733
llvm-svn: 292967
2017-01-24 19:37:09 +00:00
Changpeng Fang c85abbd955 AMDGPU/SI: Give up in promote alloca when a pointer may be captured.
Differential Revision:
  http://reviews.llvm.org/D28970

Reviewer:
  Matt

llvm-svn: 292966
2017-01-24 19:06:28 +00:00
Saleem Abdulrasool c38cd326fc Demangle: avoid butchering parameter type
When demangling a CV-qualified function type with a final parameter with
a reference type, we would insert the CV qualification on the parameter
rather than the function, and in the process adjust the insertion point
by one extra, splitting the type name.  This avoids doing so, even
though the attribution is still incorrect.

llvm-svn: 292965
2017-01-24 18:52:19 +00:00
Mehdi Amini 3406bb6748 Fix test/Driver/embed-bitcode.c on non-Darwin host by setting the target explicitly
llvm-svn: 292964
2017-01-24 18:49:49 +00:00
Saleem Abdulrasool 0c44db8f0a cxa_demangle: avoid butchering the last parameter type
Fix an off-by-one case which would destroy the final parameter in a
CV-qualified function type with a reference.  We still get the CV
qualification incorrect, but at least we do not clobber the type name
any longer.

Partially fixes PR31741.

llvm-svn: 292963
2017-01-24 18:42:56 +00:00
Marshall Clow 0d1b5ce4f9 Implement LWG2733: [fund.ts.v2] gcd / lcm and bool. We already did tbis for C++17, so replicate the changes in experimental.
llvm-svn: 292962
2017-01-24 18:15:48 +00:00
Mehdi Amini 04e1a0a171 Forward -bitcode_process_mode to ld64 in marker-only mode
Reviewers: steven_wu

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D29066

llvm-svn: 292961
2017-01-24 18:15:21 +00:00
Mehdi Amini 6683e22c25 Split isUsingLTO() outside of embedBitcodeInObject() and embedBitcodeMarkerOnly().
Summary: These accessors maps directly to the command line option.

Reviewers: steven_wu

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D29065

llvm-svn: 292960
2017-01-24 18:12:25 +00:00
Chad Rosier 8e11fbd15d [AArch64] Fix typo. NFC.
llvm-svn: 292959
2017-01-24 18:08:10 +00:00
Marshall Clow 77dd30b557 Mark LWG2736 as complete. No code changes, but we have more tests now
llvm-svn: 292958
2017-01-24 18:03:32 +00:00
Amaury Sechet d90f5f6698 Use InstCombine's builder in foldSelectCttzCtlz instead of creating a new one.
Summary: As per title. This will add the instructiions we are interested in in the worklist.

Reviewers: mehdi_amini, majnemer, andreadb

Differential Revision: https://reviews.llvm.org/D29081

llvm-svn: 292957
2017-01-24 17:48:25 +00:00
Stanislav Mekhanoshin 22a56f2f5a [AMDGPU] Add VGPR copies post regalloc fix pass
Regalloc creates COPY instructions which do not formally use VALU.
That results in v_mov instructions displaced after exec mask modification.
One pass which do it is SIOptimizeExecMasking, but potentially it can be
done by other passes too.

This patch adds a pass immediately after regalloc to add implicit exec
use operand to all VGPR copy instructions.

Differential Revision: https://reviews.llvm.org/D28874

llvm-svn: 292956
2017-01-24 17:46:17 +00:00
Reid Kleckner 310c3d3d26 Fix pc_array bounds check to use elements instead of bytes
pc_array_size and kPcArrayMaxSize appear to be measured in elements, not
bytes, so we shouldn't multiply idx by sizeof(uptr) in this bounds
check.  32-bit Chrome was tripping this assertion because it has 64
million coverage points. I don't think it's worth adding a test that has
that many coverage points.

llvm-svn: 292955
2017-01-24 17:45:35 +00:00
Evandro Menezes 7784cacd91 [AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128'
In order to follow the pattern of the existing 'slow-misaligned-128store'
option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'.

llvm-svn: 292954
2017-01-24 17:34:31 +00:00
Chris Bieneman bef847c3ae [Lanai] Rename LanaiInstPrinter library to LanaiAsmPrinter
Summary:
    This is in keeping with LLVM convention. The classes are InstPrinters, but the library is ${target}AsmPrinter.

This patch is in response to bryant pointing out to me that Lanai was the only backend deviating from convention here. Thanks!

Reviewers: jpienaar, bryant

Subscribers: mgorny, jgosnell, llvm-commits

Differential Revision: https://reviews.llvm.org/D29043

llvm-svn: 292953
2017-01-24 17:27:01 +00:00
Sanjay Patel 562272536a [InstSimplify] try to eliminate icmp Pred (add nsw X, C1), C2
I was surprised to see that we're missing icmp folds based on 'add nsw' in InstCombine, 
but we should handle the InstSimplify cases first because that could make the InstCombine
code simpler.

Here are Alive-based proofs for the logic:

Name: add_neg_constant
Pre: C1 < 0 && (C2 > ((1<<(width(C1)-1)) + C1))
%a = add nsw i7 %x, C1
%b = icmp sgt %a, C2
  =>
%b = false

Name: add_pos_constant
Pre: C1 > 0 && (C2 < ((1<<(width(C1)-1)) + C1 - 1))
%a = add nsw i6 %x, C1
%b = icmp slt %a, C2
  =>
%b = false

Name: nuw
Pre: C1 u>= C2
%a = add nuw i11 %x, C1
%b = icmp ult %a, C2
  =>
%b = false

Differential Revision: https://reviews.llvm.org/D29053

llvm-svn: 292952
2017-01-24 17:03:24 +00:00