Commit Graph

271211 Commits

Author SHA1 Message Date
Keith Wyss 7248a8bc33 [XRay][tools] Disable windows for tests that use an unsupported shell redirect.
The tests are filechecking against stderr and use some magic to make stdout go
away and pipe stderr to FileCheck. This broke bots on windows.

llvm-svn: 312739
2017-09-07 19:10:34 +00:00
Rafael Espindola c20759038b Drop --no-threads from tests.
The performance problem with --threads is fixed.

llvm-svn: 312738
2017-09-07 19:07:49 +00:00
Justin Lebar 78137ec868 [CUDA] When compilation fails, print the compilation mode.
Summary:
That is, instead of "1 error generated", we now say "1 error generated
when compiling for sm_35".

This (partially) solves a usability foogtun wherein e.g. users call a
function that's only defined on sm_60 when compiling for sm_35, and they
get an unhelpful error message.

Reviewers: tra

Subscribers: sanjoy, cfe-commits

Differential Revision: https://reviews.llvm.org/D37548

llvm-svn: 312736
2017-09-07 18:37:16 +00:00
Adrian McCarthy 8fe23bc520 Fix for bug 34510 - Minidump target does not resolve new symbols correctly
Even though the content of the minidump does not change in a debugging session,
frames can't be indiscriminately be cached since modules and symbols can be
explicitly added after the minidump is loaded.

The fix is simple, just let the base Thread::ClearStackFrames() do its job.

submitted by amccarth on behalf of lemo

Bug: https://bugs.llvm.org/show_bug.cgi?id=34510

Differential Revision: https://reviews.llvm.org/D37527

llvm-svn: 312735
2017-09-07 18:29:48 +00:00
Artem Belevich 8af4e23d1e [CUDA] Added rudimentary support for CUDA-9 and sm_70.
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.

On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.

Differential Revision: https://reviews.llvm.org/D37576

llvm-svn: 312734
2017-09-07 18:14:32 +00:00
Keith Wyss 9420ec3378 [XRay][tools] Function call stack based analysis tooling for XRay traces
Second try after fixing a code san problem with iterator reference types.

This change introduces a subcommand to the llvm-xray tool called
"stacks" which allows for analysing XRay traces provided as inputs and
accounting time to stacks instead of just individual functions. This
gives us a more precise view of where in a program the latency is
actually attributed.

The tool uses a trie data structure to keep track of the caller-callee
relationships as we process the XRay traces. In particular, we keep
track of the function call stack as we enter functions. While we're
doing this we're adding nodes in a trie and indicating a "calls"
relatinship between the caller (current top of the stack) and the callee
(the new top of the stack). When we push function ids onto the stack, we
keep track of the timestamp (TSC) for the enter event.

When exiting functions, we are able to account the duration by getting
the difference between the timestamp of the exit event and the
corresponding entry event in the stack. This works even if we somehow
miss the exit events for intermediary functions (i.e. if the exit event
is not cleanly associated with the enter event at the top of the stack).

The output of the tool currently provides just the top N leaf functions
that contribute the most latency, and the top N stacks that have the
most frequency. In the future we can provide more sophisticated query
mechanisms and potentially an export to database feature to make offline
analysis of the stack traces possible with existing tools.

Differential revision: D34863

llvm-svn: 312733
2017-09-07 18:07:48 +00:00
Matt Arsenault d7e2303df2 AMDGPU: Start selecting v_mad_mix_f32
llvm-svn: 312732
2017-09-07 18:05:07 +00:00
Matt Arsenault 61ec738b60 DAG: Allow creating extract_vector_elt post-legalize
Fixes some combine issues for AMDGPU where we weren't
getting the many extract_vector_elt combines expected
in a future patch.

This should really be checking isOperationLegalOrCustom on
the extract. That improves a number of x86 lit tests, but
a few get stuck in an infinite loop from one place
where a similar looking extract is created. I have a
different workaround in the backend for that which
keeps many of those improvements, but also adds a few
regressions.

llvm-svn: 312730
2017-09-07 17:24:43 +00:00
Konstantin Zhuravlyov 5f5b586c99 AMDGPU: Handle non-temporal loads and stores
Differential Revision: https://reviews.llvm.org/D36862

llvm-svn: 312729
2017-09-07 17:14:54 +00:00
Anastasia Stulova 257132a019 [OpenCL] Handle taking an address of block captures.
Block captures can have different physical locations
in memory segments depending on the use case (as a function
call or as a kernel enqueue) and in different vendor
implementations.

Therefore it's unclear how to add address space to capture
addresses uniformly. Currently it has been decided to disallow
taking addresses of captured variables until further
clarifications in the spec.

Differential Revision: https://reviews.llvm.org/D36410

llvm-svn: 312728
2017-09-07 17:00:33 +00:00
Peter Smith 20489ec563 [ELF] Always write non-immediate bits for AArch64 branch instruction.
To support errata patching on AArch64 we need to be able to overwrite
an arbitrary instruction with a branch. For AArch64 it is sufficient to
always write all the bits of the branch instruction and not just the
immediate field. This is safe as the non-immediate bits of the branch
instruction are always the same.

Differential Revision: https://reviews.llvm.org/D36745

llvm-svn: 312727
2017-09-07 16:29:52 +00:00
Ted Woodward 9927431d81 Fix lldb-mi test data_read_memory_bytes_global
Summary:
Test was skipped because -data-evaluate-expression was thought
to not work on globals. This is not the case - the issue was clang
removes debug info for globals in cpp files that are not used.

Add a reference to the globals in question, and fix memory patter in
test to match memory pattern in testcase.

Reviewers: ki.stfu, abidh

Reviewed By: ki.stfu

Subscribers: aprantl, lldb-commits

Differential Revision: https://reviews.llvm.org/D37533

llvm-svn: 312726
2017-09-07 16:24:39 +00:00
Konstantin Zhuravlyov c8c9d4a0a6 AMDGPU: Handle more than one memory operand in SIMemoryLegalizer
Differential Revision: https://reviews.llvm.org/D37397

llvm-svn: 312725
2017-09-07 16:14:21 +00:00
Benjamin Kramer 6ef976d5e1 [ARM] Remove redundant vcvt patterns.
These don't add any value as they're just compositions of existing
patterns. However, they can confuse the cost logic in ISel, leading to
duplicated vcvt instructions like in PR33199.

llvm-svn: 312724
2017-09-07 14:52:26 +00:00
Marek Kurdej ceeb8b91e7 [clang-format] Add support for C++17 structured bindings.
Summary:
Before:
```
    auto[a, b] = f();
```

After:
```
    auto [a, b] = f();
```
or, if SpacesInSquareBrackets is true:
```
    auto [ a, b ] = f();
```

Reviewers: djasper

Reviewed By: djasper

Subscribers: cfe-commits, klimek

Differential Revision: https://reviews.llvm.org/D37132

llvm-svn: 312723
2017-09-07 14:28:32 +00:00
Michael Zuckerman 5a385940d3 [X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess (VF{8|16|32} stride 3).
This patch expands the support of lowerInterleavedload to {8|16|32}x8i stride 3.

LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) and we plan to include the store (deinterleved side).

The patch goal is to optimize the following sequence:
a0 b0 c0 a1 b1 c1 a2 b2
c2 a3 b3 c3 a4 b4 c4 a5
b5 c5 a6 b6 c6 a7 b7 c7

into

a0 a1 a2 a3 a4 a5 a6 a7
b0 b1 b2 b3 b4 b5 b6 b7
c0 c1 c2 c3 c4 c5 c6 c7

Reviewers
1. zvi
2. igor
3. guyblank
4. dorit
5. Ayal

llvm-svn: 312722
2017-09-07 14:02:13 +00:00
Daniel Jasper 392c2ba675 [clang-format] Fix documentation for AllowAllParametersOfDeclarationOnNextLine
The current description of AllowAllParametersOfDeclarationOnNextLine in
the Clang-Format Style Options guide suggests that it is possible to
format function declaration, which fits in a single line (what is not
supported in current clang-format version). Also the example was not
reproducible and mades no sense.

Patch by Lucja Mazur, thank you!

llvm-svn: 312721
2017-09-07 13:45:41 +00:00
Simon Atanasyan 6d7958684b [mips] Use RegisterMCAsmBackend to register all MIPS asm backends. NFC
This change converts the `MipsAsmBackend` constructor to the "standard"
form. It makes possible to use `RegisterMCAsmBackend` for the backends
registrations. Now we pass `Triple` instance to the `MipsAsmBackend`
ctor and deduce all required options like endianness and bitness from
the triple. We still need to implement explicit ABI checking for
providing correct options to backends.

Differential revision: https://reviews.llvm.org/D37519

llvm-svn: 312720
2017-09-07 12:54:26 +00:00
Florian Hahn d39b8a3533 [MachineCombiner] Update instruction depths incrementally for large BBs.
Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.

In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.

Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn

Reviewed By: fhahn

Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 312719
2017-09-07 12:49:39 +00:00
Michael Kruse 2f5cbc449a [CodeGen] Bitcast scalar writes to actual value.
The type of NewValue might change due to ScalarEvolution
looking though bitcasts. The synthesized NewValue therefore
becomes the type before the bitcast.

llvm-svn: 312718
2017-09-07 12:15:01 +00:00
Sylvestre Ledru 7372d48c74 Add an usage example of BreakBeforeBraces
Reviewers: djasper

Reviewed By: djasper

Subscribers: klimek, cfe-commits

Differential Revision: https://reviews.llvm.org/D37531

llvm-svn: 312717
2017-09-07 12:09:14 +00:00
Sylvestre Ledru 44d1ef140b Refresh the clang format options doc with the recent changes
Summary:
Looks like we are out of sync between the doc and the code.


Reviewers: djasper

Reviewed By: djasper

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D37558

llvm-svn: 312716
2017-09-07 12:08:49 +00:00
Siddharth Bhat e2950f46c6 [PPCGCodeGen] Document pre-composition with Zero in getExtent. [NFC]
It's weird at first glance that we do this, so I wrote up some
documentation on why we need to perform this process.

llvm-svn: 312715
2017-09-07 11:57:33 +00:00
Florian Hahn cf0cdd4c02 [MachineTraceMetrics] Add computeDepth function (NFCI).
Summary:
This function is used in D36619 to update the instruction depths
incrementally.

Reviewers: efriedma, Gerolf, MatzeB, fhahn

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36696

llvm-svn: 312714
2017-09-07 11:51:30 +00:00
Alex Bradbury c09d5611c4 [Sparc][NFC] Clean up SelectCC lowering
The ARM, BPF, MSP430, Sparc and Mips backends all use a similar code sequence 
for lowering SelectCC. As pointed out by @reames in D29937, this code isn't 
particularly clear and in most of these backends doesn't actually match the 
comments. This patch makes the code sequence clearer for the Sparc backend 
through better variable naming and more accurate comments (e.g. we are 
inserting triangle control flow, _not_ diamond). There is no functional 
change.

Differential Revision: https://reviews.llvm.org/D37194

llvm-svn: 312713
2017-09-07 11:30:55 +00:00
George Rimar 6823c5f0c0 [ELF] - Rename PhdrEntry::First/Last to FirstSec/LastSec. NFC.
As was suggested in D34956 thread.

llvm-svn: 312712
2017-09-07 11:01:10 +00:00
George Rimar 582ede8922 [ELF] - Store pointer to PT_LOAD instead of pointer to first section in OutputSection
It is a bit more convinent and helps to simplify logic 
of program headers allocation a little.

Differential revision: https://reviews.llvm.org/D34956

llvm-svn: 312711
2017-09-07 10:53:07 +00:00
Benjamin Kramer 1a48ddb864 Fixing incorrectly capitalised regexps.
Patch by Sam Allen!

llvm-svn: 312710
2017-09-07 09:54:03 +00:00
Benjamin Kramer b04d84c067 Fixing incorrectly capitalised regexps.
Patch by Sam Allen!

llvm-svn: 312709
2017-09-07 09:54:03 +00:00
Jonas Paulsson 0f056352a8 Revert "[RegAlloc] Make sure live-ranges reflect the state of the IR when removing them"
This temporarily reverts commit 463fa38 (r311401).

See https://bugs.llvm.org/show_bug.cgi?id=34502

llvm-svn: 312708
2017-09-07 09:13:17 +00:00
Alexander Ivchenko f3a3cd198e [x86] Update to cmov promotion tests for D36711; NFC
Adding i8 -> [i16, i32, i64] and i32 -> i64 cases.
This way we can see what the current codegen looks like.

llvm-svn: 312707
2017-09-07 08:59:05 +00:00
Andrew Ng 6dee736c91 [LLD] Fix padding of .eh_frame when in executable segment
The default padding for an executable segment is the target trap
instruction which for x86_64 is 0xCC. However, the .eh_frame section
requires the padding to be zero. The code that writes the .eh_frame
section assumes that its segment is zero initialized and does not
explicitly write the zero padding. This does not work when the .eh_frame
section is in the executable segment (for example when using
-no-rosegment).

This patch changes the .eh_frame writing code to explicitly write the
zero padding.

Differential Revision: https://reviews.llvm.org/D37462

llvm-svn: 312706
2017-09-07 08:43:56 +00:00
James Henderson 7594c61d2d [ELF] Prevent crash with binary inputs with non-ascii file names
If using --format=binary with an input file name that has one or more non-ascii
characters in, LLD has undefined behaviour (it crashes on my Windows Debug build)
when calling isalnum with these non-ascii characters. Instead, of calling
std::isalnum, this patch uses an internal version that ignores the locale and
checks a specific subset of characters.

Reviewers: ruiu

Differential Revision: https://reviews.llvm.org/D37331

llvm-svn: 312705
2017-09-07 08:30:09 +00:00
Zvi Rackover 25799d93f0 X86: Improve AVX512 fptoui lowering
Summary:
Add patterns for
  fptoui <16 x float> to <16 x i8>
  fptoui <16 x float> to <16 x i16>

Reviewers: igorb, delena, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37505

llvm-svn: 312704
2017-09-07 07:40:34 +00:00
Richard Smith 1363e8f6ed P0702R1: in class template argument deduction from a list of one element, if
that element's type is (or is derived from) a specialization of the deduced
template, skip the std::initializer_list special case.

llvm-svn: 312703
2017-09-07 07:22:36 +00:00
Craig Topper 7bc65e220c [X86] Force shuffle lowering to only create X86ISD::VPERM2X128 with 64-bit element types so we can remove some patterns from isel.
Intrinsic handling is still creating these nodes with 32-bit elements as well. But at least this gets rid of 8 and 16.

Ideally, someday we'll convert the intrinsics to generic vector shuffles and remove the intrinsics.

llvm-svn: 312702
2017-09-07 06:11:10 +00:00
Simon Atanasyan cfad9d5f0f [mips] Replace Triple::Environment check by the isGNUEnvironment() call. NFC
llvm-svn: 312701
2017-09-07 06:05:06 +00:00
Richard Smith 48b35d9a14 Fix off-by-one error in block mangling.
This restores the ABI prior to r214699.

llvm-svn: 312700
2017-09-07 05:41:24 +00:00
Matt Arsenault 65ca292a8d AMDGPU: Don't legalize i16 extloads to i32 with legal i16
Keeping non-i16 extloads makes it easier to match some new
gfx9 load instructions.

llvm-svn: 312699
2017-09-07 05:37:34 +00:00
Peter Collingbourne 681fbb64a4 ModuleSummaryAnalysis: Correctly handle all function operand references.
The current code that handles personality functions when creating a
module summary does not correctly handle the case where a function's
personality function operand refers to the function indirectly
(e.g. via a bitcast). This patch handles such cases by treating
personality function references like any other reference, i.e. by
adding them to the function's reference list. This has the minor side
benefit of allowing personality functions to participate in early
dead stripping.

We do this by calling findRefEdges on the function itself. This way
we also end up handling other function operands (specifically prefix
data and prologue data) for free.

Differential Revision: https://reviews.llvm.org/D37553

llvm-svn: 312698
2017-09-07 05:35:35 +00:00
Kostya Serebryany 754e584076 [libFuzzer] simplify CustomCrossOverTest even more
llvm-svn: 312697
2017-09-07 05:33:05 +00:00
Richard Smith 80acd0fd0b [modules ts] Add test for [basic.link]p3.
llvm-svn: 312696
2017-09-07 05:29:39 +00:00
Kostya Serebryany 57c03648e1 [libFuzzer] simplify CustomCrossOverTest a bit more
llvm-svn: 312695
2017-09-07 05:23:23 +00:00
Craig Topper 9228aee711 [X86] Remove patterns for selecting a v8f32 X86ISD::MOVSS or v4f64 X86ISD::MOVSD.
I don't think we ever generate these. If we did, I would expect we would also be able to generate v16f32 and v8f64, but we don't have those patterns.

llvm-svn: 312694
2017-09-07 05:08:16 +00:00
Marshall Clow 064028bb05 Add even more string_view tests. These found some bugs in the default parameter value for rfind/find_last_of/find_last_not_of
llvm-svn: 312693
2017-09-07 04:19:32 +00:00
Saleem Abdulrasool 5fba8ba9cc ARM: track globals promoted to coalesced const pool entries
Globals that are promoted to an ARM constant pool may alias with another
existing constant pool entry. We need to keep a reference to all globals
that were promoted to each constant pool value so that we can emit a
distinct label for each promoted global. These labels are necessary so
that debug info can refer to the promoted global without an undefined
reference during linking.

Patch by Stephen Crane!

llvm-svn: 312692
2017-09-07 04:00:13 +00:00
Marshall Clow e2addb79b8 Another missing string_view test
llvm-svn: 312691
2017-09-07 03:03:48 +00:00
Marshall Clow b6d73126c8 Add more string_view tests
llvm-svn: 312690
2017-09-07 02:46:09 +00:00
Kostya Serebryany d0386fac26 [libFuzzer] simplify and re-enable CustomCrossOverTest
llvm-svn: 312689
2017-09-07 02:04:06 +00:00
Vedant Kumar b6d2fe5c88 [cmake] Work around more -Wunused-driver-argument warnings
add_compiler_rt_object_libraries should strip out the -msse3 option on
non-macOS Apple platforms.

llvm-svn: 312688
2017-09-07 01:36:47 +00:00