Summary:
Some optimizations such as jump threading and loop unswitching can negatively
affect performance when applied to divergent branches. The divergence analysis
added in this patch conservatively estimates which branches in a GPU program
can diverge. This information can then help LLVM to run certain optimizations
selectively.
Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll
Reviewers: resistor, hfinkel, eliben, meheff, jholewinski
Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8576
llvm-svn: 234567
The IPToState table must be emitted after we have generated labels for
all functions in the table. Don't rely on the order of the list of
globals. Instead, utilize WinEHFuncInfo to tell us how many catch
handlers we expect to outline. Once we know we've visited all the catch
handlers, emit the cppxdata.
llvm-svn: 234566
Given something like 'int({}, 1)', we would try to emit a diagnostic
regarding the excess element in the scalar initializer. However, we
assumed that the initializer list had an element in it.
llvm-svn: 234565
When we have an instruction for this (and, thus, don't generate a runtime
call), we need to custom type legalize this (in a trivial way, just as we do
for fp_to_sint).
Fixes PR23173.
llvm-svn: 234561
This covers most of rdar://20490076, but leaves one corner case still open - namely the case where we try to have arguments of the form foo\ bar (unquoted, but slashed) go through argdumper
llvm-svn: 234554
Because no one except Hexagon uses the header, we don't need to maintain
the header in the common directory. Also de-template the function for
readability.
llvm-svn: 234551
For the most common ones (such as fadd), we already did the promotion.
Do the same thing for all the others.
Currently, we'll just crash/assert on all these operations, as
there's no hardware or libcall support whatsoever.
f16 (half) is specified as an interchange - not arithmetic - format,
and is expected to be promoted to single-precision for arithmetic
operations.
While there, teach the legalizer about promoting some of the (mostly
floating-point) operations that we never needed before.
Differential Revision: http://reviews.llvm.org/D8648
See related discussion on the thread for: http://reviews.llvm.org/D8755
llvm-svn: 234550
This patch corresponds to review:
http://reviews.llvm.org/D8398
It adds some builtin functions to access the extended divide and bit permute instructions.
llvm-svn: 234547
This is the patch corresponding to review:
http://reviews.llvm.org/D8406
It adds some missing instructions from ISA 2.06 to the PPC back end.
llvm-svn: 234546
CreateELF.h was included only by ELFReader.h, and it was used only
by ELFReader class. By making the function a member of the class,
we can remove template parameters.
llvm-svn: 234540
formatted_raw_ostream is a wrapper over another stream to add column and line
number tracking.
It is used only for asm printing.
This patch moves the its creation down to where we know we are printing
assembly. This has the following advantages:
* Simpler lifetime management: std::unique_ptr
* We don't compute column and line number of object files :-)
llvm-svn: 234535
We don't actually need a real Mac sysroot to make the test pass, just a
linker. This makes the test pass in environments where no ld is on
PATH.
llvm-svn: 234533
WinEHPrepare was going to have to pattern match the control flow merge
and split that the old lowering used, and that wasn't really feasible.
Now we can teach WinEHPrepare to pattern match this, which is much
simpler:
%fp = call i8* @llvm.frameaddress(i32 0)
call void @func(iN [01], i8* %fp)
This prototype happens to match the prototype used by the Win64 SEH
personality function, so this is really simple.
llvm-svn: 234532
We already do:
concat_vectors(scalar, undef) -> scalar_to_vector(scalar)
When the scalar is legal.
When it's not, but is a truncated legal scalar, we can also do:
concat_vectors(trunc(scalar), undef) -> scalar_to_vector(scalar)
Which is equivalent, since the upper lanes are undef anyway.
While there, teach the combine to look at more than 2 operands.
Differential Revision: http://reviews.llvm.org/D8883
llvm-svn: 234530
The integer extend optimization tries to fold the extend into the load
instruction. This requires us to identify if the extend has already been
emitted or not and act accordingly on it.
The check that was originally performed for this was not sufficient. Besides
checking the ValueMap for a mapped register we also need to check if the
virtual register has already an associated machine instruction that defines it.
This fixes rdar://problem/20470788.
llvm-svn: 234529
A [*] is only allowed in a declaration for a function, not in its
definition. We didn't correctly recurse on reference types while
looking for it, causing us to crash in CodeGen instead of rejecting it.
llvm-svn: 234528
Summary: This will get the windows bots going.
Test Plan: Build LLDB on Windows.
Reviewers: zturner
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D8934
llvm-svn: 234527
The previous implementation would copy the attribute from the class to
functions that have the class as their return type when the functions
are first declared. This proved to have two flaws:
1) if the class is forward-declared without the attribute and a
function or method with the class as a its return type is declared,
and afterward the class is defined with warn_unused_result, the
function or method would never inherit the attribute, and
2) the check simply failed for functions and methods that are part of
a template instantiation, regardless of whether the class with
warn_unused_result is part of a specific instantiation or part of
the template itself (presumably because those function/method
declaration does not hit the same code path as a non-template one
and so never inherits the attribute).
The new approach is to instead modify the two places where a function or
method call is checked for the warn_unused_result attribute on the decl
by extending the checks to also look for the attribute on the decl's
return type.
Additionally, the check for return types that have the warn_unused_result
now excludes pointers and references to such types, as such return types do
not necessarily imply a transfer of ownership for the underlying object
being referred to by the return value. This does not change the behavior
of functions that are directly given the warn_unused_result attribute.
llvm-svn: 234526