Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.
Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.
This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19191
llvm-svn: 267102
If loop control variable for simd-based directives is explicitly marked
as linear/lastprivate in clauses, codegen for such construct would
crash. Patch fixes this problem.
llvm-svn: 267101
Re-layer the functions in the new (i.e., newly correct) post-order
traversals in ValueEnumerator (r266947) and ValueMapper (r266949).
Instead of adding a node to the worklist in a helper function and
returning a flag to say what happened, return the node itself. This
makes the code way cleaner: the worklist is local to the main function,
there is no flag for an early loop exit (since we can cleanly bury the
loop), and it's perfectly clear when pointers into the worklist might be
invalidated.
I'm fixing both algorithms in the same commit to avoid repeating the
commit message; if you take the time to understand one the other should
be easy. The diff itself isn't entirely obvious since the traversals
have some noise (i.e., things to do), but here's the high-level change:
auto helper = [&WL](T *Op) { auto helper = [](T **&I, T **E) {
=> while (I != E) {
if (shouldVisit(Op)) { T *Op = *I++;
WL.push(Op, Op->begin()); if (shouldVisit(Op)) {
return true; return Op;
} }
return false; return nullptr;
}; };
=>
WL.push(S, S->begin()); WL.push(S, S->begin());
while (!empty()) { while (!empty()) {
auto *N = WL.top().N; auto *N = WL.top().N;
auto *&I = WL.top().I; auto *&I = WL.top().I;
bool DidChange = false;
while (I != N->end())
if (helper(*I++)) { => if (T *Op = helper(I, N->end()) {
DidChange = true; WL.push(Op, Op->begin());
break; continue;
} }
if (DidChange)
continue;
POT.push(WL.pop()); => POT.push(WL.pop());
} }
Thanks to Mehdi for helping me find a better way to layer this.
llvm-svn: 267099
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
llvm-svn: 267098
This removes the interfaces added (and not yet complete) to support
lazy reading of summaries. This support is not expected to be needed
since we are moving to a model where the full index is only being
traversed in the thin link step, instead of the back ends.
(The second part of this that I plan to do next is remove the
GlobalValueInfo from the ModuleSummaryIndex - it was mostly needed to
support lazy parsing of summaries. The index can instead reference the
summary structures directly.)
llvm-svn: 267097
This test used to write a .s file until r266971 fixed that. But on most bots,
the .s file still exists. Add an rm statement to clean up the bots. In a few
days, this statement can go away again.
llvm-svn: 267095
Fix and enable working stack-use-after-scope tests.
Add more failing tests for the feature, for fix later.
PR27453.
Patch by Vitaly Buka.
llvm-svn: 267084
I will eventually make `evaluate` function a usual parse function
rather than a function that works on a separate token list.
This is the first step toward that.
llvm-svn: 267083
WIN__DBZCHK will insert a CBZ instruction into the stream. This instruction
reserves 3 bits for the condition register (rn). As such, we must ensure that
we restrict the register to a low register. Use the tGPR class instead of GPR
to ensure that this is properly constrained. In debug builds, we would attempt
to use lr as a condition register which would silently get truncated with no
hint that the register selection was incorrect.
llvm-svn: 267080
We'd disabled them on x86 because back in the early days some host tools
couldn't handle the new load commands. This no longer holds: anyone capable of
deploying Clang should be able to deploy its copies of ar/ranlib/etc.
rdar://25254790
llvm-svn: 267075
When printing the properties required by a pass, only print the
properties that are set, and not those that are clear (only properties
that are set are verified, clear properties are "don't-care").
llvm-svn: 267070
You can instruct the linker to not discard sections even if they
are unused and --gc-sections option is given. The linker script
command for doing that is KEEP. The syntax is KEEP(foo) where foo
is a section name. KEEP commands are written in SECTIONS command,
so you can specify the order of sections *and* which sections
will be kept.
Each sub-command in SECTIONS command are translated into SectionRule
object. Previously, each SectionRule has `Keep` bit. However,
if you think about it, this hid information in too deep in elements
of a list. Semantically, KEEP commands aren't really related to
SECTIONS subcommands. We can keep the section list for KEEP in a
separate list. This patch does that.
llvm-svn: 267065
Since there is a copy in every translation unit that uses them, they can
be omitted from the symbol table if the address is not significant.
This still doesn't catch as many cases as the gold plugin. The
difference is that we check canBeOmittedFromSymbolTable in each file and
use lazy loading which limits what it can do. Gold checks it in the merged file.
I think the correct way of getting the same results as gold is just to
cache in the IR the result of canBeOmittedFromSymbolTable.
llvm-svn: 267063
Since r265060 LLVM infers correct __nvvm_reflect attributes, so
explicit declaration of __nvvm_reflect() is no longer needed.
Differential Revision: http://reviews.llvm.org/D19074
llvm-svn: 267062
Summary:
Adds the initial version of a runtime library for the new
EfficiencySanitizer ("esan") family of tools. The library includes:
+ Slowpath code via callouts from the compiler instrumentation for
each memory access.
+ Registration of atexit() to call finalization code.
+ Runtime option flags controlled by the environment variable
ESAN_OPTIONS. The common sanitizer flags are supported such as
verbosity and log_path.
+ An initial simple test.
Still TODO: common code for libc interceptors and shadow memory mapping,
and tool-specific code for shadow state updating.
Reviewers: eugenis, vitalybuka, aizatsky, filcab
Subscribers: filcab, vkalintiris, kubabrecka, llvm-commits, zhaoqin, kcc
Differential Revision: http://reviews.llvm.org/D19168
llvm-svn: 267060