Summary:
Previously, STL allocators were blacklisted in compiler_rt's
cfi_blacklist.txt because they mandated a cast from void* to T* before
object initialization completed. This change moves that logic into the
front end because C++ name mangling supports a substitution compression
mechanism for symbols that makes it difficult to blacklist the mangled
symbol for allocate() using a regular expression.
Motivated by crbug.com/751385.
Reviewers: pcc, kcc
Reviewed By: pcc
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36294
llvm-svn: 310097
HasDeclarationMatcher did not handle DeducedType, it always returned false for deduced types.
So with code like this:
struct X{};
auto x = X{};
This did no longer match:
varDecl(hasType(recordDecl(hasName("X"))))
Because HasDeclarationMatcher didn't resolve the DeducedType of x.
Differential Revision: https://reviews.llvm.org/D36308
llvm-svn: 310095
This adds support for the main 128-bit atomic operations,
using the SystemZ instructions LPQ, STPQ, and CDSG.
Generating these instructions is a bit more complex than usual
since the i128 type is not legal for the back-end. Therefore,
we have to hook the LowerOperationWrapper and ReplaceNodeResults
TargetLowering callbacks.
llvm-svn: 310094
We currently emit a serialization operation (bcr 14, 0) before every
atomic load and after every atomic store. This is overly conservative.
The SystemZ architecture actually does not require any serialization
for atomic loads, and a serialization after an atomic store only if
we need to enforce sequential consistency. This is what other compilers
for the platform implement as well.
llvm-svn: 310093
Summary:
The bug was uncovered after fix of PR23384 (part 3 of 3).
The patch restricts pointer multiplication in SCEV computaion for ICmpZero.
Reviewers: qcolombet
Differential Revision: http://reviews.llvm.org/D36170
From: Evgeny Stupachenko <evstupac@gmail.com>
<evgeny.v.stupachenko@intel.com>
llvm-svn: 310092
This fixes a bug in the ReadFromSymbolizer method of the
Addr2LineProcess class; if the input is too large, the returned buffer
will be null and will consequently fail the CHECK. The proposed fix is
to simply check if the buffer consists of only a null-terminator and
return if so (in effect skipping that frame). I tested by running one of
the unit tests both before and after my change.
Submitted on behalf of david-y-lam.
Reviewers: eugenis, alekseyshl, kcc
Reviewed By: alekseyshl
Differential Revision: https://reviews.llvm.org/D36207
llvm-svn: 310089
Summary:
This intrinsic lets us set inactive lanes to an identity value when
implementing wavefront reductions. In combination with Whole Wavefront
Mode, it lets inactive lanes be skipped over as required by GLSL/Vulkan.
Lowering the intrinsic needs to happen post-RA so that RA knows that the
destination isn't completely overwritten due to the EXEC shenanigans, so
we need another pseudo-instruction to represent the un-lowered
intrinsic.
Reviewers: tstellar, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34719
llvm-svn: 310088
Summary:
Whole Wavefront Wode (WWM) is similar to WQM, except that all of the
lanes are always enabled, regardless of control flow. This is required
for implementing wavefront reductions in non-uniform control flow, where
we need to use the inactive lanes to propagate intermediate results, so
they need to be enabled. We need to propagate WWM to uses (unless
they're explicitly marked as exact) so that they also propagate
intermediate results correctly. We do the analysis and exec mask munging
during the WQM pass, since there are interactions with WQM for things
that require both WQM and WWM. For simplicity, WWM is entirely
block-local -- blocks are never WWM on entry or exit of a block, and WWM
is not propagated to the block level. This means that computations
involving WWM cannot involve control flow, but we only ever plan to use
WWM for a few limited purposes (none of which involve control flow)
anyways.
Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There
isn't yet a way to turn WWM off -- that will be added in a future
change.
Finally, it turns out that turning on inactive lanes causes a number of
problems with register allocation. While the best long-term solution
seems like teaching LLVM's register allocator about predication, for now
we need to add some hacks to prevent ourselves from getting into trouble
due to constraints that aren't currently expressed in LLVM. For the gory
details, see the comments at the top of SIFixWWMLiveness.cpp.
Reviewers: arsenm, nhaehnle, tpr
Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D35524
llvm-svn: 310087
Summary:
Right now, the WQM pass conflates two different things when tracking the
Needs of an instruction:
1. Needs can be StateWQM, which is propagated to other instructions, and
means that this instruction (and everything it depends on) must be
calculated in WQM.
2. Needs can be StateExact, which is not propagated to other
instructions, and means that this instruction must not be calculated in
WQM and WQM-ness must not be propagated past this instruction.
This works now because there are only two different states, but in the
future we want to be able to express things like "calculate this in WQM,
but please disable WWM and don't propagate it" (to implement
@llvm.amdgcn.set.inactive). In order to do this, we need to split the
per-instruction Needs field in two: a new Needs field, which can only
contain StateWQM (and in the future, StateWWM) and is propagated to
sources, and a Disables field, which can also contain just StateWQM or
nothing for now.
We keep the per-block tracking the same for now, by translating
Needs/Disables to the old representation with only StateWQM or
StateExact. The other place that needs special handling is when we
emit the state transitions. We could just translate back to the old
representation there as well, which we almost do, but instead of 0 as a
placeholder value for "any state," we explicitly or together all the
states an instruction is allowed to be in. This lets us refactor the
code in preparation for WWM, where we'll need to be able to handle
things like "this instruction must be in Exact or WQM, but not WWM."
Reviewers: arsenm, nhaehnle, tpr
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D35523
llvm-svn: 310086
Summary:
Previously, we assumed that certain types of instructions needed WQM in
pixel shaders, particularly DS instructions and image sampling
instructions. This was ok because with OpenGL, the assumption was
correct. But we want to start using DPP instructions for derivatives as
well as other things, so the assumption that we can infer whether to use
WQM based on the instruction won't continue to hold. This intrinsic lets
frontends like Mesa indicate what things need WQM based on their
knowledge of the API, rather than second-guessing them in the backend.
We need to keep around the old method of enabling WQM, but eventually we
should remove it once Mesa catches up. For now, this will let us use DPP
instructions for computing derivatives correctly.
Reviewers: arsenm, tpr, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D35167
llvm-svn: 310085
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082
Summary:
It was added to support clang warnings about includes with case
mismatches, but it ended up not being necessary.
Reviewers: twoh, rafael
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D36328
llvm-svn: 310078
This revision ports all libFuzzer tests apart from the unittest to LIT.
The advantages of doing so include:
- Tests being self-contained
- Much easier debugging of a single test
- No need for using a two-stage compilation
The unit-test is still compiled using CMake, but it does not need a
freshly built compiler.
NOTE: The previous two-stage bot configuration will NOT work, as in the
second stage build LLVM_USE_SANITIZER is set, which disables ASAN from
being built.
Thus bots will be reconfigured in the next few commits.
Differential Revision: https://reviews.llvm.org/D36295
llvm-svn: 310075
This is a continuation of https://reviews.llvm.org/D36219
This patch uses reverse mapping (encoding->name) in
ARMInstPrinter::printBankedRegOperand to get rid of
hard-coded values (as pointed out by @olista01).
Reviewed by: @fhahn, @rovka, @olista01
Differential Revision: https://reviews.llvm.org/D36260
llvm-svn: 310072
The frontend may have requested a higher alignment for any reason, and
downstream optimizations may already have taken advantage of it. We
should keep the same alignment when moving the allocation from the
parameter area to the local variable area.
Fixes PR34038
llvm-svn: 310071
Summary:
`case:` and `default:` would normally parse as labels for a `switch` block.
However in TypeScript, they can be used in field declarations, e.g.:
interface I {
case: string;
}
This change special cases parsing them in declaration lines to avoid wrapping
them.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D36148
llvm-svn: 310070
Summary: We originally set the hotness threshold as 99.9% to be consistent with gcc FDO. But because the inline heuristic is different between 2 compilers: llvm uses bottom-up algorithm while gcc uses priority based. The LLVM algorithm tends to inline too much early that prevents hot callsites from further inlined into its caller. Due to this restriction, we think it is reasonable to lower the hotness threshold to give priority to those that are really hot. Our experiments show that this change would improve performance on large applications. Note that the inline heuristic has great room for further tuning. Once the inline heuristics are refined, we could adjust this threshold to allow inlining for less hot callsites.
Reviewers: davidxl, tejohnson, eraman
Reviewed By: tejohnson
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D36317
llvm-svn: 310065
Summary:
The (not (sext)) case is really (xor (sext), -1) which should have been simplified to (sext (xor, 1)) before we got here. So we shouldn't need to handle it.
With that taken care of we only need to two cases so don't need the swap anymore. This makes us in sync with the equivalent code in visitOr so inline this to match.
Reviewers: spatel, eli.friedman, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36240
llvm-svn: 310063
Adds function attributes to index: ReadNone, ReadOnly, NoRecurse, NoAlias. This attributes will be used for future ThinLTO optimizations that will propagate function attributes across modules.
llvm-svn: 310061
Name: narrow_shift
Pre: C1 < 8
%zx = zext i8 %x to i32
%l = lshr i32 %zx, C1
=>
%narrowC = trunc i32 C1 to i8
%ns = lshr i8 %x, %narrowC
%l = zext i8 %ns to i32
http://rise4fun.com/Alive/jIV
This isn't directly applicable to PR34046 as written, but we
need to have more narrowing folds like this to be sure that
rotate patterns are recognized.
llvm-svn: 310060
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.
Committed on behalf of @jbhateja (Jatin Bhateja)
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 310058
This is causing failures when compiling clang with -O3
as one of the structures used by clang is passed by
value and uses the fastcc calling convention.
Faliures manifest for stage2 mips build.
llvm-svn: 310057
The method forwardSpeculatable forwards speculatively executable
instructions and is currently the only way to forward an
instruction.
In the future we intend to add more methods.
llvm-svn: 310056
Summary:
This fixes PR31777.
If both stores' values are ConstantInt, we merge the two stores
(shifting the smaller store appropriately) and replace the earlier (and
larger) store with an updated constant.
In the future we should also support vectors of integers. And maybe
float/double if we can.
Reviewers: hfinkel, junbuml, jfb, RKSimon, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30703
llvm-svn: 310055
Summary:
This commit allows matchSelectPattern to recognize clamp of float
arguments in the presence of FMF the same way as already done for
integers.
This case is a little different though. With integers, given the
min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX
"automatically". That is not the case for float, because for them only
full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care
about NaNs. On the other hand, some backends (e.g. X86) have only
FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM
nodes are illegal thus selection is not happening. So I decided to do
such kind of transformation in IR (InstCombiner) instead of
complicating the logic in the backend.
Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper
Reviewed By: efriedma
Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D33186
llvm-svn: 310054
Summary:
- add more tests
- pr27236.ll: rename %tmpN -> %N because otherwise a FileCheck
variable for newly appeared unnamed value would use the same name as
tmpN (as generated by update_test_checks.py)
- run update_test_checks.py
Reviewers: efriedma
Reviewed By: efriedma
Subscribers: n.bozhenov, llvm-commits
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D35002
llvm-svn: 310053
Summary:
Testing the new-pm passes becomes much easier once they behave more like the
old passes in terms of the order in which Scops are processed and printed. This
requires three changes:
- ScopInfo: Use an ordered map to store scops
- ScopInfo: Iterate and print Scops in reverse order to match legacy PM behaviour
- ScopDetection: print function name in ScopAnalysisPrinter
Reviewers: grosser, Meinersbur, bollu
Reviewed By: grosser
Subscribers: pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D36303
llvm-svn: 310052
Summary:
The check doesn't fully support smart-ptr usages inside macros, which
may cause incorrect fixes, or even crashes, ignore them for now.
Reviewers: alexfh
Reviewed By: alexfh
Subscribers: JDevlieghere, xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D36264
llvm-svn: 310050
Error was:
field of type 'llvm::ArgDescriptor' has private default constructor
const AMDGPUFunctionArgInfo AMDGPUArgumentUsageInfo::ExternFunctionInfo{};
^
llvm-svn: 310048
Summary:
M-class profiles do not support ARM execution mode, so providing
-marm/-mno-thumb does not make sense in combination with -mcpu/-march
options that support the M-profile.
This is a follow-up patch to D35569 and it seemed pretty clear that we
should emit an error in the driver in this case.
We probably also should warn/error if the provided -mcpu/-march options
do not match, e.g. -mcpu=cortex-m0 -march=armv8-a is invalid, as
cortex-m0 does not support armv8-a. But that should be a separate patch
I think.
Reviewers: echristo, richard.barton.arm, rengolin, labrinea, charles.baylis
Reviewed By: rengolin
Subscribers: aemerson, javed.absar, kristof.beyls, cfe-commits
Differential Revision: https://reviews.llvm.org/D35826
llvm-svn: 310047
D35945 introduces change when there is useless to check Error flag
in few places, but ErrorCount must be checked instead.
But then we probably can just check ErrorCount always. That should simplify
things. Patch do that.
Differential revision: https://reviews.llvm.org/D36266
llvm-svn: 310046
This is PR33889,
Patch adds support of combination of linkerscript and
-symbol-ordering-file option.
If no sorting commands are present in script inside section declaration
and no --sort-section option specified, code uses sorting from ordering
file if any exist.
Differential revision: https://reviews.llvm.org/D35843
llvm-svn: 310045