I needed a reader-writer lock for a downstream project and noticed that
llvm has one. Function.cpp is the only file in-tree that refers to it.
To anyone reading this: are you using RWMutex in out-of-tree code? Maybe
it's not worth keeping around any more...
Since we're not actually using RWMutex *here*, remove the #include (and
a few other stale headers while we're at it).
llvm-svn: 278178
It's always hard to remember when to include this file, and
when you do include it it's hard to remember what preprocessor
check it needs to be behind, and then you further have to remember
whether it's windows.h or win32.h which you need to include.
This patch changes the name to PosixApi.h, which is more appropriately
named, and makes it independent of any preprocessor setting.
There's still the issue of people not knowing when to include this,
because there's not a well-defined set of things it exposes other
than "whatever is missing on Windows", but at least this should
make it less painful to fix when problems arise.
This patch depends on LLVM revision r278170.
llvm-svn: 278177
In case there are compilers that support neither __FUNCSIG__ or
__PRETTY_FUNCTION__, we fall back to __func__ as a last resort,
which should be guaranteed by C++11 and C99.
llvm-svn: 278176
Summary:
This hopefully fixes PR28825. The problem now was that a value from the
original loop was used in a subloop, which became a sibling after separation.
While a subloop doesn't need an lcssa phi node, a sibling does, and that's
where we broke LCSSA. The most natural way to fix this now is to simply call
formLCSSA on the original loop: it'll do what we've been doing before plus
it'll cover situations described above.
I think we don't need to run formLCSSARecursively here, and we have an assert
to verify this (I've tried testing it on LLVM testsuite + SPECs). I'd be happy
to be corrected here though.
I also changed a run line in the test from '-lcssa -loop-unroll' to
'-lcssa -loop-simplify -indvars', because it exercises LCSSA
preservation to the same extent, but also makes less unrelated
transformation on the CFG, which makes it easier to verify.
Reviewers: chandlerc, sanjoy, silvas
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23288
llvm-svn: 278173
This patch adds -emscripten-cxx-exceptions-whitelist option to
WebAssemblyLowerEmscriptenExceptions pass. This options is the list of
function names in which Emscripten-style exception handling is enabled.
This is to support emscripten's EXCEPTION_CATCHING_WHITELIST which
exists because of the performance impact of emscripten's non-zero-cost
EH method.
Patch by Heejin Ahn
Differential Revision: https://reviews.llvm.org/D23292
llvm-svn: 278171
MSVC doesn't have this, it only has __FUNCSIG__. So this adds
a new macro called LLVM_PRETTY_FUNCTION which evaluates to the
right thing on any platform.
llvm-svn: 278170
When using libunwind and not building as standalone project, we
can directly depend on the unwind library target.
Differential Revision: https://reviews.llvm.org/D23289
llvm-svn: 278169
For now put them all in the entry block. This should be correct but may give
poor runtime performance. Hopefully MachineSinking combined with
isReMaterializable can solve those issues, but if not the interface is sound
enough to support alternatives.
llvm-svn: 278168
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.
Differential Revision: https://reviews.llvm.org/D22942
llvm-svn: 278161
The fix for PR28705 will be committed consecutively.
In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.
A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
S1 = S2 + C_a
S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.
In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.
Differential Revision: https://reviews.llvm.org/D21313
llvm-svn: 278160
Summary:
The corresponding LLVM change: D23217.
LazyVector::iterator breaks, because int isn't an iterator type.
Since iterator_adaptor_base shouldn't be blamed to break at the call to
iterator_traits<int>::xxx, I'd rather "fix" LazyVector::iterator.
The perfect solution is to model "relative pointer", but it's beyond the goal of this patch.
Reviewers: chandlerc, bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D23218
llvm-svn: 278156
Let the driver pass the option to frontend. Do not set precision metadata for division instructions when this option is set. Set function attribute "correctly-rounded-divide-sqrt-fp-math" based on this option.
Differential Revision: https://reviews.llvm.org/D22940
llvm-svn: 278155
In the coverage report, the line and count columns have been swapped to make it more readable.
A follow-up commit in compiler-rt is needed
Differential Revision: https://reviews.llvm.org/D23281
llvm-svn: 278152
Adjust target features for amdgcn target when -cl-denorms-are-zero is set.
Denormal support is controlled by feature strings fp32-denormals fp64-denormals in amdgcn target. If -cl-denorms-are-zero is not set and the command line does not set fp32/64-denormals feature string, +fp32-denormals +fp64-denormals will be on for GPU's supporting them.
A new virtual function virtual void TargetInfo::adjustTargetOptions(const CodeGenOptions &CGOpts, TargetOptions &TargetOpts) const is introduced to allow adjusting target option by codegen option.
Differential Revision: https://reviews.llvm.org/D22815
llvm-svn: 278151
I've put some work into the Google Benchmark library in order to make it easier
to benchmark libc++. These changes have already been upstreamed into
Google Benchmark and this patch applies the changes to the in-tree version.
The main improvement in the addition of a 'compare_bench.py' script which
makes it very easy to compare benchmarks. For example to compare the native
STL to libc++ you would run:
`$ compare_bench.py ./util_smartptr.native.out ./util_smartptr.libcxx.out`
And the output would look like:
RUNNING: ./util_smartptr.native.out
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy 62 ns 62 ns 10937500
BM_SharedPtrIncDecRef 31 ns 31 ns 23972603
BM_WeakPtrIncDecRef 28 ns 28 ns 23648649
RUNNING: ./util_smartptr.libcxx.out
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy 46 ns 46 ns 14957265
BM_SharedPtrIncDecRef 31 ns 31 ns 22435897
BM_WeakPtrIncDecRef 34 ns 34 ns 21084337
Comparing ./util_smartptr.native.out to ./util_smartptr.libcxx.out
Benchmark Time CPU
-----------------------------------------------------
BM_SharedPtrCreateDestroy -0.26 -0.26
BM_SharedPtrIncDecRef +0.00 +0.00
BM_WeakPtrIncDecRef +0.21 +0.21
llvm-svn: 278147
This is handy in case by the time clang-rename is invoked, an external
tool already genereated a list of oldname -> newname pairs to handle.
Reviewers: omtcyfz
Differential Revision: https://reviews.llvm.org/D23198
llvm-svn: 278145
A UD2 might make its way into the program via a call to @llvm.trap.
Obviously, calls are not terminators. However, we modeled the X86
instruction, UD2, as a terminator. Later on, this confuses the epilogue
insertion machinery which results in the epilogue getting inserted
before the UD2. For some platforms, like x64, the result is a
violation of the ABI.
Instead, model UD2/UD2B as a side effecting instruction which may
observe memory.
llvm-svn: 278144
As detailed on D22726, much of the shift combining code assume constant values will fit into a uint64_t value and calls ConstantSDNode::getZExtValue where it probably shouldn't (leading to asserts). Using APInt directly avoids this problem but we encounter other assertions if we attempt to compare/operate on 2 APInt of different bitwidths.
This patch adds a helper function to ensure that 2 APInt values are zero extended as required so that they can be safely used together. I've only added an initial example use for this to the '(SHIFT (SHIFT x, c1), c2) --> (SHIFT x, (ADD c1, c2))' combines. Further cases can easily be added as required.
Differential Revision: https://reviews.llvm.org/D23007
llvm-svn: 278141
Summary: Add test to detect the C++ include paths are passed to both CUDA host and device frontends.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D22946
llvm-svn: 278140
It's surprising that you have to pass /Z7 in addition to -gcodeview to
get debug info. The sanitizer runtime, for example, expects that if the
compiler supports the -gline-tables-only flag, then it will emit debug
info.
llvm-svn: 278139
Summary:
We teach alias analysis that invariant.start is readonly.
This helps with GVN and memcopy optimizations that currently treat.
invariant.start as a clobber.
We need to treat this as readonly, so that DSE does not incorrectly
remove stores prior to the invariant.start
Reviewers: sanjoy, reames, majnemer, dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23214
llvm-svn: 278138
no prof data for func warning is turned off by default
due to its high verbosity and minimal usefulness.
Differential Revision: http://reviews.llvm.org/D23295
llvm-svn: 278127
Ensure the right scalar allocations are used as the host location of data
transfers. For the device code, we clear the allocation cache before device
code generation to be able to generate new device-specific allocation and
we need to make sure to add back the old host allocations as soon as the
device code generation is finished.
llvm-svn: 278126
This increases the readability of the IR and also clarifies that the GPU
inititialization is executed _after_ the scalar initialization which needs
to before the code of the transformed scop is executed.
Besides increased readability, the IR should not change. Specifically, I
do not expect any changes in program semantics due to this patch.
llvm-svn: 278125