Summary:
LoopRotate manually updates the DoomTree by iterating over all predecessors of a basic block and computing the Nearest Common Dominator.
When a predecessor happens to be unreachable, `DT.findNearestCommonDominator` returns nullptr.
This patch teaches LoopRotate to handle this case and fixes [[ https://bugs.llvm.org/show_bug.cgi?id=33701 | PR33701 ]].
In the future, LoopRotate should be taught to use the new incremental API for updating the DomTree.
Reviewers: dberlin, davide, uabelho, grosser
Subscribers: efriedma, mzolotukhin
Differential Revision: https://reviews.llvm.org/D35074
llvm-svn: 307828
ManagedStatic<sys::Mutex> would lazilly allocate a sys::Mutex to lock
when reporting an OOM, which is a bad idea.
The three STL implementations that I know of use pthread_mutex_lock and
EnterCriticalSection to implement std::mutex. I'm pretty sure that
neither of those allocate heap memory.
It seems that we unconditionally use std::mutex without testing
LLVM_ENABLE_THREADS elsewhere in the codebase, so this should be
portable.
llvm-svn: 307827
Some libFuzzer tests on Linux would fail with bizarre error messages
unless llvm-symbolizer binary is present.
Differential Revision: https://reviews.llvm.org/D35313
llvm-svn: 307826
The current code relies on the assumption that tests are included only
if LLVM_USE_SANITIZE_COVERAGE is enabled.
This commit makes it easier to relax the assumption in the future, as
the variable LIBFUZZER_FLAGS_BASE is used further in libFuzzer tests.
Differential Revision: https://reviews.llvm.org/D35314
llvm-svn: 307825
If taskloop directive has no associated nogroup clause, it must emitted
inside implicit taskgroup block. Runtime supports it, but we need to
generate implicit taskgroup block explicitly to support future
reductions codegen.
llvm-svn: 307822
Summary: socket() is better to include SOCK_CLOEXEC in its type argument to avoid the file descriptor leakage.
Reviewers: chh, Eugene.Zelenko, alexfh, hokein, aaron.ballman
Reviewed By: chh, alexfh
Subscribers: srhines, mgorny, JDevlieghere, xazax.hun, cfe-commits
Tags: #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D34913
llvm-svn: 307818
A generic variant of IMPLICIT_DEF was added in r306875, but this
survives to selection and hits a `Cannot Select`. Add handling that
converts the note to a regular IMPLICIT_DEF.
llvm-svn: 307817
Summary:
Add a sequence number that identifies a ptx_kernel's parent Scop within a function to it's name to differentiate it from other kernels produced from the same function, yet different Scops.
Kernels produced from different Scops can end up having the same name. Consider a function with 2 Scops and each Scop being able to produce just one kernel. Both of these kernels have the name "kernel_0". This can lead to the wrong kernel being launched when the runtime picks a kernel from its cache based on the name alone. This patch supplements D33985, by differentiating kernels across Scops as well.
Previously (even before D33985) while profiling kernels generated through JIT e.g. Julia, [[ https://groups.google.com/d/msg/polly-dev/J1j587H3-Qw/mR-jfL16BgAJ | kernels associated with different functions, and even different SCoPs within a function, would be grouped together due to the common name ]]. This patch prevents this grouping and the kernels are reported separately.
Reviewers: grosser, bollu
Reviewed By: grosser
Subscribers: mehdi_amini, nemanjai, pollydev, kbarton
Tags: #polly
Differential Revision: https://reviews.llvm.org/D35176
llvm-svn: 307814
the diagnostic to its enum value
This will be used by a script that invokes clang in a debugger and forces it
to stop when it reports a particular diagnostic.
Differential Revision: https://reviews.llvm.org/D35306
llvm-svn: 307813
As promised in D35003.
Uses -codegenprepare instead of -instcombine since we hit the same
buggy path anyway, and CGP lets us keep this test really simple
(instcombine likes turning the alloca T, N into alloca [N x T], which
hides the bug this is testing for).
llvm-svn: 307811
Summary:
This follows the addition of `GetRandom` with D34412. We remove our
`/dev/urandom` code and use the new function. Additionally, change the PRNG for
a slightly faster version. One of the issues with the old code is that we have
64 full bits of randomness per "next", using only 8 of those for the Salt and
discarding the rest. So we add a cached u64 in the PRNG that can serve up to
8 u8 before having to call the "next" function again.
During some integration work, I also realized that some very early processes
(like `init`) do not benefit from `/dev/urandom` yet. So if there is no
`getrandom` syscall as well, we have to fallback to some sort of initialization
of the PRNG.
Now a few words on why XoRoShiRo and not something else. I have played a while
with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64
are usually faster but produce respectively 15 & 31 bits of entropy, meaning
that to get a full 64-bit, you would need to call them several times. The simple
XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test
suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and
produces full 64 bits.
%%%
root@tulip-chiphd:/data # ./randtest.arm
[+] starting xs32...
[?] xs32 duration: 22431833053ns
[+] starting lcg32...
[?] lcg32 duration: 14941402090ns
[+] starting pcg32...
[?] pcg32 duration: 44941973771ns
[+] starting xs128p...
[?] xs128p duration: 48889786981ns
[+] starting lcg64...
[?] lcg64 duration: 33831042391ns
[+] starting xos128p...
[?] xos128p duration: 44850878605ns
root@tulip-chiphd:/data # ./randtest.aarch64
[+] starting xs32...
[?] xs32 duration: 22425151678ns
[+] starting lcg32...
[?] lcg32 duration: 14954255257ns
[+] starting pcg32...
[?] pcg32 duration: 37346265726ns
[+] starting xs128p...
[?] xs128p duration: 22523807219ns
[+] starting lcg64...
[?] lcg64 duration: 26141304679ns
[+] starting xos128p...
[?] xos128p duration: 14937033215ns
%%%
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: aemerson, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D35221
llvm-svn: 307798
FastIsel can't handle them, so we would end up crashing during
register class selection.
Fixes PR26522.
Differential Revision: https://reviews.llvm.org/D35272
llvm-svn: 307797
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size.
Reviewers: eli.friedman, reames, mkazantsev, skatkov
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34884
llvm-svn: 307796
Summary:
This patch fixes bug https://bugs.llvm.org/show_bug.cgi?id=3313: a comment line
was aligned with the next #ifdef even in the presence of an empty line between
them.
Reviewers: djasper, klimek
Reviewed By: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D35296
llvm-svn: 307795
Patch removes restriction about moving location counter
backwards outside of output sections declarations.
That may be useful for some apps relying on such scripts,
known example is linux kernel.
Differential revision: https://reviews.llvm.org/D34977
llvm-svn: 307794
This fixes PR33712.
Imagine following script and code:
VER1 { global: foo; local: *; };
VER2 { global: foo; };
.global bar
bar:
.symver bar, foo@VER1
.global zed
zed:
.symver zed, foo@@VER2
We add foo@@VER2 as foo to symbol table, because have to resolve references to
foo for default symbols.
Later we are trying to assign symbol versions from script. For that we are searching for 'foo'
again. Here it is placed under VER1 and VER2 at the same time, we find it twice and trying to
set version again both times, hence LLD shows a warning.
Though sample code is correct: we have 2 different versions of foo.
Patch gives a symbol version extracted from name a priority over version set by script.
Differential revision: https://reviews.llvm.org/D35207
llvm-svn: 307792
Summary:
NetBSD shell sh(1) does not support ">& /dev/null" construct.
This is bashism. The portable and POSIX solution is to use:
"> /dev/null 2>&1".
This change fixes 22 Unexpected Failures on NetBSD/amd64
for the "check-llvm" target.
Sponsored by <The NetBSD Foundation>
Reviewers: joerg, dim, rnk
Reviewed By: joerg, rnk
Subscribers: rnk, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D35277
llvm-svn: 307789
When we have a diamond ifcvt the fallthough block will have a branch at the end
of it that disappears when predicated, so discount it from the predication cost.
Differential Revision: https://reviews.llvm.org/D34952
llvm-svn: 307788
1. Add SyncClock::ResetImpl which removes code
duplication between ctor and Reset.
2. Move SyncClock::Resize to SyncClock methods,
currently it's defined between ThreadClock methods.
llvm-svn: 307785
Store file descriptors from loop.m_read_fds (if FORCE_PSELECT is
defined) and signals from loop.m_signals that need to be processed in
MainLoop::RunImpl::ProcessEvents() into a separate vector and then
iterate over this container to invoke the callbacks.
This prevents a problem where when the code iterated directly over
m_read_fds/m_signals, a callback invoked from within the loop could
modify these variables and invalidate the loop iterator. This would then
result in an assertion failure in llvm::DenseMapIterator::operator++().
Differential Revision: https://reviews.llvm.org/D35298
llvm-svn: 307782
Don't create sync object if it does not exist yet. For example, an atomic
pointer is initialized to nullptr and then periodically acquire-loaded.
llvm-svn: 307778
The test should have been added in 289682
"tsan: allow Java VM iterate over allocated objects"
but I forgot to avn add.
Author: Alexander Smundak (asmundak)
Reviewed in https://reviews.llvm.org/D27720
llvm-svn: 307776