Currently, there is a big overhead in reporting of loop metadata through
ittnotify. The pair of functions: __kmp_str_loc_init/__kmp_str_loc_free are
replaced with strchr/atoi calls. Thus, a lot of time consuming actions are
skipped - many memory allocations/deallocations, heavy string duplication, etc.
The loop metadata only needs line and column info from the source string, so no
allocations and string splitting actually needed.
Patch by Andrey Churbanov
Differential Revision: http://reviews.llvm.org/D21309
llvm-svn: 272698
Cleanup - unused code removal.
TODO: consider to remove (replace with flag class methods)
also kmp_wait_64 and kmp_release_64 routines.
Patch by Andrey Churbanov
Differential Revision: http://reviews.llvm.org/D21332
llvm-svn: 272697
On OSX this test sometimes fails due to the
``LLVMFuzzer-FullCoverageSetTest-TracePC`` program going over the
default 2GiB memory limit. This shouldn't be happening and needs
investigating. For now just disable the test so we can set up an
OSX buildbot.
Differential Revision: http://reviews.llvm.org/D21319
llvm-svn: 272696
After r272599, -DLLVM_BUILD_INSTRUMENTED passes a default argument to
-fprofile-instr-generate. This confuses the perf-helper script because
the runtime emits a note stating that the default is overridden by the
LLVM_PROFILE_FILE environment variable.
Change the perf-helper script s.t it does not treat these notes as
failures.
This isn't a strictly NFC change, but I don't see a simple way to add a
test for it.
llvm-svn: 272695
For <N x i32> type mul, pmuludq will be used for targets without SSE41, which
often introduces many extra pack and unpack instructions in vectorized loop
body because pmuludq generates <N/2 x i64> type value. However when the operands
of <N x i32> mul are extended from smaller size values like i8 and i16, the type
of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which
generates better code. For targets with SSE41, pmulld is supported so no
shrinking is needed.
Differential Revision: http://reviews.llvm.org/D20931
llvm-svn: 272694
This reverts commit 879139e1c6577b09df52de56a6bab856a19ed185.
This was committed accidentally when I blindly typed git svn
dcommit instead of the command to generate a patch.
llvm-svn: 272693
This patch also includes some refactoring.
Prior to this patch, we tagged all CFLAA attributes as unknown. This is
suboptimal, since it meant that any Value used as an argument would be
considered to alias any other Value that existed.
Now that we have the machinery to tag sets below the set for an
arbitrary value with attributes, it's okay to be less conservative with
arguments. (Specifically, we still tag the set under an argument with
unknown).
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21262
llvm-svn: 272690
that makes allocation fail. "UL" is 32-bit and shift by 40 will make
the value overflow and become 0.
Patch by Wei Wang
Differential Revision: http://reviews.llvm.org/D21310
llvm-svn: 272689
This patch refactors CFLAA's graph building code. This makes keeping
track of common state (TargetLibraryInfo, ...) easier.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21261
llvm-svn: 272688
OpenMP 4.1 is now OpenMP 4.5. Any mention of 41 or 4.1 is replaced with
45 or 4.5. Also, if the CMake option LIBOMP_OMP_VERSION is 41, CMake warns that
41 is deprecated and to use 45 instead.
llvm-svn: 272687
Summary:
This removes the last usage of the Platform plugin in NPL. It was being
used for determining the architecture of the debugged process. I replace
the call that went through the Platform plugin with a lower level call
on the ObjectFile directly.
Reviewers: tberghammer
Subscribers: uweigand, nitesh.jain, omjavaid, lldb-commits
Differential Revision: http://reviews.llvm.org/D21324
llvm-svn: 272686
This patch fixes various races between the time the private state thread is signaled to exit and the time it actually exits (during which it no longer responds to events). Previously, this was consistently causing 2-second timeout delays on process detach/stop for us.
This also prevents crashes that were caused by the thread controlling its own owning pointer while the controller was using it (copying the thread wrapper is not enough to mitigate this, since the internal thread object was getting reset anyway). Again, we were seeing this consistently.
Differential Revision: http://reviews.llvm.org/D21296
llvm-svn: 272682
Implement the same optimization committed under r271004 on non-EHABI,
non-SJLJ unwinder as well.
Change-Id: I7f80ed91a75d1e778b50ba87cf8fb68658a083c7
llvm-svn: 272680
Summary:
Adds a version of sigaction that uses a raw system call, to avoid circular
dependencies and support calling sigaction prior to setting up
interceptors. The new sigaction relies on an assembly sigreturn routine
for its restorer, which is Linux x86_64-only for now.
Uses the new sigaction to initialize the working set tool's shadow fault
handler prior to libc interceptor being set up. This is required to
support instrumentation invoked during interceptor setup, which happens
with an instrumented tcmalloc or other allocator compiled with esan.
Adds a test that emulates an instrumented allocator.
Reviewers: aizatsky
Subscribers: vitalybuka, tberghammer, zhaoqin, danalbert, kcc, srhines, eugenis, llvm-commits, kubabrecka
Differential Revision: http://reviews.llvm.org/D21083
llvm-svn: 272676
Summary:
We now use a standard fixup type applying the pc-relative address of
constant address space variables, and we have the GlobalAddress lowering
code add the required 4 byte offset to the global address rather than
doing it as part of the fixup.
This refactoring will make it easier to use the same code for global
address space variables and also simplifies the code.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21154
llvm-svn: 272675
Change EmitGlobalVariable to check final assembler section is in BSS
before using .lcomm/.comm directive. This prevents globals from being
put into .bss erroneously when -data-sections is used.
This fixes PR26570.
Reviewers: echristo, rafael
Subscribers: llvm-commits, mehdi_amini
Differential Revision: http://reviews.llvm.org/D21146
llvm-svn: 272674
The feature allows for conditional assembly etc.
TODO: make those symbols read-only.
Test added.
Differential Revision: http://reviews.llvm.org/D21238
llvm-svn: 272673
We move the loop rotate functions in a separate class to avoid passing multiple
parameters to each function. This cleanup will help with further development of
loop rotation. NFC.
Patch written by Aditya Kumar and Sebastian Pop.
Differential Revision: http://reviews.llvm.org/D21311
llvm-svn: 272672
Summary:
This new alias takes a comma separated list of prefixes which allows
'--check-prefix=A --check-prefix=B --check-prefix=C' to be written as
'--check-prefixes=A,B,C'.
Reviewers: probinson
Subscribers: probinson, llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D21293
llvm-svn: 272670
Instead of always using addu to adjust the stack pointer when the
size out is of the range of an addiu instruction, use subu so that
a smaller constant can be generated.
This can give savings of ~3 instructions whenever a function has a
a stack frame whose size is out of range of an addiu instruction.
This change may break some naive stack unwinders.
Partially resolves PR/26291.
Thanks to David Chisnall for reporting the issue.
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D21321
llvm-svn: 272666
We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256.
Found by Oliver Stannard and csmith!
llvm-svn: 272665
Summary:
The SimplifyLibCalls part of InstCombine generates calls to those otherwise.
I wonder if at some point we shouldn't just call disableAllFunctions() and
then enable functions on a whitelist basis...
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96495
Reviewers: arsenm, tstellarAMD
Subscribers: llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21282
llvm-svn: 272664
The two patches together enable clang to support targets like
"x86_64-pc-linux-musl" and build binaries against musl-libc instead of
glibc. This make it easy for clang to work on some musl-based systems
like Alpine Linux and certain flavors of Gentoo.
Patch by Lei Zhang.
llvm-svn: 272662
Summary:
This is similar to other loop pragmas like 'vectorize'. Currently it
only has state values: distribute(enable) and distribute(disable). When
one of these is specified the corresponding loop metadata is generated:
!{!"llvm.loop.distribute.enable", i1 true/false}
As a result, loop distribution will be attempted on the loop even if
Loop Distribution in not enabled globally. Analogously, with 'disable'
distribution can be turned off for an individual loop even when the pass
is otherwise enabled.
There are some slight differences compared to the existing loop pragmas.
1. There is no 'assume_safety' variant which makes its handling slightly
different from 'vectorize'/'interleave'.
2. Unlike the existing loop pragmas, it does not have a corresponding
numeric pragma like 'vectorize' -> 'vectorize_width'. So for the
consistency checks in CheckForIncompatibleAttributes we don't need to
check it against other pragmas. We just need to check for duplicates of
the same pragma.
Reviewers: rsmith, dexonsmith, aaron.ballman
Subscribers: bob.wilson, cfe-commits, hfinkel
Differential Revision: http://reviews.llvm.org/D19403
llvm-svn: 272656
PR27458 highlights that the MIPS backend does not have well formed
MIR for atomic operations (among other errors).
This patch adds expands and corrects the LL/SC descriptions and uses
for MIPS(64).
Reviewers: dsanders, vkalintiris
Differential Review: http://reviews.llvm.org/D19719
llvm-svn: 272655
This patch implements the N32 case where -mno-shared is in effect. The case
where -mshared is in effect will be added later since doing that now requires
additional changes to how we handle %hi(%neg(%gp_rel(foo))) expressions to
emit the three relocations as three relocations (currently only one of the
three would be emitted) which then requires further changes to our MCFixup
handling.
While we could fix both cases together, fixing the -mno-shared case allows us
to fix the ELFCLASS bug (where N32 incorrectly uses ELFCLASS64 instead of
ELFCLASS32) in a way that allows cpsetup.s to check for a correct output instead
of another incorrect output.
Reviewers: sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: http://reviews.llvm.org/D21131
llvm-svn: 272652
Ensuring that the PHI are all single-operand is not performed in the
second pass added by the previous pass. This removes the assert from
the first pass.
llvm-svn: 272650