Currently LLVM supports vectorization of horizontal reduction
instructions with initial value set to 0. Patch supports vectorization
of reduction with non-zero initial values. Also it supports a
vectorization of instructions with some extra arguments, like:
float f(float x[], int a, int b) {
float p = a % b;
p += x[0] + 3;
for (int i = 1; i < 32; i++)
p += x[i];
return p;
}
Patch allows vectorization of this kind of horizontal reductions.
Differential Revision: https://reviews.llvm.org/D28961
llvm-svn: 293994
This reverts commit r293970.
After more discussion, this belongs to the linker side and
there is no added value to do it at this level.
llvm-svn: 293993
Exit loop analysis early if suitable private access found.
Do not account for GEPs which are invariant to loop induction variable.
Do not account for Allocas which are too big to fit into register file anyway.
Add option for tuning: -amdgpu-unroll-threshold-private.
Differential Revision: https://reviews.llvm.org/D29473
llvm-svn: 293991
If LLVM was configured with an x86_64-apple-macosx host triple, this
test would fail, as the API works but the triple isn't in the whitelist.
llvm-svn: 293990
On Windows, the symbols "___stop___sancov_guards" and "___start___sancov_guards"
are not defined automatically. So, we need to take a different approach.
We define 3 sections:
Section ".SCOV$A" will only hold a variable ___start___sancov_guard.
Section ".SCOV$M" will hold the main data.
Section ".SCOV$Z" will only hold a variable ___stop___sancov_guards.
When linking, they will be merged sorted by the characters after the $, so we
can use the pointers of the variables ___[start|stop]___sancov_guard to know the
actual range of addresses of that section.
In this diff, I updated instrumentation to include all the guard arrays in
section ".SCOV$M".
Differential Revision: https://reviews.llvm.org/D28434
llvm-svn: 293987
While looking to add support for placing singular types (types that will
only be emitted in one place (such as attached to a strong vtable or
explicit template instantiation definition)) not in type units (since
type units have overhead) I stumbled across that change causing an
increase in pubtypes.
Turns out we were missing some types from type units if they were only
referenced from other type units and not from the debug_info section.
This fixes that, following GCC's line of describing the offset of such
entities as the CU die (since there's no compile unit-relative offset
that would describe such an entity - they aren't in the CU). Also like
GCC, this change prefers to describe the type stub within the CU rather
than the "just use the CU offset" fallback where possible. This may give
the DWARF consumer some opportunity to find the extra info in the type
stub - though I'm not sure GDB does anything with this currently.
The size of the pubnames/pubtypes sections now match exactly with or
without type units enabled.
This nearly triples (+189%) the pubtypes section for a clang self-host
and grows pubnames by 0.07% (without compression). For a total of 8%
increase in debug info sections of the objects of a Split DWARF build
when using type units.
llvm-svn: 293971
When a symbol is not exported outside of the
DSO, it is can be hidden. Usually we try to internalize
as much as possible, but it is not always possible, for
instance a symbol can be referenced outside of the LTO
unit, or there can be cross-module reference in ThinLTO.
This is a recommit of r293912 after fixing build failures,
and a recommit of r293918 after fixing LLD tests.
Differential Revision: https://reviews.llvm.org/D28978
llvm-svn: 293970
Summary: The COFF linker previously implemented link-time optimization using an API which has now been marked as legacy. This change refactors the COFF linker to use the new LTO API, which is also used by the ELF linker.
Reviewers: pcc, ruiu
Reviewed By: pcc
Subscribers: mgorny, mehdi_amini
Differential Revision: https://reviews.llvm.org/D29059
llvm-svn: 293967
Summary: llvm/CodeGen/CommandFlags.h a utility function InitTargetOptionsFromCodeGenFlags which is used to set target options from flags based on the command line. The command line flags are stored in globals defined in the same file, and including the file in multiple places causes the globals to be defined multiple times, leading to linker errors. This change adds a single place in lld where these globals are defined and exports only the utility function. This makes it possible to call InitTargetOptionsFromCodeGenFlags from multiple places in lld, which a follow-up change will do.
Reviewers: davide, ruiu
Reviewed By: davide, ruiu
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D29058
llvm-svn: 293965
Summary: This allows clients of the LTO API to determine the name of the fallback symbol for COFF weak externals.
Reviewers: pcc
Reviewed By: pcc
Subscribers: mehdi_amini
Differential Revision: https://reviews.llvm.org/D29365
llvm-svn: 293960
On Windows, the symbols "___stop___sancov_guards" and "___start___sancov_guards"
are not defined automatically. So, we need to take a different approach.
We define 3 sections: ".SCOV$A", ".SCOV$M" and ".SCOV$Z".
Section ".SCOV$A" will only hold a variable ___start___sancov_guard.
Section ".SCOV$M" will hold the main data.
Section ".SCOV$Z" will only hold a variable ___stop___sancov_guards.
When linking, they will be merged sorted by the characters after the $, so we
can use the pointers of the variables ___[start|stop]___sancov_guard to know the
actual range of addresses of that section.
___[start|stop]___sancov_guard should be defined only once per module. On
Windows, we have 2 different cases:
+ When considering a shared runtime:
All the modules, main executable and dlls, are linked to an auxiliary static
library dynamic_runtime_thunk.lib. Because of that, we include the delimiters
in `SancovDynamicRuntimeThunk`.
+ When considering a static runtime:
The main executable in linked to the static runtime library.
All the dlls are linked to an auxiliary static library dll_thunk.
Because of that, we include the delimiter to both `SancovDllThunk` and
`SANITIZER_LIBCDEP_SOURCES` (which is included in the static runtime lib).
Differential Revision: https://reviews.llvm.org/D28435
llvm-svn: 293959
In Windows, when sanitizers are implemented as a shared library (DLL), users can
redefine and export a new definition for weak functions, in the main executable,
for example:
extern "C" __declspec(dllexport)
void __sanitizer_cov_trace_pc_guard(u32* guard) {
// Different implementation provided by the client.
}
However, other dlls, will continue using the default implementation imported
from the sanitizer dll. This is different in linux, where all the shared
libraries will consider the strong definition.
With the implementation in this diff, when the dll is initialized, it will check
if the main executable exports the definition for some weak function (for
example __sanitizer_cov_trace_pc_guard). If it finds that function, then it will
override the function in the dll with that pointer. So, all the dlls with
instrumentation that import __sanitizer_cov_trace_pc_guard__dll() from asan dll,
will be using the function provided by the main executable.
In other words, when the main executable exports a strong definition for a weak
function, we ensure all the dlls use that implementation instead of the default
weak implementation.
The behavior is similar to linux. Now, every user that want to override a weak
function, only has to define and export it. The same for Linux and Windows, and
it will work fine. So, there is no difference on the user's side.
All the sanitizers will include a file sanitizer_win_weak_interception.cc that
register sanitizer's weak functions to be intercepted in the binary section WEAK
When the sanitizer dll is initialized, it will execute weak_intercept_init()
which will consider all the CB registered in the section WEAK. So, for all the
weak functions registered, we will check if a strong definition is provided in
the main executable.
All the files sanitizer_win_weak_interception.cc are independent, so we do not
need to include a specific list of sanitizers.
Now, we include [asan|ubsan|sanitizer_coverage]_win_weak_interception.cc and
sanitizer_win_weak_interception.cc in asan dll, so when it is initialized, it
will consider all the weak functions from asan, ubsan and sanitizer coverage.
After this diff, sanitizer coverage is fixed for MD on Windows. In particular
libFuzzer can provide custom implementation for all sanitizer coverage's weak
functions, and they will be considered by asan dll.
Differential Revision: https://reviews.llvm.org/D29168
llvm-svn: 293958
In this diff I update the code for asan on Windows, so we can intercept
SetUnhandledExceptionFilter and catch some exceptions depending on the result of
IsHandledDeadlyException() (which depends on asan flags).
This way we have the same behavior on Windows and Posix systems.
On Posix, we intercept signal and sigaction, so user's code can only register
signal handlers for signals that are not handled by asan.
After this diff, the same happens on Windows, user's code can only register
exception handlers for exceptions that are not handled by asan.
Differential Revision: https://reviews.llvm.org/D29463
llvm-svn: 293957
In Windows, when the sanitizer is implemented as a shared library (DLL), we need
an auxiliary static library dynamic_runtime_thunk that will be linked to the
main executable and dlls.
In the sanitizer DLL, we are exposing weak functions with WIN_WEAK_EXPORT_DEF(),
which exports the default implementation with __dll suffix. For example: for
sanitizer coverage, the default implementation of __sanitizer_cov_trace_cmp is
exported as: __sanitizer_cov_trace_cmp__dll.
In the dynamic_runtime_thunk static library, we include weak aliases to the
imported implementation from the dll, using the macro WIN_WEAK_IMPORT_DEF().
By default, all users's programs that include calls to weak functions like
__sanitizer_cov_trace_cmp, will be redirected to the implementation in the dll,
when linking to dynamic_runtime_thunk.
After this diff, we are able to compile code with sanitizer coverage
instrumentation on Windows. When the instrumented object files are linked with
clang-rt_asan_dynamic_runtime_thunk-arch.lib all the weak symbols will be
resolved to the implementation imported from asan dll.
All the files sanitizer_dynamic_runtime_thunk.cc are independent, so we do not
need to include a specific list of sanitizers.
Now, we compile: [asan|ubsan|sanitizer_coverage]_win_dynamic_runtime_thunk.cc
and sanitizer_win_dynamic_runtime_thunk.cc to generate
asan_dynamic_runtime_thunk.lib, because we include asan, ubsan and sanitizer
coverage in the address sanitizer library.
Differential Revision: https://reviews.llvm.org/D29158
llvm-svn: 293953
In this diff, I update current implementation of the interception in dll_thunks
to consider the special case of weak functions.
First we check if the client has redefined the function in the main executable
(for example: __sanitizer_cov_trace_pc_guard). It we can't find it, then we look
for the default implementation (__sanitizer_cov_trace_pc_guard__dll). The
default implementation is always available because the static runtime is linked
to the main executable.
Differential Revision: https://reviews.llvm.org/D29155
llvm-svn: 293952
When the sanitizer is implemented as a static library and is included in the
main executable, we need an auxiliary static library dll_thunk that will be
linked to the dlls that have instrumentation, so they can refer to the runtime
in the main executable. Basically, it uses interception to get a pointer the
function in the main executable and override its function with that pointer.
Before this diff, all of the implementation for dll_thunks was included in asan.
In this diff I split it into different sanitizers, so we can use other
sanitizers regardless of whether we include asan or not.
All the sanitizers include a file sanitizer_win_dll_thunk.cc that register
functions to be intercepted in the binary section: DLLTH
When the dll including dll_thunk is initialized, it will execute
__dll_thunk_init() implemented in: sanitizer_common/sanitizer_win_dll_thunk.cc,
which will consider all the CB registered in the section DLLTH. So, all the
functions registered will be intercepted, and redirected to the implementation
in the main executable.
All the files "sanitizer_win_dll_thunk.cc" are independent, so we don't need to
include a specific list of sanitizers. Now, we compile: asan_win_dll_thunk.cc
ubsan_win_dll_thunk.cc, sanitizer_coverage_win_dll_thunk.cc and
sanitizer_win_dll_thunk.cc, to generate asan_dll_thunk, because we include asan,
ubsan and sanitizer coverage in the address sanitizer library.
Differential Revision: https://reviews.llvm.org/D29154
llvm-svn: 293951
Summary: Some compilers, including MSVC and Clang, allow linker options to be specified in source files. In the legacy LTO API, there is a getLinkerOpts() method that returns linker options for the bitcode module being processed. This change adds that method to the new API, so that the COFF linker can get the right linker options when using the new LTO API.
Reviewers: pcc, ruiu, mehdi_amini, tejohnson
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D29207
llvm-svn: 293950
On one test this seems to have given more chance for DAG combine to do other INSERT_SUBVECTOR/EXTRACT_SUBVECTOR combines before the BLENDI was created. Looks like we can still improve more by teaching DAG combine to optimize INSERT_SUBVECTOR/EXTRACT_SUBVECTOR with BLENDI.
llvm-svn: 293944
This moves the following classes from Core -> Utility.
ConstString
Error
RegularExpression
Stream
StreamString
The goal here is to get lldbUtility into a state where it has
no dependendencies except on itself and LLVM, so it can be the
starting point at which to start untangling LLDB's dependencies.
These are all low level and very widely used classes, and
previously lldbUtility had dependencies up to lldbCore in order
to use these classes. So moving then down to lldbUtility makes
sense from both the short term and long term perspective in
solving this problem.
Differential Revision: https://reviews.llvm.org/D29427
llvm-svn: 293941
This test fails consistently on Ubuntu 16.xx powerpc64 LE systems.
The cause is being investigated and in the meantime disable it so
the buildbots can run cleanly.
llvm-svn: 293939
1. Added comments for options
2. Added missing option cl::desc field
3. Uniified function filter option for graph viewing.
Now PGO count/raw-counts share the same
filter option: -view-bfi-func-name=.
llvm-svn: 293938
On ELF every section can have a corresponding section symbol. When in
an assembly file we have
.quad .text
the '.text' refers to that symbol.
The way we used to handle them is to leave .text an undefined symbol
until the very end when the object writer would map them to the
actual section symbol.
The problem with that is that anything before the end would see an
undefined symbol. This could result in bad diagnostics
(test/MC/AArch64/label-arithmetic-diags-elf.s), or incorrect results
when using the asm streamer (est/MC/Mips/expansion-jal-sym-pic.s).
Fixing this will also allow using the section symbol earlier for
setting sh_link of SHF_METADATA sections.
This patch includes a few hacks to avoid changing our behaviour when
handling conflicts between section symbols and other symbols. I
reported pr31850 to track that.
llvm-svn: 293936
This is the second in the series of patches to enable adding
of machine sched-models for ARM processors easier and compact.
This patch focuses on integer instructions and adds missing
sched definitions.
Reviewers: rovka, rengolin
Differential Revision: https://reviews.llvm.org/D29127
llvm-svn: 293935
In r283838, we added the capability of splitting unspillable register.
When doing so we had to make sure the split live-ranges were also
unspillable and we did that by marking the related live-ranges in the
delegate method that is called when a new vreg is created.
However, by accessing the live-range there, we also triggered their lazy
computation (LiveIntervalAnalysis::getInterval) which is not what we
want in general. Indeed, later code in LiveRangeEdit is going to build
the live-ranges this lazy computation may mess up that computation
resulting in assertion failures. Namely, the createEmptyIntervalFrom
method expect that the live-range is going to be empty, not computed.
Thanks to Mikael Holmén <mikael.holmen@ericsson.com> for noticing and
reporting the problem.
llvm-svn: 293934