Commit Graph

813 Commits

Author SHA1 Message Date
Alexandre Ganea 09158252f7 [ThinLTO] Allow usage of all hardware threads in the system
Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.

One can now say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask.

When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows).

Differential Revision: https://reviews.llvm.org/D75153
2020-03-27 10:20:58 -04:00
Fangrui Song 08ff4dc9ad [LTO] onfig::addSaveTemps: clear ResolutionFile upon an error
Otherwise ld.lld -save-temps will crash when writing to ResolutionFile.

llvm-lto2 -save-temps does not crash because it exits immediately.

Reviewed By: evgeny777

Differential Revision: https://reviews.llvm.org/D75426
2020-03-02 17:49:04 -08:00
Francis Visoiu Mistrih 94cbe13073 [LTO][Legacy] Add explicit dependency on BinaryFormat
This fixes some windows bots.
2020-02-28 15:50:43 -08:00
Francis Visoiu Mistrih e551b737c3 [LTO][Legacy] Add new API to query Mach-O CPU (sub)type
Tools working with object files on Darwin (e.g. lipo) may need to know
properties like the CPU type and subtype of a bitcode file. The logic of
converting a triple to a Mach-O CPU_(SUB_)TYPE should be provided by
LLVM instead of relying on tools to re-implement it.

Differential Revision: https://reviews.llvm.org/D75067
2020-02-28 12:56:05 -08:00
Alexandre Ganea 6f846c8504 Improve comments after 8404aeb56a. 2020-02-18 14:25:21 -05:00
Alexandre Ganea 8404aeb56a [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
Bill Wendling c55cf4afa9 Revert "Remove redundant "std::move"s in return statements"
The build failed with

  error: call to deleted constructor of 'llvm::Error'

errors.

This reverts commit 1c2241a793.
2020-02-10 07:07:40 -08:00
Bill Wendling 1c2241a793 Remove redundant "std::move"s in return statements 2020-02-10 06:39:44 -08:00
Johannes Doerfert b0c77c36d2 [Attributor] Add an Attributor CGSCC pass and run it
In addition to the module pass, this patch introduces a CGSCC pass that
runs the Attributor on a strongly connected component of the call graph
(both old and new PM). The Attributor was always design to be used on a
subset of functions which makes this patch mostly mechanical.

The one change is that we give up `norecurse` deduction in the module
pass in favor of doing it during the CGSCC pass. This makes the
interfaces simpler but can be revisited if needed.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D70767
2020-02-08 21:27:34 -06:00
Johannes Doerfert 9548b74a83 [OpenMP] Introduce the OpenMPOpt transformation pass
The OpenMPOpt pass is a CGSCC pass in which OpenMP specific
optimizations can reside.

The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime
calls and their uses. This allows targeted transformations and eases
their implementation.

This initial patch deduplicates `__kmpc_global_thread_num` and
`omp_get_thread_num` calls. We can also identify arguments that are
equivalent to such a call result and use it instead. Later we can
determine "gtid" arguments based on the use in kernel functions etc.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69930
2020-02-08 14:47:03 -06:00
Russell Gallop e7cb374433 [LLD][ELF] Add time-trace to ELF LLD
This adds some of LLD specific scopes and picks up optimisation scopes
via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in
77e6bb3c.

Differential Revision: https://reviews.llvm.org/D71060
2020-02-06 12:14:13 +00:00
Francis Visoiu Mistrih 7531a5039f [Remarks] Extend the RemarkStreamer to support other emitters
This extends the RemarkStreamer to allow for other emitters (e.g.
frontends, SIL, etc.) to emit remarks through a common interface.

See changes in llvm/docs/Remarks.rst for motivation and design choices.

Differential Revision: https://reviews.llvm.org/D73676
2020-02-04 17:16:02 -08:00
Jonas Devlieghere b2924d9956 [llvm] Replace SmallStr.str().str() with std::string conversion operator.
Use the std::string conversion operator introduced in
d7049213d0.
2020-01-29 21:16:46 -08:00
Gabor Horvath 31ae0165c3 [LTO] Add optimization remarks for removed functions
This only works with regular LTO for now.

Differential Revision: https://reviews.llvm.org/D73597
2020-01-29 15:53:51 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Teresa Johnson 59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Nico Weber 81c67da0f2 remove an include that's unused after r347592 2020-01-16 12:49:54 -05:00
Mircea Trofin 7acfda633f [llvm] Make new pass manager's OptimizationLevel a class
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.

In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.

This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.

The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72547
2020-01-16 09:00:56 -08:00
Teresa Johnson d0aad9f56e [LTO] Constify lto::Config reference passed to backends (NFC)
The lto::Config object saved on the global LTO object should not be
updated by any of the LTO backends. Otherwise we could run into
interference between threads utilizing it. Motivated by some proposed
changes that would have caused it to get modified in the ThinLTO
backends.
2020-01-13 12:26:17 -08:00
Wei Mi 21a4710c67 [ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP
down to pass builder in ltobackend.

Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.

Differential Revision: https://reviews.llvm.org/D72386
2020-01-09 21:13:11 -08:00
evgeny ad364956ed [ThinLTO] Show preserved symbols in DOT files
Differential revision: https://reviews.llvm.org/D71608
2019-12-18 18:33:15 +03:00
Rui Ueyama 69da7e29de Revert an accidental commit af5ca40b47 2019-12-13 15:17:40 +09:00
Rui Ueyama af5ca40b47 temporary 2019-12-13 14:35:03 +09:00
Teresa Johnson c8e0bb3b2c [LTO] Support for embedding bitcode section during LTO
Summary:
This adds support for embedding bitcode in a binary during LTO. The libLTO gains supports the `-lto-embed-bitcode` flag. The option allows users of the LTO library to embed a bitcode section. For example, LLD can pass the option via `ld.lld -mllvm=-lto-embed-bitcode`.

This feature allows doing something comparable to `clang -c -fembed-bitcode`, but on the (LTO) linker level. Having bitcode alongside native code has many use-cases. To give an example, the MacOS linker can create a `-bitcode_bundle` section containing bitcode. Also, having this feature built into LLVM is an alternative to 3rd party tools such as [[ https://github.com/travitch/whole-program-llvm | wllvm ]] or [[ https://github.com/SRI-CSL/gllvm | gllvm ]]. As with these tools, this feature simplifies creating "whole-program" llvm bitcode files, but in contrast to wllvm/gllvm it does not rely on a specific llvm frontend/driver.

Patch by Josef Eisl <josef.eisl@oracle.com>

Reviewers: #llvm, #clang, rsmith, pcc, alexshap, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mehdi_amini, inglorion, hiraditya, aheejin, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits, #llvm, #clang

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68213
2019-12-12 12:34:19 -08:00
Francis Visoiu Mistrih 7902d6cc80 [Remarks][ThinLTO] Use the correct file extension based on the format
Since we now have multiple formats, the ThinLTO remark files should also
respect that.
2019-12-02 13:04:43 -08:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Francis Visoiu Mistrih bffdee8ef3 [LTO][Legacy] Add API for passing LLVM options separately
In order to correctly pass options to LLVM, including options containing
spaces which are used as delimiters for multiple options in
lto_codegen_debug_options, add a new API:
lto_codegen_debug_options_array.

Unfortunately, tools/lto has no testing infrastructure yet, so there are
no tests associated with this patch.

Differential Revision: https://reviews.llvm.org/D70463
2019-11-19 16:30:37 -08:00
evgeny ef5e3b85ee [ThinLTO] Simplify code. NFC 2019-11-19 15:51:25 +03:00
evgeny 3d708bf5c2 Recommit "[ThinLTO] Add correctness check for RO/WO variable import"
ValueInfo has user-defined 'operator bool' which allows incorrect implicit conversion
to GlobalValue::GUID (which is unsigned long). This causes bugs which are hard to
track and should be removed in future.
2019-11-15 16:13:19 +03:00
Reid Kleckner 4c1a1d3cf9 Add missing includes needed to prune LLVMContext.h include, NFC
These are a pre-requisite to removing #include "llvm/Support/Options.h"
from LLVMContext.h: https://reviews.llvm.org/D70280
2019-11-14 15:23:15 -08:00
Benjamin Kramer 360f661733 Revert "[ThinLTO] Add correctness check for RO/WO variable import"
This reverts commit a2292cc537. Breaks
clang selfhost w/ThinLTO.
2019-11-14 16:07:13 +01:00
evgeny a2292cc537 [ThinLTO] Add correctness check for RO/WO variable import
This patch adds an assertion check for exported read/write-only
variables to be also in import list for module. If they aren't
we may face linker errors, because read/write-only variables are
internalized in their source modules. The patch also changes
export lists to store ValueInfo instead of GUID for performance
considerations.

Differential revision: https://reviews.llvm.org/D70128
2019-11-14 12:24:05 +03:00
Reid Kleckner 1dfede3122 Move CodeGenFileType enum to Support/CodeGen.h
Avoids the need to include TargetMachine.h from various places just for
an enum. Various other enums live here, such as the optimization level,
TLS model, etc. Data suggests that this change probably doesn't matter,
but it seems nice to have anyway.
2019-11-13 16:39:34 -08:00
Michael Liao 950b800c45 Fix compilation warning on the trailing whitespace. NFC. 2019-10-24 09:55:06 -04:00
Benjamin Kramer bfa3f0c316 Hide implementation details in anonymous namespaces. NFC. 2019-10-24 10:48:43 +02:00
Eugene Leviant eb34c3e8a4 [ThinLTOCodeGenerator] Add support for index-based WPD
Differential revision: https://reviews.llvm.org/D68950

llvm-svn: 375219
2019-10-18 10:54:14 +00:00
Oliver Stannard 3b598b9c86 Reland: Dead Virtual Function Elimination
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.

Original commit message:

Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 375094
2019-10-17 09:58:57 +00:00
Guillaume Chatelet 0e62011df8 [Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, mehdi_amini, jvesely, nhaehnle, hiraditya, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68944

llvm-svn: 374880
2019-10-15 11:24:36 +00:00
Jorge Gorbe Moya b052331bd6 Revert "Dead Virtual Function Elimination"
This reverts commit 9f6a873268.

llvm-svn: 374844
2019-10-14 23:25:25 +00:00
Oliver Stannard 9f6a873268 Dead Virtual Function Elimination
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 374539
2019-10-11 11:59:55 +00:00
Jordan Rose aab67b571a ADT: Save a word in every StringSet entry
Add a specialization to StringMap (actually StringMapEntry) for a
value type of NoneType (the type of llvm::None), and use it for
StringSet. This'll save us a word from every entry in a StringSet,
used for alignment with the size_t that stores the string length.

I could have gone all the way to some kind of empty base class
optimization, but that seemed like overkill. Someone can consider
adding that in the future, though.

https://reviews.llvm.org/D68586

llvm-svn: 374440
2019-10-10 20:22:53 +00:00
Teresa Johnson 077cc3fcb0 [ThinLTO/WPD] Ensure devirtualized targets use promoted symbol when necessary
Summary:
This fixes a hole in the handling of devirtualized targets that were
local but need promoting due to devirtualization in another module. We
were not correctly referencing the promoted symbol in some cases. Make
sure the code that updates the name also looks at the ExportedGUIDs set
by utilizing a callback that checks all conditions (the callback
utilized by the internalization/promotion code).

Reviewers: pcc, davidxl, hiraditya

Subscribers: mehdi_amini, Prazek, inglorion, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68159

llvm-svn: 373485
2019-10-02 16:36:59 +00:00
Steven Wu 34d80461ff [LTO][Legacy] Add new C inferface to query libcall functions
Summary:
This is needed to implemented the same approach as lld (implemented in r338434)
for how to handling symbols that can be generated by LTO code generator
but not present in the symbol table for linker that uses legacy C APIs.

libLTO is in charge of providing the list of symbols. Linker is in
charge of implementing the eager loading from static libraries using
the list of symbols.

rdar://problem/52853974

Reviewers: tejohnson, bd1976llvm, deadalnix, espindola

Reviewed By: tejohnson

Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67568

llvm-svn: 372021
2019-09-16 18:49:54 +00:00
Jan Korous f69c91780f [Support] Add overload writeFileAtomically(std::function Writer)
Differential Revision: https://reviews.llvm.org/D67424

llvm-svn: 371890
2019-09-13 20:08:27 +00:00
Tim Northover f1c2892912 AArch64: support arm64_32, an ILP32 slice for watchOS.
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.

llvm-svn: 371722
2019-09-12 10:22:23 +00:00
Fangrui Song 6b9df910d0 [LTO] Avoid calling GlobalValue::getGUID (MD5) twice
llvm-svn: 371593
2019-09-11 07:38:21 +00:00
Benjamin Kramer dc5f805d31 Do a sweep of symbol internalization. NFC.
llvm-svn: 369803
2019-08-23 19:59:23 +00:00
Teresa Johnson ea314fd476 [ThinLTO] Fix handling of weak interposable symbols
Summary:
Keep aliasees alive if their alias is live, otherwise we end up with an
alias to a declaration, which is invalid. This can happen when the
aliasee is weak and non-prevailing.

This fix exposed the fact that we were then attempting to internalize
the weak symbol, which was not exported as it was not prevailing. We
should not internalize interposable symbols in general, unless this is
the prevailing copy, since it can lead to incorrect inlining and other
optimizations. Most of the changes in this patch are due to the
restructuring required to pass down the prevailing callback.

Finally, while implementing the test cases, I found that in the case of
a weak aliasee that is still marked not live because its alias isn't
live, after dropping the definition we incorrectly marked the
declaration with weak linkage when resolving prevailing symbols in the
module. This was due to some special case handling for symbols marked
WeakLinkage in the summary located before instead of after a subsequent
check for the symbol being a declaration. It turns out that we don't
actually need this special case handling any more (looking back at the
history, when that was added the code was structured quite differently)
- we will correctly mark with weak linkage further below when the
definition hasn't been dropped.

Fixes PR42542.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66264

llvm-svn: 369766
2019-08-23 15:18:58 +00:00
Taewook Oh 213d8a9f13 [NewPM][PassInstrumentation] IR printing support for (Thin)LTO
Summary: IR printing has not been correctly supported with (Thin)LTO if the new pass manager is enabled. Previously we only get outputs from backend(codegen) passes, as they are still under legacy pass manager even when the new pass manager is enabled. This patch addresses the issue and enables IR printing for optimization passes with new pass manager + (Thin)LTO setting.

Reviewers: fedor.sergeev, philip.pfaffe

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66253

llvm-svn: 369024
2019-08-15 17:47:44 +00:00
Jonas Devlieghere 0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Fangrui Song d9b948b6eb Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC
F_{None,Text,Append} are kept for compatibility since r334221.

llvm-svn: 367800
2019-08-05 05:43:48 +00:00
Teresa Johnson d2df54e6a5 [ThinLTO] Implement index-based WPD
This patch adds support to the WholeProgramDevirt pass to perform
index-based WPD, which is invoked from ThinLTO during the thin link.

The ThinLTO backend (WPD import phase) behaves the same regardless of
whether the WPD decisions were made with the index-based or (the
existing) IR-based analysis.

Depends on D54815.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D55153

llvm-svn: 367679
2019-08-02 13:10:52 +00:00
Reid Kleckner f002fcb2ad Open native file handles to avoid converting from FDs, NFC
Follow up to r365588.

llvm-svn: 365820
2019-07-11 20:29:32 +00:00
Reid Kleckner cc418a3af4 [Support] Move llvm::MemoryBuffer to sys::fs::file_t
Summary:
On Windows, Posix integer file descriptors are a compatibility layer
over native file handles provided by the C runtime. There is a hard
limit on the maximum number of file descriptors that a process can open,
and the limit is 8192. LLD typically doesn't run into this limit because
it opens input files, maps them into memory, and then immediately closes
the file descriptor. This prevents it from running out of FDs.

For various reasons, I'd like to open handles to every input file and
keep them open during linking. That requires migrating MemoryBuffer over
to taking open native file handles instead of integer FDs.

Reviewers: aganea, Bigcheese

Reviewed By: aganea

Subscribers: smeenai, silvas, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, llvm-commits, zturner

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63453

llvm-svn: 365588
2019-07-10 00:34:13 +00:00
Eugene Leviant 3aef35288b [ThinLTO] Attempt to recommit r365188 after alignment fix
llvm-svn: 365215
2019-07-05 15:25:05 +00:00
Eugene Leviant e91f86f0ac Reverted r365188 due to alignment problems on i686-android
llvm-svn: 365206
2019-07-05 13:26:05 +00:00
Eugene Leviant 820cc01d1e [ThinLTO] Attempt to recommit r365040 after caching fix
It's possible that some function can load and store the same
variable using the same constant expression:

store %Derived* @foo, %Derived** bitcast (%Base** @bar to %Derived**)
%42 = load %Derived*, %Derived** bitcast (%Base** @bar to %Derived**)

The bitcast expression was mistakenly cached while processing loads,
and never examined later when processing store. This caused @bar to
be mistakenly treated as read-only variable. See load-store-caching.ll.

llvm-svn: 365188
2019-07-05 12:00:10 +00:00
Reid Kleckner f7e52fbdb5 Revert [ThinLTO] Optimize writeonly globals out
This reverts r365040 (git commit 5cacb91475)

Speculatively reverting, since this appears to have broken check-lld on
Linux. Partial analysis in https://crbug.com/981168.

llvm-svn: 365097
2019-07-04 00:03:30 +00:00
Eugene Leviant 5cacb91475 [ThinLTO] Optimize writeonly globals out
Differential revision: https://reviews.llvm.org/D63444

llvm-svn: 365040
2019-07-03 14:14:52 +00:00
Francis Visoiu Mistrih 34667519dc [Remarks] Extend -fsave-optimization-record to specify the format
Use -fsave-optimization-record=<format> to specify a different format
than the default, which is YAML.

For now, only YAML is supported.

llvm-svn: 363573
2019-06-17 16:06:00 +00:00
Aaron Puchert e1dc495e63 [Clang] Harmonize Split DWARF options with llc
Summary:
With Split DWARF the resulting object file (then called skeleton CU)
contains the file name of another ("DWO") file with the debug info.
This can be a problem for remote compilation, as it will contain the
name of the file on the compilation server, not on the client.

To use Split DWARF with remote compilation, one needs to either

* make sure only relative paths are used, and mirror the build directory
  structure of the client on the server,
* inject the desired file name on the client directly.

Since llc already supports the latter solution, we're just copying that
over. We allow setting the actual output filename separately from the
value of the DW_AT_[GNU_]dwo_name attribute in the skeleton CU.

Fixes PR40276.

Reviewers: dblaikie, echristo, tejohnson

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D59673

llvm-svn: 363496
2019-06-15 15:38:51 +00:00
Francis Visoiu Mistrih 7a21113ce8 Reland: [Remarks] Refactor optimization remarks setup
* Add a common function to setup opt-remarks
* Rename common options to the same names
* Add error types to distinguish between file errors and regex errors

llvm-svn: 363415
2019-06-14 16:20:51 +00:00
Francis Visoiu Mistrih e4147ea1ef Revert "[Remarks] Refactor optimization remarks setup"
This reverts commit 6e6e3af55b.

This breaks greendragon.

llvm-svn: 363343
2019-06-14 00:05:56 +00:00
Francis Visoiu Mistrih 6e6e3af55b [Remarks] Refactor optimization remarks setup
* Add a common function to setup opt-remarks
* Rename common options to the same names
* Add error types to distinguish between file errors and regex errors

llvm-svn: 363328
2019-06-13 21:46:57 +00:00
Ben Dunbobbin 564d248ec2 [ThinLTO]LTO]Legacy] Fix dependent libraries support by adding querying of the IRSymtab
Dependent libraries support for the legacy api was committed in a
broken state (see: https://reviews.llvm.org/D60274). This was missed
due to the painful nature of having to integrate the changes into a
linker in order to test. This change implements support for dependent
libraries in the legacy LTO api:

- I have removed the current api function, which returns a single
string, and   added functions to access each dependent library
specifier individually.

- To reduce the testing pain, I have made the api functions as thin as
possible to   maximize coverage from llvm-lto.

- When doing ThinLTO the system linker will load the modules lazily
when scanning   the input files. Unfortunately, when modules are
lazily loaded there is no access   to module level named metadata. To
fix this I have added api functions that allow   querying the IRSymtab
for the dependent libraries. I hope to expand the api in the   future
so that, eventually, all the information needed by a client linker
during   scan can be retrieved from the IRSymtab.

Differential Revision: https://reviews.llvm.org/D62935

llvm-svn: 363140
2019-06-12 11:07:56 +00:00
Johannes Doerfert aade782a98 [Attributor] Pass infrastructure and fixpoint framework
NOTE: Note that no attributes are derived yet. This patch will not go in
      alone but only with others that derive attributes. The framework is
      split for review purposes.

This commit introduces the Attributor pass infrastructure and fixpoint
iteration framework. Further patches will introduce abstract attributes
into this framework.

In a nutshell, the Attributor will update instances of abstract
arguments until a fixpoint, or a "timeout", is reached. Communication
between the Attributor and the abstract attributes that are derived is
restricted to the AbstractState and AbstractAttribute interfaces.

Please see the file comment in Attributor.h for detailed information
including design decisions and typical use case. Also consider the class
documentation for Attributor, AbstractState, and AbstractAttribute.

Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames

Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59918

llvm-svn: 362578
2019-06-05 03:02:24 +00:00
Sam Clegg 9d21f510ee Fix -DBUILD_SHARED_LIBS=ON build after rL362160
Differential Revision: https://reviews.llvm.org/D62709

llvm-svn: 362180
2019-05-31 01:04:00 +00:00
Francis Visoiu Mistrih 6ada11f134 [Remarks][NFC] Move the serialization to lib/Remarks
Separate the remark serialization to YAML from the LLVM Diagnostics.

This adds a new serialization abstraction: remarks::Serializer. It's
completely independent from lib/IR and it provides an easy way to
replace YAML by providing a new remarks::Serializer.

Differential Revision: https://reviews.llvm.org/D62632

llvm-svn: 362160
2019-05-30 21:45:59 +00:00
Ben Dunbobbin 1d16515fb4 [ELF] Implement Dependent Libraries Feature
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.

Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.

The design goals were to provide:

- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
  environments (MSVC in particular).

Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.

In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:

1. There are no competing defined symbols in a given set of libraries, or
   if they exist, the program owner doesn't care which is linked to their
   program.
2. There may be circular dependencies between libraries.

The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:

.section ".deplibs","MS",@llvm_dependent_libraries,1
         .asciz "foo"

For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.

LLD processes the dependent library specifiers in the following way:

1. Dependent libraries which are found from the specifiers in .deplibs sections
   of relocatable object files are added when the linker decides to include that
   file (which could itself be in a library) in the link. Dependent libraries
   behave as if they were appended to the command line after all other options. As
   a consequence the set of dependent libraries are searched last to resolve
   symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
   to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
   strings in a .deplibs section by; first, handling the string as if it was
   specified on the command line; second, by looking for the string in each of the
   library search paths in turn; third, by looking for a lib<string>.a or
   lib<string>.so (depending on the current mode of the linker) in each of the
   library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
   dependent libraries.

Rationale for the above points:

1. Adding the dependent libraries last makes the process simple to understand
   from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
   failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
   will affect all of the dependent libraries. There is a potential problem of
   surprise for developers, who might not realize that these options would apply
   to these "invisible" input files; however, despite the potential for surprise,
   this is easy for developers to reason about and gives developers the control
   that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
   find input files. The different search methods are tried by the linker in most
   obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
   ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
   is not necessary: if finer control is required developers can fall back to using
   the command line directly.

RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.

Differential Revision: https://reviews.llvm.org/D60274

llvm-svn: 360984
2019-05-17 03:44:15 +00:00
Eugene Leviant 053c6fc2b8 [ThinLTO] Don't internalize weak writeable variables
Variables with linkonce_odr and weak_odr linkage shouldn't be internalized
if they're not readonly. Otherwise we may end up with multiple copies of
such variable, so reads and writes will become inconsistent

Differential revision: https://reviews.llvm.org/D61255

llvm-svn: 360577
2019-05-13 11:53:05 +00:00
Teresa Johnson 37b80122bd [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).

Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.

Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59709

llvm-svn: 360466
2019-05-10 20:08:24 +00:00
Steven Wu 6c9f6fd11b [ThinLTO] Adding architecture name into saved object filename
Summary:
For ThinLTOCodegenerator, it has an option to save the object file
outputs into a directory which is essential for debug info. Tools like lldb
and dsymutil will look for these object files for debug info.

On Darwin platform, you can link fat binaries with one single clang
driver invocation like:
 $ clang -arch x86_64 -arch i386 -Wl,-object_path_lto,$TMPDIR ...
Unfornately, the output object files for one architecture is going to
overwrite the previous ones and one architecture slice will end up with
no debug info. One example for this is to turn on ThinLTO for sanitizer
dylibs in compiler-rt project.

To fix the issue, add the name for the architecture into the name of the
output object file.

rdar://problem/35482935

Reviewers: tejohnson, bd1976llvm, dexonsmith, JDevlieghere

Reviewed By: dexonsmith

Subscribers: mehdi_amini, aprantl, inglorion, eraman, hiraditya, jkorous, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60924

llvm-svn: 359508
2019-04-29 21:39:54 +00:00
Alina Sbirlea 0499a2f961 [NewPassManager] Adding pass tuning options: loop vectorize.
Summary:
Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.

Reviewers: chandlerc

Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59723

llvm-svn: 358763
2019-04-19 16:11:59 +00:00
Florian Hahn b340497f76 [LTO] Add plumbing to save stats during LTO on Darwin.
Gold and ld on Linux already support saving stats, but the
infrastructure is missing on Darwin. Unfortunately it seems like the
configuration from lib/LTO/LTO.cpp is not used.

This patch adds a new LTOStatsFile option and adds plumbing in Clang to
use it on Darwin, similar to the way remarks are handled.

Currnetly the handling of LTO flags seems quite spread out, with a bunch
of duplication. But I am not sure if there is an easy way to improve
that?

Reviewers: anemet, tejohnson, thegameg, steven_wu

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D60516

llvm-svn: 358753
2019-04-19 12:36:41 +00:00
Steven Wu 05a358cdcd [ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols
Summary:
Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool.

ThinLTOCodeGenerator currently does not preserve llvm.used symbols and
it can internalize them. In order to pass the necessary information to the
legacy ThinLTOCodeGenerator, the input to the code generator is
rewritten to be based on lto::InputFile.

Now ThinLTO using the legacy LTO API will requires data layout in
Module.

"internalize" thinlto action in llvm-lto is updated to run both
"promote" and "internalize" with the same configuration as
ThinLTOCodeGenerator. The old "promote" + "internalize" option does not
produce the same output as ThinLTOCodeGenerator.

This fixes: PR41236
rdar://problem/49293439

Reviewers: tejohnson, pcc, kromanova, dexonsmith

Reviewed By: tejohnson

Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60421

llvm-svn: 358601
2019-04-17 17:38:09 +00:00
Steven Wu f41e70d6eb Revert [ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols
This reverts r357931 (git commit 8b70a5c11e)

llvm-svn: 357932
2019-04-08 18:53:21 +00:00
Steven Wu 8b70a5c11e [ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols
Summary:
ThinLTOCodeGenerator currently does not preserve llvm.used symbols and
it can internalize them. In order to pass the necessary information to the
legacy ThinLTOCodeGenerator, the input to the code generator is
rewritten to be based on lto::InputFile.

This fixes: PR41236
rdar://problem/49293439

Reviewers: tejohnson, pcc, dexonsmith

Reviewed By: tejohnson

Subscribers: mehdi_amini, inglorion, eraman, hiraditya, jkorous, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60226

llvm-svn: 357931
2019-04-08 18:24:10 +00:00
Francis Visoiu Mistrih dd42236c6c Reland "[Remarks] Add -foptimization-record-passes to filter remark emission"
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.

This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`

will only emit the remarks coming from the pass `inline`.

This adds:

* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin

Differential Revision: https://reviews.llvm.org/D59268

Original llvm-svn: 355964

llvm-svn: 355984
2019-03-12 21:22:27 +00:00
Francis Visoiu Mistrih 1d6c47ad2b Revert "[Remarks] Add -foptimization-record-passes to filter remark emission"
This reverts commit 20fff32b7d.

llvm-svn: 355976
2019-03-12 20:54:18 +00:00
Francis Visoiu Mistrih 20fff32b7d [Remarks] Add -foptimization-record-passes to filter remark emission
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.

This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`

will only emit the remarks coming from the pass `inline`.

This adds:

* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin

Differential Revision: https://reviews.llvm.org/D59268

llvm-svn: 355964
2019-03-12 20:28:50 +00:00
Francis Visoiu Mistrih b8a847c0a3 Reland "[Remarks] Refactor remark diagnostic emission in a RemarkStreamer"
This allows us to store more info about where we're emitting the remarks
without cluttering LLVMContext. This is needed for future support for
the remark section.

Differential Revision: https://reviews.llvm.org/D58996

Original llvm-svn: 355507

llvm-svn: 355514
2019-03-06 15:20:13 +00:00
Francis Visoiu Mistrih 6b622ebea0 Revert "[Remarks] Refactor remark diagnostic emission in a RemarkStreamer"
This reverts commit 2e8c4997a2089f8228c843fd81b148d903472e02.

Breaks bots.

llvm-svn: 355511
2019-03-06 14:52:37 +00:00
Francis Visoiu Mistrih 9052f50cb4 [Remarks] Refactor remark diagnostic emission in a RemarkStreamer
This allows us to store more info about where we're emitting the remarks
without cluttering LLVMContext. This is needed for future support for
the remark section.

Differential Revision: https://reviews.llvm.org/D58996

llvm-svn: 355507
2019-03-06 14:32:08 +00:00
Rong Xu db29a3a438 [PGO] Context sensitive PGO (part 3)
Part 3 of CSPGO changes (mostly related to PassMananger).

Differential Revision: https://reviews.llvm.org/D54175

llvm-svn: 355330
2019-03-04 20:21:27 +00:00
Teresa Johnson d0b1f30b32 [ThinLTO] Detect partially split modules during the thin link
Summary:
The changes to disable LTO unit splitting by default (r350949) and
detect inconsistently split LTO units (r350948) are causing some crashes
when the inconsistency is detected in multiple threads simultaneously.
Fix that by having the code always look for the inconsistently split
LTO units during the thin link, by checking for the presence of type
tests recorded in the summaries.

Modify test added in r350948 to remove single threading required to fix
a bot failure due to this issue (and some debugging options added in the
process of diagnosing it).

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57561

llvm-svn: 354062
2019-02-14 21:22:50 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Teresa Johnson 290a839891 [LTO] Record whether LTOUnit splitting is enabled in index
Summary:
Records in the module summary index whether the bitcode was compiled
with the option necessary to enable splitting the LTO unit
(e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit).

The information is passed down to the ModuleSummaryIndex builder via a
new module flag "EnableSplitLTOUnit", which is propagated onto a flag
on the summary index.

This is then used during the LTO link to check whether all linked
summaries were built with the same value of this flag. If not, an error
is issued when we detect a situation requiring whole program visibility
of the class hierarchy. This is the case when both of the following
conditions are met:
1) We are performing LowerTypeTests or Whole Program Devirtualization.
2) There are type tests or type checked loads in the code.

Note I have also changed the ThinLTOBitcodeWriter to also gate the
module splitting on the value of this flag.

Reviewers: pcc

Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D53890

llvm-svn: 350948
2019-01-11 18:31:57 +00:00
Easwaran Raman b45994b843 Refactor synthetic profile count computation. NFC.
Summary:
Instead of using two separate callbacks to return the entry count and the
relative block frequency, use a single callback to return callsite
count. This would allow better supporting hybrid mode in the future as
the count of callsite need not always be derived from entry count (as in
sample PGO).

Reviewers: davidxl

Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D56464

llvm-svn: 350755
2019-01-09 20:10:27 +00:00
Matthew Voss 62fcfc5adb [ThinLTO] Remove dllimport attribute from locally defined symbols
Summary:
The LTO/ThinLTO driver currently creates invalid bitcode by setting 
symbols marked dllimport as dso_local. The compiler often has access 
to the definition (often dllexport) and the declaration (often 
dllimport) of an object at link-time, leading to a conflicting 
declaration. This patch resolves the inconsistency by removing the
dllimport attribute.

Reviewers: tejohnson, pcc, rnk, echristo

Reviewed By: rnk

Subscribers: dmikulin, wristow, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D55627

llvm-svn: 349667
2018-12-19 19:07:45 +00:00
Easwaran Raman 5a7056fa03 [ThinLTO] Compute synthetic function entry count
Summary:
This patch computes the synthetic function entry count on the whole
program callgraph (based on module summary) and writes the entry counts
to the summary. After function importing, this count gets attached to
the IR as metadata. Since it adds a new field to the summary, this bumps
up the version.

Reviewers: tejohnson

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D43521

llvm-svn: 349076
2018-12-13 19:54:27 +00:00
Peter Collingbourne ff9aaa25e8 LTO: Don't internalize available_externally globals.
This breaks C and C++ semantics because it can cause the address
of the global inside the module to differ from the address outside
of the module.

Differential Revision: https://reviews.llvm.org/D55237

llvm-svn: 348321
2018-12-05 00:09:36 +00:00
George Burgess IV cf5ecb1adb [ThinLTO] Look through aliases when computing hash keys
Without this, we don't consider types used by aliasees in our cache key.
This caused issues when using the same cache for thin-linking the same
TU with different sets of virtual call candidates for a virtual call
inside of a constructor. That's sort of a mouthful. :)

Differential Revision: https://reviews.llvm.org/D55060

llvm-svn: 348216
2018-12-04 00:02:33 +00:00
Teresa Johnson 93f9996278 [ThinLTO] Import local variables from the same module as caller
Summary:
We can sometimes end up with multiple copies of a local variable that
have the same GUID in the index. This happens when there are local
variables with the same name that are in different source files having the
same name/path at compile time (but compiled into different bitcode objects).

In this case make sure we import the copy in the caller's module.
This enables importing both of the variables having the same GUID
(but which will have different promoted names since the module paths,
and therefore the module hashes, will be distinct).

Importing the wrong copy is particularly problematic for read only
variables, since we must import them as a local copy whenever
referenced. Otherwise we get undefs at link time.

Note that the llvm-lto.cpp and ThinLTOCodeGenerator changes are needed
for testing the distributed index case via clang, which will be sent as
a separate clang-side patch shortly. We were previously not doing the
dead code/read only computation before computing imports when testing
distributed index generation (like it was for testing importing and
other ThinLTO mechanisms alone).

Reviewers: evgeny777

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D55047

llvm-svn: 347886
2018-11-29 17:02:42 +00:00
Teresa Johnson 5f312ad450 [ThinLTO] Consolidate cache key computation between new/old LTO APIs
Summary:
The old legacy LTO API had a separate cache key computation, which was
a subset of the cache key computation in the new LTO API (from what I
can tell this is largely just because certain features such as CFI,
dsoLocal, etc are only utilized via the new LTO API). However, having
separate computations is unnecessary (much of the code is duplicated),
and can lead to bugs when adding new optimizations if both cache
computation algorithms aren't updated properly - it's much easier to
maintain if we have a single facility.

This patch refactors the old LTO API code to use the cache key
computation from the new LTO API. To do this, we set up an lto::Config
object and fill in the fields that the old LTO was hashing (the others
will just use the defaults).

There are two notable changes:
- I added a Freestanding flag to the LTO Config. Currently this is only
used by the legacy LTO API. In the patch that added it (D30791) I had
asked about adding it to the new LTO API, but it looks like that was not
addressed. This should probably be discussed as a follow up to this
change, as it is orthogonal.
- The legacy LTO API had some code that was hashing the GUID of all
preserved symbols defined in the module. I looked back at the history of
this (which was added with the original hashing in the legacy LTO API in
D18494), and there is a comment in the review thread that it was added
in preparation for future internalization. We now do the internalization
of course, and that is handled in the new LTO API cache key computation
by hashing the recorded linkage type of all defined globals. Therefore I
didn't try to move over and keep the preserved symbols handling.

Reviewers: steven_wu, pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D54635

llvm-svn: 347592
2018-11-26 20:40:37 +00:00
Eugene Leviant bf46e7410c [ThinLTO] Internalize readonly globals
An attempt to recommit r346584 after failure on OSX build bot.
Fixed cache key computation in ThinLTOCodeGenerator and added
test case

llvm-svn: 347033
2018-11-16 07:08:00 +00:00
Steven Wu fa43892d6f Revert "[ThinLTO] Internalize readonly globals"
This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2.

llvm-svn: 346768
2018-11-13 17:35:04 +00:00
Jonas Devlieghere 45eb84f340 [Support] Make error banner optional in logAllUnhandledErrors
In a lot of places an empty string was passed as the ErrorBanner to
logAllUnhandledErrors. This patch makes that argument optional to
simplify the call sites.

llvm-svn: 346604
2018-11-11 01:46:03 +00:00
Eugene Leviant be8d19967a [ThinLTO] Internalize readonly globals
This patch allows internalising globals if all accesses to them
(from live functions) are from non-volatile load instructions

Differential revision: https://reviews.llvm.org/D49362

llvm-svn: 346584
2018-11-10 08:31:21 +00:00
Pirama Arumuga Nainar e61652a384 [LTO] Drop non-prevailing definitions only if linkage is not local or appending
Summary:
This fixes PR 37422

In ELF, non-weak symbols can also be non-prevailing.  In this particular
PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
dropped - causing multiply-defined errors with lld.

Also add a test, strong_non_prevailing.ll, to ensure that multiple
copies of a strong symbol are dropped.

To fix the test regressions exposed by this fix,
- do not mark prevailing copies for symbols with 'appending' linkage.
There's no one prevailing copy for such symbols.
- fix the prevailing version in dead-strip-fulllto.ll
- explicitly pass exported symbols to llvm-lto in fumcimport.ll and
funcimport_var.ll

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
dang, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D54125

llvm-svn: 346436
2018-11-08 20:10:07 +00:00
Xin Tong 7ca744488f [ThinLTO] Add an option to disable (thin)lto internalization.
Summary:
LTO and ThinLTO optimizes the IR differently.

One source of differences is the amount of internalizations that
can happen.

Add an option to enable/disable internalization so that other
differences can be studied in isolation. e.g. inlining.

There are other things lto and thinlto do differently, I will add
flags to enable/disable them as needed.

Reviewers: tejohnson, pcc, steven_wu

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D53294

llvm-svn: 346140
2018-11-05 15:49:46 +00:00
Fedor Sergeev bd6b2138b9 [NewPM] teach -passes= to emit meaningful error messages
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.

TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.

Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246

llvm-svn: 344685
2018-10-17 10:36:23 +00:00
Fedor Sergeev a01be0f217 Revert "[NewPM] teach -passes= to emit meaningful error messages"
This reverts r344519 due to failures in pipeline-parsing test.

llvm-svn: 344524
2018-10-15 15:36:08 +00:00
Fedor Sergeev 4155a77e98 [NewPM] teach -passes= to emit meaningful error messages
Summary:
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.

TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.

Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246

llvm-svn: 344519
2018-10-15 15:00:18 +00:00
Richard Smith 6c67662816 Add a flag to remap manglings when reading profile data information.
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.

The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.

Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.

Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.

This is the LLVM side of Clang r344199.

Reviewers: davidxl, tejohnson, dlj, erik.pilkington

Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51249

llvm-svn: 344200
2018-10-10 23:13:47 +00:00
Warren Ristow febfc4e89b [LTO] Account for overriding lib calls via the alias attribute
Given a library call that is represented as an llvm intrinsic call, but
later transformed to an actual call, if an overriding definition of that
library routine is provided indirectly via an alias, prevent LTO from
eliminating the definition.

This is a fix for PR38547.

Differential Revision: https://reviews.llvm.org/D52836

llvm-svn: 344198
2018-10-10 22:54:31 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Fedor Sergeev a43fd9522d [PassTiming] cleaning up legacy PassTimingInfo interface. NFCI.
During D51276 discussion it was decided that legacy PassTimingInfo
interface can not be reused for new pass manager's implementation
of -time-passes.

This is a cleanup in preparation for D51276 to make legacy interface
as concise as possible, moving the PassTimingInfo from the header
into the anonymous legacy namespace in .cpp.

It is rather close to a revert of rL340872 in a sense that it hides
the interface and gets rid of templates. However as compared to
a complete revert it resides in a different translation unit and has
an additional pass-instance counting funcitonality (PassIDCountMap).

Reviewers: philip.pfaffe
Differential Revision: https://reviews.llvm.org/D52356

llvm-svn: 343104
2018-09-26 13:01:43 +00:00
Teresa Johnson 7fb39dfa7c [ThinLTO] Efficiency fix for writing type id records in per-module indexes
Summary:
In D49565/r337503, the type id record writing was fixed so that only
referenced type ids were emitted into each per-module index for ThinLTO
distributed builds. However, this still left an efficiency issue: each
per-module index checked all type ids for membership in the referenced
set, yielding O(M*N) performance (M indexes and N type ids).

Change the TypeIdMap in the summary to be indexed by GUID, to facilitate
correlating with type identifier GUIDs referenced in the function
summary TypeIdInfo structures. This allowed simplifying other
places where a map from type id GUID to type id map entry was previously
being used to aid this correlation.

Also fix AsmWriter code to handle the rare case of type id GUID
collision.

For a large internal application, this reduced the thin link time by
almost 15%.

Reviewers: pcc, vitalybuka

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51330

llvm-svn: 343021
2018-09-25 20:14:40 +00:00
Caroline Tice 3dea3f9e0a Pass code-model through Module IR to LTO which will use it.
Currently the code-model does not get saved in the module IR,
so if a code model is specified when compiling with LTO,
it gets lost and is not propagated properly to LTO. This patch,
along with one for the front end, fixes that.

Differential Revision: https://reviews.llvm.org/D52322

llvm-svn: 342760
2018-09-21 18:41:31 +00:00
Steven Wu ec53c89fde [ThinLTOCodeGenerator] Avoid Rehash StringMap in ThreadPool
Summary:
During threaded thinLTO, it is possible that the entry for current
module doesn't exist in StringMaps (like ExportLists, ResolvedODR,
etc.). Using operator[] might trigger a rehash for the StringMap, which
might happen on multiple threads at the same time.

rdar://problem/43846199

Reviewers: tejohnson, mehdi_amini, kromanova, pcc

Reviewed By: tejohnson

Subscribers: dang, inglorion, eraman, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52049

llvm-svn: 342263
2018-09-14 19:38:21 +00:00
Steven Wu cf90203b0b [ThinLTO] Fix memory corruption in ThinLTOCodeGenerator when CodeGenOnly was specified
Summary:
Issue occurs when doing ThinLTO with CodeGenOnly flag.
TMBuilder.TheTriple is assigned to by multiple threads in an unsafe way resulting in double-free of std::string memory.

Pseudocode:
if (CodeGenOnly) {
  // Perform only parallel codegen and return.
  ThreadPool Pool;
  int count = 0;
  for (auto &ModuleBuffer : Modules) {
    Pool.async([&](int count) {
    ...
      /// Now call OutputBuffer = codegen(*TheModule);
      /// Which turns into initTMBuilder(moduleTMBuilder, Triple(TheModule.getTargetTriple()));
      /// Which turns into

      TMBuilder.TheTriple = std::move(TheTriple);   // std::string = "....."
      /// So, basically std::string assignment to same string on multiple threads = memory corruption

  }

  return;
}

Patch by Alex Borcan

Reviewers: llvm-commits, steven_wu

Reviewed By: steven_wu

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51651

llvm-svn: 341422
2018-09-04 22:54:17 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Bob Haarman eae4742d81 [LTO] Don't internalize declarations
Summary:
Some links were failing with "Global is external, but doesn't have
external or weak linkage!" in ThinLTO builds with debug
information. This happened when we elide the body of a global that is
referenced by debug info. This results in a declaration, which we
would then internalize - but declarations cannot be internal. This
change avoids the problem by not internalizing these declarations.

Fixes PR38046.

Reviewers: pcc, tejohnson

Subscribers: mehdi_amini, aprantl, hiraditya, JDevlieghere, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49777

llvm-svn: 338100
2018-07-27 05:40:29 +00:00
Teresa Johnson b963c0b658 [LTO] Handle __imp_ (dllimport) symbols consistently with lld
Summary:
Similar to what lld already does for dllimport symbols which are
prefaced with __imp_ (see lld patch r240620), strip off the __imp_
prefix in LTO. Otherwise we can get 2 separate GlobalResolution for
a single symbol, the dllimport declaration, and the definition, which
leads to incorrect LTO handling.

Fixes PR38105.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49138

llvm-svn: 337762
2018-07-23 22:33:57 +00:00
Teresa Johnson 28023dbed7 [ThinLTO] Enable ThinLTO WholeProgramDevirt and LowerTypeTests in new PM
Summary:
Enable these passes for CFI and WPD in ThinLTO and LTO with the new pass
manager. Add a couple of tests for both PMs based on the clang tests
tools/clang/test/CodeGen/thinlto-distributed-cfi*.ll, but just test
through llvm-lto2 and not with distributed ThinLTO.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49429

llvm-svn: 337461
2018-07-19 14:51:32 +00:00
Teresa Johnson d68935c5ac Restore "[ThinLTO] Ensure we always select the same function copy to import"
This reverts commit r337081, therefore restoring r337050 (and fix in
r337059), with test fix for bot failure described after the original
description below.

In order to always import the same copy of a linkonce function,
even when encountering it with different thresholds (a higher one then a
lower one), keep track of the summary we decided to import.
This ensures that the backend only gets a single definition to import
for each GUID, so that it doesn't need to choose one.

Move the largest threshold the GUID was considered for import into the
current module out of the ImportMap (which is part of a larger map
maintained across the whole index), and into a new map just maintained
for the current module we are computing imports for. This saves some
memory since we no longer have the thresholds maintained across the
whole index (and throughout the in-process backends when doing a normal
non-distributed ThinLTO build), at the cost of some additional
information being maintained for each invocation of ComputeImportForModule
(the selected summary pointer for each import).

There is an additional map lookup for each callee being considered for
importing, however, this was able to subsume a map lookup in the
Worklist iteration that invokes computeImportForFunction. We also are
able to avoid calling selectCallee if we already failed to import at the
same or higher threshold.

I compared the run time and peak memory for the SPEC2006 471.omnetpp
benchmark (running in-process ThinLTO backends), as well as for a large
internal benchmark with a distributed ThinLTO build (so just looking at
the thin link time/memory). Across a number of runs with and without
this change there was no significant change in the time and memory.

(I tried a few other variations of the change but they also didn't
improve time or peak memory).

The new commit removes a test that no longer makes sense
(Transforms/FunctionImport/hotness_based_import2.ll), as exposed by the
reverse-iteration bot. The test depends on the order of processing the
summary call edges, and actually depended on the old problematic
behavior of selecting more than one summary for a given GUID when
encountered with different thresholds. There was no guarantee even
before that we would eventually pick the linkonce copy with the hottest
call edges, it just happened to work with the test and the old code, and
there was no guarantee that we would end up importing the selected
version of the copy that had the hottest call edges (since the backend
would effectively import only one of the selected copies).

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D48670

llvm-svn: 337184
2018-07-16 15:30:27 +00:00
Teresa Johnson b78c5d0602 Revert "[ThinLTO] Ensure we always select the same function copy to import"
This reverts commits r337050 and r337059. Caused failure in
reverse-iteration bot that needs more investigation.

llvm-svn: 337081
2018-07-14 01:45:49 +00:00
Teresa Johnson d94c0594d9 [ThinLTO] Ensure we always select the same function copy to import
In order to always import the same copy of a linkonce function,
even when encountering it with different thresholds (a higher one then a
lower one), keep track of the summary we decided to import.
This ensures that the backend only gets a single definition to import
for each GUID, so that it doesn't need to choose one.

Move the largest threshold the GUID was considered for import into the
current module out of the ImportMap (which is part of a larger map
maintained across the whole index), and into a new map just maintained
for the current module we are computing imports for. This saves some
memory since we no longer have the thresholds maintained across the
whole index (and throughout the in-process backends when doing a normal
non-distributed ThinLTO build), at the cost of some additional
information being maintained for each invocation of ComputeImportForModule
(the selected summary pointer for each import).

There is an additional map lookup for each callee being considered for
importing, however, this was able to subsume a map lookup in the
Worklist iteration that invokes computeImportForFunction. We also are
able to avoid calling selectCallee if we already failed to import at the
same or higher threshold.

I compared the run time and peak memory for the SPEC2006 471.omnetpp
benchmark (running in-process ThinLTO backends), as well as for a large
internal benchmark with a distributed ThinLTO build (so just looking at
the thin link time/memory). Across a number of runs with and without
this change there was no significant change in the time and memory.

(I tried a few other variations of the change but they also didn't
improve time or peak memory).

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D48670

llvm-svn: 337050
2018-07-13 21:35:51 +00:00
Teresa Johnson c0320ef47b [ThinLTO] Use std::map to get determistic imports files
Summary:
I noticed that the .imports files emitted for distributed ThinLTO
backends do not have consistent ordering. This is because StringMap
iteration order is not guaranteed to be deterministic. Since we already
have a std::map with this information, used when emitting the individual
index files (ModuleToSummariesForIndex), use it for the imports files as
well.

This issue is likely causing some unnecessary rebuilds of the ThinLTO
backends in our distributed build system as the imports files are inputs
to those backends.

Reviewers: pcc, steven_wu, mehdi_amini

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48783

llvm-svn: 336721
2018-07-10 20:06:04 +00:00
Andrew Ng 089303d8ff [ThinLTO] Update ThinLTO cache file atimes when on Windows
ThinLTO cache file access times are used for expiration based pruning
and since Vista, file access times are not updated by Windows by
default:

https://blogs.technet.microsoft.com/filecab/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance

This means on Windows, cache files are currently being pruned from
creation time. This change manually updates cache files that are
accessed by ThinLTO, when on Windows.

Patch by Owen Reynolds.

Differential Revision: https://reviews.llvm.org/D47266

llvm-svn: 336276
2018-07-04 14:17:10 +00:00
Peter Collingbourne 881ba10465 LTO: Keep file handles open for memory mapped files.
On Windows we've observed that if you open a file, write to it, map it into
memory and close the file handle, the contents of the memory mapping can
sometimes be incorrect. That was what we did when adding an entry to the
ThinLTO cache using the TempFile and MemoryBuffer classes, and it was causing
intermittent build failures on Chromium's ThinLTO bots on Windows. More
details are in the associated Chromium bug (crbug.com/786127).

We can prevent this from happening by keeping a handle to the file open while
the mapping is active. So this patch changes the mapped_file_region class to
duplicate the file handle when mapping the file and close it upon unmapping it.

One gotcha is that the file handle that we keep open must not have been
created with FILE_FLAG_DELETE_ON_CLOSE, as otherwise the operating system
will prevent other processes from opening the file. We can achieve this
by avoiding the use of FILE_FLAG_DELETE_ON_CLOSE altogether.  Instead,
we use SetFileInformationByHandle with FileDispositionInfo to manage the
delete-on-close bit. This lets us remove the hack that we used to use to
clear the delete-on-close bit on a file opened with FILE_FLAG_DELETE_ON_CLOSE.

A downside of using SetFileInformationByHandle/FileDispositionInfo as
opposed to FILE_FLAG_DELETE_ON_CLOSE is that it prevents us from using
CreateFile to open the file while the flag is set, even within the same
process. This doesn't seem to matter for almost every client of TempFile,
except for LockFileManager, which calls sys::fs::create_link to create a
hard link from the lock file, and in the process of doing so tries to open
the file. To prevent this change from breaking LockFileManager I changed it
to stop using TempFile by effectively reverting r318550.

Differential Revision: https://reviews.llvm.org/D48051

llvm-svn: 334630
2018-06-13 18:03:14 +00:00
Teresa Johnson 4ffc3e7834 [ThinLTO] Rename index IsAnalysis flag to HaveGVs (NFC)
With the upcoming patch to add summary parsing support, IsAnalysis would
be true in contexts where we are not performing module summary analysis.
Rename to the more specific and approprate HaveGVs, which is essentially
what this flag is indicating.

llvm-svn: 334140
2018-06-06 22:22:01 +00:00
Peter Collingbourne 3aa30e8062 IRGen: Write .dwo files when -split-dwarf-file is used together with -fthinlto-index.
Differential Revision: https://reviews.llvm.org/D47597

llvm-svn: 333677
2018-05-31 18:25:59 +00:00
Peter Collingbourne 274c4f7ab4 Fix a make_unique ambiguity.
llvm-svn: 332889
2018-05-21 20:56:28 +00:00
Peter Collingbourne c5a9765cea LTO: Replace split dwarf implementation that uses objcopy with one that uses direct emission.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47091

llvm-svn: 332884
2018-05-21 20:26:49 +00:00
Peter Collingbourne 9a45114b3c CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output.
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47089

llvm-svn: 332881
2018-05-21 20:16:41 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Teresa Johnson 81d9207317 [LTO] Handle Task=-1 passed to addSaveTemps
Summary:
This change is necessary for D46464, which will pass -1 as the Task
ID for distributed backends, so that the save temps files don't end
up with "4294967295" in their path. For distributed back ends, when -1
is passed, don't append any Task ID.

An existing test (tools/clang/test/CodeGen/thinlto_backend.ll) will
fail without this change after D46464.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D46488

llvm-svn: 331591
2018-05-05 14:37:20 +00:00
Teresa Johnson b77ab0966e [LTO] Allow pass remarks with hotness to be set when emitting to stderr
Summary:
Set setDiagnosticsHotnessRequested before the early exit check for a
diagnostic output file, so that pass remarks with hotness works when
emitting pass remarks to stderr (e.g. via -pass-remarks=.).

Also fix the llvm-lto2 diagnistic handler so that it only calls exit(1)
when the diagnistic is an error type. Otherwise the new test invocation
of llvm-lto2 with -pass-remarks causes it to fail. The new code is
consistent with the diagnostic handler elsewhere (e.g. on the
LLVMContext).

Reviewers: pcc, davide

Subscribers: fhahn, mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D46387

llvm-svn: 331569
2018-05-04 23:59:34 +00:00
Teresa Johnson 85cc298c1a [ThinLTO] Add support for optimization remarks to thinBackend
Summary:
Support was added to the regular LTO backend, but not thinBackend.
This patch adds that support.

Reviewers: pcc, davide

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D46376

llvm-svn: 331481
2018-05-03 20:24:12 +00:00
Nico Weber 432a38838d IWYU for llvm-config.h in llvm, additions.
See r331124 for how I made a list of files missing the include.
I then ran this Python script:

    for f in open('filelist.txt'):
        f = f.strip()
        fl = open(f).readlines()

        found = False
        for i in xrange(len(fl)):
            p = '#include "llvm/'
            if not fl[i].startswith(p):
                continue
            if fl[i][len(p):] > 'Config':
                fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
                found = True
                break
        if not found:
            print 'not found', f
        else:
            open(f, 'w').write(''.join(fl))

and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.

No intended behavior change.

llvm-svn: 331184
2018-04-30 14:59:11 +00:00
Florian Hahn d4332eb3b7 [LTO] Add stats-file option to LTO/Config.h.
This patch adds a StatsFile option to LTO/Config.h and updates both
LLVMGold and llvm-lto2 to set it.

Reviewers: MatzeB, tejohnson, espindola

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D45531

llvm-svn: 330411
2018-04-20 10:18:36 +00:00
Weiming Zhao 79f2d093dd Rename ObjectMemoryBuffer to SmallVectorMemoryBuffer; NFCI
Summary: As discussed in https://reviews.llvm.org/D45606, it makes more sense to name the class as SmallVectorMemoryBuffer

Reviewers: bkramer, dblaikie

Reviewed By: dblaikie

Subscribers: mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D45661

llvm-svn: 330107
2018-04-16 03:44:03 +00:00
Weiming Zhao 7c16977a8f NFC: Move ObjectMemoryBuffer to support
Summary:
Since the class is used by both MCJIT and LTO, it makes more sense to move it to Support lib.
This is a follow up patch to r329929 and https://reviews.llvm.org/D45244

Reviewers: bkramer, dblaikie

Reviewed By: bkramer

Subscribers: mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D45606

llvm-svn: 330093
2018-04-15 05:17:14 +00:00
Mandeep Singh Grang 5416af7af6 [LTO] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer D44363 for a list of all the required patches.

Reviewers: pcc, mehdi_amini, ruiu

Reviewed By: ruiu

Subscribers: ruiu, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D45137

llvm-svn: 330053
2018-04-13 19:12:20 +00:00
Yunlian Jiang bd200b9ff6 Enable debug fission for thinLTO linked via gold-plugin
Summary: This enables debug fission on implicit ThinLTO when linked with gold. It will put the .dwo files in a directory specified by user. 

Reviewers: tejohnson, pcc, dblaikie

Reviewed By: pcc

Subscribers: JDevlieghere, mehdi_amini, inglorion

Differential Revision: https://reviews.llvm.org/D44792

llvm-svn: 329988
2018-04-13 05:03:28 +00:00
Weiming Zhao 1bd40005ba Add missing vtable anchors
Summary: This patch adds anchor() for MemoryBuffer, raw_fd_ostream, RTDyldMemoryManager, SectionMemoryManager, etc.

Reviewers: jlebar, eli.friedman, dblaikie

Reviewed By: dblaikie

Subscribers: mehdi_amini, mgorny, dblaikie, weimingz, llvm-commits

Differential Revision: https://reviews.llvm.org/D45244

llvm-svn: 329861
2018-04-11 23:09:20 +00:00
Ekaterina Romanova 0b01dfbba6 Prevent data races in concurrent ThinLTO processes.
Make sure ThinLTO with caching doesn't use non-atomic writes to the cache file (to prevent data races and cache files corruption).

1. Place temp file to the same place where the caching directory is (instead of creating it the directory pointed to by TMP/TEMP variable). This will help to prevent using non-atomic rename and falling back to non-atomic "direct" write to the cache file.
2. if rename failed do not write to the cache file directly (direct write to the file is non-atomic and could cause data race conditions).
3. if cache file doesn't exist (e.g., because 'rename' failed or because some other reasons), bypass using the cache altogether.

Differential Revision:  https://reviews.llvm.org/D45076

llvm-svn: 328904
2018-03-30 21:35:42 +00:00
David Blaikie 6054e650ff Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

llvm-svn: 328392
2018-03-23 23:58:19 +00:00
David Blaikie 8820929011 Sink Analysis/ObjectUtil(canBeOmittedFromSymbolTable) into IR so it can be legitimately be used by Object/IRSymtab
llvm-svn: 328135
2018-03-21 19:23:45 +00:00
Adam Nemet ade40dd3c8 [LTO] Return proper error object rather than null LTOModule
This caused a crash in LTOModule::createInLocalContext.

rdar://37926841

llvm-svn: 327359
2018-03-13 04:37:01 +00:00
Bob Haarman fb2d342299 Revert "[LTO] Support filtering by hotness threshold"
This reverts commit 1f3bd185c53beb6aa68446974b7e80837abd6ef0 (r326107)
because it fails
ThinLTO/X86/diagnostic-handler-remarks-with-hotness.ll.

llvm-svn: 326975
2018-03-08 01:13:10 +00:00
Adam Nemet b4ce3573c4 [LTO] Support filtering by hotness threshold
This wires up -pass-remarks-hotness-threshold to LTO and ThinLTO.

Next is to change the clang driver to pass this
with -fdiagnostics-hotness-threshold.

Differential Revision: https://reviews.llvm.org/D41465

llvm-svn: 326107
2018-02-26 18:37:45 +00:00
Vitaly Buka a139b69e12 [ThinLTO] Always create linked objects file for --thinlto-index-only=
Summary:
ThinLTO indexing may decide to skip all objects. If we don't write something to
the list build system may consider this as failure or linker can reuse a file
from the previews build.

Reviewers: pcc, tejohnson

Subscribers: mehdi_amini, inglorion, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D43415

llvm-svn: 325819
2018-02-22 19:06:15 +00:00
Teresa Johnson a344fd3db6 [LTO] Remove unused Path parameter to AddBufferFn
Summary:
With D43396, no clients use the Path parameter anymore.

Depends on D43396.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D43400

llvm-svn: 325619
2018-02-20 20:21:53 +00:00
Charles Saternos b040fcc693 [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Third attempt - moved function from lambda to static function due to build failures.

llvm-svn: 325506
2018-02-19 15:14:50 +00:00
Simon Pilgrim 0efed32577 Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function).

Reverted due to buildbot failures

llvm-svn: 325454
2018-02-18 00:01:36 +00:00
Charles Saternos 35878ee7a4 [ThinLTO] Add GraphTraits for FunctionSummaries
Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes.

Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function).

llvm-svn: 325448
2018-02-17 21:39:24 +00:00