Commit Graph

24 Commits

Author SHA1 Message Date
Eugene Leviant 3aef35288b [ThinLTO] Attempt to recommit r365188 after alignment fix
llvm-svn: 365215
2019-07-05 15:25:05 +00:00
Eugene Leviant e91f86f0ac Reverted r365188 due to alignment problems on i686-android
llvm-svn: 365206
2019-07-05 13:26:05 +00:00
Eugene Leviant 820cc01d1e [ThinLTO] Attempt to recommit r365040 after caching fix
It's possible that some function can load and store the same
variable using the same constant expression:

store %Derived* @foo, %Derived** bitcast (%Base** @bar to %Derived**)
%42 = load %Derived*, %Derived** bitcast (%Base** @bar to %Derived**)

The bitcast expression was mistakenly cached while processing loads,
and never examined later when processing store. This caused @bar to
be mistakenly treated as read-only variable. See load-store-caching.ll.

llvm-svn: 365188
2019-07-05 12:00:10 +00:00
Reid Kleckner f7e52fbdb5 Revert [ThinLTO] Optimize writeonly globals out
This reverts r365040 (git commit 5cacb91475)

Speculatively reverting, since this appears to have broken check-lld on
Linux. Partial analysis in https://crbug.com/981168.

llvm-svn: 365097
2019-07-04 00:03:30 +00:00
Eugene Leviant 5cacb91475 [ThinLTO] Optimize writeonly globals out
Differential revision: https://reviews.llvm.org/D63444

llvm-svn: 365040
2019-07-03 14:14:52 +00:00
Teresa Johnson 37b80122bd [ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).

Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.

Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59709

llvm-svn: 360466
2019-05-10 20:08:24 +00:00
Teresa Johnson 290a839891 [LTO] Record whether LTOUnit splitting is enabled in index
Summary:
Records in the module summary index whether the bitcode was compiled
with the option necessary to enable splitting the LTO unit
(e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit).

The information is passed down to the ModuleSummaryIndex builder via a
new module flag "EnableSplitLTOUnit", which is propagated onto a flag
on the summary index.

This is then used during the LTO link to check whether all linked
summaries were built with the same value of this flag. If not, an error
is issued when we detect a situation requiring whole program visibility
of the class hierarchy. This is the case when both of the following
conditions are met:
1) We are performing LowerTypeTests or Whole Program Devirtualization.
2) There are type tests or type checked loads in the code.

Note I have also changed the ThinLTOBitcodeWriter to also gate the
module splitting on the value of this flag.

Reviewers: pcc

Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D53890

llvm-svn: 350948
2019-01-11 18:31:57 +00:00
Easwaran Raman 5a7056fa03 [ThinLTO] Compute synthetic function entry count
Summary:
This patch computes the synthetic function entry count on the whole
program callgraph (based on module summary) and writes the entry counts
to the summary. After function importing, this count gets attached to
the IR as metadata. Since it adds a new field to the summary, this bumps
up the version.

Reviewers: tejohnson

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D43521

llvm-svn: 349076
2018-12-13 19:54:27 +00:00
Eugene Leviant bf46e7410c [ThinLTO] Internalize readonly globals
An attempt to recommit r346584 after failure on OSX build bot.
Fixed cache key computation in ThinLTOCodeGenerator and added
test case

llvm-svn: 347033
2018-11-16 07:08:00 +00:00
Steven Wu fa43892d6f Revert "[ThinLTO] Internalize readonly globals"
This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2.

llvm-svn: 346768
2018-11-13 17:35:04 +00:00
Eugene Leviant be8d19967a [ThinLTO] Internalize readonly globals
This patch allows internalising globals if all accesses to them
(from live functions) are from non-volatile load instructions

Differential revision: https://reviews.llvm.org/D49362

llvm-svn: 346584
2018-11-10 08:31:21 +00:00
Teresa Johnson 63ee0e73e4 [ThinLTO] Parse module summary index from assembly
Summary:
Adds assembly parsing support for the module summary index (follow on
to r333335 which added the assembly writing support).

I added support to llvm-as to invoke the index parsing, so that it can
create either a bitcode file with a Module and a per-module index, or
a combined index without a Module.

I will send follow on patches soon to do the following:
- add support to tools such as llvm-lto2 to parse the per-module indexes
from assembly instead of bitcode when testing the thin link.
- verification support.

Depends on D47844 and D47842.

Reviewers: pcc, dexonsmith, mehdi_amini

Subscribers: inglorion, eraman, steven_wu, llvm-commits

Differential Revision: https://reviews.llvm.org/D47905

llvm-svn: 335602
2018-06-26 13:56:49 +00:00
Teresa Johnson fb89e7a943 [ThinLTO] Fix a few more test match issues
Fix a few more bot failures due to r333335:
- don't match path other than file name, since the delimiter is
different for Windows
- The summary IDs in thinlto-function-summary-refgraph.ll may vary
and therefore can't be matched exactly, because the ordering depends
on the iteration order of the index map which is keyed by GUID. The GUID
for private values will depend on the path.

llvm-svn: 333338
2018-05-26 03:50:29 +00:00
Teresa Johnson 08d5b4ef0d [ThinLTO] Print module summary index to assembly
Summary:
Implements AsmWriter support for printing the module summary index to
assembly with the format discussed in the RFC "LLVM Assembly format for
ThinLTO Summary".

Implements just enough of the parsing support to recognize and ignore
the summary entries. As agreed in the RFC thread, this will be the
behavior when assembling the IR. A follow on change will implement
parsing/assembling of the summary entries for use by tools that
currently build the summary index from bitcode.

Reviewers: dexonsmith, pcc

Subscribers: inglorion, eraman, steven_wu, dblaikie, llvm-commits

Differential Revision: https://reviews.llvm.org/D46699

llvm-svn: 333335
2018-05-26 02:34:13 +00:00
Teresa Johnson f368101567 [ThinLTO] Serialize WithGlobalValueDeadStripping index flag for distributed backends
Summary:
A recent fix to drop dead symbols (r323633) did not work for ThinLTO
distributed backends because we lose the WithGlobalValueDeadStripping
set on the index during the thin link. This patch adds a new flags
record to the bitcode format for the index, and serializes this flag
for the combined index (it would always be 0 for the per-module index
generated by the compile step, so no need to serialize the new flags
record there until/unless we add another flag that applies to the
per-module indexes).

Generally this flag should always be set for the distributed backends,
which are necessarily performed after the thin link. However, if we were
to simply set this flag on the index applied to the distributed backends
(invoked via clang), we would lose the ability to disable dead stripping
via -compute-dead=false for debugging purposes.

Reviewers: grimar, pcc

Subscribers: mehdi_amini, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D42799

llvm-svn: 324444
2018-02-07 04:05:59 +00:00
Charles Saternos 75da10d1b2 [ThinLTO] Add FunctionAttrs to ThinLTO index
Adds function attributes to index: ReadNone, ReadOnly, NoRecurse, NoAlias. This attributes will be used for future ThinLTO optimizations that will propagate function attributes across modules.

llvm-svn: 310061
2017-08-04 16:00:58 +00:00
Dehao Chen 64c46574b0 Increase the import-threshold for crtical functions.
Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D35096

llvm-svn: 307439
2017-07-07 21:01:00 +00:00
Peter Collingbourne 92648c25a4 Bitcode: Write the irsymtab to disk.
Differential Revision: https://reviews.llvm.org/D33973

llvm-svn: 306487
2017-06-27 23:50:11 +00:00
Teresa Johnson 2a6b7991d4 Restrict call metadata based hotness detection to Sample PGO mode
Summary:
Don't use the metadata on call instructions for determining hotness
unless we are in sample PGO mode, where it is needed because profile
counts are not accurate. In instrumentation mode this is not necessary
and does more harm than good when calls have VP metadata that hasn't
been properly scaled after transformations or dropped after constant
prop based devirtualization (both should be fixed, but we don't need
to do this in the first place for instrumentation PGO).

This required adjusting a number of tests to distinguish between sample
and instrumentation PGO handling, and to add in profile summary metadata
so that getProfileCount can get the summary.

Reviewers: davidxl, danielcdh

Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D32877

llvm-svn: 302844
2017-05-11 23:18:05 +00:00
Peter Collingbourne a0f371a106 Bitcode: Add a string table to the bitcode format.
Add a top-level STRTAB block containing a string table blob, and start storing
strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in
the string table.

This change allows us to share names between globals and comdats as well
as between modules, and improves the efficiency of loading bitcode files by
no longer using a bit encoding for symbol names. Once we start writing the
irsymtab to the bitcode file we will also be able to share strings between
it and the module.

On my machine, link time for Chromium for Linux with ThinLTO decreases by
about 7% for no-op incremental builds or about 1% for full builds. Total
bitcode file size decreases by about 3%.

As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html

Differential Revision: https://reviews.llvm.org/D31838

llvm-svn: 300464
2017-04-17 17:51:36 +00:00
Dehao Chen 190f17cae7 Use ProfileSummary:getProfileCount to get ScaledCount for ModuleSummary
Summary: ModuleSummary should use the standard interface of ProfileSummary::getProfileCount.

Reviewers: eraman, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31154

llvm-svn: 298404
2017-03-21 17:22:35 +00:00
Dehao Chen a60cdd3881 Add function importing info from samplepgo profile to the module summary.
Summary: For SamplePGO, the profile may contain cross-module inline stacks. As we need to make sure the profile annotation happens when all the hot inline stacks are expanded, we need to pass this info to the module importer so that it can import proper functions if necessary. This patch implemented this feature by emitting cross-module targets as part of function entry metadata. In the module-summary phase, the metadata is used to build call edges that points to functions need to be imported.

Reviewers: mehdi_amini, tejohnson

Reviewed By: tejohnson

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D30053

llvm-svn: 296498
2017-02-28 18:09:44 +00:00
Peter Collingbourne 0c30f089d5 IR: Eliminate non-determinism in the module summary analysis.
Also make the summary ref and call graph vectors immutable. This means
a smaller API surface and fewer places to audit for non-determinism.

Differential Revision: https://reviews.llvm.org/D27875

llvm-svn: 290200
2016-12-20 21:12:28 +00:00
Piotr Padlewski d9830eb79f [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

llvm-svn: 282437
2016-09-26 20:37:32 +00:00