Each function summary has an attached list of type identifier GUIDs. The
idea is that during the regular LTO phase we would match these GUIDs to type
identifiers defined by the regular LTO module and store the resolutions in
a top-level "type identifier summary" (which will be implemented separately).
Differential Revision: https://reviews.llvm.org/D27967
llvm-svn: 290280
Also make the summary ref and call graph vectors immutable. This means
a smaller API surface and fewer places to audit for non-determinism.
Differential Revision: https://reviews.llvm.org/D27875
llvm-svn: 290200
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.
Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:
(1) The DIGlobalVariable should describe the source level variable,
not how to get to its location.
(2) It makes it unsafe/hard to update the expressions when we call
replaceExpression on the DIGLobalVariable.
(3) It makes it impossible to represent a global variable that is in
more than one location (e.g., a variable with multiple
DW_OP_LLVM_fragment-s). We also moved away from attaching the
DIExpression to DILocalVariable for the same reasons.
This reapplies r289902 with additional testcase upgrades and a change
to the Bitcode record for DIGlobalVariable, that makes upgrading the
old format unambiguous also for variables without DIExpressions.
<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769
llvm-svn: 290153
This reverts commit 289920 (again).
I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable
has not DIExpression. Unfortunately it is not possible to safely upgrade
these variables without adding a flag to the bitcode record indicating which
version they are.
My plan of record is to roll the planned follow-up patch that adds a
unit: field to DIGlobalVariable into this patch before recomitting.
This way we only need one Bitcode upgrade for both changes (with a
version flag in the bitcode record to safely distinguish the record
formats).
Sorry for the churn!
llvm-svn: 289982
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.
Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:
(1) The DIGlobalVariable should describe the source level variable,
not how to get to its location.
(2) It makes it unsafe/hard to update the expressions when we call
replaceExpression on the DIGLobalVariable.
(3) It makes it impossible to represent a global variable that is in
more than one location (e.g., a variable with multiple
DW_OP_LLVM_fragment-s). We also moved away from attaching the
DIExpression to DILocalVariable for the same reasons.
This reapplies r289902 with additional testcase upgrades.
<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769
llvm-svn: 289920
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.
Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:
(1) The DIGlobalVariable should describe the source level variable,
not how to get to its location.
(2) It makes it unsafe/hard to update the expressions when we call
replaceExpression on the DIGLobalVariable.
(3) It makes it impossible to represent a global variable that is in
more than one location (e.g., a variable with multiple
DW_OP_LLVM_fragment-s). We also moved away from attaching the
DIExpression to DILocalVariable for the same reasons.
<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769
llvm-svn: 289902
so we can stop using DW_OP_bit_piece with the wrong semantics.
The entire back story can be found here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161114/405934.html
The gist is that in LLVM we've been misinterpreting DW_OP_bit_piece's
offset field to mean the offset into the source variable rather than
the offset into the location at the top the DWARF expression stack. In
order to be able to fix this in a subsequent patch, this patch
introduces a dedicated DW_OP_LLVM_fragment operation with the
semantics that we used to apply to DW_OP_bit_piece, which is what we
actually need while inside of LLVM. This patch is complete with a
bitcode upgrade for expressions using the old format. It does not yet
fix the DWARF backend to use DW_OP_bit_piece correctly.
Implementation note: We discussed several options for implementing
this, including reserving a dedicated field in DIExpression for the
fragment size and offset, but using an custom operator at the end of
the expression works just fine and is more efficient because we then
only pay for it when we need it.
Differential Revision: https://reviews.llvm.org/D27361
rdar://problem/29335809
llvm-svn: 288683
This interface allows clients to write multiple modules to a single
bitcode file. Also introduce the llvm-cat utility which can be used
to create a bitcode file containing multiple modules.
Differential Revision: https://reviews.llvm.org/D26179
llvm-svn: 288195
If the inrange keyword is present before any index, loading from or
storing to any pointer derived from the getelementptr has undefined
behavior if the load or store would access memory outside of the bounds of
the element selected by the index marked as inrange.
This can be used, e.g. for alias analysis or to split globals at element
boundaries where beneficial.
As previously proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html
Differential Revision: https://reviews.llvm.org/D22793
llvm-svn: 286514
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106595.html
This change also fixes an API oddity where BitstreamCursor::Read() would
return zero for the first read past the end of the bitstream, but would
report_fatal_error for subsequent reads. Now we always report_fatal_error
for all reads past the end. Updated clients to check for the end of the
bitstream before reading from it.
I also needed to add padding to the invalid bitcode tests in
test/Bitcode/. This is because the streaming interface was not checking that
the file size is a multiple of 4.
Differential Revision: https://reviews.llvm.org/D26219
llvm-svn: 285773
- Add alignment attribute to DIVariable family
- Modify bitcode format to match new DIVariable representation
- Update tests to match these changes (also add bitcode upgrade test)
- Expect that frontend passes non-zero align value only when it is not default
(was forcibly aligned by alignas()/_Alignas()/__atribute__(aligned())
Differential Revision: https://reviews.llvm.org/D25073
llvm-svn: 284678
Summary:
When there is a call to an alias in the same module, we were not
adding a call edge. So we could incorrectly think that the alias
was dead if it was inlined in that function, despite having a
reference imported elsewhere. This resulted in unsats at link time.
Add a call edge when the call is to an alias.
Reviewers: davide, mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25384
llvm-svn: 283664
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.
I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.
I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.
Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.
Reviewers: eraman, mehdi_amini, tejohnson
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24638
llvm-svn: 282437
Summary:
Emit an empty summary section, instead of no summary section, when
there are no global variables in the index. This ensures that LTO
will treat these files as ThinLTO inputs, instead of as regular
LTO inputs.
In addition to not being what the user likely intended when
compiling with -flto=thin, the current behavior is problematic for
distributed build systems that expect to get ThinLTO index and imports
files back for each input compiled with -flto=thin. Combining into
a single regular LTO module also reduces the backend parallelism.
And in the case where the index was suppressed due to uses in
inline assembly, combining into a single LTO module could provoke
renaming of duplicates that we were trying to prevent by suppressing
the index.
This change required a couple of fixes to handle the empty summary
section.
Reviewers: mehdi_amini
Subscribers: mehdi_amini, llvm-commits, pcc
Differential Revision: https://reviews.llvm.org/D24779
llvm-svn: 282037
Previous we were issuing an error when linking a module containing
the new Objective-C metadata structure for class properties with an
"old" one.
Now instead we downgrade the module flag so that the Objective-C
runtime does not expect the new metadata structure.
This is consistent with what ld64 is doing on binary files.
Differential Revision: https://reviews.llvm.org/D24620
llvm-svn: 281685
This patch reverses the edge from DIGlobalVariable to GlobalVariable.
This will allow us to more easily preserve debug info metadata when
manipulating global variables.
Fixes PR30362. A program for upgrading test cases is attached to that
bug.
Differential Revision: http://reviews.llvm.org/D20147
llvm-svn: 281284
Fork off compatibility.ll for the 3.9 release. The *.bc file in this
commit was produced using a Release build of the release_39 branch.
llvm-svn: 281059
Summary:
This file should be referenced from
thinlto-function-summary-callgraph-pgo.ll file,
but someone forgot to use it there. Everything worked because
we store pgo data about callsite blocks, so there is no need to have
pgo count of @func.
Reviewers: tejohnson, eraman, mehdi_amini
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24309
llvm-svn: 280882
Summary:
Port the NameAnonFunction pass and add a test.
Depends on D23439.
Reviewers: mehdi_amini
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23440
llvm-svn: 278509
Summary:
Port the ModuleSummaryAnalysisWrapperPass to the new pass manager.
Use it in the ported BitcodeWriterPass (similar to how we use the
legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass).
Also, pass the -module-summary opt flag through to the new pass
manager pipeline and through to the bitcode writer pass, and add
a test that uses it.
Reviewers: mehdi_amini
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23439
llvm-svn: 278508
Summary:
This patch adds IsVariadicFunction bit to summary in order
to not import variadic functions. Inliner doesn't inline
variadic functions because it is hard to reason about it.
This one small fix improves Importer by about 16%
(going from 86% to 100% of imported functions that are
inlined anywhere)
on some spec benchmarks like 'int' and others.
Reviewers: eraman, mehdi_amini, tejohnson
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D23339
llvm-svn: 278432
Summary:
This complements the earlier addition of IntrWriteMem and IntrWriteArgMem
LLVM intrinsic properties, see D18291.
Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis.
Reviewers: reames, joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D18714
llvm-svn: 274485
Summary:
This represents the adjustment applied to the implicit 'this' parameter
in the prologue of a virtual method in the MS C++ ABI. The adjustment is
always zero unless multiple inheritance is involved.
This increases the size of DISubprogram by 8 bytes, unfortunately. The
adjustment really is a signed 32-bit integer. If this size increase is
too much, we could probably win it back by splitting out a subclass with
info specific to virtual methods (virtuality, vindex, thisadjustment,
containingType).
Reviewers: aprantl, dexonsmith
Subscribers: aaboud, amccarth, llvm-commits
Differential Revision: http://reviews.llvm.org/D21614
llvm-svn: 274325
The function name Module::empty() is slightly misleading in that it
only tests for the presence of functions in the module. However we
still want to emit the module summary if the module contains only
global variables or aliases. The presence of such entities can be
determined simply by checking the summary directly, as we are doing
below.
Differential Revision: http://reviews.llvm.org/D21669
llvm-svn: 273638
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.
This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
the normal rule that the global must have a unique address can be broken without
being observable by the program by performing comparisons against the global's
address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
its own copy of the global if it requires one, and the copy in each linkage unit
must be the same)
- It is a constant or a function (which means that the program cannot observe that
the unique-address rule has been broken by writing to the global)
Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.
See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.
Part of the fix for PR27553.
Differential Revision: http://reviews.llvm.org/D20348
llvm-svn: 272709
When we have "Image Info Version" module flag but don't have "Class Properties"
module flag, set "Class Properties" module flag to 0, so we can correctly emit
errors when one module has the flag set and another module does not.
rdar://26469641
llvm-svn: 270791
Summary:
This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.
Also, in the future unused non-kernels may be optimized.
For now, also accept SPIR_KERNEL for HCC frontend.
Also, add bitcode compatibility tests for missing calling conventions
except AVR_BUILTIN which doesn't have parse code.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, joker.eph, llvm-commits
llvm-svn: 268717
Summary:
With the removal of support for lazy parsing of combined index summary
records (e.g. r267344), we no longer need to include the summary record
bitcode offset in the VST entries for definitions. Change the combined
index format to be similar to the per-module index format in using value
ids to cross-reference from the summary record to the VST entry (rather
than the summary record bitcode offset to cross-reference in the other
direction).
The visible changes are:
1) Add the value id to the combined summary records
2) Remove the summary offset from the combined VST records, which has
the following effects:
- No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all
combined index VST entries now only contain the value id and
corresponding GUID.
- No longer have duplicate VST entries in the case where there are
multiple definitions of a symbol (e.g. weak/linkonce), as they all
have the same value id and GUID.
An implication of #2 above is that in order to hook up an alias to the
correct aliasee based on the value id of the aliasee recorded in the
combined index alias record, we need to scan the entries in the index
for that GUID to find the one from the same module (i.e. the case where
there are multiple entries for the aliasee). But the reader no longer
has to maintain a special map to hook up the alias/aliasee.
Reviewers: joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19481
llvm-svn: 267712
Add tests for some missing cases to bitcode upgrade in r267296.
- DICompositeType with an 'elements:' field, which will cause it to be
involved in a cycle after the upgrade.
- A DIDerivedType that references a class in 'extraData:'.
I updated test/Bitcode/dityperefs-3.8.ll with the missing cases and
regenerated test/Bitcode/dityperefs-3.8.ll.bc.
llvm-svn: 267332
Right now it only contains the LinkageType, but will be extended
with "hasSection", "isOptSize", "hasInlineAssembly", etc.
Differential Revision: http://reviews.llvm.org/D19404
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267319
It seems we still have some ordering issue in the combined index
emission, but I can't figure out why right now.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267306
This is a fixup for r267304.
The test was sensitive to the path in a subtle way:
the index in memory is sorted by GUID, which are hashes
that include the source filename for local globals.
Teresa recently added a directive at the IR level, so
we can specify it here to make the test independent of
the path.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267305
Summary:
As discussed in D18298, some local globals can't
be renamed/promoted (because they have a section, or because
they are referenced from inline assembly).
To be able to detect naming collision, we need to keep around
the "GUID" using their original name without taking the linkage
into account.
Reviewers: tejohnson
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19454
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267304
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*. It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.
Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType. The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.
This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata. Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.
The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html
llvm-svn: 267296
Since forward references for uniqued node operands are expensive (and
those for distinct node operands are cheap due to
DistinctMDOperandPlaceholder), minimize forward references in uniqued
node operands.
Moreover, guarantee that when a cycle is broken by a distinct node, none
of the uniqued nodes have any forward references. In
ValueEnumerator::EnumerateMetadata, enumerate uniqued node subgraphs
first, delaying distinct nodes until all uniqued nodes have been
handled. This guarantees that uniqued nodes only have forward
references when there is a uniquing cycle (since r267276 changed
ValueEnumerator::organizeMetadata to partition distinct nodes in front
of uniqued nodes as a post-pass).
Note that a single uniqued subgraph can hit multiple distinct nodes at
its leaves. Ideally these would themselves be emitted in post-order,
but this commit doesn't attempt that; I think it requires an extra pass
through the edges, which I'm not convinced is worth it (since
DistinctMDOperandPlaceholder makes forward references quite cheap
between distinct nodes).
I've added two testcases:
- test/Bitcode/mdnodes-distinct-in-post-order.ll is just like
test/Bitcode/mdnodes-in-post-order.ll, except with distinct nodes
instead of uniqued ones. This confirms that, in the absence of
uniqued nodes, distinct nodes are still emitted in post-order.
- test/Bitcode/mdnodes-distinct-nodes-break-cycles.ll is the minimal
example where a naive post-order traversal would cause one uniqued
node to forward-reference another. IOW, it's the motivating test.
llvm-svn: 267278
When an operand of a distinct node hasn't been read yet, the reader can
use a DistinctMDOperandPlaceholder. This is much cheaper than forward
referencing from a uniqued node. Change
ValueEnumerator::organizeMetadata to partition distinct nodes and
uniqued nodes to reduce the overhead of cycles broken by distinct nodes.
Mehdi measured this for me; this removes most of the RAUW from the
importing step of -flto=thin, even after a WIP patch that removes
string-based DITypeRefs (introducing many more cycles to the metadata
graph).
llvm-svn: 267276
Emit metadata nodes in post-order. The iterative algorithm from r266709
failed to maintain this property. After understanding my mistake, it
wasn't too hard to write a test with llvm-bcanalyzer (and I've actually
made this change once before: see r220340).
This also reverts the "noisy" testcase change from r266709. That should
have been more of a red flag :/.
Note: The same bug crept into the ValueMapper in r265456. I'm still
working on the fix.
llvm-svn: 266947
Use a worklist instead of recursing through MDNode operands in
ValueEnumerator. The actual record output order has changed slightly,
but otherwise there's no functionality change.
I had to update test/Bitcode/metadata-function-blocks.ll. I renumbered
nodes so they continue to match the implicit record ids.
llvm-svn: 266709
To be able to work accurately on the reference graph when taking
decision about internalizing, promoting, renaming, etc. We need
to have the alias information explicit.
Differential Revision: http://reviews.llvm.org/D18836
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266517
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.
Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.
Motivation
----------
Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.
We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.
Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.
http://reviews.llvm.org/D19034
<rdar://problem/25256815>
llvm-svn: 266446
Summary:
To be able to work accurately on the reference graph when taking decision
about internalizing, promoting, renaming, etc. We need to have the alias
information explicit.
Reviewers: tejohnson
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18836
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266214
Summary:
Let keep llvm-as "dumb": it converts textual IR to bitcode. This
commit removes the dependency from llvm-as to libLLVMAnalysis.
We'll add back summary in llvm-as if we get to a textual
representation for it at some point. In the meantime, opt seems
like a better place for that.
Reviewers: tejohnson
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19032
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266131
`allocsize` is a function attribute that allows users to request that
LLVM treat arbitrary functions as allocation functions.
This patch makes LLVM accept the `allocsize` attribute, and makes
`@llvm.objectsize` recognize said attribute.
The review for this was split into two patches for ease of reviewing:
D18974 and D14933. As promised on the revisions, I'm landing both
patches as a single commit.
Differential Revision: http://reviews.llvm.org/D14933
llvm-svn: 266032
Summary:
This is the first step in also serializing the index out to LLVM
assembly.
The per-module summary written to bitcode is moved out of the bitcode
writer and to a new analysis pass (ModuleSummaryIndexWrapperPass).
The pass itself uses a new builder class to compute index, and the
builder class is used directly in places where we don't have a pass
manager (e.g. llvm-as).
Because we are computing summaries outside of the bitcode writer, we no
longer can use value ids created by the bitcode writer's
ValueEnumerator. This required changing the reference graph edge type
to use a new ValueInfo class holding a union between a GUID (combined
index) and Value* (permodule index). The Value* are converted to the
appropriate value ID during bitcode writing.
Also, this enables removal of the BitWriter library's dependence on the
Analysis library that was previously required for the summary computation.
Reviewers: joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D18763
llvm-svn: 265941
This patch add support for GCC attribute((ifunc("resolver"))) for
targets that use ELF as object file format. In general ifunc is a
special kind of function alias with type @gnu_indirect_function. Patch
for Clang http://reviews.llvm.org/D15524
Differential Revision: http://reviews.llvm.org/D15525
llvm-svn: 265667
Whenever metadata is only referenced by a single function, emit the
metadata just in that function block. This should improve lazy-loading
by reducing the amount of metadata in the global block.
For now, this should catch all DILocations, and anything else that
happens to be referenced only by a single function.
It's also a first step toward a couple of possible future directions
(which this commit does *not* implement):
1. Some debug info metadata is only referenced from compile units and
individual functions. If we can drop the link from the compile
unit, this optimization will get more powerful.
2. Any uniqued metadata that isn't referenced globally can in theory be
emitted in every function block that references it (trading off
bitcode size and full-parse time vs. lazy-load time).
Note: this assumes the new BitcodeReader error checking from r265223.
The metadata stored in function blocks gets purged after parsing each
function, which means unresolved forward references will get lost.
Since all the global metadata should have already been resolved by the
time we get to the function metadata blocks we just need to check for
that case. (If for some reason we need to handle bitcode that fails the
checks in r265223, the fix is to store about-to-be-dropped unresolved
nodes in MetadataList::shrinkTo until they can be handled succesfully by
a future call to MetadataList::tryToResolveCycles.)
llvm-svn: 265226
A ``swifterror`` attribute can be applied to a function parameter or an
AllocaInst.
This commit does not include any target-specific change. The target-specific
optimization will come as a follow-up patch.
Differential Revision: http://reviews.llvm.org/D18092
llvm-svn: 265189
This is intended to be used for ThinLTO incremental build.
Differential Revision: http://reviews.llvm.org/D18213
This is a recommit of r265095 after fixing the Windows issues.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265111
This reverts commit r265096, r265095, and r265094.
Windows build is broken, and the validation does not pass.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265102
This is intended to be used for ThinLTO incremental build.
Differential Revision: http://reviews.llvm.org/D18213
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265095
Spiritually reapply commit r264409 (reverted in r264410), albeit with a
bit of a redesign.
Firstly, avoid splitting the big blob into multiple chunks of strings.
r264409 imposed an arbitrary limit to avoid a massive allocation on the
shared 'Record' SmallVector. The bug with that commit only reproduced
when there were more than "chunk-size" strings. A test for this would
have been useless long-term, since we're liable to adjust the chunk-size
in the future.
Thus, eliminate the motivation for chunk-ing by storing the string sizes
in the blob. Here's the layout:
vbr6: # of strings
vbr6: offset-to-blob
blob:
[vbr6]: string lengths
[char]: concatenated strings
Secondly, make the output of llvm-bcanalyzer readable.
I noticed when debugging r264409 that llvm-bcanalyzer was outputting a
massive blob all in one line. Past a small number, the strings were
impossible to split in my head, and the lines were way too long. This
version adds support in llvm-bcanalyzer for pretty-printing.
<STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 {
'abc'
'def'
'ghi'
}
From the original commit:
Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this
should (a) slightly reduce bitcode size, since there is less record
overhead, and (b) greatly improve reading speed, since blobs are super
cheap to deserialize.
llvm-svn: 264551
The implementation is fairly obvious. This is preparation for using
some blobs in bitcode.
For clarity (and perhaps future-proofing?), I moved the call to
JumpToBit in BitstreamCursor::readRecord ahead of calling
MemoryObject::getPointer, since JumpToBit can theoretically (a) read
bytes, which (b) invalidates the blob pointer.
This isn't strictly necessary the two memory objects we have:
- The return of RawMemoryObject::getPointer is valid until the memory
object is destroyed.
- StreamingMemoryObject::getPointer is valid until the next chunk is
read from the stream. Since the JumpToBit call is only going ahead
to a word boundary, we'll never load another chunk.
However, reordering makes it clear by inspection that the blob returned
by BitstreamCursor::readRecord will be valid.
I added some tests for StreamingMemoryObject::getPointer and
BitstreamCursor::readRecord.
llvm-svn: 264549
Optimize output of MDStrings in bitcode. This emits them in big blocks
(currently 1024) in a pair of records:
- BULK_STRING_SIZES: the sizes of the strings in the block, and
- BULK_STRING_DATA: a single blob, which is the concatenation of all
the strings.
Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this
should (a) slightly reduce bitcode size, since there is less record
overhead, and (b) greatly improve reading speed, since blobs are super
cheap to deserialize.
I needed to add support for blobs to streaming input to get the test
suite passing.
- StreamingMemoryObject::getPointer reads ahead and returns the
address of the blob.
- To avoid a possible reallocation of StreamingMemoryObject::Bytes,
BitstreamCursor::readRecord needs to move the call to JumpToEnd
forward so that getPointer is the last bitstream operation.
llvm-svn: 264409
Summary: If TBAA is on an intrinsic and it gets upgraded and drops the TBAA we hit an odd assert. We should just upgrade the TBAA first because it doesn't have side-effects.
Reviewers: reames, apilipenko, manmanren
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18229
llvm-svn: 263673
Fork off compatibility.ll for the 3.8 release. The *.bc file in this
commit was produced using a Release build of the release_38 branch.
llvm-svn: 263620
(Resubmitting after fixing missing file issue)
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.
A companion clang patch will immediately succeed this patch to reflect
this renaming.
llvm-svn: 263513
With the changes in r263275, there are now more than just functions in
the summary. Completed the renaming of data structures (started in
r263275) to reflect the wider scope. In particular, changed the
FunctionIndex* data structures to ModuleIndex*, and renamed related
variables and comments. Also renamed the files to reflect the changes.
A companion clang patch will immediately succeed this patch to reflect
this renaming.
llvm-svn: 263490
Summary:
This patch adds support for including a full reference graph including
call graph edges and other GV references in the summary.
The reference graph edges can be used to make importing decisions
without materializing any source modules, can be used in the plugin
to make file staging decisions for distributed build systems, and is
expected to have other uses.
The call graph edges are recorded in each function summary in the
bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
there is PGO, where the ValueId can be mapped to the function GUID via
the ValueSymbolTable. In the function index in memory, the call graph
edges reference the target via the CalleeGUID instead of the
CalleeValueId.
The reference graph edges are recorded in each summary record with a
list of referenced value IDs, which can be mapped to value GUID via the
ValueSymbolTable.
Addtionally, a new summary record type is added to record references
from global variable initializers. A number of bitcode records and data
structures have been renamed to reflect the newly expanded scope of the
summary beyond functions. More cleanup will follow.
Reviewers: joker.eph, davidxl
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D17212
llvm-svn: 263275
Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser.
Reviewers: arsenm
Subscribers: dylanmckay, llvm-commits
Differential Revision: http://reviews.llvm.org/D16348
llvm-svn: 262600
This restores commit r260408, along with a fix for a bot failure.
The bot failure was caused by dereferencing a unique_ptr in the same
call instruction parameter list where it was passed via std::move.
Apparently due to luck this was not exposed when I built the compiler
with clang, only with gcc.
llvm-svn: 260442
Summary:
This patch uses the lower 64-bits of the MD5 hash of a function name as
a GUID in the function index, instead of storing function names. Any
local functions are first given a global name by prepending the original
source file name. This is the same naming scheme and GUID used by PGO in
the indexed profile format.
This change has a couple of benefits. The primary benefit is size
reduction in the combined index file, for example 483.xalancbmk's
combined index file was reduced by around 70%. It should also result in
memory savings for the index file in memory, as the in-memory map is
also indexed by the hash instead of the string.
Second, this enables integration with indirect call promotion, since the
indirect call profile targets are recorded using the same global naming
convention and hash. This will enable the function importer to easily
locate function summaries for indirect call profile targets to enable
their import and subsequent promotion.
The original source file name is recorded in the bitcode in a new
module-level record for use in the ThinLTO backend pipeline.
Reviewers: davidxl, joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D17028
llvm-svn: 260408
Summary:
Adds the linkage type to both the per-module and combined function
summaries, which subsumes the current islocal bit. This will eventually
be used to optimized linkage types based on global summary-based
analysis.
Reviewers: joker.eph
Subscribers: joker.eph, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D16943
llvm-svn: 259993
This patch enables llvm-bcanalyzer to print the bitcode wrapper header
if the file has one, which is needed to test the changes made in
r258627 (bitcode-wrapper-header-armv7m.ll is the test case for r258627).
Differential Revision: http://reviews.llvm.org/D16642
llvm-svn: 259162
Summary:
Funclet EH personalities require a tree-like nesting among funclets
(enforced by the ParentPad linkage in the IR), and also require that
unwind edges conform to certain rules with respect to the tree:
- An unwind edge may exit 0 or more ancestor pads
- An unwind edge must enter exactly one EH pad, which must be distinct
from any exited pads
- A cleanupret's edge must exit its cleanuppad
Describe these rules in the LangRef, and enforce them in the verifier.
Reviewers: rnk, majnemer, andrew.w.kaylor
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15961
llvm-svn: 257272
With r256990, bogner introduced comprehensive tests for constant arrays
and vectors. We no longer need the existing ones because they are
redundant.
llvm-svn: 256991
I added a couple of tests in r256982, but vedantk suggested that they
fit better into compatibility.ll, since they could catch format breaks
later on there.
llvm-svn: 256990
In r254991 I allowed ConstantDataVectors to contain elements of
HalfTy, but I missed updating the bitcode reader and writer to handle
this, so now we crash if we try to emit bitcode on programs that have
constant vectors of half.
This fixes the issue and adds test coverage for reading and writing
constant sequences in bitcode.
llvm-svn: 256982
Summary: A catchswitch cannot be a parent of a cleanuppad or another catchswitch.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15841
llvm-svn: 256690
Summary:
This patch introduces two new function attributes
InaccessibleMemOnly: This attribute indicates that the function may only access memory that is not accessible by the program/IR being compiled. This is a weaker form of ReadNone.
inaccessibleMemOrArgMemOnly: This attribute indicates that the function may only access memory that is either not accessible by the program/IR being compiled, or is pointed to by its pointer arguments. This is a weaker form of ArgMemOnly
Test cases have been updated. This revision uses this (d001932f3a) as reference.
Reviewers: jmolloy, hfinkel
Subscribers: reames, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D15499
llvm-svn: 255778