This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.
The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches. For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.
The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.
Thanks to Adrian Prantl for stewarding this patch!
llvm-svn: 285094
These functions are about classifying a global which will actually be
emitted, so it does not make sense for them to take a GlobalValue which may
for example be an alias.
Change the Mach-O object writer and the Hexagon, Lanai and MIPS backends to
look through aliases before using TargetLoweringObjectFile interfaces. These
are functional changes but all appear to be bug fixes.
Differential Revision: https://reviews.llvm.org/D25917
llvm-svn: 285006
- Add alignment attribute to DIVariable family
- Modify bitcode format to match new DIVariable representation
- Update tests to match these changes (also add bitcode upgrade test)
- Expect that frontend passes non-zero align value only when it is not default
(was forcibly aligned by alignas()/_Alignas()/__atribute__(aligned())
Differential Revision: https://reviews.llvm.org/D25073
llvm-svn: 284678
Summary:
The original implementation is in r261607, which was reverted in r269726 to accomendate the ProfileSummaryInfo analysis pass. The new implementation:
1. add a new metadata for function section prefix
2. query against ProfileSummaryInfo in CGP to set the correct section prefix for each function
3. output the section prefix set by CGP
Reviewers: davidxl, eraman
Subscribers: vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D24989
llvm-svn: 284533
In futher patches we shall have alignment field added to DIVariable family
and switching from uint64_t to uint32_t will save 4 bytes per variable.
Differential Revision: https://reviews.llvm.org/D25620
llvm-svn: 284482
If -coverage is passed, but -g is not, clang populates the PassManager
pipeline with StripSymbols(debugOnly = true).
The stripSymbol pass therefore scans the list of named metadata,
drops !llvm.dbg.cu, but leaves !llvm.gcov and !0 (the compileUnit MD)
around. The verifier runs, and finds out that there's a CU not listed
in !llvm.dbg.cu (as it was previously dropped) -> crash.
When we strip debug info, so, check if there's coverage data,
and strip it as well, in order to avoid pending metadata left around.
Differential Revision: https://reviews.llvm.org/D25689
llvm-svn: 284418
The Register Calling Convention (RegCall) was introduced by Intel to optimize parameter transfer on function call.
This calling convention ensures that as many values as possible are passed or returned in registers.
This commit presents the basic additions to LLVM CodeGen in order to support RegCall in X86.
Differential Revision: http://reviews.llvm.org/D25022
llvm-svn: 284108
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:
va_start(ValueArgs, Desc);
with Desc being a StringRef.
Differential Revision: https://reviews.llvm.org/D25342
llvm-svn: 283671
Value names may be prefixed with a binary '1' to indicate that the
backend should not modify the symbols due to any platform naming
convention.
This should not show up in the YAML opt record file because it breaks
the YAML parser.
llvm-svn: 283656
We need to add an entry in the combined-index for modules that have
a hash but otherwise empty summary, this is needed so that we can
get the hash for the module.
Also, if no entry is present in the combined index for a module, we
need to skip it when trying to compute a cache entry.
Differential Revision: https://reviews.llvm.org/D25300
llvm-svn: 283654
This adds a new function to DebugInfo.cpp that takes an llvm::Module
as input and removes all debug info metadata that is not directly
needed for line tables, thus effectively stripping all type and
variable information from the module.
The primary motivation for this feature was the bitcode work flow
(cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html
for more background). This is not wired up yet, but will be in
subsequent patches. For testing, the new functionality is exposed to
opt with a -strip-nonlinetable-debuginfo option.
The secondary use-case (and one that works right now!) is as a
reduction pass in bugpoint. I added two new bugpoint options
(-disable-strip-debuginfo and -disable-strip-debug-types) to control
the new features. By default it will first attempt to remove all debug
information, then only the type info, and then proceed to hack at any
remaining MDNodes.
llvm-svn: 283473
This came out of a discussion in https://reviews.llvm.org/D25285.
There used to be various other llvm.dbg.* nodes, but we don't support
upgrading them and we want to reserve the namespace for future uses.
This also removes an entirely obsolete and bitrotted testcase for PR7662.
Reapplies 283390 with a forgotten testcase.
llvm-svn: 283400
This came out of a discussion in https://reviews.llvm.org/D25285.
There used to be various other llvm.dbg.* nodes, but we don't support
upgrading them and we want to reserve the namespace for future uses.
This also removes an entirely obsolete and bitrotted testcase for PR7662.
llvm-svn: 283390
Summary:
These are analog to the existing LLVMConstExactSDiv and LLVMBuildExactSDiv
functions.
Reviewers: deadalnix, majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D25259
llvm-svn: 283269
If the llvm. prefix is dropped other parts of llvm don't see this as
an intrinsic. This means that the number of regular symbols depends
on the context the module is loaded into, which causes LTO to abort.
Fixes PR30509.
llvm-svn: 283117
This change teaches getEquivalentICmp to be smarter about generating
ICMP_NE and ICMP_EQ predicates.
An earlier version of this change was landed as rL283057 which had a
use-after-free bug. This new version has a fix for that bug, and a (C++
unittests/) test case that would have triggered it rL283057.
llvm-svn: 283078
They've broken the sanitizer-bootstrap bots. Reverting while I investigate.
Original commit messages:
r283057: "[ConstantRange] Make getEquivalentICmp smarter"
r283058: "[SCEV] Rely on ConstantRange instead of custom logic; NFCI"
llvm-svn: 283062
(Re-committed after moving the template specialization under the yaml
namespace. GCC was complaining about this.)
This allows various presentation of this data using an external tool.
This was first recommended here[1].
As an example, consider this module:
1 int foo();
2 int bar();
3
4 int baz() {
5 return foo() + bar();
6 }
The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):
remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)
Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:
--- !Missed
Pass: inline
Name: NotInlined
DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 }
Function: baz
Hotness: 30
Args:
- Callee: foo
- String: will not be inlined into
- Caller: baz
...
--- !Missed
Pass: inline
Name: NotInlined
DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 }
Function: baz
Hotness: 30
Args:
- Callee: bar
- String: will not be inlined into
- Caller: baz
...
This is a summary of the high-level decisions:
* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:
ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
DEBUG_TYPE, "NotInlined", &I)
<< NV("Callee", Callee) << " will not be inlined into "
<< NV("Caller", CS.getCaller()) << setIsVerbose());
NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.
Subsequent patches will update ORE users to use the new streaming API.
* I am using YAML I/O for writing the YAML file. YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types. Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.
On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).
* The YAML stream is stored in the LLVM context.
* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".
* As before hotness is computed in the analysis pass instead of
DiganosticInfo. This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.
[1] https://reviews.llvm.org/D19678#419445
Differential Revision: https://reviews.llvm.org/D24587
llvm-svn: 282539
This allows various presentation of this data using an external tool.
This was first recommended here[1].
As an example, consider this module:
1 int foo();
2 int bar();
3
4 int baz() {
5 return foo() + bar();
6 }
The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):
remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)
Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:
--- !Missed
Pass: inline
Name: NotInlined
DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 }
Function: baz
Hotness: 30
Args:
- Callee: foo
- String: will not be inlined into
- Caller: baz
...
--- !Missed
Pass: inline
Name: NotInlined
DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 }
Function: baz
Hotness: 30
Args:
- Callee: bar
- String: will not be inlined into
- Caller: baz
...
This is a summary of the high-level decisions:
* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:
ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
DEBUG_TYPE, "NotInlined", &I)
<< NV("Callee", Callee) << " will not be inlined into "
<< NV("Caller", CS.getCaller()) << setIsVerbose());
NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.
Subsequent patches will update ORE users to use the new streaming API.
* I am using YAML I/O for writing the YAML file. YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types. Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.
On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).
* The YAML stream is stored in the LLVM context.
* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".
* As before hotness is computed in the analysis pass instead of
DiganosticInfo. This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.
[1] https://reviews.llvm.org/D19678#419445
Differential Revision: https://reviews.llvm.org/D24587
llvm-svn: 282499
The ValueSymbolTable is used to detect name conflict and rename
instructions automatically. This is not needed when the value
names are automatically discarded by the LLVMContext.
No functional change intended, just saving a little bit of memory.
This is a recommit of r281806 after fixing the accessor to return
a pointer instead of a reference and updating all the call-sites.
llvm-svn: 281813
The ValueSymbolTable is used to detect name conflict and rename
instructions automatically. This is not needed when the value
names are automatically discarded by the LLVMContext.
No functional change intended, just saving a little bit of memory.
llvm-svn: 281806
Previous we were issuing an error when linking a module containing
the new Objective-C metadata structure for class properties with an
"old" one.
Now instead we downgrade the module flag so that the Objective-C
runtime does not expect the new metadata structure.
This is consistent with what ld64 is doing on binary files.
Differential Revision: https://reviews.llvm.org/D24620
llvm-svn: 281685
If TBAA is on an intrinsic and it gets upgraded, it'll delete the call
instruction that we collected in a vector. Even if we were to use
WeakVH, it'll drop the TBAA and we'll hit the assert on the upgrade
path.
r263673 gave a shot to make sure the TBAA upgrade happens before
intrinsics upgrade, but failed to account for all cases.
Instead of collecting instructions in a vector, this patch makes it
just upgrade the TBAA on the fly, because metadata are always
already loaded at this point.
Differential Revision: https://reviews.llvm.org/D24533
llvm-svn: 281549
This patch reverses the edge from DIGlobalVariable to GlobalVariable.
This will allow us to more easily preserve debug info metadata when
manipulating global variables.
Fixes PR30362. A program for upgrading test cases is attached to that
bug.
Differential Revision: http://reviews.llvm.org/D20147
llvm-svn: 281284
Summary:
While woring on mapping attributes in the C API, it clearly appeared that the recent changes in the API on the C++ side left Function and Call/Invoke with an attribute API that grew in an ad hoc manner. This makes it difficult to work with it, because one doesn't know which overloads exists and which do not.
Make sure that getter/setter function exists for both enum and string version. Remove inconsistent getter/setter, unless they have many callsites.
This should make it easier to work with attributes in the future.
This doesn't change how attribute works.
Reviewers: bkramer, whitequark, mehdi_amini, void
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21514
llvm-svn: 281019
Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes:
Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4
Flags are now strongly typed
Patch by: Victor Leschuk <vleschuk@gmail.com>
Differential Revision: https://reviews.llvm.org/D23766
llvm-svn: 280700
Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes:
* Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4
* Flags are now strongly typed
Patch by: Victor Leschuk <vleschuk@gmail.com>
Differential Revision: https://reviews.llvm.org/D23766
llvm-svn: 280686
Delete the dead code for Write(ilist_iterator) in the IR Verifier,
inline report(ilist_iterator) at its call sites in the MachineVerifier,
and use simple_ilist<>::iterator in SymbolTableListTraits.
The only remaining reference to ilist_iterator outside of the ilist
implementation is from MachineInstrBundleIterator. I'll get rid of that
in a follow-up.
llvm-svn: 280565
If an attribute name has special characters such as '\01', it is not
properly printed in LLVM assembly language format. Since the format
expects the special characters are printed as it is, it has to contain
escape characters to make it printable.
Before:
attributes #0 = { ... "counting-function"="^A__gnu_mcount_nc" ...
After:
attributes #0 = { ... "counting-function"="\01__gnu_mcount_nc" ...
Reviewers: hfinkel, rengolin, rjmccall, compnerd
Subscribers: nemanjai, mcrosier, hans, shenhan, majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D23792
llvm-svn: 280357
Guarantee that ilist_traits<T>::transferNodesFromList is only called
when nodes are actually changing lists.
I also moved all the callbacks to occur *first*, before the operation.
This is the only choice for iplist<T>::merge, so we might as well be
consistent. I expect this to have no effect in practice, although it
simplifies the logic in both iplist<T>::transfer and iplist<T>::insert.
llvm-svn: 280122
Assuming the default FP env, we should not treat fdiv and frem any differently in terms of
trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env.
This matches how we treat the fdiv/frem in IR with isSafeToSpeculativelyExecute() and in
the backend after:
https://reviews.llvm.org/rL279970
llvm-svn: 279973
Summary:
[Coroutines] Part 9: Add cleanup subfunction.
This patch completes coroutine heap allocation elision. Now, the heap elision example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex3.ll)
Intrinsic Changes:
* coro.free gets a token parameter tying it to coro.id to allow reliably discovering all coro.frees associated with a particular coroutine.
* coro.id gets an extra parameter that points back to a coroutine function. This allows to check whether a coro.id describes the enclosing function or it belongs to a different function that was later inlined.
CoroSplit now creates three subfunctions:
# f$resume - resume logic
# f$destroy - cleanup logic, followed by a deallocation code
# f$cleanup - just the cleanup code
CoroElide pass during devirtualization replaces coro.destroy with either f$destroy or f$cleanup depending whether heap elision is performed or not.
Other fixes, improvements:
* Fixed buglet in Shape::buildFrame that was not creating coro.save properly if coroutine has more than one suspend point.
* Switched to using variable width suspend index field (no longer limited to 32 bit index field can be as little as i1 or as large as i<whatever-size_t-is>)
Reviewers: majnemer
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23844
llvm-svn: 279971
This patch changes LLVM_CONSTEXPR variable declarations to const
variable declarations, since LLVM_CONSTEXPR expands to nothing if the
current compiler doesn't support constexpr. In all of the changed
cases, it looks like the code intended the variable to be const instead
of sometimes-constexpr sometimes-not.
llvm-svn: 279696
In cases where .dwo/.dwp files are guaranteed to be available, skipping
the extra online (in the .o file) inline info can save a substantial
amount of space - see the original r221306 for more details there.
llvm-svn: 279650
Philip commented on r279113 to ask for better comments as to
when to use the different versions of getName. Its also possible
to assert in the simple case that we aren't an overloaded intrinsic
as those have to use the more capable version of getName.
Thanks for the comments Philip.
llvm-svn: 279466
unit for use in the PreservedAnalyses set.
This doesn't have any important functional change yet but it cleans
things up and makes the analysis substantially more efficient by
avoiding querying through the type erasure for every analysis.
I also think it makes it much easier to reason about how analyses are
preserved when walking across pass managers and across IR unit
abstractions.
Thanks to Sean and Mehdi both for the comments and suggestions.
Differential Revision: https://reviews.llvm.org/D23691
llvm-svn: 279360
When running 'opt -O2 verify-uselistorder-nodbg.lto.bc', there are 33m allocations. 8.2m
come from std::string allocations in Intrinsic::getName(). Turns out this method only
returns a std::string because it needs to handle overloads, but that is not the common case.
This adds an overload of getName which just returns a StringRef when there are no overloads
and so saves on the allocations.
llvm-svn: 279113
Summary:
Looking at the implementation, GenericDomTree has more specific
requirements on NodeRef, e.g. NodeRefObject->getParent() should compile,
and NodeRef should be a pointer. We can remove the pointer requirement,
but it seems to have little gain, given the limited use cases.
Also changed GraphTraits<Inverse<Inverse<T>> to be more accurate.
Reviewers: dblaikie, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23593
llvm-svn: 278961
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.
llvm-svn: 278902
IndVarSimplify::sinkUnusedInvariants calls
BasicBlock::getFirstInsertionPt on the ExitBlock and moves instructions
before it. This can return end(), so it's not safe to dereference. Add
an iterator-based overload to Instruction::moveBefore to avoid the UB.
llvm-svn: 278886
It is pretty easy to get it down to O(nlogn + mlogm). This
implementation has the added benefit of automatically deduplicating
entries between the two sets.
llvm-svn: 278837
I have audited all the callers of concatenate and none require duplicate
entries to service concatenation.
These duplicates serve no purpose but to needlessly embiggen the IR.
N.B. Layering getMostGenericAliasScope on top of concatenate makes it
O(nlogn + mlogm) instead of O(n*m).
llvm-svn: 278836
Summary:
1. Make coroutine representation more robust against optimization that may duplicate instruction by introducing coro.id intrinsics that returns a token that will get fed into coro.alloc and coro.begin. Due to coro.id returning a token, it won't get duplicated and can be used as reliable indicator of coroutine identify when a particular coroutine call gets inlined.
2. Move last three arguments of coro.begin into coro.id as they will be shared if coro.begin will get duplicated.
3. doc + test + code updated to support the new intrinsic.
Reviewers: mehdi_amini, majnemer
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D23412
llvm-svn: 278481
End iterators are usually sentinels, not actually Instruction* at all.
Stop casting to it just to get an iterator back.
There is likely no observable functionality change here right now
(although this is relying on UB, I doubt it was triggering anything),
but I'll be removing the cast soon.
llvm-svn: 278346
a sufficiently low alignment for the IR load created.
There is no test case because we don't have any test cases for the *IR*
produced by the autoupgrade, only the x86 assembly, and it happens that
the x86 assembly for this intrinsic as it is tested in the autoupgrade
path just happens to not produce a separate load instruction where we
might have observed the alignment.
I'm going to follow up on the original commit to suggest getting
IR-level testing in addition to the asm level testing here so that we
can see and test these kinds of issues. We might never get an x86
instruction out with an alignment constraint, but we could stil
miscompile code by folding against the alignment marked on (or inferred
for in this case) the load.
llvm-svn: 278203
I needed a reader-writer lock for a downstream project and noticed that
llvm has one. Function.cpp is the only file in-tree that refers to it.
To anyone reading this: are you using RWMutex in out-of-tree code? Maybe
it's not worth keeping around any more...
Since we're not actually using RWMutex *here*, remove the #include (and
a few other stale headers while we're at it).
llvm-svn: 278178
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
llvm-svn: 278078
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
llvm-svn: 278077
Summary:
This is the 4c patch of the coroutine series. CoroElide pass now checks if PostSplit coro.begin
is referenced by coro.subfn.addr intrinsics. If so replace coro.subfn.addrs with an appropriate coroutine
subfunction associated with that coro.begin.
Documentation and overview is here: http://llvm.org/docs/Coroutines.html.
Upstreaming sequence (rough plan)
1.Add documentation. (https://reviews.llvm.org/D22603)
2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659)
3.Add empty coroutine passes. (https://reviews.llvm.org/D22847)
4.Add coroutine devirtualization + tests.
ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998)
c) Do devirtualization <= we are here
5.Add CGSCC restart trigger + tests.
6.Add coroutine heap elision + tests.
7.Add the rest of the logic (split into more patches)
Reviewers: majnemer
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D23229
llvm-svn: 277908
Summary:
If a profile has no samples for a function, then the function "entry count" is set to the value 0. Several places in the code test that if the Function::getEntryCount is defined at all. Here we change to treat a 0 entry count the same as undefined.
In particular, this fixes a problem in getLayoutSuccessorProbThreshold in MachineBlockPlacement.cpp where we use a different and inferior heuristic for laying out basic blocks.
Reviewers: danielcdh, dnovillo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23082
llvm-svn: 277849
This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume
and coro.destroy intrinsics by replacing them with an indirect call to an address
returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes
devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate
function address.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22998
llvm-svn: 277765
This is a fix for PR28697.
An MDNode can indirectly refer to a GlobalValue, through a
ConstantAsMetadata. When the GlobalValue is deleted, the MDNode operand
is reset to `nullptr`. If the node is uniqued, this can lead to a
hard-to-detect cache invalidation in a Metadata map that's shared across
an LLVMContext.
Consider:
1. A map from Metadata* to `T` called RemappedMDs.
2. A node that references a global variable, `!{i1* @GV}`.
3. Insert `!{i1* @GV} -> SomeT` in the map.
4. Delete `@GV`, leaving behind `!{null} -> SomeT`.
Looking up the generic and uninteresting `!{null}` gives you `SomeT`,
which is likely related to `@GV`. Worse, `SomeT`'s lifetime may be tied
to the deleted `@GV`.
This occurs in practice in the shared ValueMap used since r266579 in the
IRMover. Other code that handles more than one Module (with different
lifetimes) in the same LLVMContext could hit it too.
The fix here is a partial revert of r225223: in the rare case that an
MDNode operand is a ConstantAsMetadata (i.e., wrapping a node from the
Value hierarchy), drop uniquing if it gets replaced with `nullptr`.
This changes step #4 above to leave behind `distinct !{null} -> SomeT`,
which can't be confused with the generic `!{null}`.
In theory, this can cause some churn in the LLVMContext's MDNode
uniquing map when Values are being deleted. However:
- The number of GlobalValues referenced from uniqued MDNodes is
expected to be quite small. E.g., the debug info metadata schema
only references GlobalValues from distinct nodes.
- Other Constants have the lifetime of the LLVMContext, whose teardown
is careful to drop references before deleting the constants.
As a result, I don't expect a compile time regression from this change.
llvm-svn: 277625
Summary:
This commit changes the Verifier class to accept a Module via the
constructor to make it obvious that a specific instance of the class is
only intended to work with a specific module. The `updateModule` setter
(despite being private) was making this fact less transparent.
There are fields in the `Verifier` class like `DeoptimizeDeclarations`
and `GlobalValueVisited` which are module specific, so a given
Verifier instance will not in fact work across multiple modules today.
This change just makes that more obvious.
The motivation is to make it easy to get to the datalayout of the
module unambiguously. That is required to verify that `inttoptr` and
`ptrtoint` constant expressions are well typed in the face of
non-integral pointer types.
Reviewers: dexonsmith, bkramer, majnemer, chandlerc
Subscribers: mehdi_amini, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D23040
llvm-svn: 277409
This broke some out-of-tree AMDGPU tests that relied on the old behavior
wherein isIntrinsic() would return true for any function that starts
with "llvm.". And in general that change will not play nicely with
out-of-tree backends.
llvm-svn: 277087
Summary:
This change adds a `ni` specifier in the `datalayout` string to denote
pointers in some given address spaces as "non-integral", and adds some
typing rules around these special pointers.
Reviewers: majnemer, chandlerc, atrick, dberlin, eli.friedman, tstellarAMD, arsenm
Subscribers: arsenm, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D22488
llvm-svn: 277085
Summary:
getName() involves a hashtable lookup, so is expensive given how
frequently isIntrinsic() is called. (In particular, many users cast to
IntrinsicInstr or one of its subclasses before calling
getIntrinsicID().)
This has an incidental functional change: Before, isIntrinsic() would
return true for any function whose name started with "llvm.", even if it
wasn't properly an intrinsic. The new behavior seems more correct to
me, because it's strange to say that isIntrinsic() is true, but
getIntrinsicId() returns "not an intrinsic".
Some callers want the old behavior -- they want to know whether the
caller is a recognized intrinsic, or might be one in some other version
of LLVM. For them, we added Function::hasLLVMReservedName(), which
checks whether the name starts with "llvm.".
This change is good for a 1.5% e2e speedup compiling a large Eigen
benchmark.
Reviewers: bogner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22065
llvm-svn: 276942
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address
space.
With this change, these intrinsics are overloaded for any adddress space
for memory objects
and we can use these llvm invariant intrinsics in non-default address
spaces.
Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)
This overloaded intrinsic is needed for representing final or invariant
memory in managed languages.
Reviewers: apilipenko, reames
Subscribers: llvm-commits
llvm-svn: 276447
As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector.
This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.
We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).
Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly).
Differential Revision: https://reviews.llvm.org/D22460
llvm-svn: 276416
Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address space.
With this change, these intrinsics are overloaded for any adddress space for memory objects
and we can use these llvm invariant intrinsics in non-default address spaces.
Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)
This overloaded intrinsic is needed for representing final or invariant memory in managed languages.
Reviewers: tstellarAMD, reames, apilipenko
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22519
llvm-svn: 276316
As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector.
This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.
We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).
Differential Revision: https://reviews.llvm.org/D22460
llvm-svn: 276281
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.
It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).
This patch changes both scalar and packed versions back to using x86-specific builtins.
It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.
A companion clang patch is at D22105
Differential Revision: https://reviews.llvm.org/D22106
llvm-svn: 275981
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334
This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution. My goal is to shake out the design issues before scaling
it up to other types and passes.
Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count. It's only printed in opt
currently since clang prints the diagnostic fields directly. E.g.:
remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)
A new API added is similar to emitOptimizationRemarkMissed. The
difference is that it additionally takes a code region that the
diagnostic corresponds to. From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI. (Thanks to Hal for the analysis pass idea.)
This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context. If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.
A new command-line option is added to turn this on in opt.
My plan is to switch all user of emitOptimizationRemark* to use this
module instead.
Reviewers: hfinkel
Subscribers: rcox2, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D21771
llvm-svn: 275583
This splits out the intrinsic table such that generic intrinsics come
first and target specific intrinsics are grouped by target. From here
we can find out which target an intrinsic is for or differentiate
between generic and target intrinsics.
The motivation here is to make it easier to move target specific
intrinsic handling out of generic code.
llvm-svn: 275575
The many levels of nesting inside the responsible code made it easy for
bugs to sneak in. Flattening the logic makes it easier to see what's
going on.
llvm-svn: 275244
Summary: http://reviews.llvm.org/D22118 uses metadata to store the call count, which makes it possible to have branch weight to have only one elements. Also fix the assertion failure in inliner when checking the instruction type to include "invoke" instruction.
Reviewers: mkuper, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22228
llvm-svn: 275079
Summary:
For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch.
E.g.
if (A1 && A2 && A3 && ..... && A10) {
for (i=0; i < 100000000; i++) {
callsite();
}
}
Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value.
In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness.
Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR.
Reviewers: davidxl, eraman, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22118
llvm-svn: 275073
Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look
through returned-argument functions when answering queries. This is essential
so that we don't loose all other AA information when supplementing with
llvm.noalias.
Differential Revision: http://reviews.llvm.org/D9383
llvm-svn: 275035
In order to make the optimizer smarter about using the 'returned' argument
attribute (generally, but motivated by my llvm.noalias intrinsic work), add a
utility function to Call/InvokeInst, and CallSite, to make it easy to get the
returned call argument (when one exists).
P.S. There is already an unfortunate amount of code duplication between
CallInst and InvokeInst, and this adds to it. We should probably clean that up
separately.
Differential Revision: http://reviews.llvm.org/D22204
llvm-svn: 275031
Summary:
This complements the earlier addition of IntrWriteMem and IntrWriteArgMem
LLVM intrinsic properties, see D18291.
Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis.
Reviewers: reames, joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D18714
llvm-svn: 274485
Summary:
This represents the adjustment applied to the implicit 'this' parameter
in the prologue of a virtual method in the MS C++ ABI. The adjustment is
always zero unless multiple inheritance is involved.
This increases the size of DISubprogram by 8 bytes, unfortunately. The
adjustment really is a signed 32-bit integer. If this size increase is
too much, we could probably win it back by splitting out a subclass with
info specific to virtual methods (virtuality, vindex, thisadjustment,
containingType).
Reviewers: aprantl, dexonsmith
Subscribers: aaboud, amccarth, llvm-commits
Differential Revision: http://reviews.llvm.org/D21614
llvm-svn: 274325
CodeView need to know the offset of the storage allocation for a
bitfield. Encode this via the "extraData" field in DIDerivedType and
introduced a new flag, DIFlagBitField, to indicate whether or not a
member is a bitfield.
This fixes PR28162.
Differential Revision: http://reviews.llvm.org/D21782
llvm-svn: 274200
For the new hotness attribute, the API will take the pass rather than
the pass name so we can no longer play the trick of AlwaysPrint being a
special pass name. This adds a getter to help the transition.
There is also a corresponding clang patch.
llvm-svn: 274100
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
llvm-svn: 274043
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
llvm-svn: 273892
There are two separate issues:
- LLVM doesn't consider infinite loops to be side effects: we happily
hoist/sink above/below loops whose bounds are unknown.
- The absence of the noreturn attribute is insufficient for us to know
if a function will definitely return. Relying on noreturn in the
middle-end for any property is an accident waiting to happen.
llvm-svn: 273762
This intrinsic safely loads a function pointer from a virtual table pointer
using type metadata. This intrinsic is used to implement control flow integrity
in conjunction with virtual call optimization. The virtual call optimization
pass will optimize away llvm.type.checked.load intrinsics associated with
devirtualized calls, thereby removing the type check in cases where it is
not needed to enforce the control flow integrity constraint.
This patch also introduces the capability to copy type metadata between
global variables, and teaches the virtual call optimization pass to do so.
Differential Revision: http://reviews.llvm.org/D21121
llvm-svn: 273756
The bitset metadata currently used in LLVM has a few problems:
1. It has the wrong name. The name "bitset" refers to an implementation
detail of one use of the metadata (i.e. its original use case, CFI).
This makes it harder to understand, as the name makes no sense in the
context of virtual call optimization.
2. It is represented using a global named metadata node, rather than
being directly associated with a global. This makes it harder to
manipulate the metadata when rebuilding global variables, summarise it
as part of ThinLTO and drop unused metadata when associated globals are
dropped. For this reason, CFI does not currently work correctly when
both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable
globals, and fails to associate metadata with the rebuilt globals. As I
understand it, the same problem could also affect ASan, which rebuilds
globals with a red zone.
This patch solves both of those problems in the following way:
1. Rename the metadata to "type metadata". This new name reflects how
the metadata is currently being used (i.e. to represent type information
for CFI and vtable opt). The new name is reflected in the name for the
associated intrinsic (llvm.type.test) and pass (LowerTypeTests).
2. Attach metadata directly to the globals that it pertains to, rather
than using the "llvm.bitsets" global metadata node as we are doing now.
This is done using the newly introduced capability to attach
metadata to global variables (r271348 and r271358).
See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html
Differential Revision: http://reviews.llvm.org/D21053
llvm-svn: 273729
This is a convenience iterator that allows clients to enumerate the
GlobalObjects within a Module.
Also start using it in a few places where it is obviously the right thing
to use.
Differential Revision: http://reviews.llvm.org/D21580
llvm-svn: 273470
Move Verifier::verifyIntrinsicType to Intrinsics::matchIntrinsicsType. Will be used to accumulate overloaded types of a given intrinsic by the upcoming patch to fix intrinsics names when overloaded types are renamed.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D19372
llvm-svn: 273424
This change is motivated by an upcoming change to the metadata representation
used for CFI. The indirect function call checker needs type information for
external function declarations in order to correctly generate jump table
entries for such declarations. We currently associate such type information
with declarations using a global metadata node, but I plan [1] to move all
such metadata to global object attachments.
In bitcode, metadata attachments for function declarations appear in the
global metadata block. This seems reasonable to me because I expect metadata
attachments on declarations to be uncommon. In the long term I'd also expect
this to be the case for CFI, because we'd want to use some specialized bitcode
format for this metadata that could be read as part of the ThinLTO thin-link
phase, which would mean that it would not appear in the global metadata block.
To solve the lazy loaded metadata issue I was seeing with D20147, I use the
same bitcode representation for metadata attachments for global variables as I
do for function declarations. Since there's a use case for metadata attachments
in the global metadata block, we might as well use that representation for
global variables as well, at least until we have a mechanism for lazy loading
global variables.
In the assembly format, the metadata attachments appear after the "declare"
keyword in order to avoid a parsing ambiguity.
[1] http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html
Differential Revision: http://reviews.llvm.org/D21052
llvm-svn: 273336
Summary:
This seems like the least intrusive way to pass this information
through.
Fixes PR28151
Reviewers: majnemer, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21444
llvm-svn: 273053
We convert `Default` to `NotPIC` so that target independent code
can reason about this correctly.
Differential Revision: http://reviews.llvm.org/D21394
llvm-svn: 273024
pass manager passes' `run` methods.
This removes a bunch of SFINAE goop from the pass manager and just
requires pass authors to accept `AnalysisManager<IRUnitT> &` as a dead
argument. This is a small price to pay for the simplicity of the system
as a whole, despite the noise that changing it causes at this stage.
This will also helpfull allow us to make the signature of the run
methods much more flexible for different kinds af passes to support
things like intelligently updating the pass's progression over IR units.
While this touches many, many, files, the changes are really boring.
Mostly made with the help of my trusty perl one liners.
Thanks to Sean and Hal for bouncing ideas for this with me in IRC.
llvm-svn: 272978
Summary: This reverts the changes to Globals.cpp and IRMover.cpp in
"[IR] Copy comdats in GlobalObject::copyAttributesFrom" (D20631,
rL270743).
The DeadArgElim test is left unchanged, and we change DAE to explicitly
copy comdats.
The reverted change breaks copyAttributesFrom when the destination lives
in a different module from the source. The decision in D21255 was to
revert this patch and handle comdat copying separately from
copyAttributesFrom.
Reviewers: majnemer, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21403
llvm-svn: 272855
Summary: As per title. This completes the C API Attribute support.
Reviewers: Wallbraker, whitequark, echristo, rafael, jyknight
Subscribers: mehdi_amini
Differential Revision: http://reviews.llvm.org/D21365
llvm-svn: 272811
Summary: The current naming not only doesn't convey the meaning of what this does, but worse, it convey the wrong meaning. This was a major source of confusion understanding the code, so I'm applying the boy scout rule here and making it better after I leave.
Reviewers: void, bkramer, whitequark
Differential Revision: http://reviews.llvm.org/D21264
llvm-svn: 272725
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.
This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
the normal rule that the global must have a unique address can be broken without
being observable by the program by performing comparisons against the global's
address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
its own copy of the global if it requires one, and the copy in each linkage unit
must be the same)
- It is a constant or a function (which means that the program cannot observe that
the unique-address rule has been broken by writing to the global)
Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.
See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.
Part of the fix for PR27553.
Differential Revision: http://reviews.llvm.org/D20348
llvm-svn: 272709