This is useful when need to defer the construction,
e.g. using Regex as a member of class.
Differential revision: https://reviews.llvm.org/D24101
llvm-svn: 280339
Many lists want to override only allocation semantics, or callbacks for
iplist. Split these up to prevent code duplication.
- Specialize ilist_alloc_traits to change the implementations of
deleteNode() and createNode().
- One common desire is to do nothing deleteNode() and disable
createNode(). Specialize ilist_alloc_traits to inherit from
ilist_noalloc_traits for that behaviour.
- Specialize ilist_callback_traits to use the addNodeToList(),
removeNodeFromList(), and transferNodesFromList() callbacks.
As a drive-by, add some coverage to the callback-related unit tests.
llvm-svn: 280128
Split out a new, low-level intrusive list type with clear semantics.
Unlike iplist (and ilist), all operations on simple_ilist are intrusive,
and simple_ilist never takes ownership of its nodes. This enables an
intuitive API that has the right defaults for intrusive lists.
- insert() takes references (not pointers!) to nodes (in iplist/ilist,
passing a reference will cause the node to be copied).
- erase() takes only iterators (like std::list), and does not destroy
the nodes.
- remove() takes only references and has the same behaviour as erase().
- clear() does not destroy the nodes.
- The destructor does not destroy the nodes.
- New API {erase,remove,clear}AndDispose() take an extra Disposer
functor for callsites that want to call some disposal routine (e.g.,
std::default_delete).
This list is not currently configurable, and has no callbacks.
The initial motivation was to fix iplist<>::sort to work correctly (even
with callbacks in ilist_traits<>). iplist<> uses simple_ilist<>::sort
directly. The new test in unittests/IR/ModuleTest.cpp crashes without
this commit.
Fixing sort() via a low-level layer provided a good opportunity to:
- Unit test the low-level functionality thoroughly.
- Modernize the API, largely inspired by other intrusive list
implementations.
Here's a sketch of a longer-term plan:
- Create BumpPtrList<>, a non-intrusive list implemented using
simple_ilist<>, and use it for the Token list in
lib/Support/YAMLParser.cpp. This will factor out the only real use of
createNode().
- Evolve the iplist<> and ilist<> APIs in the direction of
simple_ilist<>, making allocation/deallocation explicit at call sites
(similar to simple_ilist<>::eraseAndDispose()).
- Factor out remaining calls to createNode() and deleteNode() and remove
the customization from ilist_traits<>.
- Transition uses of iplist<>/ilist<> that don't need callbacks over to
simple_ilist<>.
llvm-svn: 280107
This reverts commit r280016, and the followups of r280017, r280027,
r280051, r280058, and r280059.
MSVC's implementation of std::promise does not get along with
llvm::Error. It uses its promised value too much like a normal value
type.
llvm-svn: 280100
behaviors, and add a callB (blacking call) primitive.
callB is a blocking call primitive for threaded code where the RPC responses are
being processed on a separate thread. (For single threaded code callST should
continue to be used instead).
No unit test yet: Last time I commited a threaded unit test it deadlocked on
one of the s390x builders. I'll try to re-enable that test first, and add a new
test if I can sort out the deadlock issue.
llvm-svn: 280051
I'm working on a lower-level intrusive list that can be used
stand-alone, and splitting the files up a bit will make the code easier
to organize. Explode the ilist headers in advance to improve blame
lists in the future.
- Move ilist_node_base from ilist_node.h to ilist_node_base.h.
- Move ilist_base from ilist.h to ilist_base.h.
- Move ilist_iterator from ilist.h to ilist_iterator.h.
- Move ilist_node_access from ilist.h to ilist_node.h to support
ilist_iterator.
- Update unit tests to #include smaller headers.
- Clang-format the moved things.
I noticed in transit that there is a simplify_type specialization for
ilist_iterator. Since there is no longer an implicit conversion from
ilist<T>::iterator to T*, this doesn't make sense (effectively it's a
form of implicit conversion). For now I've added a FIXME.
llvm-svn: 280047
And rename the tests inside from ilistTest to IListTest. This makes the
file sort properly in the CMakeLists.txt (previously, sorting would
throw it down to the end of the list) and is consistent with the tests
I've added more recently.
Why use IListNodeBaseTest.cpp (and a test name of IListNodeBaseTest)?
- ilist_node_base_test is the obvious thing, since this is testing
ilist_node_base. However, gtest disallows underscores in test names.
- ilist_node_baseTest fails for the same reason.
- ilistNodeBaseTest is weird, because it isn't in our usual
TitleCaseTest form that we use for tests, and it also doesn't have the
name of the tested class in it.
- IlistNodeBaseTest matches TitleCaseTest, but "Ilist" is hard to read,
and really "ilist" is an abbreviation for "IntrusiveList" so the
lowercase "list" is strange.
- That left IListNodeBaseTest.
Note: I made this move in two stages, with a temporary filename of
ilistTestTemp in between in r279524. This was in the hopes of avoiding
problems on Git and SVN clients on case-insensitive filesystems,
particularly on buildbots with incremental checkouts.
llvm-svn: 280033
Reverse iterators to doubly-linked lists can be simpler (and cheaper)
than std::reverse_iterator. Make it so.
In particular, change ilist<T>::reverse_iterator so that it is *never*
invalidated unless the node it references is deleted. This matches the
guarantees of ilist<T>::iterator.
(Note: MachineBasicBlock::iterator is *not* an ilist iterator, but a
MachineInstrBundleIterator<MachineInstr>. This commit does not change
MachineBasicBlock::reverse_iterator, but it does update
MachineBasicBlock::reverse_instr_iterator. See note at end of commit
message for details on bundle iterators.)
Given the list (with the Sentinel showing twice for simplicity):
[Sentinel] <-> A <-> B <-> [Sentinel]
the following is now true:
1. begin() represents A.
2. begin() holds the pointer for A.
3. end() represents [Sentinel].
4. end() holds the poitner for [Sentinel].
5. rbegin() represents B.
6. rbegin() holds the pointer for B.
7. rend() represents [Sentinel].
8. rend() holds the pointer for [Sentinel].
The changes are #6 and #8. Here are some properties from the old
scheme (which used std::reverse_iterator):
- rbegin() held the pointer for [Sentinel] and rend() held the pointer
for A;
- operator*() cost two dereferences instead of one;
- converting from a valid iterator to its valid reverse_iterator
involved a confusing increment; and
- "RI++->erase()" left RI invalid. The unintuitive replacement was
"RI->erase(), RE = end()".
With vector-like data structures these properties are hard to avoid
(since past-the-beginning is not a valid pointer), and don't impose a
real cost (since there's still only one dereference, and all iterators
are invalidated on erase). But with lists, this was a poor design.
Specifically, the following code (which obviously works with normal
iterators) now works with ilist::reverse_iterator as well:
for (auto RI = L.rbegin(), RE = L.rend(); RI != RE;)
fooThatMightRemoveArgFromList(*RI++);
Converting between iterator and reverse_iterator for the same node uses
the getReverse() function.
reverse_iterator iterator::getReverse();
iterator reverse_iterator::getReverse();
Why doesn't iterator <=> reverse_iterator conversion use constructors?
In order to catch and update old code, reverse_iterator does not even
have an explicit conversion from iterator. It wouldn't be safe because
there would be no reasonable way to catch all the bugs from the changed
semantic (see the changes at call sites that are part of this patch).
Old code used this API:
std::reverse_iterator::reverse_iterator(iterator);
iterator std::reverse_iterator::base();
Here's how to update from old code to new (that incorporates the
semantic change), assuming I is an ilist<>::iterator and RI is an
ilist<>::reverse_iterator:
[Old] ==> [New]
reverse_iterator(I) (--I).getReverse()
reverse_iterator(I) ++I.getReverse()
--reverse_iterator(I) I.getReverse()
reverse_iterator(++I) I.getReverse()
RI.base() (--RI).getReverse()
RI.base() ++RI.getReverse()
--RI.base() RI.getReverse()
(++RI).base() RI.getReverse()
delete &*RI, RE = end() delete &*RI++
RI->erase(), RE = end() RI++->erase()
=======================================
Note: bundle iterators are out of scope
=======================================
MachineBasicBlock::iterator, also known as
MachineInstrBundleIterator<MachineInstr>, is a wrapper to represent
MachineInstr bundles. The idea is that each operator++ takes you to the
beginning of the next bundle. Implementing a sane reverse iterator for
this is harder than ilist. Here are the options:
- Use std::reverse_iterator<MBB::i>. Store a handle to the beginning of
the next bundle. A call to operator*() runs a loop (usually
operator--() will be called 1 time, for unbundled instructions).
Increment/decrement just works. This is the status quo.
- Store a handle to the final node in the bundle. A call to operator*()
still runs a loop, but it iterates one time fewer (usually
operator--() will be called 0 times, for unbundled instructions).
Increment/decrement just works.
- Make the ilist_sentinel<MachineInstr> *always* store that it's the
sentinel (instead of just in asserts mode). Then the bundle iterator
can sniff the sentinel bit in operator++().
I initially tried implementing the end() option as part of this commit,
but updating iterator/reverse_iterator conversion call sites was
error-prone. I have a WIP series of patches that implements the final
option.
llvm-svn: 280032
Optional.
For void functions the return type of a nonblocking call changes from
Expected<future<Optional<bool>>> to Expected<future<Error>>, and for functions
returning T the return type changes from Expected<future<Optional<T>>> to
Expected<future<Expected<T>>>.
Inner results need to be checked (since the RPC connection may have dropped
out before a result came back) and Error/Expected provide stronger checking
requirements. It also allows us drop the crufty 'optionalToError' function and
just collapse Errors in the single-threaded call primitives.
llvm-svn: 280016
Instead of putting all possible requests into a single table, we can perform
the extremely dense lookup based on opcode and type-index in constant time
using multi-dimensional array-like things.
This roughly halves the time spent doing legalization, which was dominated by
queries against the Actions table.
llvm-svn: 280011
Summary: No functional changes, just refactoring to make D23947 simpler.
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23954
llvm-svn: 279982
switch to using one indirect stub manager per logical dylib rather than one per
input module.
LogicalDylib is a helper class used by the CompileOnDemandLayer to manage
symbol resolution between modules during lazy compilation. In particular, it
ensures that internal symbols resolve correctly even in the case where multiple
input modules contain the same internal symbol name (which must to be promoted
to external hidden linkage so that functions in any given module can be split
out by lazy compilation). LogicalDylib's resolution scheme (before this commit)
required one stub-manager per input module. This made recompilation of functions
(by adding a module containing a new definition) difficult, as the stub manager
for any given symbol was bound to the module that supplied the original
definition. By using one stubs manager for the whole logical dylib symbols can
be more easily replaced, although support for doing this is not included in this
patch (it will be implemented in a follow up).
llvm-svn: 279952
The InitializerList test had undefined behavior by creating a dangling pointer to the temporary initializer list. This patch removes the undefined behavior in the test by creating the initializer list directly.
Reviewers: mehdi_amini, dblaikie
Differential Revision: https://reviews.llvm.org/D23890
llvm-svn: 279783
In cases where .dwo/.dwp files are guaranteed to be available, skipping
the extra online (in the .o file) inline info can save a substantial
amount of space - see the original r221306 for more details there.
llvm-svn: 279650
manager, including both plumbing and logic to handle function pass
updates.
There are three fundamentally tied changes here:
1) Plumbing *some* mechanism for updating the CGSCC pass manager as the
CG changes while passes are running.
2) Changing the CGSCC pass manager infrastructure to have support for
the underlying graph to mutate mid-pass run.
3) Actually updating the CG after function passes run.
I can separate them if necessary, but I think its really useful to have
them together as the needs of #3 drove #2, and that in turn drove #1.
The plumbing technique is to extend the "run" method signature with
extra arguments. We provide the call graph that intrinsically is
available as it is the basis of the pass manager's IR units, and an
output parameter that records the results of updating the call graph
during an SCC passes's run. Note that "...UpdateResult" isn't a *great*
name here... suggestions very welcome.
I tried a pretty frustrating number of different data structures and such
for the innards of the update result. Every other one failed for one
reason or another. Sometimes I just couldn't keep the layers of
complexity right in my head. The thing that really worked was to just
directly provide access to the underlying structures used to walk the
call graph so that their updates could be informed by the *particular*
nature of the change to the graph.
The technique for how to make the pass management infrastructure cope
with mutating graphs was also something that took a really, really large
number of iterations to get to a place where I was happy. Here are some
of the considerations that drove the design:
- We operate at three levels within the infrastructure: RefSCC, SCC, and
Node. In each case, we are working bottom up and so we want to
continue to iterate on the "lowest" node as the graph changes. Look at
how we iterate over nodes in an SCC running function passes as those
function passes mutate the CG. We continue to iterate on the "lowest"
SCC, which is the one that continues to contain the function just
processed.
- The call graph structure re-uses SCCs (and RefSCCs) during mutation
events for the *highest* entry in the resulting new subgraph, not the
lowest. This means that it is necessary to continually update the
current SCC or RefSCC as it shifts. This is really surprising and
subtle, and took a long time for me to work out. I actually tried
changing the call graph to provide the opposite behavior, and it
breaks *EVERYTHING*. The graph update algorithms are really deeply
tied to this particualr pattern.
- When SCCs or RefSCCs are split apart and refined and we continually
re-pin our processing to the bottom one in the subgraph, we need to
enqueue the newly formed SCCs and RefSCCs for subsequent processing.
Queuing them presents a few challenges:
1) SCCs and RefSCCs use wildly different iteration strategies at
a high level. We end up needing to converge them on worklist
approaches that can be extended in order to be able to handle the
mutations.
2) The order of the enqueuing need to remain bottom-up post-order so
that we don't get surprising order of visitation for things like
the inliner.
3) We need the worklists to have set semantics so we don't duplicate
things endlessly. We don't need a *persistent* set though because
we always keep processing the bottom node!!!! This is super, super
surprising to me and took a long time to convince myself this is
correct, but I'm pretty sure it is... Once we sink down to the
bottom node, we can't re-split out the same node in any way, and
the postorder of the current queue is fixed and unchanging.
4) We need to make sure that the "current" SCC or RefSCC actually gets
enqueued here such that we re-visit it because we continue
processing a *new*, *bottom* SCC/RefSCC.
- We also need the ability to *skip* SCCs and RefSCCs that get merged
into a larger component. We even need the ability to skip *nodes* from
an SCC that are no longer part of that SCC.
This led to the design you see in the patch which uses SetVector-based
worklists. The RefSCC worklist is always empty until an update occurs
and is just used to handle those RefSCCs created by updates as the
others don't even exist yet and are formed on-demand during the
bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and
we push new SCCs onto it and blacklist existing SCCs on it to get the
desired processing.
We then *directly* update these when updating the call graph as I was
never able to find a satisfactory abstraction around the update
strategy.
Finally, we need to compute the updates for function passes. This is
mostly used as an initial customer of all the update mechanisms to drive
their design to at least cover some real set of use cases. There are
a bunch of interesting things that came out of doing this:
- It is really nice to do this a function at a time because that
function is likely hot in the cache. This means we want even the
function pass adaptor to support online updates to the call graph!
- To update the call graph after arbitrary function pass mutations is
quite hard. We have to build a fairly comprehensive set of
data structures and then process them. Fortunately, some of this code
is related to the code for building the cal graph in the first place.
Unfortunately, very little of it makes any sense to share because the
nature of what we're doing is so very different. I've factored out the
one part that made sense at least.
- We need to transfer these updates into the various structures for the
CGSCC pass manager. Once those were more sanely worked out, this
became relatively easier. But some of those needs necessitated changes
to the LazyCallGraph interface to make it significantly easier to
extract the changed SCCs from an update operation.
- We also need to update the CGSCC analysis manager as the shape of the
graph changes. When an SCC is merged away we need to clear analyses
associated with it from the analysis manager which we didn't have
support for in the analysis manager infrsatructure. New SCCs are easy!
But then we have the case that the original SCC has its shape changed
but remains in the call graph. There we need to *invalidate* the
analyses associated with it.
- We also need to invalidate analyses after we *finish* processing an
SCC. But the analyses we need to invalidate here are *only those for
the newly updated SCC*!!! Because we only continue processing the
bottom SCC, if we split SCCs apart the original one gets invalidated
once when its shape changes and is not processed farther so its
analyses will be correct. It is the bottom SCC which continues being
processed and needs to have the "normal" invalidation done based on
the preserved analyses set.
All of this is mostly background and context for the changes here.
Many thanks to all the reviewers who helped here. Especially Sanjoy who
caught several interesting bugs in the graph algorithms, David, Sean,
and others who all helped with feedback.
Differential Revision: http://reviews.llvm.org/D21464
llvm-svn: 279618
Re-apply this patch, hopefully I will get away without any warnings
in the constructor now.
This patch removes the MachineFunctionAnalysis. Instead we keep a
map from IR Function to MachineFunction in the MachineModuleInfo.
This allows the insertion of ModulePasses into the codegen pipeline
without breaking it because the MachineFunctionAnalysis gets dropped
before a module pass.
Peak memory should stay unchanged without a ModulePass in the codegen
pipeline: Previously the MachineFunction was freed at the end of a codegen
function pipeline because the MachineFunctionAnalysis was dropped; With
this patch the MachineFunction is freed after the AsmPrinter has
finished.
Differential Revision: http://reviews.llvm.org/D23736
llvm-svn: 279602
Change this pass constructor to just accept a const TargetMachine * and
use INITIALIZE_TM_PASS, that way we can get rid of the dummy
constructor. The pass will still fail when calling the default
constructor leading to TM == nullptr, this is no different than before
but is more in line what other codegen passes are doing and avoids the
dummy constructor.
llvm-svn: 279598
Re-apply this commit with the deletion of a MachineFunction delegated to
a separate pass to avoid use after free when doing this directly in
AsmPrinter.
This patch removes the MachineFunctionAnalysis. Instead we keep a
map from IR Function to MachineFunction in the MachineModuleInfo.
This allows the insertion of ModulePasses into the codegen pipeline
without breaking it because the MachineFunctionAnalysis gets dropped
before a module pass.
Peak memory should stay unchanged without a ModulePass in the codegen
pipeline: Previously the MachineFunction was freed at the end of a codegen
function pipeline because the MachineFunctionAnalysis was dropped; With
this patch the MachineFunction is freed after the AsmPrinter has
finished.
Differential Revision: http://reviews.llvm.org/D23736
llvm-svn: 279564
Instructions like G_ICMP have multiple types that may need to be legalized (the
boolean output and nearly arbitrary inputs in this case). So the legalizer must
be capable of deciding what to do for each of them separately.
llvm-svn: 279554
I'll rename this to IListTest.cpp after a waiting period (tonight?
tomorrow?), with a full explanation in that commit.
First, I'm moving it aside because Git doesn't play well with case-only
filename changes on case-insensitive file systems (and I suspect the
same is true of SVN). This two-stage change should help to avoid
spurious failures on bots that don't do clean checkouts.
llvm-svn: 279524
This patch removes the MachineFunctionAnalysis. Instead we keep a
map from IR Function to MachineFunction in the MachineModuleInfo.
This allows the insertion of ModulePasses into the codegen pipeline
without breaking it because the MachineFunctionAnalysis gets dropped
before a module pass.
Peak memory should stay unchanged without a ModulePass in the codegen
pipeline: Previously the MachineFunction was freed at the end of a codegen
function pipeline because the MachineFunctionAnalysis was dropped; With
this patch the MachineFunction is freed after the AsmPrinter has
finished.
Differential Revision: http://reviews.llvm.org/D23736
llvm-svn: 279502
Separate algorithms in iplist<T> that don't depend on T into ilist_base,
and unit test them.
While I was adding unit tests for these algorithms anyway, I also added
unit tests for ilist_node_base and ilist_sentinel<T>.
To make the algorithms and unit tests easier to write, I also did the
following minor changes as a drive-by:
- encapsulate Prev/Next in ilist_node_base to so that algorithms are
easier to read, and
- update ilist_node_access API to take nodes by reference.
There should be no real functionality change here.
llvm-svn: 279484
Summary: Before the change, *Opt never actually gets updated by the end
of toNext(), so for every next time the loop has to start over from
child_begin(). This bug doesn't affect the correctness, since Visited prevents
it from re-entering the same node again; but it's slow.
Reviewers: dberris, dblaikie, dannyb
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23649
llvm-svn: 279482
Summary:
We are going to combine poisoning of red zones and scope poisoning.
PR27453
Reviewers: kcc, eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23623
llvm-svn: 279373
Currently nodes_iterator may dereference to a NodeType* or a NodeType&. Make them all dereference to NodeType*, which is NodeRef later.
Differential Revision: https://reviews.llvm.org/D23704
Differential Revision: https://reviews.llvm.org/D23705
llvm-svn: 279326
This reverts commit r279053, reapplying r278974 after fixing PR29035
with r279104.
Note that r279312 has been committed in the meantime, and this has been
rebased on top of that. Otherwise it's identical to r278974.
Note for maintainers of out-of-tree code (that I missed in the original
message): if the new isKnownSentinel() assertion is firing from
ilist_iterator<>::operator*(), this patch has identified a bug in your
code. There are a few common patterns:
- Some IR-related APIs htake an IRUnit* that might be nullptr, and pass
in an incremented iterator as an insertion point. Some old code was
using "&*++I", which in the case of end() only worked by fluke. If
the IRUnit in question inherits from ilist_node_with_parent<>, you can
use "I->getNextNode()". Otherwise, use "List.getNextNode(*I)".
- In most other cases, crashes on &*I just need to check for I==end()
before dereferencing.
- There's also occasional code that sends iterators into a function, and
then starts calling I->getOperand() (or other API). Either check for
end() before the entering the function, or early exit.
Note for if the static_assert with HasObsoleteCustomization is firing
for you:
- r278513 has examples of how to stop using custom sentinel traits.
- r278532 removed ilist_nextprev_traits since no one was using it. See
lld's r278469 for the only migration I needed to do.
Original commit message follows.
----
This removes the undefined behaviour (UB) in ilist/ilist_node/etc.,
mainly by removing (gutting) the ilist_sentinel_traits customization
point and canonicalizing on a single, efficient memory layout. This
fixes PR26753.
The new ilist is a doubly-linked circular list.
- ilist_node_base has two ilist_node_base*: Next and Prev. Size-of: two
pointers.
- ilist_node<T> (size-of: two pointers) is a type-safe wrapper around
ilist_node_base.
- ilist_iterator<T> (size-of: two pointers) operates on an
ilist_node<T>*, and downcasts to T* on dereference.
- ilist_sentinel<T> (size-of: two pointers) is a wrapper around
ilist_node<T> that has some extra API for list management.
- ilist<T> (size-of: two pointers) has an ilist_sentinel<T>, whose
address is returned for end().
The new memory layout matches ilist_half_embedded_sentinel_traits<T>
exactly. The Head pointer that previously lived in ilist<T> is
effectively glued to the ilist_half_node<T> that lived in
ilist_half_embedded_sentinel_traits<T>, becoming the Next and Prev in
the ilist_sentinel_node<T>, respectively. sizeof(ilist<T>) is now the
size of two pointers, and there is never any additional storage for a
sentinel.
This is a much simpler design for a doubly-linked list, removing most of
the corner cases of list manipulation (add, remove, etc.). In follow-up
commits, I intend to move as many algorithms as possible into a
non-templated base class (ilist_base) to reduce code size.
Moreover, this fixes the UB in ilist_iterator/getNext/getPrev
operations. Previously, ilist_iterator<T> operated on a T*, even when
the sentinel was not of type T (i.e., ilist_embedded_sentinel_traits and
ilist_half_embedded_sentinel_traits). This added UB to all operations
involving end(). Now, ilist_iterator<T> operates on an ilist_node<T>*,
and only downcasts when the full type is guaranteed to be T*.
What did we lose? There used to be a crash (in some configurations) on
++end(). Curiously (via UB), ++end() would return begin() for users of
ilist_half_embedded_sentinel_traits<T>, but otherwise ++end() would
cause a nice dependable nullptr dereference, crashing instead of a
possible infinite loop. Options:
1. Lose that behaviour.
2. Keep it, by stealing a bit from Prev in asserts builds.
3. Crash on dereference instead, using the same technique.
Hans convinced me (because of the number of problems this and r278532
exposed on Windows) that we really need some assertion here, at least in
the short term. I've opted for #3 since I think it catches more bugs.
I added only a couple of unit tests to root out specific bugs I hit
during bring-up, but otherwise this is tested implicitly via the
extensive usage throughout LLVM.
Planned follow-ups:
- Remove ilist_*sentinel_traits<T>. Here I've just gutted them to
prevent build failures in sub-projects. Once I stop referring to them
in sub-projects, I'll come back and delete them.
- Add ilist_base and move algorithms there.
- Check and fix move construction and assignment.
Eventually, there are other interesting directions:
- Rewrite reverse iterators, so that rbegin().getNodePtr()==&*rbegin().
This allows much simpler logic when erasing elements during a reverse
traversal.
- Remove ilist_traits::createNode, by deleting the remaining API that
creates nodes. Intrusive lists shouldn't be creating nodes
themselves.
- Remove ilist_traits::deleteNode, by (1) asserting that lists are empty
on destruction and (2) changing API that calls it to take a Deleter
functor (intrusive lists shouldn't be in the memory management
business).
- Reconfigure the remaining callback traits (addNodeToList, etc.) to be
higher-level, pulling out a simple_ilist<T> that is much easier to
read and understand.
- Allow tags (e.g., ilist_node<T,tag1> and ilist_node<T,tag2>) so that T
can be a member of multiple intrusive lists.
llvm-svn: 279314
This spiritually reapplies r279012 (reverted in r279052) without the
r278974 parts. The differences:
- Only the HasGetNext trait exists here, so I've only cleaned up (and
tested) it. I still added HasObsoleteCustomization since I know
this will be expanding when r278974 is reapplied.
- I changed the unit tests to use static_assert to catch problems
earlier in the build.
- I added negative tests for the type traits.
Original commit message follows.
----
Change the ilist traits to use decltype instead of sizeof, and add
HasObsoleteCustomization so that additions to this list don't
need to be added in two places.
I suspect this will now work with MSVC, since the trait tested in
r278991 seems to work. If for some reason it continues to fail on
Windows I'll follow up by adding back the #ifndef _MSC_VER.
llvm-svn: 279312
was done to hopefully appease MSVC.
As an upside, this also implements the suggestion Sanjoy made in code
review, so two for one! =]
I'll be watching the bots to see if there are still issues.
llvm-svn: 279295
solve completely opaque MSVC build errors. It complains about lots of
stuff with this change without givin nearly enough information to even
try to fix.
llvm-svn: 279231
to run methods, both for transform passes and analysis passes.
This also allows the analysis manager to use a different set of extra
arguments from the pass manager where useful. Consider passes over
analysis produced units of IR like SCCs of the call graph or loops.
Passes of this nature will often want to refer to the analysis result
that was used to compute their IR units (the call graph or LoopInfo).
And for transformations, they may want to communicate special update
information to the outer pass manager. With this change, it becomes
possible to have a run method for a loop pass that looks more like:
PreservedAnalyses run(Loop &L, AnalysisManager<Loop, LoopInfo> &AM,
LoopInfo &LI, LoopUpdateRecord &UR);
And to query the analysis manager like:
AM.getResult<MyLoopAnalysis>(L, LI);
This makes accessing the known-available analyses convenient and clear,
and it makes passing customized data structures around easy.
My initial use case is going to be in updating the pass manager layers
when the analysis units of IR change. But there are more use cases here
such as having a layer that lets inner passes signal whether certain
additional passes should be run because of particular simplifications
made. Two desires for this have come up in the past: triggering
additional optimization after successfully unrolling loops, and
triggering additional inlining after collapsing indirect calls to direct
calls.
Despite adding this layer of generic extensibility, the *only* change to
existing, simple usage are for places where we forward declare the
AnalysisManager template. We really shouldn't be doing this because of
the fragility exposed here, but currently it makes coping with the
legacy PM code easier.
Differential Revision: http://reviews.llvm.org/D21462
llvm-svn: 279227
This is a little class template that just builds an inheritance chain of
empty classes. Despite how simple this is, it can be used to really
nicely create ranked overload sets. I've added a unittest as much to
document this as test it. You can pass an object of this type as an
argument to a function overload set an it will call the first viable and
enabled candidate at or below the rank of the object.
I'm planning to use this in a subsequent commit to more clearly rank
overload candidates used for SFINAE. All credit for this technique and
both lines of code here to Richard Smith who was helping me rewrite the
SFINAE check in question to much more effectively capture the intended
set of checks.
llvm-svn: 279197
This reverts commit r279086, reapplying r279084. I'm not sure what I
ran before, because the compile failure for ADTTests reproduced locally.
The problem is that TestRev is calling BidirectionalVector::rbegin()
when the BidirectionalVector is const, but rbegin() is always non-const.
I've updated BidirectionalVector::rbegin() to be callable from const.
Original commit message follows.
--
As a follow-up to r278991, add some tests that check that
decltype(reverse(R).begin()) == decltype(R.rbegin()), and get them
passing by adding std::remove_reference to has_rbegin.
I'm using static_assert instead of EXPECT_TRUE (and updated the other
has_rbegin check from r278991 in the same way) since I figure that's
more helpful.
llvm-svn: 279091
As a follow-up to r278991, add some tests that check that
decltype(reverse(R).begin()) == decltype(R.rbegin()), and get them
passing by adding std::remove_reference to has_rbegin.
I'm using static_assert instead of EXPECT_TRUE (and updated the other
has_rbegin check from r278991 in the same way) since I figure that's
more helpful.
llvm-svn: 279084
Summary:
We are going to combine poisoning of red zones and scope poisoning.
PR27453
Reviewers: kcc, eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23623
llvm-svn: 279020
RTDyldMemoryManager::getSymbolAddressInProcess()
This should allow JIT'd code for win32 to find in-process symbols. See
http://llvm.org/PR28699 .
Patch by James Holderness. Thanks James!
llvm-svn: 279016
Change the ilist traits to use decltype instead of sizeof, and add
HasObsoleteCustomization so that additions to this list don't need to be
added in two places.
I suspect this will now work with MSVC, since the trait tested in
r278991 seems to work. If for some reason it continues to fail on
Windows I'll follow up by adding back the #ifndef _MSC_VER.
llvm-svn: 279012
Duncan found that reverse worked on mutable rbegin(), but the has_rbegin
trait didn't work with a const method. See http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382890.html
for more details.
Turns out this was already solved in clang with has_getDecl. Copied that and made it work for rbegin.
This includes the tests Duncan attached to that thread, including the traits test.
llvm-svn: 278991
This removes the undefined behaviour (UB) in ilist/ilist_node/etc.,
mainly by removing (gutting) the ilist_sentinel_traits customization
point and canonicalizing on a single, efficient memory layout. This
fixes PR26753.
The new ilist is a doubly-linked circular list.
- ilist_node_base has two ilist_node_base*: Next and Prev. Size-of: two
pointers.
- ilist_node<T> (size-of: two pointers) is a type-safe wrapper around
ilist_node_base.
- ilist_iterator<T> (size-of: two pointers) operates on an
ilist_node<T>*, and downcasts to T* on dereference.
- ilist_sentinel<T> (size-of: two pointers) is a wrapper around
ilist_node<T> that has some extra API for list management.
- ilist<T> (size-of: two pointers) has an ilist_sentinel<T>, whose
address is returned for end().
The new memory layout matches ilist_half_embedded_sentinel_traits<T>
exactly. The Head pointer that previously lived in ilist<T> is
effectively glued to the ilist_half_node<T> that lived in
ilist_half_embedded_sentinel_traits<T>, becoming the Next and Prev in
the ilist_sentinel_node<T>, respectively. sizeof(ilist<T>) is now the
size of two pointers, and there is never any additional storage for a
sentinel.
This is a much simpler design for a doubly-linked list, removing most of
the corner cases of list manipulation (add, remove, etc.). In follow-up
commits, I intend to move as many algorithms as possible into a
non-templated base class (ilist_base) to reduce code size.
Moreover, this fixes the UB in ilist_iterator/getNext/getPrev
operations. Previously, ilist_iterator<T> operated on a T*, even when
the sentinel was not of type T (i.e., ilist_embedded_sentinel_traits and
ilist_half_embedded_sentinel_traits). This added UB to all operations
involving end(). Now, ilist_iterator<T> operates on an ilist_node<T>*,
and only downcasts when the full type is guaranteed to be T*.
What did we lose? There used to be a crash (in some configurations) on
++end(). Curiously (via UB), ++end() would return begin() for users of
ilist_half_embedded_sentinel_traits<T>, but otherwise ++end() would
cause a nice dependable nullptr dereference, crashing instead of a
possible infinite loop. Options:
1. Lose that behaviour.
2. Keep it, by stealing a bit from Prev in asserts builds.
3. Crash on dereference instead, using the same technique.
Hans convinced me (because of the number of problems this and r278532
exposed on Windows) that we really need some assertion here, at least in
the short term. I've opted for #3 since I think it catches more bugs.
I added only a couple of unit tests to root out specific bugs I hit
during bring-up, but otherwise this is tested implicitly via the
extensive usage throughout LLVM.
Planned follow-ups:
- Remove ilist_*sentinel_traits<T>. Here I've just gutted them to
prevent build failures in sub-projects. Once I stop referring to them
in sub-projects, I'll come back and delete them.
- Add ilist_base and move algorithms there.
- Check and fix move construction and assignment.
Eventually, there are other interesting directions:
- Rewrite reverse iterators, so that rbegin().getNodePtr()==&*rbegin().
This allows much simpler logic when erasing elements during a reverse
traversal.
- Remove ilist_traits::createNode, by deleting the remaining API that
creates nodes. Intrusive lists shouldn't be creating nodes
themselves.
- Remove ilist_traits::deleteNode, by (1) asserting that lists are empty
on destruction and (2) changing API that calls it to take a Deleter
functor (intrusive lists shouldn't be in the memory management
business).
- Reconfigure the remaining callback traits (addNodeToList, etc.) to be
higher-level, pulling out a simple_ilist<T> that is much easier to
read and understand.
- Allow tags (e.g., ilist_node<T,tag1> and ilist_node<T,tag2>) so that T
can be a member of multiple intrusive lists.
llvm-svn: 278974
This is used to mark functions with the C++11 [[ noreturn ]] or C11 _Noreturn
attributes.
Patch by Victor Leschuk!
https://reviews.llvm.org/D23167
llvm-svn: 278940
Now the tests of TargetParser is in place:
unittests/Support/TargetParserTest.cpp.
So the tests in TripleTest.cpp which actually stressing TargetParser's behavior could be removed.
llvm-svn: 278899
These splices are interesting because they involve swapping two nodes in
the same list. There are two ways to do this. Assuming:
A -> B -> [Sentinel]
You can either:
- splice B before A, with: L.splice(A, L, B) or
- splice A before Sentinel, with: L.splice(L.end(), L, A) to create:
B -> A -> [Sentinel]
These two swapping-splices are somewhat interesting corner cases for
maintaining the list invariants. The tests pass even with my new ilist
implementation, but I had some doubts about the latter when I was
looking at weird UB effects. Since I can't find equivalent explicit
test coverage elsewhere it seems prudent to commit.
llvm-svn: 278887
Pattern match has some paths which can operate on constant instructions,
but not all. This adds a version of m_value() to return const Value* and
changes ICmp matching to use auto so that it can match both constant and
mutable instructions.
Tests also included for both mutable and constant ICmpInst matching.
This will be used in a future commit to constify ValueTracking.cpp.
llvm-svn: 278570
Remove all ilist_iterator to pointer casts. There were two reasons for
casts:
- Checking for an uninitialized (i.e., null) iterator. I added
MachineInstrBundleIterator::isValid() to check for that case.
- Comparing an iterator against the underlying pointer value while
avoiding converting the pointer value to an iterator. This is
occasionally necessary in MachineInstrBundleIterator, since there is
an assertion in the constructors that the underlying MachineInstr is
not bundled (but we don't care about that if we're just checking for
pointer equality).
To support the latter case, I rewrote the == and != operators for
ilist_iterator and MachineInstrBundleIterator.
- The implicit constructors now use enable_if to exclude
const-iterator => non-const-iterator conversions from overload
resolution (previously it was a compiler error on instantiation, now
it's SFINAE).
- The == and != operators are now global (friends), and are not
templated.
- MachineInstrBundleIterator has overloads to compare against both
const_pointer and const_reference. This avoids the implicit
conversions to MachineInstrBundleIterator that assert, instead just
checking the address (and I added unit tests to confirm this).
Notably, the only remaining uses of ilist_iterator::getNodePtrUnchecked
are in ilist.h, and no code outside of ilist*.h directly relies on this
UB end-iterator-to-pointer conversion anymore. It's still needed for
ilist_*sentinel_traits, but I'll clean that up soon.
llvm-svn: 278478
Summary: Make Optional's behavior the same as the coming std::optional.
Reviewers: dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23178
llvm-svn: 278397
Summary: make_scope_exit() is described in C++ proposal p0052r2, which uses RAII to do cleanup works at scope exit.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22796
llvm-svn: 278251
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).
Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
llvm-svn: 278079
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
llvm-svn: 278077
The current approach isn't a long-term viable pattern. Given the set of
architectures A, vendors V, operating systems O, and environments E, it
does |A| * |V| * |O| * |E| * 4! tests. As LLVM grows, this test keeps
getting slower, despite my working very hard to make it get some
"optimizations" even in -O0 builds in order to lower the constant
factors. Fundamentally, we're doing an unreasonable amount of work.i
Looking at the specific thing being tested -- the goal seems very
clearly to be testing the *permutations*, not the *combinations*. The
combinations are driving up the complexity much more than anything else.
Instead, test every possible value for a given triple entry in every
permutation of *some* triple. This really seems to cover the core goal
of the test. Every single possible triple component is tested in every
position. But because we keep the rest of the triple constant, it does
so in a dramatically more scalable amount of time. With this model we do
(|A| + |V| + |O| + |E|) * 4! tests.
For me on a debug build, this goes from running for 19 seconds to 19
milliseconds, or a 1000x improvement. This makes a world of difference
for the critical path of 'ninja check-llvm' and other extremely common
workflows.
Thanks to Renato, Dean, and David for the helpful review comments and
helping me refine the explanation of the change.
Differential Revision: https://reviews.llvm.org/D23156
llvm-svn: 277912
String pooling is not guaranteed by the standard, so if
you're comparing two different string literals for equality,
you have to use strcmp.
llvm-svn: 277831
This is a follow-up to r277637. It teaches MemorySSA that invariant
loads (and loads of provably constant memory) are always liveOnEntry.
llvm-svn: 277640
This is a fix for PR28697.
An MDNode can indirectly refer to a GlobalValue, through a
ConstantAsMetadata. When the GlobalValue is deleted, the MDNode operand
is reset to `nullptr`. If the node is uniqued, this can lead to a
hard-to-detect cache invalidation in a Metadata map that's shared across
an LLVMContext.
Consider:
1. A map from Metadata* to `T` called RemappedMDs.
2. A node that references a global variable, `!{i1* @GV}`.
3. Insert `!{i1* @GV} -> SomeT` in the map.
4. Delete `@GV`, leaving behind `!{null} -> SomeT`.
Looking up the generic and uninteresting `!{null}` gives you `SomeT`,
which is likely related to `@GV`. Worse, `SomeT`'s lifetime may be tied
to the deleted `@GV`.
This occurs in practice in the shared ValueMap used since r266579 in the
IRMover. Other code that handles more than one Module (with different
lifetimes) in the same LLVMContext could hit it too.
The fix here is a partial revert of r225223: in the rare case that an
MDNode operand is a ConstantAsMetadata (i.e., wrapping a node from the
Value hierarchy), drop uniquing if it gets replaced with `nullptr`.
This changes step #4 above to leave behind `distinct !{null} -> SomeT`,
which can't be confused with the generic `!{null}`.
In theory, this can cause some churn in the LLVMContext's MDNode
uniquing map when Values are being deleted. However:
- The number of GlobalValues referenced from uniqued MDNodes is
expected to be quite small. E.g., the debug info metadata schema
only references GlobalValues from distinct nodes.
- Other Constants have the lifetime of the LLVMContext, whose teardown
is careful to drop references before deleting the constants.
As a result, I don't expect a compile time regression from this change.
llvm-svn: 277625
This fixes a bug where we'd sometimes cache overly-conservative results
with our walker. This bug was made more obvious by r277480, which makes
our cache far more spotty than it was. Test case is llvm-unit, because
we're likely going to use CachingWalker only for def optimization in the
future.
The bug stems from that there was a place where the walker assumed that
`DefNode.Last` was a valid target to cache to when failing to optimize
phis. This is sometimes incorrect if we have a cache hit. The fix is to
use the thing we *can* assume is a valid target to cache to. :)
llvm-svn: 277559
Summary: By generalize the interface, users are able to inject more flexible Node token into the algorithm, for example, a pair of vector<Node>* and index integer. Currently I only migrated SCCIterator to use NodeRef, but more is coming. It's a NFC.
Reviewers: dblaikie, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22937
llvm-svn: 277399
This patch replaces RuntimeDyld::SymbolInfo with JITSymbol: A symbol class
that is capable of lazy materialization (i.e. the symbol definition needn't be
emitted until the address is requested). This can be used to support common
and weak symbols in the JIT (though this is not implemented in this patch).
For consistency, RuntimeDyld::SymbolResolver is renamed to JITSymbolResolver.
For space efficiency a new class, JITEvaluatedSymbol, is introduced that
behaves like the old RuntimeDyld::SymbolInfo - i.e. it is just a pair of an
address and symbol flags. Instances of JITEvaluatedSymbol can be used in
symbol-tables to avoid paying the space cost of the materializer.
llvm-svn: 277386
are very handy when parsing text.
They are essentially a combination of startswith and a self-modifying
drop_front, or endswith and drop_back respectively.
Differential Revision: https://reviews.llvm.org/D22723
llvm-svn: 277288
Summary:
This change fixes issues with `LLVM_CONSTEXPR` functions and
`TrailingObjects::FixedSizeStorage`. In particular, some of the
functions marked `LLVM_CONSTEXPR` used by `FixedSizeStorage` were not
implemented such that they evaluate successfully as part of a constant
expression despite constant arguments.
This change also implements a more traditional template-meta path to
accommodate MSVC, and adds unit tests for `FixedSizeStorage`.
Drive-by fix: the access control for members of `TrailingObjectsImpl` is
tightened.
Reviewers: faisalv, rsmith, aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D22668
llvm-svn: 277270
Previously this change was submitted from a Windows machine, so
changes made to the case of filenames and directory names did
not survive the commit, and as a result the CMake source file
names and the on-disk file names did not match on case-sensitive
file systems.
I'm resubmitting this patch from a Linux system, which hopefully
allows the case changes to make it through unfettered.
llvm-svn: 277213
In a previous patch, it was suggested to use all caps instead of
rolling caps for initialisms, so this patch changes everything
to do this.
llvm-svn: 277190
When coming from an IR label type, we set a 0 NumElements, but not
when constructing an LLT using unsized(), causing comparisons to fail.
Pick one variant and fix the other.
llvm-svn: 277161
This was a pure virtual base class whose purpose was to abstract
away the notion of how you retrieve the layout of a discontiguous
stream of blocks in an Msf file. This led to too many layers of
abstraction making it difficult to figure out what was going on
and extend things. Ultimately, a stream's layout is decided by
its length and the array of block numbers that it lives on. So
rather than have an abstract base class which can return this in
any number of ways, it's more straightforward to simply store them
as fields of a trivial struct, and also to give a more appropriate
name.
This patch does that. It renames IMsfStreamData to MsfStreamLayout,
and deletes the 2 concrete implementations, DirectoryStreamData
and IndexedStreamData. MsfStreamLayout is a trivial struct
with the necessary data.
llvm-svn: 277018
These loop from 0 to AEK_XSCALE, which is currently defined as 0x80000000, and
thus the tests loop over the entire int range, which is unreasonable
and also too slow in debug builds.
llvm-svn: 276969
Add unittest to {ARM | AArch64}TargetParser,and by the way correct problems as below:
1.Correct a incorrect indexing problem in AArch64TargetParser. The architecture enumeration
is shared across ARM and AArch64 in original implementation.But In the code,I just used the
index which was offset by the ARM, and this would index into the array incorrectly. To make
AArch64 has its own arch enum,or we will do a lot of slowly iterating.
2.Correct a spelling error. The parameter of llvm::AArch64::getArchExtName.
3.Correct a writing mistake, in llvm::ARM::parseArchISA.
Differential Revision: https://reviews.llvm.org/D21785
llvm-svn: 276957
Change the syntax to use `%0.sub8` to denote a subregister.
This seems like a more natural fit to denote subregisters; I also plan
to introduce a new ":classname" syntax in upcoming patches to denote the
register class of a vreg.
Note that this commit disallows plain identifiers to start with a '.'
character. This shouldn't affect anything as external names/IR
references are all prefixed with '$'/'%', plain identifiers are only
used for instruction names, register mask names and subreg indexes.
Differential Revision: https://reviews.llvm.org/D22390
llvm-svn: 276815
If we move a last-use register read to a later position we may skip
intermediate segments. This may require us to not only extend the
segment before the NewIdx, but also extend the segment live-in to
OldIdx.
This switches LiveIntervalTest to use AMDGPU so we can test subregister
liveness.
llvm-svn: 276724
This adds versions of operator + and - which are optimized for the LHS/RHS of the
operator being RValue's. When an RValue is available, we can use its storage space
instead of allocating new space.
On code such as ConstantRange which makes heavy use of APInt's over 64-bits in size,
this results in significant numbers of saved allocations.
Thanks to David Blaikie for all the review and most of the code here.
llvm-svn: 276470
This provides a better layering of responsibilities among different
aspects of PDB writing code. Some of the MSF related code was
contained in CodeView, and some was in PDB prior to this. Further,
we were often saying PDB when we meant MSF, and the two are
actually independent of each other since in theory you can have
other types of data besides PDB data in an MSF. So, this patch
separates the MSF specific code into its own library, with no
dependencies on anything else, and DebugInfoCodeView and
DebugInfoPDB take dependencies on DebugInfoMsf.
llvm-svn: 276458
This allows ErrorAsOutParameter to work better with "optional" errors. For
example, consider a function where for certain input values it is known that
the function can't fail. This can now be written as:
Result foo(Arg X, Error *Err) {
ErrorAsOutParameter EAO(Err);
if (<Error Condition>) {
if (Err)
*Err = <report error>;
else
llvm_unreachable("Unexpected failure!");
}
}
Rather than having to construct an ErrorAsOutParameter under every conditional
where Err is known to be non-null.
llvm-svn: 276430
This provides an elegant pattern to solve the "construct if not in map
already" problem we have many times in LLVM. Without try_emplace we
either have to rely on a sentinel value (nullptr) or do two lookups.
llvm-svn: 276277
Add a "-j" option to llvm-profdata to control the number of threads used.
Auto-detect NumThreads when it isn't specified, and avoid spawning threads when
they wouldn't be beneficial.
I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:
No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total
Changes since the initial commit:
- When handling odd-length inputs, call ThreadPool::wait() before merging the
last profile. Should fix a race/off-by-one (see r275937).
Differential Revision: https://reviews.llvm.org/D22438
llvm-svn: 275938
Add a "-j" option to llvm-profdata to control the number of threads
used. Auto-detect NumThreads when it isn't specified, and avoid spawning
threads when they wouldn't be beneficial.
I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:
No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total
Differential Revision: https://reviews.llvm.org/D22438
llvm-svn: 275921
Summary:
Given that we had a bug on max/minUIntN(64), these should have tests
too.
Reviewers: rnk
Subscribers: dylanmckay, llvm-commits
Differential Revision: https://reviews.llvm.org/D22443
llvm-svn: 275723
Summary:
Previously we were doing 1 << S. "1" is an int, so this doesn't work
when S >= 32.
This patch also adds some static_asserts to these functions to ensure
that we don't hit UB by shifting left too much.
Reviewers: rnk
Subscribers: llvm-commits, dylanmckay
Differential Revision: https://reviews.llvm.org/D22441
llvm-svn: 275719
Doing "I++" inside of an EXPECT_* triggers
warning: expression with side effects has no effect in an unevaluated context
because EXPECT_* partially expands to
EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(i++)) == 1)>
which is an unevaluated context.
llvm-svn: 275717
Summary:
This shift is undefined behavior (and, as compiled by clang, gives the
wrong answer for maxUIntN(64)).
Reviewers: mkuper
Subscribers: llvm-commits, jroelofs, rsmith
Differential Revision: https://reviews.llvm.org/D22430
llvm-svn: 275656
Block 1 and 2 of an MSF file are bit vectors that represent the
list of blocks allocated and free in the file. We had been using
these blocks to write stream data and other data, so we mark them
as the free page map now. We don't yet serialize these pages to
the disk, but at least we make a note of what it is, and avoid
writing random data to them.
Doing this also necessitated cleaning up some of the tests to be
more general and hardcode fewer values, which is nice.
llvm-svn: 275629
Previously we would read a PDB, then write some of it back out,
but write the directory, super block, and other pertinent metadata
back out unchanged. This generates incorrect PDBs since the amount
of data written was not always the same as the amount of data read.
This patch changes things to use the newly introduced `MsfBuilder`
class to write out a correct and accurate set of Msf metadata for
the data *actually* written, which opens up the door for adding and
removing type records, symbol records, and other types of data to
an existing PDB.
llvm-svn: 275627
Doing "I++" inside of an EXPECT_* triggers
warning: expression with side effects has no effect in an unevaluated context
because EXPECT_* partially expands to
EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(MockObjects[I++] + 1)) == 1)>
which is an unevaluated context.
llvm-svn: 275293
Summary: Normally when you do a bitwise operation on an enum value, you
get back an instance of the underlying type (e.g. int). But using this
macro, bitwise ops on your enum will return you back instances of the
enum. This is particularly useful for enums which represent a
combination of flags.
Suppose you have a function which takes an int and a set of flags. One
way to do this would be to take two numeric params:
enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
void Fn(int Num, int Flags);
void foo() {
Fn(42, F2 | F3);
}
But now if you get the order of arguments wrong, you won't get an error.
You might try to fix this by changing the signature of Fn so it accepts
a SomeFlags arg:
enum SomeFlags { F1 = 1, F2 = 2, F3 = 4, ... };
void Fn(int Num, SomeFlags Flags);
void foo() {
Fn(42, static_cast<SomeFlags>(F2 | F3));
}
But now we need a static cast after doing "F2 | F3" because the result
of that computation is the enum's underlying type.
This patch adds a mechanism which gives us the safety of the second
approach with the brevity of the first.
enum SomeFlags {
F1 = 1, F2 = 2, F3 = 4, ..., F_MAX = 128,
LLVM_MARK_AS_BITMASK_ENUM(F_MAX)
};
void Fn(int Num, SomeFlags Flags);
void foo() {
Fn(42, F2 | F3); // No static_cast.
}
The LLVM_MARK_AS_BITMASK_ENUM macro enables overloads for bitwise
operators on SomeFlags. Critically, these operators return the enum
type, not its underlying type, so you don't need any static_casts.
An advantage of this solution over the previously-proposed BitMask class
[0, 1] is that we don't need any wrapper classes -- we can operate
directly on the enum itself.
The approach here is somewhat similar to OpenOffice's typed_flags_set
[2]. But we skirt the need for a wrapper class (and a good deal of
complexity) by judicious use of enable_if. We SFINAE on the presence of
a particular enumerator (added by the LLVM_MARK_AS_BITMASK_ENUM macro)
instead of using a traits class so that it's impossible to use the enum
before the overloads are present. The solution here also seamlessly
works across multiple namespaces.
[0] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150622/283369.html
[1] http://lists.llvm.org/pipermail/llvm-commits/attachments/20150623/073434b6/attachment.obj
[2] https://cgit.freedesktop.org/libreoffice/core/tree/include/o3tl/typed_flags_set.hxx
Reviewers: chandlerc, rsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22279
llvm-svn: 275292
Because of the goop involved in the EXPECT_EQ macro, we were getting the
following warning
expression with side effects has no effect in an unevaluated context
because the "I++" was being used inside of a template type:
switch (0) case 0: default: if (const ::testing::AssertionResult gtest_ar = (::testing::internal:: EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(Args[I++])) == 1)>::Compare("Args[I++]", "&A", Args[I++], &A))) ; else ::testing::internal::AssertHelper(::testing::TestPartResult::kNonFatalFailure, "../src/unittests/IR/FunctionTest.cpp", 94, gtest_ar.failure_message()) = ::testing::Message();
llvm-svn: 275291
Summary:
This represents the adjustment applied to the implicit 'this' parameter
in the prologue of a virtual method in the MS C++ ABI. The adjustment is
always zero unless multiple inheritance is involved.
This increases the size of DISubprogram by 8 bytes, unfortunately. The
adjustment really is a signed 32-bit integer. If this size increase is
too much, we could probably win it back by splitting out a subclass with
info specific to virtual methods (virtuality, vindex, thisadjustment,
containingType).
Reviewers: aprantl, dexonsmith
Subscribers: aaboud, amccarth, llvm-commits
Differential Revision: http://reviews.llvm.org/D21614
llvm-svn: 274325
When concatenating two error lists the ErrorList::join method (which is called
by joinErrors) was failing to set the checked bit on the second error, leading
to a 'failure to check error' assertion.
llvm-svn: 274249
re-insertion of entries into the worklist moves them to the end.
This is fairly similar to a SetVector, but helps in the case where in
addition to not inserting duplicates you want to adjust the sequence of
a pop-off-the-back worklist.
I'm not at all attached to the name of this data structure if others
have better suggestions, but this is one that David Majnemer brought up
in IRC discussions that seems plausible.
I've trimmed the interface down somewhat from SetVector's interface
because several things make less sense here IMO: iteration primarily.
I'd prefer to add these back as we have users that need them. My use
case doesn't even need all of what is provided here. =]
I've also included a basic unittest to make sure this functions
reasonably.
Differential Revision: http://reviews.llvm.org/D21866
llvm-svn: 274198
This fixes an issue where occurrence counts would be unexpectedly
reset when parsing different parts of a command line multiple
times.
**ORIGINAL COMMIT MESSAGE**
This allows command line tools to use syntaxes like the following:
llvm-foo.exe command1 -o1 -o2
llvm-foo.exe command2 -p1 -p2
Where command1 and command2 contain completely different sets of
valid options. This is backwards compatible with previous uses
of llvm cl which did not support subcommands, as any option
which specifies no optional subcommand (e.g. all existing
code) goes into a special "top level" subcommand that expects
dashed options to appear immediately after the program name.
For example, code which is subcommand unaware would generate
a command line such as the following, where no subcommand
is specified:
llvm-foo.exe -q1 -q2
The top level subcommand can co-exist with actual subcommands,
as it is implemented as an actual subcommand which is searched
if no explicit subcommand is specified. So llvm-foo.exe as
specified above could be written so as to support all three
aforementioned command lines simultaneously.
There is one additional "special" subcommand called AllSubCommands,
which can be used to inject an option into every subcommand.
This is useful to support things like help, so that commands
such as:
llvm-foo.exe --help
llvm-foo.exe command1 --help
llvm-foo.exe command2 --help
All work and display the help for the selected subcommand
without having to explicitly go and write code to handle each
one separately.
This patch is submitted without an example of anything actually
using subcommands, but a followup patch will convert the
llvm-pdbdump tool to use subcommands.
Reviewed By: beanz
llvm-svn: 274171
This allows command line tools to use syntaxes like the following:
llvm-foo.exe command1 -o1 -o2
llvm-foo.exe command2 -p1 -p2
Where command1 and command2 contain completely different sets of
valid options. This is backwards compatible with previous uses
of llvm cl which did not support subcommands, as any option
which specifies no optional subcommand (e.g. all existing
code) goes into a special "top level" subcommand that expects
dashed options to appear immediately after the program name.
For example, code which is subcommand unaware would generate
a command line such as the following, where no subcommand
is specified:
llvm-foo.exe -q1 -q2
The top level subcommand can co-exist with actual subcommands,
as it is implemented as an actual subcommand which is searched
if no explicit subcommand is specified. So llvm-foo.exe as
specified above could be written so as to support all three
aforementioned command lines simultaneously.
There is one additional "special" subcommand called AllSubCommands,
which can be used to inject an option into every subcommand.
This is useful to support things like help, so that commands
such as:
llvm-foo.exe --help
llvm-foo.exe command1 --help
llvm-foo.exe command2 --help
All work and display the help for the selected subcommand
without having to explicitly go and write code to handle each
one separately.
This patch is submitted without an example of anything actually
using subcommands, but a followup patch will convert the
llvm-pdbdump tool to use subcommands.
Reviewed By: beanz
Differential Revision: http://reviews.llvm.org/D21485
llvm-svn: 274054
The bitset metadata currently used in LLVM has a few problems:
1. It has the wrong name. The name "bitset" refers to an implementation
detail of one use of the metadata (i.e. its original use case, CFI).
This makes it harder to understand, as the name makes no sense in the
context of virtual call optimization.
2. It is represented using a global named metadata node, rather than
being directly associated with a global. This makes it harder to
manipulate the metadata when rebuilding global variables, summarise it
as part of ThinLTO and drop unused metadata when associated globals are
dropped. For this reason, CFI does not currently work correctly when
both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable
globals, and fails to associate metadata with the rebuilt globals. As I
understand it, the same problem could also affect ASan, which rebuilds
globals with a red zone.
This patch solves both of those problems in the following way:
1. Rename the metadata to "type metadata". This new name reflects how
the metadata is currently being used (i.e. to represent type information
for CFI and vtable opt). The new name is reflected in the name for the
associated intrinsic (llvm.type.test) and pass (LowerTypeTests).
2. Attach metadata directly to the globals that it pertains to, rather
than using the "llvm.bitsets" global metadata node as we are doing now.
This is done using the newly introduced capability to attach
metadata to global variables (r271348 and r271358).
See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html
Differential Revision: http://reviews.llvm.org/D21053
llvm-svn: 273729
This change is motivated by an upcoming change to the metadata representation
used for CFI. The indirect function call checker needs type information for
external function declarations in order to correctly generate jump table
entries for such declarations. We currently associate such type information
with declarations using a global metadata node, but I plan [1] to move all
such metadata to global object attachments.
In bitcode, metadata attachments for function declarations appear in the
global metadata block. This seems reasonable to me because I expect metadata
attachments on declarations to be uncommon. In the long term I'd also expect
this to be the case for CFI, because we'd want to use some specialized bitcode
format for this metadata that could be read as part of the ThinLTO thin-link
phase, which would mean that it would not appear in the global metadata block.
To solve the lazy loaded metadata issue I was seeing with D20147, I use the
same bitcode representation for metadata attachments for global variables as I
do for function declarations. Since there's a use case for metadata attachments
in the global metadata block, we might as well use that representation for
global variables as well, at least until we have a mechanism for lazy loading
global variables.
In the assembly format, the metadata attachments appear after the "declare"
keyword in order to avoid a parsing ambiguity.
[1] http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html
Differential Revision: http://reviews.llvm.org/D21052
llvm-svn: 273336
We recently made MemorySSA own the walker it creates. As a part of this,
the MSSA test fixture was changed to have a `Walker*` instead of a
`unique_ptr<Walker>`. So, we no longer need to do `&*Walker` in order to
get a `Walker*`.
llvm-svn: 273189
pass manager passes' `run` methods.
This removes a bunch of SFINAE goop from the pass manager and just
requires pass authors to accept `AnalysisManager<IRUnitT> &` as a dead
argument. This is a small price to pay for the simplicity of the system
as a whole, despite the noise that changing it causes at this stage.
This will also helpfull allow us to make the signature of the run
methods much more flexible for different kinds af passes to support
things like intelligently updating the pass's progression over IR units.
While this touches many, many, files, the changes are really boring.
Mostly made with the help of my trusty perl one liners.
Thanks to Sean and Hal for bouncing ideas for this with me in IRC.
llvm-svn: 272978
We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise
we will get dangling pointer inside BranchProbabilityInfo cache.
Differential Revision: http://reviews.llvm.org/D20957
llvm-svn: 272891
Differential Revision: http://reviews.llvm.org/D19842
Corresponding clang patch: http://reviews.llvm.org/D19843
Re-commit after addressing issues with of generating too many warnings for Windows and asan test failures
Patch by Eric Niebler
llvm-svn: 272555
undef uses are no real uses of a register and must be ignored by
findLastUseBefore() so that handleMove() does not produce invalid live
intervals in some cases.
This fixed http://llvm.org/PR28083
llvm-svn: 272446
This fixes an alignment issue by forcing all cached allocations
to be 8 byte aligned, and also fixes an issue arising on big
endian systems by writing ulittle32_t's instead of uint32_t's
in the test.
llvm-svn: 272437
This adds method and tests for writing to a PDB stream. With
this, even a PDB stream which is discontiguous can be treated
as a sequential stream of bytes for the purposes of writing.
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21157
llvm-svn: 272369
Summary:
Now DISubroutineType has a 'cc' field which should be a DW_CC_ enum. If
it is present and non-zero, the backend will emit it as a
DW_AT_calling_convention attribute. On the CodeView side, we translate
it to the appropriate enum for the LF_PROCEDURE record.
I added a new LLVM vendor specific enum to the list of DWARF calling
conventions. DWARF does not appear to attempt to standardize these, so I
assume it's OK to do this until we coordinate with GCC on how to emit
vectorcall convention functions.
Reviewers: dexonsmith, majnemer, aaboud, amccarth
Subscribers: mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D21114
llvm-svn: 272197
The architecture enumeration is shared across ARM and AArch64. However, the
data is not. The code incorrectly would index into the array using the
architecture index which was offset by the ARMv7 architecture enumeration. We
do not have a marker for indicating the architectural family to which the
enumeration belongs so we cannot be clever about offsetting the index (at least
it is not immediately apparent to me). Instead, fall back to the tried-and-true
method of slowly iterating the array (its not a large array, so the impact of
this is not too high).
Because of the incorrect indexing, if we were lucky, we would crash, but usually
we would return an invalid StringRef. We did not have any tests for the AArch64
target parser previously;. Extend the previous tests I had added for ARM to
cover AArch64 for ensuring that we return expected StringRefs.
Take the opportunity to change some iterator types to references.
This work is needed to support parsing `.arch name` directives in the AArch64
target asm parser.
llvm-svn: 272145
This allows mapping of any endian-aware type whose underlying
type (e.g. uint32_t) provides a ScalarTraits specialization.
Reviewed by: majnemer
Differential Revision: http://reviews.llvm.org/D21057
llvm-svn: 272049
Arrange to call verify(Function &) on each function, followed by
verify(Module &), whether the verifier is being used from the pass or
from verifyModule(). As a side effect, this fixes an issue that caused
us not to call verify(Function &) on unmaterialized functions from
verifyModule().
Differential Revision: http://reviews.llvm.org/D21042
llvm-svn: 271956
Also fix slice wrappers drop_front and drop_back.
The unittests are pretty awkward, but do the job; alternatives
welcome!
..and yes, I do have ArrayRefs with more than 4 billion elements.
llvm-svn: 271546
We only considered the length of the operation and the length of the
StreamRef without considered what it meant for the offset to be at a
non-zero position.
llvm-svn: 271496
Add support for the new pass manager to MemorySSA pass.
Change MemorySSA to be computed eagerly upon construction.
Change MemorySSAWalker to be owned by the MemorySSA object that creates
it.
Reviewers: dberlin, george.burgess.iv
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D19664
llvm-svn: 271432
This will be necessary to allow the global merge pass to attach
multiple debug info metadata nodes to global variables once we reverse
the edge from DIGlobalVariable to GlobalVariable.
Differential Revision: http://reviews.llvm.org/D20414
llvm-svn: 271358
This tidies up some code that was manually constructing RuntimeDyld::SymbolInfo
instances from JITSymbols. It will save more mess in the future when
JITSymbol::getAddress is extended to return an Expected<TargetAddress> rather
than just a TargetAddress, since we'll be able to embed the error checking in
the conversion.
llvm-svn: 271350
StringError can be used to represent Errors that aren't recoverable based on
the error type, but that have a useful error message that can be reported to
the user or logged.
llvm-svn: 270948
APInt::slt was copying the LHS and RHS in to temporaries then making
them unsigned so that it could use an unsigned comparision. It did
this even on the paths which were trivial to give results for, such
as the sign bit of the LHS being set while RHS was not set.
This changes the logic to return out immediately in the trivial cases,
and use an unsigned comparison in the remaining cases. But this time,
just use the unsigned comparison directly without creating any temporaries.
This works because, for example:
true = (-2 slt -1) = (0xFE ult 0xFF)
Also added some tests explicitly for slt with APInt's larger than 64-bits
so that this new code is tested.
Using the memory for 'opt -O2 verify-uselistorder.lto.opt.bc -o opt.bc'
(see r236629 for details), this reduces the number of allocations from
26.8M to 23.9M.
llvm-svn: 270881
Since r268966 the modern Verifier pass defaults to stripping invalid debug info
in nonasserts builds. This patch ports this behavior back to the legacy
Verifier pass as well. The primary motivation is that the clang frontend
accepts bitcode files as input but is still using the legacy pass pipeline.
Background: The problem I'm trying to solve with this sequence of patches is
that historically we've done a really bad job at verifying debug info. We want
to be able to make the verifier stricter without having to worry about breaking
bitcode compatibility with existing producers. For example, we don't necessarily
want IR produced by an older version of clang to be rejected by an LTO link just
because of malformed debug info, and rather provide an option to strip it. Note
that merely outdated (but well-formed) debug info would continue to be
auto-upgraded in this scenario.
http://reviews.llvm.org/D20629
<rdar://problem/26448800>
llvm-svn: 270768
This removes the subclasses of ProfileSummary, moves the members of the derived classes to the base class.
Differential Revision: http://reviews.llvm.org/D20390
llvm-svn: 270143
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.
Changes since the initial commit:
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.
- Remove the base ProfError class to work around an MSVC ICE.
Differential Revision: http://reviews.llvm.org/D19901
llvm-svn: 270020
Having an enum member named Default is quite confusing: Is it distinct
from the others?
This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.
llvm-svn: 269988
Summary:
Add support to control where files for a distributed backend (the
individual index files and optional imports files) are created.
This is invoked with a new thinlto-prefix-replace option in the gold
plugin and llvm-lto. If specified, expects a string of the form
"oldprefix:newprefix", and instead of generating these files in the
same directory path as the corresponding bitcode file, will use a path
formed by replacing the bitcode file's path prefix matching oldprefix
with newprefix.
Also add a new replace_path_prefix helper to Path.h in libSupport.
Depends on D19636.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19644
llvm-svn: 269771
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.
Changes since the initial commit:
- Address undefined-var-template warning.
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.
Differential Revision: http://reviews.llvm.org/D19901
llvm-svn: 269694
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.
Changes since the initial commit:
- Fix error message printing in llvm-profdata.
- Check errors in loadTestingFormat() + annotateAllFunctions().
- Defer error handling in InstrProfIterator to InstrProfReader.
Differential Revision: http://reviews.llvm.org/D19901
llvm-svn: 269491
Transition InstrProf and Coverage over to the stricter Error/Expected
interface.
Differential Revision: http://reviews.llvm.org/D19901
llvm-svn: 269462
a sequence of values.
It increments through the values in the half-open range: [Begin, End),
producing those values when indirecting the iterator. It should support
integers, iterators, and any other type providing these basic arithmetic
operations.
This came up in the C++ standards committee meeting, and it seemed like
a useful construct that LLVM might want as well, and I wanted to
understand how easily we could solve it. I suspect this can be used to
write simpler counting loops even in LLVM along the lines of:
for (int i : seq(0, v.size())) {
...
};
As part of this, I had to fix the lack of a proxy object returned from
the operator[] in our iterator facade.
Differential Revision: http://reviews.llvm.org/D17870
llvm-svn: 269390
Summary:
...loop after the last iteration.
This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.
This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.
The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.
Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.
Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.
We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.
Reviewers: chandlerc
Subscribers: sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D11758
llvm-svn: 269388
Remove the ModuleLevelChanges argument, and the ability to create new
subprograms for cloned functions. The latter was added without review in
r203662, but it has no in-tree clients (all non-test callers pass false
for ModuleLevelChanges [1], so it isn't reachable outside of tests). It
also isn't clear that adding a duplicate subprogram to the compile unit is
always the right thing to do when cloning a function within a module. If
this functionality comes back it should be accompanied with a more concrete
use case.
Furthermore, all in-tree clients add the returned function to the module.
Since that's pretty much the only sensible thing you can do with the function,
just do that in CloneFunction.
[1] http://llvm-cs.pcc.me.uk/lib/Transforms/Utils/CloneFunction.cpp/rCloneFunction
Differential Revision: http://reviews.llvm.org/D18628
llvm-svn: 269110
Add convenience function to create MachineModuleInfo and
MachineFunctionAnalysis passes and add them to a pass manager.
Despite factoring out some shared code in
LiveIntervalTest/LLVMTargetMachine this will be used by my upcoming llc
change.
llvm-svn: 269002
allow the transformation to strip invalid debug info.
This patch separates the Verifier into an analysis and a transformation
pass, with the transformation pass optionally stripping malformed
debug info.
The problem I'm trying to solve with this sequence of patches is that
historically we've done a really bad job at verifying debug info. We want
to be able to make the verifier stricter without having to worry about
breaking bitcode compatibility with existing producers. For example, we
don't necessarily want IR produced by an older version of clang to be
rejected by an LTO link just because of malformed debug info, and rather
provide an option to strip it. Note that merely outdated (but well-formed)
debug info would continue to be auto-upgraded in this scenario.
http://reviews.llvm.org/D19988
rdar://problem/25818489
This reapplies r268937 without modifications.
llvm-svn: 268966
allow the transformation to strip invalid debug info.
This patch separates the Verifier into an analysis and a transformation
pass, with the transformation pass optionally stripping malformed
debug info.
The problem I'm trying to solve with this sequence of patches is that
historically we've done a really bad job at verifying debug info. We want
to be able to make the verifier stricter without having to worry about
breaking bitcode compatibility with existing producers. For example, we
don't necessarily want IR produced by an older version of clang to be
rejected by an LTO link just because of malformed debug info, and rather
provide an option to strip it. Note that merely outdated (but well-formed)
debug info would continue to be auto-upgraded in this scenario.
http://reviews.llvm.org/D19988
rdar://problem/25818489
llvm-svn: 268937
toString() consumes an Error and returns a string representation of its
contents. This commit also adds a message() method to ErrorInfoBase for
convenience.
Differential Revision: http://reviews.llvm.org/D19883
llvm-svn: 268465
A loop pass that didn't preserve this entire set of passes wouldn't
play well with other loop passes, since these are generally a basic
requirement to do any interesting transformations to a loop.
Adds a helper to get the set of analyses a loop pass should preserve,
and checks that any loop pass we run satisfies the requirement.
llvm-svn: 268444
We have it for StringRef but not ArrayRef, and ArrayRef has drop_back,
so I see no reason it shouldn't have drop_front. Splitting this out of a
change that I have that will use this funcitonality.
llvm-svn: 268434
Check for success values in the CoverageMappingTest unit test file.
This is part of a series of patches to transition ProfileData over to
the stricter Error/Expected interface.
llvm-svn: 268420
Check for success values in the InstrProfTest unit test file.
This is part of a series of patches to transition ProfileData over to
the stricter Error/Expected interface.
llvm-svn: 268402
This patch fixes two somewhat related bugs in MemorySSA's caching
walker. These bugs were found because D19695 brought up the problem
that we'd have defs cached to themselves, which is incorrect.
The bugs this fixes are:
- We would sometimes skip the nearest clobber of a MemoryAccess, because
we would query our cache for a given potential clobber before
checking if the potential clobber is the clobber we're looking for.
The cache entry for the potential clobber would point to the nearest
clobber *of the potential clobber*, so if that was a cache hit, we'd
ignore the potential clobber entirely.
- There are times (sometimes in DFS, sometimes in the getClobbering...
functions) where we would insert cache entries that say a def
clobbers itself.
There's a bit of common code between the fixes for the bugs, so they
aren't split out into multiple commits.
This patch also adds a few unit tests, and refactors existing tests a
bit to reduce the duplication of setup code.
llvm-svn: 268087
In gcc, \ escapes every character in response files. It is true that this makes
it harder to mention Windows files in rsp files, but not doing this means clang
disagrees with gcc, and also disagrees with the shell (on non-Windows) which
rsp file quoting is supposed to match. clang isn't free to choose what to do
here.
In general, the idea for response files is to take bits of your command line
and write them to a file unchanged, and have things work the same way. Since
the command line would've been interpreted by the shell, things in the rsp file
need to be subject to the same shell quoting rules.
People who want to put Windows-style paths in their response files either need
to do any of:
* escape their backslashes
* or use clang-cl which uses cl.exe/cmd.exe quoting rules
* pass --rsp-quoting=windows to clang to tell it to use
cl.exe/cmd.exe quoting rules for response files.
Fixes PR27464.
http://reviews.llvm.org/D19417
llvm-svn: 267556
This replaces use of std::error_code and ErrorOr in the ORC RPC support library
with Error and Expected. This required updating the OrcRemoteTarget API, Client,
and server code, as well as updating the Orc C API.
This patch also fixes several instances where Errors were dropped.
llvm-svn: 267457
If several regions cover the same area of code, we have to restore
the combined value for that area when return from a nested region.
This patch achieves that by combining regions before calling buildSegments.
Differential Revision: http://reviews.llvm.org/D18610
llvm-svn: 267390
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*. It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.
Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType. The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.
This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata. Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.
The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html
llvm-svn: 267296
Each reference to an unresolved MDNode is expensive, since the RAUW
support in MDNode uses a separate allocation and side map. Since
a distinct MDNode doesn't require its operands on creation (unlike
uniuqed nodes, there's no need to check for structural equivalence),
use nullptr for any of its unresolved operands. Besides reducing the
burden on MDNode maps, this can avoid allocating temporary MDNodes in
the first place.
We need some way to track operands. Invent DistinctMDOperandPlaceholder
for this purpose, which is a Metadata subclass that holds an ID and
points at its single user. DistinctMDOperandPlaceholder::replaceUseWith
is just like RAUW, but its name highlights that there is only ever
exactly one use.
There is no support for moving (or, obviously, copying) these. Move
support would be possible but expensive; leaving it unimplemented
prevents user error. In the BitcodeReader I originally considered
allocating on a BumpPtrAllocator and keeping a vector of pointers to
them, and then I realized that std::deque implements exactly this.
A couple of obvious follow-ups:
- Change ValueEnumerator to emit distinct nodes first to take more
advantage of this optimization. (How convenient... I think I might
have a couple of patches for this.)
- Change DIBuilder and its consumers (like CGDebugInfo in clang) to
use something like this when constructing debug info in the first
place.
llvm-svn: 267270
A DenseMap doesn't store the hashes, so it needs to recompute them when
the table is resized.
In some applications the hashing cost is noticeable. That is the case
for example in lld for symbol names (StringRef).
This patch adds a templated structure that can wraps any value that can
go in a DenseMap and caches the hash.
llvm-svn: 266981
Add a new method, DICompositeType::buildODRType, that will create or
mutate the DICompositeType for a given ODR identifier, and use it in
LLParser and BitcodeReader instead of DICompositeType::getODRType.
The logic is as follows:
- If there's no node, create one with the given arguments.
- Else, if the current node is a forward declaration and the new
arguments would create a definition, mutate the node to match the
new arguments.
- Else, return the old node.
This adds a missing feature supported by the current DITypeIdentifierMap
(which I'm slowly making redudant). The only remaining difference is
that the DITypeIdentifierMap has a "the-last-one-wins" rule, whereas
DICompositeType::buildODRType has a "the-first-one-wins" rule.
For now I'm leaving behind DICompositeType::getODRType since it has
obvious, low-level semantics that are convenient for unit testing.
llvm-svn: 266786
The second test in this file is actually testing DICompositeType API,
not LLVMContext API (after r266742 moved it to a higher level). This
really doesn't make sense in an LLVMContextTest. Rename the tests
before adding more.
llvm-svn: 266764
Lift the API for debug info ODR type uniquing up a layer. Instead of
clients managing the map directly on the LLVMContext, add a static
method to DICompositeType called getODRType and handle the map in the
background. Also adds DICompositeType::getODRTypeIfExists, so far just
for convenience in the unit tests.
This simplifies the logic in LLParser and BitcodeReader. Because of
argument spam there are actually a few more lines of code now; I'll see
if I come up with a reasonable way to clean that up.
llvm-svn: 266742
Tighten up the API for debug info ODR type uniquing in LLVMContext. The
only reason to allow other DIType subclasses is to make the unit tests
prettier :/.
llvm-svn: 266737
As per David's review, rename everything in the new API for ODR type
uniquing of debug info.
ensureDITypeMap => enableDebugTypeODRUniquing
destroyDITypeMap => disableDebugTypeODRUniquing
hasDITypeMap => isODRUniquingDebugTypes
llvm-svn: 266713
The root of the problem was that findMainViewFileID(File, Function)
could return some ID for any given file, even though that file
was not the main file for that function.
This patch ensures that the result of this function is conformed
with the result of findMainViewFileID(Function).
This commit reapplies r266436, which was reverted by r266458,
with the .covmapping file serialized in v1 format.
Differential Revision: http://reviews.llvm.org/D18787
llvm-svn: 266620
This reverts commit r266477.
This commit introduces cyclic dependency. This commit has "Analysis" depend on "ProfileData",
while "ProfileData" depends on "Object", which depends on "BitCode", which
depends on "Analysis".
llvm-svn: 266619
Three problems:
1. <future> can't be easily used. If you must use it, see
include/Support/ThreadPool.h for how.
2. constexpr problems, even after 266588.
3. Move assignment operators can't be defaulted in MSVC2013.
llvm-svn: 266615
Removed some unused headers, replaced some headers with forward class declarations.
Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'
Patch by Eugene Kosov <claprix@yandex.ru>
Differential Revision: http://reviews.llvm.org/D19219
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
asynchronous call/handle. Also updates the ORC remote JIT API to use the new
scheme.
The previous version of the RPC tools only supported void functions, and
required the user to manually call a paired function to return results. This
patch replaces the Procedure typedef (which only supported void functions) with
the Function typedef which supports return values, e.g.:
Function<FooId, int32_t(std::string)> Foo;
The RPC primitives and channel operations are also expanded. RPC channels must
support four new operations: startSendMessage, endSendMessage,
startRecieveMessage and endRecieveMessage, to handle channel locking. In
addition, serialization support for tuples to RPCChannels is added to enable
multiple return values.
The RPC primitives are expanded from callAppend, call, expect and handle, to:
appendCallAsync - Make an asynchronous call to the given function.
callAsync - The same as appendCallAsync, but calls send on the channel when
done.
callSTHandling - Blocking call for single-threaded code. Wraps a call to
callAsync then waits on the result, using a user-supplied
handler to handle any callbacks from the remote.
callST - The same as callSTHandling, except that it doesn't handle
callbacks - it expects the result to be the first return.
expect and handle - as before.
handleResponse - Handle a response from the remote.
waitForResult - Wait for the response with the given sequence number to arrive.
llvm-svn: 266581
Rather than relying on the structural equivalence of DICompositeType to
merge type definitions, use an explicit map on the LLVMContext that
LLParser and BitcodeReader consult when constructing new nodes.
Each non-forward-declaration DICompositeType with a non-empty
'identifier:' field is stored/loaded from the type map, and the first
definiton will "win".
This map is opt-in: clients that expect ODR types from different modules
to be merged must call LLVMContext::ensureDITypeMap.
- Clients that just happen to load more than one Module in the same
LLVMContext won't magically merge types.
- Clients (like LTO) that want to continue to merge types based on ODR
identifiers should opt-in immediately.
I have updated LTOCodeGenerator.cpp, the two "linking" spots in
gold-plugin.cpp, and llvm-link (unless -disable-debug-info-type-map) to
set this.
With this in place, it will be straightforward to remove the DITypeRef
concept (i.e., referencing types by their 'identifier:' string rather
than pointing at them directly).
llvm-svn: 266549
Stop memoizing ConstantAsMetadata in ValueMapper::mapMetadata. Now we
have to recompute it, but these metadata aren't particularly common, and
it restricts the lifetime of the Metadata map unnecessarily.
(The motivation is that I have a patch which uses a single Metadata map
for the lifetime of IRMover. Mehdi profiled r266446 with the patch
applied and we saw a pretty big speedup in lib/Linker.)
llvm-svn: 266513
This reverts commit r266507, reapplying r266503 (and r266505
"ValueMapper: Use API from r266503 in unit tests, NFC") completely
unchanged.
I reverted because of a bot failure here:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16810/
However, looking more closely, the failure was from a host-compiler
crash (clang 3.7.1) when building:
lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/DwarfAccelTable.cpp.o
I didn't modify that file, or anything it includes, with that commit.
The next build (which hadn't picked up my revert) got past it:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16811/
I think this was just unfortunate timing. I suppose the bot must be
flakey.
llvm-svn: 266510
This reverts commit r266503, in case it's the root cause of this bot
failure:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/16810
I'm also reverting r266505 -- "ValueMapper: Use API from r266503 in unit
tests, NFC" -- since it's in the way.
llvm-svn: 266507
Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count.
Differential Revision: http://reviews.llvm.org/D18622
llvm-svn: 266477
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.
Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.
Motivation
----------
Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.
We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.
Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.
http://reviews.llvm.org/D19034
<rdar://problem/25256815>
llvm-svn: 266446
The root of the problem was that findMainViewFileID(File, Function)
could return some ID for any given file, even though that file
was not the main file for that function.
This patch ensures that the result of this function is conformed
with the result of findMainViewFileID(Function).
Differential Revision: http://reviews.llvm.org/D18787
llvm-svn: 266436
At the same time, fixes InstructionsTest::CastInst unittest: yes
you can leave the IR in an invalid state and exit when you don't
destroy the context (like the global one), no longer now.
This is the first part of http://reviews.llvm.org/D19094
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266379
Fix a major bug from r265456. Although it's now much rarer, ValueMapper
sometimes has to duplicate cycles. The
might-transitively-reference-a-temporary counts don't decrement on their
own when there are cycles, and you need to call MDNode::resolveCycles to
fix it.
r265456 was checking the input nodes to see if they were unresolved.
This is useless; they should never be unresolved. Instead we should
check the output nodes and resolve cycles on them.
llvm-svn: 266258
A DISubprogram on x86_64 was 48 bytes. During an LTO build we
end up allocating *a lot* of these (see Duncan's numbers on
llvm-dev and/or my numbers in the review link).
This change reduces the size to 40 bytes, with a nice effect
on peak memory usage when LTO'ing clang.
There are more classes in the hierarchy which can be compacted
so more patches will come. DISubprogram was the biggest offender
in my profiling, anyway.
Differential Revision: http://reviews.llvm.org/D18918
llvm-svn: 266241
Raw function pointer collected by value
profile data may be from external functions
that are not instrumented. They won't have
mapping data to be used by the deserializer.
Force the value to be 0 in this case.
llvm-svn: 265890
Prevent the Metadata side-table in ValueMap from growing unnecessarily
when RF_NoModuleLevelChanges. As a drive-by, make ValueMap::hasMD,
which apparently had no users until I used it here for testing, actually
compile.
llvm-svn: 265828
Stop adding MDString to the Metadata section of the ValueMap in
MapMetadata. It blows up the size of the map for no benefit, since we
can always return quickly anyway.
There is a potential follow-up that I don't think I'll push on right
away, but maybe someone else is interested: stop checking for a
pre-mapped MDString, and move the `isa<MDString>()` checks in
Mapper::mapSimpleMetadata and MDNodeMapper::getMappedOp in front of the
`VM.getMappedMD()` calls. While this would preclude explicitly
remapping MDStrings it would probably be a little faster.
llvm-svn: 265827
This reverts commit r265765, reapplying r265759 after changing a call from
LocalAsMetadata::get to ValueAsMetadata::get (and adding a unit test). When a
local value is mapped to a constant (like "i32 %a" => "i32 7"), the new debug
intrinsic operand may no longer be pointing at a local.
http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/19020/
The previous coommit message follows:
--
This is a partial re-commit -- maybe more of a re-implementation -- of
r265631 (reverted in r265637).
This makes RF_IgnoreMissingLocals behave (almost) consistently between
the Value and the Metadata hierarchy. In particular:
- MapValue returns nullptr or "metadata !{}" for missing locals in
MetadataAsValue/LocalAsMetadata bridging paris, depending on
the RF_IgnoreMissingLocals flag.
- MapValue doesn't memoize LocalAsMetadata-related results.
- MapMetadata no longer deals with LocalAsMetadata or
RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but
I realized during testing it would make the patch simpler with no
loss of generality.)
r265631 went too far, making both functions universally ignore
RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt.
Reassociate (and possibly other passes) don't currently maintain
dominates-use invariants for metadata operands, resulting in IR like
this:
define void @foo(i32 %arg) {
call void @llvm.some.intrinsic(metadata i32 %x)
%x = add i32 1, i32 %arg
}
If the inliner chooses to inline @foo into another function, then
RemapInstruction will call `MapValue(metadata i32 %x)` and assert that
the return is not nullptr.
I've filed PR27273 to add a Verifier check and fix the underlying
problem in the optimization passes.
As a workaround, return `!{}` instead of nullptr for unmapped
LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match
the behaviour of r265631.
Original commit message:
ValueMapper: Make LocalAsMetadata match function-local Values
Start treating LocalAsMetadata similarly to function-local members of
the Value hierarchy in MapValue and MapMetadata.
- Don't memoize them.
- Return nullptr if they are missing.
This also cleans up ConstantAsMetadata to stop listening to the
RF_IgnoreMissingLocals flag.
llvm-svn: 265768
This is a partial re-commit -- maybe more of a re-implementation -- of
r265631 (reverted in r265637).
This makes RF_IgnoreMissingLocals behave (almost) consistently between
the Value and the Metadata hierarchy. In particular:
- MapValue returns nullptr or "metadata !{}" for missing locals in
MetadataAsValue/LocalAsMetadata bridging paris, depending on
the RF_IgnoreMissingLocals flag.
- MapValue doesn't memoize LocalAsMetadata-related results.
- MapMetadata no longer deals with LocalAsMetadata or
RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but
I realized during testing it would make the patch simpler with no
loss of generality.)
r265631 went too far, making both functions universally ignore
RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt.
Reassociate (and possibly other passes) don't currently maintain
dominates-use invariants for metadata operands, resulting in IR like
this:
define void @foo(i32 %arg) {
call void @llvm.some.intrinsic(metadata i32 %x)
%x = add i32 1, i32 %arg
}
If the inliner chooses to inline @foo into another function, then
RemapInstruction will call `MapValue(metadata i32 %x)` and assert that
the return is not nullptr.
I've filed PR27273 to add a Verifier check and fix the underlying
problem in the optimization passes.
As a workaround, return `!{}` instead of nullptr for unmapped
LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match
the behaviour of r265631.
Original commit message:
ValueMapper: Make LocalAsMetadata match function-local Values
Start treating LocalAsMetadata similarly to function-local members of
the Value hierarchy in MapValue and MapMetadata.
- Don't memoize them.
- Return nullptr if they are missing.
This also cleans up ConstantAsMetadata to stop listening to the
RF_IgnoreMissingLocals flag.
llvm-svn: 265759
Remove the assertion that disallowed the combination, since
RF_IgnoreMissingLocals should have no effect on globals. As it happens,
RF_NullMapMissingGlobalValues asserted in MapValue(Constant*,...), so I
also changed a cast to a cast_or_null to get my test passing.
llvm-svn: 265633
Start treating LocalAsMetadata similarly to function-local members of
the Value hierarchy in MapValue and MapMetadata.
- Don't memoize them.
- Return nullptr if they are missing.
This also cleans up ConstantAsMetadata to stop listening to the
RF_IgnoreMissingLocals flag.
llvm-svn: 265631
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.
The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).
This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.
As a follow-up I'll add:
bool operator<(AtomicOrdering, AtomicOrdering) = delete;
bool operator>(AtomicOrdering, AtomicOrdering) = delete;
bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.
Reviewers: jyknight, reames
Subscribers: jyknight, llvm-commits
Differential Revision: http://reviews.llvm.org/D18775
llvm-svn: 265602
Instead of copying arguments from the source function to the
destination, steal them. This has a few advantages.
- The ValueMap doesn't need to be seeded with (or cleared of)
Arguments.
- Often the destination function won't have created any arguments yet,
so this avoids malloc traffic.
- Argument names don't need to be copied.
Because argument lists are lazy, this required a new
Function::stealArgumentListFrom helper.
llvm-svn: 265519
This commit completely rewrites Mapper::mapMetadata (the implementation
of llvm::MapMetadata) using an iterative algorithm. The guts of the new
algorithm are in MDNodeMapper::map, the entry function in a new class.
Previously, Mapper::mapMetadata performed a recursive exploration of the
graph with eager "just in case there's a reason" malloc traffic.
The new algorithm has these benefits:
- New nodes and temporaries are not created eagerly.
- Uniquing cycles are not duplicated (see new unit test).
- No recursion.
Given a node to map, it does this:
1. Use a worklist to perform a post-order traversal of the transitively
referenced unmapped nodes.
2. Track which nodes will change operands, and which will have new
addresses in the mapped scheme. Propagate the changes through the
POT until fixed point, to pick up uniquing cycles that need to
change.
3. Map all the distinct nodes without touching their operands. If
RF_MoveDistinctMetadata, they get mapped to themselves; otherwise,
they get mapped to clones.
4. Map the uniqued nodes (bottom-up), lazily creating temporaries for
forward references as needed.
5. Remap the operands of the distinct nodes.
Mehdi helped me out by profiling this with -flto=thin. On his workload
(importing/etc. for opt.cpp), MapMetadata sped up by 15%, contributed
about 50% less to persistent memory, and made about 100x fewer calls to
malloc. The speedup is less than I'd hoped. The profile mainly blames
DenseMap lookups; perhaps there's a way to reduce them (e.g., by
disallowing remapping of MDString).
It would be nice to break the strange remaining recursion on the Value
side: MapValue => materializeInitFor => RemapInstruction => MapValue. I
think we could do this by having materializeInitFor return a worklist of
things to be remapped.
llvm-svn: 265456
Some Include What You Use suggestions were used too.
Use anonymous namespaces in source files.
Differential revision: http://reviews.llvm.org/D18778
llvm-svn: 265454
destruction.
This makes the Expected<T> class behave like Error, even when in success mode.
Expected<T> values must be checked to see whether they contain an error prior
to being dereferenced, assigned to, or destructed.
llvm-svn: 265446
Summary:
A character within a string literal is not escaped correctly.
In this case, there is no semantic change because the invalid character turn out to be NUL anyway.
note: "\0x12" is equivalent to {0, 'x', '1', '2'} and not { 12 }.
This issue was found by clang-tidy.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18747
llvm-svn: 265376
We were using array_pod_sort on an array of type 'Attribute', which
wraps a pointer to AttributeImpl. For the most part this didn't matter
because the printing code prints enum attributes in a defined order, but
integer attributes such as 'align' and 'dereferenceable' were not
ordered.
Furthermore, AttributeImpl::operator< was broken for integer attributes.
An integer attribute is a kind and an integer value, and both pieces
need to be compared.
By fixing the comparison operator, we can go back to std::sort, and
things look good now. This should fix clang arm-swiftcall.c test
failures on Windows.
llvm-svn: 265361
Support seeding a ValueMap with nullptr for Metadata entries, a
situation I didn't consider in the Metadata/Value split.
I added a ValueMapper::getMappedMD accessor that returns an
Optional<Metadata*> with the mapped (possibly null) metadata. IRMover
needs to use this to avoid modifying the map when it's checking for
unneeded subprograms. I updated a call from bugpoint since I find the
new code clearer.
llvm-svn: 265228
Provide a class to generate a SHA1 from a sequence of bytes, and
a convenience raw_ostream adaptor.
This will be used to provide a "build-id" by hashing the Module
block when writing bitcode. ThinLTO will use this information for
incremental build.
Reapply r265094 which was reverted in r265102 because it broke
MSVC bots (constexpr is not supported).
http://reviews.llvm.org/D16325
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265107
This reverts commit r265096, r265095, and r265094.
Windows build is broken, and the validation does not pass.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265102
Provide a class to generate a SHA1 from a sequence of bytes, and
a convenience raw_ostream adaptor.
This will be used to provide a "build-id" by hashing the Module
block when writing bitcode. ThinLTO will use this information for
incremental build.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265094
This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder
into DICompileUnit. DIBuilder is not the right place for this enum to live
in — a metadata consumer should not have to include DIBuilder.h.
I also added a Verifier check that checks that the emission kind of a
DICompileUnit is actually legal.
http://reviews.llvm.org/D18612
<rdar://problem/25427165>
llvm-svn: 265077
Commit r260791 contained an error in that it would introduce a cross-module
reference in the old module. It also introduced O(N^2) complexity in the
module cloner by requiring the entire module to be visited for each function.
Fix both of these problems by avoiding use of the CloneDebugInfoMetadata
function (which is only designed to do intra-module cloning) and cloning
function-attached metadata in the same way that we clone all other metadata.
Differential Revision: http://reviews.llvm.org/D18583
llvm-svn: 264935
There is code under review that requires StringMap to have a copy constructor,
and this makes StringMap more consistent with our other containers (like
DenseMap) that have copy constructors.
Differential Revision: http://reviews.llvm.org/D18506
llvm-svn: 264906
Using ArrayRef in annotateValueSite's parameter instead of using an array
and it's size.
Differential Revision: http://reviews.llvm.org/D18568
llvm-svn: 264879
Since we have moved to a model where functions are imported in bulk from
each source module after making summary-based importing decisions, there
is no longer a need to link metadata as a postpass, and all users have
been removed.
This essentially reverts r255909 and follow-on fixes.
llvm-svn: 264763
Function names in ObjC can have spaces in them. This interacts poorly
with name compression, which uses spaces to separate PGO names. Fix the
issue by using a different separator and update a test.
I chose "\01" as the separator because 1) it's non-printable, 2) we
strip it from PGO names, and 3) it's the next natural choice once "\00"
is discarded (that one's overloaded).
What's changed since the original commit?
- I fixed up the covmap-V2 binary format tests using a linux VM.
- I weakened the CHECK lines in instrprof-comdat.h to account for the
fact that there have been bugfixes to clang coverage. These will be
fixed up in a follow-up.
- I added an assert to make sure we don't get bitten by this again.
- I constructed the c-general.profraw file without name compression
enabled to appease some bots.
Differential Revision: http://reviews.llvm.org/D18516
llvm-svn: 264658
Explicitly check that artificial byte limit is rounded correctly by
exposing BitstreamReader::Size through a new accessor, getSizeIfKnown.
The original code for rounding (from r264547) wasn't obviously correct,
and even though r264623 cleaned it up (by calling llvm::alignTo) I think
it's worth testing.
llvm-svn: 264650
Function names in ObjC can have spaces in them. This interacts poorly
with name compression, which uses spaces to separate PGO names. Fix the
issue by using a different separator and update a test.
I chose "\01" as the separator because 1) it's non-printable, 2) we
strip it from PGO names, and 3) it's the next natural choice once "\00"
is discarded (that one's overloaded).
This reverts the revert commit beaf3d18. What's changed?
- I fixed up the covmap-V2 binary format tests using a linux VM.
- I updated the expected counts in instrprof-comdat.h to account for
the fact that there have been bugfixes to clang coverage.
- I added an assert to make sure we don't get bitten by this again.
Differential Revision: http://reviews.llvm.org/D18516
llvm-svn: 264641
Function names in ObjC can have spaces in them. This interacts poorly
with name compression, which uses spaces to separate PGO names. Fix the
issue by using a different separator and update a test.
I chose "\01" as the separator because 1) it's non-printable, 2) we
strip it from PGO names, and 3) it's the next natural choice once "\00"
is discarded (that one's overloaded).
Differential Revision: http://reviews.llvm.org/D18516
llvm-svn: 264587
When emitting coverage mappings for functions with local linkage and an
unknown filename, we use "<unknown>:func" for the PGO function name. The
problem is that we don't strip "<unknown>" from the name when loading
coverage data, like we do for other file names. Fix that and add a test.
llvm-svn: 264559
Split helper out of EmitRecordWithAbbrevImpl called emitBlob to reduce
code duplication, and add a few tests for it.
No functionality change intended.
llvm-svn: 264550
The implementation is fairly obvious. This is preparation for using
some blobs in bitcode.
For clarity (and perhaps future-proofing?), I moved the call to
JumpToBit in BitstreamCursor::readRecord ahead of calling
MemoryObject::getPointer, since JumpToBit can theoretically (a) read
bytes, which (b) invalidates the blob pointer.
This isn't strictly necessary the two memory objects we have:
- The return of RawMemoryObject::getPointer is valid until the memory
object is destroyed.
- StreamingMemoryObject::getPointer is valid until the next chunk is
read from the stream. Since the JumpToBit call is only going ahead
to a word boundary, we'll never load another chunk.
However, reordering makes it clear by inspection that the blob returned
by BitstreamCursor::readRecord will be valid.
I added some tests for StreamingMemoryObject::getPointer and
BitstreamCursor::readRecord.
llvm-svn: 264549
Change the filename to indicate this is a test, rename the tests, move
them into an anonymous namespace, and rename some variables. All to
match our usual style before making further changes.
llvm-svn: 264548
Allow users of SimpleBitstreamCursor to limit the number of bytes
available to the cursor. This is preparation for instantiating a cursor
that isn't allowed to load more bytes from a StreamingMemoryObject (just
move around the ones already-loaded).
llvm-svn: 264547
Add API to SimpleBitstreamCursor to allow users to translate between
byte addresses and pointers.
- jumpToPointer: move the bit position to a particular pointer.
- getPointerToByte: get the pointer for a particular byte.
- getPointerToBit: get the pointer for the byte of the current bit.
- getCurrentByteNo: convenience function for assertions and tests.
Mainly adds unit tests (getPointerToBit/Byte already has a use), but
also preparation for eventually using jumpToPointer.
llvm-svn: 264546
This makes us no longer relying on move-construction elision by the compiler.
Suggested by D. Blaikie.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264475
This helper method creates a pre-checked Error suitable for use as an out
parameter in a constructor. This avoids the need to have the constructor
check a known-good error before assigning to it.
llvm-svn: 264467
This is a recommit of r264414 after fixing the buildbot failure caused by
incompatible use of std::vector.erase().
The original message:
Add erase() which returns an iterator pointing to the next element after the
erased one. This makes it possible to erase selected elements while iterating
over the SetVector :
while (I != E)
if (test(*I))
I = SetVector.erase(I);
else
++I;
Reviewers: qcolombet, mcrosier, MatzeB, dblaikie
Subscribers: dberlin, dblaikie, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18281
llvm-svn: 264450
Summary:
Add erase() which returns an iterator pointing to the next element after the
erased one. This makes it possible to erase selected elements while iterating
over the SetVector :
while (I != E)
if (test(*I))
I = SetVector.erase(I);
else
++I;
Reviewers: qcolombet, mcrosier, MatzeB, dblaikie
Subscribers: dberlin, dblaikie, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D18281
llvm-svn: 264414
Summary:
Loading IR with debug info improves MDString::get() from 19ms to 10ms.
This is a rework of D16597 with adding an "emplace" method on the StringMap
to avoid requiring the MDString move ctor to be public.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17920
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264386
Summary:
StringMap ctor accepts an initialize size, but expect it to be
rounded to the next power of 2. The ctor can handle that directly
instead of expecting clients to round it. Also, since the map will
resize itself when 75% full, take this into account an initialize
a larger initial size to avoid any growth.
Reviewers: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18344
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264385
Summary:
Just running the loop in the unittests for a few more iterations
(till 48) exhibit that the condition on the limit was not handled
properly in r263522.
Rewrite the test to use a class to count move/copies that happens
when inserting into the map.
Also take the opportunity to refactor the logic to compute the
number of buckets required for a given number of entries in the map.
Use this when constructing a DenseMap with a desired size given to
the constructor (and add a tests for this).
Reviewers: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18345
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264384
This is a temporary crutch to enable code that currently uses std::error_code
to be incrementally moved over to Error. Requiring all Error instances be
convertible enables clients to call errorToErrorCode on any error (not just
ECErrors created by conversion *from* an error_code).
This patch also moves code for Error from ErrorHandling.cpp into a new
Error.cpp file.
llvm-svn: 264221