reference-edge SCCs.
This essentially builds a more normal call graph as a subgraph of the
"reference graph" that was the old model. This allows both to exist and
the different use cases to use the aspect which addresses their needs.
Specifically, the pass manager and other *ordering* constrained logic
can use the reference graph to achieve conservative order of visit,
while analyses reasoning about attributes and other properties derived
from reachability can reason about the direct call graph.
Note that this isn't necessarily complete: it doesn't model edges to
declarations or indirect calls. Those can be found by scanning the
instructions of the function if desirable, and in fact every user
currently does this in order to handle things like calls to instrinsics.
If useful, we could consider caching this information in the call graph
to save the instruction scans, but currently that doesn't seem to be
important.
An important realization for why the representation chosen here works is
that the call graph is a formal subset of the reference graph and thus
both can live within the same data structure. All SCCs of the call graph
are necessarily contained within an SCC of the reference graph, etc.
The design is to build 'RefSCC's to model SCCs of the reference graph,
and then within them more literal SCCs for the call graph.
The formation of actual call edge SCCs is not done lazily, unlike
reference edge 'RefSCC's. Instead, once a reference SCC is formed, it
directly builds the call SCCs within it and stores them in a post-order
sequence. This is used to provide a consistent platform for mutation and
update of the graph. The post-order also allows for very efficient
updates in common cases by bounding the number of nodes (and thus edges)
considered.
There is considerable common code that I'm still looking for the best
way to factor out between the various DFS implementations here. So far,
my attempts have made the code harder to read and understand despite
reducing the duplication, which seems a poor tradeoff. I've not given up
on figuring out the right way to do this, but I wanted to wait until
I at least had the system working and tested to continue attempting to
factor it differently.
This also requires introducing several new algorithms in order to handle
all of the incremental update scenarios for the more complex structure
involving two edge colorings. I've tried to comment the algorithms
sufficiently to make it clear how this is expected to work, but they may
still need more extensive documentation.
I know that there are some changes which are not strictly necessarily
coupled here. The process of developing this started out with a very
focused set of changes for the new structure of the graph and
algorithms, but subsequent changes to bring the APIs and code into
consistent and understandable patterns also ended up touching on other
aspects. There was no good way to separate these out without causing
*massive* merge conflicts. Ultimately, to a large degree this is
a rewrite of most of the core algorithms in the LCG class and so I don't
think it really matters much.
Many thanks to the careful review by Sanjoy Das!
Differential Revision: http://reviews.llvm.org/D16802
llvm-svn: 261040
Add support for trimming a single kind of character from a StringRef.
This makes the common case of trimming null bytes much neater. It's also
probably a bit speedier too, since it avoids creating a std::bitset in
find_{first,last}_not_of.
llvm-svn: 260925
Summary:
Export the CloneDebugInfoMetadata utility, which clones all debug info
associated with a function into the first module. Also use this function
in CloneModule on each function we clone (the CloneFunction entrypoint
already does this).
Without this, cloning a module will lead to DI quality regressions,
especially since r252219 reversed the Function <-> DISubprogram edge
(before we could get lucky and have this edge preserved if the
DISubprogram itself was, e.g. due to location metadata).
This was verified to fix missing debug information in julia and
a unittest to verify the new behavior is included.
Patch by Yichao Yu! Thanks!
Reviewers: loladiro, pcc
Differential Revision: http://reviews.llvm.org/D17165
llvm-svn: 260791
As support expands to more runtimes, we'll need to
distinguish between more than just HSA and unknown.
This also lets us stop using unknown everywhere.
llvm-svn: 260790
Add another interface to function annotateValueSite() which directly uses the
VauleData array.
Differential Revision: http://reviews.llvm.org/D17108
llvm-svn: 260741
The patch adds a parameter in annotateValueSite() to control the max number
of records written to the value profile meta data for each value site. The
default is kept as the current value of 3.
Differential Revision: http://reviews.llvm.org/D17084
llvm-svn: 260450
Patch by Rong Xu
The problem is exposed by intra-module indirect call promotion where
prof symtab is created from module which does not contain all symbols
from the program. With partial symtab, the result needs to be checked
more strictly.
llvm-svn: 260361
This patch adds a new class, OrcI386, which contains the hooks needed to
support lazy-JITing on i386 (currently only for Pentium 2 or above, as the JIT
re-entry code uses the FXSAVE/FXRSTOR instructions).
Support for i386 is enabled in the LLI lazy JIT and the Orc C API, and
regression and unit tests are enabled for this architecture.
llvm-svn: 260338
This pass implements whole program optimization of virtual calls in cases
where we know (via bitset information) that the list of callees is fixed. This
includes the following:
- Single implementation devirtualization: if a virtual call has a single
possible callee, replace all calls with a direct call to that callee.
- Virtual constant propagation: if the virtual function's return type is an
integer <=64 bits and all possible callees are readnone, for each class and
each list of constant arguments: evaluate the function, store the return
value alongside the virtual table, and rewrite each virtual call as a load
from the virtual table.
- Uniform return value optimization: if the conditions for virtual constant
propagation hold and each function returns the same constant value, replace
each virtual call with that constant.
- Unique return value optimization for i1 return values: if the conditions
for virtual constant propagation hold and a single vtable's function
returns 0, or a single vtable's function returns 1, replace each virtual
call with a comparison of the vptr against that vtable's address.
Differential Revision: http://reviews.llvm.org/D16795
llvm-svn: 260312
compiler-specific issues. Instead, repeat an 'operator delete' definition in
each derived class that is actually deleted, and give up on the static type
safety of an error when sized delete is accidentally used on a type derived
from TrailingObjects.
llvm-svn: 260190
This fixes undefined behavior in C++14 due to the size of the object being
deleted being different from sizeof(dynamic type) when it is allocated with
trailing objects.
MSVC seems to have several bugs around using-declarations changing the access
of a member inherited from a base class, so use forwarding functions instead of
using-declarations to make TrailingObjects::operator delete accessible where
desired.
llvm-svn: 260180
Summary:
Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests.
The change itself is supposed to be NFC, except adding a couple of tests.
I plan to add more tests as I add new functionality and find/fix bugs.
Reviewers: chandlerc, hfinkel, sanjoy
Subscribers: zzheng, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D16623
llvm-svn: 260169
The Windows bots have been failing for the last two days, with:
FAILED: C:\PROGRA~2\MICROS~1.0\VC\bin\amd64\cl.exe -c LLVMContextImpl.cpp
D:\buildslave\clang-x64-ninja-win7\llvm\lib\IR\LLVMContextImpl.cpp(137) :
error C2248: 'llvm::TrailingObjects<llvm::AttributeSetImpl,
llvm::IndexAttrPair>::operator delete' :
cannot access private member declared in class 'llvm::AttributeSetImpl'
TrailingObjects.h(298) : see declaration of
'llvm::TrailingObjects<llvm::AttributeSetImpl,
llvm::IndexAttrPair>::operator delete'
AttributeImpl.h(213) : see declaration of 'llvm::AttributeSetImpl'
llvm-svn: 260053
-fsized-deallocation. Disable sized deallocation for all objects derived from
TrailingObjects, as we expect the storage allocated for these objects to be
larger than the size of their dynamic type.
llvm-svn: 259942
Unfortunately, ProgramInfo::ProcessId is signed on Unix and unsigned on
Windows, breaking the standard fix of using '0U' in the gtest
expectation.
llvm-svn: 259704
With this patch, the profile summary data will be available in indexed
profile data file so that profiler reader/compiler optimizer can start
to make use of.
Differential Revision: http://reviews.llvm.org/D16258
llvm-svn: 259626
differentiate between indirect references to functions an direct calls.
This doesn't do a whole lot yet other than change the print out produced
by the analysis, but it lays the groundwork for a very major change I'm
working on next: teaching the call graph to actually be a call graph,
modeling *both* the indirect reference graph and the call graph
simultaneously. More details on that in the next patch though.
The rest of this is essentially a bunch of over-engineering that won't
be interesting until the next patch. But this also isolates essentially
all of the churn necessary to introduce the edge abstraction from the
very important behavior change necessary in order to separately model
the two graphs. So it should make review of the subsequent patch a bit
easier at the cost of making this patch seem poorly motivated. ;]
Differential Revision: http://reviews.llvm.org/D16038
llvm-svn: 259463
With poorly chosen custom parameters, the line table encoding logic would
sometimes end up generating a special opcode bigger than 255, which is wrong.
The set of default parameters that LLVM uses isn't subject to this bug.
When carefully chosing the line table parameters, it's impossible to fall into the
corner case that this patch fixes. The standard however doesn't require that these
parameters be carefully chosen. And even if it did, we shouldn't generate broken
encoding.
Add a unittest for this specific encoding bug, and while at it, create some unit
tests for the encoding logic using different sets of parameters.
llvm-svn: 259334
Add an option to llvm-profdata merge for writing out sparse indexed
profiles. These profiles omit InstrProfRecords for functions which are
never executed.
Differential Revision: http://reviews.llvm.org/D16727
llvm-svn: 259258
This reverts commit r259117.
The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.
llvm-svn: 259126
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:
- .cv_file: Similar to DWARF .file directives
- .cv_loc: Similar to the DWARF .loc directive, but starts with a
function id. CodeView line tables are emitted by function instead of
by compilation unit, so we needed an extra field to communicate this.
Rather than overloading the .loc direction further, we decided it was
better to have our own directive.
- .cv_stringtable: Emits the codeview string table at the current
position. Currently this just contains the filenames as
null-terminated strings.
- .cv_filechecksums: Emits the file checksum table for all files used
with .cv_file so far. There is currently no support for emitting
actual checksums, just filenames.
This moves the line table emission code down into the assembler. This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.
David Majnemer collaborated on this patch.
llvm-svn: 259117
at least as big as the mach header to be identified as a Mach-O file and
make sure smaller files are not identified as a Mach-O files but as
unknown files. Also fix identify_magic() so it looks at all 4 bytes of
the filetype field when determining the type of the Mach-O file.
Then fix the macho-invalid-header test case to check that it is an
unknown file and make sure it does not get the error for
object_error::parse_failed. And also update the unit tests.
llvm-svn: 258883
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html
"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi
Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark
Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D16471
llvm-svn: 258861
Summary:
Update ObjectTransformLayer::addObjectSet to take the object set by
value rather than reference and pass it to the base layer with move
semantics rather than copy, to match r258185's changes to
ObjectLinkingLayer.
Update the unit test to verify that ObjectTransformLayer's signature stays
in sync with ObjectLinkingLayer's.
Reviewers: lhames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16414
llvm-svn: 258630
Using an array instead of ArrayRef would allow type inference, but
(short of using C99) one would still need to write
typedef uint16_t VT[];
LE.write(VT{0x1234, 0x5678});
llvm-svn: 258535
they're needed.
Prior to this patch objects were loaded (via RuntimeDyld::loadObject) when they
were added to the ObjectLinkingLayer, but were not relocated and finalized until
a symbol address was requested. In the interim, another object could be loaded
and finalized with the same memory manager, causing relocation/finalization of
the first object to fail (as the first finalization call may have marked the
allocated memory for the first object read-only).
By deferring the loadObject call (and subsequent memory allocations) until an
object file is needed we can avoid prematurely finalizing memory.
llvm-svn: 258185
Previously these were Darwin-only. Since the switch to direct binary emission
of stubs, trampolines and resolver blocks, these should work on other *nix
platforms too.
These tests can be enabled on Windows once known issues with ORC's handling of
Windows symbol mangling (see e.g. https://llvm.org/PR25940) have been fixed.
llvm-svn: 258031
Summary:
Simplify the memory management of mock IR in test AlterInvokeBundles.
Reviewers: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16211
llvm-svn: 257892
Summary:
Before this the Verifier didn't complain if the GlobalVariable
referenced from a DIGlobalVariable was not in fact in the correct
module (it would crash while writing bitcode though). Fix this by
always checking parantage of GlobalValues while walking constant
expressions and changing the DIGlobalVariable visitor to also
visit the constant it contains.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D16059
llvm-svn: 257825
Summary:
We already have the inverse verification that we only use globals
that are defined in this module. This essentially catches the
same mistake, but when verifying the module that contains the
definition.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D15272
llvm-svn: 257823
Summary:
The overloads of CallInst::Create and InvokeInst::Create that are used to
adjust operand bundles purport to create a new instruction "identical in
every way except [for] the operand bundles", so copy the DebugLoc along
with everything else.
Reviewers: sanjoy, majnemer
Subscribers: majnemer, dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D16157
llvm-svn: 257745
Summary:
The problem here is that an enum class can not be implicitly converted to an
integer. That assumption snuck back into PointerIntPair. This commit fixes the
issue and more importantly adds some unittests to make sure that we do not break
this again.
rdar://23594806
Reviewers: gribozavr
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16131
llvm-svn: 257574
Summary: Add SaturatingMultiplyAdd convenience function template since A + (X * Y) comes up frequently when doing weighted arithmetic.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15385
llvm-svn: 257532
This patch adds utilities to ORC for managing a remote JIT target. It consists
of:
1. A very primitive RPC system for making calls over a byte-stream. See
RPCChannel.h, RPCUtils.h.
2. An RPC API defined in the above system for managing memory, looking up
symbols, creating stubs, etc. on a remote target. See OrcRemoteTargetRPCAPI.h.
3. An interface for creating high-level JIT components (memory managers,
callback managers, stub managers, etc.) that operate over the RPC API. See
OrcRemoteTargetClient.h.
4. A helper class for building servers that can handle the RPC calls. See
OrcRemoteTargetServer.h.
The system is designed to work neatly with the existing ORC components and
functionality. In particular, the ORC callback API (and consequently the
CompileOnDemandLayer) is supported, enabling lazy compilation of remote code.
Assuming this doesn't trigger any builder failures, a follow-up patch will be
committed which tests these utilities by using them to replace LLI's existing
remote-JITing demo code.
llvm-svn: 257305
RuntimeDyld::MemoryManager.
The RuntimeDyld::MemoryManager::reserveAllocationSpace method is called when
object files are loaded, and gives clients a chance to pre-allocate memory for
all segments. Previously only the size of each segment (code, ro-data, rw-data)
was supplied but not the alignment. This hasn't caused any problems so far, as
most clients allocate via the MemoryBlock interface which returns page-aligned
blocks. Adding alignment arguments enables finer grained allocation while still
satisfying alignment restrictions.
llvm-svn: 257294
llvm\unittests\ExecutionEngine\Orc\ObjectLinkingLayerTest.cpp(115) : error C2327: 'llvm::OrcExecutionTest::TM' : is not a type name, static, or enumerator
llvm\unittests\ExecutionEngine\Orc\ObjectLinkingLayerTest.cpp(115) : error C2065: 'TM' : undeclared identifier
FYI, "this->TM" was valid even before moving class SectionMemoryManagerWrapper.
llvm-svn: 257290
type.
This makes it easy and safe to use a set of flags as one elmenet of
a tagged union with pointers. There is quite a bit of code that has
historically done this by casting arbitrary integers to "pointers" and
assuming that this was safe and reliable. It is neither, and has started
to rear its head by triggering safety asserts in various abstractions
like PointerLikeTypeTraits when the integers chosen are invariably poor
choices for *some* platform and *some* situation. Not to mention the
(hopefully unlikely) prospect of one of these integers actually getting
allocated!
With this, it will be straightforward to build type safe abstractions
like this without being error prone. The abstraction itself is also
remarkably simple thanks to the implicit conversion.
This use case and pattern was also independently created by the folks
working on Swift, and they're going to incrementally add any missing
functionality they find.
Differential Revision: http://reviews.llvm.org/D15844
llvm-svn: 257284
This is a much more general and powerful form of PointerUnion. It
provides a reasonably complete sum type (from type theory) for
pointer-like types. It has several significant advantages over the
existing PointerUnion infrastructure:
1) It allows more than two pointer types to participate without awkward
nesting structures.
2) It directly exposes the tag so that it is convenient to write
switches over the possible members.
3) It can re-use the same type for multiple tag values, something that
has been worked around by either abusing PointerIntPair or defining
nonce types and doing unsafe pointer casting.
4) It supports customization of the PointerLikeTypeTraits used for
specific member types. This means it could (in theory) be used even
with types that are over-aligned on allocation to expose larger
numbers of bits to the tag.
All in all, I think it is at least complimentary to the existing
infrastructure, and a strict improvement for some use cases.
Differential Revision: http://reviews.llvm.org/D15843
llvm-svn: 257282
managers.
Prior to this patch, recursive finalization (where finalization of one
RuntimeDyld instance triggers finalization of another instance on which the
first depends) could trigger memory access failures: When the inner (dependent)
RuntimeDyld instance and its memory manager are finalized, memory allocated
(but not yet relocated) by the outer instance is locked, and relocation in the
outer instance fails with a memory access error.
This patch adds a latch to the RuntimeDyld::MemoryManager base class that is
checked by a new method: RuntimeDyld::finalizeWithMemoryManagerLocking, ensuring
that shared memory managers are only finalized by the outermost RuntimeDyld
instance.
This allows ORC clients to supply the same memory manager to multiple calls to
addModuleSet. In particular it enables the use of user-supplied memory managers
with the CompileOnDemandLayer which must reuse the supplied memory manager for
each function that is lazily compiled.
llvm-svn: 257263
Done in InstrProfWriter to eliminate the need for client
code to do the sorting. The operation is done once and reused
many times so it is more efficient. Update unit test to remove
sorting. Also update expected output of affected tests.
llvm-svn: 257145
For a new record with weight != 1, only edge profiling
counters are scaled, VP data is not properly scaled.
This patch refactors the code and fixes the problem.
Also added sort by count interface (for follow up patch).
llvm-svn: 257143
Fix PR24852 (crash with -debug -instcombine)
Patch by Than McIntosh <thanm@google.com>
Summary:
Add guards to the asm writer to prevent crashing
when dumping an instruction that has no basic
block.
Differential Revision: http://reviews.llvm.org/D15798
From: Than McIntosh <thanm@google.com>
llvm-svn: 257094
This adds a unittest for the support added in r256648 to add
a flag that can be used to prevent RAUW on temporary metadata
used as a map key.
llvm-svn: 256938
...and mark it as merely an input_iterator rather than a forward_iterator,
since it is destructive. And then rewrite == to take advantage of that.
Patch by Alex Denisov!
llvm-svn: 256913
Summary:
There are a number of files in the tree which have been accidentally checked in with DOS line endings. Convert these to native line endings.
There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those.
Reviewers: joerg, aaron.ballman
Subscribers: aaron.ballman, sanjoy, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15848
llvm-svn: 256707
This is part of the effort/prepration to reduce the size
instr-pgo (object, binary, memory footprint, and raw data).
The functionality is currently off by default and not yet
used by any clients.
llvm-svn: 256667
This is necessary to use them as part of pointer traits and is generally
useful. I've added unit test coverage to isolate and ensure this works
correctly.
I'll watch the build bots to try to see if any compilers can't tolerate
this bit of magic (and much credit goes to Richard Smith for coming up
with this magical production!) but give a shout if you see issues.
llvm-svn: 256553
Previously, the code enforced non-decreasing alignment of each trailing
type. However, it's easy enough to allow for realignment as needed, and
thus avoid the developer having to think about the possiblilities for
alignment requirements on all architectures.
(E.g. on Linux/x86, a struct with an int64 member is 4-byte aligned,
while on other 32-bit archs -- and even with other OSes on x86 -- it has
8-byte alignment. This sort of thing is irritating to have to manually
deal with.)
llvm-svn: 256533
We didn't actually statically check this, and so it worked 25% of the
time for me. =/ Really sorry it took so long to fix, I shouldn't leave
the commit log editor window open without saving and landing the commit.
=[
llvm-svn: 256528
MSC18 Debug didn't merge them.
FIXME: I tweaked just to appease a builder. Almost string literals should be addressed identically there.
llvm-svn: 256459
Summary:
We need to actually remove the use of the personality function,
otherwise we can run into trouble if we want to e.g. delete
the personality function because ther's no way to get rid of
its uses. Do this by resetting to ConstantPointerNull value
that the operands are set to when first allocated.
Reviewers: vsk, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15752
llvm-svn: 256345
Creator and lookup interfaces are added to this symtab class.
The new interfaces will be used by InstrProf Readers and writer.
A unit test is also added for the new APIs.
llvm-svn: 256092
Remove all checks that required main thread to run faster than tasks in
ThreadPool, and yields which are now unnecessary. This should fix some
bot failures.
llvm-svn: 256056
- Automatic alignment of the base type for the alignment requirements
of the trailing types.
- Support for an arbitrary numbers of trailing types, instead of only
1 or 2, by using a variadic template implementation.
Upcoming commits to clang will take advantage of both of these features.
Differential Revision: http://reviews.llvm.org/D12439
llvm-svn: 256054
Type specific declarations have been moved to Type.h and error handling
routines have been moved to ErrorHandling.h. Both are included in Core.h
so nothing should change for projects directly including the headers,
but transitive dependencies may be affected.
llvm-svn: 255965
The current BranchProbability::normalizeProbabilities() forbids known and
unknown probabilities to coexist in the list. This was once used to help
capture probability exceptions but has caused some reported build
failures (https://llvm.org/bugs/show_bug.cgi?id=25838).
This patch removes this restriction by evenly distributing the complement
of the sum of all known probabilities to unknown ones. We could still
treat this as an abnormal behavior, but it is better to emit warnings in
our future profile validator.
Differential revision: http://reviews.llvm.org/D15548
llvm-svn: 255934
Summary: Surface counter overflow when merging profile data. Merging still occurs on overflow but counts saturate to the maximum representable value. Overflow is reported to the user.
Reviewers: davidxl, dnovillo, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15547
llvm-svn: 255825
Summary:
This change adds support for specifying a weight when merging profile data with the llvm-profdata tool.
Weights are specified by using the --weighted-input=<weight>,<filename> option. Input files not specified
with this option (normal positional list after options) are given a default weight of 1.
Adding support for arbitrary weighting of input profile data allows for relative importance to be placed on the
input data from multiple training runs.
Both sampled and instrumented profiles are supported.
Reviewers: davidxl, dnovillo, bogner, silvas
Subscribers: silvas, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D15306
llvm-svn: 255659
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.
In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.
This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.
This is a recommit of r255444 ; trying to workaround a bug in the
MSVC 2013 standard library. I think I was hit by:
http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable
Recommit of r255589, trying to please g++ as well.
Differential Revision: http://reviews.llvm.org/D15464
From: mehdi_amini <mehdi_amini@91177308-0d34-0410-b5e6-96231b3b80d8>
llvm-svn: 255593
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.
In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.
This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.
This is a recommit of r255444 ; trying to workaround a bug in the
MSVC 2013 standard library. I think I was hit by:
http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable
Differential Revision: http://reviews.llvm.org/D15464
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255589
This patch converts code that has access to a LLVMContext to not take a
diagnostic handler.
This has a few advantages
* It is easier to use a consistent diagnostic handler in a single program.
* Less clutter since we are not passing a handler around.
It does make it a bit awkward to implement some C APIs that return a
diagnostic string. I will propose new versions of these APIs and
deprecate the current ones.
llvm-svn: 255571
This patch adds optional fast-math-flags (the same that apply to fmul/fadd/fsub/fdiv/frem/fcmp)
to call instructions in IR. Follow-up patches would use these flags in LibCallSimplifier, add
support to clang, and extend FMF to the DAG for calls.
Motivating example:
%y = fmul fast float %x, %x
%z = tail call float @sqrtf(float %y)
We'd like to be able to optimize sqrt(x*x) into fabs(x). We do this today using a function-wide
attribute for unsafe-math, but we really want to trigger on the instructions themselves:
%z = tail call fast float @sqrtf(float %y)
because in an LTO build it's possible that calls with fast semantics have been inlined into a
function with non-fast semantics.
The code changes and tests are based on the recent commits that added "notail":
http://reviews.llvm.org/rL252368
and added FMF to fcmp:
http://reviews.llvm.org/rL241901
Differential Revision: http://reviews.llvm.org/D14707
llvm-svn: 255555
Make sure to check that the destination type is sized.
A check was present but was incorrectly checking the source type
instead.
Patch by Amaury SECHET!
Differential Revision: http://reviews.llvm.org/D15264
llvm-svn: 255536
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.
In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.
This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.
Differential Revision: http://reviews.llvm.org/D15464
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255444
Summary:
Adds support for in-memory round-trip of sample profile data along with basic
round trip unit tests. This will also make it easier to include unit tests for
future changes to sample profiling.
Reviewers: davidxl, dnovillo, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15211
llvm-svn: 255264
Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros.
Differential Revision: http://reviews.llvm.org/D14687
llvm-svn: 255245
The ConstantDataArray::getFP(LLVMContext &, ArrayRef<uint16_t>)
overload has had a typo in it since it was written, where it will
create a Vector instead of an Array. This obviously doesn't work at
all, but it turns out that until r254991 there weren't actually any
callers of this overload. Fix the typo and add some test coverage.
llvm-svn: 255157
Summary:
Improve SaturatingAdd()/SaturatingMultiply() to use bool * to optionally return overflow result.
This should make it clearer that the value is returned at callsites and reduces the size of the implementation.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15219
llvm-svn: 255128
Currently, vectors of halfs end up as ConstantVectors, but there isn't
a good reason they can't be ConstantDataVectors. This should save some
memory.
llvm-svn: 254991
This is needed to support linking of module-level metadata as a
postpass after function importing, where we will be leaving temporary
metadata on imported instructions until the postpass metadata import.
Also added unittest. Split from D14838.
llvm-svn: 254914
Before this patch the diagnostic handler was optional. If it was not
passed, the one in the LLVMContext was used.
That is probably not a pattern we want to follow. If each area has an
optional callback, there is a sea of callbacks and it is hard to follow
which one is called.
Doing this also found cases where the callback is a nice addition, like
testing that no errors or warnings are reported.
The other option is to always use the diagnostic handler in the
LLVMContext. That has a few problems
* To implement the C API we would have to set the diag handler and then
set it back to the original value.
* Code that creates the context might be far away from code that wants
the diagnostics.
I do have a patch that implements the second option and will send that as
an RFC.
llvm-svn: 254777
This class is turning into a useful interface, rather than an implementation
detail, so I'm dropping the 'Base' suffix.
No functional change.
llvm-svn: 254693
This change adds support for an optional weight when merging profile data with the llvm-profdata tool.
Weights are specified by adding an option ':<weight>' suffix to the input file names.
Adding support for arbitrary weighting of input profile data allows for relative importance to be placed on the
input data from multiple training runs.
Both sampled and instrumented profiles are supported.
Reviewers: dnovillo, bogner, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14547
llvm-svn: 254669
Summary: This changes overflow handling during instrumentation profile merge. Rathar than throwing away records that would result in counter overflow, merged counts are instead clamped to the maximum representable value. A warning about counter overflow is still surfaced to the user as before.
Reviewers: dnovillo, davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14893
llvm-svn: 254525
By including the module name in the error message.
This makes the error message much more useful and
saves a trip to the debugger.
Reviewers: dexonsmith
Subscribers: dexonsmith, llvm-commits
Differential Revision: http://reviews.llvm.org/D14473
llvm-svn: 254437
Raw profile writer needs to write all data of one kind in one continuous block,
so the buffer needs to be pre-allocated and passed to the writer method in
pieces for function profile data. The change adds the support for raw value data
writing.
llvm-svn: 254219
This is one of the many steps to commonize value profiling support between profile
runtime and compiler/llvm tools.
After this change, profiler runtime now can share the same C APIs to do VP
serialization/deseriazation with LLVM host tools (and produces value data
in identical format between indexed and raw profile).
It is not yet enabled in profiler runtime yet.
Also added a unit test case to test runtime profile data serialization/deserialization
interfaces implemented using common closure code.
llvm-svn: 254110
Summary: Adds the ability for callers to detect when saturation occurred on the result of saturating addition/multiplication.
Reviewers: davidxl, silvas, rsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14931
llvm-svn: 253921
Summary:
This change fixes the SaturatingMultiply<T>() function template to not cause undefined behavior with T=uint16_t.
Thanks to Richard Smith's contribution, it also no longer requires an integer division.
Patch by Richard Smith.
Reviewers: silvas, davidxl
Subscribers: rsmith, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14845
llvm-svn: 253870
Summary:
This follows D14577 to treat ARMv6-J as an alias for ARMv6,
instead of an architecture in its own right.
The functional change is that the default CPU when targeting ARMv6-J
changes from arm1136j-s to arm1136jf-s, which is currently used as
the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs.
The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't
affect code generation, attributes, optimizations, or anything else,
apart from selecting the default CPU.
Reviewers: rengolin, logan, compnerd
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14755
llvm-svn: 253675
It caused link errors of the form:
InstrProfiling.c:(.text.__llvm_profile_instrument_target+0x1c0): undefined reference to `__sync_fetch_and_add_8'
We had a network outage at the time of the commit so the first build to show a
problem is http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/10827
llvm-svn: 253656
Ubsan detected undefined behavior in the MathExtras SaturatingMultiply test.
This change disables the test while it is being investigated.
llvm-svn: 253539
Summary:
This change adds MathExtras helper functions for handling unsigned, saturating addition and multiplication. It also updates the instrumentation and sample profile merge implementations to use them.
Reviewers: dnovillo, bogner, davidxl
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14720
llvm-svn: 253497
Summary:
This change adds MathExtras helper functions for handling unsigned, saturating addition and multiplication. It also updates the instrumentation and sample profile merge implementations to use them.
No functional changes.
Reviewers: dnovillo, bogner, davidxl
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14720
llvm-svn: 253412
Summary:
This patch changes the behavior of path::system_temp_directory() on Windows to be closer to GetTempPath Windows API call. Enforces path separator to be the native one, makes path absolute, etc. GetTempPath is not used directly because of limitations/implementation bugs on Windows 7.
Windows specific unit tests are added. Most of them runs in separated process with modified environment variables.
This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables).
Reviewers: chapuni, rafael, aaron.ballman
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D14231
llvm-svn: 253345
Useful utility function; this wasn't too hard to do before, but also wasn't
obviously discoverable. Make it explicit. Reviewed offline by Michael
Gottesman.
llvm-svn: 253254
Summary:
* ARMv6KZ is the "canonical" name, given in the ARMARM
* ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM
* ARMv6ZK is a popular misspelling, which we should support as an alias.
The patch corrects the handling of the names.
Functional changes:
* ARMv6Z no longer treated as an architecture in its own right
* ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias
* arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K
* default ARMv6K CPU changed to arm1176j-s
Reviewers: rengolin, logan, compnerd
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14568
llvm-svn: 253206
Re-implement `ilist_node::getNextNode()` and `getPrevNode()` without
relying on the sentinel having a "next" pointer. Instead, get access to
the owning list and compare against the `begin()` and `end()` iterators.
This only works when the node *can* get access to the owning list. The
new support is in `ilist_node_with_parent<>`, and any class `Ty`
inheriting from `ilist_node<NodeTy>` that wants `getNextNode()` and/or
`getPrevNode()` should inherit from
`ilist_node_with_parent<NodeTy, ParentTy>` instead. The requirements:
- `NodeTy` must have a `getParent()` function that returns the parent.
- `ParentTy` must have a `getSublistAccess()` static that, given a(n
ignored) `NodeTy*` (to determine which list), returns a member field
pointer to the appropriate `ilist<>`.
This isn't the cleanest way to get access to the owning list, but it
leverages the API already used in the IR hierarchy (see, e.g.,
`Instruction::getSublistAccess()`).
If anyone feels like ripping out the calls to `getNextNode()` and
`getPrevNode()` and replacing with direct iterator logic, they can also
remove the access function, etc., but as an incremental step, I'm
maintaining the API where it's currently used in tree.
If these requirements are *not* met, call sites with access to the ilist
can call `iplist<NodeTy>::getNextNode(NodeTy*)` directly, as in
ilistTest.cpp.
Why rewrite this?
The old code was broken, calling `getNext()` on a sentinel that possibly
didn't have a "next" pointer at all! The new code avoids that
particular flavour of UB (see the commit message for r252538 for more
details about the "lucky" memory layout that made this function so
interesting).
There's still some UB here: the end iterator gets downcast to `NodeTy*`,
even when it's a sentinel (which is typically
`ilist_half_node<NodeTy*>`). I'll tackle that in follow-up commits.
See this llvm-dev thread for more details:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
What's the danger?
There might be some code that relies on `getNextNode()` or
`getPrevNode()` *never* returning `nullptr` -- i.e., that relies on them
being broken when the sentinel is an `ilist_half_node<NodeTy>`. I tried
to root out those cases with the audits I did leading up to r252380, but
it's possible I missed one or two. I hope not.
(If (1) you have out-of-tree code, (2) you've reverted r252380
temporarily, and (3) you get some weird crashes with this commit, then I
recommend un-reverting r252380 and auditing the compile errors looking
for "strange" implicit conversions.)
llvm-svn: 252694
- Make indexed value profile data more compact by peeling out
the per-site value count field into its own smaller sized array.
- Introduced formal data structure definitions to specify value
profile data layout in indexed format. Previously the layout
of the data is only assumed in the client code (scattered in
three different places : size computation, EmitData, and ReadData
- The new data structure serves as a central place for layout documentation.
- Add interfaces to force BE output for value profile data (testing purpose)
- Add byte swap unit tests
Differential Revision: http://reviews.llvm.org/D14401
llvm-svn: 252563
Summary:
In general GetTempDir follows the same logic as the replaced code: checks env variables TMP, TEMP, USERPROFILE in order. However, it also perform other checks like making separators native (\), making the path absolute, etc.
This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables).
Reviewers: chapuni, rafael, aaron.ballman
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D14231
llvm-svn: 252366
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.
For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.
This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.
Since this is an IR change, a bitcode upgrade has been provided.
Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.
Differential Revision: http://reviews.llvm.org/D14265
llvm-svn: 252219
Summary: On Windows we have to take UTF16 encoded env vars and convert them to UTF8. This patch fixes CopyEnvironment helper function used by process unit tests.
Reviewers: yaron.keren
Subscribers: yaron.keren, llvm-commits
Differential Revision: http://reviews.llvm.org/D14278
llvm-svn: 252039
Bypassing LLVM for this has a number of benefits:
1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't
work on Windows as the resolver block was in Darwin asm).
2) For cross-process JITs, it allows resolver blocks and trampolines to be
emitted directly in the target process, reducing cross process traffic.
3) It should be marginally faster.
llvm-svn: 251933
Summary:
The new function sys::path::user_cache_directory tries to discover
a directory suitable for cache storage for current system user.
On Windows and Darwin it returns a path to system-specific user cache directory.
On Linux it follows XDG Base Directory Specification, what is:
- use non-empty $XDG_CACHE_HOME env var,
- use $HOME/.cache.
Reviewers: chapuni, aaron.ballman, rafael
Subscribers: rafael, aaron.ballman, llvm-commits
Differential Revision: http://reviews.llvm.org/D13801
llvm-svn: 251784
1. Added a set of public interfaces in InstrProfRecord
class to access (read/write) value profile data.
2. Changed IndexedProfile reader and writer code to
use the newly defined interfaces and hide implementation
details.
3. Added a couple of unittests for value profiling:
- Test new interfaces to get and set value profile data
- Test value profile data merging with various scenarios.
No functional change is expected. The new interfaces will also
make it possible to change on-disk format of value prof data
to be more compact (to be submitted).
llvm-svn: 251771
This complements CopyConstructorNotSmallTest. If we are testing the copy
constructor in such a way, we should also probably test assignment in the same
way.
llvm-svn: 251736
GNU tools require elfiamcu to take up the entire OS field, so, e.g.
i?86-*-linux-elfiamcu is not considered a legal triple.
Make us compatible.
Differential Revision: http://reviews.llvm.org/D14081
llvm-svn: 251390
This adds support for the i?86-*-elfiamcu triple, which indicates the IAMCU psABI is used.
Differential Revision: http://reviews.llvm.org/D13977
llvm-svn: 251222
In this mode it just tries to tail merge the strings without imposing any other
format constrains. It will not, for example, add a null byte between them.
Also add support for keeping a tentative size and offset if we decide to
not optimize after all.
This will be used shortly in lld for merging SHF_STRINGS sections.
llvm-svn: 251153
Summary: This will be used in a future change to ScalarEvolution.
Reviewers: hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13612
llvm-svn: 250975
"external" AA wrapper pass.
This is a generic hook that can be used to thread custom code into the
primary AAResultsWrapperPass for the legacy pass manager in order to
allow it to merge external AA results into the AA results it is
building. It does this by threading in a raw callback and so it is
*very* powerful and should serve almost any use case I have come up with
for extending the set of alias analyses used. The only thing not well
supported here is using a *different order* of alias analyses. That form
of extension *is* supportable with the new pass manager, and I can make
the callback structure here more elaborate to support it in the legacy
pass manager if this is a critical use case that people are already
depending on, but the only use cases I have heard of thus far should be
reasonably satisfied by this simpler extension mechanism.
It is hard to test this using normal facilities (the built-in AAs don't
use this for obvious reasons) so I've written a fairly extensive set of
custom passes in the alias analysis unit test that should be an
excellent test case because it models the out-of-tree users: it adds
a totally custom AA to the system. This should also serve as
a reasonably good example and guide for out-of-tree users to follow in
order to rig up their existing alias analyses.
No support in opt for commandline control is provided here however. I'm
really unhappy with the kind of contortions that would be required to
support that. It would fully re-introduce the analysis group
self-recursion kind of patterns. =/
I've heard from out-of-tree users that this will unblock their use cases
with extending AAs on top of the new infrastructure and let us retain
the new analysis-group-free-world.
Differential Revision: http://reviews.llvm.org/D13418
llvm-svn: 250894
Summary: This patch replaces usage of deprecated SHGetFolderPathW with SHGetKnownFolderPath. The usage of SHGetKnownFolderPath is wrapped to allow queries for other "known" folders in the near future.
Reviewers: aaron.ballman, gbedwell
Subscribers: chapuni, llvm-commits
Differential Revision: http://reviews.llvm.org/D13753
llvm-svn: 250501
This patch adds the underlying infrastructure for an AVR backend to be included into LLVM. It is the first of a series of patches aimed at moving the out-of-tree AVR backend into the tree.
It consists of adding a new`Triple` target 'avr'.
llvm-svn: 250492
With r250345 and r250343, we start to observe the following failure
when bootstrap clang with lto and pgo:
PHI node entries do not match predecessors!
%.sroa.029.3.i = phi %"class.llvm::SDNode.13298"* [ null, %30953 ], [ null, %31017 ], [ null, %30998 ], [ null, %_ZN4llvm8dyn_castINS_14ConstantSDNodeENS_7SDValueEEENS_10cast_rettyIT_T0_E8ret_typeERS5_.exit.i.1804 ], [ null, %30975 ], [ null, %30991 ], [ null, %_ZNK4llvm3EVT13getScalarTypeEv.exit.i.1812 ], [ %..sroa.029.0.i, %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit.i.1826 ], !dbg !451895
label %30998
label %_ZNK4llvm3EVTeqES0_.exit19.thread.i
LLVM ERROR: Broken function found, compilation aborted!
I will re-commit this if the bot does not recover.
llvm-svn: 250366
Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB).
This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added.
Differential revision: http://reviews.llvm.org/D10979
llvm-svn: 250345
On Windows, fs::rename() could fail is another process was reading the
file at the same time using fs::openFileForRead(). In most cases the user
wouldn't notice as fs::rename() will continue to retry for 2000ms. Typically
this is enough for the read to complete and a retry to succeed, but if the
disk is being it too hard then the response time might be longer than the
retry time and the rename would fail with a permission error.
Add FILE_SHARE_DELETE to the sharing flags for CreateFileW() in
fs::openFileForRead() and try ReplaceFileW() prior to MoveFileExW()
in fs::rename().
Based on an initial patch by Edd Dawson!
Differential Revision: http://reviews.llvm.org/D13647
llvm-svn: 250046
Summary:
As per Duncan's review for D12536, I extracted the sub-byte bit aligned
reading and writing code into lib/Support, and generalized it. Added calls from
BackpatchWord. Also added unittests.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13189
llvm-svn: 248897
Add support to the indexed instrprof reader and writer for the format
that will be used for value profiling.
Patch by Betul Buyukkurt, with minor modifications.
llvm-svn: 248833
HHVM calling convention, hhvmcc, is used by HHVM JIT for
functions in translated cache. We currently support LLVM back end to
generate code for X86-64 and may support other architectures in the
future.
In HHVM calling convention any GP register could be used to pass and
return values, with the exception of R12 which is reserved for
thread-local area and is callee-saved. Other than R12, we always
pass RBX and RBP as args, which are our virtual machine's stack pointer
and frame pointer respectively.
When we enter translation cache via hhvmcc function, we expect
the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed
to standard ABI alignment. This affects stack object alignment and stack
adjustments for function calls.
One extra calling convention, hhvm_ccc, is used to call C++ helpers from
HHVM's translation cache. It is almost identical to standard C calling
convention with an exception of first argument which is passed in RBP
(before we use RDI, RSI, etc.)
Differential Revision: http://reviews.llvm.org/D12681
llvm-svn: 248832
BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:
Pros:
1. It uses only a half space of the current one.
2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.
Cons:
1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.
One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.
Differential revision: http://reviews.llvm.org/D12603
llvm-svn: 248633
and assert when mask is too large to apply in the small case,
previously the extra words were silently ignored.
clang-format the entire function to match current code standards.
This is a rewrite of r247972 which was reverted in r247983 due to
warning and possible UB on 32-bits hosts.
llvm-svn: 247993
Extend mask value to 64 bits before taking its complement and assert when mask is
too large to apply in the small case (previously the extra words were silently ignored).
http://reviews.llvm.org/D11890
Patch by James Touton!
llvm-svn: 247972
This was a flawed change - it just caused the getElementType call to be
deferred until later, when we really need to remove it. Now that the IR
for GlobalAliases has been updated, the root cause is addressed that way
instead and this change is no longer needed (and in fact gets in the way
- because we want to pass the pointee type directly down further).
Follow up patches to push this through GlobalValue, bitcode format, etc,
will come along soon.
This reverts commit 236160.
llvm-svn: 247585
with the StringRef::split method when used with a MaxSplit argument
other than '-1' (which nobody really does today, but which should
actually work).
The spec claimed both to split up to MaxSplit times, but also to append
<= MaxSplit strings to the vector. One of these doesn't make sense.
Given the name "MaxSplit", let's go with it being a max over how many
*splits* occur, which means the max on how many strings get appended is
MaxSplit+1. I'm not actually sure the implementation correctly provided
this logic either, as it used a really opaque loop structure.
The implementation was also playing weird games with nullptr in the data
field to try to rely on a totally opaque hidden property of the split
method that returns a pair. Nasty IMO.
Replace all of this with what is (IMO) simpler code that doesn't use the
pair returning split method, and instead just finds each separator and
appends directly. I think this is a lot easier to read, and it most
definitely matches the spec. Added some tests that exercise the corner
cases around StringRef() and StringRef("") that all now pass.
I'll start using this in code in the next commit.
llvm-svn: 247249
on StringRef. Finding and splitting on a single character is
substantially faster than doing it on even a single character StringRef
-- we immediately get to a *very* tuned memchr call this way.
Even nicer, we get to this even in a debug build, shaving 18% off the
runtime of TripleTest.Normalization, helping PR23676 some more.
llvm-svn: 247244
The purpose is to allow templated wrapper to work with either
ArrayRef or any convertible operation:
template<typename Container>
void wrapper(const Container &Arr) {
impl(makeArrayRef(Arr));
}
with Container being a std::vector, a SmallVector, or an ArrayRef.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247214
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167